Microsoft introduces the Azure AI Speech text-to-speech avatar, a tool allowing users to generate photorealistic avatars that can animate and speak scripted text. This tool, available in public preview, utilizes uploaded images to resemble a person, a script, and a text-to-speech model to drive animation. While it offers efficiency in creating training videos, product introductions, and virtual assistants, ethical concerns arise regarding potential misuse and compensation for actors. Additionally, Microsoft launches the Personal Voice feature, allowing users to replicate their voices with explicit consent and specific usage restrictions.
Users can create photorealistic avatars that animate and speak scripted text by uploading images resembling a person and providing a script. The tool trains a model for animation and leverages a text-to-speech model for reading the script aloud. It aims to facilitate the creation of training videos, product introductions, customer testimonials, chatbots, and more.
The tool raises ethical questions about potential misuse, including creating deepfake-like content. Microsoft acknowledges the potential for abuse and limits access to custom avatars, making them available through a "limited access" capability and requiring registration only for certain use cases.
The launch of avatar-generating tools raises parallels with the recent SAG-AFTRA strike over AI-generated digital likenesses. Microsoft's approach to compensation and notification for actors whose likenesses are used remains unclear, and the company has not provided information on labeling avatars as AI-generated.
Microsoft introduces the Personal Voice feature within its custom neural voice service, enabling users to replicate their voices with a one-minute speech sample. Users must give explicit consent and agree to use personal voice only in specific applications, with limitations on user-generated or open-ended content.
To avoid legal complications, Microsoft requires explicit consent and places restrictions on the usage of personal voice, particularly in entertainment scenarios like dubbing for films, TV, video, and audio. The company did not provide details on how actors might be compensated for their contributions.
Microsoft's introduction of the Azure AI Speech text-to-speech avatar and the Personal Voice feature showcases the company's advancements in generative AI technologies. While these tools offer innovative capabilities, they also raise ethical concerns about potential misuse and compensation for actors whose likenesses and voices are replicated. As the technology evolves, addressing these ethical considerations and implementing safeguards will be crucial to ensure responsible and ethical use.
Read more about Microsoft: