Microsoft introduces AI tool for realistic replication of faces and voices
3 min readAddressing claims that Azure AI Speech functioned merely as a ‘deepfakes creator,’ Microsoft stated it had incorporated protective measures
Microsoft unveiled its recent advancement in the realm of artificial intelligence during its developer conference this week: software capable of crafting new avatars and voices or mimicking the established appearance and speech of a user. This development has sparked concerns that it could significantly enhance the production of deepfakes—AI-generated videos portraying events that never occurred.
Revealed at Microsoft Ignite 2023, Azure AI Speech is equipped with human images for training purposes. It enables users to input a script, which can then be verbally delivered by a photorealistic avatar generated through artificial intelligence. Users have the option to select a preloaded Microsoft avatar or upload footage of a person to replicate their voice and appearance. Microsoft, in a blog post published on Wednesday, highlighted the tool’s potential applications in constructing “conversational agents, virtual assistants, chatbots, and more.”
The post explains, “Customers have the option to select either a prebuilt or a custom neural voice for their avatar. If the custom neural voice and the custom text-to-speech avatar both use the voice and likeness of the same person, the avatar will closely resemble that individual.”
Microsoft stated that the newly introduced text-to-speech software comes with various limitations and safeguards to prevent misuse. The company emphasized, “As part of Microsoft’s commitment to responsible AI, text-to-speech avatar is designed with the intention of protecting the rights of individuals and society, fostering transparent human-computer interaction, and counteracting the proliferation of harmful deepfakes and misleading content.
Users have the option to upload their own video recordings featuring avatar talent, which the feature utilizes to train a synthetic video of the custom avatar speaking,” as mentioned in the blog post. In this context, “avatar talent” refers to a human posing for the AI’s metaphorical camera.
The announcement swiftly drew criticism, with concerns raised that Microsoft had introduced a “deepfakes creator,” facilitating the replication of a person’s likeness to make them appear to say and do things they haven’t. Microsoft’s president himself expressed in May that deepfakes are his “biggest concern” regarding the increasing prevalence of artificial intelligence.
In response to the criticism, the company countered by stating that the customized avatars are now a “limited access” tool, requiring customers to apply for approval from Microsoft. Users must also disclose when AI is employed to generate a synthetic voice or avatar.
“With these safeguards in place, we help limit potential risks and empower customers to infuse advanced voice and speech capabilities into their AI applications in a transparent and safe manner,” stated Sarah Bird of Microsoft’s responsible AI engineering division.
The text-to-speech avatar maker represents the latest tool in the ongoing competition among major tech firms to capitalize on the artificial intelligence boom of recent years. Following the success of ChatGPT, launched by the Microsoft-backed firm OpenAI, companies such as Meta and Google have introduced their own artificial intelligence tools to the market.
The ascent of AI has raised increasing apprehensions about its capabilities, prompting OpenAI CEO Sam Altman to caution Congress about its potential for election interference and advocate for the implementation of safeguards.
Concerns are particularly heightened regarding deepfakes and their potential role in election interference, according to experts. Microsoft recently introduced a tool to enable politicians and campaigns to authenticate and watermark their videos, ensuring their legitimacy and curbing the dissemination of deepfakes. This month, Meta unveiled a policy mandating the disclosure of AI use in political ads and prohibiting campaigns from utilizing Meta’s generative AI tools for advertising purposes.