OpenAI Debuts Voice-Cloning Tech, But Won’t Release It Widely

March 29, 2024

1 View 0

SaveSavedRemoved 0

OpenAI couldn’t help itself: The company has developed voice-cloning technology that’s so good it’s bound to both impress and scare users. But for now, OpenAI is only releasing the system to select partners. Called Voice Engine, the technology can clone your voice simply by listening to a 15-second clip of you talking. In addition, the replicated voice can convey emotion and the natural cadence of human speech, making the AI-generated dialog sound realistic. OpenAI says that it first developed Voice Engine in late 2022 to power the text-to-speech capability for ChatGPT. But rather than release Voice Engine to the public, the company has essentially decided that society isn’t quite ready for it — at least not yet. “We are taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse,” the company wrote in a blog post that showcases several examples of Voice Engine in action.

This Tweet is currently unavailable. It might be loading or has been removed.

As you can hear, the AI-generated speech is often indistinguishable from the reference audio. In the wrong hands, it’s obvious the technology could be used to pump out deepfakes to misinform the public. But despite the potential for misuse, OpenAI says Voice Engine could be useful for society. The blog post goes on to say that starting late last year the company began testing Voice Engine “with a small group of trusted partners.” The results show that the voice cloning could be used as a reading assistant for school children. It can also act as a translator, using the person’s voice to speak in multiple different languages. Another use case involves providing the voice-cloning technology to people who’ve lost the ability to speak, similar to what Apple is doing. As a result, OpenAI has decided to release Voice Engine in preview mode to partners who agree to never use the technology for unauthorized impersonation purposes.

Recommended by Our Editors

“Partners must also clearly disclose to their audience that the voices they’re hearing are AI-generated,” the company said. OpenAI has also added a watermarking system to help detect any AI-generated audio from Voice Engine. Still, OpenAI isn’t guaranteeing it’ll ever widely release the voice-cloning tech. The company says a lot will depend on how society responds to the rise of generative AI, which is already blurring lines between fiction and reality. “We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities,” the company added. “Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

Get Our Best Stories!
Sign up for What’s New Now to get our top stories delivered to your inbox every morning.

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.