Reliably clone the human voice with a sample of just 15 seconds. That is what the latest artificial intelligence tool created by OpenAI achieves, the firm that dazzled the world with ChatGPT, its generative AI language program.
All the user has to do is provide that sample. Once the Voice Engine program has it, you can make it read any text you provide with the timbre and tone of that voice. The text doesn't even have to be in the same language. A Spanish speaker can provide the sample in her language and then ask the program to read a text in English, Chinese, or other languages in his or her voice.
It can be used directly for audio translation. What's more, when used for translation, Voice Engine preserves the native accent of the original speaker: for example, generating English with an audio sample from a French speaker would produce French-accented speech.
The company intends to launch the trial on a small scale for now instead of facilitating access to the tool, as it did with ChatGPT, as it is aware of the risk of identity theft. With the tool, you only need to record 15 seconds of someone to get their voice.
OpenAI considers that before generalizing access to the new tool, decisions must be made on a series of aspects. For example, it calls for progressively eliminating voice authentication as a security measure to access bank accounts and other sensitive information, since it would no longer be secure.
Educating the public to understand the capabilities and limitations of AI technologies, including the possibility of misleading AI content, is critical.
Another proposal he puts on the table is to accelerate the development and adoption of techniques to trace the origin of audiovisual content, so that it is always clear when you are interacting with a real person or with an AI.