OpenAI has revealed a voice-cloning tool it has managed to come up with even though it pointed out that it plans to keep it tightly controlled until safeguards are in place to thwart audio fakes meant to dupe listeners.
Africa Today News, New York reports that the model called “Voice Engine” can essentially duplicate someone’s speech based on a 15-second audio sample, according to an OpenAI blog post sharing results of a small-scale test of the tool.
“We recognize that generating speech that resembles people’s voices has serious risks, which are especially top of mind in an election year,” the San Francisco-based company said.
“We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.”
Disinformation specialists worry that the widespread availability of voice cloning tools—which are low-cost, simple to use, and difficult to track down—will lead to widespread abuse of AI-powered applications during a crucial election year.
Read Also: Controversy As Altman Fired As CEO Of ChatGPT Maker Open AI
Acknowledging these problems, OpenAI said it was “taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse.”
The cautious unveiling came a few months after a political consultant working for the long-shot presidential campaign of a Democratic rival to Joe Biden admitted being behind a robocall impersonating the US leader.
The AI-generated call, the brainchild of an operative for Minnesota congressman Dean Phillips, featured what sounded like Biden’s voice urging people not to cast ballots in January’s New Hampshire primary.
The incident caused alarm among experts who fear a deluge of AI-powered deepfake disinformation in the 2024 White House race as well as in other key elections around the globe this year.
OpenAI said that partners testing Voice Engine agreed to rules including requiring explicit and informed consent of any person whose voice is duplicated using the tool.
It must also be made clear to audiences when voices they are hearing are AI generated, the company added.