

It's a video of an internet user filming himself speaking in English on his mobile phone. Then we see the same clip again, but the person is now speaking in French, with no change in the tone of voice or lip movement. Then the video starts again, this time in German. This short demo, posted on September 11 by @mrjonfinger on X (formerly Twitter), illustrates the impressive and disturbing effect of new tools that are capable, thanks to artificial intelligence (AI), of generating an automatic dubbing, while cloning the speaker's voice and synchronizing their diction. Already viewed over 7 million times, this video uses software created by an American startup, HeyGen. But other companies, including Google, have similar applications. This latest breakthrough in AI increases questions about the future of translation and dubbing.
On the internet, several impressed or amused people have tried out HeyGen's tool. Digital consultant Michel Levy Provençal tested this "revolutionary new function for automatic video translation" in Spanish, Polish, Hindi, etc. Others have dubbed a Jacques Brel song, a Lionel Messi press conference and a Charles de Gaulle speech.
The HeyGen website allows users to try out the tool free of charge for around two minutes of video, but they have to pass through a queue which, on Thursday, September 14, had over 100,000 files. The startup also offers paid subscriptions (e.g. $48 (€45) a month for around 30 minutes of video).
These synthetic dubbing tools are impressive, combining several AI techniques already on the market. They combine transcription from sound to text (Trint, DeepL or YouTube, which generates automatic subtitles on its videos), translation (DeepL, ChatGPT, Google Translate, etc.); text-to-speech synthesis; and the cloning of a voice from a recording. Voice cloning from a recording will soon be offered by Apple with its Personal Voice tool, which targets people prone to voice loss or illness.
Software such as HeyGen raises the specter of breathtaking AI developments in translation. Tomorrow, will we be able to hear any foreign speaker translated live through our smartphone's headphones? Will we be able to watch any video or film dubbed into any language?
Such a prospect raises questions and concerns. There are, of course, questions about translation quality and accents (the French in the X video sounds a little Québécois). In May, Google demonstrated a similar tool, Universal Translator, but opted to let "trusted" testers try it out before making it available to the general public. "Some of these technologies could be used by the wrong people or organizations, for example to create deepfakes [deceptive realistic videos generated by AI]," the company explained. Google added that it was working on tagging systems to recognize AI-created content, in order to combat misinformation.
You have 16.78% of this article left to read. The rest is for subscribers only.