


When you think of what a voice programmed by artificial intelligence would sound like, you might picture something robotic and stilted, with a staccato cadence incapable of capturing the inflections, speed and emotion required to sound even somewhat human. But this is 2024, and the robots have gotten a serious upgrade. Now they can imitate voices, accents and intonation to an almost creepy degree — for better or worse.
ChatGPT’s new Advanced Voice Mode feature, which was released to most users of ChatGPT last week, is an audio version of the original ChatGPT, which uses artificial intelligence to respond to text prompts conversationally. Advanced Voice Mode works in the same way as the original, but using audio; those using Advanced Voice Mode speak into the app and the voice automatically responds. Users can choose one of nine voices, and then through conversation and text prompts, they teach those voices to talk in a way that they like.
It works well enough that OpenAI, the company that owns ChatGPT, issued a safety report warning that people could become emotionally reliant on the feature.
“It recognizes from both the words you are using, as well as the inflections in your tone, but also informed by the context of the words that you’re leveraging, to respond in a way that best makes sense,” explained Celia Quillian, an artificial intelligence expert. “ChatGPT has always been a predictive model, right? So it’s just predicting what the most likely response should be based off of the input you give it, and now it’s doing that with sound.”
The new feature has prompted a host of TikTok users to post videos showing off Advanced Voice Mode’s capabilities, including speaking in slang, dialects and even the occasional regional accent. The results are surprising and often hilarious.
When Morissa Schwartz, a New Jersey native and the entrepreneur, asked ChatGPT to mimic what she sounded like, the app didn’t hesitate.