Can you be emotionally reliant on an AI voice? OpenAI says yes

Home technology

Can you be emotionally reliant on an AI voice? OpenAI says yesThe report detailed a variety of the model’s capabilities, limitations and safety evaluations, including the ability to respond to audio inputs in about 232 milliseconds, with an average of 320 milliseconds, which the company said was similar to human conversation response time.

International New York Times

Last Updated 10 August 2024, 14:44 IST

The OpenAI logo.

Credit: Reuters Photo

In a report released this week, OpenAI revealed that it had considered the potential for users to form an emotional reliance on its new, humanlike voice mode, which is featured in its popular artificial intelligence chatbot, ChatGPT.

Cue all references to the Netflix show Black Mirror and the Spike Jonze movie Her.

The company noted in the report, which concerned safety steps that were taken during product development, that during early testing, users were observed engaging in language that might indicate the formation of a connection with the model. The software participates in conversation by receiving and responding to voice commands, images and videos.

Also Read:Can online voting be secure? Experts in Las Vegas try to hack new platform

“For example, this includes language expressing shared bonds, such as ‘This is our last day together,’” according to the report. “While these instances appear benign, they signal a need for continued investigation into how these effects might manifest over longer periods of time.”

The voice feature was announced this spring as part of the ChatGPT app’s newest model, known as GPT-4o. The new version of ChatGPT was unveiled in May and the voice capability was made available to paid users last week. It is expected to roll out to all users in the fall.

The report detailed a variety of the model’s capabilities, limitations and safety evaluations, including the ability to respond to audio inputs in about 232 milliseconds, with an average of 320 milliseconds, which the company said was similar to human conversation response time.

The risks of anthropomorphization, which is the act of attributing humanlike behaviors and characteristics to nonhuman entities like AI models, are heightened by GPT-4o’s audio capabilities, which allow for a natural conversation with the model, according to the report.

It also stated that the humanlike voice feature might reduce the need for human interaction, which can be an upside for those experiencing loneliness but it could adversely affect “healthy relationships.”

Blase Ur, a computer science professor at the University of Chicago who studies human-computer interactions, said that after he read the report, it was clear that OpenAI found reason to be comfortable with what it was doing, but that it had also found reasons for concern. Therefore, he isn’t sure if the current testing is “enough,” he said.

“When you think about things like car safety and airplane safety, there’s a huge, huge, huge amount of testing and external validation and external standards that we as a society, and as a group of experts, agree upon,” Ur said in a phone interview Friday. “And in many ways, OpenAI is kind of rushing forward to deploying these things that are cool because we’re in this kind of AI arms race at this point.”

(The New York Times sued OpenAI and its partner, Microsoft, in December, claiming copyright infringement of news content related to AI systems.)

Beyond OpenAI, companies like Apple and Google are working swiftly to develop their own artificial intelligence. Ur said that it seemed as though the current approach in monitoring models’ behavior was to look out for concerns as the model was being used, instead of designing it with safety in mind at the fore.

“To the extent that we are in a postpandemic society, I think a lot of people are emotionally fragile,” Ur said, referring to instances of emotional manipulation involving AI voice features. “Now that we have this agent that, in many ways, can toy with our emotions — it’s not a sentient agent, it doesn’t know what it’s doing — we can get into these situations.”

This is not the first time that OpenAI’s voice feature has been in the news. Shortly after the product was announced in May, there were concerns that the feature might be too lifelike in a rather specific way. Actress Scarlett Johansson, who voiced an AI technology in “Her,” said OpenAI had used a voice that sounded “eerily similar” to her own despite her refusal of an offer to license her voice. She then hired a lawyer and insisted that OpenAI stop using the voice, which it called “Sky,” which led the company to suspend its release.

In “Her,” the protagonist, played by actor Joaquin Phoenix, falls in love with the AI software voiced by Johansson, and is later left heartbroken at the revelation that the AI has relationships with other users as well.

OpenAI’s report, it would seem, is an extreme example of life imitating art, with the company openly acknowledging risks that have the potential to affect people in the real world.

The company said that the additional research of “more diverse user populations, with more varied needs and desires from the model,” as well as “independent academic and internal studies,” will help it more accurately define the potential risks.

It added: “We intend to further study the potential for emotional reliance and ways in which deeper integration of our model’s and systems’ many features with the audio modality may drive behavior.”

(Published 10 August 2024, 14:44 IST)