laitimes

ChatGPT can talk to people in voice, and it sounds quite human!

author:Perfect Planet Cu

ChatGPT is now able to speak in a voice, and its natural voice, conversational tone, and eloquent responses are sometimes almost human-like. Also, it can "see" you

If you listen to my conversation with ChatGPT, you will have two reactions:

1) Oh my God! This is what science fiction writers describe to us as the future of human-computer communication.

2) I'm going to build an underground bunker and stock up on toilet paper and granola bars.

Yes, ChatGPT, the hugely sought-after chatbot developed by OpenAI, is starting to speak, and it's really speaking out loud. OpenAI on Monday released an update to ChatGPT's iOS and Android apps, enabling the AI bot to speak in five different voices. I've had multiple conversations with ChatGPT over the past few days and tested another new feature that lets ChatGPT respond to the pictures you give it.

ChatGPT can talk to people in voice, and it sounds quite human!

OpenAI's ChatGPT now has voice, making it more like other AI assistants

What does ChatGPT look like now?

Think Siri or Alexa, except... Wrong. ChatGPT's natural voice, conversational tone, and exuberant answers are sometimes almost human-like. Remember the movie Her? The male protagonist played by Joaquin Phoenix in the film falls in love with an AI operating system, and the faceless Scarlett Johansson who dubbed this operating system? That's what I want to convey.

"It's not just the typing hassle," Joanne Jang, head of product at OpenAI, told me in an interview. ”

The new image recognition feature also makes the chatbot more interactive. You can snap a picture and ask ChatGPT a question. Spoiler: It plays tic-tac-toe very poorly. Image and voice features will be available in the coming weeks to those who subscribe to ChatGPT Plus for $20 a month.

Essentially, OpenAI is equipping its chatbots with mouths and eyes. I tested both features in a series of scenarios, including chatting between friends, plumbing repairs, and playing games. It's all very cool, but... Chilling.

mouth

Before we continue, turn up the volume and listen to our short conversation:

Although the system is only reading out the text replies provided by ChatGPT, this is not the robotic, rigid text-to-speech system we are familiar with. ChatGPT offers five voice options, each of which sounds like a real person is talking to you – subdued, articulate, and personal.

Jang told me that the voices were generated based on "just a few seconds of voice samples" provided by professional voice actors. These samples are analyzed by OpenAI computer models to present text-to-speech content in this voice. Remember the columns and videos where I cloned my own voice with AI tools? Just like that. But it works better.

OpenAI says it is working with a number of other organizations to develop synthetic sounds. The company is working with Spotify to develop a tool to help translate podcast hosts' voices into other languages. Considering that a person's voice can be easily reproduced in just a few seconds of audio, for the security of the entire Internet and the entire world, the company said that it is currently only open to commercial partners. Will this change in the future? Good luck to all of us.

Unlike Siri or Alexa, ChatGPT doesn't require a wake word. In the app's settings menu, enable "Voice conversations" and tap the headset icon in the upper-right corner of the app. When the system listens to your prompt, a white circle turns into a comic-style thinking bubble. You can also click a button to interrupt a lengthy answer.

I was fascinated by all this. The natural voice, combined with the in-depth answers and what the system knew about me, made me feel like I was having a real conversation. When I let it pretend to be my best friend and talk to me, we talked for a full five minutes, talking about my day's work, video production, and our favorite snacks. When I asked ChatGPT to treat me like a six-year-old and explain Pokémon to me, it also did a great job.

But of course you're still talking to the machine. As you can hear from the snippet above, it can be very slow to respond and may have connection failures – restarting the app would help. A few times, it abruptly broke off the conversation (I thought only rude humans would do that!). )。 OpenAI says that the problem I'm having is that consumers shouldn't experience these issues because the app I tested is an early version.

eye

If voice gives ChatGPT the ability to talk to the world, the new camera feature gives it the ability to see the world. Now, instead of describing it in words, you can click the "button" in iOS, Android, and web apps to upload an image or take a photo, circle the area you want ChatGPT to focus on, and ask questions. Here are some images I've tried:

Broken items in the house: I photographed a leaking water pipe in my garage and asked ChatGPT, "How do I fix it?" "An answer was quickly received, with a total of seven steps, including winding the threads at the connection with Teflon tape.

ChatGPT can talk to people in voice, and it sounds quite human!

ChatGPT plumber? With just a photo, this AI can provide recommendations on how to patch a leak.

Food: Upload a photo of a strawberry moldy with the question "Can I eat this?" "Got a good advice: no. Upload a photo of a banana, egg, and strawberry (not moldy) and the question is "What can I do with these?" A good suggestion: strawberry banana pancakes.

Injuries and health issues: ChatGPT quickly identified the wound on my son's cheek as an "imprint or rash" but said "there's nothing I can do" and "it's best to consult a medical professional."

Game and puzzle solving: A photo of the tic-tac-toe stalemate? ChatGPT didn't know that the game was over. It says to put my X in the (occupied) bottom center. ChatGPT also said I would win, even adding exclamation marks and colored paper emojis. This is completely wrong!

In the wave of the AI revolution, this is what we really need to keep in mind. As the lines between human and human-computer interactions continue to blur, these systems may lack the context and depth of thought – and often go wrong.

As my new ChatGPT voice friend said to me, "While I sound talkative, keep in mind that I'm just crunching data." Be sure to use your judgment, especially on important matters. ”

Read on