laitimes

Just now, OpenAI has released an advanced voice mode!

OpenAI has just announced an exciting new feature – Advanced Speech Mode (AVM), which marks another advancement in AI in the field of voice interaction. AVMs leverage GPT-4o's native audio technology to provide users with a more natural and real-time conversational experience.
Just now, OpenAI has released an advanced voice mode!

周二,OpenAI 宣布向更多 ChatGPT 付费用户推出“高级语音模式”(Advanced Voice Mode,简称 AVM )。

Currently, OpenAI offers two types of voice conversations – standard and advanced:

1) Advanced Voice: Currently rolling out gradually to Plus and Team users. Using GPT-4o's native audio technology, it enables more natural real-time conversations, captures nonverbal cues, such as speech speed, and responds emotionally.

Plus and Team users have a limit on how much advanced voice can use per day, including voice input and output. Enterprise and Education users will start getting the feature next week.

2) Standard Voice: Available to all users who log in to ChatGPT and can be used through iOS, macOS, and Android apps.

Standard speech generates responses through a variety of models, including transcribing speech into text and passing it to the model for response.

Although Standard Voice doesn't have native multi-modal capabilities like Advanced Voice, it still uses GPT-4o and GPT-4o mini. It's important to note that every prompt in a standard voice counts towards the message limit.

To start a voice conversation, you need to select the Speech icon in the bottom right corner of the screen:

Just now, OpenAI has released an advanced voice mode!

When starting an advanced voice conversation, the user will be taken to a screen with a blue sphere in the center of the screen:

Just now, OpenAI has released an advanced voice mode!

The AVM will be rolled out gradually to all Plus and Team users over the course of a week. While waiting, OpenAI has also added "custom instructions," a memory feature, five new voices, and improved accent support, as well as the ability to say "Sorry, I'm late" in over 50 languages.

此外,ChatGPT 还新增了五种可供用户体验的语音:Arbor、Maple、Sol、 Spruce 和 Vale,加上之前的 Breeze、Juniper、Cove 和 Ember,ChatGPT 语音总数达到了九种,几乎赶上了谷歌 Gemini Live 的数量。

  1. Arbor – easy-going and versatile
  2. Breeze – 生动活泼
  3. Cove – Calm and direct
  4. Ember – Confident and optimistic
  5. Juniper – 开放而乐观
  6. Maple – Cheerful and honest
  7. Sol – Savvy and relaxed
  8. Spruce——冷静而肯定
  9. Vale – Clever and curious

OpenAI says several improvements have been made since the release of AVM's limited alpha test. ChatGPT's voice function is said to now have a better understanding of accents, and conversations are smoother and faster.

OpenAI has also extended some of ChatGPT's customization capabilities to AVMs, including "custom instructions" that allow users to personalize how ChatGPT responds, and "memory features" that allow ChatGPT to remember previous conversations for later reference.

This article was published by Everyone is a Product Manager by [Jiang Tian Tim], WeChat public account: [There is a new Newin], original / authorized Published on Everyone is a product manager, without permission, it is forbidden to reprint.

Image from Unsplash, based on the CC0 license.

Read on