OpenAI model is finally updated! GPT-4o, with its powerful audio-visual capabilities, will be available to all users

author：51CTO 2024-05-14 14:09:00

Edit | Yifeng

Produced by | 51CTO Technology Stack (WeChat ID: blog51cto)

Spring is finally here! The model of the GPT series has finally waited for a long-awaited update - GPT-4o has surfaced.

而且此前颇具神秘气息的“im-also-a-good-gpt2-chatbot”，正是其测试版本。

Ultraman was not seen in this update, but was hosted by OpenAI CTO Muri Murati. She has previously been the subject of some controversy over the lack of clarity about OpenAI's training data in interviews.

What did OpenAI say at its spring launch? In a word, GPT-4o is faster, more modal, and cheaper!

OpenAI model is finally updated! GPT-4o, with its powerful audio-visual capabilities, will be available to all users

Image

1. The latest model, GPT-4o

让奥特曼直呼“amazing work”的模型更新来了！

Image

You can see that GPT-4o's performance is unbeatable. (As an aside, the Tongyi Qianwen model is silently on the right side of this picture).

The new large language model, trained on the massive amounts of data from the Internet, will be better at working with text and audio, and can handle 50 languages.

OpenAI's updated GPT-4o generative AI model will be officially available to developers and consumers in the coming weeks. The new model will be available to all users, adding that paid users will continue to "have five times the capacity limit of free users".

OpenAI CTO Muri Murati said GPT-4o provides "GPT-4 levels" of intelligence but improves GPT-4's capabilities in text, visuals, and audio.

Speaking in a keynote address at OpenAI's office, Murati said, "The strength of GPT-4o is that it spans speech, text, and vision. "This is very important because we are looking at the future of human-machine interaction.

GPT-4 is OpenAI's leading model before it, and it is trained from a combination of images and text, which can analyze images and text, complete tasks such as extracting text from images and even describing image content. But GPT-4o adds voice capabilities to this foundation.

这吻合了此前大家猜测的方向：“ChatGPT+Voice Agent”！

Jim Fan, a scientist at Nvidia, updates his predictions before the live broadcast

2. GPT-4o's powerful "audio-visual" capabilities

OpenAI CEO Sam Altman released a message saying that the model is "natively multimodal," meaning the model can generate content or understand speech, text, or image commands.

What exactly can GPT-4o achieve in terms of speech?

GPT-4o dramatically improves the experience with ChatGPT – OpenAI's viral AI chatbot. ChatGPT has long offered a voice mode that uses a text-to-speech model to transcribe text in ChatGPT. But GPT-4o has improved on this, allowing users to interact with ChatGPT more like an assistant.

For example, a user can ask ChatGPT, which is powered by GPT-4o, a question and interrupt ChatGPT as it answers. OpenAI says the model can provide a "real-time" response and can even capture emotion in the user's voice and generate a "range of different emotional styles" of speech.

GPT-4o also improves ChatGPT's visual abilities. Given a photo or a desktop screen, ChatGPT can now quickly answer relevant questions, from "What's going on with this software code" to "What brand of shirt is this person wearing?"

"We know that these models are getting more and more complex, but we want the interactive experience to actually become more natural and easy, so that you don't have to focus on the user interface at all, but only on working with [GPT]," Murati said.

OpenAI claims that GPT-4o is also more multilingual, with improved performance in 50 different languages. Altman added on X that developers who want to use GPT-4o can access the API, which is half the price of GPT-4-turbo and twice as fast as GPT-4-turbo.

3. Write at the end

The launch of OpenAI's model GPT-4o with powerful audio capabilities gives us a further glimpse into the future of virtual assistants.

A tech blogger familiar with the matter said that the release at this time is also a signal that OpenAI and Apple have reached a deal. This means that the future of Siri may be powered by ChatGPT!

Image

If OpenAI, Microsoft, and Apple all hold hands, then Google, the "AI Wang Feng", will really fall into the embarrassment of fighting alone.

Tomorrow, Google's developer conference will be here. OpenAI's rush to release a product update at this time is a bit of a steal from Google's limelight!

So, what do you think Google can release to save a game for itself?

Reference Links:

1.https://techcrunch.com/2024/05/13/openais-newest-model-is-gpt-4o/

2.https://www.theverge.com/2024/5/13/24155493/openai-gpt-4o-launching-free-for-all-chatgpt-users?showComments=1

Source: 51CTO Technology Stack

OpenAI model is finally updated! GPT-4o, with its powerful audio-visual capabilities, will be available to all users

1. The latest model, GPT-4o

2. GPT-4o's powerful "audio-visual" capabilities

3. Write at the end

Reference Links:

Read on

Google released a new upgraded large model to face off against OpenAI; Meizu released the new Flyme AIOS system

changes in the senior management of pharmaceutical companies Novartis and GSK in China; OpenAI's Chief Scientist Leaves | Executive Updates: May 5-17, 2024

The Conservative Rout? The driving force behind OpenAI's infighting left Altman: It makes me sad

OpenAI is shockingly exposed! Executives angrily denounced the suppression, and the 710 billion AI giant was embarrassed at home and abroad|Titanium Media AGI

GPT-4o sparks heated discussions about OpenAI's organizational innovation! Heavy responsibilities for fresh graduates and undergraduates, the ranks are all floating clouds

Ilya left OpenAI insider exposure: Ultraman cut his team's computing power and prioritized products to make money

In the second act of OpenAI's palace fight, the core security team was disbanded, and the person in charge blew up the inside story of his resignation

OpenAI forces departing employees to sign shut-up agreements: GPT can talk, but former employees can't

Be a sneak peek! ByteDance is unprecedented! The large model is stunningly unveiled, and the price is as low as 99%!

39 million people watched Lei Jun's live test drive; Musk recruits second brain-computer experiment patient; DeepMind launches a large-scale model risk assessment framework

OpenAI responds to "gag" resignation clauses; Didi Chengwei: Liu Qing was promoted to permanent partner, and the company no longer has the position of president; NetBSD prohibits AI-generated code | Geek headlines

OpenAI employees were "sealed" when they left their jobs, the core security team was disbanded, and Altman responded urgently: there was an agreement, but it was never implemented!

From "sky-high prices" to "fracture prices", large models are about to change

If you want to land a large model, let everyone afford to use it first

Direct interaction with hundreds of millions of users Third-party AI models accelerate access to the Weibo ecosystem

iFLYTEK Xinghuo large model empowerment, opening up the "new consciousness" of virtual people

聊聊OpenAI最新发布的GPT 4o

When open source meets large models, what kind of changes will occur?

OpenAI Shock! The chief scientist suddenly left! Wang Yuquan's exclusive analysis!

It is said that the senior management of the Tsinghua Department of the large model company has changed

58.com Sun Qiming: How to build a large model of life service vertical? Self-developed + open source with both hands

AI Dimensity Full Push, China's First End-to-End Large Model Mass Production on the Car Xpeng opens the era of AI intelligent driving

The price of large models has fallen, and the Internet-style "turf war" has reappeared, will big factories really lose money?

The Past of China's Large Model Capital: 20 Large Model Insiders Walk on the "Life and Death Table"

The price war of AI large models starts, and the winner will be decided in a year?

Baidu's first Wenxin large model learning machine Z30 is on sale, and 8G +256G is sold for 6694 yuan

OpenAI officially announced the launch of "next-generation cutting-edge model" training! It is expected that the training parameters will be further improved, or the "Wensheng video" model Sora will be integrated

Former OpenAI director reveals the inside story of Ultraman's recall: The board of directors knew that ChatGPT had been released from X

It's all "my own people"! OpenAI urgently set up a "safety committee", less than half a month after the disbandment of the "super alignment" team, and will face the first security "big test" in 90 days

OpenAI is caught in the biggest public relations crisis in history, and the head of Altman, who is in charge, donated half of his net worth to help the company tide over the difficulties

In the large-scale model competition, why are Chinese and American tech giants rolling in different directions?