Recently, OpenAI, a leader in the field of artificial intelligence, released a new AI model, GPT-4o, a major breakthrough that has been hailed as "changing the history of human-computer interaction overnight". GPT-4o not only supports voice chat, but also enables real-time video interaction, which is as smooth as human interaction. The advent of this technology will undoubtedly bring new development opportunities in the field of artificial intelligence.

OpenAI叇巼બ模५GPT-4o, 微๓ "I'm going to say "I'm going to go"

OpenAI's ambitions

OpenAI's flagship product, ChatGPT, can understand natural language and answer users' questions, but due to its "pre-trained" principle, it can't search for content instantly. In addition, the generation mechanism of large language models also makes it impossible for ChatGPT to completely avoid the phenomenon of "serious nonsense". So, people who want to stay up-to-date with real-time content still need to turn to search engines.

Traditional search engines are based on keyword matching, that is, identifying the search scope based on the keywords entered by the user, and matching the massive amount of information that may be in line with the user's intent. However, the pain point of traditional search is the large amount of redundancy and information inconsistency caused by the massive amount of information from different sources, which also leads to a lot of information searched, but no useful things can be found.

OpenAI clearly wants to be an important connection point between humans and data, and ChatGPT alone (even the smartest GPT) can only meet part of the demand, and the launch of a search engine is imperative. At present, the industry is most concerned about what kind of form OpenAI's search engine will be, and whether it can really shake Google's long-established search market ecology.

Before OpenAI, there was already a generative search engine in the United States, Perplexity. Founded in 2022, Perplexity is a Silicon Valley-based startup that focuses on developing generative search engines using artificial intelligence technology to provide direct answers to search queries instead of providing a list of website links. PerplexityAI integrates videos, images, etc., in the answers provided, and sometimes provides direct link resources. Perplexity has been liked by people including Nvidia CEO Jensen Huang, and has reached 10 million MAUs in the first and a half years since its establishment.

So, will OpenAI's search engine be similar to PerplexityAI, or will it bring more surprises? We still have to wait for the final reveal of OpenAI.

GPT-4o is not only completely free, but also covers desktop and mobile apps, greatly improving performance, and can comprehensively process text, images and audio, making human-computer interaction more natural and simple. For example, GPT-4o can be allowed to join a web conference and record a summary summary of the speech for the user.

What exactly is GPT-4o used for? Users can let GPT-4o handle the problem at hand, greatly improving productivity, and can have real-time voice conversations with AI, just as naturally and smoothly as chatting with a real person. AI has reached human speed in processing responses, and can even understand the user's emotions and respond accordingly.

Steal the limelight from Microsoft

In the face of OpenAI's deliberate crash and seizing the limelight, what kind of AI products did Google come up with at today's I/O conference, and did it bring enough shock and novelty?

The Google I/O Developer Conference has entered its 16th year this year, and AI has long become the absolute or even the only protagonist of the I/O Conference. Google CEO Pichai even announced at the end that AI was said 121 times in the whole conference, which caused the audience to laugh. Although there was no mention of competitors throughout the conference, Google CEO Pichai began to show Google's AI prowess from the beginning of the keynote, announcing that Google has fully entered the Gemini era. He emphasized that Google has been investing in AI for more than a decade, through every layer of AI: research, products, infrastructure.

While AI upstart OpenAI has a first-mover advantage in product launches, Google has an overwhelming advantage in research papers, user scale, number of products, and computing power, which is a direct reason why OpenAI must form an alliance with Microsoft, as neither company can compete with Google alone.

Pichai also announced that the Gemini model has covered 2 billion users across Google's entire platform, with more than 1 million users signing up for it in just three months. The native multi-model Gemini 1.5 Pro, released two months ago, has already been used by more than 1.5 million developers.

In terms of performance, Google is the Thanos of the AI industry. Gemini 1.5 Pro has directly improved the performance of Token (Context Processing) to the million level, comprehensively overpowering GPT-4.0 Turbo, which is trapped by slower performance. Three months later, Google today announced that the improved Gemini 1.5 Pro is generally available to Gemini Advanced users in 35 languages.

What's even more brutal is that Google has also doubled the context window processing performance of the Gemini 1.5 Pro to 2 million (only available to developers for the time being), which OpenAI can only hope to achieve. Pichai announced that this is a significant step towards the ultimate goal of infinite context.

What kind of experience can Gemini 1.5 Pro bring to users? Google used Workspace to demonstrate the dramatic changes that AI can bring to productivity. For example, if you're meeting remotely via Google Meets, you can ask Gemini to record and list the minutes for yourself, even if the user can't attend.

With Gemini, Gmail has a soul. Writing an email is already a basic operation. Users can ask Gemini to help them organize and summarize Gmail's massive emails, organize and summarize users' consumption expenses based on recent receipts and credit card bill emails, and give a professional and specific financial expenditure list.

Equip the AI with eyes and mouths

Zhou Hongyi pointed out that according to the brief technical principle introduction at the OpenAI press conference, different from the traditional practice of translating speech into text processing and then translating it into speech, this technology directly processes the speech and forms an integrated large model engine to achieve direct understanding of voice input - including understanding the details of emotions, feelings, intonation, and accents contained in the voice, and at the same time directly outputting speech.

"This brings a new experience, that is, the delay is only about 300 milliseconds, which reaches the response speed of human conversation, so that not only can you understand the emotions in your words, but also can be accompanied by happiness, sadness, disappointment, excitement or more complex feelings when outputting answers." Zhou Hongyi said.

Zhou Hongyi also pointed out that in addition to the amazing voice processing level, there is an easy overlook that GPT-4o can actually directly open the mobile phone camera and give it a more powerful eye ability directly through the mobile phone camera. This may not be as good as Sora, but it's a step up from GPT-4.5's ability to input images into forms. "So to sum up, GPT-4.0 is equivalent to giving artificial intelligence the ability to understand knowledge, which is equivalent to having a brain, and then GPT-4.5 is equivalent to giving some rudimentary ability to see, and GPT-4o actually adds eyes that can really understand the world, and ears that can understand people's speech, and the mouth can also freely express their emotions and emotions."

In Zhou Hongyi's view, some people will be disappointed that OpeanAI did not launch GPT-5.0 in this release, but the path to general artificial intelligence is not only to catch up with humans in super reasoning ability, knowledge ability, and logical ability, but more importantly, the ability to interact with people. Therefore, when AI can see the world better through both mobile phone cameras and ubiquitous IoT cameras, and can interact with people at the same response speed, this thing becomes very scary, "that is, it makes artificial intelligence truly more human-like."

To sum up, the development of artificial intelligence technology is changing with each passing day, and every technological breakthrough has brought us new surprises. The release of OpenAI's new AI model, GPT-4o, and Google's Gemini 1.5 Pro at the I/O conference are important breakthroughs in the field of artificial intelligence. The advent of these technologies will undoubtedly bring new development opportunities in the field of artificial intelligence and bring more convenience to our lives. However, we should also note that the development of AI technology still faces many challenges, such as how to ensure the safety of AI and how to avoid the abuse of AI. These questions require us to continue to think and explore while developing artificial intelligence.

OpenAI叇巼બ模५GPT-4o, 微๓ "I'm going to say "I'm going to go"

OpenAI's ambitions

Steal the limelight from Microsoft

Equip the AI with eyes and mouths

Read on

Google released a new upgraded large model to face off against OpenAI; Meizu released the new Flyme AIOS system

changes in the senior management of pharmaceutical companies Novartis and GSK in China; OpenAI's Chief Scientist Leaves | Executive Updates: May 5-17, 2024

The Conservative Rout? The driving force behind OpenAI's infighting left Altman: It makes me sad

OpenAI is shockingly exposed! Executives angrily denounced the suppression, and the 710 billion AI giant was embarrassed at home and abroad|Titanium Media AGI

GPT-4o sparks heated discussions about OpenAI's organizational innovation! Heavy responsibilities for fresh graduates and undergraduates, the ranks are all floating clouds

Ilya left OpenAI insider exposure: Ultraman cut his team's computing power and prioritized products to make money

In the second act of OpenAI's palace fight, the core security team was disbanded, and the person in charge blew up the inside story of his resignation

OpenAI forces departing employees to sign shut-up agreements: GPT can talk, but former employees can't

Be a sneak peek! ByteDance is unprecedented! The large model is stunningly unveiled, and the price is as low as 99%!

39 million people watched Lei Jun's live test drive; Musk recruits second brain-computer experiment patient; DeepMind launches a large-scale model risk assessment framework

OpenAI responds to "gag" resignation clauses; Didi Chengwei: Liu Qing was promoted to permanent partner, and the company no longer has the position of president; NetBSD prohibits AI-generated code | Geek headlines

OpenAI employees were "sealed" when they left their jobs, the core security team was disbanded, and Altman responded urgently: there was an agreement, but it was never implemented!

From "sky-high prices" to "fracture prices", large models are about to change

If you want to land a large model, let everyone afford to use it first

Direct interaction with hundreds of millions of users Third-party AI models accelerate access to the Weibo ecosystem

iFLYTEK Xinghuo large model empowerment, opening up the "new consciousness" of virtual people

聊聊OpenAI最新发布的GPT 4o

When open source meets large models, what kind of changes will occur?

OpenAI Shock! The chief scientist suddenly left! Wang Yuquan's exclusive analysis!

It is said that the senior management of the Tsinghua Department of the large model company has changed

58.com Sun Qiming: How to build a large model of life service vertical? Self-developed + open source with both hands

AI Dimensity Full Push, China's First End-to-End Large Model Mass Production on the Car Xpeng opens the era of AI intelligent driving

The price of large models has fallen, and the Internet-style "turf war" has reappeared, will big factories really lose money?

The Past of China's Large Model Capital: 20 Large Model Insiders Walk on the "Life and Death Table"

The price war of AI large models starts, and the winner will be decided in a year?

Baidu's first Wenxin large model learning machine Z30 is on sale, and 8G +256G is sold for 6694 yuan

OpenAI officially announced the launch of "next-generation cutting-edge model" training! It is expected that the training parameters will be further improved, or the "Wensheng video" model Sora will be integrated

Former OpenAI director reveals the inside story of Ultraman's recall: The board of directors knew that ChatGPT had been released from X

It's all "my own people"! OpenAI urgently set up a "safety committee", less than half a month after the disbandment of the "super alignment" team, and will face the first security "big test" in 90 days

OpenAI is caught in the biggest public relations crisis in history, and the head of Altman, who is in charge, donated half of his net worth to help the company tide over the difficulties

In the large-scale model competition, why are Chinese and American tech giants rolling in different directions?