One of the biggest updates in the 25 years since the launch of Google's search engine, the "AI Overviews" experience has been officially launched

Grasp the pulse of AIGC and grasp the pulse of science and technology. TechNode collects and summarizes the progress and hot spots of global AIGC every day, and takes you to understand AIGC for 5 minutes every day, hoping to play AIGC with you, decode the new trend of industry development, and open a new era of wisdom!

Text|TechNode|Gao ZhuThis article is expected to take 5 minutes to read

Graphs evolve as Google launches Imagen 3: More accurate and creative Google announced Imagen 3 at its I/O 2024 developer conference today, further enhancing its text-to-image capabilities. Compared to its predecessor, Imagen 2, Imagen 3 understands text prompts more accurately and translates them into images, and the resulting images are more "creative and detailed" with fewer distracting elements and errors from the model. To dispel concerns about the possibility of deepfakes, Google said that Imagen 3 will use the SynthID method developed by DeepMind to apply stealth encrypted watermarks on media. Users can sign up for a private preview of Imagen 3 in Google's ImageFX tool, and Google says the model will soon be available to developers and enterprise customers using Vertex AI, Google's enterprise generative AI development platform.

One of the biggest updates to Google's search engine in 25 years, the "AI Overviews" experience is officially liveGoogle's company officially launched the "AI Overviews" search experience at the 2024 I/O Developer Conference held today, which will be opened to the United States this week and will be rolled out to more countries and regions in the future. Previously known as Search Generative Experiences, this feature allows users to conduct AI searches by asking questions and chatting. In the U.S., Google works with the Reddit community to answer users' questions. Google says it will provide AI-generated answers to online queries for users in the United States, one of the biggest updates to its search engine in 25 years.

Pointing to Sora, Google launched Veo's Wensheng video model: more than 1 minute long, up to 1080p, and support for cinematic techniques OpenAI launched text-to-video Sora three months ago, which sparked extensive discussions among netizens, media and insiders. At the 2024 I/O Developer Conference held today, Google also unveiled a benchmark product, Veo, which can generate "high-quality" video with a length of more than 1 minute and a resolution of up to 1080p, with a variety of visual and cinematic styles. According to Google's official press release, Veo has an advanced understanding of natural language and is able to understand film terms such as "time-lapse" and "aerial landscapes". Users can use text, images, or video prompts to guide their desired output, which Google says produces "more coherent" videos with more realistic movements of people, animals and objects throughout the shot.

In response to GPT-4o, Google launched the Astra project: low-latency chat interaction in the mobile phone lens Google launched a new Project Astra project at the 2024 I/O Developer Conference held today, based on Gemini, which can run natively on Pixel phones, which can be said to be the latest model against OpenAI GPT-4o. Google says Project Astra is the latest multimodal AI project in which the user turns on the camera and the multimodal project can directly interpret the objects in the user's screen.

ByteDance officially released the "Doubao Large Model" family, including general models, role-playing models, voice replication models, speech recognition models, Wensheng diagram models, etc. This morning, ByteDance officially announced at the 2024 Spring Volcano Engine Force Motive Force Conference that its own bean bag large model has officially opened external services. According to reports, the bean bag model includes the bean bag general model Pro, the bean bag general model liti, the bean bag role play model, the bean bag speech synthesis model, the bean bag voice replica model, the bean bag speech recognition model, the bean bag Wensheng diagram model, and the bean bag function call model. In addition to the release of ByteDance's self-developed large model, ByteDance also announced that the Volcano Engine large model service platform, Volcano Ark, will also usher in a major upgrade.

Google previews new Android features: AI detection scam callsGoogle Inc. announced the introduction of AI scam call detection for Android at the I/O 2024 developer conference held today, alerting users to possible scam behavior during calls and encouraging users to end such calls. Google says the feature is based on the Gemini Nano model that runs natively, matching calls to find fraudulent language and other conversation patterns that are often associated with scams, and alerting users if they encounter a call that is suspected of being a scam. Security-wise, Google says that these new protections are implemented entirely on the device, so conversations monitored by the Gemini Nano will remain private.

Google Workspace Office Suite Integrates with Gemini: Can Summarize Email Content and Organize Meeting PointsAt the I/O 2024 Developer Conference held today, Google announced that Google Workspace will further integrate Gemini, and many skills based on Gemini 1.5 Pro will be introduced in the side panel. Google says that Workspace's integration with Gemini is to save users time and effort in mining files, emails, and other data from multiple apps. Google Workspace's Gmail, Docs, Sheet, Slides, and Drive will be the first to feature Gemini side panels, which allow you to organize, understand, summarize, summarize, and more, without leaving the app.

Google Gmail Deeply Integrates with Gemini: Summarize Email Content and Generate Better ResponsesAt the I/O 2024 developer conference held today, Google announced that it will invite Workspace and Google One AI Premium users to experience the new version of Gmail next month, allowing Gemini to summarize email content. Google says users can use Gemini in the mobile app and in Gmail on the web to ask questions about the content of the current email or have Gemini write a response based on the context of the email. Google has already introduced Smart Reply in Gmail, but the new version has been upgraded with "Contextual Smart Reply", which can lead to more nuanced and better responses based on context. Google says it's also bringing a new Gemini button to the Gmail app, where users will see suggestions like "Summarize this email" or "Suggested reply" when clicked, and users can type prompts to ask questions about emails.

Google Gemini unlocks travel planning skills to help you plan your trip in seconds

At today's I/O 2024 developer conference, Google announced the introduction of a trip planner feature for Gemini that combines personal and public travel information to help users plan flights, hotels, and more. Google says Gemini can dig into specific details such as flight times and hotel bookings based on user prompts to create the right vacation itinerary in seconds. Gemini creates itineraries based on flight and hotel details included in the user's email. The model will also use Google Maps to find nearby restaurants and cultural attractions, and filter out options based on specific tips, such as dietary restrictions or things to avoid. Google says the new trip planning feature will be coming to Gemini Advanced in the coming months.

iOS version of ChatGPT update support App preferred language setting Chinese iOS version ChatGPT released version 1.2024.129 update early this morning, adding support for App preferred language setting Chinese, previously for other languages. The first time you launch the iOS version of ChatGPT, a Chinese display page will appear, the app supports application language settings, click on it will jump to the ChatGPT app settings in the system settings, and click the preferred language again to set the software language.

Baidu released the world's first L4 autonomous driving model, Apollo ADFM, which it says is safer than human driving

Baidu Apollo today held Apollo Day 2024 at Wuhan Baidu Radish Express Robot Zhixing Valley, and released the world's first large model supporting L4 autonomous driving, Apollo ADFM (Autonomous Driving Foundation Model). Baidu said that Apollo ADFM reconstructs autonomous driving based on large model technology, which can take into account the safety and generalization of technology, achieve more than 10 times the safety of human drivers, and achieve city-level global complex scene coverage. Relying on the application of 践️ the autonomous driving model, Baidu Radish Express has overcome the complex road scene in Wuhan and realized the coverage of the whole area and time and space scenes of Wuhan city. At the same time, in the field of L2 + intelligent driving, ANP3, the only pure visual urban pilot assisted driving product in China, will also fully apply the autonomous driving model Apollo ADFM and upgrade to ASD (Apollo Self-Driving), which will be mass-produced and launched in all models of Jiyue, taking the lead in realizing "intelligent driving can be driven all over the country and everywhere there is a Baidu map".

This article is organized by TechNode, and may not be reproduced without authorization, if you need to reprint or open up, please reply "reprint" in the background.

- - - - - - - - END - - - - - - - -

*If you want to get industry information and share your experience with like-minded technology enthusiasts, then scan the QR code to add "Movedian" to join the group chat! There are more irregular benefits in the group!

Wonderful article is worth recommending!

One-click forwarding, poke and watch!