laitimes

200PB of data! Demystifying Mobileye's Autonomous Driving "Secrets"

Abstract: On January 7, Mobileye announced at CES 2022 that it has collected 200 petabytes of data, which means that Mobileye has a virtual treasure trove of driving data. This data, combined with Mobileye's best-in-class computer vision technology and powerful natural language understanding (NLU) models, can output thousands of results in seconds, even for rare conditions and "long tail" events in scenarios. This helps self-driving cars and state-of-the-art computer vision systems handle edge situations, enabling self-driving cars to achieve ultra-high mean time between failures (MTBF).

On January 7, Mobileye announced at CES 2022 that it has collected 200 petabytes of data, which means that Mobileye has a virtual treasure trove of driving data. This data, combined with Mobileye's best-in-class computer vision technology and powerful natural language understanding (NLU) models, can output thousands of results in seconds, even for rare conditions and "long tail" events in scenarios. This helps self-driving cars and state-of-the-art computer vision systems handle edge situations, enabling self-driving cars to achieve ultra-high mean time between failures (MTBF).

Professor Amnon Shashua, President and CEO of Mobileye, said: "The infrastructure for data and processing data is a complexity for autonomous driving technology implementation. Mobileye has spent 25 years collecting and analyzing what we believe to be an industry-leading database of real-world environments and simulated driving experiences, differentiating itself by enabling powerful autonomous driving solutions that enable ultra-high mean time between failures. ”

Mobileye has a globally recognized large dataset of cars, containing more than 200 petabytes of real-world environmental driving video footage over the past 25 years, totaling 16 million 1-minute video footage.

200PB of data! Demystifying Mobileye's Autonomous Driving "Secrets"

△ Mobileye dataset has more than 200PB of real-world environment driving video footage

For the powerful computer vision engine required for autonomous driving, large-scale data annotation is the core. Mobileye has a rich and relevant dataset, manually or automatically labeled by more than 2,500 professional annotators. The compute engine relies on 500,000 peak CPU cores in the cloud server to process 50 million datasets per month – the equivalent of 100 petabytes of data generated by 500,000 hours of driving footage per month.

The value of data lies in being able to be interpreted and put to use at the same time, which requires a deep understanding of natural language and advanced computer vision algorithms, which has always been Mobileye's strength.

Every self-driving company faces the "long tail" problem, where self-driving cars encounter situations that have never been seen or experienced before. These long-tail problems contain huge data sets, but many businesses don't have the tools needed to efficiently understand these data sets. Mobileye's advanced computer vision technology, paired with powerful natural language understanding models, can query long-tail datasets and return thousands of results in seconds. Mobileye can then use these results to train and make the computer vision system more powerful. Mobileye's approach greatly speeds up the development cycle.

The Mobileye team uses an internal search engine database containing millions of images, video clips, and scenes. Its content covers a wide range of things, from "tractors covered in snow" all the way to "traffic lights in the sunset", all collected by Mobileye and fed into its algorithms (see sample image).

200PB of data! Demystifying Mobileye's Autonomous Driving "Secrets"

The Mobileye dataset contains millions of images, video clips, and scenes

With the highest quality data and professional talent in the industry, Mobileye's driving policy ensures that reasonable and informed decisions are made, an approach that eliminates the uncertainty of AI decisions and statistically achieves an ultra-high average time between failures. At the same time, the dataset speeds up the development process, allowing the promise of self-driving technology to "save lives" to become a reality more quickly.

Read on