laitimes

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

The world model is exploding!

Recently, research results related to the world model have sprung up like mushrooms after a rain, and we have reported on the world model Whale of Zhou Zhihua's team at NTU, the world model research of Yann LeCun's team, the spatial intelligence research of Li Feifei's World Labs, Google's powerful world model Genie 2, and the just-open-sourced generative physics engine Genesis that can simulate everything.

In fact, around the time of the Genesis open source release, a startup called Odyssey also introduced their world model, Explorer, to the world. At the same time, they also announced that Ed Catmull, founder of Pixar Animation Studios and winner of the Turing Award, has joined the company's board of directors and made an investment. Prior to that, on November 13, the company announced that it had closed an $18 million Series A funding round.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

According to reports, similar to Genie 2, Explorer can also generate high-quality 3D worlds based on a single image. Judging by the demo released by the company, the quality and detail of the worlds it generates are truly remarkable.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

According to Odyssey's blog, Explorer is for Odyssey's goals, or niches: "We [with Pixar] share the belief that technology must serve the story and the storyteller. This is especially true in this age of AI." In short: Story is king. They hope to bring "the next big technological breakthrough to film, games, and beyond: generative world models."

Like other impressive generative models in the demo, the Explorer has attracted a lot of praise.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

Explorer:生成式世界模型

"The best stories take us into new worlds." In masterpieces such as Toy Story, Mind Squad, Star Wars, Dune, Avatar, The Lord of the Rings, Jurassic Park, Red Dead Redemption, and The Last of Us, artists have spent tens of thousands of hours using 3D authoring tools to artificially create richly detailed worlds. These worlds are filled with unique characters, landscapes, and music. These time-consuming processes are both a major enabler and a bottleneck for movies, games, and more.

Explorer makes this process easier. With just one image, you get a 3D world that is very realistic and rich in detail.

While Explorer is still in its early stages, it is already expected to dramatically speed up the creation of worlds compatible with movies and games, as well as enable entirely new applications or forms of entertainment.

Odyssey has shown a lot of examples in her blog, and here are some excerpts for readers to review.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

提示词:An underground workshop with a muscle car covered in a white cloth

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

提示词:An office interior from the 2000s

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

提示词:A Japanese garden, with rich, green foliage

Explorer claims to have a number of benefits. First, it can generate photorealistic worlds, which is one of the model's core strengths.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

提示词:A street in London. Brick wall

Explorers can also generate moving worlds. Although still in the early stages of research, generative world motion (all in 3D) has an exciting future, allowing artists to generate and manipulate motion in new, more realistic ways, in addition to providing granular control that generative video models are difficult to replicate, the company said.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

提示词:A family in the kitchen. Snowing

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

提示词:A serene coral reef

The Explorer can also generate gaussian splats. Over the past 18 months, many of the world's top computer graphics and vision researchers have been focusing on Gaussian sputtering. It's easy to see why, sputtering is able to reconstruct scenes in incredibly realistic, almost imperceptible real detail. There are quite a few who believe that this could become a dominant form of 3D representation. Explorer also takes the form of sputtering as a representation of the world.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

Explorer-generated worlds can also be further edited by humans.

创意工具提供商已经注意到高斯溅射的发展势头,并已在 Unreal、Houdini、Blender、Maya、3D Studio Max、After Effects 等工具中增加了对溅射可视化和操作的早期支持。

This means that you can use these tools to load and even edit the world generated by the Explorer.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

The world after editing with Blender

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

Edited world with Unreal

The company says they've already tried the Explorer in production: "To test if the Explorer is ready for production use, we recently partnered with Garden Studios in London. We teleported the world generated by the Explorer to their state-of-the-art virtual production stage (for recording feature films, TV, commercials, music videos, etc.) and were pleased to confirm that the worlds we generated could be used in today's actual production pipelines. They look fantastic."

Of course, Explorer is still in its early stages, and there are a lot of things that aren't perfect, and the company points out a few things that need to be optimized:

  • Explorer doesn't yet support real-time world generation, which currently takes an average of 10 minutes to complete.
  • Resolution and world integrity left much to be desired, and they hope to seamlessly scale generation in the future to fill in any gaps and create a full spherical world.
  • With video-to-world and world-to-world inputs, the Explorer's controllability needs to be further improved, with the goal of taking real-world Gaussian sputtering as input and augmenting it with prompts or other guidance.

Currently, Explorer is not publicly available, but interested readers can try it out on their own:

https://odyssey.systems/introducing-explorer

Odyssey: An AI company that wants to use technology to tell a story

Odyssey is clearly a startup that has established its direction early on.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

The company was founded by Oliver Cameron, CEO, and Jeff Hawke, CTO.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

X avatars of the two founders

Oliver Cameron was involved in the development of autonomous vehicles at Cruise and Voyage, while Jeff Hawke led the development of deep learning models for autonomous driving at Wayve. The team recruited researchers from Cruise, Waymo, Wayve, Tesla, Microsoft, Meta, and NVIDIA, as well as engineers who worked on video games such as Spore, SimCity, The Sims, Alien: Isolation, and the Tom Clancy franchise, as well as Dune 2, Godzilla, The Creator, Avengers: Age of Ultron, Aelita: Battle Angel, and Jurassic World: The Lost Kingdomand other technical artists of filmmaking. In addition, several members of the team have won BAFTA awards.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

As you can see, the company has a very strong background in autonomous driving R&D, which they also mentioned in their November blog: "In fact, more than 90% of our technical staff have spent most of their careers working on autonomous vehicles at companies like Cruise, Wayve, Waymo, and Tesla. This experience has given us a unique insight into the problem of the model on which the world is built." Only this time, instead of navigating the 3D world, they wanted to build a model that generated the world.

To do this, the first problem they consider solving is collecting real-world data. Cars can do some of this, but there are also places that cars can't go, such as forests, caves, trails, beaches, glaciers, parks, and so on. Eventually, they came up with a solution: human collection.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

Yes, you read that right! Specifically, this uses a lightweight backpack computer connected to a multimodal sensor with extremely high resolution. The device weighs 25 pounds (about 11.3 kg), has a long battery life, and comes with 6 cameras, 2 lidars, and an IMU. Combined, these sensors capture our world in 360 degrees with a resolution of 13.5K and rich detail, with each panoramic capture containing physically accurate depth information. What's more, because humans have precise control over the sensors, they can be sure to capture every angle their generative model might need.

Now we know where the photorealism of Explorer comes from.

It is understood that Odyssey announced the completion of a seed round of financing on July 12 this year, led by Google Ventures. On November 13 of this year, it announced the closing of an $18 million Series A funding round led by EQT Ventures. The official website also lists some investor information, where we can also see Jeff Dean's name, as well as researchers from AI companies such as OpenAI, DeepMind, and Midjourney.

Shoot the king to bomb? Photorealistic generative world model, with Pixar investment

How do you think the generative world model of Explorer performs? Are you looking forward to a movie or game made with an Explorer-generated world?

Reference Links:

https://x.com/odysseyml/status/1869417873938219360

https://odyssey.systems/learning-from-our-world

Link to video in the article: https://mp.weixin.qq.com/s/3whlcE6wMkJNBXWAg4PZ1w?token=1202965932&lang=zh_CN

Read on