GTC24 | Generate models on the fly: NVIDIA generative AI research enables 3D shapes to be generated in less than 1 second

author：NVIDIA China 2024-03-26 18:35:00

LATTE3D models quickly convert text prompts into high-quality 3D shapes that help populate virtual worlds.

GTC24 | Generate models on the fly: NVIDIA generative AI research enables 3D shapes to be generated in less than 1 second

NVIDIA researchers doubled up LATTE3D, the latest text-to-3D generative AI model.

LATTE3D is like a virtual 3D printer that converts text prompts into 3D representations of objects and animals in less than 1 second.

The shapes generated by the model are in a format commonly used in standard rendering applications, and can be easily used in a virtual environment, such as video game development, marketing, design projects, or building a virtual training ground for robots.

The NVIDIA Toronto AI Experiments team, led by Sanja Fidler, vice president of AI research at NVIDIA, developed the LATTE3D. "A year ago, it took one hour for an AI model to generate a 3D visual of this texture, and the latest technology takes 10 to 12 seconds," says Sanja Fidler. Now, we're delivering results an order of magnitude faster, enabling near-real-time text-to-3D generation for creators across industries. ”

This advancement means that when running inference on a single GPU, such as the NVIDIA RTX GPU used in NVIDIA Research's demo, LATTE3D can generate 3D shapes almost instantaneously.

Shorten the cycle from ideation to build to iteration

Creators don't need to start from scratch or dig through a library of 3D assets, they can use LATTE3D when they have an idea in their head to quickly generate concrete objects.

The model generates a number of different 3D shapes for the creator to choose from based on each text prompt. Selected shapes will be optimized in minutes to improve quality. Users can then export shapes to a graphics software application or platform, such as NVIDIA Omniverse, which enables the development of 3D workflows and applications based on Universal Scene Description (OpenUSD).

Researchers trained LATTE3D models on two specific datasets, animals and everyday objects, and developers can use the same model architecture to train AI on other types of data.

If trained on a 3D plant dataset, LATTE3D can help landscape architects quickly populate garden renderings with trees, flowers, and succulents in discussions with clients. If trained on a dataset of household items, the model will generate items suitable for use in a 3D home simulation environment, and developers can also use the resulting items to train a personal assistant robot for later testing and deployment in the real world.

LATTE3D Train with NVIDIA Tensor Core GPUs. In addition to the 3D shapes, the model was trained using a variety of text prompts generated by ChatGPT to better handle the various phrases that the user might come up with to describe a particular 3D object, such as getting the model to understand that any prompt featuring various canines should generate a shape that resembles a dog.

NVIDIA Research is made up of hundreds of scientists and engineers around the world and focuses on research in areas such as AI, computer graphics, computer vision, self-driving cars, and robotics.

At NVIDIA GTC 2024, the researchers presented their research findings that are pushing the frontier of diffusion model training technology. For the latest news on NVIDIA AI, visit the NVIDIA Technology Blog, or check out the full list of GTC NVIDIA Research sessions.

Scan the QR code of the poster below to watch NVIDIA Founder and CEO Jensen Huang's GTC 2024 keynote, which is now available with Chinese subtitles, and hear him share the AI breakthroughs shaping the future!

GTC24 | Generate models on the fly: NVIDIA generative AI research enables 3D shapes to be generated in less than 1 second

Read on

Effect of arc resistance model on fast transient overvoltage in 1000 kV GIS

The large-scale model product NETA "Qiankun Circle" will be on the car, and Nezha Automobile will enter the era of smart cars 2.0

GPT-4 Turbo-level domestic large model debuted, and Zhou Guanyu's F1 race data analysis stunned the big guy

iFLYTEK Xinghuo launched the first intelligent twins platform, agilely reaching the last mile of large-scale model application enterprises

Byte released ViTamin, a visual foundation model, and achieved SOTA for multiple tasks, which was selected for CVPR2024

Peking University | CLIP model semantic information and 3DGS, real-time and accurate semantic understanding of 3D scenes

Turn in | OccGen: A new breakthrough in generative 3D semantic occupancy prediction models in the field of autonomous driving

Whispering (52): Intensive reading of journal articles - model building and model analysis

Supporting 13 billion parameter large models to lead the industry, MediaTek released the strongest intelligent cockpit chip

"Mode" does not care about or "model" is the opposite of the opposite: on the development trend of the traffic model in the troubled times YEF2024

iFLYTEK Xinghuo launched the first intelligent twins platform to agilely reach the last mile of the landing of large model application enterprises

Stardust Intelligence released an AI robot with full marks in operation ability and large model blessing

Hima Practice: The Way of Audio Editing in the Model Era - Cloud Editing Word Editing

Hima Advertising Algorithm Optimization Practice (1): The Evolution of Advertising CVR Model

When "elderly care" meets AI models

STAR Model: Unlock the Four Golden Keys to Success in Life