Tencent released the DynamiCrafter model, a new AI video generation tool

author：The mountain monster Atu 2024-03-26 17:58:00

In the field of AI video generation, before Sora was born, there were only three products, one is the open-source SVD, and the other two are Runway and Pika. Tencent Artificial Intelligence Lab has been making efforts in the field of AI video, and recently released their new DynamiCrafter video generation model with the Chinese University of Hong Kong. Judging from the demo video compared to Pika and SVD, the effect is quite good.

DynamiCrafter is a video generation model that animates open-domain images using a priori video diffusion algorithms. It can use pre-trained video diffusion models to generate realistic video content for arbitrary static images based on text prompts. At present, it can generate videos in three resolutions, 256*256, 320*512, 576*1024, let's take a look at the demo video effect.

256*256

Tencent released the DynamiCrafter model, a new AI video generation tool

320*512

576*1024

DynamiCrafter's approximate workflow is to project the input image into a rich contextual representation space aligned with the text, using a query transformer that allows the video model to understand the content of the image in a compatible way. Then, it stitches the complete image with the initial noise and then feeds it into the diffusion model, which uses the motion prior of the diffusion model to generate a dynamic video sequence.

Flowchart of the proposed DynamiCrafter. During the training process, we randomly select video frames as the image conditions for the denoising process through the proposed dual-stream image injection mechanism to inherit visual details and digest the input images in a context-aware manner. During inference, our model can generate animated clips based on the noise of the input static image.

There are several examples of the application of the DynamiCrafter model:

1 Storytelling video generation

2 Generate frame interpolation

3 Loop video generation

The DynamiCrafter model already has Comfyui support, which can be searched for download nodes in the manager. In addition, the official also provides a trial page for the web version of Hug Face.

Official address: https://github.com/doubiiu/dynamicrafter

Trial address: https://huggingface.co/spaces/Doubiiu/DynamiCrafter

Many bloggers have reported on the model, and in order to fight the gimmick, the title is what kind of king bombing and hanging. Although the official demo is very good (everyone knows the moisture of the demo), in fact, the DynamiCrafter model is not a mature AI video generation model, but a tool that has just been formed and has not yet been perfected. The official itself also listed the following shortcomings:

The resulting video is relatively short (2 seconds, FPS = 8);

The model is unable to render clear text;

In general, faces and people may not be generated correctly;

The auto-encoded part of the model is lossy, resulting in slight flickering artifacts.

Of course, we still have to encourage the progress of domestic AI, and multiple AI video tools are always good.

Tencent released the DynamiCrafter model, a new AI video generation tool

Read on