【Global Network Technology Comprehensive Report】September 6 news, recently, Alibaba Cloud launched a new video generation large model I2VGen-XL, and in the magic community open experience, users upload a picture after about 2 minutes can generate a 1280*720 high-resolution video, the model R & D leader said that the future will further achieve 2K ultra-clear effect, can be applied to short video content production, film production and other scenarios.
I2VGen-XL is open to experience in the Magic Rider community
Different from the industry's popular AI painting creation large model, the technical threshold for video generation large model is higher, which needs to overcome many technical challenges such as text and video content matching, video picture quality, and picture continuity. Prior to this, technology companies such as Alibaba Cloud and Microsoft have successively launched a series of controllable video generation research results, such as users can generate video by defining space layout, motion mode and other conditions, but its picture clarity is difficult to meet the needs of real-world applications.
In response to this problem, Alibaba Cloud further proposed a new idea, I2VGen-XL model designed two stages, first under low resolution conditions to ensure the matching degree of the generated results and the given image semantics, and then through the video diffusion model (VLDM) to improve the video resolution, and at the same time improve the consistency in time and space, to ensure the clarity and consistency of the final video content, and finally achieve a breakthrough of 1280*720 high resolution, and greatly ahead of the existing models in the display of picture details. According to reports, the training of the model also uses a variety of styles of video data, so it can generate rich types of videos such as sense of technology, cinematic colors, cartoon styles and sketches.
I2VGen-XL flowchart
At present, the model and code of I2VGen-XL have been open source, and social media at home and abroad show that the model has attracted extensive experience and secondary development of users and developers at home and abroad, and a large number of creative AI video generated content has emerged, such as dinosaurs spreading their wings on the castle, sci-fi movie pictures of astronauts walking in spaceships, etc... Ahsen Khaliq, a well-known AI social media analyst, posted multiple video effects generated by the model on social media and said that the model has advantages in terms of clarity, texture, semantics, and time continuity.
Domestic and foreign netizens and developers widely concerned and experienced
It is understood that in the field of visual generation, Alibaba Cloud has previously launched AI painting creation large model Tongyi Wanxiang (pedestal model Composer) and controllable video generation model VideoComposer, the team has published more than 60 CCF-A papers in this field, and won more than 10 championships in international top visual competitions.
Source: Global Network