laitimes

Dialogue with Zhu Wei of Wondershare Technology: Sora's success is "vigorously producing miracles", and large models will not go directly to the sea Titanium Media Exclusive

author:Titanium Media APP

In the past 2023, large models have become a high-frequency word in the technology world, and they have also become the most mentioned thing in all walks of life. According to incomplete statistics, more than 238 large-scale model products will be publicly released in China in 2023.

However, the evolution and explosion of technology also requires a certain process, looking at the large-scale model products released in the past, most of them are based on natural language dialogue as the main function, providing chat dialogue, creative writing, code generation and other services. In contrast, in the video field, where demand is more vigorous, the maturity of large models is still lacking.

With the help of the capabilities of large models, Wensheng Wen and Wensheng diagrams have improved the efficiency of work and study scenarios to a certain extent. However, as a way of human information exchange with the highest information density, Wensheng video is more urgently pursued by the whole industry, which is one of the reasons why Sora was able to cause such a big impact at the beginning of the year.

Dialogue with Zhu Wei of Wondershare Technology: Sora's success is "vigorously producing miracles", and large models will not go directly to the sea Titanium Media Exclusive

Zhu Wei, Vice President of Wondershare Technology, image source: Wondershare Technology

"At present, large models have been commercialized in the field of text and images, but there are still challenges in the application of audio and video in the field of data set, complex structure and hierarchy of video content, and high computing cost, and its mature application still needs a cycle. Zhu Wei, vice president of Wondershare Technology, said to Titanium Media App.

Since its establishment in 2003, Wondershare Technology has been focusing on video research and development and investment, and has accelerated the research and development of large models since last year. In January this year, the Wondershare "Tianmu" audio and video multimedia model was officially released.

A few days ago, Titanium Media App exclusively revealed that Wondershare "Skylight" will be officially tested on April 28, and the one-click generation time of Wensheng video capability supports 60 seconds+. In a brief exchange with Zhu Wei, he also emphasized more than once that 2024 will usher in the year of AI video.

Sora's success is due to "miracles of force"

According to Gartner, 90% of digital content will be generated by AI by 2030, and the global AIGC market is expected to increase from $10.8 billion in 2022 to $118.1 billion by 2032. At the same time, Cisco's previous report pointed out that 82% of consumer Internet traffic in the future will be video traffic.

If you look at it from a textual point of view, the development of large models is indeed very mature, but from a video point of view, it is far from enough. At present, there are 305 million video creators in the world, 4.3 billion video coverage groups, and more than 20 billion video playbacks per day. The advent of the era of "video is king" has also given rise to the demand for multimedia vertical models and applications.

According to the research results of a16z, a well-known venture capital company in Silicon Valley, before 2023, there are no public video models on the market, but dozens of models will be born in 2023, with more than one million users worldwide. At present, the number of AI video models that have been put into use and made some progress in the market has reached 21.

Zhu Wei revealed that whether it is a text model, an image model or a video model, there are not many real original large models in China, and it can even be said that there are very few. "In particular, the large video models, including the Wondershare 'Skylight' we are working on, have not yet reached the most basic model level of L0. ”

Dialogue with Zhu Wei of Wondershare Technology: Sora's success is "vigorously producing miracles", and large models will not go directly to the sea Titanium Media Exclusive

One thing is for sure, video applications are expected to usher in explosive growth this year, that is, the application of AI video models will be more and more and faster. At the beginning of the year, the emergence of Sora made the entire industry excited, and Zhu Wei also admitted that in terms of the level of model foundation, the gap between China and Sora is still relatively large, and more resources need to be invested to quickly iterate on technology. "Sora, as an industry benchmark, is something we are trying to keep up with. ”

For large video models, algorithms, computing power, and data are the three major difficulties. Among them, due to open source, the algorithm framework is now similar to everyone. "We have studied Sora, and there is no disruptive innovation in its entire technical framework, which is the Transformer architecture," Zhu Wei pointed out, "The reason why it has such a good effect is mainly because of the miracle of computing power and data." ”

He said that Sora has at least 5 million hours of video data for training, and it needs to achieve monthly or quarterly iterations, at least the cluster above the 10,000 calorie cluster can be trained. It is understood that from the end of last year to this year, Wondershare Technology has invested nearly 100 million yuan in computing power alone.

However, in addition to acknowledging the gap with Sora, Zhu Wei did not show too much anxiety. "It's a basic model, Wondershare is actually application-oriented, we won't catch up with it on the basic model, because it's very expensive and laborious, and the final effect won't let you get such a big return, the input-output ratio is not cost-effective, in short, that's it. ”

Large models will not go directly to the sea, and China is not a very good paid market

As mentioned above, the large model of generating video can be divided into two categories or two levels: the first level is to do the basic model, such as Sora. The second level is to do the vertical model, which is trained through some basic data, and then on this basis, some fine-tuning and fine-tuning training is done.

For the vertical model, Zhu Wei believes that if you want to stand out from the era of large models, you still have to adhere to the "application is king" - through a popular application, quickly achieve explosive growth.

It should be pointed out that the current technology at home and abroad is not very mature, and the video model has not yet reached the stage of full maturity. Even though the videos generated by Sora are already amazing, there is still a gap between the final output of the user's video that can be published on social platforms. Because user videos have a certain story, and even have such as intro and outro, text, transitions, etc., this kind of video will have many more elements than Sora.

Overall, there are three major challenges to video generation. The first is the lack of datasets, the high cost of video content storage and annotation, and the lack of video-related training datasets. The second is the high cost of computing power, which is much higher than that of images, text, and other content. The third is that the generation effect is not good, and there is still a lack of models with good effectiveness and usability as a benchmark.

"We hope that each of Wondershare's products is a combination of 'technology + application', which can solve specific problems in a certain segment and allow product users to truly gain value. It is necessary to integrate all the multimodal elements well, and finally let users come up with a high-quality multimedia video when editing the video, which is what Wondershare wants to do. ”

Dialogue with Zhu Wei of Wondershare Technology: Sora's success is "vigorously producing miracles", and large models will not go directly to the sea Titanium Media Exclusive

Screenshot of Wondershare "Tianmu" Wensheng video "Boy's Adventure".

Reaching more markets and users through the application of large model capabilities is something that Zhu Wei has repeatedly mentioned. According to the previously disclosed results, in the first half of 2023, Wondershare Technology's overseas revenue accounted for 90.23%, and its current sales customers are in more than 200 countries and regions around the world. When talking about the topic of large models going to sea, Zhu Wei believes that no company will really go to sea directly with a large model, and it is estimated that we will most likely not do this.

In his view, large-scale model going overseas refers to products with large-scale model capabilities going overseas to solve the problems of overseas users, rather than making a good video model, and finally this model goes directly to sea.

As for the domestic application market, Zhu Wei also talked about some problems in the current development, such as the promotion of the C-end. "I agree with the idea that China is the largest app market, but it's not a very good paid market yet. In China, we began to do the B-end with the help of the ability of the large model this year, but did not do the C-side, because we feel that the model of using the large model as a tool to directly charge users is currently more difficult to get through in China. (This article was first published in Titanium Media APP, author | Du Zhiqiang, editor - Zhong Yi)

Read on