The Heart of the Machine is original
Author: Du Wei
"Classic works embody the wisdom and spirit of old artists, and it is worth investing in repairing them." When it comes to video restoration in 4K Classic, Ren Lifeng, president of Watermelon Video, said.
"Gourd baby, gourd baby, seven flowers on a vine, wind and rain, are not afraid, la la la la la..."
When this familiar melody sounds, many post-80s and post-90s think of a black-and-white TV version of "Hulu Brothers", or a more vivid color version:

Now, whether it is the clarity of the picture or the brightness of the color, these slightly "ancient" videos can no longer meet the viewing needs of contemporary people. In addition, with the continuous advancement of various AI technologies such as image super-resolution, FPS improvement, and color fill, more and more individual users and video websites want to repair old videos, such as 1080, 2K and 4K repairs, to regain the old time with a clearer perspective.
At the same time, the popularity of 4K HD devices and 4K video content has also made it a major demand for 4K restoration of classic old films, which can retain the original texture while showing clearer picture quality, fuller colors and better sound.
However, there are more than ten million videos in the classics, and the amount of 4K restoration projects is vast, so how to meet people's viewing needs for 4K classic old films?
The watermelon video, hand in hand with the volcano engine, gives their answer.
On October 20th, Watermelon Video and Volcano Engine held a classic Video 4K Restoration Conference with the theme of "Repairing the Old Good", at which it was announced that more than 100 classic Chinese videos would be restored in 4K within one year through technical means. Among them, the volcano engine team provides technical support, and the repaired video content can be watched by users on the watermelon video for free. Watermelon Video will also open its entrance to provide free AI repair support for ordinary users, while providing in-depth public welfare repair for valuable videos.
Ren Lifeng, president of Watermelon Video, said, "Repairing classics is inheritance, and it is also the use of new technologies to maximize the restoration of works, bringing new feelings and cognitions to everyone. Whether it's fixing cartoons or fixing old images, in the final analysis, it's not just about improving its clarity. What we want to repair is the memories behind these contents, to present the resonance and sparks that generations have generated because of these memories."
Ren Lifeng. Image source: Watermelon video
At the meeting, the first batch of partners, including CCTV Animation and Shanghai Fine Arts Film Studio, were also announced, and the parts of the 100 films planned to be restored were as follows:
Nezha Legend (2003), Go Boy (2005), Go Boy (2), Big Head Son and Little Head Daddy (1995), Black Cat Sheriff 1-5 episodes, I Am A Song Maniac (2001), Three Monks, The Adventures of Little Carp (2007), Hulu Brothers (1986), Hulu Little Kong 1-6 episodes, Journey to the West, Little Tadpole Finds Mom, Little Carp Jumping Dragon Gate, Big Ears Tutu (Season 1), Shuk and Beta Episodes 1-13, etc.
At present, six cartoons, including The Adventures of Little Carp (2007), Little Tadpole Finds Mom, Nezha Legend (2003), Black Cat Sheriff Episodes 1-5, Big Head Son Little Head Daddy and Hulu Brothers, have completed 4K fixes and can be watched on watermelon videos.
Let's start by watching the 4K restored clip from The Adventures of Little Carp:
Giving 4K resolution to old film is difficult, but it makes a lot of sense
Before explaining the difficulties of 4K film repair, let's briefly list a few numbers, the old-fashioned SDTV has a resolution of only 720x480, which means that the content that can be displayed at a time is 345,600 pixels; the HD TV has a resolution of 1920x1080, with a total pixel of 2,073,600 pixels, which is 6 times the standard definition; and the 4K, which has a new generation of Hollywood blockbuster resolution standards, has a resolution of 4096×2160, requiring 8,847,360 pixels.
There are several common standard video resolutions. Image source: wikipedia
Technically, 4K repair requires scanning the film digitally into a sequence frame image file with a resolution of 4096×3112, and then repairing the image through 4K digital intermediate film production. For example, the 4K restored version of "The Pianist at Sea", which was released in the mainland in November 2019, restored 35 mm film to 4096×3112 resolution. It's easy to say, but it's really hard to fix. Generally speaking, film restoration is divided into three major steps: physical restoration, digital restoration and artistic restoration.
First of all, many old film films inevitably have damage such as mildew, pollution, decolorization, flickering, noise, discoloration, frame loss, etc., and there are often surface problems such as dust and dirt. This is the first difficulty faced by restoration, through physical repair of the old film cleaning, patching and other work, waiting for subsequent digital repair.
Second, the film scanner is used to process the film with a rotational rotation (2K or 4K) of the film that has been physically restored. In this link, the use of professional software to automatically repair the dirty spots, mold, scratches, and other problems in the film, and then carry out color restoration, output high-format picture shots. Step by step, 2K repair is no different from 4K repair, except that 4K repair requires more human and financial resources. Data show that ordinary 2K repairs (resolution 2048×1556) completely rely on manual labor to complete one in about two weeks, costing about 300,000 yuan. 4K repairs (resolution 4096×3112) are 4 times more laborious than 2K repairs, and it takes two or three months, or even half a year, to complete one.
Finally, the artistic treatment of the restored film cannot be ignored. Some professionals engaged in the restoration of old films said that the most difficult thing in film restoration is not technical restoration, but artistic restoration. The artistic accomplishment of the restoration staff is crucial to "restoring the feeling of the old film", and the restored film must not only be superficially bright, but also have a pure taste.
Despite technical and other challenges, 4K restoration of some of the classic old films of great cultural, artistic value and historical significance is necessary. At the end of 2006, the China Film Archive led the launch of the "Film Archive Film Digital Repair Project", the Shanghai International Film Festival launched the domestic film restoration plan in 2011, and many video websites such as iQiyi also played their own advantages to invest in the restoration of old film and television works.
This time, the "Classic Video 4K Restoration Plan" launched by Watermelon Video and Volcano Engine will become a new force in the restoration army of old films, and contribute to the inheritance of classic videos.
What are the unique features of these AI algorithms used in volcano engines?
As an enterprise-level technical service platform under ByteDance, Volcano Engine opens the growth methods, technical tools and capabilities accumulated during ByteDance's rapid development to external enterprises, providing a series of products and services such as cloud, AI, and big data technology to help enterprises achieve sustainable growth in digital upgrading. Among them, in the multimedia middle platform of the volcano engine technology middle platform, intelligent processing is based on years of practical experience in multimedia intelligent processing, extracted from the whole process of video before and after intelligent processing and enhancement technology, the main capability modules include image quality enhancement, video DNA, etc.
A big sub-function of intelligent processing technology is the repair of old films. In the 4K restoration process, the volcano engine solved the problems of low definition, low fluency, color distortion and flaws of old films from four aspects: clarity, fluency, color and defects, of which the enable of multiple AI algorithms is the key, including the following:
Smart Super Score
Smart interpolation
Color enhancement
Eliminate artifacts (video noise reduction and scratch repair)
Sawtooth restoration
With the blessing of these AI algorithms, volcano engine provides an extreme picture quality repair solution for 4K extreme picture quality experience scenes (such as watermelon theater mode), improves the resolution, frame rate and color gamut of the source video, and performs high-quality intelligent transcoding to achieve the ultimate playback video quality.
The complete process of intelligent processing of the volcano engine.
As a very important video processing technology, super-resolution increases the resolution of the original image through hardware or software methods, with the purpose of reconstructing a high-resolution image based on a series of low-resolution images. Super-resolution algorithms based on deep learning have been a hot topic in recent years, and mainstream methods are generally divided into single-frame super-resolution and multi-frame super-resolution.
A single frame of supersedence means that a picture is imported and its high-resolution picture is output. The typical structure of single frame super resolution includes predefined upsampling, single upsampling, etc.; multi-frame supersampling considers the relationship between before and after the video and reconstructs more details. However, there are some bottlenecks in these super-resolution methods, when the upstream recovery is relatively high, such as 16 times, many algorithms can not reconstruct the corresponding high-definition images well.
Volcano Engine's intelligent super-resolution algorithm is based on deep learning methods, reconstructing missing details based on existing image and video information. Especially for video tasks, use the before-and-after frame information and model it in the time domain to recover additional details. In the old film restoration task, in view of the common problems of poor clarity, blur and low resolution of the picture, intelligent super score can significantly improve the clarity and resolution. Compared with other super-resolution algorithms, intelligent super-resolution algorithms have two major advantages.
On the one hand, the blurred degradation of old film scenes is specifically modeled to optimize the clarity effect. The effect is that the 720P source animation has been super-resolution reconstructed and de-blurred to achieve ultra-high image quality at 4K resolution.
On the other hand, the content is adaptively processed, and the sharding processing is carried out according to different areas to maintain the original painting style. Let's take the cartoon "Hulu Brothers" as an example, which is a combination of ink and paper cutting. When restoring, it is necessary to ensure that the characters are sharp and take into account the artistic effect of the ink background. This places very high demands on technical capabilities, requiring machines to be able to accurately identify good foregrounds and backgrounds.
Judging from the following animation diagram, the ink painting area after the restoration (on the right) maintains a hazy feeling, and the paper-cut area improves the clarity, highlighting the powerful super-resolution processing ability of intelligent super-resolution:
Video frame count is one of the important factors affecting the viewing experience, generally speaking, the more consecutive frames seen by the human eye in a unit of time, people will have a more realistic and natural subjective feeling about the film. That is, the larger the number of frames, the smoother the video will be. Interpolation technology enables the conversion of low frame rate video to high frame rate video.
There are also many interpolation technologies at home and abroad, such as SVP (Smooth Video Project) interpolation rendering can convert 24 frames / sec of video into 48 or 60 frames / s, Nvidia's neural network brain complement Super SloMo will be 30 frames of video to 60 frames, 240 frames or even higher, Shanghai Jiaotong University open source interpolation algorithm DAIN can be 30 frames of video interpolation to 480 frames.
In the restoration of old films, old cartoons are limited by production costs, and the number of painting frames is small, generally below 15 frames, resulting in poor picture fluency, a sense of caton, and more need to participate in the interpolation algorithm.
Therefore, the intelligent interpolation algorithm used by the volcano engine generates the intermediate frame by analyzing the motion and content of the front and back frames, which increases the frame rate of less than 15 to more than 60, greatly improving the fluency. In addition, due to the small number of animation textures, it is difficult for the conventional double frame scheme to determine the corresponding motion blocks of the front and back frames, and the volcano engine uses block optical flow optimization to achieve more accurate interpolation results.
The schematic diagrams of the previous frame, interstitial frame and the latter frame in the animation "The Legend of Nezha" are as follows:
Color distortion is another big problem with old films, which is mainly caused by the following two reasons.
First, film is affected by the transcription device in the transcription and digitization process, bringing about different degrees of color shift, thus deviating from the original image of the creator.
Second, the production process of old films is usually based on the old playback scene and production standards, using a narrow color gamut and a lower brightness dynamic range, resulting in a generally low dynamic range of brightness, and the picture contrast is relatively poor, which looks dim. Today, most mid-to-high-end mobile phone models have begun to support HDR playback, with a screen brightness of 1200 nit or even higher and a wide color gamut display of DCI-P3.
In view of the two reasons for the color distortion of the old film, the color enhancement scheme of the volcano engine has been targeted. On the one hand, based on AI color cast detection and repair, restore the original intention of the creators; on the other hand, SDR to HDR conversion (SDRToHDR) can map the dynamic range and color gamut of the picture to a larger space (peak brightness 100nit to maximum 10,000nit, BT.601 to BT.2020), making full use of the user's display device capabilities to obtain the best display effect. At present, the volcano engine's SDRtoHDR color enhancement scheme is at the industry's advanced level.
The comparison of the "Big Head Son Little Head Dad" before and after the SDRToHDR restoration is as follows, and it can be seen that the right picture has significantly improved in terms of color brightness and richness:
Eliminate imperfections
Due to the influence of subjective and objective factors such as age and improper preservation, old film may suffer physical and chemical damage, resulting in video footage covered with snowflake fragments, black lines and flashes and other types of flaws. At this point, video noise reduction and dead spot scratch repair are required.
Video distortion occurs in the process of acquisition, editing, encoding, transcoding, transmission, display, etc., and noise is a common distortion introduced in the signal acquisition process. Noise reduction has become a means of enhancing video quality and improving clarity. Traditional video noise reduction algorithms can be divided into spatial domain-based and time-domain-based noise reduction, and video noise reduction algorithms based on machine learning have been studied more and more, such as the deep-blind denoising algorithm ViDeNN proposed by Delft University of Technology in the Netherlands in April 2019.
For old films, due to the damage of the film itself, there will generally be many bad scratches on the background, such as vertical lines. It is very necessary to remove scratches. The classic workaround is usually a two-step approach: detection and removal. Scratch detection mostly uses straight line detection to find vertical and horizontal lines in space, and then use space or time interpolation to make up for this line with other pixels.
However, compared to common video imperfections, the imperfections of old films are not only complex but also more severe, so the volcano engine combines traditional signal processing and deep learning algorithms to make targeted repairs to noise and dead spot scratches: for smaller snowflake particle noise, traditional algorithms are used for processing; for larger dead spots and scratches, machine learning algorithms are used to identify and patch.
The effect is immediate, as can be seen from the repair comparison effect of the "Cuckoo Is Late" screen below, the algorithm strongly repairs the defects and retains the original texture without being affected:
Algorithms, however, are not omnipotent. In the actual repair process, if the algorithm is required to achieve 100% handling of defects, it will be easy to identify some artistic effects as flaws, thus causing damage to the film.
Therefore, in the process of repairing the old film, the volcano engine adopted a combination of algorithmic processing and manual annotation, in which the algorithm can solve more than 95% of the defect problems, and the remaining 5% of the defects need to be manually assisted in labeling. Then tell the algorithm, and then adjust the algorithm to do secondary optimization. In this way, we can not only eliminate the flaws more thoroughly, but also protect the original artistic style of the film.
However, for old films with serious damage, it takes a lot of manpower to completely eliminate flaws. In the case of Hulu Brothers, the restoration team saw 200,000 frames of footage in the process of removing flaws.
In the process of digitization, the sampling of old films is not done well, and there will often be a jagged effect of spectrum confusion, resulting in a poor perception. At present, most of the industry's sawtooth restorations are handled for low-resolution upsampling, and many of the sawtooth in older films occur when they are downsampled. Therefore, for the spectral confusion that has been generated, most of the industry algorithms cannot handle it.
The obvious line aliasing in the picture of the left Nezha in the figure below is caused by the confusion of the downsampling spectrum, this problem only appears in some scenes, it is difficult to locate, but if it is not solved, it will seriously affect the perception and the effect of other algorithms. Therefore, for this scenario, the volcano scheme has designed a separate optimization algorithm to greatly improve the screen aliasing effect.
We should also see that the picture quality problems in each cartoon cannot be exactly the same, and the volcano engine adopts the "right medicine, one plan for each problem" response method. Zhao Shijie, a researcher at the Volcano Engine Multimedia Laboratory, explained that an animation (in the case of "Black Cat Sheriff") may have problems in several aspects of resolution, frame rate, artifacts, color, blur and aliasing, so it is targeted to take super resolution, double frame, denoising, scratching and HDR to achieve the most accurate video repair.
Finally, it should be mentioned that these technologies used in 4K repair have been provided to internal and external customers including Douyin, Today's Headlines, Watermelon Video, Pippi Shrimp APP, FigureWorm, Tiger Punch and Zhiqiu Emperor through the intelligent processing of the volcano engine, so that more enterprises can participate in the restoration of old films and bring more 4K ultra-high-quality films to contemporary audiences.
Volcano Engine Intelligent Processing Official Website: https://www.volcengine.com/products/IMP
Reference Links:
http://www.atyun.com/46758.html
https://www.jiqizhixin.com/articles/2020-05-10
https://cloud.tencent.com/developer/article/1089304
http://www.xinhuanet.com/ent/2019-10/21/c_1125129740.htm
https://kjt.hebei.gov.cn/www/kxpj22/kxbl56/197445/index.html
http://culture.people.com.cn/n/2013/0910/c172318-22873573.html