laitimes

The results of the CLIC Video Compression Challenge were announced, and the Chinese team was brilliant

author:Quantum Position

Yunzhong is from the Au Fei Temple

量子位 | 公众号 QbitAI

With the continuous breakthrough of the new generation of artificial intelligence technology represented by deep learning, academia and industry have gradually realized the huge application potential of artificial intelligence technology in the field of image and video compression.

Image and video compression technology based on deep learning is regarded as a rising star that surpasses the limits of traditional compression technology capabilities and makes breakthroughs.

Recently, the results of the 6th Deep Learning Image Compression Challenge (hereinafter referred to as the "CLIC Competition") were announced, and B-2, a joint platform composed of Volcano Engine Multimedia Laboratory and Peking University, won the championship in both subjective and objective indicators in both high-bitrate video compression and low-bitrate video compression. Based on deep learning technology, the B-2 platform proposes a "traditional-intelligent hybrid solution".

The results of the CLIC Video Compression Challenge were announced, and the Chinese team was brilliant

Traditional—smart hybrid solutions

On the basis of fully understanding the respective principles of traditional compression technology and deep learning compression technology, the B-2 platform gives full play to the respective advantages of the two technical routes, learns from each other's strengths, and organically integrates the two into a whole, forming a unique traditional-intelligent hybrid solution.

On the basis of the existing traditional coding framework in the industry, the traditional coding module adds innovative technologies such as asymmetric quadtree division. The intelligent coding module introduces technologies such as loop filtering based on deep learning.

The results of the CLIC Video Compression Challenge were announced, and the Chinese team was brilliant

△Asymmetric quadtree division structure: (a) H1 horizontal UQT, (b) H2 horizontal UQT, (c) V1 vertical UQT, (d) V2 vertical UQT.

The coding unit division is the basis of the hybrid video coding framework, which determines the basic shape and size of the coding unit. Flexible partitioning methods can more effectively express the rich texture and motion of the video, which plays a crucial role in improving the encoding performance.

The team proposed an asymmetric quadtree (UQT) partition structure to improve the encoding efficiency of video. Compared with the existing Quad Tree (QT), Binary Tree (BT), and Ternary Tree (TT) partition structures, UQT can reach a deeper partition depth by generating sub-coding units through a partition, and can more effectively capture the rich details of the video.

In addition, the shape of the sub-block generated by UQT cannot be realized by the combination of QT, BT and TT, which makes up for the shortcomings of the existing partition to a certain extent and enriches the expression of partition.

The results of the CLIC Video Compression Challenge were announced, and the Chinese team was brilliant

△Schematic diagram of loop filtering network structure, including input, filtering and output modules of the network

In traditional video coding, loop filters are used to remove coding distortion and reduce the distortion between the original image and the reconstruction, such as classical de-block filtering, sample adaptive offset, and adaptive loop filtering.

The platform proposes an enhanced loop filtering technology based on residual convolutional network, which organically combines loop filtering technology with deep learning technology, and makes full use of the prior information of traditional video coding in the network structure and model training to improve the loop filtering efficiency.

In terms of network input, in addition to reconstructing pixels, the team used the prediction information, partition information, boundary strength, and quantization parameters in the coding process as augmented information for deep network learning, enriching prior knowledge and enabling the network to better perceive compression distortion.

In the encoding structure of hierarchical referencing, the frame to be encoded will reference the reconstructed high-quality frame. The team proposes to adopt an iterative training method for the filters used in different time-domain hierarchical frames to obtain the training data closest to the real code and achieve higher performance filtering.

In addition, each band and the maximum coding unit can adaptively select the network model with the best rate distortion performance among multiple filtering models, and transmit the selection information to the decoder.

The results of the CLIC Video Compression Challenge were announced, and the Chinese team was brilliant

△ CLIC video compression track is based on MOS-based leaderboards

Sponsored by the Institute of Electrical and Electronics Engineers (IEEE), the CLIC competition has received wide attention from academia and industry since its inception.

The 2023 CLIC competition has been suspended for one year, and this year's competition will be held again with the help of the Data Compression Conference (DCC), the top conference in the field of data compression. In this year's DCC, 8 papers from the Volcano Engine Multimedia Laboratory were selected. In addition, this is the team's second consecutive win since winning the 2022 CLIC competition in the two tracks of high-bitrate video compression and low-bitrate video compression.

Volcano Engine Multimedia Lab is a research team under ByteDance, committed to exploring cutting-edge technologies in the field of multimedia, participating in international standardization work, and many of its innovative algorithms and software and hardware solutions have been widely used in the multimedia business of Douyin, Xigua Video and other products, and providing technical services to Volcano Engine's enterprise-level customers. Since the establishment of the laboratory, many papers have been selected into international top conferences and flagship journals, and won several international technical competition championships, industry innovation awards and best paper awards.

— END —

QbitAI · Headline No

Follow us and be the first to know about the signing of cutting-edge technology trends

Read on