Reprinted from "Integration Technology"
The explosive growth of video data has placed enormous pressure and challenges on storage and transmission, urgently requiring fast and efficient video coding schemes. However, next-generation video coding methods still rely on manually designed prediction functions based on statistical experience, which limits performance improvement to some extent. How to maximize video quality and improve compression efficiency within a given bandwidth using advanced learning tools has become a key issue for future intelligent video coding optimization. This paper, from the perspective of computer vision and artificial intelligence, models the chroma prediction problem in video coding as an image colorization problem in computer vision, further eliminating redundancy between color channels. The chroma prediction based on convolutional neural networks comprises two sub-networks: luminance downsampling and chroma prediction. Linear model results are used as chroma initialization to enhance performance, and quantization parameters are used to characterize coding distortion and eliminate the impact of compression noise. In the encoder design process, to achieve better coding performance, this paper utilizes rate-distortion optimization methods to select the prediction strategy with the lowest cost from traditional chroma prediction methods and the proposed method.
The results show that, compared with existing traditional methods, the proposed method can save 4.28%, 3.34%, and 4.63% of network bandwidth in the Y, U, and V components, respectively.
Addressing the limitations of existing video coding modules, this study shifts the focus from signal processing to artificial intelligence from the perspectives of computer vision and artificial intelligence. Based on massive amounts of video/image data, it investigates video coding methods incorporating neural network models, ultimately achieving innovation in intelligent video coding optimization theory and methods. The expected results can be applied to next-generation video coding standards and related fields of video compression.
Chromaticity prediction performance comparison (the block to be predicted is located in the lower right corner)