Encoder-Driven Inpainting Strategy in Multiview Video Compression


In free viewpoint video systems, a user has the liberty to pick out a virtual view from that an image of the 3D scene is rendered, and the scene is often represented by color and depth pictures of multiple nearby viewpoints. In such representation, there exists information redundancy across multiple dimensions: 1) a 3D voxel may be represented by pixels in multiple viewpoint pictures (inter-view redundancy); two) a pixel patch might recur during a distant spatial region of the same image thanks to self-similarity (inter-patch redundancy); and three) pixels in a very local spatial region are similar (inter-pixel redundancy). It is necessary to use these redundancies during inter-view prediction toward effective multiview video compression. In this paper, we propose an encoder-driven inpainting strategy for inter-view predictive coding, where specific directions are transmitted minimally, and therefore the decoder is left to independently recover remaining missing data via inpainting, resulting in lower coding overhead. In specific, when pixels in a reference view are projected to a target view via depth-image-based mostly rendering at the decoder, the remaining holes within the target view are crammed via an inpainting method in a block-by-block manner. 1st, blocks are ordered in terms of problem-to-inpaint by the decoder. Then, express directions are only sent for the reconstruction of the foremost difficult blocks. In specific, the missing pixels are explicitly coded via a graph Fourier transform or a sparsification procedure using discrete cosine rework, resulting in low coding cost. For blocks that are easy to inpaint, the decoder independently completes missing pixels via template-based mostly inpainting. We tend to apply our proposed scheme to frames in a prediction structure defined by JCT-3V where inter-read prediction is dominant, and experimentally we have a tendency to show that our theme achieves up to 3-dB gain in peak-signal-to-noise-ratio in reconstructed image quality over a comparable 3D-High Potency Vide- Coding implementation using mounted sixteen $times $ 16 block size.

