Depth-based Prediction Mode for 3D Video Coding

With emerging multiview displays there is an interest in a new 3D video (3DV) representation and compression standard, which contains depth information, so that virtual views can be synthesized at the decoder from a sparse set of anchor views. With such a representation the depth map can also be utilized to increase video coding efficiency. In this work, we propose a novel depth-based prediction mode that increases inter-view prediction efficiency. The proposed mode is designed such that the syntax is simple and efficient to signal, and it is compatible with existing H.264/MVC modes. epending on the depth map quality, the proposed codec can provide gains over H.264/MVC coding, even with the depth overhead required for the proposed codec.


Proposed Depth-based Prediction Mode

In our proposed codec, initially the base (first) view video and its associated depth map are encoded with H.264/AVC individually. Then while encoding a frame of an additional view, a virtual view is rendered from the base view data at the corresponding time instance. It is important to note that we restricted the DBPM to use only the base view information, allowing us to signal the proposed mode very efficiently. With this design we only need a single depth map to be inserted in the bitstream. Although encoding additional depth maps is possible, it is not required for DBPM. The virtual view is rendered simply by the well-known “Depth Image Based Rendering” (DIBR) algorithm, without any hole filling at the disocclusion regions. Since hole filling is a costly and error prone operation, we simply rely on existing MVC modes to encode macroblocks located around the disocclusion regions. Once the proposed mode is signaled to the decoder, decoder refers to the reference virtual view and copies the collocated macroblock for prediction. If there is additional residual information it also adds the residual information to the predicted macroblock. The proposed prediction mode is depicted in orange in Fig. 1, where existing MVC modes are depicted in yellow (motion prediction) and blue (disparity prediction), and DIBR operation is depicted in red. Additionally, a block diagram summarizing how DBPM is integrated within MVC’s macroblock prediction framework, to be used concurrently with other MVC prediction modes, is provided in Fig. 2.

Fig. 1. Illustration of the prediction modes of the proposed codec.
Fig. 2. Block diagram of the proposed codec.

BD-Rate Results

Half resolution depth map (2x downsampling in each direction)

Depth QP PoznanHall2 PoznanStreet UndoDancer Kendo Balloons Newspaper1
26 7.52 9.25 -0.84 21.16 14.69 26.83
31 1.80 3.19 -1.09 11.12 5.66 11.66
36 -1.06 0.41 -0.80 4.92 1.19 4.38
41 -1.91 -0.48 -0.57 1.92 -0.79 1.47

Quarter resolution depth map (4x downsampling in each direction)

Depth QP PoznanHall2 PoznanStreet UndoDancer Kendo Balloons Newspaper1
26 -0.08 1.80 -2.47 8.80 3.83 9.35
31 -2.35 -0.43 -2.18 3.80 -0.03 3.51
36 -3.16 -1.33 -1.67 0.94 -1.60 0.77
41 -2.95 -1.44 -0.91 -0.25 -2.27 -0.29

Rate-Distortion Curves

Half-res Depth
Quarter-res Depth