Bidirectional, Occlusion-Aware Temporal Frame Interpolation in a Highly Scalable Video Setting

D. Ruefenacht, R. Mathew, and D. Taubman

Abstract

We present a bidirectional, occlusion-aware temporal frame interpolation (BOA-TFI) scheme that builds up on our recently proposed highly scalable video coding scheme. Unlike previous TFI methods, our scheme attempts to put ‘‘correct’’ information in problematic regions around moving objects. From a ‘‘parent’’ motion field between two existing reference frames, we compose motion from both reference frames to the target frame. These motion fields, together with motion discontinuity information, are then warped to the target frame – a process during which we discover valuable information about disocclusions, which we then use to guide the bidirectional prediction of the interpolated frame. The scheme can be used in any state-of-the-art codec, but is most beneficial if used in conjunction with a highly scalable video coder. Evaluation of the method on synthetic data allows us to shine a light on problematic regions around moving object boundaries, which has not been the focus of previous frame interpolation methods. The proposed frame interpolation method yields credible results, and compares favourably to current state-of-the-art frame interpolation methods.

Downloads

Preprint | Presentation Slides | Poster

Test Sequences

The following zip-file contains four test sequences, along with ground truth motion fields: Download

Please refer to the included ReadMe.txt to see how to properly read in the motion fields. We also included the ‘‘ground truth’’ frame 3 of every sequence, which is the frame that should be reconstructed under the constant motion assumption.

If you are using these sequences in your work, please cite our work [1].

Experimental Results

The following table shows results for the reconstruction of frame 3, from the existing reference frames 2 and 4. The first column shows the ground truth frame that should have been reconstructed under the constant motion assumption. Note that this is not identical to frame 3 in the test sequence, as all our sequences contain accelerated motion.
The color code of the disocclusion maps is as follows:

  • Light Green: Occluded in left reference frame

  • Dark Green: Occluded in right reference frame

  • Red: Occluded in both reference frames

Ground Truth Proposed Estimated Motion [1] Jeong et al. [2] Veselov et al. [3] Proposed Ground Truth Motion [1]
Disocclusion map PSNR=28.21 , occPSNR=23.08. PSNR=27.15 , occPSNR=21.61. PSNR=25.61 , occPSNR=19.70. PSNR=31.67 , occPSNR=26.91.
Ground Truth Proposed Estimated Motion [1] Jeong et al. [2] Veselov et al. [3] Proposed Ground Truth Motion [1]
Disocclusion map PSNR=31.09 , occPSNR=23.86. PSNR=31.96 , occPSNR=21.80. PSNR=29.23 , occPSNR=20.15. PSNR=34.00 , occPSNR=26.57.
Ground Truth Proposed Estimated Motion [1] Jeong et al. [2] Veselov et al. [3] Proposed Ground Truth Motion [1]
Disocclusion map PSNR=29.47 , occPSNR=25.23. PSNR=28.34 , occPSNR=22.51. PSNR=28.67 , occPSNR=21.79. PSNR=30.52 , occPSNR=26.27.
Ground Truth Proposed Estimated Motion [1] Jeong et al. [2] Veselov et al. [3] Proposed Ground Truth Motion [1]
Disocclusion map PSNR=24.33 , occPSNR=19.76. PSNR=23.65 , occPSNR=17.50. PSNR=21.09 , occPSNR=15.33. PSNR=26.41 , occPSNR=22.15.

References

[1] D. Ruefenacht, R. Mathew, and D. Taubman, ‘‘Occlusion-Aware Bidirectional Temporal Frame Interpolation in a Highly Scalable Video Setting,’’ Picture Coding Symposium (PCS), Cairns, Australia, 2015.
[2] S. Jeong, C. Lee, and C. Kim,‘‘Motion-compensated frame interpolation based on multihypothesis motion estimation and texture optimization,’’ IEEE Transactions on Image Processing, vol. 22, no. 11, pp. 4497–4509, 2013.
[3] A. Veselov and M. Gilmutdinov, ‘‘Iterative Hierarchical True Motion Estimation for Temporal Frame Interpolation,’’ IEEE International Workshop on Multimedia Signal Processing (MMSP), Jakarta, Indonesia, 2014.