Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free Improving stereo matching by incorporating geometry prior into ConvNet

Deep learning-based methods for stereo matching have shown superior performance over traditional ones. However, most of them ignore the inherent geometry prior of stereo matching when training, i.e. the reference image can be reconstructed from the second image in the visible regions. The reconstruction can be achieved by backward warping the second image using the disparity map of the reference image, while the visible regions can be calculated by left-right consistency check. This prior is useful especially when the ground truth disparity is sparse (e.g. the outdoor scene such as KITTI 2015). This prior incorporated into a two-stage end-to-end training process, both of which try to minimise the end-point-error with respect to the sparse ground truth disparity (supervised learning), and the reconstruction error (self-supervised learning). The predicted disparity and the reconstruction error of the first stage act as additional information, and are fed to the second stage to make further use of this prior knowledge to improve performance. Experiments on the challenging KITTI 2015 dataset show that the method improves the results in the foreground region, and ranks first among all the published methods on the D1-fg metric.

References

    1. 1)
      • 9. Garg, R., Carneiro, G., Reid, I.: ‘Unsupervised CNN for single view depth estimation: geometry to the rescue’. European Conf. Computer Vision, 2016, pp. 740756.
    2. 2)
      • 1. Zhang, Y., Qiu, W., Chen, Q., et al: ‘Unrealstereo: a synthetic dataset for analyzing stereo vision’. CoRR abs/1612.04647, 2016.
    3. 3)
      • 2. Bontar, J., Lecun, Y.: ‘Stereo matching by training a convolutional neural network to compare image patches’, J. Mach. Learn. Res., 2016, 17, (1), pp. 22872318.
    4. 4)
      • 6. Seki, A., Pollefeys, M.: ‘Patch based confidence prediction for dense disparity map’. British Machine Vision Conf., 2016, vol. 10.
    5. 5)
      • 7. Menze, M., Geiger, A.: ‘Object scene flow for autonomous vehicles’. IEEE Conf. Computer Vision and Pattern Recognition, 2015.
    6. 6)
      • 10. Godard, C., Mac Aodha, O., Brostow, G.J.: ‘Unsupervised monocular depth estimation with left–right consistency’. IEEE Conf. Computer Vision and Pattern Recognition, 2017.
    7. 7)
      • 5. Shaked, A., Wolf, L.: ‘Improved stereo matching with constant highway networks and reflective confidence learning’. IEEE Conf. Computer Vision and Pattern Recognition, 2017.
    8. 8)
      • 12. Kingma, D., Ba, J.: ‘Adam: a method for stochastic optimization’. 3rd International Conference on Learning Representations (ICLR2015), San Diego, CA, USA, May 2015.
    9. 9)
      • 4. Kendall, A., Martirosyan, H., Dasgupta, S., et al: ‘End-to-end learning of geometry and context for deep stereo regression’. CoRR abs/1703.04309, 2017.
    10. 10)
      • 3. Mayer, N., Ilg, E., Hausser, P., et al: ‘A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation’. IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 40404048.
    11. 11)
    12. 12)
      • 11. Ilg, E., Mayer, N., Saikia, T., et al: ‘Flownet 2.0: evolution of optical flow estimation with deep networks’. IEEE Conf. Computer Vision and Pattern Recognition, 2017.
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2017.2418
Loading

Related content

content/journals/10.1049/el.2017.2418
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address