access icon free Segmentation and semantic labelling of RGBD data with convolutional neural networks and surface fitting

We present an approach for segmentation and semantic labelling of RGBD data exploiting together geometrical cues and deep learning techniques. An initial over-segmentation is performed using spectral clustering and a set of non-uniform rational B-spline surfaces is fitted on the extracted segments. Then a convolutional neural network (CNN) receives in input colour and geometry data together with surface fitting parameters. The network is made of nine convolutional stages followed by a softmax classifier and produces a vector of descriptors for each sample. In the next step, an iterative merging algorithm recombines the output of the over-segmentation into larger regions matching the various elements of the scene. The couples of adjacent segments with higher similarity according to the CNN features are candidate to be merged and the surface fitting accuracy is used to detect which couples of segments belong to the same surface. Finally, a set of labelled segments is obtained by combining the segmentation output with the descriptors from the CNN. Experimental results show how the proposed approach outperforms state-of-the-art methods and provides an accurate segmentation and labelling.

Inspec keywords: image colour analysis; image segmentation; iterative methods; neural nets; surface fitting; feature extraction

Other keywords: convolutional neural networks; initial over-segmentation; geometrical cues; iterative merging algorithm; surface fitting; CNN features; surface fitting accuracy; surface fitting parameters; input colour; geometry data; segmentation output; nonuniform rational B-spline surfaces; labelled segments; CNN; RGBD data; softmax classifier; deep learning technique; semantic labelling; convolutional neural network; over-segmentation; spectral clustering

Subjects: Interpolation and function approximation (numerical analysis); Image recognition; Interpolation and function approximation (numerical analysis); Computer vision and image processing techniques; Neural computing techniques

References

    1. 1)
      • 16. Ren, X., Bo, L., Fox, D.: ‘RGB-(D) scene labeling: features and algorithms’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
    2. 2)
      • 2. Dal Mutto, C., Zanuttigh, P., Cortelazzo, G.: ‘Fusion of geometry and color information for scene segmentation’, IEEE J. Sel. Top. Signal Process., 2012, 6, (5), pp. 505521.
    3. 3)
      • 13. Hasnat, M.A., Alata, O., Trèmeau, A.: ‘Unsupervised RGB-D image segmentation using joint clustering and region merging’. Proc. British Machine Vision Conf. (BMVC), 2014.
    4. 4)
      • 18. Lin, D., Fidler, S., Urtasun, R.: ‘Holistic scene understanding for 3D object detection with RGBD cameras’. Proc. Int. Conf. Computer Vision (ICCV), 2013, pp. 14171424.
    5. 5)
      • 36. Gupta, S., Arbelaez, P., Malik, J.: ‘Perceptual organization and recognition of indoor scenes from RGB-D images’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2013.
    6. 6)
      • 9. Srinivasan, N., Dellaert, F.: ‘A Rao-Blackwellized MCMC algorithm for recovering piecewise planar 3D model from multiple view RGBD images’. Proc. IEEE Int. Conf. Image Processing (ICIP), 2014.
    7. 7)
      • 8. Khan, M.R., Rahman, A.B.M.M., Rahaman, G.M.A., et al: ‘Unsupervised RGB-D image segmentation by multi-layer clustering’. Proc. Int. Conf. Informatics, Electronics and Vision, 2016, pp. 719724.
    8. 8)
      • 29. Carreira, J., Sminchisescu, C.: ‘Cpmc: automatic object segmentation using constrained parametric min-cuts’, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34, (7), pp. 13121328.
    9. 9)
      • 38. Arbelaez, P., Maire, M., Fowlkes, C., et al: ‘Contour detection and hierarchical image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2011, 33, (5), pp. 898916.
    10. 10)
      • 24. Shelhamer, E., Long, J., Darrell, T.: ‘Fully convolutional networks for semantic segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (4), pp. 640651.
    11. 11)
      • 33. do Carmo, M.: ‘Differential geometry of curves and surfaces’ (Prentice-Hall, 1976).
    12. 12)
      • 37. Felzenszwalb, P., Huttenlocher, D.: ‘Efficient graph-based image segmentation’, Int. J. Comput. Vis., 2004, 59, (2), pp. 167181.
    13. 13)
      • 21. Banica, D., Sminchisescu, C.: ‘Second-order constrained parametric proposals and sequential search-based structured prediction for semantic segmentation in RGB-D images’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 35173526.
    14. 14)
      • 3. Pagnutti, G., Zanuttigh, P.: ‘Joint color and depth segmentation based on region merging and surface fitting’. Proc. Int. Conf. Computer Vision Theory and Applications, 2016.
    15. 15)
      • 25. Wang, J., Wang, Z., Tao, D., et al: ‘Learning common and specific features for RGBD semantic segmentation with deconvolutional networks’. Proc. European Conf. Computer Vision (ECCV), 2016, pp. 664679.
    16. 16)
      • 35. Couprie, C., Farabet, C., Najman, L., et al: ‘Indoor semantic segmentation using depth information’. Int. Conf. Learning Representations, 2013.
    17. 17)
      • 10. Pagnutti, G., Zanuttigh, P.: ‘Scene segmentation from depth and color data driven by surface fitting’. Proc. IEEE Int. Conf. Image Processing (ICIP), IEEE, 2014, pp. 44074411.
    18. 18)
      • 19. Khan, S.H., Bennamoun, M., Sohel, F., et al: ‘Geometry driven semantic labeling of indoor scenes’. Proc. European Conf. Computer Vision (ECCV), Springer, 2014, pp. 679694.
    19. 19)
      • 17. Deng, Z., Todorovic, S., Jan Latecki, L.: ‘Semantic segmentation of RGBD images with mutex constraints’. Proc. Int. Conf. Computer Vision (ICCV), 2015, pp. 17331741.
    20. 20)
      • 32. Piegl, L., Tiller, W.: ‘The NURBS book’ (Springer-Verlag, Inc., New York, USA, 1997, 2nd edn.).
    21. 21)
      • 14. Hasnat, M.A., Alata, O., Trèmeau, A.: ‘Joint color-spatial-directional clustering and region merging (JCSD-RM) for unsupervised RGB-D image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (11), pp. 22552268.
    22. 22)
      • 23. Couprie, C., Farabet, C., Najman, L., et al: ‘Convolutional nets and watershed cuts for real-time semantic labeling of RGBD videos’, J. Mach. Learn. Res., 2014, 15, (1), pp. 34893511.
    23. 23)
      • 11. Gupta, S., Arbeláez, P., Girshick, R., et al: ‘Indoor scene understanding with RGB-D images: bottom-up segmentation, object detection and semantic segmentation’, Int. J. Comput. Vis., 2015, 112, (2), pp. 133149.
    24. 24)
      • 7. Dal Mutto, C., Zanuttigh, P., Cortelazzo, G.: ‘Scene segmentation assisted by stereo vision’. Proc. 3DIMPVT 2011, Hangzhou, China, 2011.
    25. 25)
      • 20. Hermans, A., Floros, G., Leibe, B.: ‘Dense 3d semantic mapping of indoor scenes from RGBD images’. Robotics and Automation (ICRA), 2014 IEEE Int. Conf., IEEE, 2014, pp. 26312638.
    26. 26)
      • 22. Höft, N., Schulz, H., Behnke, S.: ‘Fast semantic segmentation of RGB-D scenes with GPU-accelerated deep neural networks’. Joint German/Austrian Conf. Artificial Intelligence, 2014, pp. 8085.
    27. 27)
      • 27. Wang, A., Lu, J., Wang, G., et al: ‘Multi-modal unsupervised feature learning for RGB-D scene labeling’. Proc. European Conf. Computer Vision (ECCV), 2014, pp. 453467.
    28. 28)
      • 6. Zanuttigh, P., Marin, G., Dal Mutto, C., et al: ‘Time-of-flight and structured light depth cameras’ (Springer, 2016).
    29. 29)
      • 30. Holzer, S., Rusu, R., Dixon, M., et al: ‘Adaptive neighborhood selection for real-time surface normal estimation from organized point cloud data using integral images’. Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), 2012, pp. 26842689.
    30. 30)
      • 5. Gupta, S., Girshick, R., Arbeláez, P., et al: ‘Learning rich features from RGB-D images or object detection and segmentation’. Proc. European Conf. Computer Vision (ECCV), 2014, pp. 345360.
    31. 31)
      • 28. Uijlings, J.R., Van De Sande, K.E., Gevers, T., et al: ‘Selective search for object recognition’, Int. J. Comput. Vis., 2013, 104, (2), pp. 154171.
    32. 32)
      • 31. Fowlkes, C., Belongie, S., Chung, F., et al: ‘Spectral grouping using the Nyström method’, IEEE Trans. Pattern Anal. Mach. Intell., 2004, 26, (2), pp. 214225.
    33. 33)
      • 26. Eigen, D., Fergus, R.: ‘Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture’. Proc. IEEE Int. Conf. Computer Vision, 2015, pp. 26502658.
    34. 34)
      • 39. Hickson, S., Essa, I., Christensen, H.: ‘Semantic instance labeling leveraging hierarchical segmentation’. Winter Conf. Applications of Computer Vision, 2015, pp. 10681075.
    35. 35)
      • 4. Minto, L., Pagnutti, G., Zanuttigh, P.: ‘Scene segmentation driven by deep learning and surface fitting’. Proc. of ECCV Geometry meets deep learning workshop, 2016.
    36. 36)
      • 34. Farabet, C., Couprie, C., Najman, L., et al: ‘Learning hierarchical features for scene labeling’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (8), pp. 19151929.
    37. 37)
      • 15. Taylor, C.J., Cowley, A.: ‘Parsing indoor scenes using RGB-D imagery’, Robot. Sci. Syst., 2013, 8, pp. 401408.
    38. 38)
      • 1. Shi, J., Malik, J.: ‘Normalized cuts and image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2000, 22, (8), pp. 888905.
    39. 39)
      • 12. Silberman, N., Hoiem, D., Kohli, P., et al: ‘Indoor segmentation and support inference from RGBD images’. Proc. European Conf. Computer Vision (ECCV), 2012.
    40. 40)
      • 40. Caesar, H., Uijlings, J., Ferrari, V.: ‘Region-based semantic segmentation with end-to-end training’. Proc. European Conf. Computer Vision (ECCV), 2016, pp. 381397.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2016.0502
Loading

Related content

content/journals/10.1049/iet-cvi.2016.0502
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading