access icon free Using mel-frequency audio features from footstep sound and spatial segmentation techniques to improve frame-based moving object detection

Moving object detection in video streams is a challenging and integral part of computer vision which is used in surveillance, traffic and site monitoring, and navigation. Compared with the background-based techniques, frame differencing technique is computationally inexpensive. However, frame differencing technique only detects the boundary of a moving object. Due to changing light conditions, shadows, poor contrast between object and background, and a slow-moving object, object detection rate from frame differencing technique reduces. This is because the number of noisy frames and frames with missing/partially detected object increases. Application of large kernel size morphological operations fails to remove noise as they might remove the boundary (or part) of a moving object. In this study, the authors propose a methodology to improve the frame differencing technique using footstep sound generated by a moving object. Audio recorded with the video system is processed and footstep sound is detected using audio features computed as mel-frequency cepstral coefficients. Number of frames within each footstep sound are counted and processed. Spatial segmentation is used to find the moving object in noisy frames. A missing or partially detected object is recovered by modelling an ellipse using a moving object from other neighbourhood frames.

Inspec keywords: object detection; image segmentation

Other keywords: multimodal systems; frame-based moving object detection; kernel size morphological operations; site monitoring; traffic navigation; computer vision; frame differencing technique; surveillance navigation; footstep sound; video streams; spatial segmentation

Subjects: Optical, image and video signal processing; Computer vision and image processing techniques

References

    1. 1)
      • 21. Heikkila, M., Pietikainen, M.: ‘A texture-based method for modeling the background and detecting moving objects’, IEEE Trans. Pattern Anal. Mach. Intell., 2006, 28, (4), pp. 657662.
    2. 2)
      • 6. Cui, Y., Zeng, Z., Cui, W., et al: ‘Moving object detection based frame difference and graph cuts’, J. Comput. Inf. Syst., 2012, 8, (1), pp. 2129.
    3. 3)
      • 32. ‘HTK MFCC MATLAB’, https://www.mathworks.com/matlabcentral/fileexchange/32849-htk-mfcc-matlab, accessed 10 April 2017.
    4. 4)
      • 25. Han, J., Bhanu, B.: ‘Fusion of color and infrared video for moving human detection’, Pattern Recognit., 2007, 40, (6), pp. 17711784.
    5. 5)
      • 20. Zhou, D., Zhang, H.: ‘Modified GMM background modeling and optical flow for detection of moving objects’. IEEE Int. Conf. on Systems, Man and Cybernetics, 2005, vol. 3, pp. 22242229.
    6. 6)
      • 30. ‘fit_ellipse’, https://www.mathworks.com/matlabcentral/fileexchange/3215-fit-ellipse, accessed 10 April 2017.
    7. 7)
      • 22. Doukas, C., Maglogiannis, I.: ‘Advanced patient or elder fall detection based on movement and sound data’. Second Int. Conf. on Pervasive Computing Technologies for Healthcare, January 2008, pp. 103107.
    8. 8)
      • 31. Oppenheim, A.V., Ronald, W.S., John, R.B.: ‘Discrete-time signal processing’ (Prentice-Hall, Upper Saddle River, NJ, 1999).
    9. 9)
      • 13. Piccardi, M.: ‘Background subtraction techniques: a review’. 2004 IEEE Int. Conf. on Systems, Man and Cybernetics, 10–13 October 2004, pp. 30993104.
    10. 10)
      • 35. iPhone 5s – Technical Specifications., https://support.apple.com/kb/sp685?locale=en_US, accessed 31 August 2017.
    11. 11)
      • 1. Kumar, P., Singhal, A., Mehta, S., et al: ‘Real-time moving object detection algorithm on high-resolution videos using GPUs’, Real-Time Image Process., 2016, 11, (1), pp. 93109.
    12. 12)
      • 2. Kamate, S., Yilmazer, N.: ‘Application of object detection and tracking techniques for unmanned aerial vehicles’, Procedia Comput. Sci., 2015, 61, pp. 436441.
    13. 13)
      • 8. Shafie, A.A., Hafiz, F., Ali, M.H.: ‘Motion detection techniques using optical flow’, Int. J. Electr. Comput. Energ. Electron. Commun. Eng., 2009, 3, (8), pp. 15611563.
    14. 14)
      • 4. Logan, B.: ‘Mel frequency cepstral coefficients for music modeling’. Int. Symp. Music Information Retrieval (ISMIR), 2000.
    15. 15)
      • 29. Otsu, N.: ‘A threshold selection method from gray-level histograms’, IEEE Trans. Syst. Man Cybern., 1979, 9, (1), pp. 6266.
    16. 16)
      • 17. Song, B., Cunwu, H., Dehui, S.: ‘Neural network based method for background modeling and detecting moving objects’, J. China Univ. Posts Telecommun., 2015, 22, (3), pp. 100109.
    17. 17)
      • 9. Stauffer, C., Grimson, W.E.L.: ‘Adaptive background mixture models for real-time tracking’. 1999 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, 23–25 June 1999, pp. 246252.
    18. 18)
      • 14. Hu, W., Tan, T., Wang, L., et al: ‘A survey on visual surveillance of object motion and behaviors’, IEEE Trans. Syst. Man Cybern. C, Appl. Rev., 2004, 34, (3), pp. 334352.
    19. 19)
      • 5. McIvor, A.: ‘Background subtraction technique’. Proc. of Image & Vision Computing, New Zealand, November 2000, pp. 147153.
    20. 20)
      • 33. David, A., Sergi, V.: ‘K-means++: the advantage of careful seeding’. SODA ‘07: Proc. of the Eighteenth Annual ACM-SIAM Symp. on Discrete Algorithms, 2007, pp. 10271035.
    21. 21)
      • 24. Zotkin, D.N., Duraiswami, R., Davis, L.S.: ‘Joint audio-visual tracking using particle filters’, EURASIP J. Adv. Signal Process., 2002, 2002, (1), pp. 11541164.
    22. 22)
      • 26. Chavez-Garcia, R.O., Aycard, O.: ‘Multiple sensor fusion and classification for moving object detection and tracking’, IEEE Trans. Intell. Transp. Syst., 2016, 17, (2), pp. 525534.
    23. 23)
      • 34. iPhone 4 – Technical Specifications., https://support.apple.com/kb/sp587?locale=en_US, accessed 10 April 2017.
    24. 24)
      • 7. Shaikh, S.H., Saeed, K., Chaki, N.: ‘Moving object detection, approaches, challenges and object tracking’ in Shaikh, S.H., Saeed, K., Chaki, N. (Eds.): Moving Object Detection Using Background Subtraction (Springer, London, 2014), pp. 514.
    25. 25)
      • 11. Zhou, Z., Jin, Z.: ‘Two-dimension principal component analysis-based motion detection framework with subspace update of background’, IET Comput. Vis., 2016, 10, (6), pp. 603612.
    26. 26)
      • 18. Qin, H., Zhen, Z., Ma, H.: ‘Moving object detection based on optical flow and neural network fusion’, Int. J. Intell. Comput. Cybern., 2016, 9, (4), pp. 325335.
    27. 27)
      • 19. Siebel, N.T., Maybank, S.: ‘Fusion of multiple tracking algorithms for robust people tracking’. Computer Vision – ECCV 2002, 2002 (LNCS), (Lecture Notes in Computer Science, Springer, Berlin, 2002), 2353, pp. 373387.
    28. 28)
      • 10. Crnojević, V., Antić, B., Ćulibrk, D.: ‘Optimal wavelet differencing method for robust motion detection’. 16th IEEE Int. Conf. on Image Processing (ICIP), 2009, pp. 645648.
    29. 29)
      • 23. Chellappa, R., Qian, G., Zheng, Q.: ‘Vehicle detection and tracking using acoustic and video sensors’. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 17–21 May 2004, p. iii-793iii-796.
    30. 30)
      • 16. Han, H., Zhu, J., Liao, S., et al: ‘Moving object detection revisited: speed and robustness’, IEEE Trans. Circuits Syst. Video Technol., 2015, 25, (6), pp. 910921.
    31. 31)
      • 28. Mehmet, S., Bulent, S.: ‘Survey over image thresholding techniques and quantitative performance evaluation’, J. Electron. Imaging, 2004, 13, (1), pp. 146168.
    32. 32)
      • 3. Roshan, A., Zhang, Y.: ‘A comparison of moving object detection methods for real-time moving object detection’. SPIE Defense+ Security, Baltimore, USA, 5 May 2014, pp. 907609907609-6.
    33. 33)
      • 12. Subudhi, B.N., Ghosh, S., Nanda, P.K., et al: ‘Moving object detection using spatio-temporal multilayer compound Markov Random Field and histogram thresholding based change detection’, Multimedia Tools Appl., 2017, 76, (11), pp. 1351113543.
    34. 34)
      • 27. Dedeoglu, Y.: ‘Moving object detection, tracking and classification for smart video surveillance’. Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2004.
    35. 35)
      • 15. Sadarangani, N.: ‘An improved Gaussian mixture model algorithm for background subtraction’. Master of Engineering, Massachusetts Institute of Technology, 2002.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0209
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0209
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading