access icon free Hybrid framework for categorising sounds of Mysticete whales

This study addresses a problem belonging to the domain of whale audio processing, more specifically the automatic classification of sounds produced by the Mysticete species. The specific task is quite challenging given the vast repertoire of the involved species, the adverse acoustic conditions and the nearly inexistent prior scientific work. Two feature sets coming from different domains (frequency and wavelet) were designed to tackle the problem. These are modelled by means of a hybrid technique taking advantage of the merits of a generative and a discriminative classifier. The dataset includes five species (Blue, Fin, Bowhead, Southern Right, and Humpback) and it is publicly available at http://www.mobysound.org/. The authors followed a thorough experimental procedure and achieved quite encouraging recognition rates.

Inspec keywords: signal classification; acoustic signal processing

Other keywords: discriminative classifier; whale audio processing; hybrid technique; automatic sound classification; generative classifier; adverse acoustic conditions; Mysticete whales

Subjects: Signal processing and detection; Digital signal processing

References

    1. 1)
      • 24. Verstraeten, D., Schrauwen, B., DHaene, M., et al: ‘An experimental unification of reservoir computing methods’, Neural Netw., 2007, 20, (3), pp. 391403, Echo State Networks and Liquid State Machines.
    2. 2)
      • 4. Ntalampiras, S., Potamitis, I., Fakotakis, N.: ‘Acoustic detection of human activities in natural environments’, J. Audio Eng. Soc., 2012, 60, (9), pp. 686695. Available at: http://www.aes.org/e-lib/browse.cfm?elib=16373.
    3. 3)
      • 30. Frank, E., Hall, M., Holmes, G., et al: ‘Weka-a machine learning workbench for data mining’, in Maimon, O., Rokach, L. (EDs.): ‘Data mining and knowledge discovery handbook’ (Springer, USA, 2010), pp. 12691277.
    4. 4)
      • 15. Wilcock, W.S.D.: ‘Tracking fin whales in the northeast pacific ocean with a seafloor seismic network’, J. Acoust. Soc. Am., 2012, 132, (4), pp. 17371748.
    5. 5)
      • 7. Halkias, X.C., Paris, S., Glotin, H.: ‘Classification of Mysticete sounds using machine learning techniques’, J. Acoust. Soc. Am., 2013, 134, (5), pp. 34963505.
    6. 6)
      • 23. Jaeger, H.: ‘Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach’. Tech. Rep., Fraunhofer Institute AIS, St. Augustin-Germany, 2002.
    7. 7)
      • 32. Kwak, C., Kwon, O.-W.: ‘Cardiac disorder classification by heart sound signals using murmur likelihood and hidden Markov model state likelihood’, IET Signal Process., 2012, 6, (4), pp. 326334.
    8. 8)
      • 1. Holmes, S.B., McIlwrick, K.A., Venier, L.A.: ‘Using automated sound recording and analysis to detect bird species-at-risk in Southwestern Ontario woodlands’. Wildlife Society Bulletin, 2014, pp. /a–n/a.
    9. 9)
      • 27. Ntalampiras, S.: ‘Universal background modeling for acoustic surveillance of urban traffic’, Digit. Signal Process., 2014, 31, pp. 6978. Available at: http://www.sciencedirect.com/science/article/pii/S1051200414001390.
    10. 10)
      • 11. Brown, J.C., Miller, P.J.O.: ‘Automatic classification of killer whale vocalizations using dynamic time warping’, J. Acoust. Soc. Am., 2007, 122, (2), pp. 12011207.
    11. 11)
      • 34. Pimentel, M.A., Clifton, D.A., Clifton, L., et al: ‘A review of novelty detection’, Signal Process., 2014, 99, pp. 215249. Available at: http://www.sciencedirect.com/science/article/pii/S016516841300515X.
    12. 12)
      • 20. Bouchard, G.: ‘Bias-variance tradeoff in hybrid generative-discriminative models’. Sixth Int. Conf. on Machine Learning and Applications, 2007, ICMLA 2007, December 2007, pp. 124129.
    13. 13)
      • 31. Viterbi, A.: ‘Error bounds for convolutional codes and an asymptotically optimum decoding algorithm’, IEEE Trans. Inf. Theory, 1967, 13, (2), pp. 260269.
    14. 14)
      • 16. Harris, F.: ‘On the use of windows for harmonic analysis with the discrete Fourier transform’, Proc. IEEE, 1978, 66, (1), pp. 5183.
    15. 15)
      • 19. Chandrakala, S., Sekhar, C.C.: ‘Combination of generative models and svm based classifier for speech emotion recognition’. 2009 Int. Joint Conf. on Neural Networks, June 2009, pp. 497502.
    16. 16)
      • 22. Lukoeviius, M., Jaeger, H.: ‘Reservoir computing approaches to recurrent neural network training’, Comput. Sci. Rev., 2009, 3, (3), pp. 127149.
    17. 17)
      • 17. Ren, Y., Johnson, M.T., Tao, J.: ‘Perceptually motivated wavelet packet transform for bioacoustic signal enhancement’, J. Acoust. Soc. Am., 2008, 124, (1), pp. 316327.
    18. 18)
      • 12. Brown, J.C., Smaragdis, P.: ‘Hidden Markov and Gaussian mixture models for automatic call classification’, J. Acoust. Soc. Am., 2009, 125, (6), pp. EL221EL224.
    19. 19)
      • 10. Bahoura, M., Simard, Y.: ‘Blue whale calls classification using short-time Fourier and wavelet packet transforms and artificial neural network’, Digit. Signal Process., 2010, 20, (4), pp. 12561263.
    20. 20)
      • 28. Aucouturier, J.-J., Defreville, B., Pachet, F.: ‘The bag-of-frame approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music’, J. Acoust. Soc. Am., 2007, 122, (2), pp. 881891.
    21. 21)
      • 13. Mellinger, D.K., Clark, C.W.: ‘Recognizing transient low-frequency whale sounds by spectrogram correlation’, J. Acoust. Soc. Am., 2000, 107, (6), pp. 35183529.
    22. 22)
      • 5. Shamma, S.: ‘On the role of space and time in auditory processing’, Trends Cogn. Sci., 2001, 5, (8), pp. 340348.
    23. 23)
      • 9. Shamir, L., Yerby, C., Simpson, R., et al: ‘Classification of large acoustic datasets using machine learning and crowdsourcing: application to whale calls’, J. Acoust. Soc. Am., 2014, 135, (2), pp. 953962.
    24. 24)
      • 25. Richard, M., Lippmann, R.: ‘Neural net classifiers estimate posterior probabilities’, Neural Comput., 1991, 3, (4), pp. 461483.
    25. 25)
      • 2. Potamitis, I., Ntalampiras, S., Jahn, O., et al: ‘Automatic bird sound detection in long real-field recordings: Applications and tools’, Appl. Acoust., 2014, 80, pp. 19.
    26. 26)
      • 33. Ntalampiras, S.: ‘Audio pattern recognition of baby crying sound events’, J. Audio Eng. Soc., 2015, 63, (5), pp. 358369. Available at: http://www.aes.org/e-lib/browse.cfm?elib=17641.
    27. 27)
      • 8. Mouy, X., Bahoura, M., Simard, Y.: ‘Automatic recognition of fin and blue whale calls for real-time monitoring in the St. Lawrence’, J. Acoust. Soc. Am., 2009, 126, (6), pp. 29182928.
    28. 28)
      • 18. Baluja, S., Covell, M.: ‘Waveprint: efficient wavelet-based audio fingerprinting’, Pattern Recognit., 2008, 41, (11), pp. 34673480.
    29. 29)
      • 14. Roch, J.C.B.E.H.M.A., Soldevilla, M.S., Hildebrand, J.A.: ‘Gaussian mixture model classification of odontocetes in the Southern California bight and the Gulf of California’, J. Acoust. Soc. Am., 2007, 121, (3), pp. 17371748.
    30. 30)
      • 29. Young, S.J., Evermann, G., Gales, M.J.F., et al: ‘The HTK Book, version 3.4’ (Cambridge University Engineering Department, Cambridge, UK, 2006).
    31. 31)
      • 21. Triefenbach, F., Jalalvand, A., Demuynck, K., et al: ‘Acoustic modeling with hierarchical reservoirs’, IEEE Trans. Audio, Speech, Lang. Process., 2013, 21, (11), pp. 24392450.
    32. 32)
      • 3. Ranjard, L., Withers, S.J., Brunton, D.H., et al: ‘Integration over song classification replicates: Song variant analysis in the hihi’, J. Acoust. Soc. Am., 2015, 137, (5), pp. 25422551.
    33. 33)
      • 26. Ntalampiras, S.: ‘A novel holistic modeling approach for generalized sound recognition’, IEEE Signal Process. Lett., 2013, 20, (2), pp. 185188.
    34. 34)
      • 6. Stephen, N.M., David, V., Shamma, S.A.: ‘Estimating sparse spectro-temporal receptive fields with natural stimuli’, Netw., Comput. Neural Syst., 2007, 18, (3), pp. 191212.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2015.0065
Loading

Related content

content/journals/10.1049/iet-spr.2015.0065
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading