http://iet.metastore.ingenta.com
1887

Hybrid framework for categorising sounds of Mysticete whales

Hybrid framework for categorising sounds of Mysticete whales

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Signal Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

This study addresses a problem belonging to the domain of whale audio processing, more specifically the automatic classification of sounds produced by the Mysticete species. The specific task is quite challenging given the vast repertoire of the involved species, the adverse acoustic conditions and the nearly inexistent prior scientific work. Two feature sets coming from different domains (frequency and wavelet) were designed to tackle the problem. These are modelled by means of a hybrid technique taking advantage of the merits of a generative and a discriminative classifier. The dataset includes five species (Blue, Fin, Bowhead, Southern Right, and Humpback) and it is publicly available at http://www.mobysound.org/. The authors followed a thorough experimental procedure and achieved quite encouraging recognition rates.

References

    1. 1)
      • 1. Holmes, S.B., McIlwrick, K.A., Venier, L.A.: ‘Using automated sound recording and analysis to detect bird species-at-risk in Southwestern Ontario woodlands’. Wildlife Society Bulletin, 2014, pp. /a–n/a.
    2. 2)
      • 2. Potamitis, I., Ntalampiras, S., Jahn, O., et al: ‘Automatic bird sound detection in long real-field recordings: Applications and tools’, Appl. Acoust., 2014, 80, pp. 19.
    3. 3)
      • 3. Ranjard, L., Withers, S.J., Brunton, D.H., et al: ‘Integration over song classification replicates: Song variant analysis in the hihi’, J. Acoust. Soc. Am., 2015, 137, (5), pp. 25422551.
    4. 4)
      • 4. Ntalampiras, S., Potamitis, I., Fakotakis, N.: ‘Acoustic detection of human activities in natural environments’, J. Audio Eng. Soc., 2012, 60, (9), pp. 686695. Available at: http://www.aes.org/e-lib/browse.cfm?elib=16373.
    5. 5)
      • 5. Shamma, S.: ‘On the role of space and time in auditory processing’, Trends Cogn. Sci., 2001, 5, (8), pp. 340348.
    6. 6)
      • 6. Stephen, N.M., David, V., Shamma, S.A.: ‘Estimating sparse spectro-temporal receptive fields with natural stimuli’, Netw., Comput. Neural Syst., 2007, 18, (3), pp. 191212.
    7. 7)
      • 7. Halkias, X.C., Paris, S., Glotin, H.: ‘Classification of Mysticete sounds using machine learning techniques’, J. Acoust. Soc. Am., 2013, 134, (5), pp. 34963505.
    8. 8)
      • 8. Mouy, X., Bahoura, M., Simard, Y.: ‘Automatic recognition of fin and blue whale calls for real-time monitoring in the St. Lawrence’, J. Acoust. Soc. Am., 2009, 126, (6), pp. 29182928.
    9. 9)
      • 9. Shamir, L., Yerby, C., Simpson, R., et al: ‘Classification of large acoustic datasets using machine learning and crowdsourcing: application to whale calls’, J. Acoust. Soc. Am., 2014, 135, (2), pp. 953962.
    10. 10)
      • 10. Bahoura, M., Simard, Y.: ‘Blue whale calls classification using short-time Fourier and wavelet packet transforms and artificial neural network’, Digit. Signal Process., 2010, 20, (4), pp. 12561263.
    11. 11)
      • 11. Brown, J.C., Miller, P.J.O.: ‘Automatic classification of killer whale vocalizations using dynamic time warping’, J. Acoust. Soc. Am., 2007, 122, (2), pp. 12011207.
    12. 12)
      • 12. Brown, J.C., Smaragdis, P.: ‘Hidden Markov and Gaussian mixture models for automatic call classification’, J. Acoust. Soc. Am., 2009, 125, (6), pp. EL221EL224.
    13. 13)
      • 13. Mellinger, D.K., Clark, C.W.: ‘Recognizing transient low-frequency whale sounds by spectrogram correlation’, J. Acoust. Soc. Am., 2000, 107, (6), pp. 35183529.
    14. 14)
      • 14. Roch, J.C.B.E.H.M.A., Soldevilla, M.S., Hildebrand, J.A.: ‘Gaussian mixture model classification of odontocetes in the Southern California bight and the Gulf of California’, J. Acoust. Soc. Am., 2007, 121, (3), pp. 17371748.
    15. 15)
      • 15. Wilcock, W.S.D.: ‘Tracking fin whales in the northeast pacific ocean with a seafloor seismic network’, J. Acoust. Soc. Am., 2012, 132, (4), pp. 17371748.
    16. 16)
      • 16. Harris, F.: ‘On the use of windows for harmonic analysis with the discrete Fourier transform’, Proc. IEEE, 1978, 66, (1), pp. 5183.
    17. 17)
      • 17. Ren, Y., Johnson, M.T., Tao, J.: ‘Perceptually motivated wavelet packet transform for bioacoustic signal enhancement’, J. Acoust. Soc. Am., 2008, 124, (1), pp. 316327.
    18. 18)
      • 18. Baluja, S., Covell, M.: ‘Waveprint: efficient wavelet-based audio fingerprinting’, Pattern Recognit., 2008, 41, (11), pp. 34673480.
    19. 19)
      • 19. Chandrakala, S., Sekhar, C.C.: ‘Combination of generative models and svm based classifier for speech emotion recognition’. 2009 Int. Joint Conf. on Neural Networks, June 2009, pp. 497502.
    20. 20)
      • 20. Bouchard, G.: ‘Bias-variance tradeoff in hybrid generative-discriminative models’. Sixth Int. Conf. on Machine Learning and Applications, 2007, ICMLA 2007, December 2007, pp. 124129.
    21. 21)
      • 21. Triefenbach, F., Jalalvand, A., Demuynck, K., et al: ‘Acoustic modeling with hierarchical reservoirs’, IEEE Trans. Audio, Speech, Lang. Process., 2013, 21, (11), pp. 24392450.
    22. 22)
      • 22. Lukoeviius, M., Jaeger, H.: ‘Reservoir computing approaches to recurrent neural network training’, Comput. Sci. Rev., 2009, 3, (3), pp. 127149.
    23. 23)
      • 23. Jaeger, H.: ‘Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach’. Tech. Rep., Fraunhofer Institute AIS, St. Augustin-Germany, 2002.
    24. 24)
      • 24. Verstraeten, D., Schrauwen, B., DHaene, M., et al: ‘An experimental unification of reservoir computing methods’, Neural Netw., 2007, 20, (3), pp. 391403, Echo State Networks and Liquid State Machines.
    25. 25)
      • 25. Richard, M., Lippmann, R.: ‘Neural net classifiers estimate posterior probabilities’, Neural Comput., 1991, 3, (4), pp. 461483.
    26. 26)
      • 26. Ntalampiras, S.: ‘A novel holistic modeling approach for generalized sound recognition’, IEEE Signal Process. Lett., 2013, 20, (2), pp. 185188.
    27. 27)
      • 27. Ntalampiras, S.: ‘Universal background modeling for acoustic surveillance of urban traffic’, Digit. Signal Process., 2014, 31, pp. 6978. Available at: http://www.sciencedirect.com/science/article/pii/S1051200414001390.
    28. 28)
      • 28. Aucouturier, J.-J., Defreville, B., Pachet, F.: ‘The bag-of-frame approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music’, J. Acoust. Soc. Am., 2007, 122, (2), pp. 881891.
    29. 29)
      • 29. Young, S.J., Evermann, G., Gales, M.J.F., et al: ‘The HTK Book, version 3.4’ (Cambridge University Engineering Department, Cambridge, UK, 2006).
    30. 30)
      • 30. Frank, E., Hall, M., Holmes, G., et al: ‘Weka-a machine learning workbench for data mining’, in Maimon, O., Rokach, L. (EDs.): ‘Data mining and knowledge discovery handbook’ (Springer, USA, 2010), pp. 12691277.
    31. 31)
      • 31. Viterbi, A.: ‘Error bounds for convolutional codes and an asymptotically optimum decoding algorithm’, IEEE Trans. Inf. Theory, 1967, 13, (2), pp. 260269.
    32. 32)
      • 32. Kwak, C., Kwon, O.-W.: ‘Cardiac disorder classification by heart sound signals using murmur likelihood and hidden Markov model state likelihood’, IET Signal Process., 2012, 6, (4), pp. 326334.
    33. 33)
      • 33. Ntalampiras, S.: ‘Audio pattern recognition of baby crying sound events’, J. Audio Eng. Soc., 2015, 63, (5), pp. 358369. Available at: http://www.aes.org/e-lib/browse.cfm?elib=17641.
    34. 34)
      • 34. Pimentel, M.A., Clifton, D.A., Clifton, L., et al: ‘A review of novelty detection’, Signal Process., 2014, 99, pp. 215249. Available at: http://www.sciencedirect.com/science/article/pii/S016516841300515X.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2015.0065
Loading

Related content

content/journals/10.1049/iet-spr.2015.0065
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address