http://iet.metastore.ingenta.com
1887

Bird and whale species identification using sound images

Bird and whale species identification using sound images

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Image identification of animals is mostly centred on identifying them based on their appearance, but there are other ways images can be used to identify animals, including by representing the sounds they make with images. In this study, the authors present a novel and effective approach for automated identification of birds and whales using some of the best texture descriptors in the computer vision literature. The visual features of sounds are built starting from the audio file and are taken from images constructed from different spectrograms and from harmonic and percussion images. These images are divided into sub-windows from which sets of texture descriptors are extracted. The experiments reported in this study using a dataset of Bird vocalisations targeted for species recognition and a dataset of right whale calls targeted for whale detection (as well as three well-known benchmarks for music genre classification) demonstrate that the fusion of different texture features enhances performance. The experiments also demonstrate that the fusion of different texture features with audio features is not only comparable with existing audio signal approaches but also statistically improves some of the stand-alone audio features. The code for the experiments will be publicly available at https://www.dropbox.com/s/bguw035yrqz0pwp/ElencoCode.docx?dl=0.

References

    1. 1)
      • J.C. Russell , N. Hasler , R. Klette .
        1. Russell, J.C., Hasler, N., Klette, R., et al: ‘Automatic track recognition of footprints for identifying cryptic species’, Ecology, 2009, 90, (7), pp. 20072013.
        . Ecology , 7 , 2007 - 2013
    2. 2)
      • Y.M.G. Costa , L.E.S. Oliveira , A.L. Koerich .
        2. Costa, Y.M.G., Oliveira, L.E.S., Koerich, A.L., et al: ‘Music genre recognition using spectrograms’. 18th Int. Conf. on Systems, Signals and Image Processing, 2011, pp. 151154.
        . 18th Int. Conf. on Systems, Signals and Image Processing , 151 - 154
    3. 3)
      • R.M. Haralick , K. Shanmugam , I. Dinstein .
        3. Haralick, R.M., Shanmugam, K., Dinstein, I.: ‘Textural features for image classification’, IEEE Trans. Syst. Man Cybern., 1973, 3, (6), pp. 610621.
        . IEEE Trans. Syst. Man Cybern. , 6 , 610 - 621
    4. 4)
      • Y.M.G. Costa , L.E.S. Oliveira , A.L. Koerich .
        4. Costa, Y.M.G., Oliveira, L.E.S., Koerich, A.L., et al: ‘Music genre classification using LBP textural features’, Signal Process., 2012, 92, pp. 27232737.
        . Signal Process. , 2723 - 2737
    5. 5)
      • Y.M.G. Costa , L.E.S. Oliveira , A.L. Koerich .
        5. Costa, Y.M.G., Oliveira, L.E.S., Koerich, A.L., et al: ‘Music genre recognition using Gabor filters and LPQ texture descriptors’. 18th Iberoamerican Congress on Pattern Recognition, 2013, pp. 6774.
        . 18th Iberoamerican Congress on Pattern Recognition , 67 - 74
    6. 6)
      • L. Nanni , Y.M.G. Costa , A. Lumini .
        6. Nanni, L., Costa, Y.M.G., Lumini, A., et al: ‘Combining visual and acoustic features for music genre classification’, Expert Syst. Appl., 2016, 45, pp. 108117.
        . Expert Syst. Appl. , 108 - 117
    7. 7)
      • A. Montalvo , Y.M.G. Costa , J.R. Calvo . (2015)
        7. Montalvo, A., Costa, Y.M.G., Calvo, J.R.: ‘Language identification using spectrogram texture’, in Cancela, H., Cuadros-Vargas, A., Cuadros-Vargas, E. (Eds.): ‘Progress in pattern recognition, image analysis, computer vision, and applications’ (Springer, Berlin, 2015), pp. 543550.
        .
    8. 8)
      • D.R. Lucio , Y.M.G. Costa .
        8. Lucio, D.R., Costa, Y.M.G.: ‘Bird species classification using spectrograms’. The XLI Latin American Computing Conf. (CLEI), Arequipa, Peru, 2015.
        . The XLI Latin American Computing Conf. (CLEI)
    9. 9)
      • L. Nanni , Y.M.G. Costa , D.R. Lucio .
        9. Nanni, L., Costa, Y.M.G., Lucio, D.R., et al: ‘Combining visual and acoustic features for bird species classification’. 28th IEEE Int. Conf. on Tools with Artificial Intelligence, 2016.
        . 28th IEEE Int. Conf. on Tools with Artificial Intelligence
    10. 10)
      • L. Nanni , Y.M.G. Costa , D.R. Lucio .
        10. Nanni, L., Costa, Y.M.G., Lucio, D.R., et al: ‘Combining visual and acoustic features for audio classification tasks’, Pattern Recognit. Lett., 2017, 88, (March), pp. 4956.
        . Pattern Recognit. Lett. , 49 - 56
    11. 11)
      • L.M. Deuser , D. Middleton , T.D. Plemonset .
        11. Deuser, L.M., Middleton, D., Plemonset, T.D., et al: ‘On the classification of underwater acoustic signals. II. Experimental applications involving fish’, J. Acoust. Soc. Am., 1979, 65, (2), pp. 444455.
        . J. Acoust. Soc. Am. , 2 , 444 - 455
    12. 12)
      • A. Giryn , M. Rojewski , K. Somla .
        12. Giryn, A., Rojewski, M., Somla, K.: ‘About the possibility of sea creature species identification on the basis of applying pattern recognition to echo-sounder signals’. Meeting on Hydroacoustical Methods for the Estimation of Marine Fish Population, 1979, pp. 455466.
        . Meeting on Hydroacoustical Methods for the Estimation of Marine Fish Population , 455 - 466
    13. 13)
      • E.D. Chesmore .
        13. Chesmore, E.D.: ‘Application of time domain signal coding and artificial neural networks to passive acoustical identification of animals’, Appl. Acoust., 2001, 62, pp. 13591374.
        . Appl. Acoust. , 1359 - 1374
    14. 14)
      • C. Lee , C. Chou , C. Han .
        14. Lee, C., Chou, C., Han, C., et al: ‘Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis’, Pattern Recognit. Lett., 2006, 27, pp. 93101.
        . Pattern Recognit. Lett. , 93 - 101
    15. 15)
      • C. Molnár , F. Kaplan , P. Roy .
        15. Molnár, C., Kaplan, F., Roy, P., et al: ‘Classification of dog barks: a machine learning approach’, Animal Cogn., 2008, 11, pp. 389400.
        . Animal Cogn. , 389 - 400
    16. 16)
      • F. Pachet , A. Zils .
        16. Pachet, F., Zils, A.: ‘Automatic extraction of music descriptors from acoustic signals’. 5th Int. Conf. on Music Information Retrieval (ISMIR), 2004.
        . 5th Int. Conf. on Music Information Retrieval (ISMIR)
    17. 17)
      • R. Bardeli , D. Wolff , F. Kurth .
        17. Bardeli, R., Wolff, D., Kurth, F., et al: ‘Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring’, Pattern Recognit. Lett., 2010, 31, pp. 15241534.
        . Pattern Recognit. Lett. , 1524 - 1534
    18. 18)
      • J. Cheng , Y. Sun , L. Ji .
        18. Cheng, J., Sun, Y., Ji, L.: ‘A call-independent and automatic acoustic system for the individual recognition of animals: a novel model using four passerines’, Pattern Recognit., 2010, 43, pp. 38463852.
        . Pattern Recognit. , 3846 - 3852
    19. 19)
      • D.R. Lucio , Y.M.G. Costa .
        19. Lucio, D.R., Costa, Y.M.G.: ‘Bird species classification using visual and acoustic features extracted from audio signal’. Int. Conf. of the Chilean Computer Science Society, Valparaiso, Chile, 2016.
        . Int. Conf. of the Chilean Computer Science Society
    20. 20)
      • I.R. Urazghildiiev , C.W. Clark , T.P. Krein .
        20. Urazghildiiev, I.R., Clark, C.W., Krein, T.P., et al: ‘Detection and recognition of north atlantic right whale contact calls in the presence of ambient noise’, IEEE J. Ocean. Eng., 2009, 34, (3), pp. 358368.
        . IEEE J. Ocean. Eng. , 3 , 358 - 368
    21. 21)
      • E. Spaulding , M. Robbins , T. Calupca .
        21. Spaulding, E., Robbins, M., Calupca, T., et al: ‘An autonomous, near-real-time buoy system for automatic detection of North Atlantic right whale calls’. 157th Meeting of the Acoustical Society of America, 2009.
        . 157th Meeting of the Acoustical Society of America
    22. 22)
      • D. Fitzgerald .
        22. Fitzgerald, D.: ‘Harmonic/Percussive separation using median filtering’. 13th Int. Conf. on Digital Audio Effects (DAFx-10), Graz, Austria, 2010.
        . 13th Int. Conf. on Digital Audio Effects (DAFx-10)
    23. 23)
      • B. McAfee , C. Raffel , D. Liang .
        23. McAfee, B., Raffel, C., Liang, D.: ‘Librosa: audio and music signal analysis in python’. Proc. 14th Python in Science Conf. (SCIPY), Austin, Texas, 2015.
        . Proc. 14th Python in Science Conf. (SCIPY)
    24. 24)
      • Y.M.G. Costa , L.E.S. Oliveira , A.L. Koerich .
        24. Costa, Y.M.G., Oliveira, L.E.S., Koerich, A.L., et al: ‘Comparing textural features for music genre classification’. IEEE World Congress on Computational Intelligence, 2012, pp. 18671872.
        . IEEE World Congress on Computational Intelligence , 1867 - 1872
    25. 25)
      • S. Umesh , L. Cohen , D. Nelson .
        25. Umesh, S., Cohen, L., Nelson, D.: ‘Fitting the mel scale’. Int. Conf. on Acoustics, Speech, and Signal Processing, 1999, pp. 217220.
        . Int. Conf. on Acoustics, Speech, and Signal Processing , 217 - 220
    26. 26)
      • T. Ojala , M. Pietikainen , T. Maeenpaa .
        26. Ojala, T., Pietikainen, M., Maeenpaa, T.: ‘Multiresolution gray-scale and rotation invariant texture classification with local binary patterns’, IEEE Trans. Pattern Anal. Mach. Intell., 2002, 24, (7), pp. 971987.
        . IEEE Trans. Pattern Anal. Mach. Intell. , 7 , 971 - 987
    27. 27)
      • V. Ojansivu , J. Heikkila .
        27. Ojansivu, V., Heikkila, J.: ‘Blur insensitive texture classification using local phase quantization’. Int. Conf. on Image and Signal Processing, 2008, pp. 236243.
        . Int. Conf. on Image and Signal Processing , 236 - 243
    28. 28)
      • G. Zhao , T. Ahonen , J. Matas .
        28. Zhao, G., Ahonen, T., Matas, J., et al: ‘Rotation-invariant image and video description with local binary pattern features’, IEEE Trans. Image Process., 2012, 21, (4), pp. 14651467.
        . IEEE Trans. Image Process. , 4 , 1465 - 1467
    29. 29)
      • R. Nosaka , C.H. Suryanto , K. Fukui .
        29. Nosaka, R., Suryanto, C.H., Fukui, K.: ‘Rotation invariant co-occurrence among adjacent LBPs’. ACCV Workshops, 2012, pp. 1525.
        . ACCV Workshops , 15 - 25
    30. 30)
      • L. Nanni , S. Brahnam , A. Lumini .
        30. Nanni, L., Brahnam, S., Lumini, A., et al: ‘Ensemble of local phase quantization variants with ternary encoding’, in ‘Local binary patterns: new variants and applications’ (Springer, Berlin, 2014).
        .
    31. 31)
      • M. San Biagio , M. Crocco , M. Cristani .
        31. San Biagio, M., Crocco, M., Cristani, M., et al: ‘Heterogeneous auto-similarities of characteristics (HASC): exploiting relational information for classification’. IEEE Computer Vision (ICCV'13), 2013, pp. 809816.
        . IEEE Computer Vision (ICCV'13) , 809 - 816
    32. 32)
      • J. Kannala , E. Rahtu .
        32. Kannala, J., Rahtu, E.: ‘Bsif: binarized statistical image features’. 21st Int. Conf. on Pattern Recognition (ICPR 2012), Tsukuba, Japan, 2012, pp. 13631366.
        . 21st Int. Conf. on Pattern Recognition (ICPR 2012) , 1363 - 1366
    33. 33)
      • L. Nanni , M. Paci , F.L.C. Santos .
        33. Nanni, L., Paci, M., Santos, F.L.C., et al: ‘Texture descriptors ensembles enable image-based classification of maturation of human stem cell-derived retinal pigmented epithelium’, PLoS One, 2016, 11, (2) p. e0149399.
        . PLoS One , 2 , e0149399
    34. 34)
      • Z. Zhu , X. You , C.L.P. Chen .
        34. Zhu, Z., You, X., Chen, C.L.P., et al: ‘An adaptive hybrid pattern for noise-robust texture analysis’, Pattern Recognit., 2015, 48, pp. 25922608.
        . Pattern Recognit. , 2592 - 2608
    35. 35)
      • T. Song , F. Meng .
        35. Song, T., Meng, F.: ‘Letrist: locally encoded transform feature histogram for rotation-invariant texture classification’, IEEE Trans. Circuits Syst. Video Technol., 2017, PP, (99).
        . IEEE Trans. Circuits Syst. Video Technol. , 99
    36. 36)
      • L. Nanni , S. Brahnam , A. Lumini .
        36. Nanni, L., Brahnam, S., Lumini, A.: ‘Combining different local binary pattern variants to boost performance’, Expert Syst. Appl., 2011, 38, (5), pp. 62096216.
        . Expert Syst. Appl. , 5 , 6209 - 6216
    37. 37)
      • Q. Wang , P. Li , L. Zhang .
        37. Wang, Q., Li, P., Zhang, L., et al: ‘Towards effective codebookless model for image classification’, Pattern Recognit., 2016, 59, pp. 6371.
        . Pattern Recognit. , 63 - 71
    38. 38)
      • M.R. Schroeder , B.S. Atal , J.L. Hall .
        38. Schroeder, M.R., Atal, B.S., Hall, J.L.: ‘Optimizing digital speech coders by exploiting masking properties of the human ear’, J. Acoust. Soc. Am., 1979, 66, (6), pp. 16471652.
        . J. Acoust. Soc. Am. , 6 , 1647 - 1652
    39. 39)
      • S. Fagerlund .
        39. Fagerlund, S.: ‘Bird species recognition using support vector machines’, EURASIP J. Appl. Signal Process., 2007, 2007, pp. 18.
        . EURASIP J. Appl. Signal Process. , 1 - 8
    40. 40)
      • S.-C. Lim , J.-S. Lee , S.-J. Jang .
        40. Lim, S.-C., Lee, J.-S., Jang, S.-J., et al: ‘Music-genre classification system based on spectro-temporal features and feature selection’, IEEE Trans. Consum. Electron., 2012, 58, (4), pp. 12621268.
        . IEEE Trans. Consum. Electron. , 4 , 1262 - 1268
    41. 41)
      • E. Vilches , I.A. Escobar , E.E. Vallejo .
        41. Vilches, E., Escobar, I.A., Vallejo, E.E., et al: ‘Data mining applied to acoustic bird species recognition’. Int. Conf. on Pattern Recognition, Hong Kong, 2006, pp. 400403.
        . Int. Conf. on Pattern Recognition , 400 - 403
    42. 42)
      • C.-H. Chou , P.-H. Liu .
        42. Chou, C.-H., Liu, P.-H.: ‘Bird species recognition by wavelet transformation of a section of birdsong’. Symp. and Workshops on Ubiquitous, Autonomic and Trusted Computing, 2009, pp. 189193.
        . Symp. and Workshops on Ubiquitous, Autonomic and Trusted Computing , 189 - 193
    43. 43)
      • M.T. Lopes , L.L. Gioppo , T.T. Higushi .
        43. Lopes, M.T., Gioppo, L.L., Higushi, T.T., et al: ‘Automatic bird species identification for large number of species’. IEEE Int. Symp. On Multimedia (ISM), 2011.
        . IEEE Int. Symp. On Multimedia (ISM)
    44. 44)
      • Z. Zhao , S.-H. Zhang , Z.-Y. Xu .
        44. Zhao, Z., Zhang, S.-H., Xu, Z.-Y., et al: ‘Automated bird acoustic event detection and robust species classification’, Ecological Inf., 2017, 39, pp. 99108.
        . Ecological Inf. , 99 - 108
    45. 45)
      • C.N. Silla , A.L. Koerich , C.A.A. Kaestner .
        45. Silla, C.N.Jr., Koerich, A.L., Kaestner, C.A.A.: ‘The latin music database’. 9th Int. Conf. on Music Information Retrieval, Philadelphia, USA, 2008, pp. 451456.
        . 9th Int. Conf. on Music Information Retrieval , 451 - 456
    46. 46)
      • A. Flexer .
        46. Flexer, A.: ‘A closer look on artist filters for musical genre classification’, World, 2007, 19, (122), pp. 1617.
        . World , 122 , 16 - 17
    47. 47)
      • B. Ong , X. Serra , S. Streich . (2006)
        47. Ong, B., Serra, X., Streich, S., et al: ‘ISMIR 2004 audio description contest’ (Music Technology Group-Universitat Pompeu Fabra, Barcelona, Spain, 2006).
        .
    48. 48)
      • G. Tzanetakis , P. Cook .
        48. Tzanetakis, G., Cook, P.: ‘Musical genre classification of audio signals’, IEEE Trans. Speech Audio Process., 2002, 10, (5), pp. 293302.
        . IEEE Trans. Speech Audio Process. , 5 , 293 - 302
    49. 49)
      • C.H.L. Costa , J.D. Valle , A.L. Koerich .
        49. Costa, C.H.L., Valle, J.D.Jr., Koerich, A.L.: ‘Automatic classification of audio data’. Int. Conf. on Systems, Man, and Cybernetics, 2004, pp. 562567.
        . Int. Conf. on Systems, Man, and Cybernetics , 562 - 567
    50. 50)
      • M.-J. Wu , Z.-S. Chen , J.-S.R. Jang .
        50. Wu, M.-J., Chen, Z.-S., Jang, J.-S.R., et al: ‘Combining visual and acoustic features for music genre classification’. Int. Conf. on Machine Learning and Applications, 2011.
        . Int. Conf. on Machine Learning and Applications
    51. 51)
      • P. Hamel .
        51. Hamel, P.: ‘Pooled features classification’. Submission to Audio Train/Test Task of MIREX, 2011.
        .
    52. 52)
      • J.-M. Ren , J.-S.R. Jang .
        52. Ren, J.-M., Jang, J.-S.R.: ‘Discovering time-constrained sequential patterns for music genre classification’, IEEE Trans. Audio Speech Lang. Process., 2012, 20, (4), pp. 11341144.
        . IEEE Trans. Audio Speech Lang. Process. , 4 , 1134 - 1144
    53. 53)
      • A. Pikrakis .
        53. Pikrakis, A.: ‘Audio latin music genre classification: a MIREX submission based on a deep learning approach to rhythm modelling’, 2013.
        .
    54. 54)
      • K. Seyerlehner , M. Schedl , T. Pohle .
        54. Seyerlehner, K., Schedl, M., Pohle, T., et al: ‘Using block-level features for genre classification, tag classification and music similarity estimation’. 6th Annual Music Information Retrieval Evaluation eXchange (MIREX-2010), Utrecht, The Netherlands, 2010.
        . 6th Annual Music Information Retrieval Evaluation eXchange (MIREX-2010)
    55. 55)
      • Y. Panagakis , C. Kotropoulos , G.R. Arce .
        55. Panagakis, Y., Kotropoulos, C., Arce, G.R.: ‘Music genre classification using locality preserving non-negative tensor factorization and sparse representations’. 10th Int. Conf. on Music Information Retrieval, 2009, pp. 249254.
        . 10th Int. Conf. on Music Information Retrieval , 249 - 254
    56. 56)
      • G. Gwardys , D. Grzywczak .
        56. Gwardys, G., Grzywczak, D.: ‘Deep image features in music information retrieval’, Int. J. Electron. Telecommun., 2014, 60, (4), pp. 321326.
        . Int. J. Electron. Telecommun. , 4 , 321 - 326
    57. 57)
      • Y.M.G. Costa , L.E.S. Oliveira , C.N. Silla .
        57. Costa, Y.M.G., Oliveira, L.E.S., Silla, C.N.Jr.: ‘An evaluation of convolutional neural networks for music classification using spectrograms’, Appl. Soft Comput., 2017, 52, pp. 2838.
        . Appl. Soft Comput. , 28 - 38
    58. 58)
      • J. Demšar .
        58. Demšar, J.: ‘Statistical comparisons of classifiers over multiple data sets’, J. Mach. Learn. Res., 2006, 7, pp. 130.
        . J. Mach. Learn. Res. , 1 - 30
    59. 59)
      • L.I. Kuncheva , C.J. Whitaker .
        59. Kuncheva, L.I., Whitaker, C.J.: ‘Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy’, Mach. Learn., 2003, 51, (2), pp. 181207.
        . Mach. Learn. , 2 , 181 - 207
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0075
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0075
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address