http://iet.metastore.ingenta.com
1887

access icon openaccess Speaker identification using multimodal neural networks and wavelet analysis

  • PDF
    637.556640625Kb
  • HTML
    101.0546875Kb
  • XML
    109.6708984375Kb
Loading full text...

Full text loading...

/deliver/fulltext/iet-bmt/4/1/IET-BMT.2014.0011.html;jsessionid=1eyonvxesmnvx.x-iet-live-01?itemId=%2fcontent%2fjournals%2f10.1049%2fiet-bmt.2014.0011&mimeType=html&fmt=ahah

References

    1. 1)
      • R.V. Pawar , P.P. Kajave , S.N. Mali .
        1. Pawar, R.V., Kajave, P.P., Mali, S.N.: ‘Speaker identification using neural networks’. Proc. World Academy of Science, Engineering and Technology, 2005, no. 7, pp. 429433.
        . Proc. World Academy of Science, Engineering and Technology , 7 , 429 - 433
    2. 2)
      • L. Rabiner , B.H. Juang . (1993)
        2. Rabiner, L., Juang, B.H.: ‘Fundamentals of speech recognition’ (Prentice-Hall, 1993).
        .
    3. 3)
      • W. Kinsner , D. Peters .
        3. Kinsner, W., Peters, D.: ‘A speech recognition system using linear predictive coding and dynamic time warping’. Proc. Annual Int. Conf. IEE, Engineering in Medicine & Biology Society, New Orleans, LA, 4–7 November 2006, no. 3, pp. 10701071.
        . Proc. Annual Int. Conf. IEE, Engineering in Medicine & Biology Society , 3 , 1070 - 1071
    4. 4)
      • J. Benesty , M. Sondhi , Y. Huang . (2007)
        4. Benesty, J., Sondhi, M., Huang, Y.: ‘Springer handbook of speech processing’ (Springer, 2007).
        .
    5. 5)
      • M.I. Abdalla , H.S. Ali .
        5. Abdalla, M.I., Ali, H.S.: ‘Wavelet-based Mel-frequency cepstral coefficients for speaker identification using hidden Markov models’, J. Telecommun., 2010, 1, (2), pp. 1621.
        . J. Telecommun. , 2 , 16 - 21
    6. 6)
      • G. Suvarna Kumar , K.A. Prasad Raju , M. Rao .
        6. Suvarna Kumar, G., Prasad Raju, K.A., Rao, M., et al: ‘Speaker recognition using GMM’, Int. J. Eng. Sci. Technol., 2010, 2, (6), pp. 24282436.
        . Int. J. Eng. Sci. Technol. , 6 , 2428 - 2436
    7. 7)
      • H.B. Kekre1 , V. Kulkarni .
        7. Kekre1, H.B., Kulkarni, V.: ‘Speaker identification by using vector quantization’, Int. J. Eng. Sci. Technol., 2010, 2, (5), pp. 13251331.
        . Int. J. Eng. Sci. Technol. , 5 , 1325 - 1331
    8. 8)
    9. 9)
    10. 10)
      • A. Shukla , R. Tiwari , M. Hemant Kumar , R. Kala .
        10. Shukla, A., Tiwari, R., Hemant Kumar, M., Kala, R.: ‘Speaker identification using wavelet analysis and modular neural networks’, J. Acoust. Soc. India (JASI), 2009, 36, (1), pp. 1419.
        . J. Acoust. Soc. India (JASI) , 1 , 14 - 19
    11. 11)
      • L.K.V. Revada , V.K. Rambatla , K.V.N. Ande .
        11. Revada, L.K.V., Rambatla, V.K., Ande, K.V.N.: ‘A novel approach to speech recognition by using generalised regression neural networks’, IJCSI Int. J. Comput. Sci. Issues, 2011, 1, pp. 483489.
        . IJCSI Int. J. Comput. Sci. Issues , 483 - 489
    12. 12)
    13. 13)
      • D.L. Hall , J. Llinas . (2011)
        13. Hall, D.L., Llinas, J.: ‘Handbook of multi-sensor data fusion’ (CRC Press, UK,2011).
        .
    14. 14)
      • A. Ross , A. Jain . (2003)
        14. Ross, A., Jain, A.: ‘Information fusion in biometrics’, Pattern Recognit. Lett., 2003, 24, (3), pp. 21152125.
        .
    15. 15)
    16. 16)
      • G. Chetty , M. Wagner .
        16. Chetty, G., Wagner, M.: ‘Audio visual speaker verification based on hybrid fusion of cross modal features’, in Pattern Recognition and Machine Intelligence, (Springer, Berlin, 2007).
        . Pattern Recognition and Machine Intelligence
    17. 17)
      • G. Chetty , M. Wagner .
        17. Chetty, G., Wagner, M.: ‘Investigating feature-level fusion for checking liveness in face-voice authentication’. Int. Symp. on Signal Processing and its Applications, 2005, vol. 1.
        . Int. Symp. on Signal Processing and its Applications
    18. 18)
      • S. Arora , D. Bhattacharjee , M. Nasipuri , L. Malik , M. Kundu , D.K. Basu .
        18. Arora, S., Bhattacharjee, D., Nasipuri, M., Malik, L., Kundu, M., Basu, D.K.: ‘Performance comparison of SVM and ANN for handwritten Devnagari character recognition’, IJCSI Int. J. Comput. Sci., 2010, 7, (3), pp. 110.
        . IJCSI Int. J. Comput. Sci. , 3 , 1 - 10
    19. 19)
    20. 20)
      • S. Mallat . (1999)
        20. Mallat, S.: ‘A wavelet tour of signal processing’ (Elsevier, UK, 1999).
        .
    21. 21)
    22. 22)
      • M. Vetterli , J. Kovacevic . (1995)
        22. Vetterli, M., Kovacevic, J.: ‘Wavelets and subband coding’ (Prentice-Hall, New Jersey, 1995).
        .
    23. 23)
    24. 24)
      • M.S. Deshpande , R.S. Holambe .
        24. Deshpande, M.S., Holambe, R.S.: ‘Speaker identification using admissible wavelet packet based decomposition’, Int. J. Inf. Commun. Eng., 2011, 6, (1), pp. 2023.
        . Int. J. Inf. Commun. Eng. , 1 , 20 - 23
    25. 25)
    26. 26)
    27. 27)
    28. 28)
      • A. Amrouche , J. Rouvaen .
        28. Amrouche, A., Rouvaen, J.: ‘Efficient system for speech recognition using general regression neural network’, Int. J. Intell. Technol., 2006, 1, (2), pp. 183189.
        . Int. J. Intell. Technol. , 2 , 183 - 189
    29. 29)
    30. 30)
      • J. Ye .
        30. Ye, J.: ‘Speech recognition using time domain features from phase space reconstructions’. PhD thesis.Marquette University Milwaukee, Wisconsin, 2004.
        .
    31. 31)
    32. 32)
      • J.G. Wilpon , C.H. Lee , L.R. Rabiner .
        32. Wilpon, J.G., Lee, C.H., Rabiner, L.R.: ‘Improvements in connected digit recognition using higher order spectral and energy features’. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Toronto, Canada, 1991.
        . Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing
    33. 33)
      • J. Rottland , C. Neukirchen , D. Willett , G. Rigoll .
        33. Rottland, J., Neukirchen, C., Willett, D., Rigoll, G.: ‘Large vocabulary speech recognition with context dependent MMI-connectionist/HMM systems using the WSJ database’. EUROSPEECH, 1997.
        . EUROSPEECH
    34. 34)
      • R. Hamzah , N. Jamil , N. Seman .
        34. Hamzah, R., Jamil, N., Seman, N.: ‘Filled pause classification using energy-boosted Mel-frequency cepstrum coefficients’. Proc. Int. Conf. on Robotic, Vision, Signal Processing & Power Applications, 2014, pp. 311319.
        . Proc. Int. Conf. on Robotic, Vision, Signal Processing & Power Applications , 311 - 319
    35. 35)
      • 35. The GRID audio corpus for speech recognition’. Available at http://www.dcs.shef.ac.uk/spandh/gridcorpus.
        .
    36. 36)
    37. 37)
      • W. Holmes . (2001)
        37. Holmes, W., Speech synthesis and recognition, (CRC Press, UK, 2001).
        .
    38. 38)
      • D. Gelbart . (2008)
        38. Gelbart, D.: ‘Ensemble feature selection for multi-stream automatic speech recognition’. Technical Report No. UCB/EECS-2008-160, University of California at Berkeley, December2008.
        .
    39. 39)
    40. 40)
      • A. Morris , G. Bloothooft , W. Barry , B. Andreeva , J.C. Koreman .
        40. Morris, A., Bloothooft, G., Barry, W., Andreeva, B., Koreman, J.C.: ‘Human and machine identification of consonantal place of articulation from vocalic transition segments’. EUROSPEECH, 1997.
        . EUROSPEECH
    41. 41)
    42. 42)
      • A. Morris , D. Wu , J. Koreman .
        42. Morris, A., Wu, D., Koreman, J.: ‘GMM based clustering and speaker separability in the TIMIT speech database’, IEICE Trans. Fundam. Syst., 2005, 85, pp. 18.
        . IEICE Trans. Fundam. Syst. , 1 - 8
    43. 43)
    44. 44)
      • T.S. Chi , T.H. Lin , C.C. Hsu .
        44. Chi, T.S., Lin, T.H., Hsu, C.C.: ‘Spectro-temporal modulation energy based mask for robust speaker identification’, J. Acoust. Soc. Am., 2012, 131, (5), pp. 368374.
        . J. Acoust. Soc. Am. , 5 , 368 - 374
    45. 45)
    46. 46)
      • R. Saeidi , P. Mowlaee , T. Kinnunen , Z. Tan , M. Christensen , H. Jensen , P. Franti .
        46. Saeidi, R., Mowlaee, P., Kinnunen, T., Tan, Z., Christensen, M., Jensen, H., Franti, P.: ‘Signal-to-signal ratio independent speaker identification for co-channel speech signals’. Proc. IEEE Int. Conf. Pattern Recognition, 2010, pp. 45454548.
        . Proc. IEEE Int. Conf. Pattern Recognition , 4545 - 4548
    47. 47)
    48. 48)
    49. 49)
      • A. Revathi , R. Ganapathy , Y. Venkataramani .
        49. Revathi, A., Ganapathy, R., Venkataramani, Y.: ‘Text independent speaker recognition and speaker independent speech recognition using iterative clustering approach’, Int. J. Comput. Sci. Inf. Technol., 2009, 1, (2), pp. 3042.
        . Int. J. Comput. Sci. Inf. Technol. , 2 , 30 - 42
    50. 50)
      • P. Gomez .
        50. Gomez, P.: ‘A text independent speaker recognition system using a novel parametric neural network’, Int. J. Signal Process., Image Process. Pattern Recognit., 2011, 1, pp. 116.
        . Int. J. Signal Process., Image Process. Pattern Recognit. , 1 - 16
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-bmt.2014.0011
Loading

Related content

content/journals/10.1049/iet-bmt.2014.0011
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address