http://iet.metastore.ingenta.com
1887

access icon openaccess Speaker identification using multimodal neural networks and wavelet analysis

  • PDF
    637.556640625Kb
  • HTML
    101.0546875Kb
  • XML
    109.6708984375Kb
Loading full text...

Full text loading...

/deliver/fulltext/iet-bmt/4/1/IET-BMT.2014.0011.html;jsessionid=1brckcr0i4ue9.x-iet-live-01?itemId=%2fcontent%2fjournals%2f10.1049%2fiet-bmt.2014.0011&mimeType=html&fmt=ahah

References

    1. 1)
      • 1. Pawar, R.V., Kajave, P.P., Mali, S.N.: ‘Speaker identification using neural networks’. Proc. World Academy of Science, Engineering and Technology, 2005, no. 7, pp. 429433.
    2. 2)
      • 2. Rabiner, L., Juang, B.H.: ‘Fundamentals of speech recognition’ (Prentice-Hall, 1993).
    3. 3)
      • 3. Kinsner, W., Peters, D.: ‘A speech recognition system using linear predictive coding and dynamic time warping’. Proc. Annual Int. Conf. IEE, Engineering in Medicine & Biology Society, New Orleans, LA, 4–7 November 2006, no. 3, pp. 10701071.
    4. 4)
      • 4. Benesty, J., Sondhi, M., Huang, Y.: ‘Springer handbook of speech processing’ (Springer, 2007).
    5. 5)
      • 5. Abdalla, M.I., Ali, H.S.: ‘Wavelet-based Mel-frequency cepstral coefficients for speaker identification using hidden Markov models’, J. Telecommun., 2010, 1, (2), pp. 1621.
    6. 6)
      • 6. Suvarna Kumar, G., Prasad Raju, K.A., Rao, M., et al: ‘Speaker recognition using GMM’, Int. J. Eng. Sci. Technol., 2010, 2, (6), pp. 24282436.
    7. 7)
      • 7. Kekre1, H.B., Kulkarni, V.: ‘Speaker identification by using vector quantization’, Int. J. Eng. Sci. Technol., 2010, 2, (5), pp. 13251331.
    8. 8)
    9. 9)
    10. 10)
      • 10. Shukla, A., Tiwari, R., Hemant Kumar, M., Kala, R.: ‘Speaker identification using wavelet analysis and modular neural networks’, J. Acoust. Soc. India (JASI), 2009, 36, (1), pp. 1419.
    11. 11)
      • 11. Revada, L.K.V., Rambatla, V.K., Ande, K.V.N.: ‘A novel approach to speech recognition by using generalised regression neural networks’, IJCSI Int. J. Comput. Sci. Issues, 2011, 1, pp. 483489.
    12. 12)
    13. 13)
      • 13. Hall, D.L., Llinas, J.: ‘Handbook of multi-sensor data fusion’ (CRC Press, UK,2011).
    14. 14)
      • 14. Ross, A., Jain, A.: ‘Information fusion in biometrics’, Pattern Recognit. Lett., 2003, 24, (3), pp. 21152125.
    15. 15)
    16. 16)
      • 16. Chetty, G., Wagner, M.: ‘Audio visual speaker verification based on hybrid fusion of cross modal features’, in Pattern Recognition and Machine Intelligence, (Springer, Berlin, 2007).
    17. 17)
      • 17. Chetty, G., Wagner, M.: ‘Investigating feature-level fusion for checking liveness in face-voice authentication’. Int. Symp. on Signal Processing and its Applications, 2005, vol. 1.
    18. 18)
      • 18. Arora, S., Bhattacharjee, D., Nasipuri, M., Malik, L., Kundu, M., Basu, D.K.: ‘Performance comparison of SVM and ANN for handwritten Devnagari character recognition’, IJCSI Int. J. Comput. Sci., 2010, 7, (3), pp. 110.
    19. 19)
    20. 20)
      • 20. Mallat, S.: ‘A wavelet tour of signal processing’ (Elsevier, UK, 1999).
    21. 21)
    22. 22)
      • 22. Vetterli, M., Kovacevic, J.: ‘Wavelets and subband coding’ (Prentice-Hall, New Jersey, 1995).
    23. 23)
    24. 24)
      • 24. Deshpande, M.S., Holambe, R.S.: ‘Speaker identification using admissible wavelet packet based decomposition’, Int. J. Inf. Commun. Eng., 2011, 6, (1), pp. 2023.
    25. 25)
    26. 26)
    27. 27)
    28. 28)
      • 28. Amrouche, A., Rouvaen, J.: ‘Efficient system for speech recognition using general regression neural network’, Int. J. Intell. Technol., 2006, 1, (2), pp. 183189.
    29. 29)
    30. 30)
      • 30. Ye, J.: ‘Speech recognition using time domain features from phase space reconstructions’. PhD thesis.Marquette University Milwaukee, Wisconsin, 2004.
    31. 31)
    32. 32)
      • 32. Wilpon, J.G., Lee, C.H., Rabiner, L.R.: ‘Improvements in connected digit recognition using higher order spectral and energy features’. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Toronto, Canada, 1991.
    33. 33)
      • 33. Rottland, J., Neukirchen, C., Willett, D., Rigoll, G.: ‘Large vocabulary speech recognition with context dependent MMI-connectionist/HMM systems using the WSJ database’. EUROSPEECH, 1997.
    34. 34)
      • 34. Hamzah, R., Jamil, N., Seman, N.: ‘Filled pause classification using energy-boosted Mel-frequency cepstrum coefficients’. Proc. Int. Conf. on Robotic, Vision, Signal Processing & Power Applications, 2014, pp. 311319.
    35. 35)
      • 35. The GRID audio corpus for speech recognition’. Available at http://www.dcs.shef.ac.uk/spandh/gridcorpus.
    36. 36)
    37. 37)
      • 37. Holmes, W., Speech synthesis and recognition, (CRC Press, UK, 2001).
    38. 38)
      • 38. Gelbart, D.: ‘Ensemble feature selection for multi-stream automatic speech recognition’. Technical Report No. UCB/EECS-2008-160, University of California at Berkeley, December2008.
    39. 39)
    40. 40)
      • 40. Morris, A., Bloothooft, G., Barry, W., Andreeva, B., Koreman, J.C.: ‘Human and machine identification of consonantal place of articulation from vocalic transition segments’. EUROSPEECH, 1997.
    41. 41)
    42. 42)
      • 42. Morris, A., Wu, D., Koreman, J.: ‘GMM based clustering and speaker separability in the TIMIT speech database’, IEICE Trans. Fundam. Syst., 2005, 85, pp. 18.
    43. 43)
    44. 44)
      • 44. Chi, T.S., Lin, T.H., Hsu, C.C.: ‘Spectro-temporal modulation energy based mask for robust speaker identification’, J. Acoust. Soc. Am., 2012, 131, (5), pp. 368374.
    45. 45)
    46. 46)
      • 46. Saeidi, R., Mowlaee, P., Kinnunen, T., Tan, Z., Christensen, M., Jensen, H., Franti, P.: ‘Signal-to-signal ratio independent speaker identification for co-channel speech signals’. Proc. IEEE Int. Conf. Pattern Recognition, 2010, pp. 45454548.
    47. 47)
    48. 48)
    49. 49)
      • 49. Revathi, A., Ganapathy, R., Venkataramani, Y.: ‘Text independent speaker recognition and speaker independent speech recognition using iterative clustering approach’, Int. J. Comput. Sci. Inf. Technol., 2009, 1, (2), pp. 3042.
    50. 50)
      • 50. Gomez, P.: ‘A text independent speaker recognition system using a novel parametric neural network’, Int. J. Signal Process., Image Process. Pattern Recognit., 2011, 1, pp. 116.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-bmt.2014.0011
Loading

Related content

content/journals/10.1049/iet-bmt.2014.0011
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address