- Sort by:
- Newest first
- Titles A to Z
Filter by subject:
- Physics [12]
- Fundamental areas of phenomenology [12]
- Acoustics [12]
- Speech communication [12]
- Human speech communication [12]
- Electrical and electronic engineering [10]
- Communications [9]
- Information and communication theory [9]
- Speech and audio signal processing [9]
- Cross-disciplinary physics and related areas of science and technology [8]
- [8]
- http://iet.metastore.ingenta.com/content/subject/a8736,http://iet.metastore.ingenta.com/content/subject/a0000,http://iet.metastore.ingenta.com/content/subject/c,http://iet.metastore.ingenta.com/content/subject/c1000,http://iet.metastore.ingenta.com/content/subject/c1200,http://iet.metastore.ingenta.com/content/subject/c1250,http://iet.metastore.ingenta.com/content/subject/a0200,http://iet.metastore.ingenta.com/content/subject/a8770,http://iet.metastore.ingenta.com/content/subject/b7000,http://iet.metastore.ingenta.com/content/subject/b7500,http://iet.metastore.ingenta.com/content/subject/c1250c,http://iet.metastore.ingenta.com/content/subject/c5000,http://iet.metastore.ingenta.com/content/subject/c5200,http://iet.metastore.ingenta.com/content/subject/c5260,http://iet.metastore.ingenta.com/content/subject/c5260s,http://iet.metastore.ingenta.com/content/subject/a4360,http://iet.metastore.ingenta.com/content/subject/a8770e,http://iet.metastore.ingenta.com/content/subject/b0000,http://iet.metastore.ingenta.com/content/subject/b0200,http://iet.metastore.ingenta.com/content/subject/b7510,http://iet.metastore.ingenta.com/content/subject/c1100,http://iet.metastore.ingenta.com/content/subject/a0250,http://iet.metastore.ingenta.com/content/subject/b0240,http://iet.metastore.ingenta.com/content/subject/b0240z,http://iet.metastore.ingenta.com/content/subject/b6140,http://iet.metastore.ingenta.com/content/subject/b7520,http://iet.metastore.ingenta.com/content/subject/c1140,http://iet.metastore.ingenta.com/content/subject/c1140z,http://iet.metastore.ingenta.com/content/subject/c6000,http://iet.metastore.ingenta.com/content/subject/c6100,http://iet.metastore.ingenta.com/content/subject/a0210,http://iet.metastore.ingenta.com/content/subject/a0230,http://iet.metastore.ingenta.com/content/subject/a0260,http://iet.metastore.ingenta.com/content/subject/a0500,http://iet.metastore.ingenta.com/content/subject/a0547,http://iet.metastore.ingenta.com/content/subject/a4370f,http://iet.metastore.ingenta.com/content/subject/a8710,http://iet.metastore.ingenta.com/content/subject/a8732,http://iet.metastore.ingenta.com/content/subject/a8732s,http://iet.metastore.ingenta.com/content/subject/a8734,http://iet.metastore.ingenta.com/content/subject/a8734f,http://iet.metastore.ingenta.com/content/subject/a8760,http://iet.metastore.ingenta.com/content/subject/a8760b,http://iet.metastore.ingenta.com/content/subject/a8770g,http://iet.metastore.ingenta.com/content/subject/a8770j,http://iet.metastore.ingenta.com/content/subject/b0210,http://iet.metastore.ingenta.com/content/subject/b0230,http://iet.metastore.ingenta.com/content/subject/b0290,http://iet.metastore.ingenta.com/content/subject/b0290h,http://iet.metastore.ingenta.com/content/subject/b6120,http://iet.metastore.ingenta.com/content/subject/b6120b,http://iet.metastore.ingenta.com/content/subject/b6130e,http://iet.metastore.ingenta.com/content/subject/b6140b,http://iet.metastore.ingenta.com/content/subject/b7510h,http://iet.metastore.ingenta.com/content/subject/b7520e,http://iet.metastore.ingenta.com/content/subject/b7550,http://iet.metastore.ingenta.com/content/subject/c1110,http://iet.metastore.ingenta.com/content/subject/c1130,http://iet.metastore.ingenta.com/content/subject/c1230,http://iet.metastore.ingenta.com/content/subject/c1230l,http://iet.metastore.ingenta.com/content/subject/c4000,http://iet.metastore.ingenta.com/content/subject/c4100,http://iet.metastore.ingenta.com/content/subject/c4140,http://iet.metastore.ingenta.com/content/subject/c6130,http://iet.metastore.ingenta.com/content/subject/c6130b,http://iet.metastore.ingenta.com/content/subject/c6180,http://iet.metastore.ingenta.com/content/subject/c6180n,http://iet.metastore.ingenta.com/content/subject/c7000,http://iet.metastore.ingenta.com/content/subject/c7300,http://iet.metastore.ingenta.com/content/subject/c7330,http://iet.metastore.ingenta.com/content/subject/e,http://iet.metastore.ingenta.com/content/subject/e1000,http://iet.metastore.ingenta.com/content/subject/e1400,http://iet.metastore.ingenta.com/content/subject/e1410,http://iet.metastore.ingenta.com/content/subject/e3000,http://iet.metastore.ingenta.com/content/subject/e3600,http://iet.metastore.ingenta.com/content/subject/e3654
- a8736,a0000,c,c1000,c1200,c1250,a0200,a8770,b7000,b7500,c1250c,c5000,c5200,c5260,c5260s,a4360,a8770e,b0000,b0200,b7510,c1100,a0250,b0240,b0240z,b6140,b7520,c1140,c1140z,c6000,c6100,a0210,a0230,a0260,a0500,a0547,a4370f,a8710,a8732,a8732s,a8734,a8734f,a8760,a8760b,a8770g,a8770j,b0210,b0230,b0290,b0290h,b6120,b6120b,b6130e,b6140b,b7510h,b7520e,b7550,c1110,c1130,c1230,c1230l,c4000,c4100,c4140,c6130,c6130b,c6180,c6180n,c7000,c7300,c7330,e,e1000,e1400,e1410,e3000,e3600,e3654
- [7],[5],[5],[5],[5],[5],[4],[4],[4],[4],[4],[4],[4],[4],[4],[3],[3],[3],[3],[3],[3],[2],[2],[2],[2],[2],[2],[2],[2],[2],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]
- /search/morefacet;jsessionid=11r6tgzvvtiko.x-iet-live-01
- /content/searchconcept;jsessionid=11r6tgzvvtiko.x-iet-live-01?option1=pub_concept&sortField=prism_publicationDate&pageSize=20&sortDescending=true&value1=a4370c&facetOptions=2&facetNames=pub_concept_facet&operator2=AND&option2=pub_concept_facet&value2=
- See more See less
Filter by content type:
Filter by publication date:
Filter by author:
- J.J. Jiang [2]
- Yu Zhang [2]
- C. Benoit [1]
- H.-C. Wang [1]
- Hongcui Wang [1]
- J.-K. Chen [1]
- J.A. Edwards [1]
- J.A.S. Angus [1]
- Jianguo Wei [1]
- Jianwu Dang [1]
- Joon-Hyuk Chang [1]
- L.-M. Lee [1]
- M.J. Joyce [1]
- M.P. Hollier [1]
- Nam Soo Kim [1]
- P. Gray [1]
- P. Lacy [1]
- P. deChazal [1]
- P.E. Roberts [1]
- R. Moran [1]
- R.B. Reilly [1]
- R.E. Massara [1]
- S. Srinivasan [1]
- W.A. Simm [1]
- Y. Jeong [1]
- Yuguang Wang [1]
- See more See less
A faithful 3D physiological articulatory model was constructed in the previous work to investigate the mechanism of speech production. The model consists of the tongue, hyoid bone, jaw, larynx complex, bilateral piriform fossae, and vocal tract wall that includes the hard palate, soft palate, pharyngeal wall. In this study, we generated a series of time-varying vocal tract shapes in simulating vowel-vowel (VV) sequence by the 3D articulatory model and analyzed their acoustic characteristics using the finite-difference time-domain (FDTD) method. Area functions (AF) showed that the vocal tract shapes vary smoothly frame by frame, while the transfer functions (TF) transit continuously from vowel to vowel for the first four formants. Acoustic roles of the piriform fossae were also examined. The piriform fossa generated two spectral dips around 4 kHz in transfer function. Pressure distribution patterns at first four formant frequencies for vowel /a/ were examined, which are consistent with previous studies. Experimental results showed that our physiological model can describe the details of the vocal tract.
A unified 2-mode analysis method is presented for a speaker adaptation method that uses bases. Different structures of training matrices produce different sets of bases in the same framework. Two forms of training matrices are investigated. Bases obtained from the same training matrix as eigenvoice express the same space as that of eigenvoice, hence producing the same recognition result; the new form of training matrix produces the bases that express a different space but also capture the variation of training speakers. In the isolated word recognition experiment, the speaker-adapted model that uses the new bases reduces the error rates of the speaker-independent model and eigenvoice by about 71 and 26% on average, respectively, for adaptation data longer than 3 s.
An efficient beamforming scheme for wireless binaural hearing aids is proposed that provides a trade-off between the transmission bit rate and the amount of noise reduction. It is proposed to transmit only the low-frequency part of the signal from one hearing aid to the other, which is used in a binaural beamformer to generate the low-frequency part of the output. The high-frequency part is generated by a monaural beamformer using only the locally available microphone signals. The trade-off can be attained by adjusting the cutoff frequency of the lowpass filter. For speech sources with a 8 kHz bandwidth in the presence of an interfering source, it is shown that good performance can be achieved with a cutoff frequency of 4 kHz.
This paper details the signal processing techniques used to produce novel speech metrics for the quantitative assessment of dysarthric speech. A number of different processing methods are used to produce measures designed to aid speech therapists and patients alike in therapy and the assessment of speech quality. The three measures also have the potential for diagnosis of condition and tracking of trends in speech.
A system of remotely detecting vocal fold pathologies using telephone quality speech is presented. Using VoiceXML, 631 clean speech files of the sustained phonation of the vowel sound /a/ (58 normal subjects, 573 pathologic) from the Disordered Voice Database Model 4337 were transmitted over telephone channels to produce a test corpus. Pitch perturbation features, amplitude perturbation features and a set of measures of the harmonic-to-noise ratio were extracted from the clean and transmitted speech files. These feature sets were used to test and train automatic classifiers, employing the method of linear discriminant analysis. Cross-fold validation was employed to measure classifier performances. While a sustained phonation can be classified as normal or pathologic with accuracy greater than 90%, results indicate that a telephone quality speech can be classified as normal or pathologic with an accuracy of 74.15%. Amplitude perturbation features proved most robust in channel transmission. This study highlights the real possibility for remote diagnosis of voice pathology.
A new quantitative scheme used in signal typing of pathological human voices is presented, which is based on nonlinear dynamic analysis. The conducted research shows that correlation dimension analysis reveals significant differences among various types of voice signals: nearly periodic type 1 signals, type 2 signals containing bifurcations or modulations, and aperiodic type 3 signals. The correlation dimensions of each signal type statistically increase from type 1 to type 3 signals. This study suggests that nonlinear dynamic analysis represents a valuable new method to quantitatively classify pathological human voice signals.
A voice activity detector (VAD) based on the complex Laplacian model is proposed. The likelihood ratio based on the Laplacian model is computed and then applied to the VAD operation. According to experimental results, it is found that the Laplacian statistical model is more efficient for the VAD algorithm compared to the Gaussian model.
Nonlinear dynamic methods are employed to describe the complexity of speech from healthy and pathological subjects with vocal polyps. The analysis demonstrates the low-dimensional dynamic characteristics of normal and pathological voices as well as their statistically significant differences. The potential clinical application of nonlinear dynamics in speech signal processing of pathological voices is suggested.
It is increasingly vital that there are effective quality of service (QoS) metrics to describe the performance of telecommunications networks. Speech quality is a major contributor to users' perception of QoS, and the ability to design for and monitor this quality is paramount. The authors describe work towards a non-intrusive speech quality assessment algorithm, capable of making predictions of the speech quality received by a customer, utilising the in-service signal. Modern telecommunications networks contain complex nonlinear elements that cannot be assessed with traditional engineering metrics. A novel use of vocal-tract modelling techniques is described, which enables predictions of the quality of a network degraded speech stream to be made. Details of the algorithm's adaptation to different talker characteristics are presented, together with a summary of the performance of the system.
The problem of a subjective, as against an objective, analysis of glottal waveforms obtained through inverse filtering of the acoustic waveform is discussed. It is shown that a dynamical system phase-plane plot can be used to view the residual resonance characteristics and hence assess the quality of the glottal waveform.
Since the 1950s, several experiments have been run to evaluate the benefit of lip-reading on speech intelligibility, all presenting a natural face speaking at different levels of background noise. In this paper, we present a similar experiment run with French stimuli. Experiments run by McGrath (1985) and then by Summerfield et al. (1989) showed that the lips carry more than half the visual information provided by the whole face of an English speaker, and that vision of the teeth somewhat increases the intelligibility of a message. Similar experiments have been carried out at the Institut de la Communication Parlée in French. We compared the overall performance of normal hearers in audio-visual intelligibility tests where the visual displays were made of a natural face (Benoit et al., 1992), natural lips alone (Le Goff et al., 1995), and a bunch of 3D parametric models of the main components of a speaker's face: the lips, the jaw and the skin (Guiard-Marigny et al., 1995). The same parameters as those used to animate our synthetic models of the face have been measured on the same corpus to evaluate the performances of an HMM classifier in an identification task analogous to that performed by human subjects (Adjoudani and Benoit, 1996). Overall results are presented too. (6 pages)
The authors deal with the problem of automatic speech recognition in the presence of additive white noise. The effect of noise is modelled as an additive term to the power spectrum of the original clean speech. The cepstral coefficients of the noisy speech are then derived from this model. The reference cepstral vectors trained from clean speech are adapted to their appropriate noisy version to best fit the testing speech cepstral vector. The LPC coefficients, LPC derived cepstral coefficients, and the distance between test and reference, are all regarded as functions of the noise ratio (the spectral power ratio of noise to noisy speech). A gradient based algorithm is proposed to find the optimal noise ratio as well as the minimum distance between the test cepstral vector and the noise adapted reference. A recursive algorithm based on Levinson-Durbin recursion is proposed to simultaneously calculate the LPC coefficients and the derivatives of the LPC coefficients with respect to the noise ratio. The stability of the proposed adaptation algorithm is also addressed. Experiments on multispeaker (50 males and 50 females) isolated Mandarin digits recognition demonstrate remarkable performance improvements over noncompensated method under noisy environment. The results are also compared to the projection based approach, and experiments show that the proposed method is superior to the projection approach under a severe noisy environment.