access icon free Estimation of glottal closure instants by considering speech signal as a spectrum

Close to glottal closure instants (GCIs), the speech signal is expected to change its amplitude rapidly and, at GCIs, it is expected to have strong negative peaks. A novel algorithm that exploits these two properties for the estimation of GCIs is presented. Here, a symmetrised speech segment is assumed to be a Fourier transform (FT) of an even function. In such a case, at the locations of the GCIs, the strong negative peaks in the symmetrised speech segment correspond to zeros that lie considerably outside the unit circle in the z-plane. The group delay spectrum of the time-domain signal derived by taking inverse FT of this assumed FT is expected to take a value close to −2π at the angular locations of these zeros. Mapping frequency scale to time scale, the frequency bins for which group delay reaches −2π correspond to the locations of GCIs. Theoretical justification for the proposed approach is also presented by defining a novel function called the conditional group delay function. Systematic evaluation is carried out on the CMU Arctic database and the performance of the proposed technique is better than that of the algorithms namely DYPSA, ZFF, YAGA and is close to that of SEDREAMS.

Inspec keywords: medical signal processing; bioelectric potentials; speech processing; electroencephalography; Fourier transforms

Other keywords: symmetrised speech segment; ZFF; DYPSA; inverse FT; strong negative peaks; z-plane; conditional group delay function; CMU arctic database; mapping frequency scale; time-domain signal; Fourier transform; SEDREAMS; speech signal; YAGA; glottal closure instant estimation; group delay spectrum; frequency bins

Subjects: Function theory, analysis; Bioelectric signals; Electrical activity in neurophysiological processes; Digital signal processing; Electrodiagnostics and other electrical measurement techniques; Integral transforms in numerical analysis; Biology and medical computing; Integral transforms in numerical analysis; Speech processing techniques; Speech and audio signal processing; Signal processing and detection

References

    1. 1)
    2. 2)
      • 3. Kominek, J., Black, A.: ‘The CMU Arctic speech databases’. 5th ISCA Speech Synthesis Workshop, Pittsburgh, PA, USA, 2004, pp. 223224.
    3. 3)
    4. 4)
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2014.4444
Loading

Related content

content/journals/10.1049/el.2014.4444
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading