- 一般社団法人 日本音響学会
- 日本音響学会誌 (ISSN:03694232)
- vol.33, no.9, pp.470-479, 1977-09-01 (Released:2017-06-02)
This paper describes a speech analysis method for linear predictor coefficients and formant frequencies and bandwidths estimation using a portion of one pitch period of speech waveform. This method is based on the fact that estimation errors together with prediction errors vanish when excitation-free segments are utilized (Eq. 6). In the case of actual speech analysis, however, prediction errors may remain, even if they are minimum (Fig. 1). Therefore, it is necessary to formulate estimation errors in order to evaluate the performance of the proposed method. Theoretical studies on estimation errors are carried out. Estimation errors in linear predictor coefficients are dependent on the unknown component of excitation (see Eqs. 7 through 11). Expected values of estimation errors are derived, based on the assumption of the statistical property of excitation (see Eqs. 13 through 15) . A similar expression of estimation errors in formant frequencies and bandwidths is derived in the same manner (see Eqs. 16 through 18). Finally, two kinds of error estimates represented in terms of observed values are introduced as criteria for the determination of analysis conditions of our method and for the experimental comparison between the proposed method and the usual linear predictive analysis (Eq. 19). Simulation studies on these error estimates are performed using several kinds of synthetic speech. Their formant frequencies and bandwidths are given in Table 1. The relevance of these error estimates is indicated in the case of our method and in the case of usual linear predictive analysis (Figs. 2 through 5 and Tables 2 and 3). The length of the analysis segment must be properly chosen when these error estimates are employed in the error evaluation of the usual linear predictive analysis of periodic speech (Eq. 20 and Fig. 6). The results of simulation studies indicate both adequacy and validity of these error estimates. Experimental studies have been done to evaluate the performance of our method in the case of actual speech analysis. The minimum values of these error estimates of our method are as small as one half to one eighth of those of usual linear predictive analysis (Figs. 7 through 9 and Tables 4 and 5). In addition, the glottal source waveform estimated by using the result of our method seems more plausible than that obtained by using the result of the usual linear predictive analysis (Fig. 10). The results of these studies indicate that the proposed method yields more accurate estimates of parameters than those obtained by the usual linear predictive analysis, especially in the case of low-pitched speech.