- 著者
-
小橋 豊
藤村 靖
- 出版者
- 一般社団法人 日本音響学会
- 雑誌
- 日本音響学会誌 (ISSN:03694232)
- 巻号頁・発行日
- vol.11, no.1, pp.24-32, 1955
A simple method for vowel synthesis was devised and was found to be useful in physical and phonetical research of speech sounds, especially when employed in combination with sound analysis. Essentially, the system is analogous to compound filter formed by connecting several single-tuned filters in tandem, except for the involvement of noise and distortion through the use of tape recording in the process. The actual method is as follows:A source signal, such as sawteeth, is fed into suitably tuned variable tuned-amplifier, and its output recorded. In the second process this recorded sound -"step I"-is reproduced and fed into the same amplifier, now tuned to a different frequency. The output-the "Step II"sound - is recorded in its turn simultaneously. A similar process is repeated several times. The limits of this method in playing the role of compound filter, the complex frequency characteristics of which can be changed at will, are discussed. An example of the actual stages of formation of Japanese five vowels as they are formulated out of the original sawteeth is illustrated by spectra which are seen to change by steps. The tones were made by cut and try, monitoring them in each step, and trying to obtain the optimum effect. They were analysed afterwards by the sectioner of Sonagraph. The data of tunings settled empirically in this manner are given and a description is made of some of authors' experiences in making vowel sounds by this method. The formant peak positions used coincide generally with those of natural vowels, as analysed by several investigators in this country. One exception is the formant of the Japanese "u"sound, which is reported by some authors as being located at about 1-kc. This peak, appearance of which is largely dependent on individual pronunciations, and which the authors could not add in this synthesis experiment without contaminating the vowel quality, has been found a'so in some spectra of natural vowel sounds analysed by the present authors, and one instructive example is shown. The authors suspect that this discrepancy might have been due to the impossibility of adding a very sharp and yet small peak, with the method in its present form. The vowel quality of the Japanese "u" sound varies very widely, however, depending on individual pronunciators and on other factors, and the sound without the 1-kc peak appears quite natural to the ear as the one with the peak. As source vibration employed in simulating the vocal chords, wave forms other than sawteeth were also used, one example being the sound of a viola picked up by microphone. The artificial vowel sound s composed by the use of this source appeared to be very natural when heard as human voice. The authors believe that this happy result is attributable chiefly to the fluctuating nature of the viola sounds. In this connection, sonagrams of the vowel sounds transfigured from the sawteeth and viola sources are illustrated in comparison with those of natural sounds, togethor with their section patterns. The authors also suggest the possibility that components of noise , contained to an extent in the viola sounds affect the quality of the resultant limitated voice to some degree to help in making the latter approach the natural human voice. A sonagram and section patterns of "raw" viola sounds with pith inflexions are also given.