著者
小橋 豊 藤村 靖
出版者
一般社団法人 日本音響学会
雑誌
日本音響学会誌 (ISSN:03694232)
巻号頁・発行日
vol.11, no.1, pp.24-32, 1955

A simple method for vowel synthesis was devised and was found to be useful in physical and phonetical research of speech sounds, especially when employed in combination with sound analysis. Essentially, the system is analogous to compound filter formed by connecting several single-tuned filters in tandem, except for the involvement of noise and distortion through the use of tape recording in the process. The actual method is as follows:A source signal, such as sawteeth, is fed into suitably tuned variable tuned-amplifier, and its output recorded. In the second process this recorded sound -"step I"-is reproduced and fed into the same amplifier, now tuned to a different frequency. The output-the "Step II"sound - is recorded in its turn simultaneously. A similar process is repeated several times. The limits of this method in playing the role of compound filter, the complex frequency characteristics of which can be changed at will, are discussed. An example of the actual stages of formation of Japanese five vowels as they are formulated out of the original sawteeth is illustrated by spectra which are seen to change by steps. The tones were made by cut and try, monitoring them in each step, and trying to obtain the optimum effect. They were analysed afterwards by the sectioner of Sonagraph. The data of tunings settled empirically in this manner are given and a description is made of some of authors' experiences in making vowel sounds by this method. The formant peak positions used coincide generally with those of natural vowels, as analysed by several investigators in this country. One exception is the formant of the Japanese "u"sound, which is reported by some authors as being located at about 1-kc. This peak, appearance of which is largely dependent on individual pronunciations, and which the authors could not add in this synthesis experiment without contaminating the vowel quality, has been found a'so in some spectra of natural vowel sounds analysed by the present authors, and one instructive example is shown. The authors suspect that this discrepancy might have been due to the impossibility of adding a very sharp and yet small peak, with the method in its present form. The vowel quality of the Japanese "u" sound varies very widely, however, depending on individual pronunciators and on other factors, and the sound without the 1-kc peak appears quite natural to the ear as the one with the peak. As source vibration employed in simulating the vocal chords, wave forms other than sawteeth were also used, one example being the sound of a viola picked up by microphone. The artificial vowel sound s composed by the use of this source appeared to be very natural when heard as human voice. The authors believe that this happy result is attributable chiefly to the fluctuating nature of the viola sounds. In this connection, sonagrams of the vowel sounds transfigured from the sawteeth and viola sources are illustrated in comparison with those of natural sounds, togethor with their section patterns. The authors also suggest the possibility that components of noise , contained to an extent in the viola sounds affect the quality of the resultant limitated voice to some degree to help in making the latter approach the natural human voice. A sonagram and section patterns of "raw" viola sounds with pith inflexions are also given.
著者
颯田 琴次 菅原 淳夫 小橋 豊
出版者
一般社団法人 日本音響学会
雑誌
日本音響学会誌 (ISSN:03694232)
巻号頁・発行日
vol.8, no.2, pp.97-103, 1952-06-30 (Released:2017-06-02)

The authors studies the body wall vibration(especially at the back of the nose)by phonation with a crystal pick-up to search the corelation among the amplitude of the wall vibration(μ), voice pressure(μbar)and the quantity of the nasal respiration(cc). It was found useful to decide the degree of nasalization from the ratio of the amplitude of wall vibration and the voice pressure, which was named by the authors as "vibration proportion". Comparison was made with Japanese vowels and nasal sounds uttered by 40 persons(9 male, 12 female, 5 singers and 14 patients of nasal speech). (A)Normal persons:1. The amplitude of wall vibration of nasal sounds is greatest in speech sounds. Yhe vowels "i"and "u" have greater amplitudes than others and "a" the smallest. 2. In average, the amplitude is greater in male voice than in female voice. In male voice the amplitude is great in low vocal range and in female voice it is great in high vocal range. It is very interesting to think of vocal difference by sex. 3. When nasopharynx is closed, the amplitude is smaller than when it is opened and this fact coincides with the degree of nasalization by acoustical examination. 4. The measured values slightly change sometimes on the same person(with same vocal range, voice intensity and speech sound). Accordingly, it is found that articulation is not always constant. 5. The vibration of nose wall increases with the nasalization and the relation between them, which is one of main objects of this study, is made more clear by "vibration proportion". Namly it is possible to know the degree of nasalization of a person from the "vibration proportion". (B)Singers(by singing voice):Difference between speech sounds which we perceived generally in normal persons decreases by singers and the vibration becomes more uniform. There is no wonder that it is the results of practice of singing method. (C)Patients of nasal speach:The amplitude of nose wall vibration is larger than normal extent(sometimes several times larger than normal). Therefore it is possible to know the degree of nasal speech by measurement of the wall viblation.