著者
Masanori MORISE Fumiya YOKOMORI Kenji OZAWA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E99.D, no.7, pp.1877-1884, 2016-07-01 (Released:2016-07-01)
参考文献数
38
被引用文献数
91 537

A vocoder-based speech synthesis system, named WORLD, was developed in an effort to improve the sound quality of real-time applications using speech. Speech analysis, manipulation, and synthesis on the basis of vocoders are used in various kinds of speech research. Although several high-quality speech synthesis systems have been developed, real-time processing has been difficult with them because of their high computational costs. This new speech synthesis system has not only sound quality but also quick processing. It consists of three analysis algorithms and one synthesis algorithm proposed in our previous research. The effectiveness of the system was evaluated by comparing its output with against natural speech including consonants. Its processing speed was also compared with those of conventional systems. The results showed that WORLD was superior to the other systems in terms of both sound quality and processing speed. In particular, it was over ten times faster than the conventional systems, and the real time factor (RTF) indicated that it was fast enough for real-time processing.
著者
Kenji OZAWA Yoshihiro CHUJO Yôiti SUZUKI Toshio SONE
出版者
Japan Society of Kansei Engineering
雑誌
KANSEI Engineering International (ISSN:13451928)
巻号頁・発行日
vol.3, no.4, pp.25-30, 2002 (Released:2010-06-28)
参考文献数
8
被引用文献数
7 11

A psychoacoustical experiment was conducted to examine the basic characteristics of auditory presence as the first step to obtain a full understanding of the multi-modal sensation of presence. Twenty-three sounds were binaurally recorded using a dummy head and reproduced to four subjects by headphones. The experiment was carried out using Scheffé's paired comparison method modified by Ura using a scale of five categories. Analysis of variance revealed that the auditory presence as the main effect was highly significant. As a result of scaling, the sounds of moving sound sources were evaluated as having a higher presence than the sounds of stationary sound sources. As for the latter, sounds with larger sound images were rated as having a relatively higher presence. The effects other than the main effect and the interactions between them, however, were also significant. In the interactions, the individual differences for the main effect tended to be explained as the difference in weights of the subjects on the psychological factors involved in the auditory presence.