石原 一志 坪田 康 奥乃 博
電子情報通信学会技術研究報告. SP, 音声 (ISSN:09135685)
vol.103, no.154, pp.19-24, 2003-06-19

環境音を擬音語に変換するうえでの問題として,擬音語表現が聴者により異なるという聴者依存性の問題が挙げられる.本研究は擬音語への変換処理を聴者に依存する処理と依存しない処理に分けることでこの問題を回避し,3段階の処理により環境音を擬音語に変換する手法を提案する.(1)時間軸上のパワー包絡から環境音を音節単位で切り分け,(2)音長・減衰速度から音節構造を同定し,(3)音素グループを利用した音素認識により音節構造を擬音語に変換する.認識実験によりこれらの手法の妥当性を確認した.セグメンテーションでは83.7 %/ 99.1‰長音の認識では84.6 %/ 100.0 %の適合率/再現率を得た.
石原 一志 駒谷 和範 尾形 哲也 奥乃 博
一般社団法人 人工知能学会
人工知能学会論文誌 (ISSN:13460714)
vol.20, no.3, pp.229-236, 2005 (Released:2005-03-23)
2 2

Environmental sounds are very helpful in understanding environmental situations and in telling the approach of danger, and sound-imitation words (sound-related onomatopoeia) are important expressions to inform such sounds in human communication, especially in Japanese language. In this paper, we design a method to recognize sound-imitation words (SIWs) for environmental sounds. Critical issues in recognizing SIW are how to divide an environmental sound into recognition units and how to resolve representation ambiguity of the sounds. To solve these problems, we designed three-stage procedure that transforms environmental sounds into sound-imitation words, and phoneme group expressions that can represent ambiguous sounds. The three-stage procedure is as follows: (1) a whole waveform is divided into some chunks, (2) the chunks are transformed into sound-imitation syllables by phoneme recognition, (3) a sound-imitation word is constructed from sound-imitation syllables according to the requirements of the Japanese language. Ambiguity problem is that an environmental sound is often recognized differently by different listeners even under the same situation. Phoneme group expressions are new phonemes for environmental sounds, and they can express multiple sound-imitation words by one word. We designed two sets of phoneme groups: ``a set of basic phoneme group'' and ``a set of articulation-based phoneme group'' to absorb the ambiguity. Based on subjective experiments, the set of basic phoneme groups proved more appropriate to represent environmental sounds than the articulation-based one or a set of normal Japaneses phonemes.
石原 一志 駒谷 和範 尾形 哲也 奥乃 博
The Japanese Society for Artificial Intelligence
人工知能学会論文誌 = Transactions of the Japanese Society for Artificial Intelligence : AI (ISSN:13460714)
vol.20, pp.229-236, 2005-11-01
5 2

Environmental sounds are very helpful in understanding environmental situations and in telling the approach of danger, and sound-imitation words (sound-related onomatopoeia) are important expressions to inform such sounds in human communication, especially in Japanese language. In this paper, we design a method to recognize sound-imitation words (SIWs) for environmental sounds. Critical issues in recognizing SIW are how to divide an environmental sound into recognition units and how to resolve representation ambiguity of the sounds. To solve these problems, we designed three-stage procedure that transforms environmental sounds into sound-imitation words, and <I>phoneme group expressions</I> that can represent ambiguous sounds. The three-stage procedure is as follows: (1) a whole waveform is divided into some chunks, (2) the chunks are transformed into sound-imitation syllables by phoneme recognition, (3) a sound-imitation word is constructed from sound-imitation syllables according to the requirements of the Japanese language. Ambiguity problem is that an environmental sound is often recognized differently by different listeners even under the same situation. Phoneme group expressions are new phonemes for environmental sounds, and they can express multiple sound-imitation words by one word. We designed two sets of phoneme groups: ``a set of basic phoneme group'' and ``a set of articulation-based phoneme group'' to absorb the ambiguity. Based on subjective experiments, the set of basic phoneme groups proved more appropriate to represent environmental sounds than the articulation-based one or a set of normal Japaneses phonemes.