単語N - gram言語モデルを用いた音声認識システムにおける未知語・冗長語の処理

2 0 0 0 単語N - gram言語モデルを用いた音声認識システムにおける未知語・冗長語の処理

著者: 甲斐充彦廣瀬良文中川聖一
出版者: 一般社団法人情報処理学会
雑誌: 情報処理学会論文誌 (ISSN:18827764)
巻号頁・発行日: vol.40, no.4, pp.1383-1394, 1999-04-15
被引用文献数: 39

対話音声認識システムや大語彙のディクテーションシステムにおいてはシステムの辞書に登録されていない未知語や間投詞・言い直し・言い淀みなどのユーザの要求に関係のない冗長語の扱いが重要である. このような問題に対処するために本研究では単語N-gram言語モデルを用いた連続音声認識アルゴリズムにおいて未知語処理を導入してその効果を調べた. 未知語処理法としてサブワード単位の音響モデルを用いたサブワード系列デコーダを併用しこれによって未知語候補の生成と検証を行う方法を用いる. この方法は以前に文脈自由文法を用いたシステムにおいて有効性を確かめている. 本論文ではこの方法に基づいて単語N-gramベースの認識アルゴリズムに未知語処理を効率的に導入する方法を提案している. 音声対話システムのタスクにおいて未知語や冗長語を含む発話を用いて評価実験を行った結果意味的な誤りが最大で48%減少した. また文脈自由文法に基づく同様なシステムと比較した結果意味理解精度の向上に効果があることが分がった. さらに大語彙連続音声認識タスクにおける効果を確かめるため新聞記事の読み上げ音声を用いた評価実験を行った結果単語単位での認識精度の改善は小さいが文レベルでの高い未知語検出性能が示された.For practical use of spoken dialog systems and dictation systems, it is important to cope with out-of-vocabulary words and filled pauses including the phenomena such as interjection, restart and hesitation. To address these problems, this study tries to use an unknown-word processing (UWP) method for a word N-gram language model based continuous speech recognition system. We investigate an UWP method which employs a subword sequence decoder with subword acoustic models to produce unknown-word hypotheses. This method has been shown to be effective on a small vocabulary task tested with a context-free grammar-based recognition system. This paper proposes an efficient method for incorporating the UWP into a word N-gram language model-based recognition system. We performed a series of experiments to show the effectiveness of the method for spoken dialog tasks and a dictation task. The experimental results show that a semantic accuracy was improved by 48% using the UWP method. Also, in compared with the result of a system using context-free grammar, the word N-gram based system could further improve the semantic accuracy for spontaneous speech. Furthermore, We performed are cognition experiment for a large-vocabulary dictation task. As a result, although only a slight improvement was observed in terms of the word accuracy, the high performance for detecting the existence of unknown-word in an utterance could be achieved.

https://ci.nii.ac.jp/naid/110002724792

言及状況

はてなブックマーク (1 users, 1 posts)

Twitter (1 users, 1 posts, 0 favorites)

こんな論文どうですか？単語N-gram言語モデルを用いた音声認識システムにおける未知語・冗長語の処理 (<特集>音声言語情報処理)(甲斐充彦ほか),1999 http://id.CiNii.jp/La0QL

収集済み URL リスト

https://ci.nii.ac.jp/naid/110002724792 (2)