単語を認識単位とした日本語の大語彙連続音声認識

1 0 0 0 単語を認識単位とした日本語の大語彙連続音声認識

著者: 西村雅史伊東伸泰山崎一孝
出版者: 一般社団法人情報処理学会
雑誌: 情報処理学会論文誌 (ISSN:18827764)
巻号頁・発行日: vol.40, no.4, pp.1395-1403, 1999-04-15
被引用文献数: 27

我々は先に日本人が感覚的にとらえている単語単位を既存の形態素解析プログラムの出力である形態素単位との統計的対応関係から自動推定する方法を提案しそれを認識および発声の単位とする離散単語発声の日本語ディクテーションシステムを構築した. この人間の考える単語単位を連続音声認識の認識単位としても利用することを試み特に他の大語彙連続音声認識システムで用いられることの多い形態素単位と比較してその有効性について調査した. また認識単位の定義が一意に決まらない現状をふまえて日本語の連続音声認識システムの評価方法を提案するとともに不特定話者の大語彙音声認識実験結果について報告する. 男女各10名に対する認識実験の結果文字誤り率3.0% 単語誤り率4.3%が得られた. さらに句読点の自動挿入方法や未知語モデルと単語N-gramモデルによる単語単位の自動分割方法などについても述べる.In this paper, we discuss a word-based continuous dictation system for Japanese. We previously proposed a statistical method for segmenting a text into words on the basis of human intuition, and developed an isolated-word-based Japanese dictation system. By comparing the word units used for the isolated word recognition with grammatical units, we show that the former are also very useful for continuous speech recognition. Evaluation of the performance of this continuous dictation system showed that the character error rate was 3.0%, and that the word error rate was 4.3%. We also present a method for inserting punctuation marks in spoken texts automatically, and a method for segmenting Japanese text into words by using an N-gram model, focusing on the handling of unknown words.

2014-12-05 04:45:14
1 + 0 Twitter

https://ci.nii.ac.jp/naid/110002724793

言及状況

Twitter (1 users, 1 posts, 0 favorites)

こんな論文どうですか？単語を認識単位とした日本語の大語彙連続音声認識 (<特集>音声言語情報処理)(西村雅史ほか),1999 http://t.co/vkLKzE3ldi

収集済み URL リスト

https://ci.nii.ac.jp/naid/110002724793 (1)