著者
向 奈津美 金寺 登 北口 直 荒井 隆行
出版者
独立行政法人 石川工業高等専門学校
雑誌
石川工業高等専門学校紀要 (ISSN:02866110)
巻号頁・発行日
vol.39, pp.51-56, 2007

The process of detecting portions involving utterances, which is essential for captioning films, is generally carried out manually by translators at present. Robust methods are inevitable for automatic voice activity detection (VAD) in films involving other irrelevant sound information such as background music. This paper proposes a new feature for automatic VAD. The proposed method utilizes the gradient of spectrum in high-frequency domain (4-6kHz) and the standard deviation of modulation-filtered cepstrum. For evaluation experiments, we used a portion (about 23 minutes) of an English musical film. The proposed method exhibits a 22.6% reduction in total error rate compared to the conventional one utilizing the short time energy.

言及状況

外部データベース (DOI)

Twitter (2 users, 2 posts, 1 favorites)

こんな論文どうですか? ミュージカル映画中の音声区間検出(向 奈津美ほか),2007 https://t.co/0tLuxhhw8o The process of detecting porti…

収集済み URL リスト