Prosodic Features Control by Symbols as Input of Sequence-to-Sequence Acoustic Modeling for Neural TTS

doi:10.1587/transinf.2020EDP7104

5 0 0 0 OA Prosodic Features Control by Symbols as Input of Sequence-to-Sequence Acoustic Modeling for Neural TTS

著者: Kiyoshi KURIHARA Nobumasa SEIYAMA Tadashi KUMANO
出版者: The Institute of Electronics, Information and Communication Engineers
雑誌: IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日: vol.E104.D, no.2, pp.302-311, 2021-02-01 (Released:2021-02-01)
参考文献数: 35
被引用文献数: 13

This paper describes a method to control prosodic features using phonetic and prosodic symbols as input of attention-based sequence-to-sequence (seq2seq) acoustic modeling (AM) for neural text-to-speech (TTS). The method involves inserting a sequence of prosodic symbols between phonetic symbols that are then used to reproduce prosodic acoustic features, i.e. accents, pauses, accent breaks, and sentence endings, in several seq2seq AM methods. The proposed phonetic and prosodic labels have simple descriptions and a low production cost. By contrast, the labels of conventional statistical parametric speech synthesis methods are complicated, and the cost of time alignments such as aligning the boundaries of phonemes is high. The proposed method does not need the boundary positions of phonemes. We propose an automatic conversion method for conventional labels and show how to automatically reproduce pitch accents and phonemes. The results of objective and subjective evaluations show the effectiveness of our method.

2023-01-24 10:08:25
5 + 3 Twitter

言及状況

外部データベース (DOI)

Twitter (5 users, 5 posts, 3 favorites)

@2dimlover TTSにおけるアクセントの重要性は、以下の論文でも述べられてますね~ https://t.co/isj65sSuC1 一方で、このアクセントをデータセットに付与したり、推論時にどう扱うかは日本語データセットの現状の課題です...

J-STAGE Articles - Prosodic Features Control by Symbols as Input of Sequence-to-Sequence Acoustic Modeling for Neural TTS https://t.co/vifj6DTKbL

1 @tankaru

Prosodic Features Control by Symbols as Input of Sequence-to-Sequence Acoustic Modeling for Neural TTS https://t.co/LIOidpnCSe

1 @vmpmember

3 @mfreixesg @mpetrochuk0 @heiga_zen

5 0 0 0 OA Prosodic Features Control by Symbols as Input of Sequence-to-Sequence Acoustic Modeling for Neural TTS

言及状況

外部データベース (DOI)

Twitter (5 users, 5 posts, 3 favorites)

収集済み URL リスト