HMM-Based Voice Conversion Using Quantized F0 Context

doi:10.1587/transinf.E93.D.2483

5 0 0 0 HMM-Based Voice Conversion Using Quantized F0 Context

著者: Takashi NOSE Yuhei OTA Takao KOBAYASHI
出版者: The Institute of Electronics, Information and Communication Engineers
雑誌: IEICE TRANSACTIONS on Information and Systems (ISSN:09168532)
巻号頁・発行日: vol.E93-D, no.9, pp.2483-2490, 2010-09-01
被引用文献数: 9

We propose a segment-based voice conversion technique using hidden Markov model (HMM)-based speech synthesis with nonparallel training data. In the proposed technique, the phoneme information with durations and a quantized F0 contour are extracted from the input speech of a source speaker, and are transmitted to a synthesis part. In the synthesis part, the quantized F0 symbols are used as prosodic context. A phonetically and prosodically context-dependent label sequence is generated from the transmitted phoneme and the F0 symbols. Then, converted speech is generated from the label sequence with durations using the target speaker's pre-trained context-dependent HMMs. In the model training, the models of the source and target speakers can be trained separately, hence there is no need to prepare parallel speech data of the source and target speakers. Objective and subjective experimental results show that the segment-based voice conversion with phonetic and prosodic contexts works effectively even if the parallel speech data is not available.

2021-04-25 02:02:32
5 + 9 Twitter

言及状況

外部データベース (DOI)

Twitter (5 users, 5 posts, 9 favorites)

おそらくですが、下記の手法は私が約10年前に提案したHMM＋量子化F0に基づく任意話者声質変換法と似たようなことをやっているのでは、と想像。一番大きな違いは合成をニューラルベースにしたことかなと。https://t.co/ZgZBr7BV1Y https://t.co/nezNDGwmO1 間違ってたらすみません^^; https://t.co/tKbrWr3mw5

4 @nullpodesu @chiteico @SythonUK @vmpmember

9 @nullpodesu @zenakuta @Gazelle20114921 @jandum81 @ejiwarp @r9y9 @ysks @ymas0315 @piercesword

収集済み URL リスト

https://search.ieice.org/bin/summary.php?id=e93-d_9_2483 (5)