Investigation on an autoregressive recurrent mixture density network for parametric speech synthesis

2 0 0 0 OA Investigation on an autoregressive recurrent mixture density network for parametric speech synthesis

著者: Xin Wang Shinji Takaki Junichi Yamagishi
雑誌: 研究報告音声言語情報処理(SLP) (ISSN:21888663)
巻号頁・発行日: vol.2017-SLP-115, no.2, pp.1-6, 2017-02-10

Neural-network-based mixture density networks are important tools for acoustic modeling in statistical parametric speech synthesis. Recently we found that incorporating an autoregressive model in a recurrent mixture density network, which is referred to as AR-RMDN, enabled the network to generate quite smooth acoustic data trajectories without using the delta and delta-delta coefficients. More interestingly, the new model generated trajectories with a dynamic range similar to that of the natural data, thus alleviating over-smoothing effect. In this work, after explaining the AR-RMDN from the perspective of signal and filter, we compare one AR-RMDN with a modulation-spectrum-based post-filtering method that also eases the over-smoothing effect. It is demonstrated that the AR-RMDN also alters the modulation spectrum of the generated data trajectories but in a different way from the post-filtering method. The AR-RMDN also generates synthetic speech with better perceived quality. Based on the signal and filter interpretation, we further extend the AR-RMDN so that the inverse AR filter can acquire complex poles and stay stable.

2020-09-01 21:24:08
2 + 1 Twitter

http://id.nii.ac.jp/1001/00177342/

言及状況

Twitter (2 users, 2 posts, 1 favorites)

@r9y9 了解しました. ちょうどNIIのpaper(https://t.co/D3QxtbvBne)を読んでいてGV/MSとはなんぞや…となっていたので疑問が一つ解決しました. 勉強させていただきます.

1 @chiteico

1 @r9y9

収集済み URL リスト

https://ipsj.ixsq.nii.ac.jp/ej/index.php?action=pages_view_main&active_action=repository_action_common_download&item_id=177376&item_no=1&attribute_id=1&file_no=1&page_id=13&block_id=8 (2)