Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis

7 0 0 0 Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis

著者: Kazuhiro NAKAMURA Kei HASHIMOTO Yoshihiko NANKAKU Keiichi TOKUDA
出版者: The Institute of Electronics, Information and Communication Engineers
雑誌: IEICE TRANSACTIONS on Information and Systems (ISSN:09168532)
巻号頁・発行日: vol.E97-D, no.6, pp.1438-1448, 2014-06-01

This paper proposes a novel approach for integrating spectral feature extraction and acoustic modeling in hidden Markov model (HMM) based speech synthesis. The statistical modeling process of speech waveforms is typically divided into two component modules: the frame-by-frame feature extraction module and the acoustic modeling module. In the feature extraction module, the statistical mel-cepstral analysis technique has been used and the objective function is the likelihood of mel-cepstral coefficients for given speech waveforms. In the acoustic modeling module, the objective function is the likelihood of model parameters for given mel-cepstral coefficients. It is important to improve the performance of each component module for achieving higher quality synthesized speech. However, the final objective of speech synthesis systems is to generate natural speech waveforms from given texts, and the improvement of each component module does not always lead to the improvement of the quality of synthesized speech. Therefore, ideally all objective functions should be optimized based on an integrated criterion which well represents subjective speech quality of human perception. In this paper, we propose an approach to model speech waveforms directly and optimize the final objective function. Experimental results show that the proposed method outperformed the conventional methods in objective and subjective measures.

2015-02-05 08:14:51
7 + 2 Twitter

http://search.ieice.org/bin/summary.php?id=e97-d_6_1438&category=D&year=2014&lang=E&abst=

言及状況

Twitter (7 users, 8 posts, 2 favorites)

@arctan_P メルケプは経由するものの波形の尤度を直接最大化する論文は去年出たこれですね。 http://t.co/2lEZmWipCu

1 @arctan_P

研究グループの論文"Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis"が電子情報通信学会論文誌に掲載されました：http://t.co/CYUJ7RDdx0

5 @ballforest @tetsurokitahara @kyama0321 @dicekicker @knoike

2 @hiroyuki_hon @knoike

7 0 0 0 Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis

言及状況

Twitter (7 users, 8 posts, 2 favorites)

収集済み URL リスト