著者
Shogo SEKI Tomoki TODA Kazuya TAKEDA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E101-A, no.7, pp.1057-1064, 2018-07-01

This paper proposes a semi-supervised source separation method for stereophonic music signals containing multiple recorded or processed signals, where synthesized music is focused on the stereophonic music. As the synthesized music signals are often generated as linear combinations of many individual source signals and their respective mixing gains, phase or phase difference information between inter-channel signals, which represent spatial characteristics of recording environments, cannot be utilized as acoustic clues for source separation. Non-negative Tensor Factorization (NTF) is an effective technique which can be used to resolve this problem by decomposing amplitude spectrograms of stereo channel music signals into basis vectors and activations of individual music source signals, along with their corresponding mixing gains. However, it is difficult to achieve sufficient separation performance using this method alone, as the acoustic clues available for separation are limited. To address this issue, this paper proposes a Cepstral Distance Regularization (CDR) method for NTF-based stereo channel separation, which involves making the cepstrum of the separated source signals follow Gaussian Mixture Models (GMMs) of the corresponding the music source signal. These GMMs are trained in advance using available samples. Experimental evaluations separating three and four sound sources are conducted to investigate the effectiveness of the proposed method in both supervised and semi-supervised separation frameworks, and performance is also compared with that of a conventional NTF method. Experimental results demonstrate that the proposed method yields significant improvements within both separation frameworks, and that cepstral distance regularization provides better separation parameters.
著者
Madoka Takimoto Takanori Nishino Katunobu Itou Kazuya Takeda
出版者
一般社団法人 日本音響学会
雑誌
Acoustical Science and Technology (ISSN:13463969)
巻号頁・発行日
vol.28, no.5, pp.335-342, 2007 (Released:2007-09-01)
参考文献数
13
被引用文献数
1 5

In this paper, we examine how covering one or both external ears affects sound localization on the horizontal plane. In our experiments, we covered subjects’ pinnae and external auditory canals with headphones, earphones, and earplugs, and conducted sound localization tests. Stimuli were presented from 12 different directions, and 12 subjects participated in the sound localization tests. The results indicate that covering one or both ears decreased their sound localization performance. Front-back confusion rates increased, particularly when covering both outer ears with open-air headphones or covering one ear with an intraconcha-type earphone or an earplug. Furthermore, incorrect answer rates were high when the sound source and the occluded ear that had an intraconcha-type earphone or an earplug were on the same side. We consider that the factors that cause poor performance can be clarified by comparing these results with characteristics of head-related transfer function.