著者
Hosung PARK Seungsoo NAM Eun Man CHOI Daeseon CHOI
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E101.D, no.12, pp.3092-3101, 2018-12-01 (Released:2018-12-01)
参考文献数
35
被引用文献数
2 2

Hidden Singer is a television program in Korea. In the show, the original singer and four imitating singers sing a song in hiding behind a screen. The audience and TV viewers attempt to guess who the original singer is by listening to the singing voices. Usually, there are few correct answers from the audience, because the imitators are well trained and highly skilled. We propose a computerized system for distinguishing the original singer from the imitating singers. During the training phase, the system learns only the original singer's song because it is the one the audience has heard before. During the testing phase, the songs of five candidates are provided to the system and the system then determines the original singer. The system uses a 1-class authentication method, in which only a subject model is made. The subject model is used for measuring similarities between the candidate songs. In this problem, unlike other existing studies that require artist identification, we cannot utilize multi-class classifiers and supervised learning because songs of the imitators and the labels are not provided during the training phase. Therefore, we evaluate the performances of several 1-class learning algorithms to choose which one is more efficient in distinguishing an original singer from among highly skilled imitators. The experiment results show that the proposed system using the autoencoder performs better (63.33%) than other 1-class learning algorithms: Gaussian mixture model (GMM) (50%) and one class support vector machines (OCSVM) (26.67%). We also conduct a human contest to compare the performance of the proposed system with human perception. The accuracy of the proposed system is found to be better (63.33%) than the average accuracy of human perception (33.48%).