- 公益社団法人 計測自動制御学会
- 計測自動制御学会論文集 (ISSN:04534654)
- vol.53, no.12, pp.644-653, 2017 (Released:2017-12-16)
In this paper, multi-speaker localization methods using voice signal are proposed. Sound source localization systems consisting of many microphones have been widely researched. In our laboratory, we have researched sound source localization of persons based on TDOA (Time Difference of Arrival) between distributed microphones for the purpose of efficient use of air conditioner and lighting in the office . Localization systems based on sound such as footsteps or voice are superior to those based on image data in terms of privacy-consciousness and utility in the dark. As sound source localization methods, beamforming, subspace methods and TDOA-based methods are widely used. CSP (Cross-power Spectrum Phase) analysis is one of the most famous method to estimate TDOA. We have researched sound source localization methods using CSP analysis because it spends less calculation cost than beamforming or subspace methods. However, CSP analysis doesn't work well by itself in the real office environment because of sound lasting for a long time, reverberations, low SNR, and so on. The main cause of difficulty in sound source localization in such a case is the “weak correlation” between different sound signals. In this paper, it is shown that the bad effect of the “weak correlation” can be improved by our proposed methods, Wiener-filter-based CSP analysis and H-CSP (Histogram-CSP) analysis. H-CSP analysis is a statistics data analysis method for CSP coefficients which uses the histogram of CSP coefficients. In this research, the effectiveness of proposed methods are shown by sound source localization experiments in the case sound sources are stable and the case sound sources are moving.