著者
飯塚 海斗 大塚 和弘
出版者
一般社団法人 人工知能学会
雑誌
人工知能学会論文誌 (ISSN:13460714)
巻号頁・発行日
vol.38, no.3, pp.J-M91_1-17, 2023-05-01 (Released:2023-05-01)
参考文献数
37

A novel machine learning framework is proposed to automatically recognize the synergetic functions of the Aizuchi and the head movements of listeners in conversations. The listeners’ head movements, such as nodding, and Aizuchi, i.e., listeners’ short back-channel utterances, play a variety of functions, such as expressing the sign of listening, agreement, and emotions. This paper presents a functional Aizuchi corpus and analyzes it with the functional head-movement corpus that the authors have previously created. The analysis reveals the synergetic relationship between Aizuchi and head movements including reinforcement, multiplexing, and complementary. Then, this paper defines a functional category system called synergetic functions, which classifies reinforcement and multiplexing as product functions and complementary as sum functions. Next, several models using convolutional neural networks (CNNs) are designed to recognize such synergetic functions from the time series of the prosodic features and the head pose of the listeners. More specifically, we compare some different architectures, which employ early/late feature fusions and single/two-stage decision-making. The experimental results shows the proposed models achieved the maximum F1-score of 0.71 for the product function of Aizuchi’s continuer and head movement back-channel and that of 0.88 for a sum function called back-channel acknowledgment that was complementarily expressed by head movements and Aizuchi. These results confirms the potential of the proposed framework.