著者
森 大河 伝 康晴
出版者
一般社団法人 人工知能学会
雑誌
人工知能学会論文誌 (ISSN:13460714)
巻号頁・発行日
vol.37, no.3, pp.IDS-H_1-12, 2022-05-01 (Released:2022-05-01)
参考文献数
38

In human-human interactions, a listener uses both verbal tokens and head nods for responding signals, and they frequently co-occur. When humanoid robots and anthropomorphic agents response to a user using verbal tokens and head nods simultaneously, they must be generated in proper timing to each other and have consistent features. In this paper, we propose models to predict co-occurrence and physical features of head nods based on prosodic and syntactic features of verbal response tokens. We used, as predictive variables, the forms, positions, durations, averages/standard deviations of fundamental frequency and loudness of response tokens and head positions at the beginning of response tokens. In addition, considering participation framework, we also used speaker's gaze and listener's gaze at the beginning of response tokens, and applied generalized mixed models to predict the co-occurrence, type, range, repetition and velocity of head nods. The results confirmed that proposed models can predict these outcomes effectively.