- 著者
-
上山 彩夏
狩野 芳伸
- 出版者
- 一般社団法人 人工知能学会
- 雑誌
- 人工知能学会論文誌 (ISSN:13460714)
- 巻号頁・発行日
- vol.37, no.2, pp.G-L62_1-10, 2022-03-01 (Released:2022-03-01)
- 参考文献数
- 39
In recent years, there has been a lot of research on building dialogue systems using deep learning, which can generate relatively fluent response sentences to user utterances. Nevertheless, they tend to produce responses that are not diverse and which are less context-dependent. Assuming that the problem is caused by the Softmax Cross- Entropy (SCE) loss, which treats all words equally without considering the imbalance in the training data, a loss function Inverse Token Frequency (ITF) loss, which multiplies the SCE loss by a weight based on the inverse of the token frequency, was proposed and confirmed the improvement of dialogue diversity. However, in the diversity of sentences, it is necessary to consider not only the information of independent tokens, but also the frequency of incorporating a sequence of tokens. Using frequencies that incorporate a sequence of tokens to compute weights that dynamically change depending on the context, we can better represent the diversity we seek. Therefore, we propose a loss function, Inverse N-gram Frequency (INF) loss, which is weighted based on the inverse of the n-gram frequency of the tokens instead of the frequency of the tokens. In order to confirm the effectiveness of the proposed method on INF loss, we conducted metric-based and human evaluations of sentences automatically generated by models trained on the Japanese and English Twitter datasets. In the metric-based evaluation, Perplexity, BLEU, DIST-N, ROUGE, and length were used as evaluation indices. In the human evaluation, we assessed the coherence and diversity of the response sentences. In the metric-based evaluation, the proposed INF model achieved higher scores in Perplexity, DIST-N, and ROUGE than the previous methods. In the human evaluation, the INF model also showed superior values.