文献一覧: 長崎好輝 (著者)

1 0 0 0 OA 動画内の音と映像によるイベント推定タスクにおける時間方向クロスモーダルアテンションの導入

著者: 長崎好輝林昌希金子直史青木義満
出版者: 公益社団法人精密工学会
雑誌: 精密工学会誌 (ISSN:09120289)
巻号頁・発行日: vol.88, no.3, pp.263-268, 2022-03-05 (Released:2022-03-05)
参考文献数: 10

In this paper, we propose a new method for audio-visual event localization 1) to find the corresponding segment between audio and visual event. While previous methods use Long Short-Term Memory (LSTM) networks to extract temporal features, recurrent neural networks like LSTM are not able to precisely learn long-term features. Thus, we propose a Temporal Cross-Modal Attention (TCMA) module, which extract temporal features more precisely from the two modalities. Inspired by the success of attention modules in capturing long-term features, we introduce TCMA, which incorporates self-attention. Finally, we were able to localize audio-visual event precisely and achieved a higher accuracy than the previous works.

2023-07-24 12:52:36
1 + 1 Twitter