著者
秋本 直郁 林 昌希 秋月 秀一 青木 義満
出版者
公益社団法人 精密工学会
雑誌
精密工学会誌 (ISSN:09120289)
巻号頁・発行日
vol.84, no.12, pp.1033-1040, 2018-12-05 (Released:2018-12-05)
参考文献数
20

In this paper, we address the problem of performing natural paste synthesis by color adjustment and image completion, in order to solve the completion problem that can specify an object appearing in a completion area. We propose a synthesis network that can extract the context features of the input image and reconstruct an image with the feature, making the inserted object appear in the completion region. In addition, we propose a ingenious method to make input images and learning method using Generative Adversarial Network (GAN) that do not require collection of high cost learning data. We show that color adjustment and image completion based on context features are executed at the same time, and natural pasting synthesis can be performed by using these proposal methods.
著者
長崎 好輝 林 昌希 金子 直史 青木 義満
出版者
公益社団法人 精密工学会
雑誌
精密工学会誌 (ISSN:09120289)
巻号頁・発行日
vol.88, no.3, pp.263-268, 2022-03-05 (Released:2022-03-05)
参考文献数
10

In this paper, we propose a new method for audio-visual event localization 1) to find the corresponding segment between audio and visual event. While previous methods use Long Short-Term Memory (LSTM) networks to extract temporal features, recurrent neural networks like LSTM are not able to precisely learn long-term features. Thus, we propose a Temporal Cross-Modal Attention (TCMA) module, which extract temporal features more precisely from the two modalities. Inspired by the success of attention modules in capturing long-term features, we introduce TCMA, which incorporates self-attention. Finally, we were able to localize audio-visual event precisely and achieved a higher accuracy than the previous works.
著者
笠井 誠斗 秋本 直郁 林 昌希 青木 義満
出版者
公益社団法人 精密工学会
雑誌
精密工学会誌 (ISSN:09120289)
巻号頁・発行日
vol.85, no.12, pp.1102-1109, 2019-12-05 (Released:2019-12-05)
参考文献数
15

In this paper, a novel setting is tackled in which a neural network generates object images with transferred attributes, by conditioning on natural language commands. Conventional methods for object image transformation have used visual attributes, which are components that describe the object's color, posture, etc. This paper builds on this approach and finds an algorithm to precisely extract information from natural language commands, which transfers the attributes of an image and completes this image translation model. The effectiveness of our information extraction model is experimented, with additional tests to see if the change in visual attributes is correctly seen in the image.