著者
Tomoki Haruyama Sho Takahashi Takahiro Ogawa Miki Haseyama
出版者
The Institute of Image Information and Television Engineers
雑誌
ITE Transactions on Media Technology and Applications (ISSN:21867364)
巻号頁・発行日
vol.8, no.2, pp.89-99, 2020 (Released:2020-04-01)
参考文献数
41

The details of the matches of soccer can be estimated from visual and audio sequences, and they correspond to the occurrence of important scenes. Therefore, the use of these sequences is suitable for important scene detection. In this paper, a new multimodal method for important scene detection from visual and audio sequences in far-view soccer videos based on a single deep neural architecture is presented. A unique point of our method is that multiple classifiers can be realized by a single deep neural architecture that includes a Convolutional Neural Network-based feature extractor and a Support Vector Machine-based classifier. This approach provides a solution to the problem of not being able to simultaneously optimize different multiple deep neural architectures from a small amount of training data. Then we monitor confidence measures output from this architecture for the multimodal data and enable their integration to obtain the final classification result.
著者
Katsuki Kobayashi Takahiro Ogawa Miki Haseyama
出版者
映像情報メディア学会
雑誌
ITE Transactions on Media Technology and Applications (ISSN:21867364)
巻号頁・発行日
vol.1, no.4, pp.333-342, 2013 (Released:2013-10-01)
参考文献数
24
被引用文献数
5

This paper presents a new evaluation criterion for visualization of image search results based on the feature integration theory. This criterion is derived by combining two elements, visual saliency on visualization and grouping degree of similar images. Visual saliency, which is calculated from the feature integration theory, on visualization of image search results enables representation of users' attention, which is closely related to the effectiveness of finding images. Furthermore, since users perceive similar images that are close to each other as one group, grouping degree of similar images enables evaluation of the effectiveness when users find images similar to a desired image. Therefore, by combining visual saliency on visualization and grouping degree of similar images, we can derive the novel criterion and evaluate the effectiveness of visualization of image search results.
著者
Sho Takahashi Miki Haseyama
出版者
映像情報メディア学会
雑誌
ITE Transactions on Media Technology and Applications (ISSN:21867364)
巻号頁・発行日
vol.1, no.3, pp.220-225, 2013 (Released:2013-07-01)
参考文献数
13
被引用文献数
1 2

An Active grid-based method for estimating pass regions from broadcast soccer videos is presented in this paper. It is assumed that the pass region has a high probability of the pass succeeding. In soccer matches, players discover pass regions based on previous and current player positions. In conventional methods, pass regions are estimated by applying Active Net to only a single frame of a soccer video. In the proposed method, Active grid is applied to three-dimensional data by which frames of the soccer video are connected with the temporal dimension. The proposed method then realizes robust estimation of pass regions based on multiple frames of player positions. The proposed method was applied to actual TV programs to verify its effectiveness.
著者
Soh Yoshida Hiroshi Okada Takahiro Ogawa Miki Haseyama
出版者
映像情報メディア学会
雑誌
ITE Transactions on Media Technology and Applications (ISSN:21867364)
巻号頁・発行日
vol.1, no.3, pp.237-243, 2013 (Released:2013-07-01)
参考文献数
19

This paper presents a new method to improve performance of SVM-based classification, which contains a target object detection scheme. The proposed method tries to detect target objects from training images and improve the performance of the image classification by calculating the hyperplane from the detection results. Specifically, the proposed method calculates a Support Vector Machine (SVM) hyperplane, and detects rectangular areas surrounding the target objects based on the distances between their feature vectors and the separating hyperplane in the feature space. Then modification of feature vectors becomes feasible by removing features that exist only in background areas. Furthermore, a new hyperplane is calculated by using the modified feature vectors. Since the removed features are not part of the target object, they are not relevant to the learning process. Therefore, their removal can improve the performance of the image classification. Experimental results obtained by applying the proposed methods to several existing SVM-based classification method show its effectiveness.
著者
Nobuyuki Yagi Norifumi Egami Naoki Shimidzu Miki Haseyama
出版者
映像情報メディア学会
雑誌
ITE Transactions on Media Technology and Applications (ISSN:21867364)
巻号頁・発行日
vol.1, no.1, pp.10-19, 2013 (Released:2013-01-01)
参考文献数
139
被引用文献数
2

This paper presents a survey of research trends in broadcasting technology from production to distribution and from fundamentals to applications. It briefly reviews work being done on UHDTV, 3DTV, IPTV, hybrid broadcasting, transmission technology, audio-visual coding, image sensing devices, display devices, media accessibility, content production, metadata, and content management.
著者
Miki Haseyama Takahiro Ogawa Nobuyuki Yagi
出版者
映像情報メディア学会
雑誌
ITE Transactions on Media Technology and Applications (ISSN:21867364)
巻号頁・発行日
vol.1, no.1, pp.2-9, 2013 (Released:2013-01-01)
参考文献数
82
被引用文献数
4 28

Research trends in new video retrieval based on image and video semantic understanding are presented in this paper. First, recent studies related to image and video semantic analysis are introduced to understand leading-edge multimedia retrieval technologies. Several works related to visualization interfaces for multimedia retrieval are also presented. Finally, trends in state-of-the-art studies and the future outlook are described.
著者
Marie KATSURAI Takahiro OGAWA Miki HASEYAMA
出版者
一般社団法人 電子情報通信学会
雑誌
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E95.A, no.5, pp.927-937, 2012-05-01 (Released:2012-05-01)
参考文献数
37
被引用文献数
1

In this paper, a novel framework for extracting visual feature-based keyword relationships from an image database is proposed. From the characteristic that a set of relevant keywords tends to have common visual features, the keyword relationships in a target image database are extracted by using the following two steps. First, the relationship between each keyword and its corresponding visual features is modeled by using a classifier. This step enables detection of visual features related to each keyword. In the second step, the keyword relationships are extracted from the obtained results. Specifically, in order to measure the relevance between two keywords, the proposed method removes visual features related to one keyword from training images and monitors the performance of the classifier obtained for the other keyword. This measurement is the biggest difference from other conventional methods that focus on only keyword co-occurrences or visual similarities. Results of experiments conducted using an image database showed the effectiveness of the proposed method.
著者
Marie KATSURAI Takahiro OGAWA Miki HASEYAMA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E95-A, no.5, pp.927-937, 2012-05-01

In this paper, a novel framework for extracting visual feature-based keyword relationships from an image database is proposed. From the characteristic that a set of relevant keywords tends to have common visual features, the keyword relationships in a target image database are extracted by using the following two steps. First, the relationship between each keyword and its corresponding visual features is modeled by using a classifier. This step enables detection of visual features related to each keyword. In the second step, the keyword relationships are extracted from the obtained results. Specifically, in order to measure the relevance between two keywords, the proposed method removes visual features related to one keyword from training images and monitors the performance of the classifier obtained for the other keyword. This measurement is the biggest difference from other conventional methods that focus on only keyword co-occurrences or visual similarities. Results of experiments conducted using an image database showed the effectiveness of the proposed method.