著者
Marie KATSURAI Ikki OHMUKAI Hideaki TAKEDA
出版者
一般社団法人 電子情報通信学会
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E99.D, no.4, pp.1010-1018, 2016-04-01 (Released:2016-04-01)
参考文献数
33
被引用文献数
6

It is crucial to promote interdisciplinary research and recommend collaborators from different research fields via academic database analysis. This paper addresses a problem to characterize researchers' interests with a set of diverse research topics found in a large-scale academic database. Specifically, we first use latent Dirichlet allocation to extract topics as distributions over words from a training dataset. Then, we convert the textual features of a researcher's publications to topic vectors, and calculate the centroid of these vectors to summarize the researcher's interest as a single vector. In experiments conducted on CiNii Articles, which is the largest academic database in Japan, we show that the extracted topics reflect the diversity of the research fields in the database. The experiment results also indicate the applicability of the proposed topic representation to the author disambiguation problem.
著者
Marie KATSURAI Ikki OHMUKAI Hideaki TAKEDA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E99-D, no.4, pp.1010-1018, 2016-04-01

It is crucial to promote interdisciplinary research and recommend collaborators from different research fields via academic database analysis. This paper addresses a problem to characterize researchers' interests with a set of diverse research topics found in a large-scale academic database. Specifically, we first use latent Dirichlet allocation to extract topics as distributions over words from a training dataset. Then, we convert the textual features of a researcher's publications to topic vectors, and calculate the centroid of these vectors to summarize the researcher's interest as a single vector. In experiments conducted on CiNii Articles, which is the largest academic database in Japan, we show that the extracted topics reflect the diversity of the research fields in the database. The experiment results also indicate the applicability of the proposed topic representation to the author disambiguation problem.
著者
Marie KATSURAI Takahiro OGAWA Miki HASEYAMA
出版者
一般社団法人 電子情報通信学会
雑誌
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E95.A, no.5, pp.927-937, 2012-05-01 (Released:2012-05-01)
参考文献数
37
被引用文献数
1

In this paper, a novel framework for extracting visual feature-based keyword relationships from an image database is proposed. From the characteristic that a set of relevant keywords tends to have common visual features, the keyword relationships in a target image database are extracted by using the following two steps. First, the relationship between each keyword and its corresponding visual features is modeled by using a classifier. This step enables detection of visual features related to each keyword. In the second step, the keyword relationships are extracted from the obtained results. Specifically, in order to measure the relevance between two keywords, the proposed method removes visual features related to one keyword from training images and monitors the performance of the classifier obtained for the other keyword. This measurement is the biggest difference from other conventional methods that focus on only keyword co-occurrences or visual similarities. Results of experiments conducted using an image database showed the effectiveness of the proposed method.
著者
Marie KATSURAI Takahiro OGAWA Miki HASEYAMA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E95-A, no.5, pp.927-937, 2012-05-01

In this paper, a novel framework for extracting visual feature-based keyword relationships from an image database is proposed. From the characteristic that a set of relevant keywords tends to have common visual features, the keyword relationships in a target image database are extracted by using the following two steps. First, the relationship between each keyword and its corresponding visual features is modeled by using a classifier. This step enables detection of visual features related to each keyword. In the second step, the keyword relationships are extracted from the obtained results. Specifically, in order to measure the relevance between two keywords, the proposed method removes visual features related to one keyword from training images and monitors the performance of the classifier obtained for the other keyword. This measurement is the biggest difference from other conventional methods that focus on only keyword co-occurrences or visual similarities. Results of experiments conducted using an image database showed the effectiveness of the proposed method.