- 著者
-
國府 久嗣
山崎 治子
野坂 政司
- 出版者
- 日本感性工学会
- 雑誌
- 日本感性工学会論文誌 (ISSN:18840833)
- 巻号頁・発行日
- vol.12, no.4, pp.511-518, 2013 (Released:2013-12-11)
- 参考文献数
- 17
- 被引用文献数
-
3
Extracting keywords from a target text data is essential for an analysis to describe substance characteristics of message content. We picked a use of a stopword filter from among alternatives because the method has the advantage that it is simple yet effective way. The filter we present was made up of non-content words and low-content words. Non-content-bearing words consisted mainly of function words and were gotten rid of by using part-of-speech (POS) tag information. High occurrence rate words in remaining had prospects of being keywords, however usually there were some low-content words like delexical verbs and so on. This article presents a stopword list obtained to come up with low-content words by sensuous manual procedures carried out using 40 text files from the CASTEL/J database and establishes it in the view of general versatility.