季語データベースの構築と俳句の季語の自動判定の試み(2) -季語の増補と判定率の向上-

1 0 0 0 季語データベースの構築と俳句の季語の自動判定の試み(2) -季語の増補と判定率の向上-

著者: 吉岡亮衛
出版者: 一般社団法人情報処理学会
雑誌: 情報処理学会研究報告人文科学とコンピュータ(CH) (ISSN:09196072)
巻号頁・発行日: vol.2001, no.6, pp.17-24, 2001-01-19
参考文献数: 6
被引用文献数: 2

本論は、コンピュータによる俳句の研究を行うために必要な俳句データベースと、俳句を分析するために必要な季語データベースの本格的な構築に先立ち、季語データベースの構造とデータベースに収録すべき季語の数を検討した結果を報告するものである。具体的には、3種類の季語を集めた本を材料として、(1)共通に存在する季語、(2)すべての見出し語である季語、(3)見出し語の異称・別名・同類・対象語を含めたもの、の3通りの季語集合を作り、それらを用いて、サンプルとして抽出した俳句の季語を特定することを試みた。その結果、先の研究で1 542語の季語で448句の俳句を分析した結果、全体の約65%の俳句の季語を特定することができたものが、2 901語の季語により344句(76.8%)、6 709語で399句(89.1%)、約2万語で420句(93.8%)まで、判定率が向上することが見いだされた。また、最後まで季語が特定出来なかった俳句について、改善の方策を検討した。This paper reported the investigated results about the amount of Kigo in Kigo-database. This investigation is needed for building Kito-database, that is useful to analyse the Haiku. A Haiku-database and a Kigo-database are both needed to study Haiku by computer. Concretely, tree types of Kigo-group are made from three different books of "Kigo". The first one is the common Kigos in books, the second one is the all different Kigos in books, and in the third one are included more broader terms in the book "Shinhan-Kiyose". As a result of the Kigo matching test of all 448 Haikus, at the last time 1,542 Kigos hit about 65%. Then 2,901 Kigos hit 344 Haikus (76.8%), 6,709 Kigos hit 399 Haikus (89.1%), and about 20,000 Kigos hit 420 Haikus (93.8%). The specification rate of Kigo is improved by Enlarging the Kigo-database. At last the reasons of unmatched Haikus are discussed.

2015-10-21 16:44:00
1 はてなブックマーク

https://ci.nii.ac.jp/naid/110002930114

言及状況

はてなブックマーク (1 users, 2 posts)

[research][Japanese]

収集済み URL リスト

https://ci.nii.ac.jp/naid/110002930114 (1)