Shift-Reduce法に基づく日本語固有表現抽出

1 0 0 0 Shift-Reduce法に基づく日本語固有表現抽出

著者: 山田寛康
出版者: 一般社団法人情報処理学会
雑誌: 情報処理学会研究報告音声言語情報処理(SLP) (ISSN:09196072)
巻号頁・発行日: vol.2007, no.47, pp.13-18, 2007-05-24

本稿では日本語固有表現に対してShift-Reduce法に基づく抽出法を提案しIREX日本語固有表現抽出タスクを用いてその有効性を検証する. 提案手法はShift-Reduce法に基づくことで文頭から順に固有表現の語境界推定後にその種類を推定するという自然な解析が実現できる. また日本語における形態素単位解析では形態素語境界と固有表現の語境界が異なる場合の誤抽出が問題となる. この問題に対し提案手法は簡単な拡張アクションを追加することで入力文全てを文字単位に解析することなく対処できる. CRL固有表現抽出データを用いた五分割交差検定による評価実験では文頭から文末に向かって部分的に文字単位解析する効率的な方法で 0.88 のF値を得た.We propose a method for Japanese Named Entity (NE) extraction based on shift-reduce parsing in a deterministic manner. After shift action is employed to determine the word boundaries of an NE composed of multiple morphemes, reduce action is applied for the estimation of the NE type. In analysis of Japanese NEs for each morpheme, incorrect extractions are inevitable because of some NEs whose word boundaries are different from the morpheme's ones. While most well known work analyzes NEs for each character in sentences at the expense of efficiency, our method can analyze NEs for each morpheme in most cases by introducing two types of additional shift-reduce actions that adjust to the word boundaries of an NE. The result of 5-fold cross validation using CRL NE data-set shows that the 0.88 F-value is comparable with related work, and our left-to-right analysis for each morpheme is more efficient.

2014-06-20 00:21:59
1 + 2 Twitter

https://ci.nii.ac.jp/naid/110006291111

言及状況

Twitter (1 users, 1 posts, 2 favorites)

@takuya_a FYI：そういえば，前職一緒に仕事させていただいた人がNEのプロパーでした http://t.co/XMr9ep8Flh

2 @takuya_a @taniokah

収集済み URL リスト

https://ci.nii.ac.jp/naid/110006291111 (1)