金川 絵利子 岡留 剛
一般社団法人 人工知能学会
人工知能学会論文誌 (ISSN:13460714)
vol.32, no.3, pp.F-G94_1-14, 2017-05-01 (Released:2017-05-01)

The subtree kernel and the information tree kernel defined here permit us to measure the syntactic characteristics and similarity of sentences. The subtree kernel is the total number of the common subtrees in two trees and the information tree kernel is defined as the total Shannon information contents contained in the common subtrees. The information tree kernel enables us to capture such structural characteristics peculiar to the styles of writers. The analyses using by these kernels reveal some syntactic characteristics and similarities of the Japanese 31 authors’ writing styles. In particular, the results of the analyses for the great five authors, Soseki Natume, Ryunosuke Akutagawa, Osamu Dazai, Nankiti Niimi, and Kenzi Miyazawa, show that, for example, (1) Natume more often writes a sentence of the dependency structure in which the same subtree structure occurs multiple times in the sentence. (2) Akutagawa more often uses the dependency structures for extra or detailed expressions that modifies a noun phrase than the others do. (3) Dazai often uses the dependency structures that consist of many shallow subtrees arranged in parallel, but the others seldom write sentences of the parallel subtree structures. (4) Niimi uses simpler dependency structures than Miyazawa does and Miyazawa writes short sentences in more various dependency structures.