- The Behaviormetric Society of Japan
- 行動計量学 (ISSN:03855481)
- vol.40, no.1, pp.17-28, 2013-03-28
This paper proposes a method for authorship identification based on phrase patterns that occur in the Japanese language, using literary work, student's work, journals to carry out actual proof analysis. The results showed that a writer's writing characteristics could be told clearly in phrase patterns. Using Random Forests, the correct ratio for identifying the authors from two arbitrary authors of literary works as well as student compositions was 99% and 92% for journals. In order to show the effectiveness of the proposed method, a comparison between phrase patterns and trigram of POS was conducted. There was no obvious difference found in the rate of correct identification of writer between phrase patterns C and POS trigram. However, when the data of the phrase patterns C were combined with morphological data, it can obtain a higher rate of correct identification of the writer than having combined the data of POS trigram with morphological data. Based on this, we carried out an analysis on the authorship doubt surrounding Kawabata Yasunari's works and the works of Mishima Yukio, HMakoto and Sawana Hisao. Phrase patterns analysis suggested there was no doubt surrounding the authorship in Kawabata's work.