著者
小泉 利恵 渡邉 聡代
出版者
The Japan Society of English Language Education
雑誌
全国英語教育学会紀要 (ISSN:13448560)
巻号頁・発行日
vol.32, pp.129-144, 2021-03-31 (Released:2022-04-01)
参考文献数
23

When teachers score classroom speaking tests, intensive rater training ahead of the test may not always be possible. The current study examines the extent to which rater reliability can be maintained using a simple rubric without detailed rater training. We analyzed four speaking tests for senior high school students (N = 116). The speaking tests involved an individual presentation, a paired role play, and two group discussions across seven months. Each test was evaluated using a simple rubric by two or more raters who did not receive intensive rater training. The data was analyzed using many-facet Rasch measurement and generalizability theory. The results suggest that in general, raters scored similarly and consistently. The number of raters required to maintain sufficient reliability (Φ = .70), at the overall test level, was one to four, with group discussion tests requiring more raters or intensive rater training. Pedagogical implications with regard to the allocation of limited resources of time and raters were discussed.