著者
馬場 正太郎
出版者
日本言語テスト学会
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.22, pp.44-64, 2019 (Released:2020-02-07)
被引用文献数
1

The purpose of this study is to propose a way of generating beneficial washback effect by using high-stakes testing from a perspective of educational psychology. As can be seen in recent educational climate in Japan, it is required that high school students make the best use of language tests, such as EIKEN, GTEC, TEAP, TOEFL, IELTS, Cambridge English Exam, in order to improve four-skill English proficiency. Although this educational reform has been criticized due to its aggressive performance, there has been little discussion on how to induce its beneficial washback effect while reducing negative one. Therefore, it is necessary not only to argue the flaws of the reform, but also to seek the practical solution to the reform. In this paper, the recent standards-based educational reform efforts in Japan will be reviewed briefly. Next, the concept of washback and validity will be introduced to argue that washback effect should be considered as a consequential aspect of validity. Then, an effective way to induce beneficial washback effect will be discussed based on the previous studies on educational psychology. Specifically, by introducing the research on learners’ beliefs about tests, this study illustrates what kind of beliefs will lead to generating beneficial washback effect. Lastly, the practical implication of this study and the need for future research will be discussed.
著者
Takeshi KATO
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.22, pp.23-43, 2019 (Released:2020-02-07)

Over the last four decades, the constructs of complexity, accuracy, and fluency have been in focus in the analysis of language learners’ performance. However, due to the polysemous nature of complexity, more and more sub-constructs have been assumed, making holistic measurement difficult. This study aims to construct a more appropriate measurement model of L2 complexity by implementing finer-grained and relatively novel linguistic indices for capturing subordinate constructs that could not be measured by conventional indices. By utilizing five natural language processing tools, conventional and fine-grained indices of complexity were computed from 503 argumentative essays written by Japanese English learners. First, exploratory factor analysis was performed on linguistic index values and the extracted factor structures behind them. Second, confirmatory factor analysis was conducted to confirm whether the structure fits the data. Finally, a structural equation model of complexity constructs to predict essay scores was tested to evaluate its applicability to writing evaluation. The result of a series of factor analyses showed that the extracted factor structures reasonably fitted to the data for syntactic complexity (CFI = .901 and RMSEA = .071) and for lexical complexity (CFI = .978 and RMSEA = .051). Furthermore, the result of Structural Equation Modeling (SEM) analysis, which was proposed as a predictive model, accounted for 32.3 % of the variance of essay scores (CFI = .916 and RMSEA = .077). Overall, the findings showed the effectiveness of the proposed approach, which combined conventional linguistic features with fine-grained and relatively novel indices.
著者
David ALLEN Tatsuro TAHARA
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.24, pp.3-22, 2021 (Released:2022-05-25)
参考文献数
54
被引用文献数
3

Washback research in language education aims to demonstrate, explain, and ultimately predict, the impact of tests on teaching and learning in educational contexts. A recent review in the international arena (Cheng et al., 2015) has revealed a rapidly growing field of empirical washback research, yet only two studies were identified as occurring in the Japanese context. The present article therefore sought to more fully document the washback research conducted in Japan prior to 2021 with the aim of facilitating future research in this important area. Following an extensive online search, 32 empirical washback studies in the Japanese context were identified. These studies were analyzed in terms of the following information: publication details, test (s) involved, context and participants, methodology, aspects of washback investigated, and type of consequence targeted. The review reveals a wealth of empirical literature that has adopted a variety of research methods and designs to investigate the impact of a variety of tests, notably that of university entrance exams. On the basis of these previous studies, a series of recommendations are made for future washback research in Japan.
著者
Paul WICKING
出版者
Japan Language Testing Association
雑誌
JLTA Journal (ISSN:21895341)
巻号頁・発行日
vol.20, pp.76-89, 2017 (Released:2017-12-13)
参考文献数
2
被引用文献数
3

Any discussion about English education in Japan is invariably bound up with a discussion about assessment. All too often, such discussions have focused on entrance examinations and other high stakes summative tests. However, language testing and assessment do not take place in a vacuum, but are deeply affected by broader social and cultural contexts, as well as individual features of the school and the classroom. In particular, perhaps more than any other factor, it is the teachers who have great influence over how tests are created, conducted and interpreted. In order for universities to foster an environment in which reliable assessment can take place, it is not enough to provide well-designed, rigorous tests. It is also imperative that teachers’ beliefs and practices concerning assessment be taken into account. This paper seeks to answer the question, what are the assessment beliefs and practices of EFL teachers working in Japanese universities? To answer this question, survey responses were gathered from English language teachers working in Japanese higher education. The results indicated some slight differences in belief and practice between native speaker teachers and non-native speaker teachers, as well as between full-time and part-time teachers. Despite these differences, it seems that most teachers have a learning-oriented approach to assessment. The findings provide support and direction for policy-makers and educational leaders seeking to promote better testing practice.
著者
Yusuke HASEGAWA
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.22, pp.3-22, 2019 (Released:2020-02-07)

This study aims to reveal whether any specific type of vocabulary learning strategy (VLS) leads to higher scores on semi-contextualized word meaning tests—a multiple-choice gap-filling format in which short written contexts are provided. A total of 132 first-year university students learning English as a foreign language completed a VLS questionnaire and a semi-contextualized word meaning test. The relationship between these two variables was examined using Pearson’s correlation analysis, confirmatory factor analysis, and exploratory factor analysis. The results demonstrated that the relationships between VLS use and test scores were very weak (less than rs = .20), regardless of the strategy type. The smaller correlations compared to those reported in previous studies using vocabulary size tests may be caused by the more complicated constructs involved in the semi-contextualized word meaning test, which requires not only receptive knowledge about word meanings, but also reading comprehension skills and knowledge about word forms and usage in a sentence. However, imagery strategies, such as creating a mental image of word forms, had a very weak but significant positive correlation with the test scores. Based on these results, this study further discusses how Japanese high school students who will take examinations that employ the semi-contextualized word meaning test format should learn vocabulary.
著者
Masumi ONO Hiroyuki YAMANISHI Yuko HIJIKATA
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.22, pp.65-88, 2019 (Released:2020-02-07)
被引用文献数
1 1

Integrated writing tasks are becoming popular in the field of language testing, but it remains unclear how teachers assess integrated writing tasks holistically and/or analytically and which is more effective. This exploratory study aims to investigate teacher-raters’ holistic and analytic ratings for reliability and validity and to reveal their perceptions of grading the integrated writing task on the Test of English as a Foreign Language Internet-based Test (TOEFL iBT). Thirty-six university students completed a reading-listening-writing task. Seven raters scored the 36 compositions using both a holistic and an analytic scale, and completed a questionnaire about their perceptions of the scales. Results indicated that the holistic and analytic scales exhibited high inter-rater reliability and there were high correlations between the two rating methods. In analytic scoring, which contained four dimensions, namely, content, organization, language use, and verbatim source use, the dimensions of content and organization were highly correlated to the overall analytic score (i.e., the mean score of the four dimensions). However, the dimension of verbatim source use was found to be peculiar in terms of construct validity for the analytic scale. The analyses also indicated various challenges the raters faced while scoring. Their perceptions varied particularly regarding verbatim source use: Some raters tended to emphasize the intricate process of textual borrowing while others stressed the difficulty in judging multiple types and degrees of textual borrowing. Pedagogical implications for the selection and use of rubrics as well as the teaching and assessment of source text use are suggested.
出版者
日本言語テスト学会
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.19, no.2, 2016
被引用文献数
1

第1章:JLTAの歴史的経緯1 大友賢二2 ランディ・スラッシャー7第2章:JLTAとテスト研究10 ジェームズ・ディーン・ブラウン11 クォン・オリャン12 アンソニー・グリーン13 ジョン・リード15 フレッド・デヴィッドソン16 デイヴィッド・ベグラー18 バリー・オサリバン19 ウォンキー・リー21 浪田克之介21 中村優治22 木下正義24 池田央26 羽鳥博愛27 田中正道28 野呂忠司30 柳瀬陽介32第3章:これからのテスト研究333.1 適切なテスト使用のために必要な事項34 3.1.1 妥当性と信頼性35 3.1.2 波及効果と影響41 3.1.3 公平性,倫理規範と標準化453.2 適切なテスト作成・使用の原則49 3.2.1 テストの使用目的と構成概念50 3.2.2 テスト細目56 3.2.3 受容技能のテスト形式61 3.2.4 産出技能のテスト形式65 3.2.5 項目作成とタスクデザイン69 3.2.6 評価尺度の開発73 3.2.7 評価者による評価(採点)と評価者訓練77 3.2.8 テストの標準化と等化81 3.2.9 妥当性検証90 3.2.10 利害関係者への結果のフィードバック943.3 言語知識と技能の評価99 3.3.1 リスニングの評価100 3.3.2 リーディングの評価104 3.3.3 スピーキングのモノローグの評価108 3.3.4 スピーキングの対話の評価112 3.3.5 技能統合的スピーキングの評価116 3.3.6 独立的ライティングの評価122 3.3.7 技能統合的ライティングの評価128 3.3.8 語彙の評価132 3.3.9 文法の評価138 3.3.10 綴り(スペリング)の評価142 3.3.11 発音の評価146 3.3.12 第二言語としての日本語の評価1503.4 評価の新しい方向性155 3.4.1 Can-Do評価156 3.4.2 ヨーロッパ言語共通参照枠と評価の関連づけ160 3.4.3 子どもの学習者の言語力の評価165 3.4.4 英語教師の評価168 3.4.5 教室における評価173 3.4.6 特定の目的のための評価178 3.4.7 コンピュータ適応型テスティング〔理論編〕182 3.4.8 コンピュータ適応型テスティング〔実践編〕1863.5 言語テスト研究の理論と方法191 3.5.1 古典的テスト理論192 3.5.2 一般化可能性理論196 3.5.3 二値項目のラッシュ分析201 3.5.4 多相ラッシュ分析207 3.5.5 項目応答理論211 3.5.6 潜在ランク理論217 3.5.7 認知的診断モデリング223 3.5.8 差異項目機能228 3.5.9 確認的因子分析232 3.5.10 マルチレベル分析236 3.5.11 メタ分析240 3.5.12 質的方法244
著者
三上 明洋
出版者
日本言語テスト学会
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.21, pp.82-101, 2018 (Released:2018-12-24)

The aims of this study are to evaluate the content validity of a reflection tool for EFL teachers’ professional development in Japan, called Self-Evaluation Checklist for EFL Teachers (SECEFLT), and to provide validity evidence for interpreting and using SECEFLT scores through Kane’s (2006) argument-based approach. SECEFLT was originally developed by Mikami (2015) to promote EFL teachers’ reflection on their professional competencies. It was revised by Mikami (2018) through the validation process of construct validity, using both exploratory and confirmatory factor analyses. To gather further validity evidence related to content aspect for the revised SECEFLT, a survey was conducted with a panel of experts including six English language teachers (all English language education majors) at teacher education departments in national universities in Japan. The experts were asked to evaluate the extent to which each item in the revised SECEFLT was relevant to the content domain it aimed to measure, as well as the overall extent of relevance of the revised SECEFLT to the content domain it aimed to measure. The results showed that each individual item in the scale was appropriate in content validity and the whole scale was also appropriate judging from individual item evaluations. It was confirmed that experts judged the revised SECEFLT as content-valid when asked directly whether it was appropriate overall. Based on the study results, interpretive arguments are discussed using Kane’s (2006) framework for indicators of theoretical constructs.
著者
John M. NORRIS
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.21, pp.3-20, 2018 (Released:2018-12-24)
被引用文献数
11

Constructed-response tasks have captured the attention of testers and educators for some time (e.g., Cureton, 1951), because they present goal-oriented, contextualized challenges that prompt examinees to deploy cognitive skills and domain-related knowledge in authentic performances. Such performances present a distinct advantage when teaching, learning, and assessment focus on what learners can do rather than merely emphasizing what they know (Wiggins, 1998). Over the past several decades, communicative performance tasks have come to play a crucial role in language assessments on a variety of levels, from classroom-based tests, to professional certifications, to large-scale language proficiency exams (Norris, 2009, 2016). However, the use of such tasks for assessment purposes remains contentious, and numerous language testing alternatives are available at potentially lower cost and degree of effort. In order to facilitate decisions about when and why to adopt task-based designs for language assessment, I first outline the relationship between assessment designs and their intended uses and consequences. I then introduce two high-stakes examples of language assessment circumstances (job certification and admissions testing) that suggest a need for task-based designs, and I review the corresponding fit of several assessments currently in use for these purposes. In relation to these purposes, I also suggest some of the positive consequences of task-based designs for language learners, teachers, and society, and I point to the dangers of using assessments that do not incorporate communicative tasks or do so inappropriately. I conclude by highlighting other circumstances that call for task-based designs, and I suggest how advances in technology may help to address associated challenges.
著者
YOSHIDA Hiroko
出版者
日本言語テスト学会
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.15, pp.101-114, 2012

This study investigated the relationship between the TOEIC Bridge and TOEIC test scores, in particular, the extent to which the TOEIC Bridge test scores can predict the TOEIC test scores. The participants in this study were 292 non-English major students who took both the TOEIC Bridge and TOEIC tests in 2009. They were first-year students enrolled in a private university in Western Japan. Their scores on both tests were statistically examined using regression analysis. The results of the study showed that (1) the scores of the TOEIC Bridge and TOEIC tests were moderately correlated and (2) the TOEIC Bridge scores significantly predicted the TOEIC scores. Equations for estimating the TOEIC scores using the TOEIC Bridge scores were also specified, from which a comparison of the predicted TOEIC scores from the ETS study and the present study was constructed. The results of the comparison showed that the predicted scores from the two studies had similar intercepts and slopes for a certain range of TOEIC Bridge scores, but that the predicted scores diverged above this range.