著者
Takeshi KATO
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.22, pp.23-43, 2019 (Released:2020-02-07)

Over the last four decades, the constructs of complexity, accuracy, and fluency have been in focus in the analysis of language learners’ performance. However, due to the polysemous nature of complexity, more and more sub-constructs have been assumed, making holistic measurement difficult. This study aims to construct a more appropriate measurement model of L2 complexity by implementing finer-grained and relatively novel linguistic indices for capturing subordinate constructs that could not be measured by conventional indices. By utilizing five natural language processing tools, conventional and fine-grained indices of complexity were computed from 503 argumentative essays written by Japanese English learners. First, exploratory factor analysis was performed on linguistic index values and the extracted factor structures behind them. Second, confirmatory factor analysis was conducted to confirm whether the structure fits the data. Finally, a structural equation model of complexity constructs to predict essay scores was tested to evaluate its applicability to writing evaluation. The result of a series of factor analyses showed that the extracted factor structures reasonably fitted to the data for syntactic complexity (CFI = .901 and RMSEA = .071) and for lexical complexity (CFI = .978 and RMSEA = .051). Furthermore, the result of Structural Equation Modeling (SEM) analysis, which was proposed as a predictive model, accounted for 32.3 % of the variance of essay scores (CFI = .916 and RMSEA = .077). Overall, the findings showed the effectiveness of the proposed approach, which combined conventional linguistic features with fine-grained and relatively novel indices.
著者
David ALLEN Tatsuro TAHARA
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.24, pp.3-22, 2021 (Released:2022-05-25)
参考文献数
54
被引用文献数
3

Washback research in language education aims to demonstrate, explain, and ultimately predict, the impact of tests on teaching and learning in educational contexts. A recent review in the international arena (Cheng et al., 2015) has revealed a rapidly growing field of empirical washback research, yet only two studies were identified as occurring in the Japanese context. The present article therefore sought to more fully document the washback research conducted in Japan prior to 2021 with the aim of facilitating future research in this important area. Following an extensive online search, 32 empirical washback studies in the Japanese context were identified. These studies were analyzed in terms of the following information: publication details, test (s) involved, context and participants, methodology, aspects of washback investigated, and type of consequence targeted. The review reveals a wealth of empirical literature that has adopted a variety of research methods and designs to investigate the impact of a variety of tests, notably that of university entrance exams. On the basis of these previous studies, a series of recommendations are made for future washback research in Japan.
著者
Paul WICKING
出版者
Japan Language Testing Association
雑誌
JLTA Journal (ISSN:21895341)
巻号頁・発行日
vol.20, pp.76-89, 2017 (Released:2017-12-13)
参考文献数
2
被引用文献数
3

Any discussion about English education in Japan is invariably bound up with a discussion about assessment. All too often, such discussions have focused on entrance examinations and other high stakes summative tests. However, language testing and assessment do not take place in a vacuum, but are deeply affected by broader social and cultural contexts, as well as individual features of the school and the classroom. In particular, perhaps more than any other factor, it is the teachers who have great influence over how tests are created, conducted and interpreted. In order for universities to foster an environment in which reliable assessment can take place, it is not enough to provide well-designed, rigorous tests. It is also imperative that teachers’ beliefs and practices concerning assessment be taken into account. This paper seeks to answer the question, what are the assessment beliefs and practices of EFL teachers working in Japanese universities? To answer this question, survey responses were gathered from English language teachers working in Japanese higher education. The results indicated some slight differences in belief and practice between native speaker teachers and non-native speaker teachers, as well as between full-time and part-time teachers. Despite these differences, it seems that most teachers have a learning-oriented approach to assessment. The findings provide support and direction for policy-makers and educational leaders seeking to promote better testing practice.
著者
Yusuke HASEGAWA
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.22, pp.3-22, 2019 (Released:2020-02-07)

This study aims to reveal whether any specific type of vocabulary learning strategy (VLS) leads to higher scores on semi-contextualized word meaning tests—a multiple-choice gap-filling format in which short written contexts are provided. A total of 132 first-year university students learning English as a foreign language completed a VLS questionnaire and a semi-contextualized word meaning test. The relationship between these two variables was examined using Pearson’s correlation analysis, confirmatory factor analysis, and exploratory factor analysis. The results demonstrated that the relationships between VLS use and test scores were very weak (less than rs = .20), regardless of the strategy type. The smaller correlations compared to those reported in previous studies using vocabulary size tests may be caused by the more complicated constructs involved in the semi-contextualized word meaning test, which requires not only receptive knowledge about word meanings, but also reading comprehension skills and knowledge about word forms and usage in a sentence. However, imagery strategies, such as creating a mental image of word forms, had a very weak but significant positive correlation with the test scores. Based on these results, this study further discusses how Japanese high school students who will take examinations that employ the semi-contextualized word meaning test format should learn vocabulary.
著者
Masumi ONO Hiroyuki YAMANISHI Yuko HIJIKATA
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.22, pp.65-88, 2019 (Released:2020-02-07)
被引用文献数
1 1

Integrated writing tasks are becoming popular in the field of language testing, but it remains unclear how teachers assess integrated writing tasks holistically and/or analytically and which is more effective. This exploratory study aims to investigate teacher-raters’ holistic and analytic ratings for reliability and validity and to reveal their perceptions of grading the integrated writing task on the Test of English as a Foreign Language Internet-based Test (TOEFL iBT). Thirty-six university students completed a reading-listening-writing task. Seven raters scored the 36 compositions using both a holistic and an analytic scale, and completed a questionnaire about their perceptions of the scales. Results indicated that the holistic and analytic scales exhibited high inter-rater reliability and there were high correlations between the two rating methods. In analytic scoring, which contained four dimensions, namely, content, organization, language use, and verbatim source use, the dimensions of content and organization were highly correlated to the overall analytic score (i.e., the mean score of the four dimensions). However, the dimension of verbatim source use was found to be peculiar in terms of construct validity for the analytic scale. The analyses also indicated various challenges the raters faced while scoring. Their perceptions varied particularly regarding verbatim source use: Some raters tended to emphasize the intricate process of textual borrowing while others stressed the difficulty in judging multiple types and degrees of textual borrowing. Pedagogical implications for the selection and use of rubrics as well as the teaching and assessment of source text use are suggested.
著者
Akiyo HIRAI Rie KOIZUMI
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会研究紀要 (ISSN:2433006X)
巻号頁・発行日
vol.11, pp.1-20, 2008-09-20 (Released:2017-08-07)
被引用文献数
1

Among different types of rating scales in scoring speaking performance, the EBB (Empirically derived, Binary-choice, Boundary-definition) scale is claimed to be easy to use and highly reliable (Turner & Upshur, 1996; 2002). However, it has been questioned whether the EBB scale can be applied to other tasks. Thus, in this study, an EBB scale was compared with an analytic scale in terms of validity, reliability, and practicality. Fifty-two EFL learners were asked to read and retell four stories in a semi-direct Story Retelling Speaking Test (SRST). Their performances were scored using these two rating scales, and then the scores were compared by using generalizability theory, a multitrait-multimethod approach, and a questionnaire delivered to the raters. As a result, the EBB scale, which consists of four criteria, was found to be more generalizable (i.e., reliable) than those of the analytic scale and generally assessed the intended constructs. However, the present EBB scale turned out to be less practical than the analytic scale due to its binary format and because it had more levels in each criterion. Further revisions seeking a better scale for the SRST are suggested.
著者
John M. NORRIS
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会誌 (ISSN:21895341)
巻号頁・発行日
vol.21, pp.3-20, 2018 (Released:2018-12-24)
被引用文献数
11

Constructed-response tasks have captured the attention of testers and educators for some time (e.g., Cureton, 1951), because they present goal-oriented, contextualized challenges that prompt examinees to deploy cognitive skills and domain-related knowledge in authentic performances. Such performances present a distinct advantage when teaching, learning, and assessment focus on what learners can do rather than merely emphasizing what they know (Wiggins, 1998). Over the past several decades, communicative performance tasks have come to play a crucial role in language assessments on a variety of levels, from classroom-based tests, to professional certifications, to large-scale language proficiency exams (Norris, 2009, 2016). However, the use of such tasks for assessment purposes remains contentious, and numerous language testing alternatives are available at potentially lower cost and degree of effort. In order to facilitate decisions about when and why to adopt task-based designs for language assessment, I first outline the relationship between assessment designs and their intended uses and consequences. I then introduce two high-stakes examples of language assessment circumstances (job certification and admissions testing) that suggest a need for task-based designs, and I review the corresponding fit of several assessments currently in use for these purposes. In relation to these purposes, I also suggest some of the positive consequences of task-based designs for language learners, teachers, and society, and I point to the dangers of using assessments that do not incorporate communicative tasks or do so inappropriately. I conclude by highlighting other circumstances that call for task-based designs, and I suggest how advances in technology may help to address associated challenges.