著者
前田 啓朗
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要
巻号頁・発行日
no.6, pp.140-147, 2004-08-30

This paper presents 1) what limitations causal analyses have, 2) how causal analyses are conducted in English language education research in Japan, 3) what problems are seen in those causal analyses, then, 4) how the problems can be improved for further research. A Causal analysis, especially an analysis according to Multiple Regression Model, is originally a powerful tool for predicting a dependent variable by some independent variables. However, when the degree of causal effect by each independent variable is focused, the problem of multi-collinearity, which is provided by correlations among dependent variables, arises. On the other hand, when stepwise method is adopted in deciding which dependent variables should be included, the problem of multi-collinearity may cause again by deleting the dependent variables which reasonably seem to contribute to independent variables. After reviewing those limitations of Multiple Regression Models, eleven articles in English language education research in Japan were reinvestigated in terms of those problems. Then, some suggestions, such as using a correlation analysis, are presented instead of regression models.
著者
田中 博晃 前田 啓朗
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要
巻号頁・発行日
no.6, pp.128-139, 2004-08-30

The purpose of this study was to examine the construct of amotivation. When amotivation is measured, negative items in a questionnaire cause attenuation of correlation, and as a result, it would give bias to the construct of amotivation. A questionnaire was made on the basis on Noels, Pelletier, Clement, and Vallerand (2000), and it included both positive items (P-type) and negative items of amotivation (N-type). By analyzing the data from the questionnaire using Confirmatory Factor Analysis to correct attenuation, we examined a systematic error caused by negative items. The result showed that (1) an artificial factor was identified when positive and negative items of amotivation were analyzed by Exploratory Factor Analysis; (2) the construct of amotivation was supported when 7-factor model of motivation was examined by conducting Confirmatory Factor Analysis to P-type questionnaire; and (3) P-type questionnaire was more appropriate than N-type questionnaire as a measure of amotivation, because bipolarity between amotivation and self-determined forms of motivation was clearly identified in P-type questionnaire.
著者
Akiyo HIRAI Rie KOIZUMI
出版者
Japan Language Testing Association
雑誌
日本言語テスト学会研究紀要 (ISSN:2433006X)
巻号頁・発行日
vol.11, pp.1-20, 2008-09-20 (Released:2017-08-07)
被引用文献数
1

Among different types of rating scales in scoring speaking performance, the EBB (Empirically derived, Binary-choice, Boundary-definition) scale is claimed to be easy to use and highly reliable (Turner & Upshur, 1996; 2002). However, it has been questioned whether the EBB scale can be applied to other tasks. Thus, in this study, an EBB scale was compared with an analytic scale in terms of validity, reliability, and practicality. Fifty-two EFL learners were asked to read and retell four stories in a semi-direct Story Retelling Speaking Test (SRST). Their performances were scored using these two rating scales, and then the scores were compared by using generalizability theory, a multitrait-multimethod approach, and a questionnaire delivered to the raters. As a result, the EBB scale, which consists of four criteria, was found to be more generalizable (i.e., reliable) than those of the analytic scale and generally assessed the intended constructs. However, the present EBB scale turned out to be less practical than the analytic scale due to its binary format and because it had more levels in each criterion. Further revisions seeking a better scale for the SRST are suggested.
著者
柳瀬 陽介
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要 (ISSN:2433006X)
巻号頁・発行日
vol.11, pp.77-95, 2008-09-20 (Released:2017-08-07)

この言語コミュニケーション力の三次元的理解は、これまでの言語コミュニケーション力論の議論の蓄積の上に、関連する諸概念を再構成したものである。もちろんただ用語を変えただけというものではなく、(a)読心力の働きの強調、(b)身体力の復活、(c)言語力における「知識」の二義性を明示した、ことが本論考の独自性の主なものである。だが、これらの論点は、これまでの言語コミュニケーション力論からは、まったく欠如していたと考えるのは行きすぎであろう。過去の言語コミュニケーション力論の諸概念と、本論文の概念をやや強引に関連づけたのがAppendix 1である。これらの改良により、本論文の「目的」で述べた、5つの課題は克服されただろうか。(1)の課題は、Bachmannの方略的能力概念よりも、言語の知識がコミュニケーションに使われる際の過程をより理論的に解明することであった。これについては、(a)の読心力の設定により、言語コミュニケーション以前に、コミュニケーションには「心の理論」に代表される他人の心を読むメカニズムが人間には働いており、ことに言語を高度に使ったコミュニケーションにおいては関連性の原理に従って言語使用がされていることを明らかにすることで課題を達成した。(2)の課題は、言語コミュニケーションにおける身体の働きを明示することであったが、これはBachman(1990)がかつて提唱していた「心身協調メカニズム」を「言語的身体力」で復活させただけでなく、「非言語的身体力」を設定することで、これまでの応用言語学が重んじていなかったが、日常生活では痛感されている領域があることを明らかにした。(3)の課題は、言語コミュニケーションの相互作用性を少しでも明らかにすることであったが、これは読心力概念を前面に出すことで、コミュニケーションの特定の相手を具体的に想定しない言語コミュニケーション力論は、コミュニケーションの理論としては不十分であることを示した。だが、これは、個人の中に他者を取り込んだ相互作用性に留まり、未だに個人主義的な発想であるともいえるかもしれない。Hymes(1972)が先駆的に述べていたコミュニケーションの「創発」(emergence)についてもまだ論考されていない。これは今後の課題となるだろう(後述)。(4)の課題は、言語の極にもコミュニケーションの極にも偏らない論考をすることだった。この課題は、読心力と言語力を独立させ直交的に表現し、その二次元平面で、ほとんど読心力だけでも成立するコミュニケーションから、高度に言語力に依拠することによって成立する言語コミュニケーションの変容範囲を理論的に示すことによって克服された。(5)の課題は言語コミュニケーション力の全体像の見通しを得る論考を目指すことであったが、これは全体像を三つの要因(三次元)という簡明な構造図式で説明し、なおかつ、それぞれの次元においてより詳しい説明が展開できる議論を展開したことによって達成されたと考えられる。このように本論文の言語コミュニケーション力の三次元的理解は、これまでの言語コミュニケーション力の展開に基づきながらも新しい独自の貢献を果たすと考える。
著者
HIRAI Akiyo KOIZUMI Rie
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要
巻号頁・発行日
no.11, pp.1-20, 2008-09-20

Among different types of rating scales in scoring speaking performance, the EBB (Empirically derived, Binary-choice, Boundary-definition) scale is claimed to be easy to use and highly reliable (Turner & Upshur, 1996; 2002). However, it has been questioned whether the EBB scale can be applied to other tasks. Thus, in this study, an EBB scale was compared with an analytic scale in terms of validity, reliability, and practicality. Fifty-two EFL learners were asked to read and retell four stories in a semi-direct Story Retelling Speaking Test (SRST). Their performances were scored using these two rating scales, and then the scores were compared by using generalizability theory, a multitrait-multimethod approach, and a questionnaire delivered to the raters. As a result, the EBB scale, which consists of four criteria, was found to be more generalizable (i.e., reliable) than those of the analytic scale and generally assessed the intended constructs. However, the present EBB scale turned out to be less practical than the analytic scale due to its binary format and because it had more levels in each criterion. Further revisions seeking a better scale for the SRST are suggested.
著者
Shizuka Tetsuhito
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要
巻号頁・発行日
vol.6, pp.108-127, 2004

The purpose of this study was to explore the potential of 'invisible-gap filling' items primarily as an in-house achievement measure of reading-oriented courses and secondarily as a more general overall-ability measure. More specifically, it compared multiple-matching 'invisible-gap filling' items and their 'visible' counterparts in terms of item facility, item discrimination, test reliability, and test validity. Eighty-eight Japanese university 1st year students took a 25-item invisible-gap filling test and its visible counterpart, along with two 25-item c-tests, the combination of which constituted a semester-end examination of a reading-oriented course. The invisible and visible gap filling tests were based on the same passage covered in the course. Target words (i.e., words to fill the gaps) were also the same between the versions, making the salience of the gaps the only difference between the two. Hence, psychometric property differences between these two versions, if any, should be attributed to the gap visibility condition difference. One c-test was created from a passage already covered in class and the other from a new passage. The former served as an achievement criterion while the latter was considered a proficiency criterion. Results indicated that the invisible-gap filling items had (1) lower facility values, (2) higher discriminations, (3) higher reliability, (4) higher validity as an achievement measure, and (5) higher validity as a proficiency measure, than its visible counterpart. Based on these findings, it is contended that invisible gap filling is a technique that can be used to produce reliable and valid achievement tests with relative ease. After discussing possible limitations of the format, two possible modifications are proposed.
著者
斉田 智里
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要
巻号頁・発行日
no.10, pp.119-133, 2007-10-01

This research addressed the comparison of concurrent calibration between a polytomous IRT model and a dichotomous IRT model using English achievement test data. Two forms of English achievement tests for senior high school students were composed of testlets (groups of items) to eliminate the effect of the dependence among within-testlet items. The two forms were equated with common testlets through a polytomous IRT model. The testlet parameter estimates and the category characteristic curves were analyzed on a common scale. The result showed that one form was more difficult than the other, as test designers had intended. The mean of the ability parameter estimates of the more difficult form was higher than that of the easier form. These findings yielded useful feedback for test designers. Item parameter estimates of independent dichotomous items, ability parameter estimates and the amount of test information derived by concurrent calibration under the graded response model (polytomous IRT model) and the two-parameter logistic model (dichotomous IRT model) were compared. The results showed similar parameter estimates for the two IRT models. The standard errors of ability parameter estimates for both models also were highly correlated. The two-parameter logistic model provided a greater amount of test information than the graded response model.
著者
Sugita Yoshihito
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要
巻号頁・発行日
no.13, pp.21-40, 2010-11-15

This article examines the main data of a task-based writing performance tests in which the five junior high school teachers participated as novice raters. The purpose of this research is to implement a task-based writing test (TBWT) which was developed on the basis of construct-based processing approach to testing, and to examine the degree of reliability and validity of the assessment tasks and rating scales. Accuracy and communicability were defined as constructs, and the test development proceeded according to such three stages as designing and characterizing writing tasks, reviewing existing scoring procedures and drafting rating scales. Each of the forty scripts collected from twenty undergraduate students was scored by five new raters, and the analyses were done using FACETS. The results indicated that all novice raters displayed acceptable levels of self-consistency, and that there was no significantly different scoring on the two tasks and overall impression, which provided reasonable fit to the Rasch model. The modified scales associated with the five rating categories and their specific written samples were shown to be mostly comprehensible and usable by raters, and demonstrated that the students' ability was effectively measured using these tasks and rating scales. However, further research is necessary for considering elimination of inter-rater differences.
著者
Sato Takanori
出版者
日本言語テスト学会
雑誌
日本言語テスト学会研究紀要
巻号頁・発行日
vol.13, pp.1-20, 2010

The purpose of the present study was to examine the validity of 16 can-do items taken from the EIKEN can-do list (STEP, 2008). A total of 2,571 Japanese junior high school students were asked to assess their degree of confidence in the 16 can-do statements-four EIKEN Grade 5, Grade 4, Grade 3, and Grade Pre-2 items, respectively. The present study employed the Rasch model to investigate whether (a) the items are unidimensional, (b) their item difficulty is appropriate, (c) item difficulty correlates with the items' EIKEN grades, and (d) the students' confidence levels correlate with their proficiency levels. The results showed that the can-do items are highly reliable and unidimensional. However, the students tended to feel that the items were unchallenging, especially the speaking and listening items.