Automated Scoring of Students’ English-to-Chinese Translations of Three Text Types

Jinlin Jiang, Lihua Jiang, Xiaofei Lu

Research output: Contribution to journalArticle

Abstract

This study attempts to construct automated scoring models for Chinese EFL (English as a Foreign Language) learners’ English-to-Chinese (E-C) translations in large-scale exams. Our data consisted of 900 human-scored translated texts of three source texts–an expository text, a narrative text and a mixed narrative-argumentative text–with 300 for each source text. Text features were extracted using technologies such as n-gram matching, word alignment and Latent Semantic Analysis. Computer scoring models were constructed using multiple linear regression analysis with text features as independent variables and human-assigned scores as the dependent variable. To determine the number of training texts required to yield the most optimal results, five scoring models were developed with a training set of 50, 100, 130, 150 and 180 texts of each text type, respectively. Results indicated that the correlation coefficients between the model-computed and human-assigned scores were above 0.8 for all five models. The model trained with 130 translated texts performed the best on expository and narrative texts, while that trained with 100 translated texts performed the best on mixed narrative-argumentative texts. Therefore, it is concluded that the text features extracted in this study are effective and that the finalized models can produce reliable scores for Chinese EFL learners’ E-C translations.

Original languageEnglish (US)
Pages (from-to)238-255
Number of pages18
JournalJournal of Quantitative Linguistics
Volume25
Issue number3
DOIs
StatePublished - Jul 3 2018

Fingerprint

student
narrative
Text Type
Scoring
foreign language
regression analysis
semantics
Narrative Text
Expository Text

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Cite this

@article{993211fd8be64760b952888e562aeae9,
title = "Automated Scoring of Students’ English-to-Chinese Translations of Three Text Types",
abstract = "This study attempts to construct automated scoring models for Chinese EFL (English as a Foreign Language) learners’ English-to-Chinese (E-C) translations in large-scale exams. Our data consisted of 900 human-scored translated texts of three source texts–an expository text, a narrative text and a mixed narrative-argumentative text–with 300 for each source text. Text features were extracted using technologies such as n-gram matching, word alignment and Latent Semantic Analysis. Computer scoring models were constructed using multiple linear regression analysis with text features as independent variables and human-assigned scores as the dependent variable. To determine the number of training texts required to yield the most optimal results, five scoring models were developed with a training set of 50, 100, 130, 150 and 180 texts of each text type, respectively. Results indicated that the correlation coefficients between the model-computed and human-assigned scores were above 0.8 for all five models. The model trained with 130 translated texts performed the best on expository and narrative texts, while that trained with 100 translated texts performed the best on mixed narrative-argumentative texts. Therefore, it is concluded that the text features extracted in this study are effective and that the finalized models can produce reliable scores for Chinese EFL learners’ E-C translations.",
author = "Jinlin Jiang and Lihua Jiang and Xiaofei Lu",
year = "2018",
month = "7",
day = "3",
doi = "10.1080/09296174.2017.1370192",
language = "English (US)",
volume = "25",
pages = "238--255",
journal = "Journal of Quantitative Linguistics",
issn = "0929-6174",
publisher = "Routledge",
number = "3",

}

Automated Scoring of Students’ English-to-Chinese Translations of Three Text Types. / Jiang, Jinlin; Jiang, Lihua; Lu, Xiaofei.

In: Journal of Quantitative Linguistics, Vol. 25, No. 3, 03.07.2018, p. 238-255.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Automated Scoring of Students’ English-to-Chinese Translations of Three Text Types

AU - Jiang, Jinlin

AU - Jiang, Lihua

AU - Lu, Xiaofei

PY - 2018/7/3

Y1 - 2018/7/3

N2 - This study attempts to construct automated scoring models for Chinese EFL (English as a Foreign Language) learners’ English-to-Chinese (E-C) translations in large-scale exams. Our data consisted of 900 human-scored translated texts of three source texts–an expository text, a narrative text and a mixed narrative-argumentative text–with 300 for each source text. Text features were extracted using technologies such as n-gram matching, word alignment and Latent Semantic Analysis. Computer scoring models were constructed using multiple linear regression analysis with text features as independent variables and human-assigned scores as the dependent variable. To determine the number of training texts required to yield the most optimal results, five scoring models were developed with a training set of 50, 100, 130, 150 and 180 texts of each text type, respectively. Results indicated that the correlation coefficients between the model-computed and human-assigned scores were above 0.8 for all five models. The model trained with 130 translated texts performed the best on expository and narrative texts, while that trained with 100 translated texts performed the best on mixed narrative-argumentative texts. Therefore, it is concluded that the text features extracted in this study are effective and that the finalized models can produce reliable scores for Chinese EFL learners’ E-C translations.

AB - This study attempts to construct automated scoring models for Chinese EFL (English as a Foreign Language) learners’ English-to-Chinese (E-C) translations in large-scale exams. Our data consisted of 900 human-scored translated texts of three source texts–an expository text, a narrative text and a mixed narrative-argumentative text–with 300 for each source text. Text features were extracted using technologies such as n-gram matching, word alignment and Latent Semantic Analysis. Computer scoring models were constructed using multiple linear regression analysis with text features as independent variables and human-assigned scores as the dependent variable. To determine the number of training texts required to yield the most optimal results, five scoring models were developed with a training set of 50, 100, 130, 150 and 180 texts of each text type, respectively. Results indicated that the correlation coefficients between the model-computed and human-assigned scores were above 0.8 for all five models. The model trained with 130 translated texts performed the best on expository and narrative texts, while that trained with 100 translated texts performed the best on mixed narrative-argumentative texts. Therefore, it is concluded that the text features extracted in this study are effective and that the finalized models can produce reliable scores for Chinese EFL learners’ E-C translations.

UR - http://www.scopus.com/inward/record.url?scp=85029425982&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029425982&partnerID=8YFLogxK

U2 - 10.1080/09296174.2017.1370192

DO - 10.1080/09296174.2017.1370192

M3 - Article

VL - 25

SP - 238

EP - 255

JO - Journal of Quantitative Linguistics

JF - Journal of Quantitative Linguistics

SN - 0929-6174

IS - 3

ER -