The relationship of lexical sophistication to second language (L2) production quality has received much attention in the past few decades. Viewed as a multidimensional construct, lexical sophistication has been measured using indices that tap into various distributional, formal, semantic, acquisitional, and psycholinguistic properties of words and certain n-gram properties (e.g., Kyle et al., 2018). However, existing indices have not systematically accounted for the fact that polysemous words are used with distinct senses in different contexts and that those senses may not be equally sophisticated for L2 learners. The current study addresses this gap by proposing three frequency-based lexical sophistication indices that take into account the reference-corpus frequency of the senses with which polysemous words are used in learner texts and assessing their predictive power for L2 English writing quality both in comparison to and in combination with existing lexical sophistication indices. Results from the analysis of a corpus of exam scripts produced by L2 learners sitting for the Cambridge First Certificate in English (Yannakoudakis et al., 2011) show that two sense-aware indices proposed correlated more strongly with holistic scores of L2 English writing quality than existing indices. Integrating the new sense-aware indices with existing ones in a regression model resulted in higher predictive power for L2 English writing quality than models built with either set of indices alone. The implications of our findings for future L2 lexical sophistication research are discussed.
All Science Journal Classification (ASJC) codes
- Experimental and Cognitive Psychology
- Developmental and Educational Psychology
- Arts and Humanities (miscellaneous)
- Psychology (miscellaneous)