Abstract
The present study explored different approaches for automatically scoring student essays that were written on the basis of multiple texts. Specifically, these approaches were developed to classify whether or not important elements of the texts were present in the essays. The first was a simple pattern-matching approach called "multi-word" that allowed for flexible matching of words and phrases in the sentences. The second technique was latent semantic analysis (LSA), which was used to compare student sentences to original source sentences using its high-dimensional vector-based representation. Finally, the third was a machine-learning technique, support vector machines, which learned a classification scheme from the corpus. The results of the study suggested that the LSA-based system was superior for detecting the presence of explicit content from the texts, but the multi-word pattern-matching approach was better for detecting inferences outside or across texts. These results suggest that the best approach for analyzing essays of this nature should draw upon multiple natural language processing approaches.
Original language | English (US) |
---|---|
Pages (from-to) | 622-633 |
Number of pages | 12 |
Journal | Behavior research methods |
Volume | 44 |
Issue number | 3 |
DOIs | |
State | Published - Sep 2012 |
All Science Journal Classification (ASJC) codes
- Experimental and Cognitive Psychology
- Developmental and Educational Psychology
- Arts and Humanities (miscellaneous)
- Psychology (miscellaneous)
- Psychology(all)