Mining student-generated textual data in MOOCs and quantifying their effects on student performance and learning outcomes

Conrad Tucker, Barton K. Pursel, Anna Divinsky

Research output: Contribution to journalArticle

19 Citations (Scopus)

Abstract

Massive Open Online Courses (MOOCs) are freely available courses offered online for distance based learners who have access to the internet. The tremendous success of MOOCs can in part, be attributed to their global availability, enabling anyone in the world to sign up/drop courses at any time during the course offerings. Course enrollment in MOOCs often range between 10,000 to 200,000 students, thereby providing a potentially rich venue for large scale digital data (e.g., student course comments, temporal and geo-location data, etc.). However, despite the overabundance of digital data generated through MOOCs, research into how student interactions in MOOCs translate to student performance and learning outcomes is limited. The objective of this research is to mine student-generated textual data (e.g., online discussion forums) existing in MOOCs in order to quantify their impact on student performance and learning outcomes. Student performance is quantified based on grades attained in course homework assignments, quizzes and examinations. Similar to in-class learning environments, students enrolled in MOOCs often self-organize and form learning groups, where course topics and assignments can be discussed. One of the major benefits of MOOC data is that student networks and discussion therein are digitally stored and readily available for data mining/statistical analysis. The proposed methodology employs robust natural language processing techniques and data mining algorithms to quantify temporal changes in student sentiments relating to course topics and instructor clarity. Researchers aim to determine whether textual content (e.g., quality VS quantity of student forum discussions) expressed through MOOCs can serve as leading indicators of student performance in MOOCs. A case study involving the Introduction to Art: Concepts and Techniques offered by Perm State University through the Coursera platform, is used to validate the proposed methodology.

Original languageEnglish (US)
Pages (from-to)84-95
Number of pages12
JournalComputers in Education Journal
Volume5
Issue number4
StatePublished - Jan 1 2014

Fingerprint

Students
learning
performance
student
Data mining
Statistical methods
quiz
homework
methodology
Availability
Internet
statistical analysis
instructor
learning environment
Processing
art

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Education

Cite this

@article{cac8ba403c2540bd8e06826f1e0336f3,
title = "Mining student-generated textual data in MOOCs and quantifying their effects on student performance and learning outcomes",
abstract = "Massive Open Online Courses (MOOCs) are freely available courses offered online for distance based learners who have access to the internet. The tremendous success of MOOCs can in part, be attributed to their global availability, enabling anyone in the world to sign up/drop courses at any time during the course offerings. Course enrollment in MOOCs often range between 10,000 to 200,000 students, thereby providing a potentially rich venue for large scale digital data (e.g., student course comments, temporal and geo-location data, etc.). However, despite the overabundance of digital data generated through MOOCs, research into how student interactions in MOOCs translate to student performance and learning outcomes is limited. The objective of this research is to mine student-generated textual data (e.g., online discussion forums) existing in MOOCs in order to quantify their impact on student performance and learning outcomes. Student performance is quantified based on grades attained in course homework assignments, quizzes and examinations. Similar to in-class learning environments, students enrolled in MOOCs often self-organize and form learning groups, where course topics and assignments can be discussed. One of the major benefits of MOOC data is that student networks and discussion therein are digitally stored and readily available for data mining/statistical analysis. The proposed methodology employs robust natural language processing techniques and data mining algorithms to quantify temporal changes in student sentiments relating to course topics and instructor clarity. Researchers aim to determine whether textual content (e.g., quality VS quantity of student forum discussions) expressed through MOOCs can serve as leading indicators of student performance in MOOCs. A case study involving the Introduction to Art: Concepts and Techniques offered by Perm State University through the Coursera platform, is used to validate the proposed methodology.",
author = "Conrad Tucker and Pursel, {Barton K.} and Anna Divinsky",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
volume = "5",
pages = "84--95",
journal = "Computers in Education Journal",
issn = "1069-3769",
publisher = "American Society for Engineering Education",
number = "4",

}

Mining student-generated textual data in MOOCs and quantifying their effects on student performance and learning outcomes. / Tucker, Conrad; Pursel, Barton K.; Divinsky, Anna.

In: Computers in Education Journal, Vol. 5, No. 4, 01.01.2014, p. 84-95.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Mining student-generated textual data in MOOCs and quantifying their effects on student performance and learning outcomes

AU - Tucker, Conrad

AU - Pursel, Barton K.

AU - Divinsky, Anna

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Massive Open Online Courses (MOOCs) are freely available courses offered online for distance based learners who have access to the internet. The tremendous success of MOOCs can in part, be attributed to their global availability, enabling anyone in the world to sign up/drop courses at any time during the course offerings. Course enrollment in MOOCs often range between 10,000 to 200,000 students, thereby providing a potentially rich venue for large scale digital data (e.g., student course comments, temporal and geo-location data, etc.). However, despite the overabundance of digital data generated through MOOCs, research into how student interactions in MOOCs translate to student performance and learning outcomes is limited. The objective of this research is to mine student-generated textual data (e.g., online discussion forums) existing in MOOCs in order to quantify their impact on student performance and learning outcomes. Student performance is quantified based on grades attained in course homework assignments, quizzes and examinations. Similar to in-class learning environments, students enrolled in MOOCs often self-organize and form learning groups, where course topics and assignments can be discussed. One of the major benefits of MOOC data is that student networks and discussion therein are digitally stored and readily available for data mining/statistical analysis. The proposed methodology employs robust natural language processing techniques and data mining algorithms to quantify temporal changes in student sentiments relating to course topics and instructor clarity. Researchers aim to determine whether textual content (e.g., quality VS quantity of student forum discussions) expressed through MOOCs can serve as leading indicators of student performance in MOOCs. A case study involving the Introduction to Art: Concepts and Techniques offered by Perm State University through the Coursera platform, is used to validate the proposed methodology.

AB - Massive Open Online Courses (MOOCs) are freely available courses offered online for distance based learners who have access to the internet. The tremendous success of MOOCs can in part, be attributed to their global availability, enabling anyone in the world to sign up/drop courses at any time during the course offerings. Course enrollment in MOOCs often range between 10,000 to 200,000 students, thereby providing a potentially rich venue for large scale digital data (e.g., student course comments, temporal and geo-location data, etc.). However, despite the overabundance of digital data generated through MOOCs, research into how student interactions in MOOCs translate to student performance and learning outcomes is limited. The objective of this research is to mine student-generated textual data (e.g., online discussion forums) existing in MOOCs in order to quantify their impact on student performance and learning outcomes. Student performance is quantified based on grades attained in course homework assignments, quizzes and examinations. Similar to in-class learning environments, students enrolled in MOOCs often self-organize and form learning groups, where course topics and assignments can be discussed. One of the major benefits of MOOC data is that student networks and discussion therein are digitally stored and readily available for data mining/statistical analysis. The proposed methodology employs robust natural language processing techniques and data mining algorithms to quantify temporal changes in student sentiments relating to course topics and instructor clarity. Researchers aim to determine whether textual content (e.g., quality VS quantity of student forum discussions) expressed through MOOCs can serve as leading indicators of student performance in MOOCs. A case study involving the Introduction to Art: Concepts and Techniques offered by Perm State University through the Coursera platform, is used to validate the proposed methodology.

UR - http://www.scopus.com/inward/record.url?scp=84912525270&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84912525270&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84912525270

VL - 5

SP - 84

EP - 95

JO - Computers in Education Journal

JF - Computers in Education Journal

SN - 1069-3769

IS - 4

ER -