Large, multivariate geographic datasets have been used to characterize geographic space with the help of spatial data mining tools. In our study, we explore the sufficiency of the Support Vector Machine (SVM), a popular machine-learning technique for unsupervised classification and clustering, to help recognize hidden patterns in a college admissions dataset. Our college admissions dataset holds over 10,000 students applying to an undisclosed university during one undisclosed year. Students are qualified almost exclusively by their standardized test scores and school records, and a known admissions decision is rendered based on these criteria. Given that the university has a number of political, social and geographic econometric factors in its admissions decisions, we use SVM to find implicit spatial patterns that may favor students from certain geographic regions. We first explore the characteristics of the applicants in the college admissions case study. Next, we explain the SVM technique and our unique 'threshold line' methodology for both discrete (regional) and continuous (k-neighbors) space. We then analyze the results of the regional and k-neighbor tests in order to respond to the methodological and geographic research questions.
All Science Journal Classification (ASJC) codes
- Earth and Planetary Sciences(all)