Feature selection methods for optimal design of studies for developmental inquiry

Timothy R. Brick, Rachel E. Koffer, Denis Gerstorf, Nilam Ram

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Objectives: As diary, panel, and experience sampling methods become easier to implement, studies of development and aging are adopting more and more intensive study designs. However, if too many measures are included in such designs, interruptions for measurement may constitute a significant burden for participants. We propose the use of feature selection-a data-driven machine learning process-in study design and selection of measures that show the most predictive power in pilot data. Method: We introduce an analytical paradigm based on the feature importance estimation and recursive feature elimination with decision tree ensembles and illustrate its utility using empirical data from the German Socio-Economic Panel (SOEP). Results: We identified a subset of 20 measures from the SOEP data set that maintain much of the ability of the original data set to predict life satisfaction and health across younger, middle, and older age groups. Discussion: Feature selection techniques permit researchers to choose measures that are maximally predictive of relevant outcomes, even when there are interactions or nonlinearities. These techniques facilitate decisions about which measures may be dropped from a study while maintaining efficiency of prediction across groups and reducing costs to the researcher and burden on the participants.

Original languageEnglish (US)
Pages (from-to)113-123
Number of pages11
JournalJournals of Gerontology - Series B Psychological Sciences and Social Sciences
Volume73
Issue number1
DOIs
StatePublished - Jan 1 2018

Fingerprint

Economics
Research Personnel
Decision Trees
Aptitude
SOEP
Age Groups
Efficiency
Costs and Cost Analysis
Health
age group
learning process
Datasets
paradigm
decision making
efficiency
ability
costs
interaction
health
experience

All Science Journal Classification (ASJC) codes

  • Health(social science)
  • Sociology and Political Science
  • Life-span and Life-course Studies

Cite this

@article{64fcf7dc45a847c798f2309744731f89,
title = "Feature selection methods for optimal design of studies for developmental inquiry",
abstract = "Objectives: As diary, panel, and experience sampling methods become easier to implement, studies of development and aging are adopting more and more intensive study designs. However, if too many measures are included in such designs, interruptions for measurement may constitute a significant burden for participants. We propose the use of feature selection-a data-driven machine learning process-in study design and selection of measures that show the most predictive power in pilot data. Method: We introduce an analytical paradigm based on the feature importance estimation and recursive feature elimination with decision tree ensembles and illustrate its utility using empirical data from the German Socio-Economic Panel (SOEP). Results: We identified a subset of 20 measures from the SOEP data set that maintain much of the ability of the original data set to predict life satisfaction and health across younger, middle, and older age groups. Discussion: Feature selection techniques permit researchers to choose measures that are maximally predictive of relevant outcomes, even when there are interactions or nonlinearities. These techniques facilitate decisions about which measures may be dropped from a study while maintaining efficiency of prediction across groups and reducing costs to the researcher and burden on the participants.",
author = "Brick, {Timothy R.} and Koffer, {Rachel E.} and Denis Gerstorf and Nilam Ram",
year = "2018",
month = "1",
day = "1",
doi = "10.1093/geronb/gbx008",
language = "English (US)",
volume = "73",
pages = "113--123",
journal = "Journals of Gerontology - Series B Psychological Sciences and Social Sciences",
issn = "1079-5014",
publisher = "Gerontological Society of America",
number = "1",

}

Feature selection methods for optimal design of studies for developmental inquiry. / Brick, Timothy R.; Koffer, Rachel E.; Gerstorf, Denis; Ram, Nilam.

In: Journals of Gerontology - Series B Psychological Sciences and Social Sciences, Vol. 73, No. 1, 01.01.2018, p. 113-123.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Feature selection methods for optimal design of studies for developmental inquiry

AU - Brick, Timothy R.

AU - Koffer, Rachel E.

AU - Gerstorf, Denis

AU - Ram, Nilam

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Objectives: As diary, panel, and experience sampling methods become easier to implement, studies of development and aging are adopting more and more intensive study designs. However, if too many measures are included in such designs, interruptions for measurement may constitute a significant burden for participants. We propose the use of feature selection-a data-driven machine learning process-in study design and selection of measures that show the most predictive power in pilot data. Method: We introduce an analytical paradigm based on the feature importance estimation and recursive feature elimination with decision tree ensembles and illustrate its utility using empirical data from the German Socio-Economic Panel (SOEP). Results: We identified a subset of 20 measures from the SOEP data set that maintain much of the ability of the original data set to predict life satisfaction and health across younger, middle, and older age groups. Discussion: Feature selection techniques permit researchers to choose measures that are maximally predictive of relevant outcomes, even when there are interactions or nonlinearities. These techniques facilitate decisions about which measures may be dropped from a study while maintaining efficiency of prediction across groups and reducing costs to the researcher and burden on the participants.

AB - Objectives: As diary, panel, and experience sampling methods become easier to implement, studies of development and aging are adopting more and more intensive study designs. However, if too many measures are included in such designs, interruptions for measurement may constitute a significant burden for participants. We propose the use of feature selection-a data-driven machine learning process-in study design and selection of measures that show the most predictive power in pilot data. Method: We introduce an analytical paradigm based on the feature importance estimation and recursive feature elimination with decision tree ensembles and illustrate its utility using empirical data from the German Socio-Economic Panel (SOEP). Results: We identified a subset of 20 measures from the SOEP data set that maintain much of the ability of the original data set to predict life satisfaction and health across younger, middle, and older age groups. Discussion: Feature selection techniques permit researchers to choose measures that are maximally predictive of relevant outcomes, even when there are interactions or nonlinearities. These techniques facilitate decisions about which measures may be dropped from a study while maintaining efficiency of prediction across groups and reducing costs to the researcher and burden on the participants.

UR - http://www.scopus.com/inward/record.url?scp=85046261188&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046261188&partnerID=8YFLogxK

U2 - 10.1093/geronb/gbx008

DO - 10.1093/geronb/gbx008

M3 - Article

C2 - 28164232

AN - SCOPUS:85046261188

VL - 73

SP - 113

EP - 123

JO - Journals of Gerontology - Series B Psychological Sciences and Social Sciences

JF - Journals of Gerontology - Series B Psychological Sciences and Social Sciences

SN - 1079-5014

IS - 1

ER -