An exploration of text mining of narrative reports of injury incidents to assess risk

David Lynn Passmore, Chungil Chae, Yulia Kustikova, Rose Baker, Jeong Ha Yim

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.

Original languageEnglish (US)
Article number06020
JournalMATEC Web of Conferences
Volume251
DOIs
StatePublished - Dec 14 2018
Event6th International Scientific Conference on Integration, Partnership and Innovation in Construction Science and Education, IPICSE 2018 - Moscow, Russian Federation
Duration: Nov 14 2018Nov 16 2018

Fingerprint

Musculoskeletal system
Open pit mining
Coal mines
Learning systems

All Science Journal Classification (ASJC) codes

  • Chemistry(all)
  • Materials Science(all)
  • Engineering(all)

Cite this

Passmore, David Lynn ; Chae, Chungil ; Kustikova, Yulia ; Baker, Rose ; Yim, Jeong Ha. / An exploration of text mining of narrative reports of injury incidents to assess risk. In: MATEC Web of Conferences. 2018 ; Vol. 251.
@article{1fc3a1cd14a840dca240c7712c96576b,
title = "An exploration of text mining of narrative reports of injury incidents to assess risk",
abstract = "A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.",
author = "Passmore, {David Lynn} and Chungil Chae and Yulia Kustikova and Rose Baker and Yim, {Jeong Ha}",
year = "2018",
month = "12",
day = "14",
doi = "10.1051/matecconf/201825106020",
language = "English (US)",
volume = "251",
journal = "MATEC Web of Conferences",
issn = "2261-236X",
publisher = "EDP Sciences",

}

An exploration of text mining of narrative reports of injury incidents to assess risk. / Passmore, David Lynn; Chae, Chungil; Kustikova, Yulia; Baker, Rose; Yim, Jeong Ha.

In: MATEC Web of Conferences, Vol. 251, 06020, 14.12.2018.

Research output: Contribution to journalConference article

TY - JOUR

T1 - An exploration of text mining of narrative reports of injury incidents to assess risk

AU - Passmore, David Lynn

AU - Chae, Chungil

AU - Kustikova, Yulia

AU - Baker, Rose

AU - Yim, Jeong Ha

PY - 2018/12/14

Y1 - 2018/12/14

N2 - A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.

AB - A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.

UR - http://www.scopus.com/inward/record.url?scp=85059185991&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059185991&partnerID=8YFLogxK

U2 - 10.1051/matecconf/201825106020

DO - 10.1051/matecconf/201825106020

M3 - Conference article

AN - SCOPUS:85059185991

VL - 251

JO - MATEC Web of Conferences

JF - MATEC Web of Conferences

SN - 2261-236X

M1 - 06020

ER -