Nomadic speech-based text entry: A decision model strategy for improved speech to text processing

Kathleen J. Price, Min Lin, Jinjuan Feng, Rich Goldman, Andrew L. Sears, Julie Jacko

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Speech text entry can be problematic during ideal dictation conditions, but difficulties are magnified when external conditions deteriorate. Motion during speech is an extraordinary condition that might have detrimental effects on automatic speech recognition. This research examined speech text entry while mobile. Speech enrollment profiles were created by participants in both a seated and walking environment. Dictation tasks were also completed in both the seated and walking conditions. Although results from an earlier study suggested that completing the enrollment process under more challenging conditions may lead to improved recognition accuracy under both challenging and less challenging conditions, the current study provided contradictory results. A detailed review of error rates confirmed that some participants minimized errors by enrolling under more challenging conditions while others benefited by enrolling under less challenging conditions. Still others minimized errors when different enrollment models were used under the opposing condition. Leveraging these insights, we developed a decision model to minimize recognition error rates regardless of the conditions experienced while completing dictation tasks. When applying the model to existing data, error rates were reduced significantly but additional research is necessary to effectively validate the proposed solution.

Original languageEnglish (US)
Pages (from-to)692-706
Number of pages15
JournalInternational Journal of Human-Computer Interaction
Volume25
Issue number7
DOIs
StatePublished - Sep 1 2009

Fingerprint

Text processing
text processing
decision model
Speech recognition

All Science Journal Classification (ASJC) codes

  • Human Factors and Ergonomics
  • Human-Computer Interaction
  • Computer Science Applications

Cite this

Price, Kathleen J. ; Lin, Min ; Feng, Jinjuan ; Goldman, Rich ; Sears, Andrew L. ; Jacko, Julie. / Nomadic speech-based text entry : A decision model strategy for improved speech to text processing. In: International Journal of Human-Computer Interaction. 2009 ; Vol. 25, No. 7. pp. 692-706.
@article{d97b76a463d544d9824727bbfcf09ab5,
title = "Nomadic speech-based text entry: A decision model strategy for improved speech to text processing",
abstract = "Speech text entry can be problematic during ideal dictation conditions, but difficulties are magnified when external conditions deteriorate. Motion during speech is an extraordinary condition that might have detrimental effects on automatic speech recognition. This research examined speech text entry while mobile. Speech enrollment profiles were created by participants in both a seated and walking environment. Dictation tasks were also completed in both the seated and walking conditions. Although results from an earlier study suggested that completing the enrollment process under more challenging conditions may lead to improved recognition accuracy under both challenging and less challenging conditions, the current study provided contradictory results. A detailed review of error rates confirmed that some participants minimized errors by enrolling under more challenging conditions while others benefited by enrolling under less challenging conditions. Still others minimized errors when different enrollment models were used under the opposing condition. Leveraging these insights, we developed a decision model to minimize recognition error rates regardless of the conditions experienced while completing dictation tasks. When applying the model to existing data, error rates were reduced significantly but additional research is necessary to effectively validate the proposed solution.",
author = "Price, {Kathleen J.} and Min Lin and Jinjuan Feng and Rich Goldman and Sears, {Andrew L.} and Julie Jacko",
year = "2009",
month = "9",
day = "1",
doi = "10.1080/10447310902964132",
language = "English (US)",
volume = "25",
pages = "692--706",
journal = "International Journal of Human-Computer Interaction",
issn = "1044-7318",
publisher = "Taylor and Francis Ltd.",
number = "7",

}

Nomadic speech-based text entry : A decision model strategy for improved speech to text processing. / Price, Kathleen J.; Lin, Min; Feng, Jinjuan; Goldman, Rich; Sears, Andrew L.; Jacko, Julie.

In: International Journal of Human-Computer Interaction, Vol. 25, No. 7, 01.09.2009, p. 692-706.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Nomadic speech-based text entry

T2 - A decision model strategy for improved speech to text processing

AU - Price, Kathleen J.

AU - Lin, Min

AU - Feng, Jinjuan

AU - Goldman, Rich

AU - Sears, Andrew L.

AU - Jacko, Julie

PY - 2009/9/1

Y1 - 2009/9/1

N2 - Speech text entry can be problematic during ideal dictation conditions, but difficulties are magnified when external conditions deteriorate. Motion during speech is an extraordinary condition that might have detrimental effects on automatic speech recognition. This research examined speech text entry while mobile. Speech enrollment profiles were created by participants in both a seated and walking environment. Dictation tasks were also completed in both the seated and walking conditions. Although results from an earlier study suggested that completing the enrollment process under more challenging conditions may lead to improved recognition accuracy under both challenging and less challenging conditions, the current study provided contradictory results. A detailed review of error rates confirmed that some participants minimized errors by enrolling under more challenging conditions while others benefited by enrolling under less challenging conditions. Still others minimized errors when different enrollment models were used under the opposing condition. Leveraging these insights, we developed a decision model to minimize recognition error rates regardless of the conditions experienced while completing dictation tasks. When applying the model to existing data, error rates were reduced significantly but additional research is necessary to effectively validate the proposed solution.

AB - Speech text entry can be problematic during ideal dictation conditions, but difficulties are magnified when external conditions deteriorate. Motion during speech is an extraordinary condition that might have detrimental effects on automatic speech recognition. This research examined speech text entry while mobile. Speech enrollment profiles were created by participants in both a seated and walking environment. Dictation tasks were also completed in both the seated and walking conditions. Although results from an earlier study suggested that completing the enrollment process under more challenging conditions may lead to improved recognition accuracy under both challenging and less challenging conditions, the current study provided contradictory results. A detailed review of error rates confirmed that some participants minimized errors by enrolling under more challenging conditions while others benefited by enrolling under less challenging conditions. Still others minimized errors when different enrollment models were used under the opposing condition. Leveraging these insights, we developed a decision model to minimize recognition error rates regardless of the conditions experienced while completing dictation tasks. When applying the model to existing data, error rates were reduced significantly but additional research is necessary to effectively validate the proposed solution.

UR - http://www.scopus.com/inward/record.url?scp=77949382487&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77949382487&partnerID=8YFLogxK

U2 - 10.1080/10447310902964132

DO - 10.1080/10447310902964132

M3 - Article

AN - SCOPUS:77949382487

VL - 25

SP - 692

EP - 706

JO - International Journal of Human-Computer Interaction

JF - International Journal of Human-Computer Interaction

SN - 1044-7318

IS - 7

ER -