Third-party error detection support mechanisms for dictation speech recognition

Lina Zhou, Yongmei Shi, Andrew Sears

Research output: Contribution to journalArticle

7 Scopus citations

Abstract

Although speech recognition has improved significantly in recent years, its adoption continues to be limited, in part, by the effort and frustration associated with correcting speech recognition errors. Error detection is a particularly challenging issue in third-party error correction where different individuals are responsible for the original dictation and correcting the resulting text. This research aims to address the difficulty experienced in third-party error detection by developing and evaluating a variety of support mechanisms. Drawing on a growing body of literature on human computer interaction and speech recognition, four support mechanisms were designed and evaluated, namely indexed audio, speech summarization, error prediction, and the presentation of alternative hypotheses. A user study assessed the impact of these support mechanisms on both performance and perceptions during error detection tasks. Performance measures included effectiveness and efficiency, and perception measures included confidence, perceived usefulness, and cognitive workload. The results provide strong support for the use of indexed audio in the context of third-party error detection. The results also confirm that consecutive error rate, or the percentage of recognition errors immediately adjacent to another error, has a negative impact on the effectiveness of third-party error detection. Other support mechanisms failed to improve either effectiveness or perceptions, but they did negate the negative impact as consecutive error rate increased. These findings have significant implications for speech recognition error detection research and the design of error detection support solutions.

Original languageEnglish (US)
Pages (from-to)375-388
Number of pages14
JournalInteracting with Computers
Volume22
Issue number5
DOIs
StatePublished - Sep 1 2010

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction

Fingerprint Dive into the research topics of 'Third-party error detection support mechanisms for dictation speech recognition'. Together they form a unique fingerprint.

  • Cite this