Error-driven generalist+experts (EDGE): A multi-stage ensemble framework for text categorization

Jian Huang, Omid Madani, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

We introduce a multi-stage ensemble framework, Error-Driven Generalist+Expert or Edge, for improved classica-tion on large-scale text categorization problems. Edgerst trains a generalist, capable of classifying under all classes, to deliver a reasonably accurate initial category ranking given an instance. Edge then computes a confusion graph for the generalist and allocates the learning resources to train experts on relatively small groups of classes that tend to be systematically confused with one another by the generalist. The experts' votes, when invoked on a given instance, yield a reranking of the classes, thereby correcting the errors of the generalist. Our evaluations showcase the improved classification and ranking performance on several large-scale text categorization datasets. Edge is in particular effcient when the underlying learners are effcient. Our study of confusion graphs is also of independent interest.

Original languageEnglish (US)
Title of host publicationProceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08
Pages83-92
Number of pages10
DOIs
StatePublished - Dec 1 2008
Event17th ACM Conference on Information and Knowledge Management, CIKM'08 - Napa Valley, CA, United States
Duration: Oct 26 2008Oct 30 2008

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other17th ACM Conference on Information and Knowledge Management, CIKM'08
CountryUnited States
CityNapa Valley, CA
Period10/26/0810/30/08

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Fingerprint Dive into the research topics of 'Error-driven generalist+experts (EDGE): A multi-stage ensemble framework for text categorization'. Together they form a unique fingerprint.

Cite this