Learning Decision Tree Classifiers from Attribute Value Taxonomies and Partially Specified Data

Jun Zhang, Vasant Honavar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

48 Scopus citations

Abstract

We consider the problem of learning to classify partially specified instances i.e., instances that are described in terms of attribute values at different levels of precision, using user-supplied attribute value taxonomies (AVT). We formalize the problem of learning from AVT and data and present an AVT-guided decision tree learning algorithm (AVT-DTL) to learn classification rules at multiple levels of abstraction. The proposed approach generalizes existing techniques for dealing with missing values to handle instances with partially missing values. We present experimental results that demonstrate that AVT-DTL is able to effectively learn robust high accuracy classifiers from partially specified examples. Our experiments also demonstrate that the use of AVT-DTL outperforms standard decision tree algorithm (C4.5 and its variants) when applied to data with missing attribute values; and produces substantially more compact decision trees than those obtained by standard approach.

Original languageEnglish (US)
Title of host publicationProceedings, Twentieth International Conference on Machine Learning
EditorsT. Fawcett, N. Mishra
Pages880-887
Number of pages8
StatePublished - 2003
EventProceedings, Twentieth International Conference on Machine Learning - Washington, DC, United States
Duration: Aug 21 2003Aug 24 2003

Publication series

NameProceedings, Twentieth International Conference on Machine Learning
Volume2

Other

OtherProceedings, Twentieth International Conference on Machine Learning
CountryUnited States
CityWashington, DC
Period8/21/038/24/03

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint Dive into the research topics of 'Learning Decision Tree Classifiers from Attribute Value Taxonomies and Partially Specified Data'. Together they form a unique fingerprint.

Cite this