Abstract
Improved iterative scaling (IIS) is a simple, powerful algorithm for learning maximum entropy (ME) conditional probability models that has found great utility in natural language processing and related applications. In nearly all prior work on IIS, one considers discrete-valued feature functions, depending on the data observations and class label, and encodes statistical constraints on these discrete-valued random variables. Moreover, most significantly for our purposes, the (ground-truth) constraints are measured from frequency counts, based on hard (0-1) training set instances of feature values. Here, we extend US for the case where the training (and test) set consists of instances of probability mass functions on the features, rather than instances of hard feature values. We show that the US methodology extends in a natural way for this case. This extension has applications 1) to ME aggregation of soft classifier outputs in ensemble classification and 2) to ME classification on mixed discrete-continuous feature spaces. Moreover, we combine these methods, yielding an ME method that jointly performs (soft) decision-level fusion and feature-level fusion in making ensemble decisions. We demonstrate favorable comparisons against both standard boosting and bagging on UC Irvine benchmark data sets. We also discuss some of our continuing research directions.
Original language | English (US) |
---|---|
Title of host publication | 2005 IEEE Workshop on Machine Learning for Signal Processing |
Pages | 61-66 |
Number of pages | 6 |
DOIs | |
State | Published - Dec 1 2005 |
Event | 2005 IEEE Workshop on Machine Learning for Signal Processing - Mystic, CT, United States Duration: Sep 28 2005 → Sep 30 2005 |
Publication series
Name | 2005 IEEE Workshop on Machine Learning for Signal Processing |
---|
Other
Other | 2005 IEEE Workshop on Machine Learning for Signal Processing |
---|---|
Country | United States |
City | Mystic, CT |
Period | 9/28/05 → 9/30/05 |
Fingerprint
All Science Journal Classification (ASJC) codes
- Engineering(all)
Cite this
}
An extension of iterative scaling for joint decision-level and feature-level fusion in ensemble classification. / Miller, David Jonathan; Pal, Siddharth.
2005 IEEE Workshop on Machine Learning for Signal Processing. 2005. p. 61-66 1532875 (2005 IEEE Workshop on Machine Learning for Signal Processing).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
TY - GEN
T1 - An extension of iterative scaling for joint decision-level and feature-level fusion in ensemble classification
AU - Miller, David Jonathan
AU - Pal, Siddharth
PY - 2005/12/1
Y1 - 2005/12/1
N2 - Improved iterative scaling (IIS) is a simple, powerful algorithm for learning maximum entropy (ME) conditional probability models that has found great utility in natural language processing and related applications. In nearly all prior work on IIS, one considers discrete-valued feature functions, depending on the data observations and class label, and encodes statistical constraints on these discrete-valued random variables. Moreover, most significantly for our purposes, the (ground-truth) constraints are measured from frequency counts, based on hard (0-1) training set instances of feature values. Here, we extend US for the case where the training (and test) set consists of instances of probability mass functions on the features, rather than instances of hard feature values. We show that the US methodology extends in a natural way for this case. This extension has applications 1) to ME aggregation of soft classifier outputs in ensemble classification and 2) to ME classification on mixed discrete-continuous feature spaces. Moreover, we combine these methods, yielding an ME method that jointly performs (soft) decision-level fusion and feature-level fusion in making ensemble decisions. We demonstrate favorable comparisons against both standard boosting and bagging on UC Irvine benchmark data sets. We also discuss some of our continuing research directions.
AB - Improved iterative scaling (IIS) is a simple, powerful algorithm for learning maximum entropy (ME) conditional probability models that has found great utility in natural language processing and related applications. In nearly all prior work on IIS, one considers discrete-valued feature functions, depending on the data observations and class label, and encodes statistical constraints on these discrete-valued random variables. Moreover, most significantly for our purposes, the (ground-truth) constraints are measured from frequency counts, based on hard (0-1) training set instances of feature values. Here, we extend US for the case where the training (and test) set consists of instances of probability mass functions on the features, rather than instances of hard feature values. We show that the US methodology extends in a natural way for this case. This extension has applications 1) to ME aggregation of soft classifier outputs in ensemble classification and 2) to ME classification on mixed discrete-continuous feature spaces. Moreover, we combine these methods, yielding an ME method that jointly performs (soft) decision-level fusion and feature-level fusion in making ensemble decisions. We demonstrate favorable comparisons against both standard boosting and bagging on UC Irvine benchmark data sets. We also discuss some of our continuing research directions.
UR - http://www.scopus.com/inward/record.url?scp=33749047673&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33749047673&partnerID=8YFLogxK
U2 - 10.1109/MLSP.2005.1532875
DO - 10.1109/MLSP.2005.1532875
M3 - Conference contribution
AN - SCOPUS:33749047673
SN - 0780395174
SN - 9780780395176
T3 - 2005 IEEE Workshop on Machine Learning for Signal Processing
SP - 61
EP - 66
BT - 2005 IEEE Workshop on Machine Learning for Signal Processing
ER -