A data mining-constraint satisfaction optimization problem for cost effective classification

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

We propose a data mining-constraint satisfaction optimization problem (DM-CSOP) where it is desired to maximize the number of correct classifications at a lowest possible information acquisition cost. We show that the problem can be formulated as a set of several binary variable knapsack optimization problems, which are solved sequentially. We propose a heuristic hybrid simulated annealing and gradient-descent artificial neural network (ANN) procedure to solve the DM-CSOP. Using a real-world heart disease data set, we show that the proposed hybrid procedure provides a low-cost and high-quality solution when compared to a traditional ANN classification approach. The massive proliferation of very large databases in organizations makes it necessary to design cost effective and efficient data mining systems. This paper proposes a data mining constraint satisfaction optimization problem, which provides a high quality cost effective solution for a binary classification problem.

Original languageEnglish (US)
Pages (from-to)3124-3135
Number of pages12
JournalComputers and Operations Research
Volume33
Issue number11
DOIs
StatePublished - Nov 1 2006

Fingerprint

Constraint Satisfaction Problem
Data mining
Data Mining
Optimization Problem
Costs
Artificial Neural Network
Neural networks
Binary Classification
Binary Variables
Gradient Descent
Knapsack Problem
Several Variables
Proliferation
Simulated annealing
Simulated Annealing
Classification Problems
Lowest
Maximise
Heuristics
Necessary

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Modeling and Simulation
  • Management Science and Operations Research

Cite this

@article{90e000ac945a4793b5e976f8bef670b7,
title = "A data mining-constraint satisfaction optimization problem for cost effective classification",
abstract = "We propose a data mining-constraint satisfaction optimization problem (DM-CSOP) where it is desired to maximize the number of correct classifications at a lowest possible information acquisition cost. We show that the problem can be formulated as a set of several binary variable knapsack optimization problems, which are solved sequentially. We propose a heuristic hybrid simulated annealing and gradient-descent artificial neural network (ANN) procedure to solve the DM-CSOP. Using a real-world heart disease data set, we show that the proposed hybrid procedure provides a low-cost and high-quality solution when compared to a traditional ANN classification approach. The massive proliferation of very large databases in organizations makes it necessary to design cost effective and efficient data mining systems. This paper proposes a data mining constraint satisfaction optimization problem, which provides a high quality cost effective solution for a binary classification problem.",
author = "Pendharkar, {Parag C.}",
year = "2006",
month = "11",
day = "1",
doi = "10.1016/j.cor.2005.01.023",
language = "English (US)",
volume = "33",
pages = "3124--3135",
journal = "Surveys in Operations Research and Management Science",
issn = "0305-0548",
publisher = "Elsevier Limited",
number = "11",

}

A data mining-constraint satisfaction optimization problem for cost effective classification. / Pendharkar, Parag C.

In: Computers and Operations Research, Vol. 33, No. 11, 01.11.2006, p. 3124-3135.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A data mining-constraint satisfaction optimization problem for cost effective classification

AU - Pendharkar, Parag C.

PY - 2006/11/1

Y1 - 2006/11/1

N2 - We propose a data mining-constraint satisfaction optimization problem (DM-CSOP) where it is desired to maximize the number of correct classifications at a lowest possible information acquisition cost. We show that the problem can be formulated as a set of several binary variable knapsack optimization problems, which are solved sequentially. We propose a heuristic hybrid simulated annealing and gradient-descent artificial neural network (ANN) procedure to solve the DM-CSOP. Using a real-world heart disease data set, we show that the proposed hybrid procedure provides a low-cost and high-quality solution when compared to a traditional ANN classification approach. The massive proliferation of very large databases in organizations makes it necessary to design cost effective and efficient data mining systems. This paper proposes a data mining constraint satisfaction optimization problem, which provides a high quality cost effective solution for a binary classification problem.

AB - We propose a data mining-constraint satisfaction optimization problem (DM-CSOP) where it is desired to maximize the number of correct classifications at a lowest possible information acquisition cost. We show that the problem can be formulated as a set of several binary variable knapsack optimization problems, which are solved sequentially. We propose a heuristic hybrid simulated annealing and gradient-descent artificial neural network (ANN) procedure to solve the DM-CSOP. Using a real-world heart disease data set, we show that the proposed hybrid procedure provides a low-cost and high-quality solution when compared to a traditional ANN classification approach. The massive proliferation of very large databases in organizations makes it necessary to design cost effective and efficient data mining systems. This paper proposes a data mining constraint satisfaction optimization problem, which provides a high quality cost effective solution for a binary classification problem.

UR - http://www.scopus.com/inward/record.url?scp=33644688207&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33644688207&partnerID=8YFLogxK

U2 - 10.1016/j.cor.2005.01.023

DO - 10.1016/j.cor.2005.01.023

M3 - Article

VL - 33

SP - 3124

EP - 3135

JO - Surveys in Operations Research and Management Science

JF - Surveys in Operations Research and Management Science

SN - 0305-0548

IS - 11

ER -