In many regression applications, users are often faced with difficulties due to nonlinear relationships, heterogeneous subjects, or time series which are best represented by splines. In such applications, two or more regression functions are often necessary to best summarize the underlying structure of the data. Unfortunately, in most cases, it is not known a priori which subset of observations should be approximated with which specific regression function. This paper presents a methodology which simultaneously clusters observations into a preset number of groups and estimates the corresponding regression functions' coefficients, all to optimize a common objective function. We describe the problem and discuss related procedures. A new simulated annealing-based methodology is described as well as program options to accommodate overlapping or nonoverlapping clustering, replications per subject, univariate or multivariate dependent variables, and constraints imposed on cluster membership. Extensive Monte Carlo analyses are reported which investigate the overall performance of the methodology. A consumer psychology application is provided concerning a conjoint analysis investigation of consumer satisfaction determinants. Finally, other applications and extensions of the methodology are discussed.
All Science Journal Classification (ASJC) codes
- Applied Mathematics