In a large number of organizations, there is an ongoing need to evaluate performance. There is also a need to estimate the production frontier of best performers in such environments. Part of the difficulty in constructing this frontier is that massive amounts of data are generated daily with the result being that the set of best performers is constantly changing. Furthermore, measurement errors can influence datasets as well as the estimation of a production frontier. Outliers can also substantially affect the estimated production frontier. Efforts have been made in the past three decades to deal with datasets that might include such outliers; these methods are mostly semiautomatic or require significant computation time when a large dataset is involved. The few existing research on large datasets also focuses on the computational process of measuring a production frontier without identifying the possible influence of outliers to the estimated frontier. In the current paper, for the first time in the literature of data envelopment analysis (DEA), we develop an automatic framework with the computational capability and accuracy needed when big datasets (with multiple inputs and multiple outputs) are considered. Several examples, simulation experiments, and real-life applications are discussed to demonstrate the power of the proposed framework. A data analysis with illustrative graphs is provided to clearly show the methodology. In terms of estimating the production frontier, the method is robust, user-friendly, and substantially decreases the requirement of user judgment while at the same time allowing for the incorporation of such judgement.
All Science Journal Classification (ASJC) codes
- Computer Science(all)
- Modeling and Simulation
- Management Science and Operations Research
- Information Systems and Management