In the traditional data envelopment analysis (DEA) approach for a set of n Decision Making Units (DMUs), a standard DEA model is solved n times, one for each DMU. As the number of DMUs increases, the running-time to solve the standard model sharply rises. In this study, a new framework is proposed to significantly decrease the required DEA calculation time in comparison with the existing methodologies when a large set of DMUs (e.g., 20,000 DMUs or more) is present. The framework includes five steps: (i) selecting a subsample of DMUs using a proposed algorithm, (ii) finding the best-practice DMUs in the selected subsample, (iii) finding the exterior DMUs to the hull of the selected subsample, (iv) identifying the set of all efficient DMUs, and (v) measuring the performance scores of DMUs as those arising from the traditional DEA approach. The variable returns to scale technology is assumed and several simulation experiments are designed to estimate the running-time for applying the proposed method for big data. The obtained results in this study point out that the running-time is decreased up to 99.9% in comparison with the existing techniques. In addition, we illustrate the essential computation time for applying the proposed method as a function of the number of DMUs (cardinality), number of inputs and outputs (dimension), and the proportion of efficient DMUs (density). The methods are also compared on a real data set consisting of 30,099 electric power plants in the United States from 1996 to 2016.
All Science Journal Classification (ASJC) codes
- Computer Science(all)
- Modeling and Simulation
- Management Science and Operations Research
- Information Systems and Management