The commoditization of high performance computer systems has resulted in their widespread deployment as servers in numerous environments. We find clusters and/or Symmetric Multiprocessors (SMPs) being extensively used for commercial services such as e-commerce and transaction processing, everyday file/web service needs, and for long running scientific applications in academic/research settings. While hardware and software procurement costs for their deployment have dropped significantly, the total cost of ownership is in fact growing because of the costs of involving a human in managing and tuning these systems. It is a non-trivial task today to tune a system and service for each environment/configuration. This proposal intends to develop a data-driven feedback framework - called Cruise Control - to aid in the design and deployment of such autonomic servers. There are several research questions to be investigated in the development of this framework: What system and workload events (data) should we monitor in the server that have a consequence on its performance? How do we represent and store this data in a meaningful manner (and compress them in the process) since it may be collected over several days and at very fine resolution? Based on this historical data and currently evolving conditions, how do we design a controller that can modulate the server execution to avoid performance bottlenecks in a cost-effective manner? What system mechanisms are needed within the underlying operating system and middleware layer to provide the data collection and server modulation functionalities? How do we structure and develop this complete infrastructure on commodity systems without degrading the performance of the server?
The Cruise Control framework will provide system mechanisms for collecting the data and will attempt to characterize them to compress their representation. and will develop within the operating system mechanisms for effecting such performance modulation will also be developed within the operating system. This general framework will be implemented and validated experimentally for two different server environments - a commercial database server and a high performance computing server for scientific applications - on a cluster and a SMP hardware platform.
|Effective start/end date||9/15/03 → 8/31/06|
- National Science Foundation: $550,590.00