Almost all application segments today experience data explosion, meaning that they need to store, access, manipulate and transform extremely large amounts of data stored in different mediums in a fashion that is simultaneously performance-aware and energy-aware. These data-hungry market segments include (i) consumer applications in the mobile and home electronics segment, (ii) desktop applications that are providing rich content and user experience, (iii) scientific applications that generate petabytes of data for analyzing experiments and real-world phenomena on temporal and spatial scales unheard of before, (iv) enterprise applications which tirelessly store all kinds of data/knowledge for auditability, analytics, and optimization, (v) datacenters and cloud platforms which use storage to hold large virtual machine images of the workloads for consolidation across different servers, (vi) Internet services and social networking platforms which need to store, track and manage user patterns, and (vii) cyber-physical applications which continuously sense and store physical world data for real-time analytics and control. Current computer infrastructures are poorly equipped to cope with this data demand. The primary reason for this is the inherent physical divide between computation and storage. While both computation and storage technologies have undergone tremendous improvements in the last decades, the interactions and interfaces between them have not, thereby limiting the performance of critical data-intensive applications. If not addressed in a timely fashion, this problem has the potential to slow down scientific discoveries and engineering breakthroughs.
This project addresses the data management problem by breaking the physical divide between computation and NAND-flash storage. Doing so can potentially allow the communication bandwidth between computation and storage to scale together with the parallelism-driven scaling of both computation resources and storage resources. It can also allow each to become more aware of the intentions and operations of the other, opening a wide spectrum of possibilities in more efﬁciently managing storage. This will in turn allow better co-design, co-management, and co-evolution of the two for better scalability in the future, as applications start imposing even more stringent computing and storage demands. Specifically, this project investigates three main strategies for bridging the physical divide between compute and NAND-flash storage. The first strategy enables better cooperation between flash storage and host; the second strategy elevates NAND-flash storage to directly interface with the processors, similar to main memory DIMMs (dual inline memory modules) interfacing to the on-chip cores through memory controllers; and the last strategy explores different placement options for tighter integration of NAND-flash storage with computational resources. The broader impacts of this research include student training, participation of under-represented groups, recruiting workshops, incorporation of educational modules into existing and future courses, and public domain simulation tools. Further, through the Visit In Engineering Weekend (VIEW) program, the project fosters interest in computer science and engineering. The project provides hands-on-design activities to motivate the VIEW participants in new areas of computer science and engineering related to storage system and data management.
|Effective start/end date||7/1/13 → 6/30/17|
- National Science Foundation: $800,000.00