TY - GEN
T1 - CHAMELEON
T2 - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
AU - Kotra, Jagadish B.
AU - Zhang, Haibo
AU - Alameldeen, Alaa R.
AU - Wilkerson, Chris
AU - Kandemir, Mahmut T.
N1 - Funding Information:
ACKNOWLEDGMENT We thank Jaewoong Sim for his helpful input on an earlier version of this work. We thank the anonymous reviewers for their valuable feedback. This work is supported in part by NSF grants 1822923, 1439021, 1629915, 1626251, 1629129, 1763681, 1526750 and 1439057. AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/12
Y1 - 2018/12/12
N2 - Modern computing systems and applications have growing demand for memories with higher bandwidth. This demand can be alleviated using fast, large on-die or die-stacked memories. They are typically used with traditional DRAM as part of a heterogeneous memory system and used either as a DRAM cache or as a hardware-or OS-managed part of memory (PoM). Caches adapt rapidly to application needs and typically provide higher performance but reduce the total OS-visible memory capacity. PoM architectures increase the total OS-visible memory capacity but exhibit additional overheads due to swapping large blocks of data between fast and slow memory. In this paper, we propose Chameleon, a hybrid architecture that bridges the gap between cache and PoM architectures. When applications need a large memory, Chameleon uses both fast and slow memories as PoM, maximizing the available space for the application. When the application's footprint is smaller than the total physical memory capacity, Chameleon opportunistically uses free space in the system as a hardware-managed cache. Chameleon is a hardware-software co-designed system where the OS notifies the hardware of pages that are allocated or freed, and hardware decides on switching memory regions between PoM-And cache-modes dynamically. Based on our evaluation of multi-programmed workloads on a system with 4GB fast memory and 20GB slow memory, Chameleon improves the average performance by 11.6% over PoM and 24.2% over a latency-optimized cache.
AB - Modern computing systems and applications have growing demand for memories with higher bandwidth. This demand can be alleviated using fast, large on-die or die-stacked memories. They are typically used with traditional DRAM as part of a heterogeneous memory system and used either as a DRAM cache or as a hardware-or OS-managed part of memory (PoM). Caches adapt rapidly to application needs and typically provide higher performance but reduce the total OS-visible memory capacity. PoM architectures increase the total OS-visible memory capacity but exhibit additional overheads due to swapping large blocks of data between fast and slow memory. In this paper, we propose Chameleon, a hybrid architecture that bridges the gap between cache and PoM architectures. When applications need a large memory, Chameleon uses both fast and slow memories as PoM, maximizing the available space for the application. When the application's footprint is smaller than the total physical memory capacity, Chameleon opportunistically uses free space in the system as a hardware-managed cache. Chameleon is a hardware-software co-designed system where the OS notifies the hardware of pages that are allocated or freed, and hardware decides on switching memory regions between PoM-And cache-modes dynamically. Based on our evaluation of multi-programmed workloads on a system with 4GB fast memory and 20GB slow memory, Chameleon improves the average performance by 11.6% over PoM and 24.2% over a latency-optimized cache.
UR - http://www.scopus.com/inward/record.url?scp=85060028747&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060028747&partnerID=8YFLogxK
U2 - 10.1109/MICRO.2018.00050
DO - 10.1109/MICRO.2018.00050
M3 - Conference contribution
AN - SCOPUS:85060028747
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 533
EP - 545
BT - Proceedings - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
PB - IEEE Computer Society
Y2 - 20 October 2018 through 24 October 2018
ER -