ABSTRACT

The ever increasing power consumption of the components within a computing system have resulted in tremendous costs and substantial failure rates which are a roadblock in achieving optimal performance at reasonable costs. To mitigate these issues, strategies that reduce the dynamic power consumption of these components are needed. In this paper, we review a survey a subset of those strategies with their salient features and their efficacy in providing energy savings. The paper reviews energy saving strategies proposed and verified in both simulators and real-time systems.

Keywords: DRAM Power Management, Power States.

I. INTRODUCTION

The ever increasing costs and constraints on power consumption are limiting the leap to the next generation exascale systems [34] as the power limit determined for these systems as per DoE guidelines is 20 MW. This means that nearly 60x improvement is watts per operation efficiency is needed. Considering the general trend of change in IPC moving from one Intel processor generation to another, the average increase in instructions per cycle (IPC) has been modest so achieving such a high ratio of power to performance in a relatively short duration could be nearly impossible. Therefore, there is an urgent need to limit the power consumption of modern computing systems to mitigate these issues.

To minimize the impact of the increasing power consumption on the computing performance, novel schemes/strategies are the need of the hour which could achieve it without hurting the performance by much. Traditionally, the focus of the energy saving strategies has been on the processor since it has been found to have the largest ratio in the power consumption of a full compute node. But off-late, DRAM (dynamic random access memory) power consumption has been found increasing in server platforms and its power consumption can even go beyond the level of processor power, given the right amount of memory intensity and DRAM modules. Similarly, the data intensive applications in the HPC (high performance computing) domain also demand an increase in memory capacity and subsequently, an increase in a share of its power consumption. For example, in an IBM server, it was determined that the average memory power consumption was determined to be 1200 watts compared to average processor power consumption of 840 watts. This shows that there is an urgent need to limit the power consumption of the memory as well.

Many strategies have been proposed in the past to reduce processor power consumption, primarily through DVFS. Profiling based strategies making use of performance counters have been used in [1], [3], [8], [9],[10],[12],[14],[16],[18],[21],[25],[26],[27],[59], [60],[61],[62],[63],[69],[70],[71],[73],[74],[75],[76],[77], [78]. Communication interval determination in parallel applications making use of MPI and other communication mechanisms and subsequent application of DVFS was done in [15], [19], [23], [32], [36], [58], [59], [60], [61],[62],[63]. The Intel Running
Average Power Limit (RAPL) has also been used to limit power consumption in modern computing systems [6],[9],[18],[26],[78].

In this paper, we survey several DRAM power saving techniques. We only review a subset of those strategies since it is impossible to survey all the available strategies. Therefore, we pick only the most representative ones to give an overview regarding current state of the work in DRAM power saving. Additionally, we only focus on strategies which were proposed, explicitly to address power saving instead of achieving it as a by product of their research work. Moreover, we focus on strategies proposed and verified in both simulation and real systems.

The rest of the paper is organized as follows. Section 2 provides the background about basic terminology for DRAM internals. Section 3 discusses the previously proposed power saving strategies for DRAM. Section 4 provides the conclusions for the paper.

II. DRAM Basics

DRAM consists of multiple arrays of one bit storages which are arranged in a 2-D form. These 2-D matrices are formed by the intersection of rows (word lines) and columns (bit lines). These grid matrix like structure is called a bank and when these banks are associated in the group of two, four and eight, these banks form DDR, DDR2 and DDR3 and they form the next higher logical unit in DRAM hierarchy (Fig.1) known as rank.

A DIMM (Dual in-line memory module) is a circuit board which consists of one or more DRAM ranks on it. This provides provides an interface to the memory bus. A DRAM channel formed by a one or a group of DIMMs that process the requests from the DRAM controller which is usually built on-chip so as to decrease the latency of the memory access operations.

III. Discussion of Power Saving Strategies

Authors in [30] use a mechanism which estimates the timing of memory access of to a bank bank and then switch it to low power mode appropriately. It also uses makes use of compiler analysis to modify the executable as per the memory accesses and based on that analysis different power saving modes are applied.

Lebeck et al. [47] describes a strategy for putting DRAM modules in low power modes by modifying virtual to physical address mapping so that physical pages are arranged into minimal number of chips and chips which have not been utilized are being put into low power mode.

An estimation of the idle time of memory chips was proposed in [38] which helped in deciding the time after which the DRAM can be put into low-power state. A similar strategy was proposed in [49] for reducing memory energy consumption by adaptively transitioning memory module to low-power modes.
with a added performance loss constraint which cannot be violated and provides an overall envelope within which, energy savings are to be maximized.

Authors in [31] propose an operating system scheduler based power mode strategy for DRAM energy efficiency which dynamically keeps a track of banks which are being utilized and accordingly switches on/off these banks at a context switch detected by the scheduler.

[47] proposes a technique for turning off DRAM chips in low-power mode. Their technique works by controlling virtual address to physical address mapping such that the physical pages of an application are clustered into a minimum number of DRAM chips and the unused chips are transitioned to low power modes. In addition, their technique also monitors the time period between accesses to a chip as a metric for measuring the frequency of reuse of a chip. When this time is greater than a certain threshold, the chip is transitioned to the low-power mode.

A strategy in Delaluz and Kandemir et al. [28] reduces DRAM power and energy consumption placing the arrays at run time with temporal affinity into the same set of banks, which helps in taking advantage of deeper sleep modes for longer durations of time. A virtual memory management based technique is proposed in [40] to reduce the DRAM power consumption which works by remapping to reduce the memory footprint of each application and turning off the under utilized DRAM modules. Zhou et al. [64] proposes a memory allocation scheme which depends on component utilization such that applications are allocated the memory depending on their utility. The unallocated memory is put down into the low power mode to save energy and power consumption. [41] discusses method for saving memory energy by focusing the memory accesses to few memory ranks so that rest of the ranks can be put into low- power modes. This strategy works by dividing the pages as per their utilization into hot and cold ranks and increases the idle periods of cold ranks.

Memory voltage and frequency scaling was used in [33, 35] reduce DRAM power consumption which reduces the frequency of devices, channels etc. when the memory utilization is relatively lower while minimizing the performance degradation at the same time. The “rank-subsetting” strategy was proposed in Zheng et al. [65] to reduce memory power consumption by decreasing the amount of memory actively involved in providing memory accesses. This is done by putting a buffer between the memory modules and the memory bus which can provide variegated arrangement of DRAM ranks so that only one mini-rank is activated for a memory access with rest of them put into low power modes.

The idle time of memory banks was increased under a scheme [46] to reduce their power consumption such that if requested data can be recomputed using the active banks, the banks which are in low-power mode are not activated. A DMA (direct memory access) based scheme is discussed in [57] for saving energy since DMA transfers tend to be larger than the transfers done through the CPU, they get sliced into smaller transfer. As the gaps between these small transfers does not warrant turning off the chips to low power modes, the authors discuss aligning DMA requests coming from different I/O buses to the same memory device. Yoon et al. [66] discuss a strategy which exploits low- power mobile DRAM components to reduce memory power consumption by using a buffering mechanism which combines the data outputs from multiple ranks of low frequency mobile DRAM devices such that it expands and provide equivalent bandwidth similar to higher powered DRAM services. A study was done in [67] to explore the latency and reliability characteristics of modern DRAM when the supply voltage is modified. They determined that reducing the supply voltage
below a certain threshold introduced errors in the data and these errors can be avoided by increasing the latency of major memory operations. Finally, they proposed a DRAM energy reduction strategy termed “Voltron” which used a performance model to determine the extent to which the supply voltage of DRAM can be reduced without introducing errors and without exceeding a performance loss constraint.

A strategy was proposed in [17] to reduce the delay of synchronization of ranks to low power modes using the system I/O calls by using the fact that most I/O requests go through system calls and the OS can determine the completion time of these requests and this can be used to put the idle memory ranks into low-power modes. [68] proposed a main memory design which optimizes three different memory modules power consumption etc. at the expense of other parameters by profiling an application based on its last level cache misses and memory level parallelism.

Authors in [82] and Li et al. [83] discuss strategies for minimizing memory power consumption exploiting spatial and temporal image data correlations in video processing applications by mapping image data in DRAM such that the row activations are minimized, subsequently reducing memory power consumption. Authors in [84] proposes refresh method to reduce refresh power consumption of memory devices, such memory row refreshes are bypassed using counters for each row in the memory module. An Analytical model is used for saving memory power in Lyuh et al. [51] by choosing a low-power mode for memory banks using scheduling of memory access operations etc.

A strategy for tuning the garbage collector is used in [24] to reduce DRAM power by switching off the banks that are not storing live data. An application level strategy is proposed in S. Liu et al. [50] which reduces refresh level power in memory by identifying critical and non-critical data in the application and allocated them in different modules of the memory and then use lower refresh rates for non-critical data and higher refresh rates for critical data.

A strategy is proposed in [85] to improve upon DRAM temperature by working on thread scheduling and page allocation such that threads are grouped together and mapped to certain DIMMs and at any time only single group and the DIMM associated with it remain active and the rest are deactivated to reduce their temperature. A buffering based technique is studied in Trajkovic et al. [86] to reduce DRAM power consumption by examining the activate-precharge operation and avoiding the cost of activation. Their strategy works by prefetching cache blocks and combining multiple blocks for write accesses to the same DRAM row.

[37] proposes a strategy to reduce the peak power consumption of DRAM to a certain power budget by using basic algorithms such as knapsack etc. to determine the schedule which decides when the memory should be put to low power modes. A granularity modification based strategy to reduce memory power is proposed in Yoon et al. [87] which manages the virtual memory to specify access granularity for each page based on the spatial locality of an application.

David et. al [6] propose a memory-frequency scaling based strategy which essentially utilizes memory bandwidth usage metrics in a manner such that frequency scaling is applied whenever memory-bandwidth usage goes below a certain threshold. In addition to providing a memory frequency scaling strategy, the work also provides detailed power and performance models which discusses effect of memory frequency scaling on performance and power consumption.

A coordinated mechanism for processor and memory frequency scaling is proposed in [88] with detailed performance model which relies multiple performance counters to gauge the effect of memory and processor frequency scaling in conjunction on the
overall application performance and they depict their results on a simulation framework by using a timeslice based approach. A relatively modest approach of jointly applying processor and memory frequency scaling is proposed in [27] where the authors use a basic timeslice based strategy coupled with power and performance models based on feedback to the performance misprediction. They show their results in real-time by emulating memory frequency modification from the BIOS.

IV. CONCLUSIONS

The desire to extract maximum performance from the modern computing systems has greatly increased their power and energy consumption. To mitigate this issue, several strategies have been proposed in the past to reduce power consumption of the memory by various ways. In this work, we have reviewed many power saving strategies for DRAM which make use of frequency scaling, memory mapping and many other ways to implement the strategy. We hope that this survey work will inform researchers, processor architects and software engineers to regarding the prominent strategies employed to reduce memory power consumption.

V. REFERENCES


[33]. Q. Deng et. al. MultiScale: Memory System DVFS with Multiple Memory Controllers. ISPLED 2012.


[74]. V Pallipadi and A Starikovskiy. The Ondemand Governor: Past, Present and Future. 2:223-238, 01 2006

[75]. R. Efraim, R. Ginosar, C. Weiser, and A. Mendelson. Energy Aware Race to Halt: A Down to Earth Approach for Platform Energy


