For multimedia applications, loop buffering is an efficient mechanism to reduce the power in the instruction memory of embedded processors. In particular, software controlled clustered loop buffers are very energy efficient. However code transformations needed in VLIW compilers to reach a higher ILP potentially may have a large negative influence on the energy consumed in the instruction memories (including the loop buffer). This paper will show that such code transformations can also have a positive impact on the instruction memory energy of processors, if the transformations are steered taking into account the presence of the software controlled clustered loop buffer. We will propose guidelines to steer the code transformations and show that these guidlines should be differently applied in a system with clustered loop buffer than in a system with only a normal instruction cache. Results are presented for a mix of three important ILP code transformations: software-pipelining, if-conversion and loop unrolling. Results show an energy reduction between 15% and 25% and a delay reduction between 60% and 75% for an MPEG-2 Video Encoder benchmark.
|Title of host publication||11th Workshop on Interaction between Compilers and Computer Architecture 2007 : INTERACT 11 ; February 11, 2007, Phoenix, Arizona, USA|
|Place of Publication||Red Hook|
|Publication status||Published - 2007|