WG – Mathieu Stoffel: Improving power-efficiency through fine-grain monitoring in HPC clusters

2019-01-22

Title: Improving power-efficiency through fine-grain monitoring in HPC clusters

Speaker: Mathieu Stoffel (LIG, CORSE team)

Location: LIP, meeting room M7 3rd floor

Schedule: 14:30

Abstract:

Nowadays, power and energy consumption are of paramount importance. Further, reaching the Exascale target will not be possible in the short term without major breakthroughs in software and hardware technologies to meet power consumption constraints.
In this context, this papers discusses the design and implementation of a system-wide tool to monitor, analyze and control power/energy consumption in HPC clusters.
We developed a lightweight tool that relies on a fine-grain sampling of two CPU performance metrics: instructions throughput (IPC) and last level cache bandwidth.
Thanks to the information provided by these metrics about hardware resources’ activity, and using DVFS to control power/performance, we show that it is possible to achieve up to 16% energy savings at the cost of less than 3% performance degradation on real HPC applications.