WG Philippe Virouleau: Improving OpenMP compilers and runtimes for task-based applications on NUMA architectures

2016-05-17 – Philippe Virouleau

Title: Improving OpenMP compilers and runtimes for task-based applications on NUMA architectures

Speaker: Philippe Virouleau

Abstract: The most popular architecture for building large-scale shared memory machines nowodays is the NUMA architecture (Non-Uniform Memory Access). In such architecture, the shared memory and cores are split in nodes, physically separated from each others. The memory access time depends on which core wants to access which data, and the distance between the core and the data’s NUMA node. A popular application design to efficiently exploit the parallelism offered by large multi processors architectures is to use fine-grain dependent tasks. In order to successfully use this approach on NUMA architectures, the application’s programmer should take great care of the locality between the task being executed, and the data manipulated by the task. OpenMP is the de-facto standard for shared-memory parallel programming, and the revision 4.0 introduced the tasks with dependencies model, in which the programmer can specify which data are read and/or written by a given task. Having the runtime use these informations is a first step to dynamically improve the application’s performances, however more flexibility could be given to the programmer, e.g. by giving him the possibility to specify which data are important for a given task. This presentation will describe my PhD works, which focus on proposing and evaluating compilers and runtimes extensions to help reduce the impact of NUMA architectures on the application’s performances and scalability.

PDF: WG_160517_avalon_2015

WG Issam Raïs: Towards Green Exascale Computing Challenges

2016-05-03 – Issam Raïs

Title: Towards Green Exascale Computing Challenges

Speaker: Issam Raïs

Abstract: Exascale is coming. Massively heterogeneous machines with hundreds of thousands of computing nodes and each of these nodes possessing hundreds of cores, bounded to each other by a dedicated and efficient network. On every component composing such a machine, we can spot many techniques to reduce energy consumption while maintaining good computing power. In such a context, this presentation aims at presenting the problematics being tackled in the current thesis.

PDF: presentation