2016-05-17 – Philippe Virouleau
Title: Improving OpenMP compilers and runtimes for task-based applications on NUMA architectures
Speaker: Philippe Virouleau
Abstract: The most popular architecture for building large-scale shared memory machines nowodays is the NUMA architecture (Non-Uniform Memory Access). In such architecture, the shared memory and cores are split in nodes, physically separated from each others. The memory access time depends on which core wants to access which data, and the distance between the core and the data’s NUMA node. A popular application design to efficiently exploit the parallelism offered by large multi processors architectures is to use fine-grain dependent tasks. In order to successfully use this approach on NUMA architectures, the application’s programmer should take great care of the locality between the task being executed, and the data manipulated by the task. OpenMP is the de-facto standard for shared-memory parallel programming, and the revision 4.0 introduced the tasks with dependencies model, in which the programmer can specify which data are read and/or written by a given task. Having the runtime use these informations is a first step to dynamically improve the application’s performances, however more flexibility could be given to the programmer, e.g. by giving him the possibility to specify which data are important for a given task. This presentation will describe my PhD works, which focus on proposing and evaluating compilers and runtimes extensions to help reduce the impact of NUMA architectures on the application’s performances and scalability.