WG – Victor Allombert: Programming Multi-BSP Algorithms in ML

2018-01-15

Title: Programming Multi-BSP Algorithms in ML

Speaker: Victor Allombert

Abstract: From personal computers using an increasing number of cores, to supercomputers having millions of computing units, parallel architectures are the current standard. The high performance architectures are usually referenced to as hierarchical, as they are composed from clusters of multi-processors of multi-cores. Programming such architectures is known to be notoriously difficult. Writing parallel programs is, most of the time, difficult for both the algorithmic and the implementation phase. To answer those concerns, many structured models and languages were proposed in order to increase both expressiveness and efficiency. Among other models, Multi-BSP is a bridging model dedicated to hierarchical architecture that ensures efficiency, execution safety, scalability and cost prediction. It is an extension of the well known BSP model that handles flat architectures. We introduce the Multi-ML language, which allows programming Multi-BSP algorithms “à la ML” and thus, guarantees the properties of the Multi-BSP model and the execution safety, thanks to a ml type system. To deal with the multi-level execution model of Multi-ML, we defined formal semantics which describe the valid evaluation of an expression. To ensure the execution safety of Multi-ML programs, we also propose a typing system that preserves replicated coherence. An abstract machine is defined to formally describe the evaluation of a Multi-ML program on a Multi-BSP architecture. An implementation of the language is available as a compilation toolchain. It is thus possible to generate an efficient parallel code from a program written in Multi-ML and execute it on any hierarchical machine.

Inria Project Lab Discovery

Distributed and COoperative management of Virtual Environments autonomousLY

The DISCOVERY initiative aims at exploring a new way of operating Utility Computing (UC) resources.

To accommodate the ever-increasing demand for Utility Computing (UC) resources, while taking into account both energy and economical issues, the current trend consists in building larger and larger data centers in a few strategic locations. Although such an approach enables UC providers to cope with the actual demand while continuing to operate UC resources through centralized software system, it is far from delivering sustainable and efficient UC infrastructures. We claim that a disruptive change in UC infrastructures is required: UC resources should be managed differently, considering locality as a primary concern. To this aim, we propose to leverage any facilities available through the Internet in order to deliver widely distributed UC platforms that can better match the geographical dispersal of users as well as the unending demand. Critical to the emergence of such locality-based UC (LUC) platforms is the availability of appropriate operating mechanisms. We advocate the implementation of a unified system driving the use of resources at an unprecedented scale by turning a complex and diverse infrastructure into a collection of abstracted computing facilities that is both easy to operate and reliable.

Start Date: January 2015

Duration: 4 years

Avalon Members: J. Darrous, G. Fedak, C. Perez

More information on Discovery website

Inria Project Lab C2S@Exa

Computer and Computational Sciences at Exascale INRIA Large Scale Initiative

The C2S@Exa INRIA large-scale initiative is concerned with the development of numerical modeling methodologies that fully exploit the processing capabilities of modern massively parallel architectures in the context of a number of selected applications related to important scientific and technological challenges for the quality and the security of life in our society. Avalon is a core-team member, co-leading Pole 4 on Programming models.

Start Date: 2013

Duration: 4 years

Avalon Members: T. Gautier, C. Perez, J. Richard

More information on C2S@Exa website

PIA ELCI

ELCI is a French software project that brings together academic and industrial partners to design and provide a software environment for the next generation of HPC systems. The principal objective for the project is to facilitate the development of a software environment that meets the demands of the new generation of HPC architectures. This will cover the whole software stack (system and programming environments), numerical solvers and pre/post/co processing software.
ELCI is a French software project that brings together academic and industrial partners to design and provide a software environment for the next generation of HPC systems. The project is funded by the participating partners and by the French FSN “Fond pour la Société Numérique”.

The principal objective for the project is to facilitate the development of a software environment that meets the demands of the new generation of HPC architectures. This will cover the whole software stack (system and programming environments), numerical solvers and pre/post/co processing software.

A co-design approach is employed, that covers the software environment for computer architectures, the requirements of more demanding applications, and is adapted to future hardware architectures (multicore/many core processors, high-speed networks and data storage).

These developments will be validated according to their capacity to deal with the new exascale challenges- larger scalability, higher resiliency, greater security, improved modularity, with better abstraction and interactivity for application cases.

Start Date: September 2014

Duration: 3 years

Avalon Members: T. Gautier, L.Lefevre, C. Perez, I. Rais, J. Richard

More information on the ELCI web site.

LEXISTEMS

LEXISTEMS develops Xact.ai, a solution to provide an universal access to knowledge in Natual Langage (data & data’s structuration limitless).

For organizations, Xact.ai is the most effective way to monetize data assets. Whatever the nature and volume of knowledge bases.

LEXISTEMS’ solutions streamline the use and analysis of natural language in business and personal applications.
A new era is opening. Users are empowered, and organizations leverage the true value of their data assets.

LEXISTEMS and Avalon collaborate on the design and development of NLP algorithms and high-level data structuration.

 

Start Date: September 2016

Duration:

Avalon Members: Marcos Assuncao, Eddy Caron and Thomas Pellisier-Tanon

More information on website: LEXISTEMS

Inria Project Lab HAC-SPECIS

HAC SPECIS: Inria project lab on High-performance Application and Computers: Studying PErformance and Correctness In Simulation (2016-2020) :

The goal of the HAC SPECIS (High-performance Application and Computers: Studying PErformance and Correctness In Simulation) project is to answer  methodological needs of HPC application and runtime developers and to allow to study real HPC systems both from the correctness and performance point of view. To this end, we gather experts from the HPC, formal verification and performance evaluation community. website : http://hacspecis.gforge.inria.fr/

 

Start Date: June 2016

Duration: 4 years

Avalon Members: F. Suter, L. Lefevre

CeoE H2020 POP

Summary

Inaugurated October 1, 2015, the new EU H2020 Center of Excellence (CoE) for Performance Optimisation and Productivity (POP) provides performance optimisation and productivity services for academic and industrial codes. European’s leading experts from the High Performance Computing field will help application developers getting a precise understanding of application and system behaviour. This project is supported by the European Commission under H2020 Grant Agreement No. 676553

Established codes, but especially codes never undergone any analysis or performance tuning, may profit from the expertise of the POP services which use latest state-of-the-art tools to detect and locate bottlenecks in applications, suggest possible code improvements, and may even help by Proof-of-Concept experiments and mock-up test for customer codes on their own platforms.

Partners

Barcelona Supercomputing Centre (BSC), High Performance Computing Center Stuttgart of the University of Stuttgart (HLRS), Jülich Supercomputing Centre (JSC), Numerical Algorithm Group (NAG), Rheinisch-Westfälische Technische Hochschule Aachen (RWTH), TERATEC (TERATEC).

Project Information

Start Date: October 2015

Duration: 3 years

Avalon Members:

Online Resources

More information on http://www.pop-coe.eu

Labex MILYON

Laboratoire d’excellence en mathématiques et informatique fondamentale.

MILYON fédère les communautés mathématiques et informatique de Lyon autour de trois axes : la recherche d’excellence, notamment des domaines à l’interface des deux disciplines ou d’autres sciences ; la formation, avec l’appui à des filières innovantes tournées vers la recherche ; la société, à travers la médiation de la culture scientifique auprès du grand public et le transfert de technologie vers l’industrie.

Il regroupe plus de 350 chercheurs, et trois unités mixtes de recherche de l’Université de Lyon : l’Institut Camille Jordan, le Laboratoire de l’Informatique du Parallélisme et l’Unité de Mathématiques Pures et Appliquées.

Plus d’information sur le site de MILYON.

Start Date:

Duration: Until 2024

Avalon Members:

Labex PRIMES

Laboratory of Excellence on Physics, Radiobiology, Medical Imaging, and Simulation

The program Laboratory of Excellence (LabEx) aims to endow with significant means a set of research units in order to attract world-renowned researchers and to establish a high-level and integrated politic of research, training and valorization. The ambition of this program is to develop scientific originality, to favor multidisciplinary, to increase the excellence and the international visibility of the French research and to play a driving role into the training of both doctorate and master levels.

PRIMES’s (Physics, Radiobiology, Medical Imaging, and Simulation) primary objective is to develop new concepts and methods for the exploration, the diagnosis and the therapy of cancer and ageing-related pathologies. PRIMES brings together the complementary skills of 16 recognized academic and medical partners with a long-standing experience to develop state-of-the-art methods, covering all necessary fields, from basic physics, instrumentation, radiobiology, data acquisition and processing, to image reconstruction, simulations and modeling supported by supercomputing.

Duration: 2012-2019

More information on the PRIME website.

ANR MOEBUS

Multi-objective scheduling for large scale parallel systems.

The MOEBUS project focuses on the efficient execution of parallel applications submitted by various users and sharing resources in large-scale high-performance computing environments.

We propose to investigate new functionalities to add at low cost in actual large scale schedulers and programming standards, for a better use of the resources according to various objectives and criteria. We also propose to revisit the principles of existing schedulers after studying the main factors impacted by job submissions. Then, we will propose novel efficient algorithms for optimizing the schedule for unconventional objectives like energy consumption and to design provable approximation multi-objective optimization algorithms for some relevant combinations of objectives (performance, fairness, energy consumption, etc.). An important characteristic of the project is its right balance between theoretical analysis and practical implementation. The most promising ideas will lead to integration in reference systems such as SLURM and OAR as well as new features in programming standards implementations such as MPI or OpenMP. We expect MOEBUS results to impact further use of very large scale parallel platforms.

Start Date:

Duration:

Avalon Members:

More on MEBUS website