PIA ELCI

ELCI is a French software project that brings together academic and industrial partners to design and provide a software environment for the next generation of HPC systems. The principal objective for the project is to facilitate the development of a software environment that meets the demands of the new generation of HPC architectures. This will cover the whole software stack (system and programming environments), numerical solvers and pre/post/co processing software.
ELCI is a French software project that brings together academic and industrial partners to design and provide a software environment for the next generation of HPC systems. The project is funded by the participating partners and by the French FSN “Fond pour la Société Numérique”.

The principal objective for the project is to facilitate the development of a software environment that meets the demands of the new generation of HPC architectures. This will cover the whole software stack (system and programming environments), numerical solvers and pre/post/co processing software.

A co-design approach is employed, that covers the software environment for computer architectures, the requirements of more demanding applications, and is adapted to future hardware architectures (multicore/many core processors, high-speed networks and data storage).

These developments will be validated according to their capacity to deal with the new exascale challenges- larger scalability, higher resiliency, greater security, improved modularity, with better abstraction and interactivity for application cases.

Start Date: September 2014

Duration: 3 years

Avalon Members: T. Gautier, L.Lefevre, C. Perez, I. Rais, J. Richard

More information on the ELCI web site.

ANR MOEBUS

Multi-objective scheduling for large scale parallel systems.

The MOEBUS project focuses on the efficient execution of parallel applications submitted by various users and sharing resources in large-scale high-performance computing environments.

We propose to investigate new functionalities to add at low cost in actual large scale schedulers and programming standards, for a better use of the resources according to various objectives and criteria. We also propose to revisit the principles of existing schedulers after studying the main factors impacted by job submissions. Then, we will propose novel efficient algorithms for optimizing the schedule for unconventional objectives like energy consumption and to design provable approximation multi-objective optimization algorithms for some relevant combinations of objectives (performance, fairness, energy consumption, etc.). An important characteristic of the project is its right balance between theoretical analysis and practical implementation. The most promising ideas will lead to integration in reference systems such as SLURM and OAR as well as new features in programming standards implementations such as MPI or OpenMP. We expect MOEBUS results to impact further use of very large scale parallel platforms.

Start Date:

Duration:

Avalon Members:

More on MEBUS website

ANR SONGS

The last decade has brought tremendous changes to the characteristics of large scale distributed computing platforms. Large grids processing terabytes of information a day and the peer-to-peer technology have become common even though understanding how to efficiently such platforms still raises many challenges. As demonstrated by the USS SimGrid project funded by the ANR in 2008, simulation has proved to be a very effective approach for studying such platforms. Although even more challenging, we think the issues raised by petaflop/exaflop computers and emerging cloud infrastructures can be addressed using similar simulation methodology.

The goal of the SONGS project is to extend the applicability of the SimGrid simulation framework from Grids and Peer-to-Peer systems to Clouds and High Performance Computation systems. Each type of large-scale computing system will be addressed through a set of use cases and lead by researchers recognized as experts in this area.

Any sound study of such systems through simulations relies on the following pillars of simulation methodology: Efficient simulation kernel; Sound and validated models; Simulation analysis tools; Campaign simulation management.

For more information, please visit the project website.

ANR MapReduce

This project is devoted to using MapReduce programming paradigm on clouds and hybrid infrastructures. Partners: Argonne National Lab (USA), the University of Illinois at Urbana Champaign (USA), the UIUC-INRIA Joint Lab on Petascale Computing, IBM France, IBCP, MEDIT (SME) and the GRAAL/AVALON INRIA project-team.

ANR MapReduce

This project aims to overcome the limitations of current Map-Reduce frameworks such as Hadoop, thereby enabling highly-scalable Map-Reduce-based data processing on various physical platforms such as clouds, desktop grids, or on hybrid infrastructures built by combining these two types of infrastructures.To meet this global goal, several critical aspects will be investigated. Data storage and sharing architecture. First, we will explore advanced techniques for scalable, high-throughput, concurrency-optimized data and metadata management, based on recent preliminary contributions of the partners. Scheduling. Second, we will investigate various scheduling issues related to large executions of Map-Reduce instances. In particular, we will study how the scheduler of the Hadoop implementation of Map-Reduce can scale over heterogeneous platforms; other issues include dynamic data replication and fair scheduling of multiple parallel jobs. Fault tolerance and security. Finally, we intend to explore techniques to improve the execution of Map-Reduce applications on large-scale infrastructures with respect to fault tolerance and security.

Our global goal is to explore how combining these techniques can improve the behavior of Map-Reduce-based applications on the target large-scale infrastructures. To this purpose, we will rely on recent preliminary contributions of the partners associated in this project, illustrated though the following main building blocks. BlobSeer, a new approach to distributed data management being designed by the KerData team from INRIA Rennes – Bretagne Atlantique to enable scalable, efficient, fine-grain access to massive, distributed data under heavy concurrency. BitDew, a data-sharing platform being currently designed by the GRAAL team from INRIA Grenoble – Rhône-Alpes at ENS Lyon, with the goal of exploring the specificities of desktop grid infrastructures. Nimbus, a reference open source cloud management toolkit developed at the University of Chicago and Argonne National Laboratory (USA) with the goal of facilitating the operation of clusters as Infrastructure-as-a-Service (IaaS) clouds.

More information on the MapReduce web site.

ANR COOP

Multi-level Cooperative Resource Management

ANR COOP

The problem addressed by the COOP project (Dec. 2009 — May 2013) was to reconcile two layers – Programming Model Frameworks (PMF) and Resource Management Systems (RMS) – with respect to a number of tasks that they both try to handle independently. PMF needs to have a knowledge of resources to select the most efficient transformation of abstract programming concepts into executable ones. However, the actual management of resources is done by RMS in an opaque way, based on a simple abstraction of applications.

More details are available on the ANR COOP website.

ANR SPADES

SPADES will propose solutions for the management of distributed schedulers in Desktop Computing environments, coping with a co-scheduling framework.

ANR SPADES

Today’s emergence of Petascale architectures and evolutions of both research grids and computational grids increase a lot the number of potential resources. However, existing infrastructures and access rules do not allow to fully take advantage of these resources.

One key idea of the SPADES project is to propose a non-intrusive but highly dynamic environment able to take advantages to available resources without disturbing their native use. In other words, the SPADES vision is to adapt the desktop grid paradigm by replacing users at the edge of the Internet by volatile resources. These volatile resources are in fact submitted via batch schedulers to reservation mechanisms which are limited in time or susceptible to preemption (best-effort mode).

One of the priorities of SPADES is to support platforms at a very large scale. Petascale environments are in consequence particularly considered. Nevertheless, these next-generation architectures still suffer from a lack of expertise for an accurate and relevant use.

One of the SPADES goal is to show how to take advantage of the power of such architectures. Another challenge of SPADES is to provide a software solution for a service discovery system able to face a highly dynamic platform. This system will be deployed over volatile nodes and thus must tolerate « failures ». The implementation of such an experimental development leads to the need for an interface with batch submission systems able to make reservations in a transparent manner for users, but also to be able to communicate with these batch systems in order to get the information required by our schedulers.

SPADES will propose solutions for the management of distributed schedulers in Desktop Computing environments, coping with a co-scheduling framework.

More information on SPADES website: SPADES

ANR USS SimGrid

The USS-SimGrid project aims at Ultra Scalable Simulations with SimGrid. This tool is leader in the simulation of HPC settings, and the main goal of this project is to allow its use in the simulation of desktop grids and peer-to-peer settings.

ANR USS SimGrid

Computer Science differs from other experimental sciences, such as biology of physics, in the way experimental results are presented in articles. In those other disciplines articles always begin with a detailed presentation of the methods employed to produce the results that often rely on previously described and acknowledged procedures. In computer science, and more particularly in the field of application simulation, only a short description of a (sometime unavailable) ad-hoc simulation framework is provided. This prevents reproducibility of published results and thus objective comparisons between new research results and the state of the art. To reduce this gap between computer science and other experimental sciences, there is need for powerful, validated, available and well advertised tools and methods. The general goal of this project is to provide such an application simulation framework that meets the needs of both the High Performance Computing and the Large Scale Distributed Computing communities. SimGrid is recognized inthe HPC community as one of the most prominent simulation environments as shown by its large community of users and the number of publications that use it. This project will allow to extend SimGrid to target the Large Scale Distributed Computing community, increase simulation realism, and provide useful tools for test campaign management.

More information on USS Simgrid website: USS Simgrid