WG Avalon public 31 mars 2025 (Quentin Guilloteau)

Quentin Guilloteau donnera l’exposé suivant.

Titre : Présentation et démonstration Nix/NixOS/NixOS-Compose 

Résumé:
Dans cette présentation, nous motiverons l’utilisation de
gestionnaires de paquets fonctionnels, tels que Nix [1] ou Guix [2],
pour la mise en place d’environnements logiciels reproductibles.
Nous expliquerons ensuite les concepts importants de ces outils,
puis illustrerons leurs utilisations par une démonstration.
(use case: analyse de données sur sa machine)

Nix se concentre uniquement sur la pile logicielle au-dessus
du noyau, nous présenterons également NixOS [3], la distribution
Linux basée sur Nix, qui permet de produire des images systèmes
reproductibles.
(use case: expérience sur G5K avec un nœud)

Enfin, nous présenterons NixOS-Compose [4], un outil se basant sur
NixOS, visant à faciliter les cycles de mise au point
d’environnement logiciels distribués tout en gardant des garanties
en termes de reproductibilité.
(use case: expérience sur G5K avec plusieurs nœuds et “rôles”)

[1] Nix: https://nixos.org/https://nix.dev/manual/nix/2.24/
[2] Guix: https://guix.gnu.org
[3] NixOS: https://nixos.org/manual/nixos/stable/
[4] NixOS-Compose: https://gitlab.inria.fr/nixos-compose/nixos-compose

ODISSEE: Online Data Intensive Solutions for Science in the Exabytes Era

Research sectors such as high-energy physics and astronomy offer critical insights that have the potential to revolutionise research and development, leading to breakthroughs in theoretical fields. However, these areas often require intensive research, expensive tools, and substantial long-term funding, without which their results may be limited. The EU-funded ODISSEE project will bring together the efforts of three pan-European ESFRI infrastructures in the physical sciences, alongside several other key sectors, to enhance European excellence and leadership in astronomy and high-energy physics. The project will provide access to essential high-quality hardware, software, R&D programmes and state-of-the-art experimental facilities to support key scientific projects and researchers, particularly those focused on the search for dark matter.

Project Information

  • URL: CORDIS web site
  • Starting date: 2025, January 1st
  • End date: 2027, December 31th

WG Avalon public 25 novembre 2024 (Pierre Jacquet)

Pierre Jacquet (https://jacquetpi.github.io/), donnera l’exposé
suivant.

Title: On the Sharing of Cloud Computing Resources

Abstract:
Cloud Data Centers are central to the popular Cloud Computing
paradigm. While consolidating diverse workloads within large sites has
improved the overall energy efficiency of these facilities, one
significant challenge remains: low server utilization. This
underutilization can be considered a form of hardware waste, as many
servers are required relative to the actual demand. In this
presentation, we will explore how physical resources can be shared
among clients in an “oversubscribed” model, examining various
scheduling techniques, architectural strategies, and paradigms to
optimize resource usage.

Pierre Jacquet is a Cloud Computing researcher starting a postdoctoral
position at ÉTS Montréal. He earned a Master’s degree in Computer
Science from Université Paris XII in 2021 and recently completed his
Ph.D. at Inria Lille, between both the Spirals and Stack teams. His
research interests include resource scheduling and sustainable
computing.

WG Avalon public 1er octobre 2024 (Nikola Schuster)

Nikola Schuster (Stellenbosch University, Afrique du Sud) donnera l’exposé suivant

Title:
Energy-Efficient Processing in Radio Astronomy: Comparative Analysis
of Computing platforms and their Implementation

Short abstract:
As the Square Kilometer Array pushes the limits of radio astronomy,
energy-efficient computing is crucial. My research firstly compares
the energy efficiency of Field-Programmable Gate Arrays (FPGAs) and
Graphics Processing Units (GPUs), two key technologies in this field,
and secondly investigates the effect of using the HDL coder
abstraction layer.

First, we ran a correlator algorithm on both platforms and found that
FPGAs processed 127 times more instructions per kilowatt-hour than
GPUs, indicating significantly better energy efficiency. Secondly, we
examined the impact of using HDL Coder  an abstraction tool for
programming FPGAs. We used the correlator algorithm, and found that
the abstraction layer increased power consumption by 11.18%.

These results suggest that while GPUs are being adopted, FPGAs may
still offer superior energy efficiency for certain tasks. Moreover,
abstraction tools, though useful, come with an energy cost.

WG Avalon public 10 juin 2024

Sébastien Valat (Inria AirSea) a donné le séminaire suivant.

Titre : Profiling mémoire, présentation de MALT et NUMAPROF

Résumé:
La mémoire est souvent devenue un problème majeur dans les applications.

Ceci pour la question des performances d’accès, mais aussi par les 
volumes à gérer au sein d’applications souvent de plus en plus 
dynamiques, complexes et développées à plusieurs sur des dizaines d’années.

Se posent alors les questions suivantes :

  – Comment se rendre compte de ses erreurs et de patterns 
problématiques facilement corrigibles ?
   – Comment trouver où elle est consommée lorsque j’atteins les limites 
de ma machine (malloc, variables globales, TLS) ?
   – Le NUMA, je fais comment pour savoir si je me suis trompé et où ?

Après ma thèse sur la gestion mémoire en contexte HPC (malloc, kernel, 
NUMA, multi-threading….) sur large simulation numérique j’ai eu 
l’opportunité de développer deux profileurs mémoires, Malt (MALloc 
Tracker) et Numaprof. Avec ces outils, j’ai tenté de reporter ce que 
j’ai pu comprendre le long de mon chemin et rendre visible ce que 
j’avais du mal à visualiser à l’époque dans les codes cibles inconnus 
avec lesquels j’interagissais. Outils maintenant open-sources pour le C 
/ C++ / Fortran (rust).

Je présenterais donc ces deux outils avec en principe quelques exemples 
d’observations obtenues.

Site web : https://memtt.github.io/

Présentation de l’orateur:
Après un parcours en physique des particules, j’ai bifurqué pour 
répondre à mes questions sur l’informatique en science avec une thèse au 
CEA en gestion de la mémoire des supercalculateurs et depuis cheminé au 
CERN et pour partie dans l’industrie du HPC sur les IOs. Je suis en ce 
moment à l’INRIA dans le domaine de la simulation océanique.
https://svalat.github.io/

The ECLAT Laboratory

A laboratory for astronomical instrumentation

ECLAT stands as a centre of excellence dedicated to High-Performance Computing (HPC) and Artificial Intelligence (AI) technologies and techniques applied to astronomical instrumentation. This project brings together sixteen partner laboratories and teams around a common roadmap, aimed at strengthening collaboration in research and development (R&D). The aim is to design and build the future cyber-physical systems for astronomy, capable of managing, processing and optimising vast volumes of data.

More information on ECLAT website.

Exa-SofT : HPC software and tools

A NumPEx PEPR project

Though significant efforts have been devoted to the implementation and optimization of several crucial parts of a typical HPC software stack, most HPC experts agree that exascale supercomputers will raise new challenges, mostly because the trend in exascale compute-node hardware is toward heterogeneity and scalability: Compute nodes of future systems will have a combination of regular CPUs and accelerators (typically GPUs), along with a diversity of GPU architectures.

Meeting the needs of complex parallel applications and the requirements of exascale architectures raises numerous challenges which are still left unaddressed.
As a result, several parts of the software stack must evolve to better support these architectures. More importantly, the links between these parts must be strengthened to form a coherent, tightly integrated software suite.

Our project aims at consolidating the exascale software ecosystem by providing a coherent, exascale-ready software stack featuring breakthrough research advances enabled by multidisciplinary collaborations between researchers.

The main scientific challenges we intend to address are:

  • productivity,
  • performance portability,
  • heterogeneity,
  • scalability and resilience,
  • performance and energy efficiency.

AVALON is coordinating the WP1 and participates to WP1 and WP2

Project Information

  • URL: Not available yet
  • Starting date: 2023
  • End date: 2028

Taranis : Model, Deploy, Orchestrate, and Optimize Cloud

A PEPR Cloud project

New infrastructures, such as Edge Computing or the Cloud-Edge-IoT computing continuum, make cloud issues more complex as they add new challenges related to resource diversity and heterogeneity (from small sensor to data center/HPC, from low power network to core networks), geographical distribution, as well as increased dynamicity and security needs, all under energy consumption and regulatory constraints.

In order to efficiently exploit new infrastructures, we propose a strategy based on a significant abstraction of the application structure description to further automate application and infrastructure management. Thus, it will be possible to globally optimize the resources used with respect to multi-criteria objectives (price, deadline, performance, energy, etc.) on both the user side (applications) and the provider side (infrastructures). This abstraction also includes the challenges related to the abstraction of application reconfiguration and to automatically adapt the use of resources.

The Taranis project addresses these issues through four scientific work packages, each focusing on a phase of the application lifecycle: application and infrastructure description models, deployment and reconfiguration, orchestration, and optimization.

The first work package “Modeling” addresses the complexity of cloud-edge application and infrastructure models: formal verification and optimization of these models, multi-layer variability, the relationship between model expressiveness and efficient solution computation, lock-ins of proprietary models, and heterogeneity of cloud application and infrastructure modeling languages.

The second work package “Deployment and Reconfiguration” studies deployment and reconfiguration related issues to reduce management complexity and increase support for provisioning and configuration languages, while improving operations certification and increasing operations concurrency. The workpackage also aims to reduce the complexity of the bootstrapping problem on geo-distributed and heterogeneous resources.

The third work package “Orchestration of services and resources” aims at extending the orchestrators for the Cloud-Edge-IoT continuum, while making them more autonomous with respect to dynamic, functional and/or non-functional needs, in particular with respect to the network partitioning problem specific to Cloud-Edge-IoT infrastructures.

Finally, the fourth work package “Optimization” aims to revisit the optimization problems associated with the use of Cloud-Edge-IoT infrastructures and the execution of an application when a large number of decision variables need to be considered jointly. It also aims to make optimization techniques aware of the Cloud-Edge-IoT continuum, the heterogeneous distributed platforms and the wide range of application configurations involved.

AVALON is coordinating the project and participated to the first two workpackages.

Project Information

  • URL: Not available yet
  • Starting date: 2023, September 1st
  • End date: 2030, August 31th

Slices PP – Preparatory Phase

The digital infrastructures research community continues to face numerous new challenges towards the design of the Next Generation Internet. This is an extremely complex ecosystem encompassing communication, networking, data-management and data-intelligence issues, supported by established and emerging technologies such as IoT, 5/6G, cloud-to-edge computing. Coupled with the enormous amount of data generated and exchanged over the network, this calls for incremental as well as radically new design paradigms. Experimentally-driven research is becoming worldwide a de-facto standard, which has to be supported by large- scale research infrastructures to make results trusted, repeatable and accessible to the research communities.
SLICES-RI (Research Infrastructure), which was recently included in the 2021 ESFRI roadmap, aims to answer these problems by building a large infrastructure needed for the experimental research on various aspects of distributed computing, networking, IoT and 5/6G networks. It will provide the resources needed to continuously design, experiment, operate and automate the full lifecycle management of digital infrastructures, data, applications, and services.
Based on the two preceding projects within SLICES-RI, SLICES-DS (Design Study) and SLICES-SC (Starting Community), the SLICES-PP (Preparatory Phase) project will validate the requirements to engage into the implementation phase of the RI lifecycle. It will set the policies and decision processes for the governance of SLICES-RI: i.e., the legal and financial frameworks, the business model, the required human resource capacities and training programme. It will also settle the final technical architecture design for implementation. It will engage member states and stakeholders to secure commitment and funding needed for the platform to operate. It will position SLICES as an impactful instrument to support European advanced research, industrial competitiveness and societal impact in the digital era.

Project Information

WG Avalon 3 mai 2022 (Élise Jeanneau)

Élise Jeanneau a présenté l’exposé suivant.

Titre: SkyData, un nouveau paradigme pour la gestion de données

Résumé

Les systèmes de gestion de données traditionnels sont centrées sur les 
applications, plutôt que sur les données. Le projet SkyData propose de 
renverser la donne en gérant des données autonomes, capables de décider 
d’elles-mêmes leurs migrations et réplications. Le système qui en 
résulte est distribué, dynamique et fondamentalement différent des 
systèmes de gestion de données existants. Ce nouveau paradigme permet 
aux utilisateurs d’intégrer leurs données au système sans avoir à en 
céder le contrôle à un gestionnaire de données tier.

SkyData est un projet soumis à l’ANR. Cette présentation a pour but 
d’introduire les bases de la structure de SkyData, et de discuter 
certains usages possibles de ces données autonomes.