WG – Houmani Zeina: Study and design of data-driven services/microservices discovery mechanisms

2018-11-13

Title: Study and design of data-driven services/microservices discovery mechanisms

Speaker: Houmani Zeina

Abstract:

— English version

Usual microservice discovery mechanisms are normally based on user needs (Goal-based Approaches). However, in today’s evolving architectures, several new microservices can be created. This makes the classic approach insufficient to discover the available microservices. That’s why customers need to discover the features they can benefit from before searching the available microservices in their domain. We will present a data-driven microservice architecture that allows customers to discover, from specific objects, the functionalities that can be exerted on these objects as well as all the microservices dedicated to them. This architecture, based on the main components of classic microservice architectures, adopts a particular communication strategy between clients and registers to achieve the desired objective.

— French version

Les mécanismes de découverte de microservices classiques sont normalement basés sur les besoins des utilisateurs (Goal-based Approches). Cependant, dans les architectures actuelles qui évoluent fréquemment, plusieurs nouveaux microservices peuvent être créés. Cela rend l’approche classique seule insuffisante pour découvrir les microservices disponibles. C’est pourquoi, les clients ont besoin de découvrir les fonctionnalités dont ils peuvent bénéficier avant de rechercher dans leur domaine les microservices disponibles. Nous allons présenter une architecture microservices pilotée par les données qui permet aux clients de découvrir, à partir d’objets spécifiques, les fonctionnalités qui peuvent être exercées sur ces objets ainsi que l’ensemble des microservices qui leur sont dédiés. Cette architecture, basée sur les composants principaux des architectures microservices classiques, adopte une stratégie de communication particulière entre les clients et les registres permettant d’atteindre l’objectif recherché.

WG – Arthur Chevalier: Software licenses for fun and profit

2018-11-06

Title: Software licenses for fun and profit

Speaker: Arthur Chevalier

Abstract:

— French version :

Aujourd’hui, l’utilisation des logiciels est généralement réglementée par des licences, qu’elles soient gratuites, payantes et avec ou sans accès à leurs sources. L’univers des licences est très vaste et mal connu. Souvent on ne connaît que la version la plus répandue au grand public (un achat de logiciel est égale à une licence). La réalité est bien plus complexe surtout chez les grands éditeurs. Dans cette présentation je présenterai l’impact et l’importance de la gestion de ces licences lors de l’utilisation de logiciels dans une architecture Cloud. Je montrerai un cas d’étude pour prouver l’impact de la gestion dynamique des licences et la nécessité de proposer de nouvelles façons de gérer un patrimoine logiciel. Ce cas d’étude portera sur des logiciels vendus par quatre grands éditeurs (Microsoft, Red Hat, Software AG et Oracle).

— English version :

Today, the use of software is generally regulated by licenses, whether they are free, paid for and with or without access to their sources. The world of licensing is very vast and poorly understood. Often we only know the version most widely used by the general public (a software purchase is equal to a license). The reality is much more complex, especially for large publishers. In this presentation I will present the impact and importance of managing these licenses when using software in a cloud architecture. I will show a case study to demonstrate the impact of dynamic license management and the need to propose new ways to manage software assets. This case study will focus on software sold by four major vendors (Microsoft, Red Hat, Software AG and Oracle).

 

WG – Hadrien Croubois: Toward an autonomic engine for scientific workflows and elastic Cloud infrastructure

2018-10-02

Title: Toward an autonomic engine for scientific workflows and elastic Cloud infrastructure

Speaker: Hadrien Croubois

Abstract: The constant development of scientific and industrial computation infrastructures requires the concurrent development of scheduling and deployment mechanisms to manage such infrastructures. Throughout the last decade, the emergence of the Cloud paradigm raised many hopes, but achieving full platform autonomicity is still an ongoing challenge.

Work undertaken during this Ph.D. aimed at building a workflow engine that integrated the logic needed to manage workflow execution and \cloud deployment on its own. More precisely, we focus on \cloud solutions with a dedicated Data as a Service (DaaS) data management component. Our objective was to automate the execution of workflows submitted by many users on elastic Cloud resources.

This contribution proposes a modular middleware infrastructure and details the implementation of the underlying modules:

– A workflow clustering algorithm that optimises data locality in the context of DaaS-centered communications;

– A dynamic scheduler that executes clustered workflows on Cloud resources;

– A deployment manager that handles the allocation and deallocation of Cloud resources according to the workload characteristics and users’ requirements.

All these modules have been implemented in a simulator to analyse their behaviour and measure their effectiveness when running both synthetic and real scientific workflows. We also implemented these modules in the DIET middleware to give it new features and prove the versatility of this approach. Simulation running the WASABI workflow (waves analysis based inference, a framework for the reconstruction of gene regulatory networks) showed that our approach can decrease the deployment cost by up to 44% while meeting the required deadlines.

WG – Carlos Cardonha: Network Models for Multi-Objective Discrete Optimization

2018-06-29

Title: Network Models for Multi-Objective Discrete Optimization

Speaker: Carlos Cardonha

Abstract: This work provides a novel framework for solving multi-objective discrete optimization problems with an arbitrary number of objectives. Our framework formulates these problems as network models, in that enumerating the Pareto frontier amounts to solving a multi-criteria shortest path problem in an auxiliary network. We design tools and techniques for exploiting the network model in order to accelerate the identification of the Pareto frontier, most notably a number of operations to simplify the network by removing nodes and arcs while preserving the set of nondominated solutions. We show that the proposed framework yields orders-of magnitude performance improvements over existing state-of-the-art algorithms on four problem classes containing both linear and nonlinear objective functions.
This is a joint work with David Bergman, Merve Bodur, and André Ciré.

Mini-bio: Carlos Cardonha is a Research Staff Member of the Optimization under Uncertainty Group at IBM Research Brazil, with a Ph.D. in Mathematics (T.U. Berlin) and with a Bachelor’s and a Master’s degree in Computer Science (Universidade de São Paulo). His primary research interests are mathematical programming and theoretical computer science, with focus on the application of techniques in mixed integer linear programming, combinatorial optimization, and algorithms design and analysis to real-world and/or operations research problems.

WG – Alexandre da Silva Veith: Latency-Aware Placement of Data Stream Analytics on Edge Computing

2018-06-26

Title: Latency-Aware Placement of Data Stream Analytics on Edge Computing

Speaker: Alexandre da Silva Veith

Abstract: The interest in processing data events under stringent time constraints as they arrive has led to the emergence of architecture and engines for data stream processing. Edge computing, initially designed to minimize the latency of content delivered to mobile devices, can be used for executing certain stream processing operations. Moving operators from cloud to edge, however, is challenging as operator-placement decisions must consider the application requirements and the network capabilities. In this work, we introduce strategies to create placement configurations for data stream processing applications whose operator topologies follow series-parallel graphs. We consider the operator characteristics and requirements to improve the response time of such applications. Results show that our strategies can improve the response time in up to 50% for application graphs comprising multiple forks and joins while transferring less data and better using the resources.

WG – Cyril Seguin: Elasticity in Distributed File Systems.

2018-05-29

Title: Elasticity in Distributed File Systems.

Speaker: Cyril Seguin

Abstract: Since about several decades, distributed file systems are more an more used as a storage solution for distributed infrastructures.
They offer efficient, reliable and easy access to huge amounts of shared data by federating several storage resources and by replicating each data across these resources.
In parallel, the advent of cloud computing platforms, and especially infrastructure as a service platforms that punctually offer to users thousand of resources on demand, allows to acquire inexpensive distributed infrastructures.
Elasticity and pay per use cloud’s characteristics allow users to dynamically extend or reduce the number of used resources according their needs, paying exactly what they use.
Deploying a distributed file system on a cloud computing platform can offer to users the possibility to adapt the number of used resources to the platform activity while taking advantage of a distributed file system’s performance.
However, new challenges are raised concerning data availability and the trade-off between number of used resources and performance.
This talk focuses on solving these issues respectively in a static and dynamic context in which the platform activity is respectively known or not.
We show that bringing new data placement strategies and adapting the number of replicas of each data to its access frequency and balancing the requests load on each used resource allow to answer to the previous issues.

WG – Hadrien Croubois: A Cloud-aware autonomous workflow engine and its application to Gene Regulatory Networks inference.

2018-03-13

Title: A Cloud-aware autonomous workflow engine and its application to Gene Regulatory Networks inference.

Speaker: Hadrien Croubois

Abstract: With the recent development of commercial Cloud offers, Cloud solutions are today the obvious solution for many computing use-cases. However, high performance scientific computing is still among the few domains where Cloud still raises more issues than it
solves. Notably, combining the workflow representation of complex scientific applications with the dynamic allocation of resources in a Cloud environment is still a major challenge. In the meantime, users with monolithic applications are facing challenges when trying
to move from classical HPC hardware to elastic platforms. In this paper, we present the structure of an autonomous workflow manager dedicated to IaaS-based Clouds (Infrastructure as a Service) with DaaS storage services (Data as a Service). The solution proposed in
this paper fully handles the execution of multiple workflows on a dynamically allocated shared platform. As a proof of concept we validate our solution through a biologic application with the WASABI workflow.

WG – Alba Cristina Magalhaes Alves de Melo: Parallel Sequence Alignment of Whole Chromosomes with Hundreds of GPUs and Pruning

2018-02-28

Title: Parallel Sequence Alignment of Whole Chromosomes with Hundreds of
GPUs and Pruning

Speaker: Alba Cristina Magalhaes Alves de Melo

Abstract: Biological Sequence Alignment is a very basic operation in Bioinformatics used routinely worldwide. Smith-Waterman is the exact algorithm used to compare two sequences, obtaining the optimal alignment in quadratic time and space. In order to accelerate Smith-Waterman, many GPU-based strategies were proposed in the literature. However, aligning DNA sequences of millions of characters, or Base Pairs (MBP), is still a very challenging task. In this talk, we discuss related work in the area of parallel biological sequence alignment and present our multi-GPU strategy to align DNA sequences with up to 249 millions of characters in 384 GPUs. In order to achieve this, we propose an innovative speculation technique, which is able to parallelize a phase of the Smith-Waterman algorithm that is inherently sequential. We combined our speculation technique with sophisticated buffer management and fine-grain linear space matrix processing strategies to obtain our parallel algorithm. As far as we know, this is the first implementation of Smith-Waterman able to retrieve the optimal alignment between sequences with more than 50
millions of characters. We will also present a pruning technique for one GPU that is able to prune more than 50% of the Smith-Waterman matrix and still retrieve the optimal alignment. We will show the results obtained in the Keeneland cluster (USA), where we compared all the human x chimpanzee homologous chromosomes (ranging from 26 MBP to 249 MBP). The human_chimpanzee chromosome 5 comparison (180 MBP x 183 MBP) attained 10.35 TCUPS (Trillions of Cells Updated per Second) using 384 GPUs. In this case, we processed 45 petacells, being able to produce the optimal alignment in 53 minutes and 7 seconds, with a speculation hit ratio of 98.2%.

Speaker Bio:  Alba Cristina Magalhaes Alves de Melo obtained her PhD degree in Computer Science from the Institut National Polytechnique de Grenoble (INPG), France, in 1996. In 2008, she did a postdoc at the University of Ottawa, Canada; in 2011, she was invited as Guest Scientist at Université Paris-Sud, France; and in 2013 she did a sabbatical at the Universitat Polytecnica de Catalunya, Spain. Since 1997, she works at the Department of Computer Science at the University of Brasilia (UnB), Brazil, where she is now a Full Professor. She is also a CNPq Research Fellow level 1D in Brazil. She was the Coordinator of the Graduate Program in Informatics at UnB for several years (2000-2002, 2004-2006, 2008, 2010, 2014) and she coordinated international collaboration projects with the Universitat Politecnica de Catalunya, Spain (2012, 2014-2016) and with the University of Ottawa, Canada (2012-2015). In 2016, she received the Brazilian Capes Award on “Advisor of the Best PhD Thesis in Computer Science”. Her research interests are High Performance Computing, Bioinformatics and Cloud Computing. She advised 2 postdocs, 4 PhD Thesis and 22 MsC Dissertations. Currently, she advises 4 PhD students and 2 MsC students. She is Senior Member of the IEEE Society and Member of the Brazilian Computer Society. She gave invited talks at Universitat Karlshure, Germany, Université Paris-Sud, France, Universitat Polytecnica de Catalunya, Spain, University of Ottawa, Canada and at Universidad del Chile, Chile. She has currently 91 papers
listed at  DBLP (www.informatik.uni-trier.de/~ley/db/indices/a-tree/m/Melo:Alba_Cristina_Magalhaes_Alves_de.html).

WG – Prof. Rajkumar Buyya: New Frontiers in Cloud Computing for Big Data and Internet-of-Things (IoT) Applications

2018-02-27

Title: New Frontiers in Cloud Computing for Big Data and Internet-of-Things (IoT) Applications

Speaker: Prof. Rajkumar Buyya
Director, Cloud Computing and Distributed Systems (CLOUDS) Lab,
The University of Melbourne, Australia

CEO, Manjrasoft Pvt Ltd, Melbourne, Australia

Abstract: Computing is being transformed to a model consisting of services that are commoditised and delivered in a manner similar to utilities such as water, electricity, gas, and telephony. Several computing paradigms have promised to deliver this utility computing vision. Cloud computing has emerged as one of the buzzwords in the IT industry and turned the vision of “computing utilities” into a reality.  Clouds deliver infrastructure, platform, and software (application) as services, which are made available as subscription-based services in a pay-as-you-go model to consumers. Cloud application platforms need to offer (1) APIs and tools for rapid creation of elastic applications and (2) a runtime system for deployment of applications on geographically distributed computing infrastructure in a seamless manner.
The Internet of Things (IoT) paradigm enables seamless integration of cyber-and-physical worlds and opening up opportunities for creating a new class of applications for domains such as smart cities. The emerging Fog computing is extending Cloud computing paradigm to edge resources for latency-sensitive IoT applications.
This keynote presentation will cover (a) 21st century vision of computing and identifies various IT paradigms promising to deliver the vision of computing utilities; (b) opportunities and challenges for utility and market-oriented Cloud computing, (c) innovative architecture for creating market-oriented and elastic Clouds by harnessing virtualisation technologies; (d) Aneka, a Cloud Application Platform, for rapid development of Cloud/Big Data applications and their deployment on private/public Clouds with resource provisioning driven by SLAs; (e) experimental results on deploying Cloud and Big Data/Internet-of-Things (IoT) applications in engineering, and health care, satellite image processing, and smart cities on elastic Clouds;and (f) directions for delivering our 21st century vision along with pathways for future research in Cloud and Fog computing.

Speaker Bio:  Dr. Rajkumar Buyya is a Redmond Barry Distinguished Professor and Director of the Cloud Computing and Distributed Systems (CLOUDS) Laboratory at the University of Melbourne, Australia. He is also serving as the founding CEO of Manjrasoft, a spin-off company of the University, commercializing its innovations in Cloud Computing. He served as a Future Fellow of the Australian Research Council during 2012-2016. He has authored over 625 publications and seven textbooks including “Mastering Cloud Computing” published by McGraw Hill, China Machine Press, and Morgan Kaufmann for Indian, Chinese and international markets respectively. He also edited several books including “Cloud Computing: Principles and Paradigms” (Wiley Press, USA, Feb 2011). He is one of the highly cited authors in computer science and software engineering worldwide (h-index=114, g-index=245, 67,600+ citations).  Dr. Buyya is recognized as a “Web of Science Highly Cited Researcher” in 2016 and 2017 by Thomson Reuters, a Fellow of IEEE, and Scopus Researcher of the Year 2017 with Excellence in Innovative Research Award by Elsevier for his outstanding contributions to Cloud computing.
Software technologies for Grid and Cloud computing developed under Dr. Buyya’s leadership have gained rapid acceptance and are in use at several academic institutions and commercial enterprises in 40 countries around the world. Dr.  Buyya has led the establishment and development of key community activities, including serving as foundation Chair of the IEEE Technical Committee on Scalable Computing and five IEEE/ACM conferences. These contributions and international research leadership of Dr. Buyya are recognized through the award of “2009 IEEE Medal for Excellence in Scalable Computing” from the IEEE Computer Society TCSC.
Manjrasoft’s Aneka Cloud technology developed under his leadership has received “2010 Frost & Sullivan New Product Innovation Award”. Recently, Dr. Buyya received “Mahatma Gandhi Award” along with Gold Medals for his outstanding and extraordinary achievements in Information Technology field and services rendered to promote greater friendship and India-International cooperation. He served as the founding Editor-in-Chief of the IEEE Transactions on Cloud Computing. He is currently serving as Co-Editor-in-Chief of Journal of Software: Practice and Experience, which was established over 45 years ago. For further information on Dr.Buyya, please visit his cyber home: www.buyya.com

WG – Laércio LIMA PILLA: Current Efforts in Global Scheduling and Fault Tolerance for HPC Systems

2018-01-23

Title: Current Efforts in Global Scheduling and Fault Tolerance for HPC Systems

Speaker: Laércio LIMA PILLA

Abstract: Performance, energy efficiency, and reliability have been important objectives and challenges in current and future computing systems. In this context, our approach has been based on understanding the details of the computing system architecture and the behavior of applications, in order to combine this information, identify issues and propose new solutions. In this presentation, I will discuss our experience with the development of new architecture-aware global scheduling algorithms for multiprocessor and multicomputer systems, and with fault tolerance mechanisms for radiation-induced errors in parallel accelerators. I will also present some future global scheduling plans to handle the inclusion of non-volatile random-access memories (NVRAMs) in computing systems.