WG – Cyril Seguin: Elasticity in Distributed File Systems.

2018-05-29

Title: Elasticity in Distributed File Systems.

Speaker: Cyril Seguin

Abstract: Since about several decades, distributed file systems are more an more used as a storage solution for distributed infrastructures.
They offer efficient, reliable and easy access to huge amounts of shared data by federating several storage resources and by replicating each data across these resources.
In parallel, the advent of cloud computing platforms, and especially infrastructure as a service platforms that punctually offer to users thousand of resources on demand, allows to acquire inexpensive distributed infrastructures.
Elasticity and pay per use cloud’s characteristics allow users to dynamically extend or reduce the number of used resources according their needs, paying exactly what they use.
Deploying a distributed file system on a cloud computing platform can offer to users the possibility to adapt the number of used resources to the platform activity while taking advantage of a distributed file system’s performance.
However, new challenges are raised concerning data availability and the trade-off between number of used resources and performance.
This talk focuses on solving these issues respectively in a static and dynamic context in which the platform activity is respectively known or not.
We show that bringing new data placement strategies and adapting the number of replicas of each data to its access frequency and balancing the requests load on each used resource allow to answer to the previous issues.

WG – Hadrien Croubois: A Cloud-aware autonomous workflow engine and its application to Gene Regulatory Networks inference.

2018-03-13

Title: A Cloud-aware autonomous workflow engine and its application to Gene Regulatory Networks inference.

Speaker: Hadrien Croubois

Abstract: With the recent development of commercial Cloud offers, Cloud solutions are today the obvious solution for many computing use-cases. However, high performance scientific computing is still among the few domains where Cloud still raises more issues than it
solves. Notably, combining the workflow representation of complex scientific applications with the dynamic allocation of resources in a Cloud environment is still a major challenge. In the meantime, users with monolithic applications are facing challenges when trying
to move from classical HPC hardware to elastic platforms. In this paper, we present the structure of an autonomous workflow manager dedicated to IaaS-based Clouds (Infrastructure as a Service) with DaaS storage services (Data as a Service). The solution proposed in
this paper fully handles the execution of multiple workflows on a dynamically allocated shared platform. As a proof of concept we validate our solution through a biologic application with the WASABI workflow.

WG – Prof. Rajkumar Buyya: New Frontiers in Cloud Computing for Big Data and Internet-of-Things (IoT) Applications

2018-02-27

Title: New Frontiers in Cloud Computing for Big Data and Internet-of-Things (IoT) Applications

Speaker: Prof. Rajkumar Buyya
Director, Cloud Computing and Distributed Systems (CLOUDS) Lab,
The University of Melbourne, Australia

CEO, Manjrasoft Pvt Ltd, Melbourne, Australia

Abstract: Computing is being transformed to a model consisting of services that are commoditised and delivered in a manner similar to utilities such as water, electricity, gas, and telephony. Several computing paradigms have promised to deliver this utility computing vision. Cloud computing has emerged as one of the buzzwords in the IT industry and turned the vision of “computing utilities” into a reality.  Clouds deliver infrastructure, platform, and software (application) as services, which are made available as subscription-based services in a pay-as-you-go model to consumers. Cloud application platforms need to offer (1) APIs and tools for rapid creation of elastic applications and (2) a runtime system for deployment of applications on geographically distributed computing infrastructure in a seamless manner.
The Internet of Things (IoT) paradigm enables seamless integration of cyber-and-physical worlds and opening up opportunities for creating a new class of applications for domains such as smart cities. The emerging Fog computing is extending Cloud computing paradigm to edge resources for latency-sensitive IoT applications.
This keynote presentation will cover (a) 21st century vision of computing and identifies various IT paradigms promising to deliver the vision of computing utilities; (b) opportunities and challenges for utility and market-oriented Cloud computing, (c) innovative architecture for creating market-oriented and elastic Clouds by harnessing virtualisation technologies; (d) Aneka, a Cloud Application Platform, for rapid development of Cloud/Big Data applications and their deployment on private/public Clouds with resource provisioning driven by SLAs; (e) experimental results on deploying Cloud and Big Data/Internet-of-Things (IoT) applications in engineering, and health care, satellite image processing, and smart cities on elastic Clouds;and (f) directions for delivering our 21st century vision along with pathways for future research in Cloud and Fog computing.

Speaker Bio:  Dr. Rajkumar Buyya is a Redmond Barry Distinguished Professor and Director of the Cloud Computing and Distributed Systems (CLOUDS) Laboratory at the University of Melbourne, Australia. He is also serving as the founding CEO of Manjrasoft, a spin-off company of the University, commercializing its innovations in Cloud Computing. He served as a Future Fellow of the Australian Research Council during 2012-2016. He has authored over 625 publications and seven textbooks including “Mastering Cloud Computing” published by McGraw Hill, China Machine Press, and Morgan Kaufmann for Indian, Chinese and international markets respectively. He also edited several books including “Cloud Computing: Principles and Paradigms” (Wiley Press, USA, Feb 2011). He is one of the highly cited authors in computer science and software engineering worldwide (h-index=114, g-index=245, 67,600+ citations).  Dr. Buyya is recognized as a “Web of Science Highly Cited Researcher” in 2016 and 2017 by Thomson Reuters, a Fellow of IEEE, and Scopus Researcher of the Year 2017 with Excellence in Innovative Research Award by Elsevier for his outstanding contributions to Cloud computing.
Software technologies for Grid and Cloud computing developed under Dr. Buyya’s leadership have gained rapid acceptance and are in use at several academic institutions and commercial enterprises in 40 countries around the world. Dr.  Buyya has led the establishment and development of key community activities, including serving as foundation Chair of the IEEE Technical Committee on Scalable Computing and five IEEE/ACM conferences. These contributions and international research leadership of Dr. Buyya are recognized through the award of “2009 IEEE Medal for Excellence in Scalable Computing” from the IEEE Computer Society TCSC.
Manjrasoft’s Aneka Cloud technology developed under his leadership has received “2010 Frost & Sullivan New Product Innovation Award”. Recently, Dr. Buyya received “Mahatma Gandhi Award” along with Gold Medals for his outstanding and extraordinary achievements in Information Technology field and services rendered to promote greater friendship and India-International cooperation. He served as the founding Editor-in-Chief of the IEEE Transactions on Cloud Computing. He is currently serving as Co-Editor-in-Chief of Journal of Software: Practice and Experience, which was established over 45 years ago. For further information on Dr.Buyya, please visit his cyber home: www.buyya.com

WG – Laércio LIMA PILLA: Current Efforts in Global Scheduling and Fault Tolerance for HPC Systems

2018-01-23

Title: Current Efforts in Global Scheduling and Fault Tolerance for HPC Systems

Speaker: Laércio LIMA PILLA

Abstract: Performance, energy efficiency, and reliability have been important objectives and challenges in current and future computing systems. In this context, our approach has been based on understanding the details of the computing system architecture and the behavior of applications, in order to combine this information, identify issues and propose new solutions. In this presentation, I will discuss our experience with the development of new architecture-aware global scheduling algorithms for multiprocessor and multicomputer systems, and with fault tolerance mechanisms for radiation-induced errors in parallel accelerators. I will also present some future global scheduling plans to handle the inclusion of non-volatile random-access memories (NVRAMs) in computing systems.

WG – Victor Allombert: Programming Multi-BSP Algorithms in ML

2018-01-15

Title: Programming Multi-BSP Algorithms in ML

Speaker: Victor Allombert

Abstract: From personal computers using an increasing number of cores, to supercomputers having millions of computing units, parallel architectures are the current standard. The high performance architectures are usually referenced to as hierarchical, as they are composed from clusters of multi-processors of multi-cores. Programming such architectures is known to be notoriously difficult. Writing parallel programs is, most of the time, difficult for both the algorithmic and the implementation phase. To answer those concerns, many structured models and languages were proposed in order to increase both expressiveness and efficiency. Among other models, Multi-BSP is a bridging model dedicated to hierarchical architecture that ensures efficiency, execution safety, scalability and cost prediction. It is an extension of the well known BSP model that handles flat architectures. We introduce the Multi-ML language, which allows programming Multi-BSP algorithms “à la ML” and thus, guarantees the properties of the Multi-BSP model and the execution safety, thanks to a ml type system. To deal with the multi-level execution model of Multi-ML, we defined formal semantics which describe the valid evaluation of an expression. To ensure the execution safety of Multi-ML programs, we also propose a typing system that preserves replicated coherence. An abstract machine is defined to formally describe the evaluation of a Multi-ML program on a Multi-BSP architecture. An implementation of the language is available as a compilation toolchain. It is thus possible to generate an efficient parallel code from a program written in Multi-ML and execute it on any hierarchical machine.