Large-scale distributed computing systems Lecture 2: Remote services December 2014 Johan Montagnat CNRS, I3S, MODALIS http://www.i3s.unice.fr/~johan/ Course overview ► ► ► ► ► ► ► ► 1. Distributed computing and models 2. Remote services and cloud computing 3. Infrastructures and deployment 4. Workload and performance modeling 5. Workflows 6. Authentication, authorization, security 7. Data management 8. Evaluation Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 2 Course content ► 2. Remote Services ▸ ▸ ▸ Services and interoperability Web Services, WS-RF, WS-* Platform as a service, Cloud computing Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 3 Service-Oriented Architectures Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 4 Service architectures ► Middleware ▸ ▸ ▸ ► is distributed software requires interoperability, despite heterogeneity requires platform independence (e.g. MS Windows interfaces, Unix computing platforms) Service architectures help in achieving these goals ▸ ▸ ▸ ▸ Earlier initiatives (e.g. GT2) based on more static technologies led to difficult middleware/application integration Web Services widely developed during the late 90s OGSA standards emerging in 2003 WS-RF is an evolution of WS and OGSA towards convergence Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 5 ► Modular design ▸ ► Object-oriented software (C++/Java) ▸ ► Reuse functions: classes Component-based software ▸ ► Reuse code: subroutines and functions Reuse a functionality Service Oriented Architecture (Web-Services) ▸ Reuse across platforms and protocols Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat Increasing code abstraction level Increasing tooling and code generation Code reusability 6 Software composition ► Software engineering development ▸ ► From assembly and static binaries to dynamic object oriented programming, dynamic libraries, kernel modules, etc Software components ▸ ▸ ▸ ▸ ▸ Define and publish a standard interface Interact by message exchanges Ease (dynamic) composition Modularity (no compilation time dependency) High level message exchange protocols (format, types...) Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 7 Remote Procedure Calls ► RPC ▸ ▸ ▸ ► Interface Description Language (IDL) Communication protocol Data exchange format RPC system examples ▸ ▸ RMI (java description, java serialization) CORBA (Common Object Request Broker Architecture) standard from OMG (CORBA IDL, CORBA bus) ∙ ▸ GridRPC (no standard IDL, different communication implementations) ∙ ▸ http://www.corba.org http://en.wikipedia.org/wiki/GridRPC Web Services standard from the W3C (WSDL, SOAP, XML) ∙ http://www.w3.org/TR/{wsdl,soap} Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 8 Service-Oriented-Architectures (SOA) in a nutshell ► Basic principles Messages ► A service is an exposed piece of functionality with 3 properties: (1) The interface contract to the service is platform-independent (2) The service can be dynamically located and invoked (3) Services maintain a relationship that minimizes dependencies and only require that they maintain an awareness of each other (loosely coupling) Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 9 SOA principles ► Reuse, modularization ▸ ▸ ▸ ▸ ▸ Language independence Loose coupling Black-box: Hide internal logic from the outside world Autonomy: Control over internal logic Granularity: ∙ ∙ ► Logic divided into services to promote reusability Yet, functionality is presented at a granularity recognized by the user as a meaningful service Interoperability ▸ ▸ ▸ Compliance to standards (interface, protocols) Contract: services adhere to a communication and quality of service agreement Composability: Collections of services can be coordinated and assembled to form composite services Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 10 SOA principles ► Usability ▸ Identification and categorization ∙ ▸ ▸ ► Possibly alternate services, providing different QoS for a given functionality Provisioning and delivery Monitoring and tracking Service framework ▸ ▸ Encapsulation: Many services are consolidated to be used under the SOA Discoverability: Services are designed to be outwardly descriptive so that they can be found and assessed via available discovery mechanisms Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 11 Use-case scenario X ► ► ► Y X wants to use the algorithm of Y on his data Y wants to test his algorithm on X's data Underlying issues: ▸ ▸ How to share algorithms ? How to share data ? Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 12 Usual solutions ► Share codes ▸ ▸ ▸ Y creates a code repository (or a .tgz) with its C code X checks it out, compiles it and tests the algorithm Technical problems: ∙ ∙ ► Share binaries ▸ ▸ ► Y's code does not compile on X's machine and X does not know how (or does not want to) to debug C Y created a new version of the algorithm and X still uses the old one X and Y need the same execution platform X need to properly configure and deploy the software License problems (source availability, ...) Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 13 Use case revisited in SOA (2) write a client Interface description (1) publish Client X ► ► ► (3) invoke Service location Y Y has published the interface of its algorithm (1) X wrote a client to invoke it (2) X invokes Web-Service (3) ▸ All data is transferred as part of the WS parameters Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 14 Web Services basics ► Client/Server interaction ► Platform independent and language independent ▸ ► Self-describing ▸ ► ► Client and server program can be written in different languages, run in different environments Once located you can ask it how to use it Standard Loosely coupled Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 15 Web-Services standard ► ► Originally from IBM and Microsoft Standardized by the W3C: ▸ Contract / Interface format Web Services Description Language ▸ Messages format Simple Object Access Protocol ▸ Discovery format Universal Description Discovery & Integration ► Based on XML ▸ ▸ Text format Platform/language independence Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 16 Web services ► Typical use case discover describe invoke Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 17 Service discovery: UDDI ► ► ► Current version: 3.0 (feb. 2005) - OASIS UDDI is a set of Web-Services White pages: ▸ ► Yellow pages ▸ ▸ ► Web-Services classified w.r.t service provider Web-Services classified w.r.t service categories There might be several yellow pages per provider Green pages ▸ ▸ Web-Services classified w.r.t technical interfaces Typically one green page per binding of a WS Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 18 Service description: WSDL http://w3.org/TR/wsdl ► ► ► Describes the interface of the Web-Service Current version: 1.1 (15th march 2001) 7 basic XML tags: ▸ Wha t ? ▸ ▸ ▸ How ? ▸ ▸ ▸ ► Types: complex types made from basic types Message: set of parts (Input/Output parameters) Operation: set of messages (≃method) Port type: set of operations (≃class) Binding: protocol used to invoke the service (e.g: SOAP/HTTP) Port: internet address and port Service: set of ports Following example on an image registration Web-Service Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 19 WSDL structure: <types> Types definition, based on XSD primitive types <types> <complexType name="image"> XML standard types <sequence> <element name="file-name" type="xsd:string" /> <element name="modality" type="xsd:string"/> </sequence> </complexType> </types> Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 20 WSDL structure: <message> Messages to be exchanged (as request or response) <message name="registrationRequest"> <part name="reference-image" type="ns:image"/> <part name="floating-image" type="ns:image"/> </message> <message name="transformation"> ... </message> Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 21 WSDL structure: <portType> and <operation> Operations (request + response message) and collections of operations (port type) <portType name="registrationPortType"> <operation name="register"> <input message="registrationRequest"/> <output message="transformation"/> </operation> </portType> Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 22 WSDL structure: <binding> Protocol binding <binding name="SOAP11Binding" type="registrationPortType"> <SOAP:binding style="rpc" transport="http"/> </binding> HTTP, SMTP... “rpc” (Remote Procedure Call) or “document” Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 23 WSDL structure: <port> and <service> Service definition and service endpoint (port) <service name="registration_service"> <port name="registration1" binding="registrationB"> <address location="http://rigid.registration.com:1234"/> </port> <port name="registration2" binding="registrationB"> <address location="http://bestAlgos.fr/registration"/> </port> </service> Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 24 Invocation: SOAP http://w3.org/TR/soap ► ► ► ► Describes the format of the messages Current version 1.2 - 24thJune 2003 Various underlying protocols (HTTP, SMTP, FTP...) SOAP envelope: ▸ ▸ ► ► Header Body: method invocation Faults mechanism Attachment of binary data Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 25 SOAP (over HTTP) example POST / HTTP/1.1 Host: localhost:18006 User-Agent: gSOAP/2.7 Content-Type: text/xml; charset=utf-8 Content-Length: 544 Connection: close SOAPAction: "" <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope> <SOAP-ENV:Body> <ns:register> <reference-image> <file-name>image1</file-name> <modality>MRI</modality> </reference-image> <floating-image> <file-name>image2</file-name> <modality>MRI</modality> </floating-image> </ns:register> </SOAP-ENV:Body> </SOAP-ENV:Envelope> Master Ubinet: Large-Scale Distributed Computing (2) SOAP Message HTTP Headers SOAP Envelope SOAP Header Headers SOAP Body Method Call & Data Johan Montagnat 26 Interoperability issues in practice ► Problems with specific data structures ▸ ▸ Complex types Alternatives in choosing data types ∙ ∙ ► Problems with protocols supported ▸ ► Strong typing (more control, less interoperability) Weak typing: “everything is a string” (eases interoperability, no control) SOAP 1.1 / SOAP 1.2 / HTTP POST Problems with encoding styles ▸ RPC / RPC literal / document literal Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 27 Other limitations of Web-Services ► Statelessness ▸ ► Low performance (text protocols) ▸ ► Alternatives: GridRPC (DIET [http://graal.ens-lyon.fr/DIET], NetSolve, Ninf-G) and other binary protocols Low scalability ▸ ▸ ► Services as complete black-box is a desirable property in a pure SOA but it may bring limitations Static endpoint declaration No scheduling / load-balancing framework OGSA / WS-RF extensions ▸ ▸ ▸ ▸ Name and do bindings Start and end services Query, subscription, and notification Share error messages Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 28 RESTful services Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 29 REpresentational State Transfer ► REST is an architectural style ▸ ▸ High-performance, scalable client-server interaction Based on Web standards ∙ ∙ ∙ ▸ Create/Read/Update/Delete (CRUD) actions only ∙ ∙ ► HTTP1.1 protocol XML and JSON data formats URI identifiers Binding HTTP POST/GET/PUT/DELETE messages to CRUD actions Must conform to correct message use RESTful services ▸ Are completely stateless ∙ ▸ ▸ Use HTTP methods explicitely Expose structured data as URI trees ∙ ▸ Minimal load on server to improve scalability Make data self-explanatory Transfer XML and/or JSON ∙ Independently of the internal client / server data structures Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 30 RESTful vs Web Services ► RESTful services ▸ ▸ ▸ ▸ Are typically lighter, simpler and minimize impact on server Can only use HTTP protocol Convert all data to XML/JSON Require client and server to have common data structures and context ∙ ▸ Do not benefit from message extensions ∙ ► No self-explanatory WSDL description No SOAP messaging Web Services ▸ ▸ ▸ ▸ Are typically heavier and leverage client / server complete decoupling Define WSDL contracts Embed higher-level functionality in SOAP messages (security, addressing, etc) Support multiple protocols Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 31 WS-* specifications Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 32 Open Grid Service Architecture ► OGSA aims at delivering improved services for compute intensive applications ▸ ▸ ▸ ► Defines a higher level service management system Defines a standard interface for grid services (extended WSDL) Obsoleted and now integrated in Web Services Resources Framework (WSRF) OGSA services vs regular Web Services ▸ ▸ ▸ Transient (destroyable) grid services vs persistent web services Factory services to manage other service instances Stateful services: information may be stored on server side and made available among multiple clients ∙ ▸ Getting away from the “service as a black box” model Service information metadata Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 33 Services lifecycle management ► Services management ▸ ▸ ▸ ▸ ▸ ▸ ► Define service instantiation to support decentralization ▸ ► Service creation Life time Fault management State management Publication … Multiple instances of a given service But no real support for load distribution or fault tolerance ▸ It is up to services to define their load management policy and reactions to faults Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 34 Stateful vs Stateless services ► Discoverable and accessible state ► A state is anything the service needs to expose Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 35 Web Services Resource Framework ► Resources: keep web services and state separate ▸ ▸ ▸ States are stored externally in “resources” Each resource has a unique key Resources can be anything Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 36 Resources Framework ► ► Web Service + Resource = WS-Resource Address of a WS-resource is called an endpoint reference Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 37 Standard interface ► ► Service information State representation GetRP ► GetMultRPs Client SetRP QueryRPs Web Service ► ► Master Ubinet: Large-Scale Distributed Computing (2) SetTerminationTime ImmediateDestruction Notification Interfaces ► GetRP, SetRP, QueryRPs, GetMultipleRPs Lifetime Interfaces Destroy Endpoint Reference State Interfaces Subscribe SetTerm Time State identification ► Resource Resource Property Subscribe Notify ServiceGroups Johan Montagnat 38 WS-Notification ► Event notifications (asynchronous communications) ▸ ► Naming ▸ ► Resources created following factory pattern / destroyed Information model (monitoring & discovery) ▸ ▸ ► Every resource can be uniquely referenced Lifecycle ▸ ► Publication/Subscription model for WS Properties associated with resources (with queriers and setters) Asynchronous notification of changes to properties Service Groups (registries & collective services) ▸ Group membership rules & membership management Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 39 Other frameworks ► WS core led to a family of frameworks (WS-*) grouped in WS-RF Applications of the framework (Compute, network, storage provisioning, job reservation & submission, data management, application service QoS, …) WS-Agreement (Agreement negotiation) WS Distributed Management (Lifecycle, monitoring, …) WS-Resource Framework & WS-Notification (Resource identity, lifetime, inspection, subscription, …) Web services (WSDL, SOAP, WS-Security, WS-ReliableMessaging, …) Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 40 Practical use of Services in largescale systems Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 41 Web-Services in practice ► What is needed to build a Web-Service (server side) ▸ ▸ ▸ ► What is needed to invoke a WS (client side) ▸ ▸ ► Generate WSDL from a method implementation Receive HTTP requests (web server) Interpret SOAP requests (stub) Generate method header from WSDL Write SOAP requests (stub) Language-specific tooling for generating Web Services Client Business code Native data types data bindings serialization Server stub transport layer Master Ubinet: Large-Scale Distributed Computing (2) skeleton data bindings Service code serialization Johan Montagnat Native data types 42 Web Services container ► ► Web services: software exposing a set of operations Container ▸ ▸ ▸ ► SOAP Engine: handles SOAP requests and responses (e.g. Apache Axis) Application Server: provides “living space” for applications that must be accessed by different clients (e.g. Tomcat) HTTP server: also called a Web server, handles HTTP messages Managing contained services ▸ ▸ ▸ Services deployment (static / hot) Network connections manager / firewall Secure communications (HTTPS) Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 43 Web-Services development toolkits ► Apache axis - http://ws.apache.org/axis2 ▸ ▸ ► ▸ Web server: Tomcat container / Apache server ▸ Stubs and bindings: wsdl2java, java2wsdl Sun metro - https://metro.dev.java.net/ ▸ ▸ ▸ ▸ ► Platform: UNIX, Windows Language: Java Platform: SUN OS, UNIX, Windows Language: Java Web server: Glassfish / Tomcat container Stubs and bindings: wsgen (annotated classes), wsimport gSOAP - http://gsoap2.sourceforge.net/ ▸ ▸ ▸ ▸ Platforms: UNIX, Windows Language: C/C++ Web server: stand-alone / Apache module Stubs: wsdl2h Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 44 WS development toolkits (2) ► Microsoft .Net - http://www.microsoft.com/NET/ ▸ ▸ ▸ ► Platform: Windows Languages: C, C++, C#, Visual Basic Web server: Microsoft IIS SOAP::Lite - http://www.soaplite.com ▸ ▸ ▸ Platforms: UNIX, Windows Languages: Perl, PHP A Web-Service client in 3 lines Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 45 Globus Software: dev.globus.org Globus Projects MPICHG2 GridWay Incubator Mgmt Common Runtime GT4 Java Runtime Delegation MyProxy Data Rep Replica Location C Runtime CAS GSIOpenSSH GridFTP MDS4 GRAM Reliable File Transfer GT4 Docs Python Runtime Security OGSA-DAI C Sec Execution Mgmt Master Ubinet: Large-Scale Distributed Computing (2) Data Mgmt Info Services Johan Montagnat Other 46 Globus Toolkit ► ► A collection of services to address common requirements Software for grid infrastructure ▸ ► Tools to build applications that exploit grid infrastructures ▸ ▸ ► Extensible framework, WS-RF compliant Services container Open source & open standards ▸ ► Strong emphasis on security and distributed resources http://www.globus.org Enabler of a rich tool and service ecosystem ▸ ▸ With large community effort and incubated services It is “only” a toolkit: a grid middleware needs to be composed out of it Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 47 GLOBUS Toolkit and applications ► GOBUS supports WSRF and legacy WS services Custom Web Services Custom WSRF Services Globus WSRF Web Services Registry and Admin Globus Container (e.g., Apache Axis) User Applications WS-A, WSRF, WS-Notification WSDL, SOAP, WS-Security Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 48 Data exchanges ► Serialization ▸ ▸ ▸ ► Attachment ▸ ▸ ▸ ► Data embedded in SOAP messages Converted into XML (text) Performance killer for large data sets Binary data attached to SOAP messages Interoperability issues Still not efficient (in a distributed environment, the client may not host the data) References ▸ ▸ SOAP messages only contain references to the data The grid provides a common reference space Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 49 Sharing data on a distributed system ► Each file is identified by a Logical File Name (LFN) LFNresult Storage Resources LFN2 (1) register images (4) put result LFN1 (5) reply(LFNresult) X ► ► client (3) get images (2) invoke (LFN1, LFN2) Service location Y X sends LFNs of its images to Y's Web-Service (2) Y's Web-Service sends back LFNs to the client (5) Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 50 Distributing computations ► X1...X1000 want to access Y’s algorithm Storage Resources LFNresult X1 X2 X3 X4 LFN2 LFN1 (1) register images (4) get images (3) submits/monitors grid job Computing resources ... (6) reply(LFNresult) X10^6 ► (5) put result (2) invoke (LFN1, LFN2) Service location Y Y submits jobs to answer computing demand Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 51 Service wrapper ► Embedding legacy codes (non-service oriented) ▸ ▸ ▸ ► Common in scientific areas Heavy-weight, difficult to recompile, modify and validate Provide a standard interface, description and remote invocation capability Hiding data transfers ▸ Including grid data access protocol and data referencing Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 52 Generic Service Wrapper ► The remote job handling can be decoupled from Y ▸ Independent from middleware evolutions Storage Elements (1) LFN-algo (2) algo description (1) register algorithm (2) publish description (3) get algorithm description Y (4) register images (10) reply(LFNresult) X (5) invoke (LFN1, LFN2,algo desc.) Master Ubinet: Large-Scale Distributed Computing (2) LFN2 LFNresult (9) put result LFNalgo (7) get algo LFN1 (8) get images (6) submits/monitors grid job Computing resources Generic Service location Johan Montagnat 53 Generic algorithm descriptor ► Executable access method: ▸ ▸ ► Input/Output ▸ ▸ ► URL Grid file Command-line options Access methods (for files) Sandbox files ▸ ▸ Dynamic libraries Config files,... Master Ubinet: Large-Scale Distributed Computing (2) <description> <executable name="ExecThisCode.pl"> <access type="URL"> <path value="http://somewhere.eu/"/> </access> <value value="ExecThisCode.pl"/> <input name="image" option="-im1"> <access type="LFN" /> </input> <input name="scale" option="-s"/> <output name="crest_lines" option="-c2"> <access type="LFN" /> </output> <sandbox name="UtilityProgram.pl"> <access type="URL"> <path value="http://elsewhere.dk/"/> </access> <value value="UtilityProgram.pl"/> </sandbox> </executable> </description> Johan Montagnat 54 Generic Application Service Wrapper ► ► Provide service wrapper to non instrumented code Handle data transfers (references to grid data) Algorithm parameters description Input 0 Input 1 Generic service Output 0 Legacy code 1 Algorithm parameters description Input 0 Input 1 Generic service Output 0 Legacy code 2 Master Ubinet: Large-Scale Distributed Computing (2) <description> <executable name="ExecThisCode.pl"> <access type="URL"> <path value="http://somewhere.eu/"/> </access> <value value="ExecThisCode.pl"/> <input name="image" option="-im1"> <access type="LFN" /> </input> <input name="scale" option="-s"/> <output name="crest_lines" option="-c2"> <access type="LFN" /> </output> <sandbox name="UtilityProgram.pl"> <access type="URL"> <path value="http://elsewhere.dk/"/> </access> <value value="UtilityProgram.pl"/> </sandbox> </executable> </description> Johan Montagnat 55 GASW setup ► Generic Application Service Wrapper ▸ ▸ ► Provide service wrapper to non instrumented code Handle data transfer (references to grid data) Invocation User Web Server WSDL Contract Code descriptor Storage Resource Executable Executable Input file Required library Wraper script Code descriptor GASW host Input files transfer Command-line Output files transfer Remote Job Master Ubinet: Large-Scale Distributed Computing (2) Job submission service Johan Montagnat Computing resource 56 GASW invocation Storage Resource User Web Server WSDL Contract Executable Executable Code descriptor Input file Required library Input file Wraper script Input files transfer Command-line Output files transfer Code descriptor Required library GASW host Executable Execute Job submission service Master Ubinet: Large-Scale Distributed Computing (2) Computing resource Johan Montagnat Output file 57 GASW results Storage Resource User Web Server WSDL Contract Executable Executable Code descriptor Input file Required library Output file Wraper script Input files transfer Command-line Output files transfer Code descriptor GASW host Output file Job submission service Master Ubinet: Large-Scale Distributed Computing (2) Computing resource Johan Montagnat 58 Service wrapper ► Possibly supporting other non-functional concerns ▸ ▸ ▸ ► Security (credentials, delegation, encryption of data...) Optimization (requests grouping, bulk submission...) ... Bridging service-oriented environment (metacomputing flavor) with global computing systems Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 59 Delegation service example ► A service needs to act on behalf of the client ▸ ► Specific delegation service sharing and authentication resource ▸ ▸ ► The client hold longlived credentials Hosting Environment Service1 Resources Service2 EPR Service3 Delegate once, share across services and invocation Medium-lived credentials Authentication protocol independent approach Master Ubinet: Large-Scale Distributed Computing (2) Delegation Service Delegate Refresh Refresh EPR Delegate Client Johan Montagnat 60 OGSA-DAI Example ► ► OGSA-compliant Data Access and Integration Service Objectives: sharing and accessing data resources in a grid ▸ ▸ ► Heterogeneous Distributed Scalability and other data-maintenance concerns prevents the use of a centralized server FR Query Client FR data Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 61 OGSA-DAI architecture ► Service-based framework ▸ ▸ ▸ access to heterogeneous data resources provides services for data access, integration, transformation and delivery executes data-centric workflows UK query UK data ES query ES data IA query IA data OGSA-DAI service Translate and join FR query FR data Client Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 62 Query workflow Convert query query Convert from French French to to from English English Country Capital UK London France Paris Convert data data Convert from from English to to English French French Run SQL SQL query query Run SELECT Country, Capital FROM Countries Capital Grande-Bretagne Londres France Paris Join Join the the data data SELECT Pays,Capital FROM Pays SELECT País, Capital FROM Países Convert query query Convert from French French to to from Spanish Spanish Pays Run SQL SQL query query Run Convert data data Convert from from Spanish to to Spanish French French País Capital España Madrid Italia Roma Master Ubinet: Large-Scale Distributed Computing (2) Pays Capital Grande-Bretagne Londres France Paris Espagne Madrid Italie Rome Pays Capital Espagne Madrid Italie Rome Johan Montagnat 63 Cloud computing Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 64 Cloud computing ► Delivering computing resources as a service, on demand ▸ ► Three levels of service commonly indentified ▸ ▸ ▸ ► Computing resources are allocated through a service interface IaaS: Infrastructure as a Service (compute, data and network resources) PaaS: Platform as a Service (infrastructure + facilities for application development, deployment, maintenance) SaaS: Software as a Service (complete application stack) Motivations ▸ ▸ ▸ ▸ Convergence of services industry and infrastructure provision Externalize management of computing resources to external resource providers Scale-effect: benefit from very large data-centers Business-centric: pay resources as you go Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 65 History of on-demand computing Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 66 Cloud and Grid computing ► ► ► Clouds focus is on resources provision Grid focus is on middleware capability Complementary ▸ ▸ ▸ Clusters, grid computing and cloud computing: the combination of multiple computers into larger metacomputers Clouds focus on infrastructure provision, yet require middleware to operate Cloud solutions are more or less flexible in terms of services / middleware provided Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 67 Virtualization ► Abstraction of computer resources ▸ ▸ ▸ ▸ ► Make it possible for a single physical computer to emulate several (different) virtual machines Separate the OS from the underlying platform resources Several OSes and software stack can co-exist on a physical resource Resources (CPU time, memory, network...) are shared among virtual machines Why? ▸ ▸ ▸ ▸ ▸ Hardware is underutilized, especially multi-core machines Some services require permanent operation but limited machine resources Grouping services on a single host is much more energy efficient Data centers run out of space Increased flexibility and lower system administration costs Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 68 Hardware virtualization principle ► Add an hypervisor layer between OS and hardware App App App App App O.S. O.S. App App Several Virtual Machines O.S. Privileged code Hypervisor Hardware Privileged code sharing hardware resources Hardware ► Different techniques ▸ ▸ ▸ Full virtualization through binary translation or CPU-assistance Kernel-level virtualization Paravirtualization for hypervisor-aware systems Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 69 Unmodified OS virtualization ► Run native guest OSes on top of a host OS O.S. Load & Translate Priviledged instructions Privileged code App App App App App App O.S. O.S. O.S. Translated code Translated code Privileged code Hypervisor Privileged code Hardware ► Pros/cons + Works with any OS unmodified + Host OS unaware of virtualization layer - Complex virtualization process with binary code translation and CPU calls emulation - All hardware devices usually not supported - Performance impact of emulation ► Example: VMware Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 70 Kernel-level virtualization ► Homogeneous OSes running on the same hardware App App App App App App App App Compatible O.S. Compatible O.S. Custom environment Custom system libraries Custom root file system Custom environment Custom system libraries Custom root file system Host (linux) virtualization kernel Privileged code Hardware ► Pros/cons + reduced overhead of virtualization process - limited to homogeneous Oses ► Example: KVM Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 71 Para-virtualization ► Cooperating OSes-hypervisor ▸ Each OS is controlled by a Virtual Machine Monitor (VMM) in the hypervisor App App App App App App O.S. O.S. Hypervisor calls Hypervisor calls Hypervisor VMM VMM Privileged code Hardware ► Pros/Cons + Low hypervisor overhead - With compliant OSes only ► Example: Xen Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 72 Hardware-assisted virtualization ► CPU support for virtualization ▸ Trapping privileged access from OS and notifying hypervisor App App App App App App O.S. O.S. Privileged code Privileged code CPU call trapping Hypervisor VMM VMM Privileged code Hardware ► Pros / Cons + Combine advantages of para-virtualization and full virtualization (unmodified OSes support) + Reduced privileged code trapping overhead - Still some performance penalty, several generations of hardware support ► Examples: Intel-VT and AMD-V compliant processors Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 73 More virtualization ► Virtual memory unit virtualization ▸ ▸ ► Devices virtualization ▸ ▸ ▸ ► Memory is the most critical component to be shared between several guest OSes Hardware-assist memory virtualization in latest x86 CPUs Other devices may require virtualization (GPUs, network...) Network virtualization gives more control / guarantee on network bandwidth dedicated to applications Network devices increasingly programmable in software System manipulations by hypervisor ▸ ▸ ▸ ▸ Suspend / resume / stop system activity Write VM memory state on disk / restore Migrate VM from one host to another Checkpoint and run fail-safe backup VM... Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 74 Use of virtualization technologies in Clouds ► Virtual Machine ▸ ▸ ▸ ► Resources allocation ▸ ▸ ► A virtualized machine emulates a physical computing environment (including OS, libraries, applications) and request for hardware resources to an hypervisor VMs can be started (OS boot) / paused / resumed / stopped VMs are bundled in image files, easily transported Physical resources can be allocated to a VM on-demand (e.g. amount of memory available, number of CPU cores...) A computer can host as many VMs as its physical resources can bear Cloud infrastructures use VMs ▸ ▸ For user environment installation and deployment To easily balance load over existing physical servers Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 75 Cloud computing ► ► Strong industry involvement Use virtualization to adapt to all customer needs ▸ ▸ ▸ To adapt hardware environments to software needs rather than the opposite Free-up the relationship between software and hardware Large hypervisor pool to allocate resources on-demand ∙ ► Automation ▸ ► Elastic resources provisioning framework Virtualized VMs can easily be managed to reach garanteed quality of services Service-oriented ▸ ▸ The cloud itself is a service accessible / configurable through an API Quality of service is usually guaranteed by contract Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 76 Heterogeneous interfaces Applications for thin clients Increasing level of complexity and vendor lock-in Browser-based, desktop-based, mobile Client Services Applications (SaaS) Platform (PaaS) Storage Infrastructure (IaaS) (e.g. Google/Yahoo Maps, WS-2.0) Some standardization of WS 2.0 Online apps (e.g. Google Apps) Completely vendor specific Cloud IDE (e.g. AppEngine) Complete development environment (OS, compilers, database server, web server...) Data storage (e.g. S3) Decouple long-term storage from computing Computing resources (e.g. EC2) Physical resources an hypervisors Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 77 Amazon Web Services example http://aws.amazon.com ► ► Extensive cloud-oriented services portofolio Web Service-oriented API ▸ ▸ ► SOAP and/or RESTful API Bindings for several languages From simple VM delivery to complete business hosting ▸ IaaS ∙ ▸ PaaS ∙ ∙ ∙ ∙ ∙ ∙ ▸ Virtual servers, storage servers, VPN... Multi-platform development environments, integration tools Identity management, data security Elastic scaling, workload splitting, load balancing Platform monitoring Databases (relational and NoSQL), web server, email server ... SaaS ∙ ∙ Data analysis services, workflow managers Communication services, data transcoding and streaming Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 78 Conclusions ► Service-oriented architectures widely adopted ▸ ► WS-RF architectures enable the design and deployment of Grid middlewares ▸ ▸ ▸ ► Toolkits available Interoperability issues addressed Basics for managing collection of services Still, tough problems have to be addressed when deploying a large-scale distributed system ▸ ▸ ▸ ► Convergence of distributed computing and web communities Scalability of all services Fault tolerance Deployment processes With the emergence of the Cloud concept, infrastructure provision also became a service Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 79 References ► Web Services ▸ ► OGSA ▸ ► http://www.w3.org/2002/ws “The GRID 2: blueprint for a new computing infrastructure”, Ian Foster, Carl Kesselman. Elsevier. Service wrapper ▸ "A Service-Oriented Architecture enabling dynamic services grouping for optimizing distributed workflows execution". T. Glatard, J. Montagnat, D. Emsellem, D. Lingrand. Future Generation Computer Systems (FGCS), 24 (7), pages 720–730, Elsevier, july 2008. Master Ubinet: Large-Scale Distributed Computing (2) Johan Montagnat 80
© Copyright 2024 ExpyDoc