Transcript XtreemOS

Managed by
Overview of XtreemOS
Christine Morin
XtreemOS scientific coordinator
[email protected]
Phenix Workshop, Rennes
December 07, 2006
XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576
Grid Environment & VO
VO1
WAN
• Multiple users from
different institutions
VO2
• Large scale
• Uncountable number of resources
• Dynamicity
• VO, users, resources
Overview of XtreemOS - Phenix Workshop, December 7, 2006
• Multiple geographically
distributed resources in
different administrative
domains
2
State of the Art
 Current OS are not Grid-aware & not VO-aware
 A variety of Grid middleware & Toolkits for Grid
Computing
•
•
•
•
Resource management
Changing interfaces
Security pitfalls
Complexity for users, programmers & administrators
Overview of XtreemOS - Phenix Workshop, December 7, 2006
3
XtreemOS Objectives
 Design & implement a reference open source Grid operating
system based on Linux
– Native support for virtual organizations
 Validate the XtreemOS Grid OS with a set of real use cases on
a large Grid testbed
 Promote XtreemOS software in the Linux community and
create communities of users and developers
Overview of XtreemOS - Phenix Workshop, December 7, 2006
4
XtreemOS Research Challenges
 Identify fundamental functionalities to be embedded in Linux for
secure application execution in Grids
 Build a set of scalable self-healing OS services for secure
resource management in very large dynamic grids
 Provide a simple Grid API compliant with Posix while adding
new functionality and supporting Grid-aware applications
 Aggregate cluster resources into powerful grid nodes by
integrating single system image mechanisms in Linux
 Build an XtreemOS flavour for mobile devices enabling
ubiquitous access to grid resources
Overview of XtreemOS - Phenix Workshop, December 7, 2006
5
XtreemOS Flavours
Appli
Application
Appli
Appli
Middleware
XtreemOS
Linux
Linux
Linux
Linux
Computer
Computer
Computer
Computer
 PC
 Federation of PCs
– Cluster
 Mobile device
– PDA
– Mobile phone
Overview of XtreemOS - Phenix Workshop, December 7, 2006
6
XtreemOS Architecture
Scientific Applications
Business Applications
XtreemOS API
VO & Security
Application
Management
Data Management
Infrastructure for Highly Available and Scalable Services
Linux-XOS: Grid-enabled Linux Operating System
Linux-XOS for PC
Linux-XOS for
Cluster
Linux-XOS for
Mobile Devices
Overview of XtreemOS - Phenix Workshop, December 7, 2006
7
XtreemOS Use Cases
 14 applications
– Simulation applications (aerospace, energy)
– Business applications
– Bioinformatics application
– Virtual reality application
– Finance application
– Telecom application
Overview of XtreemOS - Phenix Workshop, December 7, 2006
8
XtreemOS & Linux
 Acceptance in the Linux community is key for the success of the
XtreemOS project
– Packaging for multiple Linux distributions
 Mandriva Linux
 Red Flag Linux
 Debian
– Integration in OSCAR
– Get XtreemOS patches accepted in Linux OS
Overview of XtreemOS - Phenix Workshop, December 7, 2006
9
XtreemOS Project Phases
 Phase 1 (M1-M6)
– Specification of XtreemOS
 Phase 2 (M7-M18)
– Design and implementation of XtreemOS basic version
– Preliminary experiments with LinuxSSI
 Phase 3 (M19-M24)
– Integration of all XtreemOS components
– Delivery of first XtreemOS prototype
 Phase 4 (M25-M48)
– Evaluation with real use cases
– Design and implementation of advanced features of
XtreemOS
– Public releases
Overview of XtreemOS - Phenix Workshop, December 7, 2006
10
XtreemOS Sub-projects
 SP1 - Project Management
 SP2 - Linux for Virtual
Organizations
SP2
SP3
 SP3 - Grid Support for Linux
 SP4 - Software integration,
packaging, experimentation &
validation
XtreemOS
 SP5 - Communication,
dissemination, exploitation &
training
SP4
Overview of XtreemOS - Phenix Workshop, December 7, 2006
11
VO and Security Management
Scientific Applications
Business Applications
XtreemOS API
VO & Security
Application
Management
Data Management
Infrastructure for Highly Available and Scalable Services
Linux-XOS: Grid-enabled Linux Operating System
Linux-XOS for PC
Linux-XOS for
Cluster
Linux-XOS for
Mobile Devices
Overview of XtreemOS - Phenix Workshop, December 7, 2006
12
VO & Security Management
A VO can be seen as a temporary or permanent
coalition of geographically dispersed entities
(individuals, groups, organizational units or entire
organizations) that pool resources, capabilities and
information to achieve common objectives.
– Legal or contractual arrangements between
entities
– Resources can be physical equipment or other
capabilities such as knowledge, information or
data
Overview of XtreemOS - Phenix Workshop, December 7, 2006
13
Some Lessons from the State of the
Art
 Open issues
– Scalability of in-the-large VO management
• Short-lived VOs
– Ease of management of VO and VO identities
– Security and VO policy enforcement at the
node and site level
Overview of XtreemOS - Phenix Workshop, December 7, 2006
14
VO & Security Management
 Key components of VO
– Owner/administrator of the VO
– A set of participating users in different
participating domains
– A set of participating resources in different
participating domains
– A set of roles which users/resources can play
in the VO
– A set of rules/policies on resource availability
and access control
– An (renewable) expiry time of the VO
Overview of XtreemOS - Phenix Workshop, December 7, 2006
15
VO Lifecycle
 VO identification




– Identify and name VO candidates
VO formation
– Creation and configuration of the VO according to the
anticipated roles of members
VO operation
– Members should be identified for effectively logging and
auditing
– The VO should be able to classify the resources to different
access control level for effective management
VO evolution
– Managing change in participating entities or in their condition
of use
– Members can be added and linked into a VO by authorization
– Users can be classified at different levels with associated
operation rights
VO dissolution
– Non persistent information should be deleted, credentials
reclaimed and user and resource providers notified
– Should take place after all activities finished
Overview of XtreemOS - Phenix Workshop, December 7, 2006
16
VO Management
 Two levels
– VO level (administration)
• Performed by XtreemOS-G services
 Distributed information management for membership
tracking and accounting of users and resources
– Node level
• Performed by XtreemOS-F
• Add mechanisms to Linux OS for recognizing, controlling,
and enforcing usage of global Grid entities
 Grid identity management
 Resource access granting and accounting
 VO policy checking, auditing and enforcing
Overview of XtreemOS - Phenix Workshop, December 7, 2006
17
Node Level VO Management
 Minimal with respect to changes to the kernel code to reduce
pressure to get VO related changes accepted in Linux community
– Keep changes localized in dynamically loadable kernel
modules
 Features
– PAM-plug-in based authentication
– Static and dynamic identity mapping to local user/group ids
– Kernel level key retention mechanisms
– ACL mechanisms
• Flexible, secure, efficient and easily sustainable from the software
engineering point of view VO model
 Investigation of synergies with existing security enhancement for
Linux
– Linux Security Module (LSM)
• Refinement of access control and enforcement mechanisms
Overview of XtreemOS - Phenix Workshop, December 7, 2006
18
Infrastructure for Highly Available
and Scalable Services
Scientific Applications
Business Applications
XtreemOS API
VO & Security
Application
Management
Data Management
Infrastructure for Highly Available and Scalable Services
Linux-XOS: Grid-enabled Linux Operating System
Linux-XOS for PC
Linux-XOS for
Cluster
Linux-XOS for
Mobile Devices
Overview of XtreemOS - Phenix Workshop, December 7, 2006
19
Infrastructure for Highly Available
and Scalable Grid Services
 Grid
– Very large number of nodes that are distributed worldwide
– Dynamicity: nodes join, leave, fail
 Applications
– Standalone (interact only with the user that launched
them)
– Services (present an interface to the outside world and
can be invoked)
• System level functionalities
• Application-level functionalities
 Targets of the infrastructure
– XtreemOS-G services
– Application-level services
Overview of XtreemOS - Phenix Workshop, December 7, 2006
20
Infrastructure for Highly Available
and Scalable Grid Services
 Management of collections of nodes
Overview of XtreemOS - Phenix Workshop, December 7, 2006
21
Infrastructure for Highly Available
and Scalable Grid Services
 Toolbox
– Facilities to construct structured collections
• Application initialization
• DHT, N-dimensional matrix, ranked nodes
– Distributed servers
• Present a single stable address to the external world
hiding the internal organization of the service
– Virtual nodes
• Fault tolerant groups of nodes capable of taking over each
other’s tasks
– Publish/Subscribe
• Useful for applications and also to build structured
collections
• Fully decentralized implementation
– Directory service
• Node monitoring and failure detection
• Adapt to the dynamicity of the monitored attributes
Overview of XtreemOS - Phenix Workshop, December 7, 2006
22
Application Management
Scientific Applications
Business Applications
XtreemOS API
VO & Security
Application
Management
Data Management
Infrastructure for Highly Available and Scalable Services
Linux-XOS: Grid-enabled Linux Operating System
Linux-XOS for PC
Linux-XOS for
Cluster
Linux-XOS for
Mobile Devices
Overview of XtreemOS - Phenix Workshop, December 7, 2006
23
Application Management
 Entities taking part in job execution
– Job
• One or more processes that collaborate to achieve a common goal
• Resource allocation unit
– Resources
• Physical or virtual component of limited availability within a
computer system
 Have static and dynamic characteristics
 Application execution management
– Job submission and scheduling
– Job and resource control
– Job and resource monitoring
Overview of XtreemOS - Phenix Workshop, December 7, 2006
24
Application Life Cycle
Overview of XtreemOS - Phenix Workshop, December 7, 2006
25
Application Execution Management
 AEM is generic and flexible as much as possible
– Does not target specific users or types of jobs
 AEM allows users to exploit advantages of executing a job in a Grid
 AEM provides an easy to use job submission, control and monitoring
interface
– Unix-like submission (with default description of requirements)
– Batch-like submission
• Requirements
• Hints (additional information optionally provided by users)
– Adaptive and accurate monitoring
 AEM deals with Grid dynamicity
– Job migration and checkpointing
– Hide failures and changes as much as possible to users
Overview of XtreemOS - Phenix Workshop, December 7, 2006
26
Application Execution Management


AEM has to guarantee access to authorized resources and their limited utilization
– Jobs executed in the context of a grid user and a VO
– Rely on VO and security management services (WP2.1, WP3.5)
Scalability and fault tolerance taken into account in the design of AEM
– Most of AEM services are in the scope of a job which is suitable for scalability
•
–
–
JobDirectory and jController need to be fault tolerant
WP3.2 services will be used as appropriate
•
•

Resource discovery
Distributed servers
Tight integration with the Linux OS
– Enforcement in the usage of agreed resources (quota, access control)
•
–
Job-id to be known by XtreemOS-F
Users will have more information and control on how their jobs are running
•

JExecMng and jMonitor could potentially have to manage hundreds of nodes
Performance metrics, occurred errors, exit status, …
AEM provides a basic set of system-level functionalities
– Users may rely on user-level services (eg. workflow manager, SAGA runtime)
Overview of XtreemOS - Phenix Workshop, December 7, 2006
27
Data management
Scientific Applications
Business Applications
XtreemOS API
VO & Security
Application
Management
Data Management
Infrastructure for Highly Available and Scalable Services
Linux-XOS: Grid-enabled Linux Operating System
Linux-XOS for PC
Linux-XOS for
Cluster
Linux-XOS for
Mobile Devices
Overview of XtreemOS - Phenix Workshop, December 7, 2006
28
Data Management
 XtreemFS
– Federated object-based file system for Grid environments
• Centralised metadata servers replaced by a federation of metadata
servers
 Independence of participating organizations while maintaining a global
view of the system
• Designed with wide-area networks in mind
 File replication
 Location and access management based on an intelligent monitoring
service
o Access pattern-aware replication
• Semantic naming and advanced query functions to allow users to
find data in huge archives
– Object Sharing Service (OSS)
• Inter-process communication via volatile memory, mapped files,
dynamically allocated objects and grid pipes
Overview of XtreemOS - Phenix Workshop, December 7, 2006
29
XtreemFS Components
 Object Storage Device (OSD)
– Data access in the file system
• Read/write access, concurrency control
– Object-based storage interface to hide complexity of
underlying block-based storage mechanisms
 Metadata and Replica Catalogue (MRC)
– Maintenance of all file system metadata
• Posix metadata
• Extended (user defined) metadata
• Information on replica locations
 Replica Management Service (RMS)
– Decides when replicas have to be replicated and with what
distribution among OSD
– Replica removal
 Client
– Hosts running the access layer (file system adapter or
XtreemFS library)
• Linux traditional file system interface for transparent access to
MRC, OSD, RMS
• Native XtreemFS interface
Overview of XtreemOS - Phenix Workshop, December 7, 2006
30
Overview of XtreemOS - Phenix Workshop, December 7, 2006
31
Object Storage Device (OSD)
 Container of objects
– Reliably store and retrieve data from physical media
– Security enforcement for access to stored objects
• Capabilities built by MRC and received with each request
– Multi-object files
• Striping and/or replication
• Each file replica has its own striping policy
– Transactional files
• Changes performed on a local copy (and not forwarded to
other OSD) and committed or rolled back at some time
Overview of XtreemOS - Phenix Workshop, December 7, 2006
32
Replica Management Service (RMS)
 Take care of autonomous creation and deletion of replicas
 Replication policies
– Must satisfy security needs and comply with local regulations
• Countries, real organization, VO, racks in a data centre
 Replica creation
– Gathering information from other services to decide when and
where to create a replica
• Each time a file is open
 RMS is contacted to see if a better replica should be created
o Decision depends on the file size, OSD availability
o A client may start accessing a “bad replica” during the creation of a new
one
• MRC may keep track of opens to predict future access from the
previous ones
• AEM can inform RMS that a job is about to start its execution
 RMS can anticipate the creation of a new replica before the job
execution
 Removing “obsolete” replicas
– Lack of free space, file or replica very seldom used, close
replicas not anymore useful, …
– A replica can be removed at any time even while being used
Overview of XtreemOS - Phenix Workshop, December 7, 2006
33
MetaData and Replica Catalogue
(MRC)
 MRC
– Acts logically as one service but will be composed of replicated
service instances to improve availability and performance
– Access control management
• Support of a variety of policies
• Volume ACL
 Data model
– Hierarchical directory structure and/or extended metadata
– Core abstraction for controlling access to file metadata and file data is
the volume
– Files can be copied between volumes and links to files in other
volumes can be created
 Internal architecture
– Exactly one meta object per physical object on a storage device
 To what extend it is possible to decouple system components
while preserving a global view to the system
Overview of XtreemOS - Phenix Workshop, December 7, 2006
34
Object Sharing Service (OSS)
 Inter-process communication via volatile memory, mapped
files, dynamically allocated objects and grid pipes
– All components designed to be scalable and fault
tolerant to deal with the dynamic behaviour of the Grid
 Features
– Management of shared objects containing references
– Object access detection
• Page based
– Object access monitoring to control false sharing and
object replicas
– Object consistency management
• Strict, weak and transactional memory consistency models
Overview of XtreemOS - Phenix Workshop, December 7, 2006
35
LinuxSSI: Linux-XOS for Clusters
Scientific Applications
Business Applications
XtreemOS API
VO & Security
Application
Management
Data Management
Infrastructure for Highly Available and Scalable Services
Linux-XOS: Grid-enabled Linux Operating System
Linux-XOS for PC
Linux-XOS for
Cluster
Linux-XOS for
Mobile Devices
Overview of XtreemOS - Phenix Workshop, December 7, 2006
36
LinuxSSI: XtreemOS-F Cluster
Flavour
 LinuxSSI will leverage Kerrighed SSI OS for
clusters
 Four work directions for LinuxSSI
– Scalability to hundreds of processors
– LinuxSSI file system
– Automatic reconfiguration of LinuxSSI
– Checkpoint/restart mechanisms for parallel
applications
– Customizable scheduler
Overview of XtreemOS - Phenix Workshop, December 7, 2006
37
Scalability & Reconfiguration
Management
 Scalability to hundreds of processors
– Removing hard limits on the amount of nodes
– Evaluating the scalability of Kerrighed internal
algorithms
 Automatic reconfiguration of LinuxSSI
– Node addition, eviction or failure management
– Leverage the existing mechanisms provided by
Kerrighed in the HotPlug module
Overview of XtreemOS - Phenix Workshop, December 7, 2006
38
LinuxSSI File System
 LinuxSSI file system
– Exploitation of the disks attached to cluster
nodes
• Single name space (root file system)
• Policies for placing/replicating data on disk
• Efficient parallel accesses to large data volumes
– Performance as a primary target in LinuxSSI
basic version
– LinuxSSI file system should not fail in the
event of failures
• Better support to failures in the advanced version of
LinuxSSI
Overview of XtreemOS - Phenix Workshop, December 7, 2006
39
Checkpoint/Restart in LinuxSSI
 Checkpoint and restart of parallel application units in a cluster
– Shared memory and message-passing programming models
will be supported
– Checkpointer multi-level architecture
• Kernel checkpointer
 Process/thread checkpointing
 Based on Kerrighed mechanisms
 Transparent or application-aware checkpointing
• System checkpointer
 Application unit checkpointing (inside a cluster)
 Coordination of thread/process checkpoints for parallel applications
 Configurable service
• Grid checkpointer
 Application checkpointing (an application may span multiple Grid
nodes)
 Coordination of application unit checkpoints for an application
comprising of multiple units
Overview of XtreemOS - Phenix Workshop, December 7, 2006
40
Customizable Scheduler
 Customizable scheduler
– Long-term scheduler
• Application admission in the cluster (job queuing system)
– Load balancing scheduler
• Balance the current workload between cluster nodes
 Long-term scheduler
– DRMAA standard interface
– Adapted to take advantage of the SSI “virtual multiprocessor”
– Resource sharing (a CPU may not be dedicated to a single application)
– Advanced monitoring capabilities
 Load balancing scheduler
– Policy customization
• Multilevel architecture (probes, analyzers, decision-making)
– Self adaptation of policy based on the current state of the cluster
– Advanced policies
• Shared memory, IPC
 Interaction with the Grid level services when needed
Overview of XtreemOS - Phenix Workshop, December 7, 2006
41
From LinuxSSI to LinuxSSI-XOS
 Virtual organization support
– Support of the kernel key retention system
• Impact on the Ghost module
– XtreemOS-G services will run as a single
instance on a LinuxSSI cluster
• Example: daemons in charge of mapping global user,
VO and group identities onto the Linux UID/GID
Overview of XtreemOS - Phenix Workshop, December 7, 2006
42
XtreemOS Consortium
 19 partners
– 1 public financial institution as coordinator
– 9 research centers & universities
– 9 industrial partners
• 4 SME
 8 countries
– Europe
• France, Germany, Italy, Slovenia, Spain, The Netherlands, UK
– China
Overview of XtreemOS - Phenix Workshop, December 7, 2006
43
XtreemOS Partners
Overview of XtreemOS - Phenix Workshop, December 7, 2006
44
Fact Sheet
 Start date
– June 1st, 2006
 Duration
– 4 years
 Budget
– Approx. 30 Meuros
– EC funding 14.2
Meuros
 Website
– http://www.xtreemos.eu
 Administrative and
financial coordinator
– CDC, Jean-Noël Forget
 Scientific and technical
staff
– More than 100 persons
Overview of XtreemOS - Phenix Workshop, December 7, 2006
45