Lecture10+11
Download
Report
Transcript Lecture10+11
Cloud Computing,
CS596-015
Amazon EC2 & Amazon
Web Services (AWS)
1
Outline
Introduction
Amazon Web Services (AWS) Components:
IaaS: EC2, S3, EBS
PaaS: SimpleDB, SQS, SNS, CloudFront, Relational Data
SaaS: AWS Web Services
AWS Integration and Management
AWS Billing
AWS Scalability
AWS Application Architecture: Design to Scale using AWS Elastic Features
Summary and Conclusions
2
Introduction
3
Introduction:
AWS Components
AWS spans IaaS, PaaS, and SaaS
4
Introduction:
Where AWS Fits?
5
Introduction:
Issues facing Web Developers
70% of Web Development Effort is “Muck”:
Data Centers
Bandwidth / Power / Cooling
Operations
Staffing
Scaling is Difficult and Expensive:
Large Up-Front Investment
Invest Ahead of Demand
Load is Unpredictable
6
Introduction:
Unpredictable Load
Slashdot/Digg/TechCrunch Effect
Rapid, unexpected customer demand/growth
7
Introduction:
Seasonal Spikes
8
Introduction:
How Do You Survive This?
9
Introduction:
Predictions Cost Money
Infrastructure Cost $
You just lost
customers
Large
Capital
Expenditure
Predicted
Demand
Opportunity
Cost
Traditional
Hardware
Actual
Demand
Automated
Virtualization
time
10
Introduction:
Solution – Web-Scale Computing
Scale capacity on demand
Turn fixed costs into variable costs
Always available
Rock-solid reliability
Simple APIs and conceptual models
Cost-effective
Reduced time to market
Focus on product & core competencies
11
Amazon Web Services
Components
12
AWS Services Are:
Building block services that allow developers to innovate
and make money:
Infrastructure As a Service
Amazon Simple Storage Service
Amazon Elastic Compute Cloud
Amazon Simple Queue Service
Amazon SimpleDB
Commerce As a Service
Amazon Flexible Payments Service
Fulfillment Web Service
Data As A Service
Amazon E-Commerce Service
Amazon Historical Pricing
People As a Service
Amazon Mechanical Turk
Alexa Web Services
Alexa Web Information Service
Alexa Top Sites
Alexa Site Thumbnail
Alexa Web Search Platform
Search As A Service
Alexa Web Information Service
Alexa Top Sites
Alexa Site Thumbnail
Alexa Web Search Platform
13
AWS Architecture:
14
AWS Components:
IaaS: Infrastructure Services
Elastic Compute
Cloud
Compute
Simple Storage
Service
Store
Simple Queue
Service
Message
15
IaaS: Amazon Elastic
Compute Cloud – EC2
16
Amazon Elastic Compute Cloud
• Virtual Compute Cloud
• Elastic Capacity
• 1.7 GHz x86
• 1.7 GB RAM
• 160 GB Disk
• 250 MB/Second Network
• Network Security Model
Time or Traffic-based Scaling, Load
testing, Simulation and Analysis,
Rendering, Software as a Service
Platform, Hosting
$.10 per
server hour
$.10 - $.18 per GB
data transfer
17
Amazon EC2 Concepts
Amazon offers the user a choice of VM
template that can be instantiated in shared
or virtual environment, called AMI
Customer can use pre-packaged AMI or
can build their own
AMI vary in resources: RAM, Compute
units, Local disk and OS
Amazon Machine Image (AMI):
Bootable root disk
Pre-defined or user-built
Catalog of user-built AMIs
OS: Fedora, Centos, Gentoo, Debian,
Ubuntu, Windows Server
App Stack: LAMP, mpiBLAST, Hadoop
Instance:
Network Security Model:
Running copy of an AMI
Launch in less than 2 minutes
Start/stop programmatically
Explicit access control
Security groups
Inter-service bandwidth is free
18
Three Flavors of
Amazon Machine Images
Public AMIs: Use pre-configured, template AMIs to get
up and running immediately. Choose from Fedora,
Movable Type, Ubuntu configurations, and more
Private AMIs: Create an Amazon Machine Image (AMI)
containing your applications, libraries, data and
associated configuration settings
Paid AMIs: Set a price for your AMI and let others
purchase and use it (Single payment and/or per hour)
19
Amazon EC2 Concepts
Resizable compute capacity in the cloud
Obtain and boot new server instances in minutes
Quickly scale capacity, up or down, as your computing
requirements change
Full root access to a blank Linux machine
Simple Web service management interface
Changes the economics of computing
20
Amazon EC2 SOAP/Query API
Images:
RegisterImage
DescribeImages
DeregisterImage
Instances:
RunInstances
DescribeInstances
TerminateInstances
GetConsoleOutput
RebootInstances
Keypairs:
CreateKeyPair
DescribeKeyPairs
DeleteKeyPair
Image Attributes:
ModifyImageAttribute
DescribeImageAttribute
ResetImageAttribute
Security Groups:
CreateSecurityGroup
DescribeSecurityGroups
DeleteSecurityGroup
AuthorizeSecurityGroupIngress
RevokeSecurityGroupIngress
21
Three Amazon EC2 Choices
Small
Large
Extra Large
Bits
32
64
64
RAM
1.7 GB
7.5 GB
15 GB
Disk
160 GB
850 GB
1690 GB
1
4
8
Medium
High
High
Yes
Yes
Yes
EC2
Compute
Units
I/O
Performance
Firewall
22
Amazon EC2 Growth
Users
100000
90000
80000
70000
60000
50000
Users
40000
30000
20000
10000
0
23
IaaS: Amazon Simple
Storage Service – S3
24
IaaS: Amazon Simple Storage Service (S3)
• Object-Based Storage
• 1 B – 5 GB / object
• Fast, Reliable, Scalable
• Redundant, Dispersed
• 99.99% Availability Goal
• Private or Public
• Per-object URLs & ACLs
• BitTorrent Support
$.15 per GB per month
storage
$.01 for 1000 to
10000 requests
$.10 - $.18 per GB data
transfer
25
IaaS: Amazon Simple Storage Service (S3)
S3 is an opaque storage service
Highly scalable data storage in-the-cloud
Programmatic access via web services API: REST & SOAP
Simple to get going and privdes1B – 5TB and leverage AWS
authentication services
Highly available and durable
Offers distributed, redundant buckets replicated using CloudFront
Content Delivery Network (CDN) across continents
Pay-as-you-go:
Storage: $0.15 / GB / month
Data Transfer: starts at $0.18 / GB
Requests: nominal charges
26
IaaS: Amazon Simple Storage Service (S3) Amazon S3 Namespace
Amazon S3
bucket
object
bucket
object
object
object
bucket
object
object
27
IaaS: Amazon Simple Storage Service (S3) Amazon S3 Namespace
Amazon S3
mculver-images
Beach.jpg
media.mydomain.com
2005/party/hat.jpg
img1.jpg
img2.jpg
public.blueorigin.com
index.html
img/pic1.jpg
28
IaaS: Amazon Simple Storage Service (S3)
14 Billion
10 Billion
5 Billion
800 Million
August 06
April 07
October 07
January 08
Billions of Objects Stored
29
IaaS: Amazon Simple Storage Service (S3) Open Source Backup
30
IaaS: Amazon Elastic Block
Storage – EBS
31
IaaS: Amazon Elastic Block Storage (EBS)
EBS is a high performance virtual hard disk
It can be formatted as file system and then mounted on EC2
instance, i.e., attach to an instance in the same availability zone
Size can range from 1 GB – 1 TB
Storage: $0.10 / GB / month +
$0.10/million I/O ops
Snapshot backup (point-in-time) a
volume to S3 (not to a bucket)
Create new volume from snapshot
Incremental backup
Restore to new volume (instantaneous –
lazy restore)
32
IaaS: Amazon Elastic Block Storage (EBS)
Incremental Snapshot:
Table of Contents vs.
Data Blocks
Space used is difficult
to gauge
Frequent snapshots –
minimal cost
Freeze data while
snapshotting – XFS
File System supports a freeze command
Volume is fragile in transit: unmounting can leave data in flight +
mounting mistake is destructive!
Use snapshots for safety: perform snapshot after unmount + create a
fresh volume from a snapshot
33
IaaS: Amazon Elastic Block Storage (EBS)
Running MySQL with EBS:
Snapshot master
Create slave volume
Attach slave volume
Start replicating
34
IaaS: Amazon Elastic Block Storage (EBS)
MySQL Fail-Over:
Promote slave to master
Fail-over App Servers
Launch new slave
Init from snapshot
Start replication; always
roll forward
35
IaaS: Amazon Elastic Block Storage (EBS)
Multi-Zone Deployment:
36
PaaS: Amazon SimpleDB
37
PaaS: Amazon SimpleDB (SDB)
SDB is available for more structured data; it does not
support schema but instead defines “Domains” with items
that consist of up to 256 attributes/values. A value can be up
to 1 KB. SDB supports simple operators such as: =, !=, <, >,
<=, >=, STARTS-WITH, AND, OR, NOT, INTERSECTION,
and UNION
SDB is a distributed, highly scalable, light-weight,
query-able, attribute store – new style of DB for cloud
CAP: Consistency: Availability: network-Partitioning
Cloud DB needs to sacrifice traditional DB CAP
(consistency) properties: client side, Server side, and
Eventual
38
PaaS: Amazon SimpleDB (SDB)
The SimpleDB Model
39
PaaS: Amazon SimpleDB (SDB)
Developers want to:
Probably don’t want:
Store data
Process data
Query data
Schema management
Index management
Performance tuning
Data access scaling
All data is replicated in geographically disbursed data centers
(no explicit backup). Requests use HTTPS (security)
Complex JOIN applications (DW) are not a good match for
SimpleDB
40
PaaS: Amazon SimpleDB (SDB)
Architecture:
Attributes: name/value pair, multiple values per name
Items: consists of multiple attributes, can have different set of
attributes for each item in domain
Domain: elastic table structure – no schema is required
Ability to improve your
data model dynamically
as needed basis makes
SimpleDB a perfect
match for agile
development
≈ dynamic schema Table
Flexible dynamic schema
data model
41
PaaS: Amazon SimpleDB (SDB)
PutAttributes(Joe:(Hair:Red));
PutAttributes(Sarah:(Age:13));
42
Amazon SimpleDB
item
description
color
123
Sweater
Blue, Red
456
Dress shirt
White, Blue
789
Shoes
Black
material
Leather
PUT (item, 123), (description, Sweater), (color, Blue), (color, Red)
PUT (item, 456), (description, Dress shirt), (color, White), (color, Blue)
PUT (item, 789), (description, Shoes), (color, Black), (material, Leather)
Query
Domain = MyStore
[‘description’ = ‘Sweater’]
43
PaaS: Amazon Simple
Queue Service - SQS
44
PaaS: Amazon Simple Queue Service (SQS)
• Scalable Queuing
• Elastic Capacity
• Reliable, Simple, Secure
Inter-process messaging, data buffering,
architecture component
$.10 per 1000
messages
$.10 - $.18 per GB
data transfer
45
PaaS: Amazon Simple Queue Service (SQS) Overview
A distributed queue in the cloud
Used for storing messages traveling between
computers
Reliable:
Runs within Amazon's high-availability data centers
Messages are stored redundantly across multiple servers
and locations
Scalable to millions of messages a day
Simple: Only 6 methods
Platform agnostic
Provides access control and message locking
46
PaaS: Amazon Simple Queue Service (SQS) Amazon SQS Concepts
Queues:
Named message container
Persistent
Messages:
Up to 256KB of data per message
Peek / Lock access model
Scalable:
Unlimited number of queues per account
Unlimited number of messages per queue
47
PaaS: Amazon Simple Queue Service (SQS) Amazon SQS Concepts
48
PaaS: Amazon Simple Queue Service
Application Architecture: Design to Scale using AWS Elastic Features
49
PaaS: Amazon Simple Queue Service
SQS SOAP / Query API
Queues:
ListQueues
DeleteQueue
SetVisibilityTimeout
GetVisibilityTimeout
Messages:
SendMessage
ReceiveMessage
DeleteMessage
PeekMessage
Security:
AddGrant
ListGrants
RemoveGrant
50
PaaS: Amazon Simple
Notification Service - SNS
51
PaaS: Amazon Simple Notification Service - SNS
Overview
SNS provides publish/subscribe messaging functionality
SNS is a distributed and redundant service that enables
applications, end-user, and devices to send and receive
notifications from the cloud
The service works on specified topics, which are Universal
Resource Identifier (URIs) that specify communication channels
based on content or event types
Any web server, email address, or SQS queue can subscribe to
notification messages associated with a particular topic
Authorized publishers can post messages to the channel and
they will automatically be delivered to all subscribers
52
PaaS: Amazon CloudFront
53
PaaS: Amazon CloudFront (~Akamai)
Overview
CloudFront is a web service for content delivery; both static
and streaming content
Requests for objects are automatically routed to the nearest
edge location
CloudFront is optimized to work with other Amazon services
like S3, EC2, but also it works with servers hosted by other
providers
CloudFront objects are organized into distributions. A
distribution specifies the location of the original version , unique
domain name (e.g., abc123.cloudfront.net) or map a proprietary
domain (e.g., images.example.com)
Distributions can either download definitive content from the
origin/source server (HTTP/HTTPS) or stream the content using
RTMP (Real-Time Messaging Protocol).
54
PaaS: Amazon Relational
Data
55
PaaS: Amazon Relational Data
Overview
Significant portion of use cases involve data in tabular form
and may include cross reference between tables
Scalability vs Integrity: SQL supports complex queries for
transactional, normalized and uniform data. On the other hand,
SQL is not appropriate for unstructured data (e.g., enforcing
schema consistency). In cloud, data is changing fast for SQL
engine to manage if all relations/schema need to be fully
enforced
The above limitation can be summarized as there is a need for
systems to manipulate and analyze huge amount of data w/o
impacting availability, performance or throughput
In other words, SQL is good engine but it is difficult to scale-out
to process huge amount of data and with schema-less
environment; hence NoSQL initiative like Google BigTable
56
PaaS: Amazon Relational Data
Overview
NoSQL is a linear approach that has the potential of scaling
much higher but also bring with it new set of scalability
challenges (such as overloaded keys or heavy use of
indexes, constraints enforcement are left to applications)
Query Model Software-based
Examples
Service-based
Examples
SQL
Amazon RDS
MS/SQL Azure
Zoho CloudSQL
LAMP/MySQL
Windows/SQL Server
Oracle
PseudoSQL
NoSQL
Amazon SDB
Google GQL Datastore
MS Azure storage
Hypertable
Hbase
MongoDB
CouchDB
57
PaaS: Amazon Relational Data
Amazon Relational DB Service (RDS)
RDS is a web service that makes it easy to set up,
operate, and scale an RDBMS in the cloud
RDS reduces the time-consuming administration tasks
RDS gives you compatibility with (access to the
capabilities of a familiar) MySQL, Oracle or MS SQL
Server. Applications and tools can be used with RDS
RDS automatically patches the database software and
backs up your database; storing the backups for a user
defined retention period and enable point-in-time recovery
IOPS is a new storage option for RDS designed to
deliver fast, predictable and consistent IO performance
(up to 10,000 IOPS per DB instance)
58
PaaS: Amazon Relational Data
Amazon Relational DB Service (RDS)
RDS DB can be provisioned with either standard storage
or IOPS storage
RDS makes it easy to use replication to enhance
availability and reliability.
Multi-AZ (Availability Zones) deployment option allows you
to run mission critical workloads with high availability and
built-in automated fail-over from your primary database to
a synchronously replicated secondary database in case of
failure
RDS for MySQL enables you to scale-out beyond the
capacity of a single DB deployment for read-heavy DB
workloads
There is no up-front investment required; pay-per-usage
59
SaaS: AWS Web Services
60
PaaS: Amazon Web Services
Overview
AWS began in 2006 to offer IT infrastructure service to
businesses in the form of web services – now is called cloud
computing
With AWS, businesses no longer need to plan for and procure
servers and other IT infrastructure weeks or months in advance;
instead they can instantly spin up hundreds or thousands of
servers in minutes and deliver results faster
AWS powers businesses in 190 countries around the world with
data center locations around the world. It provides:
Low cost
Agility and Instant Elasticity
Open and Flexible
Secure
61
PaaS: Amazon Web Services
Overview
AWS Solutions:
Application hosting: reliable, on-demand infrastructure to power
your applications, from IaaS to SaaS offerings
Backup and Storage: store data and build dependable backup
solutions based on AWS inexpensive storage services
Content Delivery: distribute content to end users worldwide with
low cost and high transfer arte
Web hosting: supports dynamic web hosting needs with AWS
Elastic infrastructure
Enterprise IT: host internal- or external-facing IT applications in
AWS secure environment
Databases: supports variety of scalable DB solutions including SQL
or No-SQL databases
62
PaaS: Amazon Web Services
Overview
63
PaaS: Amazon Web Services
Overview
64
PaaS: Amazon Web Services
Overview
65
PaaS: Amazon Web Services
Overview
66
PaaS: Amazon Web Services
Overview
67
PaaS: Amazon Web Services
Overview
68
PaaS: Amazon Web Services
Overview
69
AWS Integration and
Management
70
AWS Integration and Management:
Integration Overview
AWS has a rich set of integration services:
Elastic IP Addresses: are static IP addresses, associated with an account
rather than a particular instance, designed for dynamic cloud computing
Simple Queue Service: provides unlimited # of queues and messages of
size up to 8 KB
Simple Notification Service: provides publish/subscribe messaging
functionality
Virtual Private Cloud: provides a means for enterprises to extend their
private data center into Amazon’s cloud in a secure fashion
VM Import: allow customers to import VM images from their existing
environment into Amazon EC2
AWS Import/Export: accelerates moving large amount of data into and out
of AWS bypassing the Internet with portable storage devices for transport
71
AWS Integration and Management:
Management Overview
AWS Management Console is the main interface to managing AWS
It is also possible to use SSH or HTTP to interact with the instance
directly
CloudFormation: gives the customer the option to collect related AWS
resources in a so-called stack and provision them in an orderly fashion.
The stack includes Amazon services such as EC2, Security groups, SQS
queues, RDS instances, load balancers, etc.
CloudWatch: is a web service that provides monitoring for AWS cloud
resources – can be displayed on the management console as charts in
real-time
AWS Ecosysem: AWS services are not enough; hence AWS created an
ecosystem of products that fill in any gaps that AWS do not support
72
AWS Billing
73
AWS Billing:
Overview
Standard licensing terms
Commercially usable
Aggressive pricing
Monthly credit card billing
Self-serve model:
Sign up as developer
Choose services
Agree to service licenses
Enter payment info
Start coding
74
AWS Billing:
Overview
EC2 support monetization; it exposes set of financial
services to its developers:
Flexible Payment Service (FPS): is a service that Amazon created for
developers that leverages Amazon’s sophisticated retail billing system.
The customer can use the same identity, shipping details and payment
information as they would for ordering directly from Amazon
DevPay: is an online billing and account management service supporting
application that are built for AWS. It uses Amazon’s authentication and
settlement framework to manage customer subscriptions and billing for
Amazon EC2 Machine Images (AMI) or applications that use Amazon S3
75
AWS Scalability
76
AWS Scalability:
Overview
AWS also caters to enterprise needs for elastic computing with
capabilities that scale both vertically and horizontally:
High Performance Computing: The EC2 cluster Compute and Cluster
GPU instance types are designed to combine high compute and
networking performance for HPC applications using MPI. Cluster can be
up 128 nodes and 10 Gbps bandwidth between them, and you configure
up to 128 instances
Elastic Load Balancing: distributes incoming traffic for a given service
across multiple EC2 instances. Customer can enable Elastic Load
Balancing within a Single Availability Zone or across zones
77
AWS Scalability:
Overview
Auto Scaling: to support applications that experience hourly, daily, or weekly
variability in usage; varies # of EC2 instances during demand spikes. Amazon
provides tools to define triggers (say based on CPU utilization) for
adding/removing EC2 instances
Elastic MapReduce: is a web
service that enables businesses
and developers to process very
large amounts of data. It is
based on hosted Hadoop running
on the Amazon Elastic Compute
Cloud (EC2) and Amazon S3.
Amazon Elastic MapReduce
supports SQL-like tools, such as
Hive and Pig as well as many
programming languages including C++, Java, Perl, PHP, Python, R, and Ruby
78
AWS Application Architecture:
Design to Scale Using AWS
Elastic Features
79
AWS Application Architecture:
Overview
80
AWS Application Architecture:
Cloud Applications Design 10 Best Practices
Build cloud Apps, not apps in the cloud
Virtualize the application stack
Design for failures and nothing fails
Design for scalability
Loose coupling lets you maximize plug & play
Design for dynamism
Build security into every component
Leverage native cloud storage options
Leverage best cloud Management Tools
Don’t fear cloud constraints
81
AWS Application Architecture:
Don’t Just Build Apps in the Cloud
Don’t simply port traditional Apps to the cloud
Traditional Apps stacks are architected in functional silos
Each silo has its own machines, network, management and support
82
AWS Application Architecture:
Virtualize the Application Stack
Re-factor to use standardized VM containers, each instance should use selfdiscovery, self-configurable, and network independent
Use cloud standardized Messaging & DB when possible
Leverage inherent EBS replication & snapshots for DBMS
83
AWS Application Architecture:
Compensate for Ephemeral Storage
EC2 instance default storage can only be used for transient data
and not for archival data logs; consider using SDB to store
persistent archival data records that can be associated with a
key (timestamp)
If possible recover only from the most recent backup; consider
restoring data from S3 at boot-up and backing-up current data to
S3 at shutdown
If not OK, use EBS attached volumes for all persistent file data
RDBMS should always use EBS volumes
Consider using soft-links (Linux) to map portions of the default
storage to persistent EBS volume
Consider using EBS volumes exported on EC2 NFS server if
small chunks of persistent storage are needed
84
AWS Application Architecture:
Compensate for Dynamic IP Addresses
Attach ElasticIP for Internet-facing EC2 instances (e.g., HA
Proxy Load-balancer instance)
Use dynamic DNS registration of EC2 instance’s internal IP
address or use SDB
EC2 instances should only use the internal IP address for
communicating with each other (free!)
85
AWS Application Architecture:
Design for Failure
Everything fails all the time
Avoid single points of failure
Assume everything fails, and design backwards
Design for failure and your application won’t fail
What can fail:
EC2 instance may crash
Portion of zone may not be accessible due to network failure
AWS Services in a Region may not be accessible
86
AWS Application Architecture:
Design for Scalability
Use Load Balancing on multiple layers; use your own or AWS Elastic
Load Balancing
Use Cloud monitoring systems: either your own or AWS CloudWatch
Use Auto-scaling technology (free with CloudWatch)
Build Lossely Coupled Systems:
Use independent components
Design everything as a Black Box with well defined inputs & outputs
Use subsystems de-coupling for hybrid models
Use Load-balanced clusters of Black Boxes to maximize plug & play
87
AWS Application Architecture:
Design for Scalability
Use Message Queues:
Use MQ system such as Amazon SQS to pass along requests
Each MQ consumer can be a cluster of EC2 instances
88
AWS Application Architecture:
Design for Scalability
Leverage Amazon Storage Solutions:
Amazon S3: large static objects
Amazon CloudFront (CDN): content distribution
Amazon SimpleDB: simple data indexing/querying
Amazon EC2 local disc drive: transient data
Amazon EBS: RDBMS persistent storage + S3 snapshots
89
Summary and Conclusions
90
AWS: Summary and Conclusions
AWS is the leading Solution in the public cloud offering
AWS supports both IaaS, PaaS, and SaaS. It also has a
comprehensive integration and management story in addition to
billing
IaaS offering includes EC2, S3, and EBS
PaaS offering includes SDB, SQS, SNS, CloudFront, and
RDS
SaaS include AWS web services
AWS supports scalability via elastic computing
AWS applications can be designed to scale leveraging AWS
Elastic featured
91
END
92