X-Trace: A Network Tracing Framework

Download Report

Transcript X-Trace: A Network Tracing Framework

UC Berkeley
Scaleable Structured
Datastorage for Web 2.0
Michael Armbrust, David Patterson
October, 2007
RAD Lab 5-year Mission
• Today’s Internet systems complex, fragile, manually
managed, rapidly evolving
– To scale Ebay, must build Ebay-sized company
• “Moon shot” mission statement:
Enable a single person to Develop, Assess, Deploy, and
Operate the next-generation IT service
– “The Fortune 1 Million” by enabling rapid innovation
• Create core technology to enable vision via synergy
across systems, networking, and Statisical Machine
Learning
• Making datacenter easier to manage enables vision of
single person to analyze, deploy and operate a
scalable IT service
If Datacenter is the
computer…
•
•
•
•
•
•
•
What is the programming language?
What are the libraries?
How do trace/monitor programs?
What is the simulator?
What is Computer Aided Design?
What is the Operating System?
What is the Database System?
Storage Status Quo
• Current status of data storage for Web 2.0
apps
– Large relational databases running on
expensive hardware
– Manual horizontal and vertical partitioning of
data
• Problem: Requires redesign at each
scaling milestone
• Goal: Scaleable structured data storage
for Web 2.0
Web 2.0 App Characteristics
• Need to scale to YouTube or MySpace
sizes
• Require geographic replication
• Short transactions
• No ad-hoc queries
• Willing to trade relaxed consistency for
scalability and availability
– Photos, not financials
Relaxed Consistency
• Some things can be updated lazily
• Eventual consistency is often acceptable
• However users should see their own
writes immediately
• Need to provide simple choices to
developers
Our Idea
• Large scale distributed database underneath
• Runs on 1000+ of shared nothing commodity
servers
• ActiveRecord-like layer in Ruby on Rails vs. SQL
– Provides simple relationships and consistency
guarantees between models
• has_many
• belongs_to
• searchable_by (for full-text search)
• Pre-compute joins for quick reads
Related Work (we know of)
• G. DeCandia, D. Hastorun, et al. Dynamo: Amazon’s highly
available key-value store. In SOSP. 2007. [5] M. Stonebraker and U.
Cetintemel. one size fits all: an idea whose time has come and
gone. pp. 211. 2005.
• M. Stonebraker, S. R. Madden, et al. The end of an architectural era
(its time for a complete rewrite). In VLDB. Vienna, Austria, 2007.
• D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable
semantic web data management using vertical partitioning. In VLDB,
Vienna, Austria, 2007.
• F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M.
Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A
distributed storage system for structured data. In OSDIユ06: Seventh
Symposium on Operating System Design and Implementation,
November 2006.