Power Point - home.apache.org
Download
Report
Transcript Power Point - home.apache.org
Open Source Database
Rises to the Challenge
Britt Johnston
CTO
NuSphere
Agenda
2
Rise of Open Source Databases
Traditional and Open Source Licensing
Properties of Successful Community
MySQL Gemini Project
Future Trends
© 2001, NuSphere
Terminology
3
OSDB – Open Source Database
OSI – Open Source Initiative
OSD – Open Source Definition
FSF – Free Software Foundation
GPL – General Public License
© 2001, NuSphere
On Relational Databases…
“Relational databases can handle no more
than one hundred megabytes of data and
maybe ten users.”
Database Product Manager
Digital Equipment Corporation
1985
4
© 2001, NuSphere
Change is Constant
Relational databases
– Early debate on viability
– Clumsy positioning with existing products
– Tidal wave of acceptance
OSDB is in a similar position
– Reluctantly admit usefulness, but
– “Commercial databases are
required for backend functions.”
5
© 2001, NuSphere
Rapid Evolution!
Top 3 Selling Database Books
1. SQL for Dummies
2. MySQL
3. Oracle 8i Reference
B&N Topsellers 10/28/00
References On the Web
•
•
•
•
•
•
Oracle
MySQL
PostgreSQL
SQL Server
DB2
Interbase
Google Searches
6
3.0 Million
2.3 Million
0.7 Million
0.6 Million
0.5 Million
0.1 Million
3.4 Million
2.9 Million
0.8 Million
0.6 Million
0.5 Million
0.1 Million
10/28/00
01/31/01
+13%
+26%
+14%
© 2001, NuSphere
2000 –
First Boxed Commercial Distributions
– September 2000
First Open Source Database Summit
– October 2000
7
Four companies dedicated to Open
Source Database products were launched
© 2001, NuSphere
2001 –
140,000 MySQL Books sell in 12 months
Oracle: MySQL to Oracle Migration Kit
MySQL Track at Major Conferences
NuSphere Delivers Gemini Beta
8
© 2001, NuSphere
Agenda
9
Rise of Open Source Databases
Open Source Licensing
Properties of Successful Community
MySQL Gemini Project
Future Trends
© 2001, NuSphere
The Open Source License
Nine Major Requirements:
Free redistribution – cannot require royalty
Source code – source and binary distribution
Derived works – allow distribution of changes
Integrity of author’s code – may require patch
No discrimination against persons or groups
No discrimination against fields of endeavor
License distributed – no new license required
License not product-specific – extracted code ok
No contamination of other software on same medium
1.
2.
3.
4.
5.
6.
7.
8.
9.
Open Source Definition (Bruce Perens) www.opensource.org
10
© 2001, NuSphere
MySQL License
Prior to June 2000
– Not open source license
June 2000 and Future
– All releases under GPL
11
© 2001, NuSphere
On Open Source…
“We recommend that products near the end of
their life go open source.”
Gartner Group Analyst
October 2000
12
© 2001, NuSphere
Agenda
13
Rise of Open Source Databases
Traditional and Open Source Licensing
Properties of Successful Community
MySQL Gemini Project
Future Trends
© 2001, NuSphere
It’s Not Only About Technology
Feature Wars
– Oracle vs. Microsoft:
competition not based on user need
– Feature bloat will have long term impact
14
Fast, easy-to-use, integrated; with a clean
programming model are most important
© 2001, NuSphere
Signs of Healthy Community
15
Mix of Church and State membership
World-wide contributor community
Active development process
Rich collection of interfaces
Support from other products
Full service offerings
Clear license terms
© 2001, NuSphere
Modular Architecture
16
Key to scalable community
Drive rapid evolution
Allow large contributions from
multiple sources
© 2001, NuSphere
MySQL Modular Architecture
Table Handler
– Specialized storage for individual tables
MyISAM – High Speed Read Mostly
Heap – In Memory Tables
Gemini – Large Scale OLTP
– Row-level Locking, Transactions, Recovery
More under development
MySQL
MyISAM
17
Heap
Gemini
© 2001, NuSphere
Agenda
18
Rise of Open Source Databases
Traditional and Open Source Licensing
Properties of Successful Community
MySQL Gemini Project
Future Trends
© 2001, NuSphere
Gemini - Row Level Locking
Gemini is NuSphere’s contribution to MySQL
project – design targets:
Multi-threaded engine
Supports 10,000 concurrent transactions
Sustains 1 billion tpd on single server
Small footprint for PC class hardware
SMP support for large systems
Concurrent operations on parallel threads
Familiar SQL standard programming model used
by commercial applications today.
– Table and Row-level Locking
– Standard Isolation Levels
19
© 2001, NuSphere
Gemini – Historical Roots
Progress RDBMS
MySQL
Language
Processor
and Server
Language
Processor
and Server
Storage Engine
Gemini Engine
20
© 2001, NuSphere
Gemini – Historical Roots
Progress RDBMS source for Technology
#6 Relational DB Worldwide
#1 Embedded DB Worldwide
IDC Worldwide Database Market – May 2000
Proven Performance and Reliability
Technology is Base for Gemini
21
Recovery, Locking, B-Tree, Concurrency,
Cache, I/O and Transaction mechanisms
© 2001, NuSphere
Important Design Factors
Gemini is designed to be:
–
–
–
–
–
22
Database Schema Independent
Record Format Independent
Index Key Format Independent
Server Architecture Independent
Gemini API Closely Matches
MySQL Table Handler API
© 2001, NuSphere
Gemini Properties
Targeted squarely at OLTP model
– Heavy concurrent update
– Online maintenance operations
Expands open source web platform
– Backend database for e-commerce sites
– Reliability with 10+ years of “experience”
– Proven enterprise-class technology
23
© 2001, NuSphere
Gemini Properties
Multi-threaded storage manager
–
–
–
–
Concurrent read and write operations
Concurrent commit support
Fine grained locking of internal structures
Online Recovery from failed threads
Scalable database cache
– Dynamic data and index cache size
– 128GB capacity (RAM limited)
– LRU mechanism with index page priority
24
© 2001, NuSphere
Gemini Properties - Transactions
Support for ACID Transactions
–
–
–
–
25
Atomicity – All or nothing
Consistency – Data in consistent state
Isolation – Allow independence
Durability – Effects persist always
© 2001, NuSphere
Gemini Properties - Transactions
Support for 4 Standard Isolation levels
–
–
–
–
Read uncommitted
Read committed
Repeatable read
Serializable
Table and row lock support
– 6 mode lock manager (intent support)
– 2 phase with automatic lock acquisition
– Delegated delete locks
26
© 2001, NuSphere
Transaction Isolation Levels
Levels described in terms of possible
anomalies
– Dirty read - read data written by concurrent
uncommitted transaction.
– Non-repeatable reads - re-read data
previously read and see data modified by
another committed transaction
– Phantom read - re-run same query and see
additional rows inserted by another
committed transaction.
27
© 2001, NuSphere
Transaction Isolation Levels
Level
Dirty
Read
Read uncommitted Yes
Read committed
No
Repeatable read
No
Serializable
No
28
Non-Repeatable
Read
Yes
Yes
No
No
Phantom
Read
Yes
Yes
Yes
No
© 2001, NuSphere
Durable Transactions
Recovery Log
DB on Disk
• Previous Record
• Transaction Notes
• Update Notes
Memory Copy
• Updated Record
29
© 2001, NuSphere
Rollback and Recovery
Recovery Log
DB on Disk
• Previous Record
• Transaction Notes
• Update Notes
Memory Copy
• Updated Record
30
© 2001, NuSphere
Scalability
What is scalability, why does it matter?
– Increase workload or hardware
More work gets done (concurrency)
Work gets done faster (response time)
– Improve throughput with additional
resources
Bottlenecks can be removed
– Minimal impact on response time when
scaling
31
© 2001, NuSphere
Scalability
Architectural limits raised as high as possible
– Goal: storage engine does not impose limits.
– Available hardware and OS are limiting factors.
– Example – Concurrent Users:
32
Demonstration 5,000 users
Published limit is 10,000
Tested Limit is 32,000
Architecture limit is 4 billion
© 2001, NuSphere
Scalability
A system may be limited by:
– Number of disks and controllers
– Number of open files, OS kernel
– Memory
Not the underlying database
33
© 2001, NuSphere
Multi-Processor Support
Several flavors of spin locks
Can directly use 32 CPU SMP Hardware
Alpha, IA32, IA64, PA-RISC, Power, Sparc
Work with hardware vendors to tune chip
specific resource locking primitives.
– Non blocking, no unneeded system calls
– Account for CPU cache characteristics
– Instruction and Data Fence requirements
34
© 2001, NuSphere
Gemini Crash Recovery
Automated recovery and logging
– Log is created at system startup
– Space is reused automatically
– Log is optionally removed at shutdown
Asynchronous checkpoint support
– Stable performance under load
– Self tuning multi-threaded I/O subsystem
35
© 2001, NuSphere
24 X 7 Availability
Powerful High Availability solution:
– Automatic Crash Recovery
Online in 15 to 30 seconds
– Fail over Clusters
Eliminate single points of failure
– Flexible Backup Solutions
Online Backup – non blocking
Split Mirror Zero Impact Backup
– Table and Site Replication
36
Support Server Farm Model
© 2001, NuSphere
Gemini – Ease of Use
Familiar storage model used by MySQL
MySQL native record format
Specify table type to use Gemini tables
CREATE TABLE …
Programming model
–
–
–
–
37
TYPE = GEMINI;
Table and row-level locking
Statement atomicity
Multi-statement ACID transactions
Standard isolation levels
© 2001, NuSphere
Gemini – Current Status
Beginning Formal Beta
Check-in via MySQL community process
Basic functions complete (insert, update, delete,
select) for all types
Active reliability testing
– Can run same tests against multiple table types
Benchmark work started
– Target: High concurrency, heavy update
– Goal: Fastest engine for transaction processing
38
© 2001, NuSphere
Agenda
39
Rise of Open Source Databases
Traditional and Open Source Licensing
Properties of Successful Community
MySQL Gemini Project
Future Trends
© 2001, NuSphere
It’s All In The Community!
A traditional software product rarely has a
sustaining community outside the
employees of the company.
Look for existing communities and create
or integrate missing technology.
Contributors using community process?
40
© 2001, NuSphere
Corporate IT Is Catching On
Response to OSDB is no longer to challenge its
viability.
IT managers know they can get support and
packaged distributions.
OSDB is proven solution for wide range of web
infrastructure.
Gemini changes the rules for OLTP systems built
with open source software.
41
© 2001, NuSphere
Corporate IT Is Catching On
New projects are looking at OSDB for
significant aspects of a solution.
Initial acquisition costs no longer a factor;
IT can choose service level on perapplication level based on business
needs.
IT will increasingly go with an OSDB
solution.
42
© 2001, NuSphere
Questions…
[email protected]
www.nusphere.com