SQL Server 7.0 Strategy Deck

Transcript SQL Server 7.0 Strategy Deck

A Database View of
Intelligent Disks
James Hamilton
[email protected]
Microsoft SQL Server
Overview


Do we agree on the assumptions?
A database perspective:







Use of intelligent disk: NASD model?


Scalable DB clusters are inevitable
Affordable SANs are (finally) here
Admin & Mgmt costs dominate
Intelligent disks are coming
DB exploitation of intelligent disk
Failed DB machines all over again?
full server slice vs. a file block server
Conclusions
2
Do we agree on the assumptions?

From the NASD web page:
“Computer storage systems are built from sets of disks,
tapes, disk arrays, tape robots, and so on, connected to one
or more host computers. We are moving towards a world
where the storage devices are no longer directly connected
to their host systems, but instead talk to them through an
intermediate high-performance, scalable network such as
FibreChannel. To fully benefit from this, we believe that the
storage devices will have to be come smarter and more
sophisticated.”


Premise: conclusion is 100% correct; we’ll
question the assumptions that led to it
There are alternative architectures with
strong advantages for DB workloads
3
Clusters Are Inevitable: Query


Data intensive application workloads (data
warehousing, data mining, & complex query) are
growing quickly
Greg’s law: DB capacity growing 2X every 9-12
months (Patterson)




DB capacity requirements growing super-Moore
Complex query workloads tend to scale with DB size
many CPU’s required
“Shared memory is fine as long as you don’t
share it” (Helland)


clusters only DB architecture with sufficient scale
We only debate at what point, not if, clusters are
required
4
Clusters Are Inevitable: TP Apps



Most database transaction workloads currently hosted on
SMPs
Prior to the web, TP workloads tended to be reasonably
predictable

TP workloads scale with customer base/business size

Load changes at speed of business change (typically slow )
Web puts back office in the front office




Much of the world has direct access - very volatile
Capacity planning goes from black art to impossible
Server capacity variation is getting much wider
Need near infinite, incremental growth capability with
potential to later de-scale


Wheel-in/wheel out upgrade model doesn’t work
clusters are only DB architecture with sufficient incremental
growth capability
5
Clusters Are Inevitable: Availability


Non-cluster server architectures suffer from many single
points of failure
Web enabled direct server access model driving high
availability requirements:


Web model enabling competition in access to information




Drives much faster server side software innovation which
negative impacts quality
“Dark machine room” approach requires auto-admin and data
redundancy (Inktomi model)


recent high profile failures at eTrade and Charles Schwab
42% of system failures admin error (Gray)
Paging admin at 2am hoping for quality response is dangerous
Fail fast design approach is robust but only acceptable with
redundant access to redundant copies of data
Cluster Architecture is required for availability
6
Shared Nothing Clusters Are Inevitable


Data-intensive application capacity growth
requirement is seriously super-Moore
Increasing proportion of apps are becoming
data intensive:




Transaction workloads now change very
rapidly and unpredictably
High availability increasingly important
Conclusion: cluster database architecture
is required


E.g. High end web sites typically DB backed
supported by Oracle, IBM, Informix, Tandem, …
Why don’t clusters dominate today?


High inter-server communications costs
Admin & management costs out of control
7
Affordable SANs Are (Finally) Here

TCP/IP send/receive costs on many O/Ss in 15k
instr range




Communications costs makes many cluster
database application model impractical
Bandwidth important, but prime issues CPU
consumption and, to lesser extent, latency
A system area network (SAN) is used to connect
clustered servers together






some more than 30K
typically high bandwidth
Send/receive without O/S Kernel transition (50 to 100
instructions common)
Round trip latency in 15 microsecond range
SANs not new (e.g. Tandem)
Commodity-priced parts are new (Myrinet, Giganet,
Severnet, etc.) and available today
8
www.viarch.org
Admin & Mgmt Costs Dominate



Bank of America: “You keep explaining to me how I
can solve your problems”
Admin costs single largest driver of IT costs
Admitting we have a problem is first step to a cure:



Most commercial DBs now focusing on admin costs
SQL Server:

Enterprise manager (MMC framework--same as O/S)

Integrated security with O/S

Index tuning wizard (Surajit Chaudhuri)

Auto-statistics creation

Auto-file grow/shrink

Auto memory resource allocation
“Install and run” model is near

Trades processor resources for admin costs
9
Intelligent Disk are Coming




Fatalism: they’re building them so we might
as well figure out how to exploit (Patterson
trying to get us DB guys to catch on)
Reality: disk manufacturers work with very
thin margins and will continue to try to add
value to their devices (Gibson)
Many existing devices already (under-)
exploiting commodity procs (e.g. 68020)
Counter argument: Prefer general purpose
processor for DB workloads:


Dynamic workload requirements: computing
joins, aggregations, applying filters, etc.
What if it was both a general purpose proc
and embedded on disk controller?
10
DB Exploitation of Intelligent Disk

Each disk includes network, CPU, memory
and drive subsystem





All on disk package—it already had power,
chassis and PCB
scales as a unit in small increments
Runs full std O/S (e.g. Linux, NT, …)
Each is a node in single image, shared
nothing database cluster
Continues long standing DB trend of
moving function to the data:



Stored procedures
Joins done at server
Internally as well: SARGable predicates run in
storage engine
11
DB Exploitation of Intelligent Disk

Client systems are sold complete:



Server systems require weeks to months of
capacity planning, training, installing, configuring,
and testing before going live
Let’s make the client model work for servers:



Include O/S, relevant device drivers, office productivity
apps, …
Purchase a “system” (frame & 2 disk, cpu, memory, and
network units)
Purchase server-slices as required when required
Move to a design point where H/W is close to free
and admin costs dominate design decisions

High hardware volume still drives significant revenue
12
DB Exploitation of Intelligent Disk



Each slice contains S/W for file, DB, www, mail,
directory, … no install
Adding capacity is plugging in a few more slices
and choosing personality to extend
Due to large number of components in system
reliability an issue




“Nothing fails fast … just eventually performs poorly
enough to be “fired” … typically devices don’t just “quit”
(Patterson)
Introspection is key: dedicate some resources to tracking
intermittent errors and predicting failure
Take action prior to failure … RAID-like model where
disks fail but system keeps running
Add slices when capacity increase or accumulating
failures require it
13
Failed DB machines all over again?

Numerous past projects both commercial &
research



Britton Lee probably best remembered
Solutions looking for a problem (Stonebraker)
What went wrong?





Special purpose hardware with low volume
High, difficult to amortize engineering costs
Fell off general purpose system technology
curve
Database sizes were smaller and most server
systems were not single function machines
Non-standard models for admin, management,
security, programming, etc.
14
How about H/W DB accelerators?

Many efforts to produce DB accelerators



E.g. ICL CAFS
I saw at least one of these proposals a year while I was
working on DB2
Why not?





The additional H/W only addresses a tiny portion of total
DB function
Device driver support required
Substantial database engineering investment required to
exploit
Device must have intimate knowledge of database
physical row format in addition to logical properties like
international sort orders (bug-for-bug semantic match)
Low volume so devices quickly fall off commodity
technology curve

ICL CAFS supported single proc & general
commodity SMPs made irrelevant
15
Use of intelligent disk: NASD?

NASD has architectural advantages when data can
be sent from block server directly to client:


Could treat the intermediate server as a NASD
“client”




Many app-models require significant server side
processing preventing direct transfer (e.g. all database
processing)
Gives up advantages of not transferring data through
intermediate server
Each set of disk resources requires additional
network, memory, and CPU resources
Why not add together as self contained locally
attached unit?
Rather than directly transfer from the disk to the
client, move intermediate processing to data
(continuation of long database tradition)
16
Use of Intelligent disk: NASD Model?

Making disk unit full server-slice allows use of
existing:











Commodity operating system
device drivers framework and drivers
file system (API and on-disk format)
No client changes required
Object naming and directory lookup
Leverage on-going DB engineering investment
LOB apps (SAP, Peoplesoft … )
security, admin, and mgmt infrastructure
Customer training and experience
Program development environment investment
if delivered as peer nodes in a cluster, no mass
infrastructure re-write required prior to intelligent
disk adoption
17
Use of Intelligent disk: NASD Eng. Costs

New device driver model hard to sell:




Getting new file system adoption difficult




HPFS on OS/2 never got heavy use
After a decade NTFS now getting server use
Will O/S and file system vendors want new server
side infrastructure:


OS/2 never fully got driver support
NT still has less support than Win95/98
Typical UNIX systems support far fewer devices
What is upside for them?
If written and evangelized by others, will it be
adopted without system vendor support?
Intelligent disk is the right answer, question is what
architecture exploits them best and promotes
18
fastest adoption
Conclusions

Intelligent disk will happen


NASD could happen but alternative architectures
also based upon intelligent disk appear to:



An opportunity for all of us to substantially improve
server-side infrastructure
Require less infrastructure re-work
Offer more benefit to non-file app models (e.g. DB)
Intelligent disk could form generalized, scalable
server side component



CPU, network, memory, and disk
Emulate client-side sales and distribution model: all
software and hardware included in package
Client side usage model: use until fails and then discard
19
A Database View of
Intelligent Disks
James Hamilton
[email protected]
Microsoft SQL Server

SQL Server 7.0 Strategy Deck

Transcript SQL Server 7.0 Strategy Deck

Directory