EML_PBDB - Jim Gray Summary Home Page

Download Report

Transcript EML_PBDB - Jim Gray Summary Home Page

Building Peta-Byte Data Stores
Jim Gray
@
Claus Shira Anniversary
European Media Lab
12 February 2001
How Much Information Is there?
• Soon everything can be
recorded and indexed
• Most data never be seen by
humans
• Precious Resource:
Human attention
Auto-Summarization
Auto-Search
is key technology.
www.lesk.com/mlesk/ksg97/ksg.html
24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli
Everything
!
Recorded
All Books
Yotta
Zetta
Exa
MultiMedia
Peta
All LoC books
(words)
.Movi
e
A Photo
A Book
Tera
Giga
Mega
Kilo
ops/s/$ Had Three Growth Phases
Now doubling every year
1890-1945
Mechanical
Relay
7-year doubling
1945-1985
Tube, transistor,..
2.3 year doubling
1985-2000
Microprocessor
1.0 year doubling
1.E+09
ops per second/$
doubles every
1.0 years
1.E+06
1.E+03
1.E+00
1.E-03
doubles every
7.5 years
doubles every
2.3 years
1.E-06
1880
1900
1920
1940
1960
1980
2000
Gilder’s Law:
3x bandwidth/year for 25 more years
• Today:
– 10 Gbps per channel (per lambda)
– 4 channels per fiber: 40 Gbps
– 32 fibers/bundle = 1.2 Tbps/bundle
•
•
•
•
In lab 3 Tbps/fiber (400 x WDM)
In theory 25 Tbps per fiber
1 Tbps = USA 1996 WAN bisection bandwidth
Aggregate bandwidth doubles every 8 months!
1 fiber = 25 Tbps
Redmond/Seattle, WA
Information Sciences Institute
Microsoft
Qwest
University of Washington
Pacific Northwest Gigapop
New York
HSCC (high speed connectivity consortium)
DARPA
Arlington, VA
San Francisco,
CA
5626 km
10 hops
Storage capacity
beating Moore’s law
Disk TB Shipped per Year
1E+7
ExaByte
1E+6
• 3 k$/TB today (raw disk)
• 3 M$ /PB
1998 Disk Trend (Jim Porter)
http://www.disktrend.com/pdf/portrpkg.pdf.
1E+5
disk TB
growth:
112%/y
Moore's Law:
58.7%/y
1E+4
1E+3
1988
Moores law
Revenue
TB growth
Price decline
1991
1994
1997
58.70% /year
7.47%
112.30% (since 1993)
50.70% (since 1993)
2000
Microsoft TerraServer:
http://TerraServer.Microsoft.com/
• Build a multi-TB SQL Server database
• Data must be
–
–
–
–
1 TB
Unencumbered
Interesting to everyone everywhere
And not offensive to anyone anywhere
–
–
–
–
1.5 M place names from Encarta World Atlas
7 M Sq Km USGS doq (1 meter resolution)
10 M sq Km USGS topos (2m)
1 M Sq Km from Russian Space agency (2 m)
• Loaded
• On the web (world’s largest atlas)
• Sell images with commerce server.
TerraServer 4.0 Configuration
3 Active Database Servers
SQL\Inst1 - Topo & Relief Data
Compaq
Compaq
Compaq
Controller
Controller
Controller
E
L
S
Compaq
Compaq
DL360
DL360
DL360
DL360
DL360
DL360
DL360
DL360
SQL\Inst2 – Aerial Imagery
SQL\Inst3 – Aerial Imagery
Logical Volume Structure
One rack per database
All volumes triple mirrored (3x)
MetaData on 15k rpm 18.2 GB drives
Image Data on 10k rpm 72.8 GB drives
MetaData 101GB
Image1-10 3.4 TB cooked
10 x 339 GB volumes
Spread across 3 servers
2x4 to photo servers
1x2 for topo/relief server
File
Group
Admin
Gazetteer
Image
Meta
Search
Grand Total
Controller
F
G
Controller
H
I
Rows
(millions)
1
17
254
254
46
572
Controller
Controller
M N
T U
Controller
Controller
O P
V U
Total Size
(GB)
0 GB
5 GB
2,237 GB
70 GB
10 GB
2,322 GB
Compaq 8500
SQL\Inst1
Compaq 8500
SQL\Inst2
Compaq 8500
Web
Servers
8 2-proc
“Photon”
DL360
SQL\Inst3
Compaq 8500
Passive Srvr
Data Size
(GB)
0.1 GB
1 GB
2,220 GB
53 GB
5 GB
2,280 GB
Index Size
(GB)
0 GB
3 GB
17 GB
17 GB
5 GB
42 GB
TerraServer.Microsoft.NET
A Web Service
Before .NET
Html
Page
Internet
Image
Tile
Web Browser
TerraServer
Web
Site
TerraServer
SQL Db
With .NET
Application
Program
Internet
GetAreaByPoint
GetAreaByRect
TerraServer
GetPlaceListByName
Web
GetPlaceListByRect
GetTileMetaByLonLatPt
GetTileMetaByTileId
GetTile
ConvertLonLatToNearestPlace
ConvertPlaceToLonLatPt
.
.
.
Service
TerraServer
SQL Db
TerraServer
Recent/Current Effort
•
•
•
•
•
•
•
•
Added USGS Topographic maps (4 TB)
High availability (4 node cluster with failover)
Integrated with Encarta Online
The other 25% of the US DOQs (photos)
Adding digital elevation maps
Open architecture: publish SOAP interfaces.
Adding mult-layer maps (with UC Berkeley)
Geo-Spatial extension to SQL Server
Astronomy is Changing
(and so are other sciences)
•
•
•
•
•
•
The World Virtual Observatory
Doubles every 2 years.
Astronomers have a few PB
Data is public after 2 years.
So: Everyone has ½ the data
Some people have 5%more “private data”
So, it’s a nearly level playing field:
– Most accessible data is public.
• Cyberspace is the new telescope:
– Multi-spectral, very deep,…
• Computer Science challenge:
Organize these datasets
Provide easy access to them.
The Sloan Digital Sky Survey
Goal: Create a detailed multicolor map
of the Northern Sky
over 5 years
Special 2.5m telescope
Two surveys in one:
Photometric survey in 5 bands.
Spectroscopic redshift survey.
Huge CCD Mosaic
30 CCDs 2K x 2K (imaging)
22 CCDs 2K x 400 (astrometry)
Two high resolution spectrographs
2 x 320 fibers, with 3 arcsec diameter.
R=2000 resolution with 4096 pixels.
Spectral coverage from 3900Å to 9200Å.
Automated data reduction
Over 70 man-years of development effort.
(Fermilab + collaboration scientists)
Very high data volume
40 TB of raw, 3TB cooked data (all public).
The University of Chicago
Princeton University
The Johns Hopkins University
The University of Washington
Fermi National Accelerator Laboratory
US Naval Observatory
The Japanese Participation Group
The Institute for Advanced Study
SLOAN Foundation, NSF, DOE, NASA
The Cosmic Genome Project
The SDSS will create the ultimate map
of the Universe, with much more detail
than any other measurement before
daCosta
etal 1995
deLapparent, Geller and Huchra
1986
Gregory and Thompson 1978
SDSS Collaboration 2002
Area and Size of Redshift Surveys
1.00E+09
SDSS
photo-z
1.00E+08
No of objects
1.00E+07
SDSS
main
SDSS
abs line
1.00E+06
SDSS
red
1.00E+05
CfA+
SSRS
2dF
LCRS
1.00E+04
SAPM
1.00E+03
1.00E+04
2dFR
1.00E+05
1.00E+06
QDOT
1.00E+07
1.00E+08
Volume in M pc 3
1.00E+09
1.00E+10
1.00E+11
Experiment with Relational DBMS
• See if SQL’s Good Indexing and Scanning
Compensates for Poor Object Support.
• Leverage Fast/Big/Cheap Commodity
Hardware.
• Ported 40 GB Sample Database (from SDSS
Sample Scan) to SQL Server 2000
• Building public web site and data server
20 Astronomy Queries
• Implemented spatial access extension to SQL (HTM)
• Implement 20 Astronomy Queries in SQL (see paper
for details).
• 15M rows 378 cols, 30 GB.
Can scan it in 8 minutes (disk IO limited).
• Many queries run in seconds
• Create Covering Indexes on queried columns.
• Create ‘Neighbors’ Table listing objects within 1 arcminute (5 neighbors on the average) for spatial joins.
• Install some more disks!
Query to Find Gravitational
Lenses
Find all objects within 1 arc-minute of each
other that have very similar colors (the color
ratios u-g, g-r, r-i are less than 0.05m)
1 arc-minute
SQL Query to Find
Gravitational Lenses
Find nearby objects with similar color
ratios.
select count(*)
from Objects L, Objects O, neighbors N
where L.Obj_id = N.Obj_id
and O.Obj_id = N.neighbor_Obj_id
and L.Obj_id < O.Obj_id
-- no dups
and ABS((L.u-L.g)-(O.u-O.g))<0.05 -- similar color
and ABS((L.g-L.r)-(O.g-O.r))<0.05 – ratios
and ABS((L.r-L.i)-(O.r-O.i))<0.05 – (=dif of log)
and ABS((L.z-L.r)-(O.z-O.r))<0.05
Finds 5223 objects, executes in 6 minutes.
SQL Results so far.
• Have run 17 of 20 Queries so far.
Working on spectra load and queries now.
• Most Queries IO bound, ( 80MB/sec on 4 disks in 6 minutes)
• Covering indexes reduce execution to < 30 secs.
• Common to get Grid Distributions:
select
convert(int,ra*30)/30.0, as ra_bucket
convert(int,dec*30)/30.0, as dec_bucket
count(*)
as bucket count
from
Galaxies
where
(u-g) > 1 and r < 21.5
group by ra_bucket, dec_bucket
Summary
• Technology:
– 1M$/PB: store everything online (twice!)
– Gigabit to the desktop : store it anywhere
So: You can store everything,
Anywhere in the world
Online everywhere
• Research driven by apps:
– TerraServer
– National Virtual Astronomy Observatory.