Speech Title - Microsoft Research
Download
Report
Transcript Speech Title - Microsoft Research
SkyServer Database
Past, Present, and Future
Jim Gray: Microsoft
Alex Szalay (and friends): Johns Hopkins
Help from:
Cathan Cook (personal SkyServer),
Maria A. Nieto-Santisteban (image cutout service)
Peter Kukol, Gyorgy Fekete, Wil O’Mullane (evolved HTM to native C#)
Cesar Galindo-Legaria, Ming-Chuan Wu (statistics for optimizer)
Promised Abstract
Can database technology help manage and mine
scientific data? That is the question I have been
trying to answer with my astronomy colleagues
(especially Alex Szalay.) We have had some success
but still face many problems. I will start by
describing the odyssey of putting the Sloan Digital
Sky Survey online and give some statistics about
how it is used and what we are doing now. That
segment will end with a discussion of how the
integration of SQL with the CLR (common language
run time) makes it much easier for us to handle
scientific datatypes and spatial access methods. The
World-Wide Telescope is an attempt to federate all
the astronomy archives of the world. I will briefly
describe the architecture of SkyQuery a prototype
portal to several archives each of which is a web
service. Urls: http://research.microsoft.com/~gray
http://skyserver.pha.jhu.edu/dr1/ http://skyQuery.net
Instructions from Last Year
Less marketecture
More Substance
More demos and code
SkyServer Overview (10 min)
10 minute SkyServer tour
Pixel space
http://skyserver.sdss.org/en/
Record space:
http://skyserver.sdss.org/en/tools/explore/obj.asp?id=2255030989160697
select top 10 * from weblog..weblog
where yy = 2003 and mm=7 and dd =25
Doc space: Ned
order by seq desc
Set space:
select top 10 * from weblog..sqlLog
Web & Query Logs:
order by theTime Desc
http://skyserver.pha.jhu.edu/dr1/en/tools/chart/navi.asp
Dr1 WebService
You can download (thanks to Cathan Cook )
Data + Database code:
Website: http://research.microsoft.com/~gray/SDSS/personal_skyserver.htm
Data Mining the SDSS SkyServer Database
MSR-TR-2002-01
Cutout Service (10 min)
A typical web service
Show it
Show WSDL
Show fixing a bug
Rush through code.
You can download it.
Maria A. Nieto-Santisteban did most of this (Alex and I started it)
http://research.microsoft.com/~gray/SDSS/personal_skyserver.htm
SkyQuery: http://skyquery.net/
Distributed Query tool using a set of web services
Four astronomy archives from
Pasadena, Chicago, Baltimore, Cambridge (England).
Feasibility study, built in 6 weeks
Tanu Malik (JHU CS grad student)
Tamas Budavari (JHU astro postdoc)
With help from Szalay, Thakar, Gray
Implemented in C# and .NET
Allows queries like:
SELECT o.objId, o.r, o.type, t.objId
FROM SDSS:PhotoPrimary o,
TWOMASS:PhotoPrimary t
WHERE XMATCH(o,t)<3.5
AND AREA(181.3,-0.76,6.5)
AND o.type=3 and (o.I - t.m_j)>2
SkyQuery Structure
Each SkyNode publishes
Schema Web Service
Database Web Service
Image
Cutout
SDSS
SkyQuery
Portal
FIRST
2MASS
Portal is
Plans Query (2 phase)
Integrates answers
Is itself a web service
INT
Four Database Topics
Sparse tables:
column vs row store
tag and index tables
pivot
Maplist (cross apply)
Bookmark bug
Object Relational has arrived.
Column Store Pyramid
Users see fat base tables
(universal relation)
Define popular columns index
tag table 10% ~ 100 columns
Make many skinny indices
1% ~ 10 columns
Query optimizer picks right plan
Automate definition & use
Fast read, slow insert/update
Data warehouse
BASE
TAG
Obese query
Fat
quer
y
Typical Semi-join
Note: prior to Yukon,
index had 16 column limit.
A bane of my existence. Simpl
INDICIES
Examples
create table base (
id bigint, f1 int primary key,
f2 int, …,f1000 int)
create index tag on base (id)
include (f1, …, f100)
BASE
TAG
Obese query
Typical Semi-join
Fat
quer
y
create index skinny on
base(f2,…f17)
INDICIES
Simpl
A Semi-Join Example
create table fat(a int primary key, b int, c int, fat char (988))
declare @i int, @j int; set @i = 0
again:
insert fat values(@i, cast(100*rand() as int), cast (100*rand() as int), ' ')
set @i = @i + 1; if (@i < 1000000) goto again
create index ab on fat(a,b)
create index ac on fat(a,c)
b=c
137 K IO
31 sec
dbcc dropcleanbuffers with no_infomsgs
select count(*) from fat with(index (0)) where c = b
1GB
-- Table 'fat'. Scan 3, reads 137,230, CPU : 1.3 s, elapsed 31.1s.
ab
dbcc dropcleanbuffers with no_infomsgs
8MB 8MB
select count(*) from fat where b=c
-- Table 'fat'. Scan 2, reads: 3,482 CPU 1.1 s,
ac
elapsed: 1.4 s.
b=c
3.4K IO 1.4 sec
Moving From Rows to Columns
Pivot & UnPivot
What if the table is sparse?
LDAP has 7 mandatory
and 1,000 optional attributes
Store row, col, value
Features
object attribute value
create table Features ( object varchar ,
attribute varchar,
value varchar,
primary key ( object,
attribute))
select *
from (features pivot value
on
attribute
in
(year, color) ) as T
where object = ‘4PNC450’
●●●●
4PNC450
4PNC450
4PNC450
4PNC450
year
color
make
model
2000
white
Ford
Taurus
●●●●
T
Object year
4PNC450 2000
color
white
Maplist Meets SQL – cross apply
select p.*, q.*
from parent as p cross apply f(p.a, p.b, p.c) as q
where p.type = 1
f(p1)
Your table-valued function F(a,b,c)
returns all objects related to a,b,c.
p1
spatial neighbors,
sub-assemblies,
members of a group,
items in a folder,…
Apply this function to each row
Classic drill-down
use outer apply if f() may be null
p2
f(p2)
pn
f(pn)
The Bookmark Bug
SQL is a non-procedural language.
The compiler/optimizer picks the
procedure
based on statistics.
If the stats are wrong or missing….
Bad things happen.
Queries can run VERY slowly.
Strategy 1: allow users to specify plan.
Strategy 2: make the optimizer smarter
(and accept hints from the user.)
A query selects some fields
of an index
and of huge table.
Bookmark plan:
This is
look in index for a subset.
Lookup subset in Fat table.
Index
An Example of the Problem
great if subset << table.
terrible if subset ~ table.
If statistics are wrong,
or if predicates not independent,
you get the wrong plan.
How to fix the statistics?
Huge table
A Fix: Let user ask for stats
Create Statistics on View(f1,..,fn)
Then the optimizer has the right data
Picks the right plan.
Statistics on Views,
C. Galindo-Legaria, M. Josi, F. Waas, M. Wu, VLDB 2003,
Q3:
Select count(*) from Galaxy
where r < 22 and r_extinction > 0.120
Bookmark: 34 M
random IO,
520 minutes
Create Statistics on Galaxy(objID )
Scan:
5 M sequential IO
18 minutes
Ultimately this should be automated,
but for now,… it’s a step in the right direction.
Object Relational Has Arrived
VMs are moving inside the DB
Yukon includes Common Language Runtime
(Oracle & DB2 have similar mechanisms).
So, C++, VB, C# and Java are
data
code
co-equal with TransactSQL.
You can define classes and methods
SQL will store the instances
data
code
Access them via methods
+code
You can put your analysis code
INSIDE the database.
Minimizes data movement.
You can’t move petabytes to the client
But we will soon have petabyte databases.
And..
Fully-async and synchronous (blocking) calls
and multi-concurrent-result sets
per connection (transaction)
Queues built in:
Fire-and forget asynchronous processing
It listens to Port 80 for SOAP calls :
TP-lite is back
It’s a web service
Notification service and
data mining and
olap and
reporting and
xml and
xquery and
.... )
But, back to OR.
The Pre CLR design
Transact SQL
sp_HTM
(20 lines)
469 lines of
“glue”
looking like:
// Get Coordinates param datatype, and param length information of
if (srv_paraminfo(pSrvProc, 1, &bType1, &cbMaxLen1, &cbActualLen1, NULL, &fNull1) == FAIL)
ErrorExit("srv_paraminfo failed...");
// Is Coordinate param a character string
if (bType1 != SRVBIGVARCHAR && bType1 != SRVBIGCHAR &&
bType1 != SRVVARCHAR
&& bType1 != SRVCHAR)
ErrorExit("Coordinate param should be a string.");
The HTM code body
// Is Coordinate param non-null
if (fNull1 || cbActualLen1 < 1 || cbMaxLen1 <= cbActualLen1)
ErrorExit("Coordinate param is null.");
// Get pointer to Coordinate param
pzCoordinateSpec = (char *) srv_paramdata (pSrvProc, 1);
if (pzCoordinateSpec == NULL)
ErrorExit("Coordinate param is null.");
pzCoordinateSpec[cbActualLen1] = 0;
// Get OutputVector datatype, and param length information
if (srv_paraminfo(pSrvProc, 2, &bType2, &cbMaxLen2, &cbActualLen2, NULL, &fNull2) == FAIL)
ErrorExit("Failed to get type info on HTM Vector param...");
The “glue” CLR design
Discard 450 lines of UGLY
C# SQL
code
sp_HTM
(50 lines)
Thanks!!! To Peter Kukol
(who wrote this)
using System;
using System.Data;
using System.Data.SqlServer;
using System.Data.SqlTypes;
using System.Runtime.InteropServices;
namespace HTM {
public class HTM_wrapper {
[DllImport("SQL_HTM.dll")] static extern unsafe void * xp_HTM_Cover_get (byte *str);
public static unsafe void HTM_cover_RS(string input) {
// convert the input from Unicode (array of 2 bytes) to an array of bytes (not shown)
byte * input; byte * output;
// invoke the HTM routine
output = (byte *)xp_HTM_Cover_get(input);
// Convert the array to a table
SqlResultSet outputTable = SqlContext.GetReturnResultSet();
if (output[0] == 'O') { // if Output is “OK”
uint
c = *(UInt32 *)(s + 4); // cast results as dataset
Int64 * r = ( Int64 *)(s + 8); // Int64 r[c-1,2]
for (int i = 0; i < c; ++i) {
SqlDataRecord newRecord = outputTable.CreateRecord();
newRecord.SetSqlInt64(0, r[0]);
newRecord.SetSqlInt64(1, r[1]); r++;r++;
outputTable.Insert(newRecord);
} } // return outputTable;
}
}
}
The HTM
code body
The Clean CLR design
Discard all glue code
return array cast as table
CREATE ASSEMBLY HTM_A
FROM '\\localhost\HTM\HTM.dll'
CREATE FUNCTION HTM_cover( @input NVARCHAR(100) )
RETURNS @t TABLE (
HTM_ID_START BIGINT NOT NULL PRIMARY KEY,
HTM_ID_END BIGINT NOT NULL
)
AS
EXTERNAL NAME HTM_A:HTM_NS.HTM_C::HTM_cover
using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlServer;
using System.Data.SqlTypes;
using System.Runtime.InteropServices;
namespace HTM_NS {
public class HTM_C {
public static Int64[,2] HTM_cover(string input) {
// invoke the HTM routine
return (Int64[,2]) xp_HTM_Cover(input); // the actual HTM C# or C++ or Java or VB code goes here.
Your/My
code
goes here
}
} }
Performance (Beta1)
On a 2.2 Ghz Xeon
Call a Transact SQL function
33μs
Call a C# function
50μs
Table valued function
Array (== table) valued function 200 μs
+ per row 27 μs
not good in β1
The Code
CREATE ASSEMBLY ReturnOneA
FROM '\\localhost\C:\ReturnOne.dll'
GO
CREATE FUNCTION ReturnOne_Int( @input INT) RETURNS INT
AS EXTERNAL NAME ReturnOneA:ReturnOneNS.ReturnOneC::ReturnOne_Int
GO
---------------------------------------------- time echo an integer
declare @i int, @j int, @cpu_seconds float, @null_loop float
declare @start datetime, @end datetime
set @j = 0
set @i = 10000
set @start = current_Timestamp
while(@i > 0)
begin
set @j = @j + 1
set @i = @i -1
end
set @end = current_Timestamp
set @null_loop = datediff(ms, @start,@end) / 10.0
set @i = 10000
set @start = current_Timestamp
while(@i > 0)
begin
select @j = dbo.ReturnOne_Int(@i)
set @j = @j + 1
set @i = @i -1
end
set @end = current_Timestamp
set @cpu_seconds = datediff(ms, @start,@end) / 10.0 - @null_loop
print 'average cpu time for 1,000 calls to ReturnOne_Int was ' +
str(@cpu_seconds,8,2)+ ' micro seconds'
Program in DB
in different
language (Tsql)
calling function
Function
written in C#
inside the DB
using System;
using System.Data;
using System.Data.SqlServer;
using System.Data.SqlTypes;
using System.Runtime.InteropServices;
namespace ReturnOneNS {
public class ReturnOneC {
public static int ReturnOne_Int(int input) {
return input;
}
}
}
What Is the Significance?
No more inside/outside DB dichotomy.
You can put your code near the data.
Indeed, we are letting users put
personal databases near the data
archive.
This avoids moving large datasets.
Just move questions and answers.
Meta-Message
Trying to fit science data into
databases
When it does not fit, something is
wrong.
Look for solutions
Many solutions come from OR extensions
Some are fundamental engine changes
More structure in DB
Richer operator sets
Better statistics
© 2002 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.