Microsoft Research
Download
Report
Transcript Microsoft Research
Microsoft Research
Jim Gray
Researcher
Microsoft Research
Microsoft Corporation
Outline
Highlights of MS
Research
DB-Related Research
in some detail
Microsoft Research
http://Research.Microsoft.com/
Goal: pursue strategic technologies for MS
231 Researchers in 20 areas
Plan to grow to 600 in 3 years
Internationally recognized research teams
Hundreds of publications, conference presentations
Leadership in professional societies, journals,
conferences
Virtually all MS products use technology from
MS Research
Broad Research Areas
Systems & architecture
Make programmers more productive
Tools, compilers
Human-oriented interaction
Distributed & scaleable systems,
databases, cryptography, telepresence
Speech, natural language, vision, intelligence
3D Graphics, moving beyond GUI
Fundamental problems
Development Tools
Analysis of executables
Automatic reorganization of executables
Dynamic analysis driven by user scenarios
Instrumented code
Reduce code working set size
Branch straightening
Boot ordering for boot time reduction
Reduced code working sets up to 50%
Improved throughput by 10%
Smaller Working Sets
Faster Startup & Execution
Pages Referenced
600
50%
Original
Optimized
500
39%
40%
35%
33%
400
30%
300
20%
200
7% 8%
10%
100
10%
2%
0%
0
0
50
Seconds
100
TxRd$NoL
TxRW$NoL
TxWrt$
NT Network Drivers
Throughput Speedup
Speech Technology
Speech recognition
Trainable speech synthesis
Speaker independent, Command and Control
Dictation
Speaker independent, large vocabulary
Discrete and continuous speech
Prosody and concatenative speech units
learned from corpus
Demos and SDKs at web site
http://research.microsoft.com/research/srg/
Natural Language Understanding
Broad-coverage syntax analyzer.
Dictionary-based semantic network
provides growing knowledge base.
Underlying system is multi-lingual.
Contributions to MS word-processing
Levels of writing critiques
We scheduled teh next meeting for noon.
Each of the products are designed to help.
I saw the Grand Canyon flying to Arizona.
Ladies are requested not to have children
in the bar. (from a sign in a Norwegian
cocktail lounge)
Natural Language
Roadmap and Mile Stones
Robotics
Vision
Machine
learning
University
Discourse/
pragmatics
Library
UI
2nd F
Dictionary
SR
Probs
(DT)
fjeiofjdksl fjeiofjdksl
eriowe.asm eriowe.asm
qweiqpo ero qweiqpo ero
oei iqpwe iio oei iqpwe iio
qwpe ec,l;aklqwpe ec,l;akl
Concept
normalizing
Sense
choosing
Logical form
Revised syntax
Initial syntax
Morphology
Interactive movies
Info highway cruiser
Advanced summary
Interactive games
Peedy+
NL query
Improved IR
Bob+
Enhanced help
Improved SR
Semantic critiques
Auto indexing
Syntactic
critiques
Phrase spacing
Find and replace
3D Graphics Research
Bring very high performance,
high quality graphics to PCs
Modeling
Interactive
Uniform treatment of multimedia
Representation of 3D models
Automatic simplification
Animation
Simplification Problem
70,000
8,700
34,100
4,200
2,600
2,300
Competing goals: accuracy & conciseness
Simplification Results
(creases)
34,100
difficult to author directly!
2,300
original
2,300
simplified
(~10 minutes)
Vision Projects
3D
Reconstruction from Video and Images
Motion
analysis for video compression
Model acquisition for rendering
Visual
Human/Computer Interaction
Communication
by gestures and expressions
Multimodal speech/vision interfaces
Motion Analysis
convert
masked images into a background
sprite for content-based coding [Scrunch]
+
+
=
working
on motion tracking
+
Outline
Highlights of MS
Research
DB-Related Research
University of Washington
& Microsoft Research
Summer Institute 1997
Data Mining & Knowledge
Discovery
July 6-11, 1997
A collection of 75
researchers from around
the world: –Statistics
–Databases
–AI/pattern recognition
–Visualization
–Systems
–HPC
http://research.Microsoft.com/uwmsrdmi
Data Mining and OLAP
The Query formulation problem:
difficult to formulate as DB queries
which records are fraudulent transactions?
which households are likely to prefer a Ford
over a Toyota?
Who’s a good credit risk in my customer DB?
Yet database contains the information
good/bad customer, profitability, response rate
known objects of interest (stars/galaxies),
results in medical studies with higher details,
etc...
DM helps
Use data to build predictors
regression, classification, clustering etc.
Generate summaries to aid understanding
find “easy to describe” partitions in data
automatically (via clustering/segmentation)
find clusters not well-known to analyst
Example:
Analyze web server hits to categorize clients.
A new mode of interacting with database
Data Base Research
http://research.Microsoft.com/db/
Databases
Recovery
Recover application state after sys fail.
Data Structures
High concurrency data structures
Key range locking in SS 7 based on Lomet’s
work.
Auto DB Design & Admin
Index DB designer started as research
project.
More coming
Scaleable Servers
High Availability SQL Server
Billion Transactions Per Day demo
Virtual Server design
Always Up prototype
Combined SQL 6.5 with MTS, DTC
RAGS: stress test NT DBs
TerraServer: a giant single node SQL7 DB
Always Up prototype
Scaleable Servers
BOTH SMP and Cluster
Grow Up with SMP
4xP6 is now standard
SMP
Super Server
Grow Out with Cluster
Cluster has inexpensive parts
Departmental
Server
Personal
System
Cluster
of PCs
Scale OUT
Clusters Have Advantages
Fault tolerance:
Modular growth without limits
Grow by adding small modules
Parallel data search
Spare modules mask failures
Use multiple processors and disks
Clients and servers made from the same stuff
Inexpensive: built with commodity CyberBricks
Billion Transactions per Day Project
Built a 45-node Windows NT Cluster
(with help from Intel & Compaq)
> 900 disks
All off-the-shelf parts
Using SQL Server &
DTC distributed transactions
DCOM & ODBC clients
on 20 front-end nodes
DebitCredit Transaction
Each server node has 1/20 th of the DB
Each server node does 1/20 th of the work
15% of the transactions are “distributed”
1 B tpd: So What?
Shows what is possible, easy to build
Grows without limits
Shows scaleup of DTC, MTS, SQL…
Shows (again) that shared-nothing clusters
scale
Next task: make it easy.
auto partition data
auto partition application
auto manage & operate
Parallelism
The OTHER aspect of clusters
Clusters of machines
allow two kinds
of parallelism
Many little jobs: online
transaction processing
TPC-A, B, C…
A few big jobs: data
search and analysis
TPC-D, DSS, OLAP
Both give
automatic parallelism
Kinds of Parallel Execution
Pipeline
Partition
outputs split N ways
inputs merge M ways
Any
Sequential
Program
Any
Sequential
Program
Any
Sequential
Program
Any
Sequential
Program
Data Rivers
Split + Merge Streams
N X M Data Streams
M Consumers
N producers
River
Producers add records to the river,
Consumers consume records from the river
Purely sequential programming.
River does flow control and buffering
does partition and merge of data records
River = Exchange operator in SQL 7
N x M way Parallelism
Merge
Merge
Merge
Sort
Sort
Sort
Sort
Sort
Join
Join
Join
Join
Join
A...E
F...J
K...N
O...S
T...Z
N inputs, M outputs, no bottlenecks.
Partitioned Data
Partitioned and Pipelined Data Flows
Clustered DBs
Pieces are in place with SQL 7.0
Manage arrays of SQL Servers
Partition data among files
Disk and CPU Parallel query plans
Distributed query via OLE DB
High Availability with MSCS failover
Next Parallel Databases on NT clusters
Parallel Query
Transparent access to partitioned tables
Grow without limits (16 x 2 TB)
Do not add complexity
Design, Administer, Operate, Use
Scaleup - Big Database
Build a 1 TB SQL Server database
Data must be
1 TB
Unencumbered
Interesting to everyone everywhere
And not offensive to anyone anywhere
Loaded
Show off Windows NT and
SQL Server scalability
Stress test the product
1.1 M place names from Encarta World Atlas
1 M Sq Km from USGS (1 meter resolution)
2 M Sq Km from Russian Space agency (2 m)
Will be on web (world’s largest atlas)
Sell images with commerce server.
USGS CRDA: 3 TB more coming.
TerraServer
World’s Largest PC!
324 disks (2.9 terabytes)
NT EE & SQL 7.0
8 x 440Mhz Alpha CPUs
10 GB DRAM
Photo of the planet
USGS and Russian
images
Background
Earth is 500 Tera-meters square
100 TM2 land in 70ºN to 70ºS
We have pictures of 6% of it
USA is 10 tm2
3 tsm from USGS
2 tsm from Russian Space Agency
Compress 5:1 (JPEG) to 1.5 TB.
Slice into 10 KB chunks
Store chunks in DB
Navigate with
Encarta™ Atlas
globe
gazetteer
StreetsPlus™ in the USA
Someday
multi-spectral
image
of everywhere
once a day / hour
1.8x1.2 km2 tile
10x15 km2 thumbnail
20x30 km2 browse image
40x60 km2 jump image
USGS Digital Ortho Quads
(DOQ)
US Geologic Survey
3 TeraBytes
Most data not yet published
Based on a CRADA
TerraServer makes data available.
1x1 meter
4 TB
Continental
US
New Data
Coming
USGS “DOQ”
Russian Space Agency(SovInfomSputnik)
SPIN-2 (Aerial Images is Worldwide Distributor)
1.5 Meter Geo Rectified imagery of (almost) anywhere
Almost equal-area projection
De-classified satellite photos (from 200 KM),
More data coming (1 m)
Want to sell imagery on Internet.
Putting 2 tm2 onto TerraServer.
SPIN-2
Demo
http://www.TerraServer.com
Microsoft
BackOffice
SPIN-2
Hardware
1TB Database Server
AlphaServer 8400 4x400. 10 GB RAM
324 StorageWorks disks
10 drive tape library (STC Timber Wolf DLT7000 )
SPIN-2
Software
Web Client
Image
Server
Active Server Pages
Internet
Information
Server 4.0
Java
Viewer
broswer
MTS
Terra-Server
Stored Procedures
HTML
The Internet
Internet Info
Server 4.0
Sphinx
(SQL Server)
Microsoft Automap
ActiveX Server
Terra-Server DB
Automap Server
Terra-Server Web Site
Internet Information
Server 4.0
Microsoft
Site Server EE
Image Delivery SQL Server
Application
7
Image Provider Site(s)
System
Management &
Maintenance
Backup and Recovery
STC 9717 Tape robot
Legato NetWorker™
Sphinx Backup/Restore Utility
Clocked at 80 MBps!!
SQL Server Enterprise Mgr
DBA Maintenance
SQL Performance Monitor
TerraServer File Group Layout
Convert 324 disks to 28 RAID5 sets
plus 28 spare drives
Make 4 NT volumes (RAID 50)
595 GB per volume
Build 30 20GB files on each volume
DB is File Group of 120 files
E:
F:
G:
H:
Gazetteer Design
Classic Snowflake Schema
Fast First hint to Optimizer
PlaceGrid
Place
CountrySearch
AlternateName
CountryID
GazSrcID
1148
Country
CountryID
CountryName
UNcode
264
StateSerach
AlternateName
CountryID
StateID
FreatureID
GazSrcID
State
StateID
CountryID
StateName
1083
PlaceID
ImageFlag
AlternateName
Name
CountryID
StateID
TypeID
GazSourcID
Latitude
Longitude
UGridID
ZGridID
DOQdate
SPIN2date
3776
1,089,897
ZGridID
BestPlaceName
XDistance
YDistrance
50,000,000
FeatureType
TypeID
Description
13
GazetteerSource
GazSrcID
Description
1
Image Data Design
Image pyramid stored in DBMS (250 M recs)
OriginalMetaData
ImageMeta
OrigMetaID
SrcID
ImageSource
Agency
SourcePhotoID
SourcePhotoDate
SourceDEMDate
MetaDataDate
ProductionSystem
ProductionDate
DataFileSize
Compression
HeaderBytes
…
80 other fields
ImgMetaID
OrigMetaID
ImgStatus
ImgDate
ImgTypeID
JumpPixHeight
JumpPixWidth
BrowsePixHeight
BrowsePixWidth
ThumbPixWidth
ThumbPixHeight
CutCol
CutRow
MidLat
MidLong
NELat
NELong
NWLat
NWLong
SELat
SELong
SWLat
SWLong
UGridID
UTMZone
XUtmID
YUtmID
XGridID
YGridID
ZGridID
650 k SPIN2
2 M USGS
ImgSource
ImgType
ImgTypeID
ImgFileDesc
ImgFileExt
MimeStr
SrcID
SrcName
SrcTblName
SrcDescription
GridSysID
ImgTypeID
4
2
650 k SPIN2
2 M USGS
Pick
Log
UGridHits
Name
Description
Link
PickDate
URL
Time
<extensive
list of action
parameters
URL
UGridID
ZTileGridID
count
10
TileMeta
xxx
xxx
Jump
Browse
Thumb
Tile
UGridID
ZGridID
ZTileGridID
ImgData
ImgDate
ImgTypeID
ImgMetaID
SrcID
EncryptKey
File Name
UGridID
ZGridID
ZTileGridID
ImgData
ImgDate
ImgTypeID
ImgMetaID
SrcID
EncryptKey
File Name
UGridID
ZGridID
ZTileGridID
ImgData
ImgDate
ImgTypeID
ImgMetaID
SrcID
EncryptKey
File Name
UGridID
ZGridID
ZTileGridID
ImgData
1
ImgDate
ImgTypeID
ImgMetaID
SrcID
EncryptKey
File Name
.65 M SPIN2
1.5 M USGS
.65 M SPIN2
1.5 M USGS
.65 M SPIN2
1.5 M USGS
16 M SPIN2
96 M USGS
ImgMetaID
OrigMetaID
SrcID
ImgStatus
ImgDate
ImgTypeID
TilePixHeight
TilePixWidth
CutCol
CutRow
MidLat
MidLong
NELat
NELong
NWLat
NWLong
SELat
SELong
SWLat
SWLong
UGridID
UTMZone
XUtmID
YUtmID
XGridID
YGridID
ZGridID
16 M SPIN2
96 M USGS
Image Delivery and Load
DLT
Tape
DLT
Tape
“tar”
NT
DoJob
\Drop’N’
LoadMgr
DB
Wait 4
Load
Backup
LoadMgr
LoadMgr
ESA
Alpha
Server
4100
100mbit
EtherSwitch
60
4.3 GB
Drives
Alpha
Server
4100
ImgCutter
\Drop’N’
\Images
Enterprise Storage Array
STC
DLT
Tape
Library
108
9.1 GB
Drives
108
9.1 GB
Drives
108
9.1 GB
Drives
Alpha
Server
8400
10: ImgCutter
20: Partition
30: ThumbImg
40: BrowseImg
45: JumpImg
50: TileImg
55: Meta Data
60: Tile Meta
70: Img Meta
80: Update Place
...
Terra-Server Tables
USGS DOQ Data
SPIN-2 Data
3200 278 MB images are input (approximate size)
Creates 620,800 Jump, Thumb, & Browse images (2.5 m rows)
Creates 15.5 m Tile images (31 m rows)
Gazetteer Data
48,000 DOQ images are input (45-55mb / image)
Creates 864,000 Jump, Thumb, & Browse images (3.5 m rows)
Creates 55.3 m Tile images (110.6 m rows)
1.1 m named places (Encarta World Atlas)
45 m cell names
Total Rows = 228 M today (and growing fast)
Other Details
Active Server pages
faster and easier than DB stored procedures.
Commerce Server is interesting
Images the Inventory
USGS built their own
no SKU,
millions of them
they are very smart, but it is easy
masquerade as a credit-card reader.
The earth is a geoid, and
Every Geographer has a coordinate system
Tapes are still a nightmare.
Everyone is a UI expert.
(or two).
SQL 7 Testimonial
We started using it March 4 1997
Loaded the DB twice
SQL 7 Pre-Alpha
SQL 7 Alpha
SLQ 7 Beta 1
SQL 7 Beta
(we made application mistakes)
Now doing it “right”
Reliability: Great! SQL 7 never lost data
Ease of use: Great!
Functionality: Great!
RAGS:
RAndom SQL test Generator
Microsoft spends a LOT of money on testing.
Idea: test SQL by
generating random correct queries
executing queries against database
compare results with SQL 6.5, DB2, Oracle, Sybase
Being used in SQL 7.0 testing.
Very productive test tool
Sample
Rags
Generated
Statement
SELECT TOP 3 T1.royalty , T0.price , "Apr 15 1996 10:23AM" , T0.notes
FROM titles T0, roysched T1
WHERE EXISTS (
SELECT DISTINCT TOP 9 $3.11 , "Apr 15 1996 10:23AM" , T0.advance , (
"<v3``VF;" +(( UPPER(((T2.ord_num +"22\}0G3" )+T2.ord_num ))+("{1FL6t15m" +
RTRIM( UPPER((T1.title_id +((("MlV=Cf1kA" +"GS?" )+T2.payterms )+T2.payterms
))))))+(T2.ord_num +RTRIM((LTRIM((T2.title_id +T2.stor_id ))+"2" ))))),
T0.advance , (((-(T2.qty ))/(1.0 ))+(((-(-(-1 )))+( DEGREES(T2.qty )))-(-((
-4 )-(-(T2.qty ))))))+(-(-1 ))
FROM sales T2
WHERE EXISTS (
SELECT "fQDs" , T2.ord_date , AVG ((-(7 ))/(1 )), MAX (DISTINCT
-1 ), LTRIM("0I=L601]H" ), ("jQ\" +((( MAX(T3.phone )+ MAX((RTRIM( UPPER(
T5.stor_name ))+((("<" +"9n0yN" )+ UPPER("c" ))+T3.zip ))))+T2.payterms )+
MAX("\?" )))
FROM authors T3, roysched T4, stores T5
WHERE EXISTS (
SELECT DISTINCT TOP 5 LTRIM(T6.state )
FROM stores T6
WHERE ( (-(-(5 )))>= T4.royalty ) AND (( (
( LOWER( UPPER((("9W8W>kOa" +
T6.stor_address )+"{P~" ))))!= ANY (
SELECT TOP 2 LOWER(( UPPER("B9{WIX" )+"J" ))
FROM roysched T7
WHERE ( EXISTS (
This statement yields an error (now fixed)
SELECT (T8.city +(T9.pub_id +((">" +T10.country )+
UPPER( LOWER(T10.city))))), T7.lorange ,
((T7.lorange )*((T7.lorange )%(-2 )))/((-5 )-(-2.0 ))
FROM publishers T8, pub_info T9, publishers T10
WHERE ( (-10 )<= POWER((T7.royalty )/(T7.lorange ),1))
AND (-1.0 BETWEEN (-9.0 ) AND (POWER(-9.0 ,0.0)) )
)
--EOQ
) AND (NOT (EXISTS (
SELECT MIN (T9.i3 )
FROM roysched T8, d2 T9, stores T10
WHERE ( (T10.city + LOWER(T10.stor_id )) BETWEEN (("QNu@WI" +T10.stor_id
)) AND ("DT" ) ) AND ("R|J|" BETWEEN ( LOWER(T10.zip )) AND (LTRIM(
UPPER(LTRIM( LOWER(("_\tk`d" +T8.title_id )))))) )
GROUP BY T9.i3, T8.royalty, T9.i3
HAVING -1.0 BETWEEN (SUM (-( SIGN(-(T8.royalty ))))) AND (COUNT(*))
)
--EOQ
) )
)
--EOQ
) AND (((("i|Uv=" +T6.stor_id )+T6.state )+T6.city ) BETWEEN ((((T6.zip +(
UPPER(("ec4L}rP^<" +((LTRIM(T6.stor_name )+"fax<" )+("5adWhS" +T6.zip ))))
+T6.city ))+"" )+"?>_0:Wi" )) AND (T6.zip ) ) ) AND (T4.lorange BETWEEN (
3 ) AND (-(8 )) ) )
)
--EOQ
GROUP BY ( LOWER(((T3.address +T5.stor_address )+REVERSE((T5.stor_id +LTRIM(
T5.stor_address )))))+ LOWER((((";z^~tO5I" +"" )+("X3FN=" +(REVERSE((RTRIM(
LTRIM((("kwU" +"wyn_S@y" )+(REVERSE(( UPPER(LTRIM("u2C[" ))+T4.title_id ))+(
RTRIM(("s" +"1X" ))+ UPPER((REVERSE(T3.address )+T5.stor_name )))))))+
"6CRtdD" ))+"j?]=k" )))+T3.phone ))), T5.city, T5.stor_address
)
--EOQ
ORDER BY 1, 6, 5
)
Always UP: System pairs
Two clusters at two sites
Changes from one
sent to other
When one site fails
other provides service
Masks
Hardware/Software faults
Operations tasks (reorganize, upgrade
move
Environmental faults (power fail)
Outline
Highlights of MS
Research
DB-Related Research