- Microsoft Research

Download Report

Transcript - Microsoft Research

Managing Innovation:
How Microsoft Research Works
Jim Gray
Distinguished Engineer
Microsoft Corporation
Actionable Ideas
Co-lo if possible
Adopt a “university model”
Recruit from the top
Recruit for passion and
a desire to have impact
Install a Research Program Management
organization to orchestrate tech-transfer
Institute an annual TechFest
Innovation
Build versus Buy versus Invest
Build: Have in-house research
Bell Labs, IBM, GM, Pfizer, Merc, Microsoft…
Buy: Acquire startups or whole companies
IBM, Cisco, Intel, Microsoft, Pfizer, Merc…
Invest: All boats rise
Government research funding
IBM, Cisco, Intel, Microsoft, Pfizer, Merc…
All 3 approaches valid
Complement one another
Companies Are Different
Intel
Product
19%
Gross
40%
Gross
50%
S G&A
23%
R&D
15%
Accenture
Gross
32%
Gross
36%
Gross
27%
44%
Product
33%
R&D
6%
S G&A
16%
S G&A
26%
R&D
6%
Cisco
HP
other
7%
Product
R&D
0%
Product
26%
Gross
38%
Product
31%
S G&A
27%
R&D
15%
S G&A
21%
Oracle
IBM
Product
18%
S G&A
16%
Product
47%
other
2%
Microsoft
S G&A
25%
DELL
Gross
18%
Gross
26%
R&D
16%
R&D
12%
Product
73%
R&D
1%
EDS
other
14%
Gross
8%
S G&A
9%
S G&A
8%
Product
69%
Selected IT company FY02 R&D budgets:
Notice that R&D is correlated with margin
IBM and HP have large service revenues
So, their “real” R&D investment rate is higher
Dell, Accenture, EDS have modest R&D – innovate in other ways
R&D
0%
Most R&D Is D
How to Do Basic Research in Industry?
Critical questions (from Rick Rashid)
How can I
create and maintain a world class research
organization in an industrial setting?
How do I
keep the lines of communication open
between product teams and researchers?
How do I
get new technology into products quickly?
Approach
Adapt the Academic Model
Organizational goal: Advance state of the art
University organizational model
Flat structure, critical mass groups
Open research environment
Aggressive publication in peer-reviewed literature
Frequent visitors, daily seminars
Strong ties to University Research
Nearly 15% of basic research budget
directly invested in Universities
Lab grants, research grants, fellowships, etc.
Hundreds of interns and visitors
Microsoft Research Today
Founded in 1991
Staff of over 700 in over 55 areas
Internationally recognized research teams
Research lab locations :
Redmond, Washington,
San Francisco, California
Cambridge, United Kingdom
Beijing, People’s Republic of China
Mountain View, California
75%
1%
10%
10%
5%
Microsoft Research
Expanding the State of the Art
Thousands of peer-reviewed publications
10%…30% of papers at our focus conferences
graphics, programming, systems, data management…
Community leadership
Professional societies
Journals
Conferences
Mentoring Interns
Hosting academic summers and sabbaticals
Special workshops
How To Build A Group
Identify a promising area
Hire the leader (internal or external)
Support her/him
Build team around senior researcher
Look for people who
Want to have impact
Have passion for their ideas
Same template works for whole labs
Cambridge, Beijing, Silicon Valley
Keeping Open The Lines Of
Communication To Product Teams
Co-location helps: 75% “on campus”
“How can I help?” attitude
demonstrates willingness to “get dirty”
to help product succeed
Product group spin-offs build strong ties
Over time a number of product groups
evolved from research (e.g., Windows Media)
Researchers involved in all corporate product
reviews
MSR Relationship To MS Products
Virtually every research group actively
engaged with product groups
E.G., Windows, Office, streaming media, SQL,
Exchange, IIS, commerce server, visual studio,
office, consumer products, MSN, etc.
Tech transfer:
Ideas
Code
People
Contacts
Recruiting
Focused Technology Transfer
Quickly getting technology into products
Program management team
with sole focus on tech transfer
Researchers on product “advisory” boards
“Mind-swaps” – joint product/research off-sites
Joint product/research teams, e.g.,
ClearType (Windows XP)
Datamining (SQL 2000)
Natural Language & Speech (Office)
TabletPC
Smart Personal Objects (SPOT)
Encourage and recognize contributions
MSR Techfest
Internal open house for Microsoft Research
Annual event since 2001
~ 7000 attendees
170 demos, 26 lectures
“Research in progress”
Breadboard demos
This is research idea/prototype
Great networking event:
Breaks down barriers
Serendipitous connections.
Examples Of Technology Transfer
Critical support technologies
Memory Optimization Technology enabled
sim-ship of Win95/Office95
Automated bug detection in Windows 2000
Key technologies that drive products
E.G., MS audio 4.0, ClearType, intelligent search,
collaborative filtering, Intellimirror, etc.
Incubated major products
Windows streaming media
Windows CE, TabletPC, eBook
Ecommerce, Datamining
Natural language and speech technologies, etc.
MSR Mission Statement
Expand the state of the art in each of the
areas in which we do research
Rapidly transfer innovative technologies into
Microsoft products
Ensure that Microsoft products have a future
Personal Examples of R&D
Scaleable Servers
TerraServer
SkyServer
Databases
Data Cube, Snapshot Isolation
SQL Stress testing
Reliable Multicast
Personal Media Management
TerraServer & TerraService
TerraServer
http://terraserver-usa.com
USGS Photo and Topo maps
16TB of data
Online since 1997
7 billon pages served
120 TB served
Shows
Scalability
Availability
Manageability
SQL + Windows
TerraService
http://terraservice.net
A .NET web service
OpenGIS
Place Search
TerraServer Map Server
Landmarks & annotations
layered on imagery
Used by thousands of real
apps today
Shows
Web Services
Performance
TerraServer Today
TerraServer Tomorrow
Mirrored System versus SAN
3 mirrored DB servers + spare
versus 4 DB servers
Commodity versus Enterprise
White box Dual Xeon
versus 8-way branded
DAS 250GB SATA
versus FC-SAN 73GB SCSI
No Tape versus
LTO Tape Robot
$0.1M versus
$1.8M
Geoplex: 2 sites
You can afford 2!
KVM / IP
World Wide Telescope
http://www.voforum.org/
Premise: Most Astro data is online
So, the Internet is the
world’s best telescope:
Has data on every part of the sky
In every measured spectral band
As deep as the best instruments
It is up when you are up;
the “seeing” is always great
(no working at night, no clouds no moons no…)
It’s a smart telescope:
links objects and data
to literature on them
Next-Generation Data Analysis
Looking for
Needles in haystacks – the Higgs particle
Haystacks: Dark matter, Dark energy
Needles are easier than haystacks
Global statistics have poor scaling
Correlation functions are N2,
likelihood techniques N3
As data and computers grow at same rate,
we can only keep up with N logN
A way out?
data is fuzzy, answers are approximate
Requires combination of statistics and
computer science
Data Federations Of Web Services
Massive datasets live near their owners:
Near the instrument’s software pipeline
Near the applications
Near data knowledge and curation
Super Computer centers become Super Data Centers
Each Archive publishes a web service
Schema: documents the data
Methods on objects (queries)
Scientists get “personalized” extracts
Uniform access to multiple Archives
A common global schema
Challenge:
Federation
What is the object model for your science?
Web Services – The Key?
Web SERVER:
Your
program
Web
Service
Given a url + parameters
Returns a web page (often dynamic)
Web SERVICE:
Given a XML document (soap msg)
Returns an XML document
Tools make this look like an RPC.
F(x,y,z) returns (u, v, w)
Distributed objects for the web.
+ naming, discovery, security,..
Internet-scale
distributed computing
Your
program
Web
Service
Data
In your
address
space
Federating Astronomy Archives
Great Test for data mining algorithms
IRAS 25m
2MASS 2m
It is real and well documented data
High-dimensional data
(with confidence intervals)
Spatial data
Temporal data
DSS Optical
IRAS 100m
Many different instruments from
many different places and
many different times
Federation is a goal
There is a lot of it (petabytes)
Can share cross company
University researchers
WENSS 92cm
NVSS 20cm
ROSAT ~keV
GB 6cm
SkyServer – One such archive
SkyServer.SDSS.org
Sloan Digital Sky
Survey Pixels +
Data Mining
400 attributes
per “object”
Spectrograms for 1%
Demo: pixel space
record space
set space
teaching
SkyQuery: Federating Archives
http://skyquery.net/
Distributed Query tool using a set of web services
Federates ten astronomy archives from Pasadena,
Chicago, Baltimore, Cambridge (England)
Implemented in C# and .NET
Allows queries like:
SELECT o.objId, o.r, o.type, t.objId
FROM SDSS:PhotoPrimary o,
TWOMASS:PhotoPrimary t
WHERE XMATCH(o,t)<3.5
AND AREA(181.3,-0.76,6.5)
AND o.type=3 and (o.I - t.m_j)>2
SkyQuery Structure
Each SkyNode publishes
Schema Web Service
Database Web Service
Portal
Plans Query (2 phase)
Integrates answers
Is itself a web service
Image
Cutout
INT
SDSS
2MASS
SkyQuery
Portal
FIRST
Databases
Theory to practice
Data Cube
Wrote paper
SQL Server product and
ISO Standard adopted idea
Snapshot Isolation
Paper in 1996
Product in 2004
old
Reader
version
new
Databases
Stress Test
Generate millions of
random SQL queries
Send them to 4 different products
Compare the answers:
If all agree, good!
If not, a bug somewhere
Found many bugs in DB products
Much appreciated by MS DB group
Tool cloned by other DB vendors
SqlServer
DB2
=
Oracle
Informix
SQL Automated Test Example
Four SQL systems on 2,000 statements
Case
Error
W
X
Y
Z
1672
1672
232
234
241
31
1
1
1
1
31
15
12
28
1
12
5
116
0
29
32
4
18
18
19
25
45
19
18
113
1672 1672
All four
agree 84%
W,X, and Y
agree 95%
Problem with
intermediate
table.
PGM
Pretty Good Multicast
Reliable multicast protocol
Scales using hierarchy, suppression,
and FEC “on-demand”
(FEC on-demand is our contribution)
Joint work with Cisco and others
IETF standard
Implemented prototype
(Multicast PowerPoint)
Shipped in Windows XP
MyLifeBits
“A lifetime store of everything”
The experiment:
digitizing Gordon Bell’s life
The software:
Based on SQL server
Tools to capture web pages,
IM chats, TV, radio & telephone
Reports, links, full text search,
pivot by time or any other attribute
MyLifeBits Software
Radio
capture tool
TV capture
tool
Internet
TV EPG
download
tool
Browser
tool
Telephone
capture tool
MyLifeBits
store
database
PocketPC
transfer
tool
Radio EPG
tool
MAPI
interface
files
Legacy
applications
MyLifeBits
Shell
Voice
annotation
tool
Text
annotation
tool
PocketRadio
player
Legacy email
client
Research Failures
Not everything is a success
We had technology transfer failures
We had projects with little impact
Success and Failure depend on environment
Even if you have a GREAT! idea
There are many exogenous factors in
technology transfer
And, sometimes the idea or focus is wrong
Allow people to fail once or twice.
Summary
Actionable Ideas
Co-lo if possible
Adopt a “university model”
Recruit from the top
Recruit for passion and a desire to
have impact
Install a Research Program Management
organization to orchestrate tech-transfer
Institute an annual TechFest
© 2003 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.