slides - sigmod

Download Report

Transcript slides - sigmod

Database Theory:
Back to the Future
Victor Vianu
UC San Diego / INRIA
Predicting the future: a daunting task
The Oracle at Delphi
Arthur C. Clarke, 1964
Prediction in CS:
an inglorious history
T.J. Watson, 1943
“I think there is a world market for maybe five computers”
Ken Olson, 1977
"There is no reason anyone would want a computer in their
home."
Herbert A. Simon, 1965
“By 1985 machines will be capable of doing any work Man
can do."
History of Database Futurology
Laguna, Lagunita I and II, Asilomar, NSF Workshops, etc
• Prediction: what will happen
• Prescription: what researchers should work on
Laguna report (1988)
Prediction
--Top 3 future applications:
CASE, CAD, Image and Spatial data
--Main issue of debate: object-oriented databases
Completely missed the Web revolution around the corner!
Prescription: should not work on dependency
theory, recursive queries, and new data models
“The overwhelming sentiment of the majority of participants
is that they did not want to see any more papers on
recursive queries. An analogy was drawn to dependency
theory which was explored at length a few years ago (…)
There was no support for any more data models. The
problem of data translation in a heterogeneous computing
environment was raised by one participant. This area was
discussed and most people believed it to be a solved
research problem. “
Niels Bohr:
"Prediction is very difficult, especially about the future."
Why is prediction so hard?
• Difficult to predict new applications
• Even harder to predict what technical issues
they will raise
Think of XML, data integration, data exchange…
DB Theory Futurology
Christos Papadimitriou, PODS 1995:
“Database Metatheory: Asking the Big Queries”
• retrospective of db theory
• reflections on theory, dynamics of the field
good vs. successful
positive vs. negative results
relationship to practice
paradigm shifts and scientific revolutions in CS
Future = evolution + revolution
extrapolated
???
Database research topics: 1981 - 1995
[Papadimitriou 1995]
Database research topics: 1996-2011
Optimization, indexing
15
Relational Theory
Semistructured, XML
10
Constraint, spatial db
5
Probabilistic db
97
99
01
03
05
07
09
11
Database research topics: 1996-2011
15
Data integration/exchange
10
Streams
Data mining
5
Security, privacy
97
99
01
03
05
07
09
11
Others (1996-2011)
•
•
•
•
•
Recursive queries 13
Workflows and web services 12
Search and ranking 11
Transactions 8
Provenance 4
Can this be extrapolated into the future?
No!
Can this be extrapolated into the future?
No!
Clustering by “surface” topic:
primary motivation, closely tied to timely applications
Alternative: look for “persistent” topics:
recurring as fundamental conceptual
and technical tools under the wraps
Alternative: look for “persistent” topics:
recurring as fundamental conceptual
and technical tools under the wraps
Example: dependency theory
• started out as surface topic:
relational dbs need dependencies!
• but became persistent:
crucial tool in data exchange, data integration,
query optimization, data cleaning…
proof techniques used in all areas
Surface topic
Persistent topics
Data integration/exchange
Incomplete information
Datalog
Views
Dependency theory
Conjecture: past persistency is
a predictor of future persistency!
Some persistent themes
•
•
•
•
•
query languages
updates
dependency theory
recursive queries
views and incomplete information
• connections to broader theory:
logic, complexity, automata theory
Some persistent themes
•
•
•
•
•
query languages
updates
dependency theory
recursive queries
views and incomplete information
• connections to broader theory:
logic, complexity, automata theory
Query languages
For any data model:
• query language design, semantics
• expressiveness and complexity
• static analysis and optimization
Open question: language for PTIME
Updates
For any data model:
• semantics of updates
• update languages
• incremental computation
view maintenance, constraint checking, etc
Views and incomplete information
Central as surface topics, inseparable tandem
Everywhere under the wraps:
• data integration
LAV, GAV, certain answers
• data exchange
• query optimization
• data cleaning and repair
• uncertain data
• privacy
Future = evolution + revolution
extrapolated
persistent themes
???
The revolutionary side:
wide open, pregnant with opportunity !
“the future is data + communication”
“computer users will spend their time extracting
information from multiple data sources”
“we need to understand the new kinds of data arising
from the web”
“we need to study data streams, large data collections”
J. Hopcroft
Computer science theory to support
research in the information age
Exciting times for PODS !
computer science is becoming
increasingly centered around
data, information, knowledge
Challenge: Data has many suitors!
Challenge: Data has many suitors!
•
•
•
•
•
•
•
•
Mainstream databases
Knowledge discovery / data mining
Information retrieval
Semantic web, searching, ranking
High-dimensional data analysis
Networks, distributed computing
Cloud computing
Scientific computing, data visualization
Increasingly pervasive: probability and statistics
Challenge: Data has many suitors!
•
•
•
•
•
•
•
•
Mainstream databases
Knowledge discovery / data mining
Information retrieval
Semantic web, searching, ranking
High-dimensional data analysis
Networks, distributed computing
Cloud computing
Scientific computing, data visualization
PODS will have to reinvent its identity!
Reliable prediction:
PODS will remain a good place to be!
• exciting crossroads between hot applications and
beautiful theory
• researchers inspired by applications while maintaining
a long-term perspective
Reliable prediction:
PODS will remain a good place to be!
• exciting crossroads between hot applications and
beautiful theory
• researchers inspired by applications while maintaining
a long-term perspective
Will continue to have fun!
Best reason for optimism:
the human factor
• The field started by attracting first-rate
theoreticians from other areas in search
of trailblazing excitement
• It continues to attract incredibly talented
young researchers