dbhiring - UBC Department of Computer Science

Download Report

Transcript dbhiring - UBC Department of Computer Science

Database Systems Research:
Where it is (or should be)
Headed?
(aka looking for a “perfect”
candidate)
Laks V.S. Lakshmanan
Dept. of Computer Science
Univ. of British Columbia
December 6, 2001.
Disclaimers and Stage Setting

not meant to be comprehensive
 necessarily biased
 database intended in a very broad sense
–
–
–
–
–
–

e.g., relational databases, OO, object-relational, …
legacy systems (hierarchical/network DBs)
file system, spreadsheets, network directories
text, media, maps
time series, biological sequences
data on the web, XML
data management research – more apt term
DB Research Paradigms

three major streams:
– database theory (connections to math. logic,
finite model theory, …)
– principles (data modeling, design, query
languages, query optimization, …)
– systems (database tuning, benchmarking, …)

all three have their place in general
 but there are limitations
DB Research: A data-driven
perspective
data mining, OLAP
OO
- data on the web
- business data, scientific,
- biological data
relational
- alphanumeric
- rigid structure
spatial
temporal
mobility
unstructured
data
multi-media
semi-structured
data & XML
-raster, video,
-audio
- text/doc domination
- surprising e.g.:AcEDB
DB Research: A processdriven perspective

classical: e.g., transactions, triggers,
integrity checking
 modern:
–
–
–
–
–

richer transaction models
active databases
workflow
data warehousing
data integration
Note: last two have a substantial data
modeling, query answering,
algorithmic component.
Some Database Theory

what are queries?
 First (bad) answer: any computable
INOUT function.
 Okay, efficiently computable ones: why is
this still bad?
 What about the following “queries”?
– Find the 10th tuple in relation emp.
– Find the employees with an odd salary.
– Find the employees the internal
representation of whose name is odd!
More on queries

What went wrong: representation dependence.
 Queries are computable functions that commute
(i.e. they are generic):
DB
Q
Ans
Rep
Rep
Rep(DB)
Rep(Ans)
Q
Interesting Questions

what are meaningful queries for a given
data model/application class?
 how do you design declarative query
languages and algebras?
 build novel indices for new data types?
 design optimal strategies for clustering data
 deal with size: data compression,
approximation, summarization, etc.
 resource conscious designs
 scalable algorithms for analysis queries
(incl. data mining)
IQ (contd.)

liberating data mining from present-day
mindset
 answering queries using views and view
maintenance
 semi-structured data management
 mixing paradigms: e.g., database style
querying and information retireval or media
retrieval
 foundational questions in new domains:
e.g., what does it mean to query sequences?
Profile of a perfect candidate

some obvious desirables: is a hardcore
system builder, architect of extensions
 has vision in traditional or new domains
(e.g., web, biology, mobility, …)
– vision just as important as technical skills

raises difficult questions and provides
surprisingly elegant and/or efficient
solutions
 complements the DB group’s strengths
 has unbounded energy and enthusiasm!!!!