dbhiring - UBC Department of Computer Science
Download
Report
Transcript dbhiring - UBC Department of Computer Science
Database Systems Research:
Where it is (or should be)
Headed?
(aka looking for a “perfect”
candidate)
Laks V.S. Lakshmanan
Dept. of Computer Science
Univ. of British Columbia
December 6, 2001.
Disclaimers and Stage Setting
not meant to be comprehensive
necessarily biased
database intended in a very broad sense
–
–
–
–
–
–
e.g., relational databases, OO, object-relational, …
legacy systems (hierarchical/network DBs)
file system, spreadsheets, network directories
text, media, maps
time series, biological sequences
data on the web, XML
data management research – more apt term
DB Research Paradigms
three major streams:
– database theory (connections to math. logic,
finite model theory, …)
– principles (data modeling, design, query
languages, query optimization, …)
– systems (database tuning, benchmarking, …)
all three have their place in general
but there are limitations
DB Research: A data-driven
perspective
data mining, OLAP
OO
- data on the web
- business data, scientific,
- biological data
relational
- alphanumeric
- rigid structure
spatial
temporal
mobility
unstructured
data
multi-media
semi-structured
data & XML
-raster, video,
-audio
- text/doc domination
- surprising e.g.:AcEDB
DB Research: A processdriven perspective
classical: e.g., transactions, triggers,
integrity checking
modern:
–
–
–
–
–
richer transaction models
active databases
workflow
data warehousing
data integration
Note: last two have a substantial data
modeling, query answering,
algorithmic component.
Some Database Theory
what are queries?
First (bad) answer: any computable
INOUT function.
Okay, efficiently computable ones: why is
this still bad?
What about the following “queries”?
– Find the 10th tuple in relation emp.
– Find the employees with an odd salary.
– Find the employees the internal
representation of whose name is odd!
More on queries
What went wrong: representation dependence.
Queries are computable functions that commute
(i.e. they are generic):
DB
Q
Ans
Rep
Rep
Rep(DB)
Rep(Ans)
Q
Interesting Questions
what are meaningful queries for a given
data model/application class?
how do you design declarative query
languages and algebras?
build novel indices for new data types?
design optimal strategies for clustering data
deal with size: data compression,
approximation, summarization, etc.
resource conscious designs
scalable algorithms for analysis queries
(incl. data mining)
IQ (contd.)
liberating data mining from present-day
mindset
answering queries using views and view
maintenance
semi-structured data management
mixing paradigms: e.g., database style
querying and information retireval or media
retrieval
foundational questions in new domains:
e.g., what does it mean to query sequences?
Profile of a perfect candidate
some obvious desirables: is a hardcore
system builder, architect of extensions
has vision in traditional or new domains
(e.g., web, biology, mobility, …)
– vision just as important as technical skills
raises difficult questions and provides
surprisingly elegant and/or efficient
solutions
complements the DB group’s strengths
has unbounded energy and enthusiasm!!!!