Advanced Database Applications Fall 2001

Download Report

Transcript Advanced Database Applications Fall 2001

Advanced Database Applications:
CS562 -- Fall 2011
George Kollios
Boston University
Prof. George Kollios
Office: MCS 288
Office Hours: Monday 2:30pm-4:00pm
Thursday 11:00am-12:30pm
Web:
http://www.cs.bu.edu/faculty/gkollios/ada11
History of Database Technology

1960s: Data collection, database creation, IMS and network DBMS

1970s: Relational data model, relational DBMS implementation



1980s: RDBMS, advanced data models (extended-relational, OO, deductive,
etc.) and application-oriented DBMS (spatial, scientific, engineering, etc.)
1990s—2000s: Data mining and data warehousing, multimedia databases.
2010s-: Data on the cloud, privacy, security. Social network data (facebook,
twitter, etc), Web 3.0 and more
Modern Database Systems
Extend these layers
Structure of a RDBMS

A DBMS is an OS for
data!
Query Optimization
and Execution
Relational Operators

A typical RDBMS has a
layered architecture.
Files and Access Methods
Buffer Management
Disk Space Management
DB
Index Methods for RDBMS

Hashing Methods:


B-tree family:


Linear Hashing, Extensible Hashing
B+-trees and variations
Both of them are one-dimensional
Overview of the course

Spatial Database Systems



Temporal Database Systems


GIS, CAD/CAM, EOSDIS project NASA
Manages points, lines and regions
Billing, medical records
Spatio-temporal Databases

Moving objects, changing regions, etc
Overview of the course

Multimedia databases


A multimedia system can store and retrieve
objects/documents with text, voice,
images, video clips, etc
Time series databases

Stock market, ECG, trajectories, etc
Multimedia databases

Applications:



Digital libraries, entertainment, office
automation
Medical imaging: digitized X-rays and MRI
images (2 and 3-dimensional)
Query by content: (or QBE)


Efficient
‘Complete’ (no false dismissals)
Database Outsourcing
Owner(s): publish database
Servers: host database and provide query services
Clients: query the owner’s database through servers
Owner
Clients
Server
Security Issues: untrusted or compromised servers
H. Hacigumus, B. R. Iyer, and S. Mehrotra, ICDE02
9
Security Issues



Query authentication and verification
Data privacy and confidentiality
Access control
Databases on the Cloud




Cloud computing is a new trend
Data are stored “in the cloud”, accessed
from everywhere
System should maximize utility,
minimize response time
Use of large clusters (data centers)

MapReduce
Semantic Web: A lot of data
on the web…




There is a lot of data on the web…
Need to make them more accessible and
useful
Machine should understand some of the
semantics of the web data
Semantic Web: "a web of data that can be
processed directly and indirectly by
machines.“Tim Berners-Lee
Semantic Web


From document sharing to data sharing
Issues/Challenges:




Vastness:More than 24B pages
Vagueness and Uncertainty: meaning of
“young”, “cheap”, “close”, etc.
Inconsistency: contradictions on data and
semantics
Deceit: a user may want to mislead, deceive
Probabilistic (or Uncertain)
Databases



Another approach to model many real
world applications.
Data records are probabilistic or
uncertain
Need to formally model and query
(correctly and efficiently)=> Prob DBs
What is a Probabilistic Database ?

“An item belongs to the database” is a probabilistic
event



Tuple-existence uncertainty
Attribute-value uncertainty
“A tuple is an answer to the query” is a probabilistic
event
15
Two Types of Probabilistic Data

Database is deterministic
Query answers are probabilistic



E.g., IR-style/”fuzzy-match” queries
Approximate query answers
Database is probabilistic
Query answers are probabilistic
16
Prob DB Models

The database is a probability distribution
over possible instances of (deterministic)
databases
Example: x-relations [Trio]
Each x-tuple represents a
discrete probability distribution
of tuples
x-tuples are mutually
independent, and disjoint
Back to reality…

Grading:


4 Homeworks : 0.2
1 Term Project: 0.3




You need to talk to me and get a problem
Project proposal due in a couple of weeks
Midterm: probably on Oct 26, in class: 0.2
Final: Dec 20 at 12:30pm (?): 0.3