Transcript slides

Distributed Database
Management Systems
Reading

Farkas
Textbook: Ch. 4
CSCE 824 - Spring 2011
2
Design Issues


Farkas
Placing of data and programs
(DBMS and application)
Network issues
CSCE 824 - Spring 2011
3
Level of Sharing



No sharing
Data sharing
Data and program sharing
Heterogeneous environment!
Farkas
CSCE 824 - Spring 2011
4
Top-Down Design

Global Conceptual schema 
distribution
– Fragmentation
– Replication
– Allocation

Farkas
Figure 3.2
CSCE 824 - Spring 2011
5
Correctness of
Fragmentation
1.
2.
Completeness: FR={R1, …, Rn}
Reconstruction: R=Ri, RiR
1&2: Lossless-join (normalization)
3.
Disjointness:
–
–
Farkas
Horizontal: does not  djRi such that
djRk where ki
Vertical: same as horizontal for nonprimary key attributes
CSCE 824 - Spring 2011
6
Data Directory

Global vs. local conceptual
schemas
– How to search?
– Where to store?
– Single vs. multiple copies?
Farkas
CSCE 824 - Spring 2011
7
Current Research



Allocation: new requirements,
technology, etc.
Where to store the fragments?
Dynamic environment
–
–
–
–
Farkas
Usage pattern
Application characteristics
Network changes
Security
CSCE 824 - Spring 2011
8
Bottom-Up Approach


Multi-database systems
How to integrate them into 1
database?
– Interoperability
Farkas
CSCE 824 - Spring 2011
9
Database Integration

Physical integration
– Materialized database: data
warehouses
– Extract-transform-load (ETL) tools

Logical integration
– Virtual (not materialized) integration
– Enterprise Information Integration
Farkas
CSCE 824 - Spring 2011
10
Data Warehouses

On-line Analytical Processing
(OLAP) applications:
– Decision support systems
– Trend analysis and forecasting


Farkas
Complex queries, large databases
Materialized view maintanence
CSCE 824 - Spring 2011
11
Logical Integration




Farkas
No materialized global database
Virtual integration: data remains at
the local (operational) databases
Global conceptual schema may not
contain everything from local schemas
Autonomous and heterogeneous local
systems
CSCE 824 - Spring 2011
12
Bottom-Up Design

Global Conceptual Schema (GCS
or mediated schema)
– Defined first: local conceptual
schemas (LCS) are mapped to GCS
– Defined during the integration of
the LCSs and develop the
corresponding mappings from LCSs
to the GCS
Farkas
CSCE 824 - Spring 2011
13
GCS Defined First


Local-as-view (LAV) systems
– Each LCS is treated as a view over the GCS
– Query results: constrained to the objects in
the local DBs while the GCS definition may be
richer
– Potential incomplete answers
Global-as-view GCS is defined as a set of views
over the LCSs
– View definition defines how to derive elements
of the GCS
– Query results: constrained to the GCS while
the local DBs might be richer
Farkas
CSCE 824 - Spring 2011
14
Design Tasks



Farkas
Schema translation
Schema generation
Figure 4.3
CSCE 824 - Spring 2011
15
Intermediate Canonical
Representation




Farkas
Expressive to incorporate all
concepts in the local databases
Simple, intuitive, practical, etc.
Example: E/R model, relational
model, graph/tree models, etc.
Tools
CSCE 824 - Spring 2011
16
Schema Generation




Farkas
Schema matching: syntax and
semantics
Integration of common schema
elements
Schema mapping
See example 4.1, 4.2
CSCE 824 - Spring 2011
17
Schema Matching


Defined or discovered (e.g., web
data)
Rules:
– Correspondence between 2 elements
– Predicate whether the
correspondence holds or not
– Similarity value between the 2
elements
Farkas
CSCE 824 - Spring 2011
18
Finding Correspondence


Difficult process due to schema
heterogeneity
Can be automated?
– Insufficient schema and instance
information
– Unavailability of schema
documentation
– Subjectivity of matching
Farkas
CSCE 824 - Spring 2011
19
Matching Algorithm
Issues



Farkas
Schema vs. instance matching
– Concept match
– Data instance: semantic inconsistencies
Element-level vs. structure-level mapping
– Element name  semantics
– Multiple attribute mapping?
Matching cardinality
– One-to-one, one-to-many, many-to-many
CSCE 824 - Spring 2011
20
Semantic Schema
Heterogeneity

Farkas
Semantic: meaning, interpretation,
and intended use of data
– Synonyms, homonyms, hypernyms
– Different ontologies
– Imprecise wording
CSCE 824 - Spring 2011
21
Structural Schema
Heterogeneity
– Type conflict: attribute vs. entity
– Dependency conflict: mapping
cardinality inconsistencies
– Key conflict: different primary keys
– Behavioral conflict: modeling
assumptions, e.g., referential integrity,
deletion, etc.
Farkas
CSCE 824 - Spring 2011
22
Schema Integration


Farkas
Binary
N-ary
CSCE 824 - Spring 2011
23
Schema Mapping



Farkas
How the data from local
databases can be mapped to GCS
Mapping creating
Mapping maintanence
CSCE 824 - Spring 2011
24
Mapping Creation


Input: LCS, GCS, M (schema
matches)
Output: Q={Q1, …, Qk} such that
– DBGCS =  Q(DBCLS)
Farkas
CSCE 824 - Spring 2011
25
Security Objectives



Farkas
Confidentiality
Integrity
Availability
CSCE 824 - Spring 2011
26
Question 1

How distributed databases
impact the security objectives?
– Confidentiality in traditional vs.
distributed DBs
– Integrity in traditional vs.
distributed DBs
– Availability in traditional vs.
distributed DBs
Farkas
CSCE 824 - Spring 2011
27
Integrity

Correctness criteria
– Top-down design
– Bottom-up design
Farkas
CSCE 824 - Spring 2011
28
Availability

What are the issues related to
availability when dealing with
– Top-down design
– Bottom-up design
Farkas
CSCE 824 - Spring 2011
29
Confidentiality


(will be covered in 2nd part of
semester but…)
Centralized vs. distributed
security policy
– Top-down design
– Bottom-up design
Farkas
CSCE 824 - Spring 2011
30
Next Class
Semantics-based Database
Integration
Farkas
CSCE 824 - Spring 2011
31