Geog 495 GIS Database Design

Download Report

Transcript Geog 495 GIS Database Design

Geog 495
GIS Database Design
Midterm review
Outlines
1.
2.
3.
4.
5.
6.
Database Concepts
Relational Database
Object-oriented Database
Entity-Relationship Diagram
Unified Modeling Language
Normalization
1. Database concepts
•
•
•
•
•
•
•
Data vs. Information
Data vs. Database
DBMS vs. Database system
Level of abstraction
Data independence
Different database models
Architecture of database system
Data vs. Information
• Data: raw fact
• Information: data processed to reveal
meaning
• Database system transforms data into
information through queries
Data vs. Database
• Database is a collection of related data
• Database stores data in an organized
manner (i.e. minimal data redundancy)
• Database exists, first and the foremost, to
serve users’ requirements
DBMS vs. DB system
• DBMS allows users to access data (i.e.
interface between users and database)
• DB system is composed of DBMS, people,
database, and procedures
Different types of DB system
• Number of users
–
–
–
–
Single-user
Multi-user
Workgroup
Enterprise
• Database site location
– Centralized
– Distributed
• Database use
– Transactional (production)
– Data warehouse
Level of abstraction
Real-world view
Abstract
• Conceptual: independent of s/w and h/w
– How humans see the world
• Logical: s/w dependent
– How programs see the world
• Physical: h/w dependent
– How machines see the world
Machine code
Concrete
Data independence
• Separation of program from data
• Program’s ability to retrieve data without changing the
structure of code
• Logical data independence
– Program’s ability to retrieve data without changing the structure
of s/w-specific code
– When a system uses 4GL language (or non-procedural language)
• Physical data independence
– Program’s ability to retrieve data without changing the structure
of h/w-specific code
– When a system uses <= 3GL language (or procedural language)
Database models
•
•
•
•
Hierarchical
Network
Relational
Object-oriented
Hierarchical DB model
• The world is represented with tree-like
structure
• Only one-to-many relationships (i.e.
parent-child relationship) are allowed
• Relationships between entities are built
through reference point (e.g. pointer)
• Logical data independence (yes)
• Physical data independence (no)
Network DB model
• The world is represented with web-like
structure
• Many-to-many relationships are allowed
• Relationships between entities are built
through reference point (e.g. pointer)
• Logical data independence (yes)
• Physical data independence (no)
Relational DB model
• The world is represented with entities and
relationships
• M:N relationships are conceptually allowed,
but implemented through transformation
into 1:M relationship (e.g. composite
entities (a.k.a. bridge entities))
• Relationships are built through key
• Logical data independence (yes)
• Physical data independence (yes)
Object-oriented DB model
• The world is represented with a collection
of objects
• Object embeds attribute, operation and
relationships
• Complex objects can be represented
through abstract data type
• Embody OO concepts such as
encapsulation, inheritance and
polymorphism
Architecture of DB system
1. External view
user’s view, local (incomplete)
2. Conceptual/logical view
designer’s view, global
3. Internal view
Programmer’s view
•
Physical view
Implementation view, contains most details
•
•
Organized by level of abstraction
Data independence is embodied by the proper
separation between four layers
2. Relational DB
•
•
•
•
•
Representation
Table
Key
Integrity rules
Relationships
Representation
• The world are viewed as entities and
relationships
• Entities are modeled as table
• Relationships are built through common
attributes between entities
Table
• Row represents a single entity
• Column represents attribute
• Cell represents a single value in the
intersection between row and column
Key
• One or more attributes (columns) that
determines other attributes
• Primary key
– Uniquely identifies entity (should be unique)
– Should be Not Null (non-empty)
• Foreign key
– Common attributes that link one table to another
tables
– Placed in M side table in reference to primary key to 1
side table
Integrity rules
• Entity integrity
– Each entity (record) must be uniquely
identified (by primary key: PK)
– PK should be Not Null for entity integrity to be
enforced
• Referential integrity
– One table must reference another table
properly (by foreign key: FK)
– FK should be Not Null for referential integrity
to be enforced
Relationships
• M:N
– Yields data redundancy
– Composite (or bridge) entities are needed to
transform into 1:M
3. Object-oriented DB
• Object
• Difference between object in OODB and
entity in RDB
• Object and class
• OO concepts
Object
• Object has
– OID (identity): system-generated
– Instance variables (attributes): ADT allowed,
thus the representation of complex entities
are possible
– Methods (operations): make objects act upon
them, thus entities become autonomous
Objects (OODB) vs. Entities (RDB)
Objects, unlike entities
• Identity is not state-dependent
– Because OID is system-generated
• Relationship is embedded in the object
– Because objects store the reference to other
objects in themselves
• Autonomous
– Because objects can use methods stored in
class
Object and Class
• Class is a collection of objects with similar
attributes and behaviors
• Object is an instance of class from
conceptual point of view
• Class is an instantiation of objects from
implementation point of view (i.e. Object is
implemented through class: e.g. object
uses the methods stored in class)
• Objects are organized by class hierarchy
OO Concepts
• Encapsulation
– information can be selectively hidden
– enhances integrity
• Inheritance
– subclass can inherit common properties from
superclass
– enhances modularity
• Polymorphism
– operation can take many forms depending on
characteristics (through method overriding)
– enhances flexibility
4. Entity-Relationship Diagram
• Attributes
• Relationships
–
–
–
–
–
Connectivity
Cardinality
Participation (optionality)
Strength
Degree
• Entities
– Composite
– Weak
– Subtype/supertype
Attribute
• Simple/Composite
– Simple: cannot be subdivided (e.g. Sex)
– Composite: can be subdivided (e.g. Name = First
Name + Last Name; Address = street + city + state +
zipcode)
• Single-valued/Multi-valued
– Single-valued: can have a single value (e.g. age)
– Multi-valued: can have multiple values (e.g.
educational attainment: # degree can differ by
persons; address: you can live in many different
places such as permanent address, local address,
vacation home, and so on)
Relationships
• Connectivity: 1:1, 1:M, M:N
• Cardinalities: the number of entity
occurrence associated with another entity
• Participation: optional/mandatory,
determined by cardinalities
• Strength: existence-dependency + PK
derived from other table
• Degree: the number of entities associated
with the relationship
Entity
• Composite (bridge)
– entity that represents relationship between entities
(e.g. enrollment)
• Weak
– when the relationship is strong (e.g. dependent)
• Supertype/subtype
– characteristics of subtype entities are generalizable
from supertype entities (e.g. employee/secretary)
ERD notations
5. UML
•
•
•
•
What is UML?
Why UML?
UML Diagrams
Class Diagram
What is UML?
• Standardized modeling language for OO system
design & analysis
• UML notation 1.0 was formed in 1996, version
2.0 as of 2005
• Graphic notations: in between natural language
(too imprecise) and programming language (too
precise thus too much details)
• Use different diagrams depending on different
perspectives (conceptual, logical, physical)
Why UML?
• Let’s make OO system design unified
• Let’s make OO system design visual and easyto-learn
• Let’s make OO system design independent of
different programming languages
• Let’s promote good things about OO principles
– Modularity/code reusability
• Let’s make system extensible
– Stereotype, tagged value, constraint
• Let’s make model interchange easier
– XMI (XML Modeling Interchange)
UML Diagrams
• Behavior diagram
– Describe behavioral aspect of system
– Use case diagram, activity diagram
• Structure diagram
– Show the static structure of the model
– Class diagram, package diagram
– Component diagram, deployment diagram
• Interaction diagram
– Represents different aspects of interaction
– Sequence diagram, collaboration diagram
Class Diagram (overview)
• Shows the static structure of objectoriented database or database that is
implemented in OO system
• Equivalent to ERD with some differences
such as operation and more semantics on
relationships
• Can be seen from three different
perspectives (conceptual, specification,
implementation)
Class Diagram: class
• Class is represented as three-part
compartments (name, attribute, operation)
• Naming notation of attribute
– [visibility] name: data type = [initial-value]
• Naming notation of operations
– [visibility] name (parameter-list: data type):
[return value type]
Class Diagram: relationship
•
•
•
•
•
Association
Aggregation: part-whole relation
Composition: strong form of aggregation
Generalization: general/unique properties
Dependency: implementation of one class is
dependent on another class
• Multiplicities: # participants associated with
relationship
• Navigability: shows the direction of navigation
between classes
Class Diagram: relationship:
notation
6. Normalization
•
•
•
•
What is normalization?
1NF
2NF
3NF
What is normalization?
• Process for correcting table structure to
minimize data redundancies
• Usually follows three-step procedures:
conversion 1NF  2NF  3NF
• Operated by functional dependency between
attributes
• Two types of functional dependencies
– Partial dependency: nonkey attributes are dependent
on a part of composite PK
– Transitive dependency: nonkey attributes are
dependent on another nonkey attributes
First Normal Form (1NF)
• No repeating groups
• PK is defined
Second Normal Form (2NF)
• Table is in 1NF
• No partial dependency
Third Normal Form (3NF)
• Table is in 2NF
• No transitive dependency
Questions & concerns?
• Please talk to me
• Take advantage of office hours (Wed 2:304:30): tutorial is the best way to
communicate I believe.
• Please do some reading before you show
up in the class.