Basics of data management
Download
Report
Transcript Basics of data management
Introduction to Data
Management
Chapter 1, Pratt & Adamski
Data and Information
DATA: Facts concerning people, objects, vents or
other entities. Databases store data.
INFORMATION: Data presented in a form
suitable for interpretation.
Data is converted into information by programs
and queries. Data may be stored in files or in
databases. Neither one stores information.
KNOWLEDGE: Insights into appropriate actions
based on interpreted data.
Knowledge Generation
DATA
INFORMATION
Basic Principles
DATABASE: A shared collection of interrelated
data designed to meet the varied information
needs of an organization.
DATABASE MANAGEMENT SYSTEM: A
collection of programs to create and maintain a
database.
Define
Construct
Manipulate
Advantages of Database
Processing
More information
from same data
Shared data
Balancing conflicts
among users
Controlled
redundancy
Consistency
Integrity
Security
Increased
productivity
Data independence
Disadvantages of
Database Processing
Increased size
Increased complexity
More expensive personnel
Increased impact of failure
Difficulty of recovery
Cost
Especially server and mainframe systems
Objectives of the DBMS
Approach
SELF-DESCRIBING
DATA INDEPENDENCE
MULTIPLE VIEWS
MULTIPLE USERS
What is a Database
Management System?
Data Files
Directory
Access Engine
Utility Programs
Database
DATA
METADATA
ACCESS ENGINE
UTILITIES
Files and Databases
Metadata
“Data about data”
Description of fields
Display and format instructions
Structure of files and tables
Security and access rules
Triggers and operational rules
Database Access
USER
INTERFACE
DATABASE
PROGRAM
History of Database
Management
File Management Systems
Hierarchical Model
IBM “Information Management System (IMS)” 1966
Network Model
Charles Bachman’s “Integraded Data Store (IDS)” 1965
Conference on Data Systems Languages /DataBase Task
Group CODASYL/DBTG (1971)
Relational Model
E.F. Codd, 1970
File Management Systems
Provided facilities to extract data and
share files, but did not implement any
way to connect records in one file to
those in another. Relationships had to be
implemented in application code.
Database vs File Systems
Program 1
Meta-Data
Program 2
Meta-Data
Program 3
Meta-Data
Program 1
Program 2
Program 3
FILE SYSTEM
Data
DATABASE
MetaData
Data
Structured Databases
Relationships were implemented by
physical pointers (called “sets”) which
allowed records to be connected in
different files. Hierarchical databases
allow only one parent set; networks allow
several. These permit efficient processing
but the sets must be constructed on data
entry and cannot be rearranged later.
Relational Models
Relational models implement relationships
with matched data values in related files
(called primary and foreign keys). Any
attributes can be matched. The
connection is established at retrieval so
interconnections can be developed as
needed.
Hierarchy
SECTION
STUDENT
COLLEGE
INSTRUCTOR
COLLEGE
Each file can have only one parent. To implement a second
“parent” (COLLEGE) we have to implement a shadow copy.
Network
SECTION
STUDENT
INSTRUCTOR
COLLEGE
Each file can have several parents. Both SECTION and
COLLEGE are “parent” files..
Relational
SECTION
SECTION-STUDENT
SECTION-INSTRUCTOR
SECTION-KEY
STUDENT-KEY
SECTION-KEY
INSTRUCTOR-KEY
STUDENT
INSTRUCTOR
COLLEGE-KEY
COLLEGE-KEY
COLLEGE
Each file can have several parents. Both SECTION and
COLLEGE are “parent” files..
Relational Terminology
Entity
Person, place, thing or event about which we
wish to keep data
Attribute
property of an entity
Relationship
an association among entities (entity
records)
KERR MCGEE’S LIFE CYCLE
STAGE
PROCESS MODEL
DATA MODEL
Initialization
Report
Report
Feasibility
Report
High Level DFD
Process Analysis
(Business Chart)
High Level E/R Diagram
Requirements
General
DFD
High Level Dictionary
Top Down E/R
File Specifications
Requirements
Logical
DFD
Data Dictionary
File Specifications
Process Logic
Bottom Up E/R
Action Diagrams
System Design
Structure Charts
Module IPO Specification
Screen/Report Layouts
Cleanup
Volume/Usage Analysis
Physical Schema
Index/Record Specs
Coding/Testing
Test Plan
Logs and Documentation
Code
Implementation
Installation Plan
Population Plan
Data Management
Designing and managing information in a
data base environment requires:
Understanding the principles of data
modeling in system design.
Using SQL for data manipulation.
Understanding the concepts of managing
data in a database environment.
Information System
Modeling Approaches
PROCESS MODELING: The traditional method of
designing systems by following the changes to data
flows.
DATA MODELING: An approach to system development
that specifies the file structure that conforms to the
things important to the organization.
PROTOTYPING: An iterative approach that focuses on
building small operating
OBJECT MODELING (Event driven design):
Defines objects that contain data and associated
processing rules encapsulated together.