Transcript table

Databases and DBMS
General Concepts
• Database
– Organized collection of logically related data
• Data
– Known facts
– Types: text, figures, graphics, images, sound,
videos…
• Database management system (DBMS)
– Software package for defining and managing
a database
Databases and DBMS
Table
• A table is the primary unit of physical
storage for data in a database.
• Usually a database contains more than
one table.
Databases and DBMS
Field
Record
Primary key
Databases and DBMS
Metadata
• Data that describes the properties or characteristics of
other data
• Allows database designers and users to understand the
meaning of the data
Name
Type
Size
Description
REL_NUM
Integer
R_LOCATION
Text
R_SURV_DATE
Date/Time
R_SURVEYORS
Text
R_LATITUDE
Double
8 Northing, UTM
R_LONGITUDE
Double
8 Easting, UTM
R_PHOTO_N
Text
10 Aerial photo #
R_TOPOG_MAP
Text
20 Topographic map
2 Relevé number
255 Location
8 Date
50 Initials of surveyors
Databases and DBMS
Relational Database
• Data are organized into two or more tables
• Relationships between entities are represented
by values stored in the columns of the
corresponding tables (keys)
• Schema: a view that inter-relates tables to each
other
• Accessible through Standard Query Language
(SQL)
• Query: set of instructions to a database “engine”
to retrieve, sort and format returning data
Databases and DBMS
Schema
Schema
Databases and DBMS
SQL
• Structured Query Language
• SQL statements are used to retrieve and update data in
a database
• Syntax for executing queries, updating, inserting, and
deleting records.
–
–
–
–
SELECT - extracts data from one or more table
INSERT INTO - inserts new data
UPDATE - updates data
DELETE FROM - deletes data
Databases and DBMS
Example
SELECT tab_releve.* FROM tab_releve INNER JOIN tab_site ON
tab_releve.REL_NUM = tab_site.REL_NUM WHERE
tab_site.SG_ELEVATION > 2000
SELECT tab_releve.*
FROM tab_releve
INNER JOIN tab_site
ON tab_releve.REL_NUM = tab_site.REL_NUM
WHERE tab_site.SG_ELEVATION > 2000
Databases and DBMS
DBMS
• Software package for defining and managing a
database (aka DB “engine”)
• Examples:
– Proprietary: MS SQL Server, DB2, Oracle,
Sybase
– Open source: MySql, PostgreSQL
Databases and DBMS
What is Microsoft Access?
• “It is a Database”
• No! Microsoft Access is an application
development tool, not a Database Management
System (DBMS)
• Access has some DBMS-like features
• Strictly, these are the features of a software
component called Jet, which happens to be
bundled with Access and other Microsoft
products
Databases and DBMS
DBMS Advantages
• Enforced standards
• Increased productivity of application
development
– centralization
– availability
• Minimal data redundancy
• Improved data consistency and quality
– access control
– transaction control
– data independence
Databases and DBMS
Normalization
• A method for organizing data elements
into tables
• Done in order to avoid
– Duplication of data
– Insert anomaly
– Delete anomaly
– Update anomaly
Database normalization
Inconsistency
Name
Country
Sport1
Fee1
Sport2
Fee2
Amado Adalberto LOPEZ
BAUTISTA
Guatemala
Football
35
Golf
80
115
Sabil Damiao MANDALA
Mozambique
Swimming
60
Tennis
50
110
Egypt
Tennis
50
Basket
30
80
Italy
Football
30
Senegal
Golf
80
Basket
30
Italy
Football
30
Golf
80
Ahmed Mohamed Harb RABIA
Marta RONCHI
Alfred Kouly TINE
Cinzia LANFREDI SOFIA
Redundancy
Ambiguity
Sport3
Fee3
Total
30
Hockey
75
185
110
Redundancy
Database normalization
Name
Country
Sport1
Fee1
Sport2
Fee2
Amado Adalberto LOPEZ
BAUTISTA
Guatemala
Football
35
Golf
80
115
Sabil Damiao MANDALA
Mozambique
Swimming
60
Tennis
50
110
Egypt
Tennis
50
Basket
30
80
Italy
Football
30
Senegal
Golf
80
Basket
30
Italy
Football
30
Golf
80
Ahmed Mohamed Harb RABIA
Marta RONCHI
Alfred Kouly TINE
Cinzia LANFREDI SOFIA
Sport3
Fee3
30
Hockey
75
185
110
First Name
Second Name
Name
Family Name
Country
Sport1
Fee1
Sport2
Fee2
Sport3
Unpack the data
•
Reduce elements to non-decomposable (atomic) values
Total
Fee3
Database normalization
First Name
Second Name
Family Name
Country
Sport1
Fee1
Sport2
Fee2
Amado
Adalberto
LOPEZ BAUTISTA
Guatemala
Football
35
Golf
80
115
Sabil
Damiao
MANDALA
Mozambique
Swimming
60
Tennis
50
110
Ahmed
Mohamed Harb
RABIA
Egypt
Tennis
50
Basket
30
80
Italy
Football
30
Senegal
Golf
80
Basket
30
Italy
Football
30
Golf
80
RONCHI
Marta
Alfred
Cinzia
Kouly
TINE
LANFREDI SOFIA
Sport3
Fee3
Total
30
Hockey
75
110
First Name
Second Name
Family Name
Country
Sport1
Sport
Fee
Fee1
Sport2
Fee2
Sport3
Fee3
Total
First Normal Form (1NF)
1. Eliminate duplicative columns from the same table.
2. Create separate tables for each group of related data and identify each row
with a unique column or set of columns (the primary key).
185
Database normalization
Id_student
First Name
Second Name
1
Amado
Adalberto
LOPEZ BAUTISTA
2
Sabil
Damiao
MANDALA
3
Ahmed
Mohamed Harb
4
Marta
5
Alfred
6
Cinzia
Id_student
Id_sport
Family Name
Country
Total
Guatemala
115
Mozambique
110
RABIA
Egypt
80
RONCHI
Italy
30
Senegal
185
Italy
110
TINE
Kouly
LANFREDI SOFIA
Id_sport
Sport
Fee
1
Football
30
2
Swimming
60
3
Tennis
50
4
Golf
80
5
Basket
30
6
Hockey
75
Second Normal Form (2NF)
1. Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
2. Create relationships between these new tables and their predecessors through the use of foreign keys.
Database normalization
Id_student
First Name
Second Name
1
Amado
Adalberto
2
Sabil
3
Ahmed
4
Marta
5
Alfred
6
Cinzia
Id_student
Id_sport
Country
Total
LOPEZ BAUTISTA
1
115
Damiao
MANDALA
2
110
Mohamed Harb
RABIA
3
80
RONCHI
4
30
TINE
5
185
LANFREDI SOFIA
4
110
Kouly
Family Name
Id_sport
Sport
Fee
Id_country
Country
1
Football
30
1
Guatemala
2
Swimming
60
2
Mozambique
3
Tennis
50
4
Golf
80
3
Egypt
5
Basket
30
4
Italy
6
Hockey
75
5
Senegal
Third Normal Form (3NF)
•
Remove columns that are not fully dependent upon the primary key.
Database normalization
Id_student
First Name
Second Name
1
Amado
Adalberto
LOPEZ BAUTISTA
1
2
Sabil
Damiao
MANDALA
2
3
Ahmed
Mohamed Harb
RABIA
3
4
Marta
RONCHI
4
5
Alfred
TINE
5
6
Cinzia
LANFREDI SOFIA
4
Id_student
Id_sport
Kouly
Family Name
Country
Id_sport
Sport
Fee
Id_country
Country
1
Football
30
1
Guatemala
2
Swimming
60
2
Mozambique
3
Tennis
50
4
Golf
80
3
Egypt
5
Basket
30
4
Italy
6
Hockey
75
5
Senegal
Databases and DBMS
DBMS Advantages
• Enforced standards
• Increased productivity of application
development
• Minimal data redundancy
• Improved data consistency and quality
• Program-data independence
Databases and DBMS
DBMS Advantages
• Enforced standards
• Increased productivity of application
development
• Minimal data redundancy
• Improved data consistency and quality
• Program-data independence
• Improved accessibility and data sharing
Databases and DBMS
PostgreSQL
• PostgreSQL is an object-relational database
management system (DBMS) based on POSTGRES
developed at the University of California at Berkeley
Computer Science Department.
• POSTGRES pioneered many concepts that only became
available in some commercial database systems much
later.
• PostgreSQL is an open-source descendant of this
original Berkeley code. It supports a large part of the
SQL standard, offers many modern features and can be
extended by the user in many ways.
• The official manual is more than 2700 pages...
Databases and DBMS
PostgreSQL
• A server process, which manages the database files,
accepts connections to the database from client
applications, and performs database actions on behalf of
the clients. The database server program is called
postgres.
• The user’s client (frontend) application that wants to
perform database operations. Client applications can be
very diverse in nature: a client could be a text-oriented
tool, a graphical application, a web server that accesses
the database to display web pages, or a specialized
database maintenance tool.
• Some client applications are supplied with the
PostgreSQL distribution; most are developed by users.
Databases and DBMS
http://db-engines.com/en/