Transcript data type

Data Model / Database
Implementation
Jeffery S. Horsburgh
Hydroinformatics
Fall 2014
This work was funded by National
Science Foundation Grants EPS 1135482
and EPS 1208732
Objectives
• Understand common database software
packages, limitations and capabilities
• Where are certain databases commonly used?
• Why and when would you use certain
databases?
• Introduce MySQL functionality
• Create a physical implementation of an
Observations Data Model (ODM) database
within a RDBMS
Where We are Going
• This week: Database implementation
– How do you physically implement a data model as a
database and load data
• Next week: Using Structured Query Language
(SQL)
– Using code to slice and dice your data once you have
it in a database
• Two weeks: Accessing databases from client
software
– Getting data from a database into Python
Steps in Data Model Design
1. Identify entities
2. Identify relationships among entities
3. Determine the cardinality and participation
of relationships
4. Designate keys / identifiers for entities
5. List attributes of entities
6. Identify constraints and business rules
7. Map 1-6 to a physical implementation
Physical Data Model
• The “physical” means a specific implementation
of the data model
– Choice of hardware and operating system
– Choice of relational database management system
– Implementation of tables, relationships, constraints,
triggers, indices, data types
– Database access and security
– Performance
– Storage
Summary of 3 Levels of Data Model Design
Feature
Conceptual
Logical
Entity Names
X
X
Entity Relationships
X
X
Physical
Attributes
X
Primary Keys
X
X
Foreign Keys
X
X
Table Names
X
Column Names
X
Column Data Types
X
Views
X
Stored Procedures
X
Triggers
X
Constraints
X
Relational Database Management Systems
• Robust software for managing data
• Data stored in tables that may be related
• Based on Structured Query Language (SQL)
– Adopted by the American National Standards
Institute (ANSI) and the International Standards
Organization (ISO) as the standard data access
language
Relational Database Management
Systems (RDBMS)
•
•
•
•
•
File vs. server based
Free vs. commercial
Different data types
Potentially different syntax for SQL queries
Security models and concurrent users
File Versus Server-Based RDBMS
• File-based
– Everything contained within a single file
– Generally good for single user, desktop applications
– Examples: SQLite, Microsoft Access
• Server-based
– A database “server” manages databases and all
transactions
– Good for multiple, simultaneous connections and
transactions
– Examples: Oracle, Microsoft SQL Server, MySQL,
PostgreSQL
The Beauty of Server-Based RDBMS
Data
Database
Authentication and Access Control
Yay!
I want some
Database ServerDo I know you?Do I recognize
data!
Are you your IP address?
With RDBMS
authorized?
Firewall
Simultaneous Data Users
Server-Based RDBMS – “Granules”
• Database Server
• Databases
• Tables
• Records
DB1
DB2
DB3
...
DBn
Site Variable
Date
Value
1 Temperature 8/2/2007 14:00 12.4
1 Temperature 8/2/2007 14:30 12.7
1 Temperature 8/2/2007 15:00 13.1
1
Temperature
8/2/2007 14:00 12.4
Who are the main contenders?
• Commercial software
– Sybase Adaptive Server Enterprise
– IBM DB2
– Oracle
– Microsoft SQL Server
– Teradata
• Free/GPL/Open Source:
– MySQL
– PostgreSQL
MySQL
• No limit on # of CPUs, fully featured, support
for several OSs
• $0 without Support, approx. $600-$6000 for
subscription-based support
• Highly scalable, provides support for most
types of indexes and storage methods
Things to Consider When
Deciding on a RDBMS
• Price
• Operating System Support
• Most important features
– ACID (Atomicity, Consistency, Isolation, Durability)
– Referential Integrity support
– Transactions support
• Scalability and security
• Availability of support and software
• Potential longevity
ACID Properties of Database
Transactions
• Atomicity – each transaction is “all or
nothing”
• Consistency – any transaction brings the
database from one valid state to another
• Isolation – concurrent execution of
transactions results in a system state that
would be obtained if they executed serially
• Durability – once a transaction has been
committed, it will remain so
Physical Implementation of a Data
Model within a RDBMS
1.
2.
3.
4.
Create a new database
Create tables for entities
Define attributes and data types
Create relationships and define their
properties
5. Define constraints
All of this can be scripted/automated using SQL
Reduction of an ER Diagram to Tables
• Converting an ER diagram to table format is
the basis for deriving a relational database
– Entities are expressed as tables that contain data
– A database is a collection of tables
– Tables are assigned the same name as the entity
– Each table has columns that correspond to
attributes – each column has a unique name
– Each column must have a single data type
Don’t forget: Forward engineering in MySQL Workbench does
this automatically
Data Types
• Each attribute of an entity (column in a
database table) must have a single data type
• Data types are enforced by RDBMS software
Table: DataValues
Attribute
Data Type
Sample Data
ValueID
Integer
1
SiteID
Integer
5
VariableID
Integer
5
DateTime
Date/Time
8/15/2013 4:30 PM
DataValue
Double
4.567
Data Types
• Data types can be specific to RDBMS software
RDBMS
Integer
Floating Point
Decimal
String
Date/Time
MS SQL Server
TINYINT,
SMALLINT, INT,
BIGINT
FLOAT, REAL
NUMERIC,
DECIMAL,
SMALLMONEY,
MONEY
CHAR,
VARCHAR,
TEXT, NCHAR,
NVARCHAR,
NTEXT
DATE, DATETIMEOFFSET,
DATETIME2,
SMALLDATETIME,
DATETIME, TIME
MySQL
TINYINT (8-bit),
SMALLINT (16bit), MEDIUMINT
(24-bit), INT (32bit), BIGINT (64bit)
FLOAT (32-bit),
DOUBLE (aka
REAL) (64-bit)
DECIMAL
CHAR,
BINARY,
VARCHAR,
VARBINARY,
TEXT,
TINYTEXT,
MEDIUMTEXT,
LONGTEXT
DATETIME, DATE,
TIMESTAMP, YEAR
PostgreSQL
SMALLINT (16bit), INTEGER (32bit), BIGINT (64bit)
REAL (32-bit),
DOUBLE
PRECISION
(64-bit)
DECIMAL,
NUMERIC
CHAR,
VARCHAR,
TEXT
DATE, TIME (with/without
TIMEZONE), TIMESTAMP
(with/without
TIMEZONE), INTERVAL
Quick summary from: http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems
Data Types in MySQL
http://dev.mysql.com/doc/refman/5.7/en/data-types.html
• Numeric
– Integer types: bit, tinyint, smallint, mediumint, int, bigint
– Fixed-point types: decimal, numeric (exact numeric
values to preserve precision)
– Floating-point types: float, double (approximate values)
• Date and Time
– date, datetime, timestamp, time, year
• Strings
– char(size) – fixed length string, right padded
– varchar(size) – variable length string
– tinytext, text, mediumtext, longtext - text data block
Data Types in MySQL (2)
• Binary data
– binary(size) – binary byte strings
– varbinary(size)
– tinyblob, blob, mediumblob, longblob – binary large
object
• Spatial Data Types
– geometry
– point, multipoint
– linestring, multilinestring
– polygon, multipolygon
Data Types in MySQL (3)
• Nullable and NOT NULL types
– All types in SQL Server may or may
not allow NULL values
• Primary key columns
– Define the primary key
• Auto-increment columns
– Automatically increased values when a new row is
inserted (auto-increment values)
– Used in combination with primary key
Advanced Database Objects
•
•
•
•
Views
Stored procedures
Triggers
Constraints
• Implementation of these objects may depend
on your choice of RDBMS software
Database Views
• A View is equivalent to a table, but is defined
by a SQL query
• Used to present a set of desired information,
independent of the underlying database
structure
• Can be used to hide complexities of the
underlying data model from the user
– One way to address the cons of normalization
Stored Procedures
• A set of structured query language (SQL)
statements that are stored and executed on
the server
• Useful for repetitive tasks
• Encapsulate functionality and isolate users
from data tables
• Can provide a security layer – software
applications have no access to the database
directly, but can execute stored procedures
Triggers
• Special kind of stored procedure
• Automatically executes on a table or view
when an event occurs in the database
• Events include: CREATE, ALTER, INSERT,
UPDATE, DELETE
• Mostly used to maintain the integrity of
information in the database
Constraints
• Common way to enforce data integrity
• Examples:
– Not NULL – value in a column must not be NULL
– Unique – value(s) in specified column(s) must be
unique for each row in a table
– Primary Key – value(s) in the specified column(s) must
be unique for each row in the table and not be NULL
– Foreign Key – values(s) in the specified column(s)
must reference an existing record in another table via
its primary key
– Check – an expression that validates data and must
not be FALSE
MySQL Workbench
• MySQL Workbench is a powerful graphical DB
management tool
– Create and modify ER diagrams
– Administrate databases (create, modify, backup /
restore DB)
– Create/ view / modify table data and other database
objects
– Write and execute SQL queries
28
Authentication in MySQL
• Performed when you connect to the database server
• Establishes your identity
• Enables you to create accounts for MySQL Server without
giving access to the rest of the server
• “root” = system administrator access
Connecting to a MySQL Server
• Connecting to a MySQL Server requires:
– The name of the server machine or IP address and
port number (3306 is default)
– Username
– Password
30
DEMO: Connecting to Local and
Remote MySQL Servers
Creating a New Database
• In MySQL Workbench, click the “Create new schema” button
• In MySQL, “Schema” = “Database”
32
Creating a New Database (2)
• In the "New schema" tab enter the name for your new
schema, set the Default Collation, and click “Apply”
Character Set:
A set of symbols
and encodings
Collation: A set
of rules for
comparing
characters in a
set
33
DEMO: Creating and Deleting
Databases in MySQL
Creating Tables
• Expand database view in the schemas list
• Right click on “Tables” and select “Create Table”
35
Creating Tables (2)
• Enter table name and define the table columns
(name, data type, and properties):
Enter the name of
Choose the data type of
the column here
the column here
36
Choose properties here
Creating Tables (3)
• Defining column properties
– PK – the column is the primary key for the table
– NN – the column is not nullable
– UQ – Unique Index, the values in the column must be
unique
– BIN – the values in the column are binary
– UN – the values in the column are unsigned
– ZF – Zero fill, integer values will be padded with leading
zeros for display
– AI – the values in the column should be set automatically
using an auto-increment (cannot be assigned manually)
37
Creating Tables (4)
• It is best practice to set the name of the table at the
time it is created
Enter the name of
the table here
38
DEMO: Creating and
Deleting Tables
39
Creating Relationships
• Alter child table to add foreign key attribute
• Alter parent table to add foreign key relationship
Create and
name the
relationship
Choose the
referenced
table
Match the
primary and
foreign key
attributes
40
Choose what
to do on
update and
delete
DEMO: Creating Relationships
41
Naming Conventions
• Tables
– Each word is capitalized (Pascal Case) –
• But wait - MySQL’s default settings are to lower case
database and table names!!!
– In English, plural
– Examples: Sites, Variables, DataValues
• Columns
– In English, singular
– Each word is capitalized (Pascal Case)
– Avoid reserved words and characters (e.g., key, int,
date)
– Examples: SiteID, SiteCode, SiteName
42
Naming Conventions (2)
• Primary key
– Use "ID" or name_of_the_table + "ID"
– Example: in the Sites table the PK column should
be be called ID or SiteID
• Foreign key
– Use the name of the referenced table + "ID"
– Example: in the DataValues table the foreign key
column that references the Sites table should be
named SiteID
43
Naming Conventions (3)
• Relationship names (constraints)
– In English, Pascal Case
– "fk_" + ParentTable + "_" + ChildTable
– Example: fk_Sites_DataValues
• View names
– "v_" + "name"
– Example: v_SitesWithData
• Stored procedures names
– "sp_" + "name"
– Example: sp_UpdateSeriesCatalog()
44
Authorization in MySQL
• Assigning specific permissions to specific users
for specific database objects
DEMO: Creating Database Users
and Assigning Permissions in MySQL
Summary
• Physical database implementation requires
choices about hardware, software, security,
data types, formats and storage, and other
factors
• RDBMS provide the ability to implement a
database in a multi-user, server environment
• MySQL Workbench provides tools for
database creation, editing, and administration
– including forward and reverse engineering