Databasebasics2014

Download Report

Transcript Databasebasics2014

CTFS Asia Region Workshop 2014
Shameema Esufali
Suzanne Lao
Data coordinators and technical resources for
the network
[email protected]
[email protected]
CTFS Workshop
Relational database basics

Why relational databases?

Why MySQL?

What about R?
From an input sheet to a database
What is a database?
Why do we need to convert our input sheet/
excel spreadsheet to a database?
What are the advantages and
disadvantages?
How does a data entry program help?
Input form / Excel sheet
How best to store data for

Accuracy

Easy retrieval
Relational Theory
In order to work with MySQL it is necessary to
understand the basics of relational theory.
i.e how and why data is stored and
managed in a relational database.
The guiding principle behind a relational
database is to store data once and only
once.
What is a Relation?
A table. Columns are fields (attributes) of data
related to other fields on the
same row (tuple).
Primary Key



Identifies the row of a table without
duplicates.
Tells you what the row contains
Eg. If treeid is the primary key then the row
has information about that tree
Candidate Primary Key



Any attribute(s) which together would serve as
the primary key.
Must uniquely identify a row of data.
Each part of the key must be essential to unique
identification. No redundancy.
Foreign Key

A foreign key is a column in a table that
matches the primary key column of another
table. Its function is to link the basic data of
two entities on demand, i.e. when two tables are
joined using the common key.
First Normal Form
One piece of information per column. No repeated rows. Eliminate fused
data eg Code1,Code2
Wrong!
Tag
Species
Code
1234
SHORME
A, BA
Tag
Species
Code
1234
SHORME
A
1234
SHORME
BA
Right
Second Normal Form
Each column depends on the entire primary key.
Wrong
Tag
Census
1234 1
Tag
Species
Seedsize X
SHORTR
Medium
Species
1234 SHORTR
Seedsize X
Medium
Y
DBH
11.3 15.4 12
Y
11.3 15.4
Right
Third Normal Form
Each column depends ONLY on the primary key. i.e.
there are no transitive dependencies
Wrong
Tag
Species
Seedsize
X
1234
SHORTR
Medium
11.3 15.4
Right
Tag
Species
X
Y
1234
SHORTR
11.3 15.4
Y
Fourth Normal Form
The table must contain no more than one
multi-valued dependency
Tag
DBH
Cod
e
1234
10
A
1234
11
A
1234
11
BA
Entity Relationship diagram (ERD)

Shows in a diagram how entities (tables) are
related to one another.

One to One

One to many

Many to many
One to one


Extension of number of attributes in a single
table
Rarely required
Tree
More tree
attributes
One to Many

Most
common
Child
Parent


Requires two
tables.
Linked by
Foreign Key
Family
Genus
Species
Many to many

Need to break down
to one to many
Measurement
Tree


Requires three tables
Associative table
provides common key
Measurement
Code
Code
Reassembling data



Data was broken down into tables to preserve
integrity
How can we put it together to derive
information?
Use Structured Query Language (SQL) to JOIN
tables using a common attribute
Joins


Two tables may be
joined when they
share at least one
common attribute
The Primary key of
the Parent table is
stored in the Child
table as a cross
reference. This is
called a Foreign Key.
Primary Key in Parent
GenusID
Genus
FamilyID
1
Acacia
4
2
Acalypha
3
3
Adelia
3
4
Aegiphila
3
5
Alchornea
3
SpeciesID
Species
Foreign Key
in Child
GenusId
1
melanoceras
1
2
diversifolia
2
3
macrostachya
2
4
triloba
3
5
panamensis
4
6
costaricensis
5
7
latifolia
5
Table joined on Foreign Key
GenusID
SpeciesID
1
2
3
4
5
6
7
Species
melanoceras
diversifolia
macrostachya
triloba
panamensis
costaricensis
latifolia
GenusId
1
2
2
3
4
5
5
⇿
⇿
⇿
⇿
⇿
⇿
⇿
GenusID
1
2
2
3
4
5
5
Genus
Acacia
Acalypha
Acalypha
Adelia
Aegiphila
Alchornea
Alchornea
The Genus ID in the Species table is used to pick
up information for the corresponding Genus. It
looks for a row with the matching Primary Key
FamilyID
4
3
3
3
3
3
3
Extend to join many tables

With SQL you can join as many tables as you
need to in order to get the set of information
you need. Thus the previous example can be
extended to include Family which is a parent
table of Genus and/or extended in the another
direction to include Tree which is a child of
Species as long as there is a linking attribute.
This attribute is called a Foreign Key.