access tervuren
Download
Report
Transcript access tervuren
Access Primer
Africamuseum
5 June 2013
MS Access
Relational Database Management
System
Data/information resides in series of related
tables
Principle of normalisation
Often ‘client-server’
End-user interface independent from
‘engine’
Access elements
Tables: to store data
Queries: to ask questions from database
Forms: to create a user interface
Macros: to automate repetitive tasks
Modules: create non-standard
functionality, using Visual Basic
Tables
Must be defined before they can be used
Vaguely resemble Excel sheets but:
Unit of information is a ‘record’ (line)
Consists of ‘fields’
Type must be chosen
All fields of a record remain together
No restriction on the number of lines
Saved automatically as soon as one moves to another
record
Excel: 64,000 lines
Order of records is not fixed!
Types of fields
Text (length; max = 255)
Memo
Integer (normal or long)
Autonumber
Real (single or double)
Date
Boolean (yes/no)
Indices
Facilitate sorting and selective extraction
of records
Special index: primary key
unique identifier for a record
often ‘autonumber’
Other indices:
Unique or not
Slow data entry down
Hands-on
Create table to store names and
addresses of students
Experiment with field types
‘Populate’ table – create records and
provide input
Play with facilities to display table
Filter, sort
Import text (or other) files
Download data from for example
data.gbif.org
Don’t select too many data!
Check data through text editor
Then we’ll import the data in our
database
Hands-on
Download data from for example
data.gbif.org
Don’t select too many data!
Check data through text editor
Play with filter, sort…
Normalisation
All data should be in the database once
and only once
Repeated data go to separate table
Relationships between the tables are part of
the ‘model’ of the database
Link between tables defined on pair of fields, one
in each table, that are of same type
Records of the two tables are ‘related’ if value in
the link fields is the same
Later more!!
Non-normalised Table
Species
Asterias rubens
Asterias rubens
Asterias rubens
Cancer pagurus
Cancer pagurus
Cancer pagurus
Asterias rubens
Asterias rubens
Cancer pagurus
# legs
5
5
5
10
10
10
5
5
10
# eyes
0
0
0
2
2
2
0
0
2
place Country
Oostende
Belgium
Zeebrugge
Belgium
Zeebrugge
Belgium
De Panne
Belgium
Oostende
Belgium
Zeebrugge
Belgium
Wimereux
France
Wimereux
France
Wimereux
France
date
12/3/2004
13/3/2005
14/3/2005
12/3/2004
12/3/2004
14/3/2004
13/3/2005
14/3/2005
12/3/2004
Normalisation
Table with morphological characters
fields: species, number of legs, number of eyes
Table with distribution records
fields: species, place, date
The field ‘Species’ is common to the two
tables, and makes it possible to link records
between two tables
Creating relations
‘one to many’ relations
Defined on fields of the same type
In case of text fields: same length
On the many side: primary key
Helps in maintaining the integrity of the data
Enforce presence of a ‘related record’ in the ‘related
table’
Avoid spelling variations (Oostende, Ostend, Ostende…)
Relations
Best on fields that do not carry any real-world
information
Autonumber
In the example: add field autonumber in the
species table, and use this as a code in the
distributions table; make autonumber field the
primary key
Also best to add Autonumber as primary key in
distributions table
And more Relations…
Also the locality of the species
distribution record should be in a
separate table
With fields for position, country…
Hands-on
Refine the data model by creating an
extra table for the countries
Create extra table with id and countryname
Make id the primary key in countries table
Change places table, change country to
country_id, and update field type
Create relationship between places and
countries
Queries
Select and sort records of a table
Pull together information from related tables
Simple statistics
In principle: SQL
But access interface hides SQL from the user
For now: query grid
Define which tables will be used, fields that have to
be shown, criteria for selection and sorting…
SQL statement
SELECT species.speciesname,
distrecords.observationdate, places.placename,
places.country
FROM species INNER JOIN (places INNER JOIN
distrecords ON places.id =
distrecords.place_id) ON species.id =
distrecords.species_id
WHERE (((places.country)="Belgium"))
ORDER by speciesname, observationdate;
Statistics query
Count the number of records in the
database given criteria in a ‘Group By’
clause
Cross-tab query
Not standard SQL, specific for Access
Create table, with one column header
and one or more row headers, and a
value for the cells
Hands-on
How many different species do you have
in your download from GBIF?
How many with latitude & longitude?
Check the ‘country’ field
Why do we need normalisation & standard
vocabularies?
Access Wizard
Forms
Create an interface for the users
Hide codes/autonumbers (combobox)
Elements of the form are linked to fields
in the tables
Can show single records at a time, or
‘repeated forms’
Entries/changes to the form are written
automatically to the table
ComboBox
Limit possible entries into a field
Hide codes/autonumbers from users
Two different representations of the
same information are combined
Code, to be entered into the table
Full text equivalent, for the benefit of the
user
Subforms
To display the content of a many-to-many
relationship
We don’t know how many related records there will
be, so impossible to solve with static, predefined
structure
A form can be repeated, showing multiple lines
Number of lines is not limited
First create the subform
Subforms
And then the main form
And now…
How to decide on the structure of a database?
Which information do we want to store?
Which tables do we need? Relationships between
them?
Which indices (primary and others) do we need?
Start from an analysis of user needs
In principle also, decide on a platform
Task
Create ‘model’ for
biodiversity/biogeography database
Which standard vocabularies to use?
What are important elements of
information?
Sources
Biology
Classification, synonymy
Geography