presentation source
Download
Report
Transcript presentation source
APPLICATION OF DATA
MODELING
In natural resources and forest management
Yujia Zhang and Bruce E. Borders
1
INTRODUCTION
A database is a collection of
related data. In forestry, Data
are typically stored in
electronic files with
additional information stored
on field tally sheets.
2
During data collection, data
formats may be changed, or
files may be revised by users
individually without informing
others, which causes
problems in data storage and
analysis.
3
The traditional data storage
approach results in
redundancy in data storage.
For example, stand
information stored in a tree
level file takes large storage
space and increases storage
cost.
4
Relational database
management systems (RDBMS)
provide a powerful tool to store
and update forest data, review
relationships among individual
components, and model forest
dynamics.
5
DATA MODELING
Data modeling is a conceptual
or logical design of a database.
A data model is a set of
concepts that refers to types,
relations, constraints, and
operations of data.
6
The basic operations of a data
model include accessing and
updating the database. We use
an entity-relation (ER) data
model for conceptual analysis
and implementation of
database.
7
The concepts involved in the
ER model are entities,
attributes, and relationships.
An entity is an object,
attributes are descriptions of
the properties of the entity,
and relationships are
interactions among entities.
8
Data are stored in tables. A row
in a table is called a tuple, a
column header is called an
attribute, and a table is called a
relation. The data illustrated
here are from the Consortium for
Accelerated Pine Plantation
Studies (CAPPS).
9
The CAPPS plantations were
established in 1987. The
applied silvicultural
treatments are herbicide (H),
fertilization (F), herbicide and
fertilization (HF), and control
(C).
10
RELATION SCHEMA
A relation schema R of degree
n can be denoted as:
R (A1, A2, … , An)
where R is the name of the
relation and A1, A2, … , An are
attributes.
11
The relation schemas in our data
model are:
STAND (PlOT_ID, Location, Block,
Plot, FirstGrowingSeason,
Treatment);
TREE (ID, PlantationAge, TreeNumber,
DBH, Height, CrownHeight,
CronartiumQuartileCode,
TipMothCode, DamageCode,
Plot_ID);
12
GROUNDCOVER (ID, SubPlot,
PlantationAge,
PercentAndropogon,
HeightAndropogon,
PercentGrass, HeightGrass,
PercentBroadLeaf,
HeightBroadLeaf, Plot_ID)
SMALLCOMPETITOR (ID, SubPlot,
PlantationAge, Species,
TreeHeight, CrownLength,
CrownWidth, Plot_ID)
13
LARGECOMPETITOR (ID, Subplot,
PlantationAge, Species,
TreeHeight,
DBH, CrownHeight, BaseHeight,
Plot_ID));
14
A relation is a set of tuples.
An attribute with distinct
value can be used as a
primary key to identify a
tuple. The value of the
primary key must not be null,
which is called the entity
integrity constraint.
15
A foreign key is needed to
maintain the consistency
among relations. For example,
tree level information and
stand level information can be
combined together using a
foreign key, PLOT_ID.
16
The operations for a relational
database include select, project,
and join. The notation for
operation select is:
<selection condition>(<relation name>)
To select trees with DBH larger
than 10 cm, we use:
DBH>10(TREE)
17
Operation project selects
certain columns from a table:
<attribute list>(<relation name>)
To list tree number, plantation
age, and tree height, we use:
TreeNumber, PlantationAge, Height(TREE)
18
Operation join combines two
tuples from two relations
together:
R||<join condition>S
To access all trees from
location Athens, we use:
(TREE)|| LOCATION=ATH (STAND)
19
The RDBMS installed on the
server is Oracle8 Enterprise.
Each user can access the
database over a network. Data
files used by each user can be
stored either in the server or PC,
or in other external devices.
20
DATA MODEL
IMPLEMENTATION
The implementation of a data
model includes establishing
relations and queries. The
following query is used to obtain
some stand level information
from relations STAND and
TREE.
21
Assuming a forester wants to
know the number of trees per
plot, average DBH and
average height for each age in
stands located in Athens that
have accepted herbicide and
fertilization, the regarding
query is:
22
select
s.Location, s.Block, s.Plot, s.FirstGrowingSeason,
s.Treatment,
t.PlantationAge, count(t.TreeNumber), avg(t.DBH),
avg(t.Height)
from
STAND s,
TREE t
where
s.Plot_ID = t.Plot_ID
and s.Location = 'ATH'
and s.Treatment = 'HF'
group by s.Location, s.Block, s.Plot,
t.PlantationAge;
23
THE OUTPUT PRODUCED IS:
==============================================================================
LOC BLK PLT FGS TRT
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
ATH 1
1
89 HF
AGE
1
2
3
4
5
6
7
8
9
10
TREE
81
81
81
81
81
80
78
78
77
75
DBH(cm)
0.0
0.0
5.3
8.9
10.7
13.0
14.7
15.7
17.0
18.0
HT(m)
0.7
0.2
3.9
5.2
7.0
8.3
9.8
11.5
13.0
14.3
…………………………………………………………………………………………
=============================================================================
24
where LOC = Location, BLK = Block, PLT
= Plot, FGS = First Growing Season, TRT
= Treatment, AGE = Plantation Age,
TREE = Tree/plot, DBH = Average
Diameter at Breast Height/plot (cm), and
HT = Average Total Tree Height/plot (m).
25
SUMMARY
Forest data are characterized
with large size of records,
complicated relationships, and a
diversity of data types. The
traditional approach is far from
satisfactory for data storage and
manipulation.
26
In our database, The relations
among data files eliminate
redundancy in data storage.
All files are stored in a server,
which ensures updated data
available for each user.
27
For data safety, the redundant
hard disks in the server store
the mirrored data that can be
recovered when the server is
down. Also, Oracle8 backup
manager can backup the
whole database to external
storage devices.
28
Users can be granted
privileges at different levels by
the database administrator
(DBA) to view, revise, and
transfer files. The database is
protected from unauthorized
access.
29
Data modeling helps establish
a comprehensive database
including tree, soil, hydrology,
GIS, and wildlife data, which
facilitates natural resources
and forest management.
30