Introduction to Data Models

Download Report

Transcript Introduction to Data Models

Week 2 Introduction to Data
Modelling
•
•
•
•
Why modelling?
Database Modeling Basics
Design constraints
Different Models in Database Design
Introduction to Data Models
• Why modelling?
– Modelling saves time by allowing designers to
evaluate and assess a system model before engaging
in a full system development. It also true that at the
outset, some of the requirements may not be clear,
and the system model can help to clarify these.
– All database design must start with a model in order
to establish a data structure and relationships
between data. A model offers a convenient and costeffective of building a design from concept to the final
system through a number of iterations. At each
iteration a model will be able to highlight any
deficiencies in design which can be corrected and the
model tested again.
Different Models in Database
Design
• In database design there are a number of different models. The two
most prevalent models to date are the relational model and the
object oriented model. We take a brief look at other models in
order to gain an understanding of the need behind models. Since
the introduction of electronic databases the way that data is stored
and operated upon needed to be defined. This required a data
model and over time a few of these were developed. These are;
–
–
–
–
–
–
1. Hierarchical
2. Network
3. Relational
4. Entity relationship
5. Object Oriented
6. Object-Relational
Database Modeling Basics
• Database modelling relies on three elements namely, entity,
attributes and relationships.
– Entity: refers to the basic data description. AN entity can be a person, a
physical component or simply a value or a quantity. If you find it difficult
to identify an entity, simply think of’ nouns’ that exist in your system.
These will often make it easier for you to identify entities.
– Attributes: These are also called properties of an entity. These will refer
to properties that can be used to describe an entity. If you need to identify
the properties of an entity it often helps to list the adjectives that can be
applied to the entity. These will serve the purpose to identify the
properties of the entity.
– Relationships: These describe how entities relate to each other. In
simple terms these can be thought of as links between entities. Thus,
when an entity experiences a change, the link to another entity may
cause a change in that entity as a result of their relationship. For example
if a student is an entity then so is his tutor. The two entities will have a
relationship as a data component.
– A database designer must take into account these relationships, and
program them into the database in order that the database application
functions efficiently and accurately.
Design constraints
• All database design has to take into account the
constraints on data. These constraints are sometimes
called business rules and they control working practices
of an organisation. For example, it may be a university
policy that a class can not run with fewer than five
students. In the database design it is essential that the
model reflects this rule.
• All models must allow the designer to describe these
relationships and constraints on entities and their
properties. Furthermore, the designer must be able to
see where the models can generate problems, so that
the design can be suitably modified.
Classification of Relationships
• A further point worth mentioning is that
relationship are classified in terms of how many
objects are involved in the relationship. Thus,
three types are defined; namely;
– One to Many (1:M): As in one tutor to many students.
– Many to One (M:1): As in many modules to one
course
– Many to Many (M:N): As in many units to many
students
Database model types
-The Hierarchical Model
•
•
•
The first type of data model used in database design was the hierarchical model.
The data structure in this type of model resembles an inverted tree where the
base of the tree is the highest level of hierarchy, and the branches below
constitute the lower levels.
In this data structure the root is considered to be the parent of all objects below
it and conversely, going down the structure each entity is the child to the one
above it in the structure. In the hierarchical model it is clear that each child
entity can only have one parent and that a parent entity can have many children
entities. Whilst this is quite suitable for many database applications, the
constraint that each child can only have one parent can restrict many database
designs.
The hierarchical model was the first used in database design and although it
has been very useful, some modifications were necessary as database
applications became more complex.
Network Model
• To improve performance the network model included the
ability of a child to have more than one parent. That is to
say, a student can belong to more than one course. For
example, they can attend as full time students on one
course, and as part time on another. The data structure
must tolerate this relationship. In the network model, the
terminology is somewhat different. Here a relationship
between entities is called a set. Each set comprises of at
least two records one the owner (parent) and the other
the member (child). Thus, a set can show many
members as well as many owners, and the single parent
constraint is removed.
• Although this was a significant improvement on the
hierarchical model, there was a need for more flexibility
in describing relationships between entities. Thus the
relational data model was conceived.
Network Model
• While the hierarchical database
model structures data as a tree of records, with
each record having one parent record and many
children, the network model allows each record
to have multiple parent and child records,
forming a generalized graph structure. This
property applies at two levels: the schema is a
generalized graph of record types connected by
relationship types (called "set types" in
CODASYL), and the database itself is a
generalized graph of record occurrences
connected by relationships (CODASYL "sets").
Network Model
• Cycles are permitted at both levels. The chief argument in favour of
the network model, in comparison to the hierarchic model, was that
it allowed a more natural modeling of relationships between entities.
Although the model was widely implemented and used, it failed to
become dominant for two main reasons. Firstly, IBM chose to stick
to the hierarchical model with semi-network extensions in their
established products such as IMS and DL/I. Secondly, it was
eventually displaced by the relational model, which offered a higherlevel, more declarative interface. Until the early 1980s the
performance benefits of the low-level navigational interfaces offered
by hierarchical and network databases were persuasive for many
large-scale applications, but as hardware became faster, the extra
productivity and flexibility of the relational model led to the gradual
obsolescence of the network model in corporate enterprise usage.
Relational Database Model
•
•
The next phase of evolution produced the relational database model, which is
perhaps the most common type of model in current database applications.
Databases that subscribe to this model are called relational databases.
In very simple terms a relational data model can be described as a collection
of tables that store data. Data are stored in rows and columns and the
intersection of a particular row and column will point to a particular data value.
This representation allows data to be a part of more than one table, and as a
consequence tables can be linked together. Thus, for example, the Course
director identifier data, say CD_ID can belong to a table storing student
information and also in the table that is used to store the course director data.
The entity relationship model
(ERM)
• The ERM is a graphical representation of entities and their
relationships. This is done using the entity relationship diagrams
(ERD).
• The ERD uses a rectangular box to hold the entity name and the
diamond shape to hold the relationship. It also uses labels to indicate
if the relationship is 1:1, 1:M or M:N. Straight lines between entities
indicate connectivity. For example consider the following ERD
representation.
The entity relationship model
(ERM)
• symbols tend to vary and
perhaps
the
most
common notation is the
Crow’s foot ERD model.
This notation represents
1:1 and 1:M, M:N as
symbols on the diagram
rather than labels on the
ERD. Additonally the
relationship diamond is
replaced with a simple
label above the line. Thus,
with Crow’s foot notation
the above diagram would
be as follows.