Fundamentals of Relational Database Design
Download
Report
Transcript Fundamentals of Relational Database Design
Fundamentals of Relational
Database Design
By
Ed Lance
Copyright © 2005 Ed Lance
About Me
Independent Database Consultant
Worked with Access since 1.0
Many years designing, developing, tuning,
and developing with databases.
Code with .Net platform
http://SanDiegoDataPro.com
Copyright © 2005 Ed Lance
What’s this about relationships?
Relational Database is a database system that
organizes data according to (more or less) the
rules of relational set theory. (Codd)
In a relational database, we break up data into
multiple tables.
These tables have ‘pointers’ (keys) in the data
that relate rows in one table to rows in another.
This allows us to have multiple dimensions in
our data without going insane.
Copyright © 2005 Ed Lance
Why do I care?
Relational Database Design is important
because…
– It allows us to organize our data in a way that doesn’t
limit the user.
– It allows us to model very complex data with relative
ease.
– It eliminates duplicate data that is hard to maintain.
– It provides us a way to search and retrieve data with
an easy language and gets us out of having to write
all our own data handling code.
Copyright © 2005 Ed Lance
The Penalties for Not Doing It
Poor database performance
Can’t write queries to get data the way you want
it.
Lots of extra work (code) to get reports they way
you want.
Always doing kludges to get around the design.
When the customer wants to add one more of
something, it’s a design change (“I need six
phone numbers instead of five.”)
Updating data is a nightmare
Copyright © 2005 Ed Lance
Basic Concepts
Keys
Indexes
Relationships
Normal Forms
Copyright © 2005 Ed Lance
Primary Keys
We need to be able to uniquely identify
rows in a table.
Natural key – Use one or more columns
together that aren’t repeated.
Identity – or Autonumber, a unique number
for each row, managed by the system.
PK’s help us connect data in other tables
to rows in the current table.
Copyright © 2005 Ed Lance
Indexes
An index is a data structure used by a
database engine to make searches faster.
Data is generally stored in the order it was
input. We want to be able to filter and
order by any column (or set of columns).
Indexes are defined by you, but
maintained by the engine.
You can index one column or a set of
columns.
Copyright © 2005 Ed Lance
Indexes
Deciding which columns to create indexes
for is a matter of judgment and skill.
An index can have a huge impact (usually
good) on performance of queries.
Too many indexes can bog down database
when modifying data.
Indexes do take space on disk.
Copyright © 2005 Ed Lance
Relationships
We use columns in tables to contain
pointers to rows in other tables.
By using primary key columns, we can
uniquely identify rows in a table.
Then by using the value of the primary key
in other tables, we can tie the tables
together.
Columns that contain key values from
another table are called foreign keys.
Copyright © 2005 Ed Lance
Relationships
In order for this to work, we must be
careful that we don’t end up with foreign
keys that point to rows that don’t exist.
This is called Referential Integrity
Most database engines (inc. Access, SQL
Server) can enforce this for you.
Copyright © 2005 Ed Lance
Relationship Types
1–1
– If two tables really represent the same set of items. In
other words they could be merged into one table.
1 – Many
– Items in one table are related to one or more items in
another table. (One library has many books.)
Many – Many
– Items in one table are related to one or more items in
another table, but the relationship exists the other
way as well. (One doctor has many patients, but each
patient may have more than one doctor)
Copyright © 2005 Ed Lance
Relationships
Finding these relationships in your
customer’s data and designing tables to
correctly model them is the art of relational
database design.
Look for repetition in existing data.
Look for ‘add-on’ fields that are used to
add similar items (PhoneNumber1,
PhoneNumber2)
Copyright © 2005 Ed Lance
Tips for Designing
Think of tables as entities (nouns).
– Books, products, people, events, orders
Think of relationships as verbs.
– Books have authors.
– Products have suppliers
– Orders have order details (the set of items being
ordered)
Test phrases for relationship types
– Suppliers have many products, but each product has
only one supplier. (one-to-many)
Copyright © 2005 Ed Lance
What is Normal?
The process of organizing relational data
can be called Normalization.
Codd (and follow on people) defined
criteria for increasing levels of
normalization in data.
In the practical world we care about the
first three.
Copyright © 2005 Ed Lance
Normal Forms
1NF – First Normal Form
– No duplicate data in columns. This means we have to
break up our data into relational tables, and establish
key values.
Name
Address
Title
Author
Phone1
Codd, Lance
Copyright © 2005 Ed Lance
Phone2
ISBN
..
Normal Forms
Name
Address
Phone1
Phone2
Ed
1234
555-1212
444-1234
NamesID
Name
Address
NamesID PhoneNumber
1
Ed
1234
1
555-1212
1
444-1234
Copyright © 2005 Ed Lance
Normal Forms
Title
Author
RDBMS
Codd, Lance
PubID
Title
1
RDBMS
ISBN
ISBN
..
PubsID
Name
1
Codd
1
Lance
Copyright © 2005 Ed Lance
Normal Forms
2NF – Second Normal Form
– Be in 1NF.
– All columns must be dependent on the whole primary
key, not just one part of it. Not an issue if you use
Autonumber keys.
Part_No (PK)
Supplier_Name (PK) Price
Copyright © 2005 Ed Lance
Supplier
Address
Normal Forms
3NF – Third Normal Form
– Be in 2NF
– Non-key columns should not have dependencies on
each other.
Part_No (PK) Mfgr. Name
Mfgr. Address …
Copyright © 2005 Ed Lance
Questions?
The End
Copyright © 2005 Ed Lance