Lecture 4 PowerPoint

Download Report

Transcript Lecture 4 PowerPoint

Instructor: Craig Duckett
Lecture 04: Thursday, April 6, 2016
Relationships
1
Assignment 1 is due LECTURE 5, Tuesday, April 11th
in StudentTracker by MIDNIGHT
• MID-TERM EXAM is LECTURE 10, Thursday, April 27th
• Assignment 2 is due LECTURE 12, Tuesday, May 9th
in StudentTracker by MIDNIGHT
2
3 x 150 Points (450 points Total)
• Assignment 1 (Stage 1): DUE LECTURE 5 Tuesday, April 11th
• Assignment 2 (Stage 2): DUE LECTURE 12 Tuesday, May 6th
• Assignment 3 (Stage 3): DUE LECTURE 20 Tuesday, June 6th
• Database Presentation: DUE LECTURE 20 Tuesday, June 6th
3
Tuesday (LECTURE 3)
• Database Design for Mere Mortals: Chapter 2, 3
Thursday (LECTURE 4)
• The Language of SQL:
• Chapter 3: Calculations and Aliases
• Chapter 4: Using Functions
4
• BIT275/276 Library: On Resources page (Username/Password)
• The Language of SQL: Word Documents now hosted on Resources pages
• Relationships
5
Links of Interest: Join, Alias, Between
•
•
•
•
•
•
•
•
•
•
•
•
SQL Joins (W3Schools)
SQL INNER JOIN (W3Schools)
SQL LEFT JOIN (W3Schools)
SQL RIGHT JOIN (W3Schools)
SQL FULL OUTER JOIN (W3Schools)
SQL JOIN Tutorial (Tech on the Net)
SQL JOIN Tutorial (SQL Guides)
Using JOINS (TutorialsPoint)
INNER JOIN (TutorialsPoint)
LEFT JOIN (TutorialsPoint)
RIGHT JOIN (TutorialsPoint)
FULL JOIN (TutorialsPoint)
•
•
•
•
•
SQL ALIAS (W3Schools)
SQL ALIAS (Tech on the Net)
SQL ALIAS (TutorialsPoint)
SQL ALIAS (Beginner-SQL-Tutorial)
SQL ALIAS (SQL Guides)
• SQL BETWEEN (W3Schools)
• SQL BETWEEN (Tech on the Net)
• SQL BETWEEN (Tizag)
• SQL Tutorial (Tizag)
• MySQL Tutorial (Tizag)
6
Relationships
7
Relationships
Last week we talked about Relationships in the earlier section on the features of relational
databases, but it's time to get more specific.
You're organizing your database into individual tables, and many of these tables will need to know
about each other.
Relationships
Now, if you've started by sketching out a rough Entity Relationship (ER) diagram, you begin by just
attempting to show that some kind of relationship exists between your entities, but to get closer to
actually building this in a database, we need to say, what these Relationships are. And while it is
common in an ER diagram to use a phrase or a verb to describe it like contains or refers to or
belongs to that's for your benefit. The database can't describe it that way.
Relationships
Now, there are only three kinds of relationship of cardinality* between tables in a relational
database: one-to-one, one-to-many and many-to-many. Some people might add a fourth of noneat-all.
The time to start defining these is yet another example of how you can keep to a process and order
when designing databases
* See: http://en.wikipedia.org/wiki/Cardinality_%28data_modeling%29
Relationships
First, you sketch out your entities, which will become your tables.
Then you're going to define the data, the attribute for those entities, which will become columns in
your tables, because only then can you say, which columns will be the primary keys, or if you need
to generate primary keys. You can't properly define your relationship until these primary keys are
specified.
Relationships
I'm not trying to pretend that it's a purely linear process. Not all your tables will become obvious in
the first couple of steps. When you go through the process of normalization, which we'll get to next
week, you'll probably end up with a few new tables. So we will revisit everything we've started to
define, but we can still work through a process.
Relationships
The benefit of specifying our relationships
formally in the database, rather than just
informally putting things together in an
application
is
that
the
database
management system itself will take care of
keeping those relationships valid and
meaningful.
You can then use those relationships to
answer all sorts of questions. Everything
from the basic idea of, how many orders has
a particular customer had.
How many products were in an order? Or
say in a game system. What weapons belong
to a particular player? How many quests has
one player completed compared to another?
All the data is stored in these separate
tables, but it becomes easy to jump from
one to another by following the
relationships.
Relationships
The most typical kind of cardinality in a relational database management system is one-to-many,
which is just as easily described as many-to-one. It just depends on which side of the relationship
you start at.
In the example I showed earlier was perhaps the archetypal one-to-many or many-to-one
relationship. Customer and Order, one customer can have many orders. Meaning, one Customer row
can be associated with multiple Order rows but each Order row is only for one Customer.
One-to-Many Relationships
Now creating this relationship depends on us having our primary keys defined first. If you're
attempting to describe that we need to link from a row in one table to specific rows in another table,
we need a way to get to those specific rows and getting to a specific row takes a primary key. So
going into this, we must have our tables with our columns and primary keys at least roughly planned
out.
One-to-Many Relationships
Next, implementing a new one-to-many relationship requires a change to whatever table represents
the many side of the relationship. So, to relate Customer and Order tables, I don't need to change
anything about the Customer table, that's the one side. I need to add some extra information to the
Order table, and it's the key to the Customer table.
One-to-Many Relationships
So, Customer ID, and this again is called a Foreign Key. It represents a column in this Order table that
is a key to a row in a different table. These will specifically refer to one and only one row in the
Customer table.
One-to-Many Relationships
You might find the Customer ID occurs more than once in order, but it's always pointing to only one
row in the customer. We always make the change to the many side of the relationship because it's
the only way to do it. I can add one column with one value to every Order row that will always point
to a correct customer, but I cannot add one column with one value to the Customer row that could
point to a variety of different orders.
One-to-Many Relationships
Now very often, we would use the same column name across both of those tables. So in this case,
Customer ID and Customer, Customer ID and Order, but it doesn't have to be the same because we
are really making the match on the values in these columns. In some cases, you couldn't use the
same name because there'd be a conflict. Perhaps both of these tables were defined with the
primary key just called ID, or we couldn't reuse the ID column twice in the Order table. There would
be a conflict there, it just wouldn't work. So, we call it something else like Customer ID. It's still a
foreign key. It just happens to have a different column name between the different tables.
One-to-Many Relationships
Another thing that you might see is a completely different name that attempts instead to describe
the relationship. So, instead of using Customer ID, we might use something like Placed By as a
column name. And Order is Placed By a particular Customer ID. It still refers to a value that's in the ID
column in the other table, it's just using a different name. So, the name of the column doesn't have
to match but the data type certainly should. It doesn't make sense to have a primary key that's an
integer in one table, try and match a column that's described as character data in a different table.
One-to-Many Relationships
Now it's also very common to have one table that takes part in multiple relationships. So, a customer
can have many orders. We go to one-to-many relationship here, but we may decide that our
customers can have multiple different addresses they may ship to, so we might add a new table for
Address and have another one-to-many relationship between Customer and Address. This is
perfectly acceptable and very common.
One-to-Many Relationships
Another option is that a table that is on the many side of a one-to-many relationship could be on the
one side of another. So, for example, Order might be the many side of the customer to order
relationship but Order itself could have many order items, so it's on the one side of that relationship.
One customer has many orders, one order can contain many order items.
One-to-Many Relationships
Now as we start to get a bit further, we're going to want to start to diagram these a bit more
specifically. I've shown the basics of entity relationship diagrams as they are very simple to just
get started conceptually. Basically, boxes with lines between them. Although as you start to more
formalize the actual database diagram, you'll see a different kind of layout emerge, and there's no
fixed one standard.
One-to-Many Relationships
You'll see different options used across different database management systems and different
charting applications. But they usually boil down to this kind of idea. A box for each table with
the name of the table at the top, then we'd have the column names. You may or may not add a
little extra to show what kind of data these are, character data, dates, binary, integers, and so on.
What is very common is you would add a PK for a Primary Key, and you would add an FK to
denote a Foreign Key and then start adding the relationship connector lines between them.
One-to-Many Relationships
This kind of diagram is very, very common across all relational databases. It's easy to see what
tables exist, what columns, what relationships exist between the tables. Now in most relational
database management system administration software, there is an option to generate these
diagrams from an actual existing database in that server, and the relationships themselves will be
shown with these connector lines. Different software tends to generate different looking lines. As
I mentioned earlier, some will use the Crow's foot style to show the many part of a one-to-many
relationship. Some will show the infinity symbol. There are other ways of doing this, but it's
usually not that difficult to pick any one relationship and figure out which way it's going.
Relationships
It's possible to create one-to-one relationships in your database, but it's actually very unusual and
here is why. Let's imagine I'm building an HR database. So, I sketched out a few entities based on
real world object. And as part of this system, I figured out I need to keep track of employee data
and driver's license information so I'll create two tables just for those pieces of data.
One-to-One Relationships
But let's say the only reason I need to do this, to keep track of driver's license data is so I can
associate a license with an employee for making travel reservations. So, in this system that I'm
describing, it's a pure one-to-one relationship. One employee row points to one driver's license
row. There are never multiple employees for one driver's license, and there are never multiple
licenses for one employee.
Now, I understand there are always caveats, so I'm assuming that this system just doesn't care
that a few people might possess two licenses for different states or countries. It's trying to model
the idea that we need one listed primary driver's license. So, I could create these as two tables
with a one-to-one relationship. I could store a foreign key to the driver's license table in my
employee row, or I could do it the other way around or in fact, like I'm showing here, I could just
use the same primary key for each table, just mapping one row directly to another.
One-to-One Relationships
But if this is the actual situation that it's a pure one-to-one, then I might as well just combine the
two tables and have the driver's license information stored directly in the employee row with no
relationship needed. So even if that driver's license might be a physical different entity, this
makes a lot more sense to do it this way.
One-to-One Relationships
One thing to take care of, when you're new to database design, it sometimes happen that you
think you've got a one-to-one relationship when you don't. Here is an example where I've seen
this happen multiple times. Let's say we're drawing up a fairly basic situation with an Order, an
OrderItem, and a Product. There's some kind of relationship here.
One-to-One Relationships
Order is storing information like an Order ID, a date, a total amount due. Each OrderItem is
storing a Foreign Key to order ID, storing a quantity, and product ID because it's linking to a
different table product that's storing information like the description in the list price. Now I know
there's a relationship between them. And if asked someone to diagram the probable
relationships in between basic entities like this, I usually have something that goes this way. They
know it's a one-to-many between Order and OrderItem. Okay, one order can contain multiple
OrderItems and then they take the OrderItem which is for a product so it has a product key and a
quantity. And the reasoning often goes, well, one OrderItem is for one and only one product so
it's a one-to-one. But of course it isn't.
One-to-One Relationships
What they've correctly determined is that it is a two-one relationship, meaning an OrderItem is
for one and only one product but they haven't looked at it the other way, that the same product
can be associated with multiple OrderItems.
One-to-One Relationships
So, this really is a many-to-one. It's going from OrderItem to product and a one-to-many going
from product to OrderItem. One product could be ordered multiple times. So, make sure that
you're looking at your relationship both ways, particularly when you think you have found a oneto-one.
Relationships
While I went through an example of a many-to-many relationship earlier, it's worth going
through again, particularly if you're new to database design.
So, I had used the example of author and book that one author could write multiple books, but
also one book could be written by multiple authors. So, this is a many-to-many business scenario.
Many-to-Many Relationships
I'm going to use a different business problem to describe at this time around, in this case, a
database for a training center. So, let's imagine we have two tables right now, Class and Student.
Class includes the information like a title of a training course, what date it's on, what classroom
it's in, and the Student is just exactly the information you would imagine, first name, last name,
email, and so on. And what I want to describe is a relationship between them, that a class can be
attended by multiple students but that also a student could take multiple different classes. It is a
many-to-many relationship.
Many-to-Many Relationships
And I cannot add one column to either table that would successfully represent all the possible
combinations because if I had a StudentID column to the Class table, then I can only have on
student in each class. If add a ClassID column to the Student table, then I can say that each
student can only take one class. So once again, in relational database, you cannot represent a
many-to-many relationship directly.
Many-to-Many Relationships
You need to create a new table to link the two, in several names with this kind of table, a joining
table, a joint table, a junction table, a linking table, a bridging table, a cross-reference table, it
really doesn't matter, what ever you prefer. By convention, the name of this table is just usually
made up by taking the names of the two tables that it's cross-referencing and putting them
together. So in this case, ClassStudent, or StudentClass, either would work. And there's a one-tomany relationship from Class, the ClassStudent linking table, and another one-to-many
relationship from the Student table to the ClassStudent table. But this linking table doesn't
contain any other data than just two columns.
Many-to-Many Relationships
One of those columns is a Foreign Key to the Class table with ClassID, the other is a Foreign Key
to Student with StudentID, and this is how we represent a many-to-many relationship. So, if I
want to take Student ID 102, Viola, and say that she takes multiple classes. Then what I need is
multiple rows in that liking table both referring to Student ID 102 but referring to different Class
IDs, in this case, 441 and 442, one student, multiple classes. And if I want to go the other way
taking one class, say the database design class, Class ID 441, I'll take that over into the linking
table, find all the rows with 441, find the student for those rows, 102, 101, 103, and just map
those to the Student table, one class, multiple students.
Relationships
So, why do we go to all this trouble? Well, it's usually easy to understand that relationships are a
convenient way to link from one table to another. For a customer, we can just get all the orders.
Relationships
Well, just as important, if not more so is the idea that a relationship describes a rule, a constraint
that I cannot violate, that I now must not have an order row for a customer that doesn't exist.
Relationships
The database will not let me add a new order row with a Customer ID 388 if there is no Customer
ID 388 in the Customer table. I may say but I'm just about to add that customer, but at that
moment the database would be no longer valid. It would no longer be internally consistent. This
is what's considered a referential constraint. It's a rule that applies between tables where each
table isn't just applying the rules of its own column definitions, but it's also cross-referencing the
data in the Customer table to make sure that this is or is not allowed. If there isn't a Customer ID
388, you would not be able to enter that row. So, defining these rules will often imply a sequence
in your database for creation of new content, but in this if I have a new customer and a new
order for that customer, I must add the customer first before I can add the order for it.
Referential Integrity
It would allow a customer without an order, but it won't allow an order for a non-existing
customer. The idea that data is not just valid within one particular row or one particular table
according to the rules of that table, but valid and meaningful between all your tables is known as
referential integrity.
There are two places that it primarily has an impact. Adding new rows as I just described, making
sure we can't add an order row with a Customer ID that doesn't exist. This also applies for
updates. I can't bypass it by adding a row for Customer 369 who does exist and then changing
that to a value that doesn't. It will refuse that too. This is still referential integrity enforcing the
rules of the relationship.
Referential Integrity
What's more interesting is what happens with referential integrity when you delete something.
So, if our data currently exist between these two tables in a valid meaningful state, say, I have
Customer ID 367 who has two associated orders, well, what happens if I delete Customer 367?
Well, in most database systems, it depends. It depends on you and your rules and what you want
to have happen. One option you have when deleting between tables that have relationships is
something called a Cascading Delete. If I want to delete a row from the Customer table, the
database says, "You're the boss", and we'll just immediately delete any related row.
Referential Integrity
Another option for deleting is cascading nullify. Some database systems have this option,
although it is rarely used, where you would null the Foreign Key column. So, in this case, I delete
Customer ID 367 and the database would automatically scan all the orders and where it finds
367, set that to null. So, it's a situation where I must keep the order, but it's kind of now detached
from a particular customer. It still exists but with a null column.
Referential Integrity
But more likely than either of this is simply refusal. No action. This is the default in most
database management systems, that if you've created a relationship between these two tables
and then you try and delete a customer that has existing orders, it's not going to let you do that.
It will refuse the delete option. When deleting, you'd need to delete all the orders first or transfer
their ownership to some other Customer ID and only then would I be allowed to delete that
Customer row.
NO ICE TODAY
In-CLASS Assignment 1 Workday
Also: It is okay to “look ahead” at upcoming
PowerPoints to learn about a topic we haven’t
covered yet. Look at the Topics listed in the
Lecture accordion bar to find the topic you’re
interested in.
45