Lecture Notes

Download Report

Transcript Lecture Notes

LSP 121
Normalization
Queries (contd)
* Normalization
• Normalization is the process of efficiently
organizing your data
• Normalizing your database can drastically improve
its performance
– This becomes especially important with very large
databases
• Normalization techniques include:
– Avoiding data redundancy (ie avoid repetition of data)
– Ensuring proper data dependencies
• See example…
Normalization
• Let’s create a database for a car club
• What if one person owns multiple cars? (One owner
can have many cars, so this is 1:M relationship)
– Table: Members: In this table, we would have to repeat all
of the person’s info (MemberID, Name, Phone, etc, etc for
EVERY car they own)
• This is called data redundancy and is very BAD!
– Instead, let’s create an additional separate table just for
the cars. We’ll call this table Cars
• We can still connect the car with its owner by including a field in
the Cars table MemberID field from the Owners table
• This is called establishing a “relationship” between the tables
* Data Redundancy
• A major no-no in database design.
• Can lead to all kinds of problems down the road.
– For example, suppose you had a database in which you
store a person’s phone number in three different tables.
Now, suppose you updated a member’s phone number in
one table, but forgot to do it in another? This is a very
common and can really mess things up.
– You should try to limit fields to one place only.
– However, you can “associate” fields between two tables.
– Eg: In the ‘Cars’ table, you can refer to the MemberID from
the ‘Members’ table.
Normalization Example
Car Club (original table)
Member ID (primary key)
Member Name
Member Address
Member City
Member State
Member Zip
Member Phone
Dues Paid?
National Member?
Model of Car
Make of Car
Year of Car
MemberInfo
Member ID (primary key)
Name
Address
Relationship
City
State
Zip
Phone
Dues Paid?
National Member?
Cars
Model of Car
Make of Car
Year of Car
Member ID (not
a primary key here!)
Note:
If a member has >1 car, the first 9
fields will be repeated multiple times
This is pointless – and dangerous!
Primary Key in MemberInfo table
“Foreign Key” in Cars table?
1:M (“one-to-many”) relationship
Why isn’t MemberID a primary key in the CarTable table?
Data Redundancy
• Data Redundancy: A bad idea!
– Key Point: Do every thing you can to avoid repeating data.
– Database people do everythng they can to avoid data redundancy.
• For example, each time you add a new car for a user, you
should not have to repeat all of the user’s personal info
(name, address, phone, etc, etc) all over again.
• Instead, place the user’s personal info in one table (e.g.
“Members”) and have a separate table for cars owned by the
members of the club.
• We then create a relationship between the two tables.
Another Normalization Example
Student ClassRecords
Student ID (primary key)
Name
Address
City
State
Zip
Phone
Major
Minor
Degree Sought
Class Name
Grade
Number Credits
The first 10 fields
are repeated for each course
Before
Student Info
Courses
Student ID
Name
Address
City
State
Zip
Phone
Major
Minor
Degree Sought
Class Name
Grade
Number Credits
Student ID
After
• StudentID is a primary key in the StudentInfo table and
becomes a “foreign key” in the Courses table.
• There is a 1:M (one-to-many) relationship between students
and courses. That is, one student can have many courses.
Practice Example
Imagine a table of your customers and all of your sales to them.
Would you change the design of your database?
If so, how many tables would you want in this case?
Primary keys? Foreign keys?
Customer ID
Customer Last Name
Customer Phone
Customer Address
Customer City
Customer State
Customer Zip
Sales Transaction Date
Sales Amount
Item
Clearance Item?
Practice Example
Table: Customers
CustomerID
Customer Last Name
Customer Phone
Customer Address
Customer City
Customer State
Customer Zip
Sales Transaction Date
Sales Amount
Item
Clearance Item?
CustomerID (primary key)
LastName
Phone
Address
City
State
Zip
Table: Sales
InvoiceID (primary key)
CustomerID (foreign key)
SalesTransactionDate
SalesAmount
Item
ClearanceItem?
Relationships
• You can create a relationship between any
tables by hand
• Click Tools  Relationships
– Add the two tables to the view, click on one of the
fields (e.g. StudentID) drag it over to the other
table’s identical field (StudentID) and un-click
– Check Enforce Referential Integrity (you don’t
want children records without parents)
Relationships
• Access automatically creates a relationship
between the two tables if:
a. you create two tables
b. the first table has a primary key
c. you carry that primary key over to the second
table as a foreign key
d. the primary key and the foreign key are spelled
the same and have the same type
Data
1
2
3
Smith
555-5555
Chen
666-6666
Wilson 777-7777
 3/3/09
 3/3/09
 3/4/09
20.45
5.99
29.99
Palos Heights
LaGrange
Chicago
Shirt
Scarf
Jeans
N
N
Y
1
1
3
 Important: Note the last column in the second table references the
primary key (e.g. customerID) from the first table.
 Let’s do the first part of today’s activity.
Simple Queries - contd
• To create an Access query, don’t use the query
wizard. Instead, create query in Design view
• Let’s see how Access does it
• Copy the Pets database from the Basic
Information page to your desktop (or My
Documents)
• Then open the Pets database
– If you don’t see any tables/queries etc on the left, click on
the down-arrow and choose ‘All Access Objects’
* Query on Dates
• You can query based on dates, but only if the
data was stored as date/time
– E.g. to search for dates after Jan 1 2004, you
would type: >1/1/2004
• In a query, dates should be entered with #
before and after the date. Note that dates can
be in written many different formats, ie
#1/1/2004#, #January 1, 2004#, #1-Jan-2004#
– Access typically puts these in FOR you
• Different databases have different ways of
dealing with dates.
** Queries – Using ‘OR’
• In Access, put each criterion in a different row
– If you put criteria in the same row, the query will work as
an AND query (discussed next)
– Customers with City=“Chicago” OR first name = “Jack”
• Another way of using ‘OR’
• If you are looking for different values within a single field (e.g.
State), you can simply type the word ‘OR’ between each
value:
– E.g. You can look for records in the state of Indiana or
Tennessee or Ohio by saying
“IL” OR “TN” OR “OH”
– E.g. Show all pets that are birds or snakes:
** Queries – Using ‘AND’
• Logical AND - you can make multiple entries in the
query boxes.
– E.g. In the Type field enter “Dog” and in the Color field
enter Brown
• In Access, putting criteria in the same row is how you
accomplish an AND effect
– Recall that this is different from an OR query, where the
different values must be on separate rows
– E.g. Show all brown dogs:
** Queries – Using ‘AND’
• In Access, putting criteria in the SAME row is a way of
accomplishing an AND effect
– This example shows customers with city = “Chicago” AND
first name = “Jack”
AND Queries contd
• Logical AND - You can also use an AND in one
field.
– For example, in the Size field you can enter
>=3 AND <=9
• Possible operators include =, <>, <, >, <=, >=
Queries That Calculate
• When performing a query, you can aggregate data
together from a series of records
– E.g. Find all customers born before 1970 and calculate their
average sales.
• You can do various basic statistical calculations such as:
Count, Sum, Avg, Max, Min, Standard deviation, etc
• Certain calculations can be performed only on certain
datatypes.
– E.g. You can not calculate the average of FirstName
Viewing Totals
• To see these values, you need to click on Design  ‘Totals’
– Totals can be found in the ‘Show/Hide’ box
• You will now see an additional row called ‘Total’ in your query
design view
• Under Total, the ‘Group By’ criteria is important…
Example
• Say you have a database for a vet that treats dogs.
Each dog treated has an entry including ID, weight,
and height
• If you want to find the average weight and height of
all pets
– * IMPORTANT: Note that under ‘Total’, we set ‘Group By’ ‘Count’ for type (it
will show how many pets there were) and to ‘Avg’ for weight (shows the
average)
Example
• What if you want to find the average height
and weight for all dogs?
Example
• What if you want to find the minimum and
maximum weight for all dogs?
Note that you have to include the Weight field twice.
More Examples
• You can also perform totals on groups of records.
• For example, suppose you want to count how many
different types of pets the vet has on record
This example is not as intuitive as the others. For example, you would
not know that the ‘Type’ field should not be grouped. Learning to
skillfuly query a database takes some reading and practice.
Further Examples?
• Experiment with more queries