ITS 180 – DBMS Week 1

Download Report

Transcript ITS 180 – DBMS Week 1

ITS 180:
Database Design
INSTRUCTOR INFORMATION:
MARK JAMIL
[email protected]
Who am I?
 Long time San Diegan
 I Love Science and Technology
 Built my 1st PC from thrown away parts during high school.

33Mhz CPU, 4x 64k RAM, 64k cache, 40MB Disk Drive, DOS 4.0
 1st post-college job: Human genome project
 Mathematical modeling of proteins
 1st well paying job: Systems Administrator
 Media production operations (news)
 Later started my own business: Media & Software
Production Studio

Work was bought by Disney and Art Institute of LA
About this course
This is not a pure programming course… per se. You will not be graded on
your knowledge of programming but rather on your ability to logically
design and use databases.
Functional code is important to help facilitate your learning however in most
cases I will accept pseudo-code and non-working code as long as you
demonstrate that you understood the concept.
This is not a writing class. You will not be graded on spelling, punctuation, or
grammar but you will be graded on the content of your answers.
I prefer teaching database design as a practical lab course. Rather than
lecture, most of the course time will be spent solving practical problems in
a lab style environment. During lab you will be able to ask questions, work
together to solve problems, and learn at your own pace.
Goals
The goals of this course include:





Understanding hardware requirements for creating database
driven applications
Learn how to set up a database server
Learn how to use support software (mysql workbench)
Learn how to use SQL
Learn how to design an efficient database application
Grading Information





Attendance
Class participation
Quizzes
Assignments
Final Project
(10 points possible)
(10 points possible)
(20 points possible)
(40 points possible)
(20 points possible)
----------------------------------------------------------
100 Points Possible
Points
Grade
|
Points
Grade
-------------------------------------------------------------------------------------------------------------------------98-100
A+
|
70-79
C
93-97
A
|
60-69
D
90-92
A|
59 or less
F
88-89
B+
|
80-87
B
|
80-82
B|
POSSIBLE EXTRA CREDIT


5 Extra Credit Points
10 Extra Credit Points
- Perfect Attendance
- Optional Programming Project
EXTRA CREDIT
 Perfect Attendance (5 Points Extra Credit)


Must be present at all class meetings
Cannot be more than 15 minutes late to class
 Extra Credit Programming Project (10 Points Extra Credit)


Create an end user GUI for the database driven application based on the
database we are working on in class.
You may use any programming language to create the GUI. It can be web
driven or standalone. You may use any of the following for development:
PHP: Netbeans (free) or PHPStorm (commercial)
C#:
Visual Studio Express (free) or MonoDevelop (free)
C++:
Visual Studio Express (free) or Qt (free)
Python: PyDev Eclipse (free) or PyCharm (commercial)
Hints for Getting an “A”
 Attendance is important
 Points just for being in class! (10 points)
 Course is structured around discussions (10 points)
 Quiz questions are taken directly from discussion (20 points)
 Perfect Attendance Bonus (5 points)
45 points can easily be earned
by simply attending class!

Most assignment can be completed in class (40 points)
Assessment Questionnaire (aka Quiz 1)
1.
Do you have any experience with database development? What software have you
used?
2.
Do you have any programming experience? If so, what languages and IDE’s are you
familiar with?
3.
What does the following SQL command do?
SELECT CustomerID, phone1
FROM Customers
WHERE CustomerName <> 'Jessica Simpson'
4.
What does the following SQL command do?
SELECT a.CustomerID, a.CustomerName, a.phone1
FROM Customers AS a INNER JOIN Customers AS b
ON a.City = b.City
WHERE b.CustomerName='Jessica Simpson'
AND a.CustomerName<>'Jessica Simpson‘
5.
What does the following SQL command do?
SELECT SUM (Sales) FROM StoreInformation
WHERE StoreName IN
(SELECT StoreName FROM Geography
WHERE RegionName = 'West');
In the Beginning…




Data was written into stone tablets.
Then we had paper.
Then we had a lot of paper bound together into books.
Then we started organizing books:



Page numbering.
We separated books into chapters.
Chapters were listed in an index.
 Books were taking up too much space so we started
looking for ways to shrink them… anyone remember the
microfiche?
 Then we created integrated circuits allowing us to store a
lot of data on electronic parts.
 Time for a game…
Early Electronic Data Storage
This is how data was stored initially. This is inefficient
because every time we wanted to search for a piece of
data, we had to transverse all the data until we found
what we are looking for. For example, how many
sentences are in this paragraph? You would need to read
the entire paragraph and search for end of sentence
punctuations to figure out that there are four sentences.
Or are there only three? Maybe there are five but the
point is that this is very time consuming and even a
computer can be brought to its knees if there is a very
large set of data to transverse. Okay lets call it seven
sentences. Really I am not kidding this time…
Enter Big Data
 Initially, electronics were so fast at storing and looking up data, people did
not worry about performance issues.
 Once the amount of data grew, it became obvious that we needed to find a
better way to organize the data.
 Enter IBM.




To meet the need for big data companies started creating commercial database systems.
The 1st IBM database was called IMS and was sold to the government for the Apollo space
program in the 1960’s.
The 1st commercially success database was Saber, and was purchased by American Airlines,
also in the 1960’s.
IMS and Saber were hierarchical databases.
 IBM researcher (Ed Cobb) theorizes the relational database.



IBM switched all their data products to the relational model in the 70s (System R) and
continued their virtual monopoly on business computing well into the 80s.
System R was the 1st database to use a query language called “Sequel.”
“Sequel” was later changed to “SQL” due to copyright issues with the name “Sequel.”
What happened next?
 Due to the high cost of IBM hardware/software, students at
UC Berkeley made their own relational database for
educational use in the early 70s. It was called Ingres.



Ingress was the 1st open source relational database.
It used a scripting language called QUEL .
Database systems that used Ingress source code include: MS SQL,
Sybase, & Posgres.
 Knowing IBM was costly and that UC Berkeley did not have
money to market their database; Larry Ellison decided to
make his own relational database backed by investors. It was
called Oracle.


Oracle took market share from Ingress mostly due to huge marketing
campaigns.
Oracle used SQL.
 Lets play another game…
Better Data Storage
1.
2.
3.
4.
5.
6.
7.
This is inefficient because every time we wanted to search for
a piece of data, we had to transverse all the data until we
found what we are looking for.
For example, how many sentences are in this paragraph?
You would need to read the entire paragraph and search for
end of sentence punctuations to figure out that there are four
sentences.
Or are there only three?
Maybe there are five but the point is that this is very time
consuming and even a computer can be brought to its knees
if there is a very large set of data to transverse.
Okay lets call it seven sentences.
Really I am not kidding this time.
Definitions
 SQL (pronounced S-Q-L or sequel)


Structured Query Language.
Is a computer language used to communicate with most modern
database systems.
 Database

Is an organized collection of data used to provide information.
 Table

A data storage method using rows and columns. In computer science,
rows are also referred to as “Records” and columns are referred to as
“Fields.”
 Record

Is a row in the database.
 Field

Is a column in the database.
Definitions
 Primary Key
 Is a field (or column) that contains unique data within a table.
 Because the data is the column is unique, every row (or record) becomes unique.
 A table may have multiple keys or no keys at all.
 If you have more than one key in a table, it is called a composite key.
 For our purposes, keys are required to form relations.
 Foreign Key
 Is a relation formed between two tables where one table utilizes a primary key
from the other related table, as part of its data.
 Foreign Keys are NOT necessarily unique.
 Foreign Keys provide integrity because they MUST match a Primary Key in the
related table.
 Index
 Can be applied to columns or rows.
 Indices are NOT Keys!!!!
 They create a pointer to a record or field for faster database lookup.
 Typically copy data thus requiring extra memory.
Relationships (aka Cardinality)
 One-to-One
 Each row in one table, corresponds to exactly one row in
another table.
 One-to-Many
 Each row on one table can correspond to many rows in another
table.
 Many-to-Many
 Every row may have one or more associated rows in another
related table.
Entity Relation (ER) Diagram
Computer Memory
 Current computers store memory in units called bits.
 Bits are represented by 0 and 1
 What does 32 bit mean?
 A computer that has a 32 bit CPU can process 32 bits (32 zero’s
or one’s) in a single cycle (measured in hertz or 1/s).
 What is a byte?
 A byte is 8 bits such as 0000 0000 or 1010 0101
Beware of the difference in Bit and Byte
 Historically computer memory is labeled in bytes.





512 kiloBYTEs Cache
128 megaBYTEs of RAM
4 teraBYTE Hard Drive
1 petaBYTE = 1000 teraBYTEs
1 exaBYTE = 1000 petaBYTEs
 Internet Service Providers use bits instead of bytes.


2 MB service means 2 megaBITs not bytes!!
2 Mbits service means 250 Kbytes per second.
Question: How long would it take to fill a 10 Mbyte hard
drive with data streaming from a 10 Mbit internet
connection?
How do we store data in bits?
 How do I store the number 5 in Binary?
 You need 3 bits of storage









000
=0
001
=1
010
=2
011
=3
100
=4
101
=5
110
=6
111
=7
What happens if someone wants to store larger numbers? Or even words?
 The relationship between bits and how many permutations it can have is 2n.
 Thus if you want to store the number 128, you need at least 7 bits of storage space (27 = 128)
 How are letters stored?
 Letters are encoded. There are many different encoding systems such as ASCII or UTF-8
 ASCII stores everything in 8 bits of data (aka hex)




“A” is equivalent to “65”
“a” is equivalent to “97”
“1” is equivalent to “1”
Null is equivalent to “0”
or
or
or
or
0100
0110
0000
0000
0001
0001
0001
0000
Data Types
 Databases generally store human readable data be it a number
or some text
 Common data types (sizes are not always the same!)



Data Type
Size Limit
How it is stored
-----------------------------------------------------------------------INT
4 bytes
Fixed





DATETIME
BOOLEAN
CHAR
VARCHAR


Can store value of + or - 2 billion. Or unsigned up to 4 billion
Fixed
Fixed
Fixed
Variable
Always adds 1 extra byte for storing its own length
BLOB

8 bytes
1 byte
0-255 bytes
0-255 bytes
0-65k bytes
Variable
Since it stores twice as much as Varchar, it needs 2 bytes of storage for length
Table Example
MovieName
MovieGenre
MovieYear
Star Wars
Drama, Action, Science Fiction
1977
Gladiator
Drama
1992
Ju-On
Horror
2001
Gladiator
Action
1955
The Matrix
Action, Sience Fiction
2003
Gladiator
Action
2000
Hercules
Action
2013
Hercules
Action
2013
Unique Records (Unique Keys)
MovieID MovieName
MovieGenre
MovieYear
1
Star Wars
Action, Drama, Science Fiction
1977
2
Gladiator
Drama
1992
3
Ju-On
Horror
2001
4
Gladiator
Action
1955
5
The Matrix
Action, Sience Fiction
2003
6
Gladiator
Action
2000
7
Hercules
Action
2013
8
Hercules
Action
2013
Data Integrity (Incorrect Data Entry)
MovieID MovieName
MovieGenre
MovieYear
1
Star Wars
Action, Drama, Science Fiction
1977
2
Gladiator
Drama
1992
3
Ju-On
Horror
2001
4
Gladiator
Action
1955
5
The Matrix
Action, Sience Fiction
2003
6
Gladiator
Action
2000
7
Hercules
Action
2013
8
Hercules
Action
2013
Better Integrity
MovieID
GenreID
1
1
1
2
1
5
MovieID
MovieName
MovieYear
1
Star Wars
1977
2
2
3
4
2
Gladiator
1992
4
1
3
Ju-On
2001
5
1
5
5
4
Gladiator
1955
6
1
5
The Matrix
2003
7
1
8
1
6
Gladiator
2000
7
Hercules
2013
8
Hercules
2013
Master Table
Transition Table
GenreID
Genre
1
Action
2
Drama
3
Comedy
4
Horror
5
Science Fiction