CSC 110 - Intro. to Computing - Canisius College Computer Science
Download
Report
Transcript CSC 110 - Intro. to Computing - Canisius College Computer Science
CSC 110 Intro. to Computing
Lecture 10:
Databases
Announcements
Enjoy the long weekend!
Bob Jones
Sam Smith
Tom Jones
Jim Mower
Bob DeNiro
Greg Martin
Mike Marino
Jim Woods
Bob Denver
Xavier Jobs
Sam Els
Ernie Smith
7
12
10
8
7
6
2
6
7
11
3
9
Elvis Gates
Steve Young
Bob Bledsoe
Michael Jones
Jack Woods
Josh Jones
Bob Martin
Fran McLean
Clive Deeker
Matthew Hertz
Zoe Elton
Kate Bolter
4
12
7
4
10
9
7
12
4
8
9
10
Bob Jones
Sam Smith
Tom Jones
Jim Mower
Bob DeNiro
Greg Martin
Mike Marino
Jim Woods
Bob Denver
Xavier Jobs
Sam Els
Ernie Smith
7
12
10
8
7
6
2
6
7
11
3
9
Elvis Gates
Steve Young
Bob Bledsoe
Michael Jones
Jack Woods
Josh Jones
Bob Martin
Fran McLean
Clive Deeker
Matthew Hertz
Zoe Elton
Kate Bolter
4
12
7
4
10
9
7
12
4
8
9
10
Problems With Humans
Humans cannot process large amounts of
data
Get
overwhelmed, lost, and miss patterns
(Other) people may also make mistakes
Patterns
and trends are important, however
Suggest methods of improving business
Identify new possibilities that can be exploited
Hints at unknown relationships
One Solution: Databases
Databases are programs that store and
manage information automatically
Typically
used when data is frequently added,
modified, or searched for
Examples of databases:
Prices & orders from amazon.com
Student records in a college
Phone calls made/received from a phone number
Card catalogue in a library
Why Databases Are Useful
Computers process 1 data item as easily
as they process 10 or 10 billion items
Filter
data to return only certain elements
E.g., Only consider results for people named “Bob”
Evaluate
E.g. Have we made our sales goals yet?
Combine
results on a minute-by-minute basis
data that are somehow related
Let the user look for missed relationships
Relational Databases
Most common databases today
Uses relationships to organize data
Starts with a table
Each
table contains a collection of records
Every record uses identical fields, but may have
different values
Fields must include at least one unique identifier
Database
usually contain multiple tables
And define how to combine data between tables
Commonly Used Database
Database Organization
Databases contain several different pieces
database – File(s) holding the tables
Database engine – Software enabling
programs and users to read, write, and modify
database contents
Database schema – Files specifying what
type of data each field in a table stores; also
specifies what fields enable tables to relate to
one another
Physical
Physical Database
Specifies how data are stored
Has
little/no relation to anything else about
the database
Usually uses many files to store all the data
Little relationship between files and the data
Physical database is only meant to be
accessed by database engine
May
not make sense to actual people
Database Schema
Specifies what the type of data each field
stores
Used to interpret the file(s) storing each
table
Example
schema for Movie table:
Movie (MovieId: key; Title, Genre: text;
Rating: Enum)
Database Engine
Program which must be used to access
database
Interprets
database schema to get/store
information in physical database
Ensures requests match what is expected by
database
Verifies program/user requesting or modifying
data is allowed to do so
SQL
Stands for Structured Query Language
Usually
pronounced “sequel”
Language used by database engines
Includes
commands retrieve, add, remove,
and modify records
Expects/requires commands to be given in
a very specific format
SQL Query Format
select Name from Student
Returns
Name in each record in Student table
select Name from Student where
StudentId =2009348
Return
the Name of any Student with a
StudentId of 2009348
select Name from Student order
by Name
Returns
sorted list of Names from Student
SQL Query Format
select * from Student
Return
all records from Movie table
select * from Student where
name like ‘John’ order by
StudentId
Return
all records from the Student table,
sorted by their id numbers, whose name
includes a word like “John”
Database Design
Designing good databases is a tricky,
complex process
No
single solution works for all data
Often the “best” design is a matter of
interpretation
But, bad designs limits data’s usefulness
May
limit what data can be stored
Cannot access information across tables
ER Modeling
ER stands for Entity-Relationship
Common way of designing databases
Ignores
“real-world” issues, implementation
Starts at important data to store
Evaluates relationships between the data
Builds tables and fields from there
One popular tool for ER modeling is ER
diagram
ER Diagram
Start with boxes
Each box captures single concept we wish
to consider/capture
Box
labels are usually singular noun
Should describe important actor in
relationship being modeled
Problem
Use
Homework
ER Diagram
Add ovals around each of the boxes
Represent
important attributes of each box
Labels should be simple, declarative noun
Include
all details we might want to capture
Each oval should define single piece of data
ProblemId
Question
Number
Problem
Use
Answer
PctCorrect
HomeworkId
DateGiven
Homework
DateDue
Avg Score
ER Diagram
Show relationships between boxes
Draw
line whenever we may wish to connect
data
Turn boxes that in middle of relationship into
diamonds
ProblemId
Question
Number
Problem
Use
Answer
PctCorrect
HomeworkId
DateGiven
Homework
DateDue
Avg Score
ER Diagram
Now build database
Boxes
and diamonds become tables
Ovals become fields of the table
Links between tables suggest tables should
share at least 1 field in common
ProblemId
Question
Number
Problem
Use
Answer
PctCorrect
HomeworkId
DateGiven
Homework
DateDue
Avg Score
For Next Lecture
Finish reading Chapter 12
Be prepared to discuss:
Spreadsheets