CSC 110 - Intro. to Computing - Canisius College Computer Science

Download Report

Transcript CSC 110 - Intro. to Computing - Canisius College Computer Science

CSC 110 Intro. to Computing
Lecture 10:
Databases
Announcements

Enjoy the long weekend!
Bob Jones
Sam Smith
Tom Jones
Jim Mower
Bob DeNiro
Greg Martin
Mike Marino
Jim Woods
Bob Denver
Xavier Jobs
Sam Els
Ernie Smith
7
12
10
8
7
6
2
6
7
11
3
9
Elvis Gates
Steve Young
Bob Bledsoe
Michael Jones
Jack Woods
Josh Jones
Bob Martin
Fran McLean
Clive Deeker
Matthew Hertz
Zoe Elton
Kate Bolter
4
12
7
4
10
9
7
12
4
8
9
10
Bob Jones
Sam Smith
Tom Jones
Jim Mower
Bob DeNiro
Greg Martin
Mike Marino
Jim Woods
Bob Denver
Xavier Jobs
Sam Els
Ernie Smith
7
12
10
8
7
6
2
6
7
11
3
9
Elvis Gates
Steve Young
Bob Bledsoe
Michael Jones
Jack Woods
Josh Jones
Bob Martin
Fran McLean
Clive Deeker
Matthew Hertz
Zoe Elton
Kate Bolter
4
12
7
4
10
9
7
12
4
8
9
10
Problems With Humans

Humans cannot process large amounts of
data
 Get

overwhelmed, lost, and miss patterns
(Other) people may also make mistakes
 Patterns
and trends are important, however
Suggest methods of improving business
 Identify new possibilities that can be exploited
 Hints at unknown relationships

One Solution: Databases

Databases are programs that store and
manage information automatically
 Typically
used when data is frequently added,
modified, or searched for
 Examples of databases:
Prices & orders from amazon.com
 Student records in a college
 Phone calls made/received from a phone number
 Card catalogue in a library

Why Databases Are Useful

Computers process 1 data item as easily
as they process 10 or 10 billion items
 Filter

data to return only certain elements
E.g., Only consider results for people named “Bob”
 Evaluate

E.g. Have we made our sales goals yet?
 Combine

results on a minute-by-minute basis
data that are somehow related
Let the user look for missed relationships
Relational Databases
Most common databases today
 Uses relationships to organize data
 Starts with a table

 Each
table contains a collection of records
Every record uses identical fields, but may have
different values
 Fields must include at least one unique identifier

 Database

usually contain multiple tables
And define how to combine data between tables
Commonly Used Database
Database Organization

Databases contain several different pieces
database – File(s) holding the tables
 Database engine – Software enabling
programs and users to read, write, and modify
database contents
 Database schema – Files specifying what
type of data each field in a table stores; also
specifies what fields enable tables to relate to
one another
 Physical
Physical Database

Specifies how data are stored
 Has
little/no relation to anything else about
the database
 Usually uses many files to store all the data


Little relationship between files and the data
Physical database is only meant to be
accessed by database engine
 May
not make sense to actual people
Database Schema
Specifies what the type of data each field
stores
 Used to interpret the file(s) storing each
table

 Example
schema for Movie table:
Movie (MovieId: key; Title, Genre: text;
Rating: Enum)
Database Engine

Program which must be used to access
database
 Interprets
database schema to get/store
information in physical database
 Ensures requests match what is expected by
database
 Verifies program/user requesting or modifying
data is allowed to do so
SQL

Stands for Structured Query Language
 Usually

pronounced “sequel”
Language used by database engines
 Includes
commands retrieve, add, remove,
and modify records

Expects/requires commands to be given in
a very specific format
SQL Query Format

select Name from Student
 Returns

Name in each record in Student table
select Name from Student where
StudentId =2009348
 Return
the Name of any Student with a
StudentId of 2009348

select Name from Student order
by Name
 Returns
sorted list of Names from Student
SQL Query Format

select * from Student
 Return

all records from Movie table
select * from Student where
name like ‘John’ order by
StudentId
 Return
all records from the Student table,
sorted by their id numbers, whose name
includes a word like “John”
Database Design

Designing good databases is a tricky,
complex process
 No
single solution works for all data
 Often the “best” design is a matter of
interpretation

But, bad designs limits data’s usefulness
 May
limit what data can be stored
 Cannot access information across tables
ER Modeling
ER stands for Entity-Relationship
 Common way of designing databases

 Ignores
“real-world” issues, implementation
 Starts at important data to store
 Evaluates relationships between the data


Builds tables and fields from there
One popular tool for ER modeling is ER
diagram
ER Diagram
Start with boxes
 Each box captures single concept we wish
to consider/capture

 Box
labels are usually singular noun
 Should describe important actor in
relationship being modeled
Problem
Use
Homework
ER Diagram

Add ovals around each of the boxes
 Represent

important attributes of each box
Labels should be simple, declarative noun
 Include
all details we might want to capture
 Each oval should define single piece of data
ProblemId
Question
Number
Problem
Use
Answer
PctCorrect
HomeworkId
DateGiven
Homework
DateDue
Avg Score
ER Diagram

Show relationships between boxes
 Draw
line whenever we may wish to connect
data
 Turn boxes that in middle of relationship into
diamonds
ProblemId
Question
Number
Problem
Use
Answer
PctCorrect
HomeworkId
DateGiven
Homework
DateDue
Avg Score
ER Diagram

Now build database
 Boxes
and diamonds become tables
 Ovals become fields of the table
 Links between tables suggest tables should
share at least 1 field in common
ProblemId
Question
Number
Problem
Use
Answer
PctCorrect
HomeworkId
DateGiven
Homework
DateDue
Avg Score
For Next Lecture
Finish reading Chapter 12
 Be prepared to discuss:

 Spreadsheets