Course introduction, relational theory

Download Report

Transcript Course introduction, relational theory

CSCI 6442
Database
Management II
INTRODUCTION
Copyright 2016 David C. Roberts, all rights reserved
Agenda
How the course works
◦
◦
◦
◦
Homework
Project
Exams
Grades
Prerequisite
◦ CSCI 6441: Mandatory prerequisite
◦ Take the prereq or get permission to take the course
Goals of the course
◦ Advanced topics
◦ Topics that are often not understood
◦ Realistic experience
Workload
◦ Fairly heavy workload from the beginning
◦ Workload gets heavier when project starts
Relational Principles
◦ Why relational, why it matters
2
Homework
Weekly assignments
Assignments are challenging and make an important point, no busywork
Submit assignments by email to [email protected]
No attachments
Assignments due at start of class
Late assignments not accepted
3
Project
One project will involve the entire class
Work will be performed in functional teams
Every student must produce programs, will be graded on personally
produced results
Details of project not yet determined
4
Project
We have a choice about the project
◦ We can build a multi-enterprise Web-accessed application with general
usefulness as a WordPress plugin.
◦ You can take it with you and use it as you wish in the future
◦ It’s a good thing to show for grad school or job interviews
◦ We can include both transaction and analytical processing in the project
◦ We can do a big data experiment with a lot of data
◦ New York City has made large data collections available
◦ We can get accounts on Amazon Web Services
◦ This would be a one-time project, no takeaways
Previous project built by a CSCI 6442 class: QuestionPeach.com
5
Exams
Midterm and final
Midterm will be closed book, in class
Midterm date will not change
Plan your schedule now:
◦ Be here for the midterm, no makeup exams
◦ Be here for the final, no makeup exams
Midterm will test your ability to work with concepts discussed in lecture
and covered by homework
6
Grading
A: Good quality graduate work, only minor issues with correctness
B: Acceptable graduate work, one or more major issues
C: Not acceptable graduate work, several serious issues
F: Does not show basic understanding
7
Prerequisites
CSCI 6441 is a mandatory prerequisite
Take it before this course
If you think you know the material, you need to explain it and get
permission
First assignment is intended to clear this up
8
Goals
Misunderstood topics
◦
◦
◦
◦
Normalization
Database design
Performance
SQL
Advanced topics
◦ Time in databases
◦ Translucency
◦ Performance
Realistic experience
◦ Realistic team size
◦ Accountability
◦ Emerging requirements
Current Developments
◦ Big data
◦ NOSQL
◦ Cloud Computing
9
Workload
This course is for advanced students who want to be technical leaders
of a database project
If want to “slide by,” you are in the wrong course
But if you do want to be the database guru on a project at work, then
stay in this course!
10
Relational Principles
Earlier database systems: hierarchies, networks as data models
◦ Relationships represented as physical connections
◦ Structure of relationships imbedded in applications
◦ When relationships changed, applications had to change
Relational: independent table as data model
◦ Relationships represented by equal values
◦ Structure of relationships invisible to applications
◦ Relationships can change structure without application change
11
Relational Database
Relational Database: a set of relations
12
Relation
Relation: a set of ordered pairs
Ordered pair: a pair of values, such that interchanging the two values
changes the meaning
◦ That is, <a,b> = <b,a> iff a = b and b = a
Specifying a relation by enumeration:
R = {<a,b>, <c,d>, <e,f>}
◦ This is a relation consisting of three ordered pairs.
13
Relation and File
Ordered pairs can model more than two values through nesting:
◦ <a, b, c> == <<a,b>, c>
◦ <a, b, c, d> == <<a,b>, c, d>
◦ And so on
This extends the ordered pair so that it can model a tuple of any length
Now a relation starts to look like our notion of a file, with each tuple
corresponding to our notion of a record
14
The Definition
Relation is a set of ordered pairs (modeling a set of tuples), so:
1. exchanging order of values within a tuple changes the meaning of the
tuple
2. exchanging the order of tuples within a relation does not change the
meaning of the tuple
3. duplicate tuples are not allowed
15
Data Modeling
Now we build a database as a collection of independent relations, each
describing instances of a single entity type
For example:
◦ Employee (employee#, job, salary, department)
◦ Department (department#, departmentname, location)
(this is called schema notation)
16
Data Language
We need a way to insert data into the database, retrieve data from the
database, and changes values that are stored in the database
We define a data language that can be used from any programming
language to do that
The data language (SQL) has a lot of power and can save a lot of
programming work if you understand it
You’ll have a brief chance to learn more about SQL in this course
17
Normalization
Database courses talk about normalization
Students usually don’t learn more than memorizing definitions
We will talk about Roberts’s Rules, plain English rules that give you a
highly normalized database
Then we will talk about the normalization rules and what they mean in
English
You will have the chance to really understand how to do this
18
Time
Time in databases is a complex issue
There’s the time something happened and the time it’s entered into the
database
And there’s also the effective time, which may differ from those two
times
And there’s the need to capture a history of previous values and roll
back to it
We’ll examine all of these cases of time
19
Translucency
Typically the GRANT statement is used to give access to a database
A DBA enters the statement, and a user has access
But that’s not good enough if there may be thousands of people on the
Web using a database
We’ll study translucency, a way to provide access control without
GRANT statement
20
Responsibility
Professional standards, like the work environment, will be followed:
◦ Arrive at class on time
◦ Submit homework on time
◦ Limit answers to 50 words
21
Class Web Site
A class Web site has been established
Everything about the class is on it
◦ Assignments
◦ Lecture slides
◦ Papers
Please read it
When changes are made to the site a note will be sent to the class email
list
It’s at http://csci6442.org
22
EMAIL list
A class email list has been established
If you got the email sent today, then you’re on it
If not, please go to the site, follow instructions, and enroll yourself
You can follow instructions on the Web site to enroll a different email
address
23