Transcript ppt file

CMPT 354
Views and Indexes
Spring 2012
Instructor: Hassan Khosravi
Three level vision of a database
1.2
 What are views

Relations that are defined with a create table statement exist in
the physical layer and


do not change unless explicitly told so
Virtual views do not physically exist, they are defined by
expression over the tables.

Can be queries (most of the time) as if they were tables.
 Why use views?

Hide some data from users

Make some queries easier

Modularity of database

When not specified exactly based on tables.
1.3
Defining and using Views
 Create View <view name> As <view definition>

View definition is defined in SQL

From now on we can use the view almost as if it is just a normal
table
 View V (R1,…Rn )
 query Q involving V

Conceptually


V (R1,…Rn ) is used to evaluate Q
In reality

The evaluation is performed over R1,…Rn
1.4
Example of views
1.5
Views
 Conceptually
 Reality
 Query optimizer
 Users do not need to worry about any of this
1.6
Views on Views
1.7
Views make life easy
1.8
Deleting views
 Drop View <view name>

Dropping a view does not affect any tuples of the in the underlying
relation.
 Dropping a table would remove all of the tuples of the table and views
that use that table.

What would happen if CSaccept is dropped?

Some DBMS would not allow it because it has dependencies

Some would delete it and the be unable to answer queries on
CSberk
1.9
Modifying views
 Can views be modified?

Doesn’t make sense to modify a view since it doesn’t physically
exist

Should be possible since many users can only observe views and
do not have access to tables, so they should be able to alter
tables using views.
 Solution Modifications to the view rewritten to modify base tables.
1.10
Example
 R(A,B) which has (1, 2)
 V is just the A attribute of V = ∏
A
(R)
 Insert (3) into A

What would be insert into R(A,B)?

(3,1)?

(3,2)?

(3,null)?
 R(N)
 V = avg(N)

Update
1.11
Example
1.12
Example
1.13
Approaches to View Modification
 Rewriting process specified explicitly by view creator (user)

+ Can handle all modification

- No guarantee of correctness

- More work
 Restrict views and modifications so that translation to base tables can
be done automatically, correctly, and unambiguously

+ No User intervention

- Restrictions are significant
1.14
Rewriting process specified explicitly by
view creator
 Instead of Triggers are mainly used
 We can write a trigger to perform deletion on this view
1.15
 This will produce an error without having a trigger for it
 There is nothing stopping you from making a mistake in your trigger.
1.16
Restricting modifications in Views
 Showing students that are either in CS or EE
 Inserting new students
1.17
Automatic View Modification
 Restrict views and modifications so that translation to base tables can
be done automatically, correctly, and unambiguously

The view should be on one table and not join of tables

Attributes that are not in the view can be null or have default
values

Sub-queries must not refer to T

You cannot have group by or aggregation
 MYSQL supports automatic updates on views
 Postgress and SQLite do not support automatic updates on views and
triggers should be used
1.18
Deletion Example
 The deletion will be done in the apply table and would show in the
CSaccept view
1.19
Insertion Example
 You should not be able to add to view tuples that do not belong to the
view
1.20
Updatable Views -- Aggregation
 What would this deletion mean?

The system disallows this deletion.
1.21
Updatable views -- Sub-queries
 This is disallowed since it is vague

It would have surprising effects on the base table
1.22
Updatable views-- Joins
 What would insertion or deletion to this table mean?

Updates the apply table but not the student table so the first row
would not match any more.
1.23
Indexes in SQL
 Primary mechanism to get improved performance on database
 Data structure stored in the database

Implementation issues
 Full scan vs. immediate tuple location
1.24
1.25
Implementation of Indexes
 Underlying data structure


Balance Trees

A=v

A<v

A>v

Logarithm time
Hash tables

Only A = v

If designed properly, constant time
1.26
1.27
1.28
1.29
Downside of Indexes
 Extra space

Local hard space
 Over head in Index creation
 Index maintenance
1.30
Calculating the best indexes to create
 Typically a relation is stored over many disk blocks
 Principle cost of a query or modification is often the number of disks
that have to be brought into the main memory
 Indexes store where tuples are stored, so we know which block to
restore instead of full scan of the table
 However indexes themselves have to be stored and modified

Modification in the actual relation needs one page access to
modify the relation and one page access to modify the index
 Assuming we have history of the previous queries and modifications
how can we decide what to have indexes on?
1.31
Example

StarIn(movieTitle, movieYear, starName)

Three database operations are performed

(Q1)Look for title and year of movies in which a given star appeared (p1)

(Q2) Look for stars that appear in a given movie (p2)

(I) Insert a new tuple into the starIn table (1-p1-p2)

StarIn occupies 10 pages

On average a star has appeared in 3 movies

On average a movie has 3 stars

Movies of a star and stars of a movie are speared over all pages

One disk access is needed to read a page in the index

Insertion requires 1 access to find page to write to and 1 to actually write it
Action
No Index
Star Index
Movie
Index
Both
Indexes
Q1
10
4
10
4
Q2
10
10
4
4
I
2
4
4
6
Average
2 + 8p1 +8p2
4 + 6p2
4 + 6p1
6 - 2p1 - 2p2
1.32
Action
No Index
Star Index
Movie
Index
Both
Indexes
Q1
10
4
10
4
Q2
10
10
4
4
I
2
4
4
6
Average
2 + 8p1 +8p2
4 + 6p2
4 + 6p1
6 - 2p1 - 2p2
P1=0.33
P2=0.33
7.28
6
6
4.67
P1=0.1
p2=0.1
3.6
4.6
4.6
5.6
P1=0.1
p2=0.5
6.8
7
4.6
4.8
1.33
Automatic Selection of Indexes to create
1.34
Materialized views
 Materialized views
 Periodic Maintenance of Materialized views

Common for databases to serve two purposes

Record current inventory

Analysts to study patterns

Analyst’s queries involve many aggregation and joins

Not as sensitive to modifications

They are updated (reconstructed) periodically

Once a day or week
1.35