Introduction to Data Integrity

Download Report

Transcript Introduction to Data Integrity

Chapter No 4
Query optimization and Data Integrity & Security
What is a query optimizer?
Query optimization is of great importance for the
performance of a relational database, especially for the
execution of complex SQL statements. A query
optimizer determines the best strategy for performing
each query.
Concurrent user access to the same data is
one of the most central and vexing issues for
applications utilizing databases. Concurrency
can affect two of the most important facets of
any application: the underlying integrity of
the data and the performance of the
application system.
The query optimizer chooses, for example,
whether or not to use indexes for a given
query, and which join techniques to use
when joining multiple tables. These
decisions have a tremendous effect on SQL
performance, and query optimization is a
key technology for every application, from
operational systems to data warehouse
and analysis systems to contentmanagement systems.
Introduction to Data Integrity
Oracle uses integrity constraints to
prevent invalid data entry into the base
tables of the database. You can define
integrity constraints to enforce the
business rules you want to associate with
the information in a database.
Examples of Data Integrity
Types of Integrity Constraints
NOT NULL Integrity Constraints
•A NOT NULL constraint requires a column
of a table contain no null values.
•UNIQUE Key Integrity Constraints
A UNIQUE key integrity constraint requires
that every value in a column or set of
columns (key) be unique—that is, no two
rows of a table have duplicate values in a
specified column or set of columns.
PRIMARY KEY Integrity Constraints:
It consists of one or more columns to keep the
records unique in the table. It should not be
empty. It helps to find a record uniquely in the
table.
Referential Integrity Constraints
Different tables in a relational database can be
related by common columns, and this rule
governs the relationship of the columns must be
maintained.
CHECK Integrity Constraints
A CHECK integrity constraint on a
column or set of columns requires that a
specified condition be.
Database security is the system, processes,
and procedures that protect a database from
unintended activity. Unintended activity can
be categorized as authenticated misuse,
malicious attacks or unintentional mistakes
made by authorized individuals or processes.
Database Recovery:
A database recovery log keeps a record of all changes made to
a database, including the addition of new tables or updates to
existing ones. This log is made up of a number of log extents,
each contained in a separate file called a log file.
Data Security
• Two Basic Categories of Computer Security:• Physical Security:- It is concerned with physical
protecting the computer resources.
–Physical Security Control:
• Data Backup.
• Hardware Backup.
• Planning and testing.
Data Security …Cont
• Logical Security:- It is concerned in
controlling access to information.
–Logical Security Control:• Unauthorized disclosure. (viewing DB).
• Unauthorized modification.
• Unauthorized withholding.
Database/Application Security
• Ensure that only authenticated users can
access the system
• And can access (read/update) only
data/interfaces that they are authorized to
access
13
Database and Application Security, Nov 2006
Limitations of SQL Authorization
• SQL does not support authorization at a tuple
level
– E.g. we cannot restrict students to see only (the tuples
storing) their own grades
• Web applications are dominant users of
databases
– Application end users don't have database user ids,
they are all mapped to the same database user id
– Database access control provides only a very coarse
application-level access control
14
Access Control in Application Layer
• Applications authenticate end users and
decide what interfaces to give to whom
– Screen level authorization: which users are
allowed to access which screens
– Parameter checking: users only authorized to
execute forms with certain parameter values
• E.g. CSE faculty can see only CSE grades
Privacy
• Aggregate information about private
information can be very valuable
– E.g. identification of epidemics, mining for
patterns (e.g. disease causes) etc.
• Privacy preserving data release
– E.g. in US, many organizations released
“anonymized” medical data, with names
removed, but zipcode (= pincode), sex and
date of birth retained
• Turns out above (zipcode,sex,date of birth)
uniquely identify most people!
– Correlate anonymized data with (say) electoral data
with same information
–Recent problems at America Online
• Released search history, apparently
anonymized, but users could be easily
identified in several cases
–Several top officials were fired
–Earlier problems revealed medical history
of
Massachusetts state governer.
• Not yet a criminal issue, but lawsuits
have happened
• Conflict with Right To Information Act
–Many issues still to be resolved
References:
- http://www.ibm.com/support
- Database and Application Security By S. Sudarshan
Computer Science and Engg. Dept
I.I.T. Bombay