Transcript The Files
File Structures by Folk, Zoellick, and Riccardi
Chap1. Introduction to
File Structures
File StructuresFile Structure
SNU-OOPSLA Lab
1
The Files
A computer file is a resource for storing
information, which is available to a computer
program and is usually based on some kind of
secondary storage.
A file is a collection of related information
defined by its creator. Commonly, files
represent programs (both source and object
forms) and data.
File StructuresFile Structure
SNU-OOPSLA Lab
2
Chapter Objectives
Introduce the primary design issues
characterizing file structure design
Survey the history of file structure design
Introduce the notions of file structure literacy and
of a conceptual toolkit for file structure design
Discuss the need for specification of data
structures, operations and development of objectoriented toolkit
Introduce classes and overloading in C++
File StructuresFile Structure
SNU-OOPSLA Lab
3
Contents
1.1 The heart of file structure design
1.2 History of file structure design
1.3 Conceptual toolkit : file structure literacy
1.4 Object-Oriented Toolkit
: make file structure usable
1.5 Using objects in C++
File StructuresFile Structure
SNU-OOPSLA Lab
4
1.1 The Heart of File Structure Design
The Heart of File Structure Design
File structure
representation
of data + operation for accessing data
Disk vs. RAM
speed
: very slower than RAM
cost : enormous capacity at much less cost than RAM
nonvolatile in Disk vs. volatile in RAM
Good file structure design
access to all the capacity without spending much time waiting for
disk
uniform
average access time( The access
time or response time of a rotating drive is a measure
of the time it takes before the drive can actually transfer
data).
File StructuresFile Structure
SNU-OOPSLA Lab
5
1.2 History of File Structure Design
History of file structure design(1)
Sequential
look
file
at records in order
AVL tree (Balanced Binary Tree)
self-adjusting
binary tree
guarantee good access time for data in RAM
B-tree
balanced
tree structure
provide fast access to data in file
File StructuresFile Structure
SNU-OOPSLA Lab
6
1.2 History of File Structure Design
History of file structure design(2)
B+-tree
variation
on B-tree structure
provide both sequential access and fast-indexed
access(An indexed file is a computer file with
an index that allows easy random access to
any record given its file key).
File StructuresFile Structure
SNU-OOPSLA Lab
7
History of file structure
design(3)
Hashing
transform
search key into storage address(A hash
function is any algorithm that maps data of arbitrary
length to data of a fixed length. The values returned
by a hash function are called hash values)
provide fast access to stored data(because we don’t
have any search, we only calculate the address)
Extendible hashing
approach
to hashing that works well with files
undergoing many changes in size over time
Extendible hashing is a type of hash system which
treats a hash as a bit string, and uses a tire for bucket
lookup
SNU-OOPSLA Lab
8
1.2 History of File Structure Design
Taxonomy of File Structures
Single-key files(files access using single value).
Index-based (Tree Data Structure)
Indexed Sequential File -> B-Tree -> B+-Tree
Hashing-based (Address Computation)
Hashing File -> Extendible Hashing File
Multi-key files (multidimensional)
K-D-B tree: In computer science, a k-d tree (short for kdimensional tree ) is a space-partitioning data structure for
organizing points in a k-dimensional space. k-d trees are a
useful data structure for several applications, such as
searches involving a multidimensional search key
File StructuresFile Structure
SNU-OOPSLA Lab
9
Taxonomy of File Structures
Grid
file: It provides a grid of n-dimensions
where n represents how many keys can be
used to reference a single point.
R-tree: tree data structures used for spatial
access methods, i.e., for indexing multidimensional information such as geographical
coordinates.
Multimedia Indexing Techniques
Audio, Image, Video
Multimedia (MM) data indexing refers to the problem of
preprocessing a database of MM objects so that they can be
efficiently searched for on the basis of their content
File StructuresFile Structure
SNU-OOPSLA Lab
10
1.3 A Conceptual Toolkit
Conceptual Toolkit : File Structure Literacy
The toolkit for file structure system refer to what the specific
programming language contains to use the files structures.
Objective of Conceptual toolkit
Fundamental
file concepts
Generic file operations
Conceptual tools in this book
basic
tools + evolution of basic tools
basic tools : Chapter 2 ~ 6
evolution of basic tools : Chapter 7 ~ 12
B-trees, B+trees, hashed indexes, and extensible
dynamic hashed files
File StructuresFile Structure
SNU-OOPSLA Lab
11
1.4 Object-Oriented Toolkit
Object-Oriented Toolkit:
Making File Structures Usable
Object-Oriented Toolkit
making
file structures usable requires turning
conceptual toolkit into collections :(classes) of data
types and operations
Major problem
complicated
and progressive
often modified and extended from other classes
and details of classes become more complex
File StructuresFile Structure
SNU-OOPSLA Lab
12
1.5. Using Objects in C++
Using objects in C++(1)
Features
of object in C++
class
definition
data members(attributes) + methods(Functions)
constructor
provide a guarantee for initialization of every object &
called in creation time of object
public, private & protected sections
public label specifies that any users can freely access
private & protected label are restrict access
File StructuresFile Structure
SNU-OOPSLA Lab
13
1.5. Using Objects in C++
Using objects in C++(2)
operator
overloading
allows a particular symbol to have more than
one meaning
other
features
inheritance, virtual function, and templates
explained in later chapters
File StructuresFile Structure
SNU-OOPSLA Lab
14
Let’s Review!!
1.1 The heart of file structure design
1.2 History of file structure design
1.3 Conceptual toolkit : file structure literacy
1.4 Object-Oriented Toolkit
: make file structure usable
1.5 Using objects in C++
File StructuresFile Structure
SNU-OOPSLA Lab
15