Transcript The Files

File Structures by Folk, Zoellick, and Riccardi
Chap1. Introduction to
File Structures
File StructuresFile Structure
SNU-OOPSLA Lab
1
The Files
A computer file is a resource for storing
information, which is available to a computer
program and is usually based on some kind of
secondary storage.
 A file is a collection of related information
defined by its creator. Commonly, files
represent programs (both source and object
forms) and data.

File StructuresFile Structure
SNU-OOPSLA Lab
2
Chapter Objectives
Introduce the primary design issues
characterizing file structure design
 Survey the history of file structure design
 Introduce the notions of file structure literacy and
of a conceptual toolkit for file structure design
 Discuss the need for specification of data
structures, operations and development of objectoriented toolkit
 Introduce classes and overloading in C++

File StructuresFile Structure
SNU-OOPSLA Lab
3
Contents
1.1 The heart of file structure design
1.2 History of file structure design
1.3 Conceptual toolkit : file structure literacy
1.4 Object-Oriented Toolkit
: make file structure usable
1.5 Using objects in C++
File StructuresFile Structure
SNU-OOPSLA Lab
4
1.1 The Heart of File Structure Design
The Heart of File Structure Design

File structure
 representation

of data + operation for accessing data
Disk vs. RAM
 speed
: very slower than RAM
 cost : enormous capacity at much less cost than RAM
 nonvolatile in Disk vs. volatile in RAM

Good file structure design

access to all the capacity without spending much time waiting for
disk
 uniform
average access time( The access
time or response time of a rotating drive is a measure
of the time it takes before the drive can actually transfer
data).
File StructuresFile Structure
SNU-OOPSLA Lab
5
1.2 History of File Structure Design
History of file structure design(1)
 Sequential
 look

file
at records in order
AVL tree (Balanced Binary Tree)
 self-adjusting
binary tree
 guarantee good access time for data in RAM

B-tree
 balanced
tree structure
 provide fast access to data in file
File StructuresFile Structure
SNU-OOPSLA Lab
6
1.2 History of File Structure Design
History of file structure design(2)
 B+-tree
 variation
on B-tree structure
 provide both sequential access and fast-indexed
access(An indexed file is a computer file with
an index that allows easy random access to
any record given its file key).
File StructuresFile Structure
SNU-OOPSLA Lab
7
History of file structure
design(3)

Hashing
 transform
search key into storage address(A hash
function is any algorithm that maps data of arbitrary
length to data of a fixed length. The values returned
by a hash function are called hash values)
 provide fast access to stored data(because we don’t
have any search, we only calculate the address)

Extendible hashing
 approach
to hashing that works well with files
undergoing many changes in size over time
 Extendible hashing is a type of hash system which
treats a hash as a bit string, and uses a tire for bucket
lookup
SNU-OOPSLA Lab
8
1.2 History of File Structure Design
Taxonomy of File Structures

Single-key files(files access using single value).



Index-based (Tree Data Structure)
Indexed Sequential File -> B-Tree -> B+-Tree
Hashing-based (Address Computation)
Hashing File -> Extendible Hashing File
Multi-key files (multidimensional)

K-D-B tree: In computer science, a k-d tree (short for kdimensional tree ) is a space-partitioning data structure for
organizing points in a k-dimensional space. k-d trees are a
useful data structure for several applications, such as
searches involving a multidimensional search key
File StructuresFile Structure
SNU-OOPSLA Lab
9
Taxonomy of File Structures
 Grid
file: It provides a grid of n-dimensions
where n represents how many keys can be
used to reference a single point.
 R-tree: tree data structures used for spatial
access methods, i.e., for indexing multidimensional information such as geographical
coordinates.

Multimedia Indexing Techniques


Audio, Image, Video
Multimedia (MM) data indexing refers to the problem of
preprocessing a database of MM objects so that they can be
efficiently searched for on the basis of their content
File StructuresFile Structure
SNU-OOPSLA Lab
10
1.3 A Conceptual Toolkit
Conceptual Toolkit : File Structure Literacy

The toolkit for file structure system refer to what the specific
programming language contains to use the files structures.

Objective of Conceptual toolkit
 Fundamental
file concepts
 Generic file operations

Conceptual tools in this book
 basic
tools + evolution of basic tools
 basic tools : Chapter 2 ~ 6
 evolution of basic tools : Chapter 7 ~ 12

B-trees, B+trees, hashed indexes, and extensible
dynamic hashed files
File StructuresFile Structure
SNU-OOPSLA Lab
11
1.4 Object-Oriented Toolkit
Object-Oriented Toolkit:
Making File Structures Usable

Object-Oriented Toolkit
 making
file structures usable requires turning
conceptual toolkit into collections :(classes) of data
types and operations

Major problem
 complicated
and progressive
 often modified and extended from other classes
and details of classes become more complex
File StructuresFile Structure
SNU-OOPSLA Lab
12
1.5. Using Objects in C++
Using objects in C++(1)
 Features
of object in C++
 class
definition
 data members(attributes) + methods(Functions)
 constructor
 provide a guarantee for initialization of every object &
called in creation time of object
 public, private & protected sections
 public label specifies that any users can freely access
 private & protected label are restrict access
File StructuresFile Structure
SNU-OOPSLA Lab
13
1.5. Using Objects in C++
Using objects in C++(2)
 operator
overloading
 allows a particular symbol to have more than
one meaning
 other
features
 inheritance, virtual function, and templates
 explained in later chapters
File StructuresFile Structure
SNU-OOPSLA Lab
14
Let’s Review!!
1.1 The heart of file structure design
1.2 History of file structure design
1.3 Conceptual toolkit : file structure literacy
1.4 Object-Oriented Toolkit
: make file structure usable
1.5 Using objects in C++
File StructuresFile Structure
SNU-OOPSLA Lab
15