PowerPoint XP

Download Report

Transcript PowerPoint XP

Naming and Directories
Andy Wang
Operating Systems
COP 4610 / CGS 5675
Recall from the last time…
 A file header associates the file with its data
blocks
File Header Storage
 Under UNIX, a file header is stored in a data
structure called i-node
 For early UNIX systems

I-nodes are stored in a special array

Fixed number of array entries
 Maximum number of files fixed

Not stored near data blocks on disk
 Reading a small file involves
 One disk seek to get the i-node
 Other disk seek(s) to get file blocks
Reasons for Separate Allocations
 Reliability

Data corruptions are unlikely to affect i-nodes
 Reduced fragmentation


File headers are smaller than a whole block
By packing them in an array, multiple headers
can be fetched from disk
 File headers are accessed more often
 e.g., ls

Grouping file headers improves disk efficiency
For BSD 4.2…
 Portions of file header array stored on each
cylinder
 For small directories


All file headers and data stored in the same
cylinder
Reduce seek time
Naming
 Remember that odd moment when your
computer asks you for name the first file?
 Naming: allows users to issue file names
instead of i-node numbers
- Users tend to come up with poor names

e.g., test
- Many file are difficult to name…
How do you name these photos?
Directories
 A table of file names and their i-node
numbers
 Under many file systems


Directories are implemented as normal files
Containing file names and i_node numbers
 Only the OS is permitted to modify directories
Name Space
 Flat name space
 Hierarchical naming
 Relational name space
 Contextual naming
 Content-based naming
Flat Name Space
 All files are stored in a single directory
+ Easy to implement
- Not scalable for large directories

Name collisions: multiple files with the same
names
Hierarchical Naming
 Uses multiple levels of directories
 Most popular name space organization
+ Conceptual model maps well into the human
model of organizing things

A file cabinet contains many files
+ Scalable

The probability of name collisions decreases
+ Spatial locality

Store all files under a directory within a
cylinder to avoid disk seeks
More on Hierarchical Naming
 Absolute path name: consisting the path
from the root directory ‘/’ to the file

e.g., /pets/cat.jpg
root directory
sub directory
file name
Drawbacks of Hierarchical Naming
- Not all files can fit into the hierarchical model
pets
pests
?
?
- Accessing a file may involve many levels of
directory lookups, or a path resolution
before getting to the file content
An Example of Path Resolution
 To access the data content of
/pets/cat.jpg
 The system needs to perform the following
disk I/Os
An Example of Path Resolution
 To access the data content of
/pets/cat.jpg
 The system needs to perform the following
disk I/Os
1. Read in the file header for the root directory ‘/’

Stored at a fixed location on disk
/
An Example of Path Resolution
 To access the data content of
/pets/cat.jpg
 The system needs to perform the following
disk I/Os
2. Read the first data block for the root directory

Lookup the directory entry for pets
/
pets
An Example of Path Resolution
 To access the data content of
/pets/cat.jpg
 The system needs to perform the following
disk I/Os
3. Read the file header for pets
/
pets
pets
An Example of Path Resolution
 To access the data content of
/pets/cat.jpg
 The system needs to perform the following
disk I/Os
4. Read the first data block for the pet directory

Lookup the directory entry for cat.jpg
/
pets
pets
cat
An Example of Path Resolution
 To access the data content of
/pets/cat.jpg
 The system needs to perform the following
disk I/Os
5. Read the file header for cat.jpg
/
pets
pets
cat
cat
An Example of Path Resolution
 To access the data content of
/pets/cat.jpg
 The system needs to perform the following
disk I/Os
6. Read the data block for cat.jpg
/
pets
pets
cat
cat
Some Performance Optimizations…
 Top-level directories are usually cached
 A user inside a directory (e.g., /pets)

Can issue relative path names (e.g.,
cat.jpg) to refer files within the current
directory
Relational Name Space
 Hierarchical naming model is largely a tree
 Relational naming model allows the
construction of general graphs
 A file can belong to multiple folders


According to its attributes
Files can be accessed in a manner similar to
relational databases

e.g., keywords: cats and blinds
Pros/Cons of Relational Name Space
+ More flexible than hierarchical naming
- May require a long list of attributes to name a
single piece of data

e.g., this lecture

Keywords: operating systems, file systems,
naming, PowerPoint XP
- Who will create those attributes?
Contextual Naming
 Takes advantage of the observation that
certain attributes can be added automatically

e.g., when you try to open a file by Word, a
system will search only the file types
supported by Word (.doc, .txt, .html)
+ Avoids a long list of attributes
- A user may not remember the file name
Content-based Naming
 Searches a file by its content instead of
names
 File contents are extracted automatically

e.g., I want a photo of a cat taken five years
ago

The system returns all files satisfying the criteria
Content-based Naming
- Requires advanced information processing
techniques



e.g., image recognition
Many existing systems use manual indexing
Automated content-based naming is still an
active area of research
Example: The “Internet File System”
 Can be viewed as a worldwide file system
 What is the naming scheme for the Internet
file system?
The “Internet File System”
 Contains shades of various naming schemes
 Flat name space:
 Each URL provides a unique name
 Hierarchical name space:
 Within individual websites
 Relational name space
 Can search the Internet via search engines
 Contextual name space:
 Page ranked according to relevance
 Content-based name space:
 You can find your information without knowing the
exact file names
Example: Plan 9
 Modern UNIX has a deep-rooted influence from the
Plan 9 OS

Developed by Bell lab
 Major design philosophy: everything is a file
 A single hierarchical name space for
 Processes (e.g., /proc)
 Files
 IPC (e.g., pipe)
 Devices (e.g., /dev/fd0)
 Use open/close/read/write for everything
 e.g., /dev/mem