PowerPoint XP
Download
Report
Transcript PowerPoint XP
Naming and Directories
Andy Wang
Operating Systems
COP 4610 / CGS 5675
Recall from the last time…
A file header associates the file with its data
blocks
File Header Storage
Under UNIX, a file header is stored in a data
structure called i-node
For early UNIX systems
I-nodes are stored in a special array
Fixed number of array entries
Maximum number of files fixed
Not stored near data blocks on disk
Reading a small file involves
One disk seek to get the i-node
Other disk seek(s) to get file blocks
Reasons for Separate Allocations
Reliability
Data corruptions are unlikely to affect i-nodes
Reduced fragmentation
File headers are smaller than a whole block
By packing them in an array, multiple headers
can be fetched from disk
File headers are accessed more often
e.g., ls
Grouping file headers improves disk efficiency
For BSD 4.2…
Portions of file header array stored on each
cylinder
For small directories
All file headers and data stored in the same
cylinder
Reduce seek time
Naming
Remember that odd moment when your
computer asks you for name the first file?
Naming: allows users to issue file names
instead of i-node numbers
- Users tend to come up with poor names
e.g., test
- Many file are difficult to name…
How do you name these photos?
Directories
A table of file names and their i-node
numbers
Under many file systems
Directories are implemented as normal files
Containing file names and i_node numbers
Only the OS is permitted to modify directories
Name Space
Flat name space
Hierarchical naming
Relational name space
Contextual naming
Content-based naming
Flat Name Space
All files are stored in a single directory
+ Easy to implement
- Not scalable for large directories
Name collisions: multiple files with the same
names
Hierarchical Naming
Uses multiple levels of directories
Most popular name space organization
+ Conceptual model maps well into the human
model of organizing things
A file cabinet contains many files
+ Scalable
The probability of name collisions decreases
+ Spatial locality
Store all files under a directory within a
cylinder to avoid disk seeks
More on Hierarchical Naming
Absolute path name: consisting the path
from the root directory ‘/’ to the file
e.g., /pets/cat.jpg
root directory
sub directory
file name
Drawbacks of Hierarchical Naming
- Not all files can fit into the hierarchical model
pets
pests
?
?
- Accessing a file may involve many levels of
directory lookups, or a path resolution
before getting to the file content
An Example of Path Resolution
To access the data content of
/pets/cat.jpg
The system needs to perform the following
disk I/Os
An Example of Path Resolution
To access the data content of
/pets/cat.jpg
The system needs to perform the following
disk I/Os
1. Read in the file header for the root directory ‘/’
Stored at a fixed location on disk
/
An Example of Path Resolution
To access the data content of
/pets/cat.jpg
The system needs to perform the following
disk I/Os
2. Read the first data block for the root directory
Lookup the directory entry for pets
/
pets
An Example of Path Resolution
To access the data content of
/pets/cat.jpg
The system needs to perform the following
disk I/Os
3. Read the file header for pets
/
pets
pets
An Example of Path Resolution
To access the data content of
/pets/cat.jpg
The system needs to perform the following
disk I/Os
4. Read the first data block for the pet directory
Lookup the directory entry for cat.jpg
/
pets
pets
cat
An Example of Path Resolution
To access the data content of
/pets/cat.jpg
The system needs to perform the following
disk I/Os
5. Read the file header for cat.jpg
/
pets
pets
cat
cat
An Example of Path Resolution
To access the data content of
/pets/cat.jpg
The system needs to perform the following
disk I/Os
6. Read the data block for cat.jpg
/
pets
pets
cat
cat
Some Performance Optimizations…
Top-level directories are usually cached
A user inside a directory (e.g., /pets)
Can issue relative path names (e.g.,
cat.jpg) to refer files within the current
directory
Relational Name Space
Hierarchical naming model is largely a tree
Relational naming model allows the
construction of general graphs
A file can belong to multiple folders
According to its attributes
Files can be accessed in a manner similar to
relational databases
e.g., keywords: cats and blinds
Pros/Cons of Relational Name Space
+ More flexible than hierarchical naming
- May require a long list of attributes to name a
single piece of data
e.g., this lecture
Keywords: operating systems, file systems,
naming, PowerPoint XP
- Who will create those attributes?
Contextual Naming
Takes advantage of the observation that
certain attributes can be added automatically
e.g., when you try to open a file by Word, a
system will search only the file types
supported by Word (.doc, .txt, .html)
+ Avoids a long list of attributes
- A user may not remember the file name
Content-based Naming
Searches a file by its content instead of
names
File contents are extracted automatically
e.g., I want a photo of a cat taken five years
ago
The system returns all files satisfying the criteria
Content-based Naming
- Requires advanced information processing
techniques
e.g., image recognition
Many existing systems use manual indexing
Automated content-based naming is still an
active area of research
Example: The “Internet File System”
Can be viewed as a worldwide file system
What is the naming scheme for the Internet
file system?
The “Internet File System”
Contains shades of various naming schemes
Flat name space:
Each URL provides a unique name
Hierarchical name space:
Within individual websites
Relational name space
Can search the Internet via search engines
Contextual name space:
Page ranked according to relevance
Content-based name space:
You can find your information without knowing the
exact file names
Example: Plan 9
Modern UNIX has a deep-rooted influence from the
Plan 9 OS
Developed by Bell lab
Major design philosophy: everything is a file
A single hierarchical name space for
Processes (e.g., /proc)
Files
IPC (e.g., pipe)
Devices (e.g., /dev/fd0)
Use open/close/read/write for everything
e.g., /dev/mem