Implementing File Systems

Transcript Implementing File Systems

Chapter 11:
Implementing File Systems
Chapter 11: Implementing File Systems
 File-System Structure
 File-System Implementation
 Directory Implementation
 Allocation Methods
 Free-Space Management
 Efficiency and Performance
 Recovery
 Log-Structured File Systems
 11.9 NFS (skip)
 11.10 Example: WAFL File System (skip)
Operating System Principles
11.2
Silberschatz, Galvin and Gagne ©2005
Objectives
 To describe the details of implementing local file
systems and directory structures
 To describe the implementation of remote file systems
(11.9, skip)
 To discuss block allocation and free-block algorithms
and trade-offs
Operating System Principles
11.3
Silberschatz, Galvin and Gagne ©2005
11.1 File-System Structure
 Disk characteristics for storing multiple files


Can be rewritten in place
Can access directly any block of information it contains
 File structure


Logical storage unit
Collection of related information
 File system resides on secondary storage (disks)

Allow the data in disk to be stored, located, and retrieved easily
How the file system should look to the user
 How to map the logical file system to the physical secondary
storage devices

 File system organized into layers
 File control block (FCB) – storage structure consisting
of information about a file
Operating System Principles
11.4
Silberschatz, Galvin and Gagne ©2005
Layered File System
System calls like create( ),
open( ), close( )
Manages metadata
information
Translates logical block
addresses to physical
block addresses
Device driver
Operating System Principles
11.5
Silberschatz, Galvin and Gagne ©2005
11.2 File-System Implementation
 On-disk and in-memory structures are used to
implement a file system
 On disk



A boot control block
A volume control block
A per-file FCB: file permissions, ownership, size, and location
of the data blocks
In UNIX File System, it is called inode
 In Windows NTFS, it is stored as a record in master file table


A directory structure
In Unix File System, this include file names and associated inode
numbers
 In NTFS, it is stored in the master file table

Operating System Principles
11.6
Silberschatz, Galvin and Gagne ©2005
11.2 File-System Implementation
 In-memory information is used for file-system
management and performance improvement via
caching

In-memory mount table

In-memory directory-structure cache

System-wide open-file table


A copy of the FCB for each open file
Per-process open-file table

A pointer to the appropriate entry in system-wide open-file table

In Unix, it is called a file descriptor

In Windows, it is called a file handler
Operating System Principles
11.7
Silberschatz, Galvin and Gagne ©2005
A Typical File Control Block
Operating System Principles
11.8
Silberschatz, Galvin and Gagne ©2005
In-Memory File System Structures
 The following figure illustrates the necessary file system
structures provided by the operating systems.
Figure 11-3(a) refers to opening a file.
Operating System Principles
11.9
Silberschatz, Galvin and Gagne ©2005
In-Memory File System Structures
Figure 11-3(b) refers to reading a file.
Operating System Principles
11.10
Silberschatz, Galvin and Gagne ©2005
Partitions and Mounting
 A disk can be sliced into multiple partitions. A volume can
span multiple partitions on multiple disks (RAID, Section
12.7)
 Raw disk: no file system.

Used in Unix swap space, and database management systems
 Boot information has its own format, and is usually a
sequential series of blocks, loaded as an image into
memory

Allow dual-booted for installing multiple OS
 The root partition, containing the OS kernel and other
system files, is mounted at boot time


Other volumes can be automatically mounted at boot time or
manually mounted later
Skip
OS maintains a mount table for mounted file systems
11.2.3
Operating System Principles
11.11
Silberschatz, Galvin and Gagne ©2005
11.3 Directory Implementation
 Linear list of file names with pointer to the data blocks.



simple to program but time-consuming to execute
To create a new file, directory must be searched to be sure
that no existing file has the same name. To delete a file, we
search the directory for the named file, then release the space
allocated to it
To reuse the directory entry, several options

Mark the entry as unused by
–
Assigning it s special name
–
Or with a used-unused bit
Attach it to a list of free directory entries
 Copy the last entry in the directory into the freed location and
decrease the length of the directory


Disadvantage: finding a file requires a linear search

Make a list sorted would complicate the creating and deleting of
files
Operating System Principles
11.12
Silberschatz, Galvin and Gagne ©2005
Directory Implementation
 Hash Table – linear list stores the directory entries with a
hash table




The hash table takes a value computed from the file name and
returns a pointer to the file name in the linear list
decreases directory search time
Some provisions must be made for collisions – situations where
two file names hash to the same location
Difficulties:
fixed size (because it is a table)
 The dependence of the hash function on that size


Alternatively, a chained-overflow hash table can be used instead


each hash entry is a linked list instead of an individual value
Collisions resolved by adding the new entry to the linked list
Operating System Principles
11.13
Silberschatz, Galvin and Gagne ©2005
11.4 Allocation Methods
 An allocation method refers to how disk blocks are
allocated for files.

How to allocate space to these files so that disk space is
utilized effectively and files can be accessed quickly
 Three major methods

Contiguous allocation

Linked allocation

Indexed allocation
Operating System Principles
11.14
Silberschatz, Galvin and Gagne ©2005
Contiguous Allocation
 Each file occupies a set of contiguous blocks on the disk

Simple – only starting location (block #) and length (number of
blocks) are required

Random access (next page)
 Problems


Dynamic storage-allocation problem

First-fit, best-fit, worst-fit

Repacking off-line or on-line
Determining how much space is needed for a file when it is
created.

If we allocate too little space to a file, it cannot be extended

Pre-allocation may be inefficient
Operating System Principles
11.15
Silberschatz, Galvin and Gagne ©2005
Contiguous Allocation
 Mapping from logical to physical
Q (Quotient)
Logical Address/512
R (Remainder)
Block to be accessed = Q + starting address
Displacement into block = R
Operating System Principles
11.16
Silberschatz, Galvin and Gagne ©2005
Contiguous Allocation of Disk Space
Operating System Principles
11.17
Silberschatz, Galvin and Gagne ©2005
Extent-Based Systems
 Many newer file systems (I.e. Veritas File System)
use this modified contiguous allocation scheme
 Extent-based file systems allocate disk blocks in
extents

A contiguous chunk of space is allocated initially

If that amount is not large enough later, another chunk of
contiguous space, called extent, is added

A file consists of one or more extents.

The location of a file’s blocks is recorded as a location and a
block count, plus a link to the first block of the next extent
Operating System Principles
11.18
Silberschatz, Galvin and Gagne ©2005
Linked Allocation
 Each file is a linked list of disk blocks: blocks may
be scattered anywhere on the disk.
 The directory contains for each file a pointer to the
first and last blocks of the file
Pointer to the next block
Contents of a block
Operating System Principles
11.19
Silberschatz, Galvin and Gagne ©2005
Linked Allocation (Cont.)
 Simple – need only starting address
 Free-space management system – no waste of space
 No random access
 Mapping
Q (Quotient)
Logical Address/511
R (Remainder)
Block to be accessed is the Q-th block in the linked chain of
blocks representing the file.
Displacement into block = R + 1
File-allocation table (FAT) – disk-space allocation used by MS-DOS
and OS/2.
Operating System Principles
11.20
Silberschatz, Galvin and Gagne ©2005
Linked Allocation
Disadvantages:
1. Can be used effectively only for
sequential-access files
2. The space required for the
pointers.
Solution: collect blocks into
clusters, and allocate clusters
rather than blocks. Cost is
increased internal
fragmentation.
3. Reliability: disaster if pointers
were lost or damaged.
Solution: doubly linked lists or
store file name and relative
block number in each block.
10
16
25
1
Operating System Principles
11.21
Silberschatz, Galvin and Gagne ©2005
Linked Allocation
 Linear list of file names with pointer to the data
blocks

simple to program

time-consuming to execute
 Hash Table – linear list with hash data structure

decreases directory search time

collisions – situations where two file names hash to the
same location

fixed size
Operating System Principles
11.22
Silberschatz, Galvin and Gagne ©2005
File-Allocation Table (MS-DOS and OS/2)
Unused block: a 0 table value
Allocating a new block to a file:
Finding the first 0-valued table
entry and replacing the previous
end-of-file value with the address
of the new block. The 0 table
entry is then replaced by the
end-of-file value.
end-of-file
Operating System Principles
11.23
Silberschatz, Galvin and Gagne ©2005
Indexed Allocation
 Brings all pointers together into the index block
 Each file has its own index block
 Logical view.
index table
Operating System Principles
11.24
Silberschatz, Galvin and Gagne ©2005
Example of Indexed Allocation
Operating System Principles
11.25
Silberschatz, Galvin and Gagne ©2005
Indexed Allocation
 Need index table
 Random access
 Dynamic access without external fragmentation, but
have overhead of index block
 With only 1 block for index table and a block size of
512 words, mapping from logical to physical in a file
of maximum size of 256K (=0.5K * 512) words.
Q (Quotient)
Logical Address/512
R (Remainder)
Q = displacement into index table
R = displacement into block
Operating System Principles
11.26
Silberschatz, Galvin and Gagne ©2005
Indexed Allocation
 If the index block is too small, it will not be able to hole
enough pointers for a large file. Mechanisms to handle
this issue:

Linked scheme


Multilevel index


The last word in the index block is nil (for a small file) or is a
pointer to another index block (for a large file)
Use first-level index block to point to a set of second-level index
blocks, which point to the file blocks
Combined scheme

Example: In Unix File System, for the 15 pointers of the index
block in the file’s inode
–
The first 12 point to data of the file
–
The next three pointers point to (single, double, triple) indirect blocks
Operating System Principles
11.27
Silberschatz, Galvin and Gagne ©2005
Combined Scheme: UNIX (4K bytes per block)
Operating System Principles
11.28
Silberschatz, Galvin and Gagne ©2005
Performance
 Before selecting an allocation method, we need to know
how the system would be used

Contiguous allocation requires only one access to get a disk block.

Linked allocation is only good for sequential access.


Keeping index block in memory requires considerable space. The
performance of indexed allocation depends on the index structure
(how many level), on the size of the file, and on the position of the
block desired.


Some system supports both, but require the declaration of the type of
access in file creation.
Some system uses contiguous allocation for small files and
automatically switching to an index allocation if the file grows large
Many other optimizations are in use

It is reasonable to add (hundreds of) thousands of instructions to save
a few disk-head movements
Operating System Principles
11.29
Silberschatz, Galvin and Gagne ©2005