FileSystemImplementation
Download
Report
Transcript FileSystemImplementation
File-System Structure
File structure user view
Logical storage unit
Collection of related information
File system physical view
Resides on secondary storage (disks)
Partitions and mounting
Blocks
Operating System Concepts
12.1
Silberschatz, Galvin and Gagne 2002
Moving-Head Disk Mechanism
Operating System Concepts
12.2
Silberschatz, Galvin and Gagne 2002
Low Level Format
Low-level formatting, or physical formatting — Dividing a
disk into sectors that the disk controller can read and
write.
Zones of cylinders allow more (up to 40%) sectors on outer
tracks
The sectors and tracks are mapped to a large 1-
dimensional array of logical blocks,
Block 0 is the first sector of the first track on the outermost
or innermost cylinder.
Mapping proceeds in order through that track, then the rest
of the tracks in that cylinder, and then through the rest of the
cylinders from outermost to innermost.
The block is the smallest unit of transfer.
Operating System Concepts
12.3
Silberschatz, Galvin and Gagne 2002
Blocks and Fragments
Block sizes
Smaller => less internal fragmentation
Larger => larger transfers
Can improve efficiency by varying file-system parameters, e.g.,
block size = frame size on swap device
4.2BSD uses two block sized for files which have no
indirect blocks:
All the blocks of a file are of a large block size (such as 8K),
except the last.
The last block is an appropriate multiple of a smaller
fragment size (e.g., 1024) to fill out the file.
Thus, a file of size 18,000 bytes would have two 8K blocks
and one 2K fragment (which would not be filled completely).
Operating System Concepts
12.4
Silberschatz, Galvin and Gagne 2002
High Level Format
To use a disk to hold files, the operating system still
needs to record its own data structures on the disk.
Partition the disk into one or more groups of cylinders.
Logical formatting or “making a file system”.
Operating System Concepts
12.5
Silberschatz, Galvin and Gagne 2002
Partitions and Disks
A physical device may support several partitions, each hosting
a physical file system.
Operating System Concepts
12.6
Silberschatz, Galvin and Gagne 2002
Partitions and Disks
Different file systems can support different uses.
Potentially different OSs in different partitions
Prevents one program from using all available space for a large file.
Speeds up searches on backup tapes and restoring partitions from
tape.
Known as
IBM minidisks
Mac volumes
UNIX file systems
Windows drives
Operating System Concepts
12.7
Silberschatz, Galvin and Gagne 2002
Partitions
Boot block - for booting the OS
Partition control block
Information about the partition
Free space list
Free directory space list
Superblock in UNIX, Master File Table in NTFS
Directory structure
Directory nodes, exactly one per file
Inodes in UNIX
MFT entries in NTFS
Organization links nodes into a data structure
Provides naming
Data blocks
Operating System Concepts
12.8
Silberschatz, Galvin and Gagne 2002
Directory Implementation
Linear list of file names with pointer to the data blocks.
Simple to program
Inefficient - searching, empty entries
UNIX directory files
Sorted tree, e.g., B-tree
NTFS B+ tree
Hash Table – linear list with hash data structure.
Decreases directory search time
Collisions – situations where two file names hash to the
same location
Fixed size, or overflow facilities
Operating System Concepts
12.9
Silberschatz, Galvin and Gagne 2002
Disk Structures
A logical file system that a user ordinarily sees may consist
of several physical file systems, each hosted in a partition
of a physical device.
Permits very large logical file systems
Makes monolithic backups possible
The root file system is always available on a partition.
Other file systems may be mounted — i.e., integrated into
the directory hierarchy of the root file system.
A file system must be mounted before it can be accessed.
A unmounted file system mounted at a mount point.
Operating System Concepts
12.10
Silberschatz, Galvin and Gagne 2002
Mapping File System to Physical Devices
Operating System Concepts
12.11
Silberschatz, Galvin and Gagne 2002
Layered File System
Applications call the LFS via system calls
LFS deals with file names, file and
directory operations, file tables, security
FOM maps logical to physical, buffers,
tracks free blocks, tracks bad blocks
BFS sends commands to DDs to access
specified blocks on the device
I/O control consists of device drivers and
interrupts handlers for each device.
Operating System Concepts
12.12
Silberschatz, Galvin and Gagne 2002
File Tables
Draw my example
To open a file
System wide file table is search to see if already open
If no, the directory structure is searched (cached if possible)
Node copied to system wide file table
A new entry in the per-process file table
Refers to the existing SW file table entry
Contains mode of access, and offset
Counter incremented in the SW file table
To close a file
Per-process file table entry is removed
Count decremented in SW file table, and if 0 the node is
written to disk
To create a file
LFS allocates a new directory node
Reads in, updates, and writes out the directory
Operating System Concepts
12.13
Silberschatz, Galvin and Gagne 2002
Contiguous Allocation
Each file occupies a set
of contiguous blocks on
the disk
E.g., IBM, VMS
Operating System Concepts
12.14
Silberschatz, Galvin and Gagne 2002
Contiguous Allocation
Random access
Block = LA / BlockSize
Offset = LA % BlockSize
Simple – only starting location (block #) and length
(number of blocks) are required in a directory node
Wasteful of space
Dynamic storage-allocation problem
External fragmentation
Files cannot grow easily
Initial allocation?
Operating System Concepts
12.15
Silberschatz, Galvin and Gagne 2002
Linked Allocation
Each file is a linked list of disk blocks: blocks may be
scattered anywhere on the disk.
Block is a pointer (disk address) and data
block
Operating System Concepts
12.16
=
pointer
Silberschatz, Galvin and Gagne 2002
Linked Allocation (Cont.)
Random access
Block in chain = LA / (BlockSize - LinkSize)
Offset = LA % (BlockSize - LinkSize)
Simple – need only starting address
Free-space management system – no waste of space
New file has NULL pointer in directory
Easy to get another or release a block
Slow random access
Operating System Concepts
12.17
Silberschatz, Galvin and Gagne 2002
Linked Allocation (Cont.)
FAT variation
Table with index corresponding to blocks
Linked table entries - cached for fast traversal
Used by DOS and OS/2
Operating System Concepts
12.18
Silberschatz, Galvin and Gagne 2002
Extent-Based Systems
Many newer file systems (e.g., Veritas File System) use a
modified contiguous allocation scheme.
Extent-based file systems allocate disk blocks in extents.
An extent is a contiguous block of disks. Extents are
allocated for file allocation. A file consists of one or more
extents.
Extents are linked together
Extents reduce LinkSize overhead
E.g., NTFS
Operating System Concepts
12.19
Silberschatz, Galvin and Gagne 2002
Indexed Allocation
Brings all pointers together into the index block.
index table
Operating System Concepts
12.20
Silberschatz, Galvin and Gagne 2002
Indexed Allocation (Cont.)
Random access
Index table entry = LA / BlockSize
Offset = LA % BlockSize
Fast random access
Cached for speed
Need index block
Small files have mostly empty index block
Fixed maximal file size
Operating System Concepts
12.21
Silberschatz, Galvin and Gagne 2002
Linked - Indexed Allocation
Linked scheme – Link blocks of index table
Random access
Index block in chain =
LA / (((BlockSize-LinkSize)/LinkSize) * BlockSize)
LAInBlock = LA % (((BlockSize-LinkSize)/LinkSize) * BlockSize)
Index table entry = LAInBlock / BlockSize
Offset = LAInBlock % BlockSize
Operating System Concepts
12.22
Silberschatz, Galvin and Gagne 2002
Two Level Indexed Allocation
outer-index
index table
Operating System Concepts
12.23
file
Silberschatz, Galvin and Gagne 2002
Two Level Indexed Allocation
Random access
Outer index table entry = LA / (BlockSize2/LinkSize)
LAInBlock = LA % (BlockSize2/LinkSize)
Inner index table entry = LAInBlock / BlockSize
Offset = LAInBlock % BlockSize
Fixed, but large maximal size
Fast random access if cached
Operating System Concepts
12.24
Silberschatz, Galvin and Gagne 2002
Combined Scheme: UNIX (4K bytes per block)
Fast for small files
Gradual degradation as file size increases
Operating System Concepts
12.25
Silberschatz, Galvin and Gagne 2002
Free-Space Management
Linked list (free list)
Cannot get contiguous
space easily
No waste of space
Can use same routines as
for file management
Operating System Concepts
12.26
Silberschatz, Galvin and Gagne 2002
Free-Space Management
Bit vector (n blocks) (e.g., Mac)
0 1 2
n-1
bit[i] =
…
1 block[i] free
0 block[i] occupied
Can use Motorola “find bit” instruction
Easy to get contiguous blocks
Needs to be kept in memory
block size = 212 bytes (4KB)
disk size = 236 bytes (64GB)
n = 236/212 = 224 bits = 221 bytes (2MB (!))
Operating System Concepts
12.27
Silberschatz, Galvin and Gagne 2002
Efficiency and Performance
Efficiency dependent on:
Disk allocation and directory algorithms
Types of data kept in file’s directory entry
Performance
Disk cache – separate section of main memory for
frequently used blocks
Free-behind and read-ahead – techniques to optimize
cache performance for sequential access
A page cache caches pages rather than disk blocks using
virtual memory techniques.
Improve PC performance by dedicating section of memory
as virtual disk, or RAM disk.
Operating System Concepts
12.28
Silberschatz, Galvin and Gagne 2002
Log Structured File Systems
Log structured (or journaling) file systems record each
update to the file system as a transaction.
All transactions are written to a log. A transaction is
considered committed once it is written to the log.
However, the file system may not yet be updated.
The transactions in the log are asynchronously written to
the file system. When the file system is modified, the
transaction is removed from the log.
If the file system crashes, all remaining transactions in the
log must still be performed.
Operating System Concepts
12.29
Silberschatz, Galvin and Gagne 2002
UNIX File Systems
Size of a block is given in sys/param.h as BSIZE, and is
typically 512 or 1024 bytes.
The area of disc set aside for a file system is split into 4
regions:
Block 0 is set aside for a bootstrap program if required
Block 1 contains the superblock which describes the file
system as a whole, including free blocks and inodes lists.
Block 2 starts the inode area (fixed size)
Remaining blocks are allocated to files
The UNIX file system supports two main objects: files and
directories.
Directories are just files with a special format, so the
representation of a file is the basic UNIX concept.
Operating System Concepts
12.30
Silberschatz, Galvin and Gagne 2002
Inodes
A file (could be a directory) is represented by an inode —
a record that stores information about a specific file on
the disk.
Type of file
Plain
Directory
FIFO
Special, …
Access permissions.
Number of links to this inode
User and group IDs of the file owner.
Size of file in bytes for files, device number for devices.
Times of creation, last access, and last modification
Addresses of the files data blocks.
Inodes are numbered from 1
Inode 1 is reserved for the bad blocks list
Inode 2 for the root directory of the file system.
Operating System Concepts
12.31
Silberschatz, Galvin and Gagne 2002
Directories
The inode type field distinguishes between plain files and
directories.
Directory entries are of variable length, with fields:
The length of the entry
The file name
The inode number.
Operating System Concepts
12.32
Silberschatz, Galvin and Gagne 2002
Finding Inodes
The OS has to map the supplied user path name to an
inode
If the first character is “/”, the starting directory is the root
directory.
For any other starting character, the starting directory is the
current directory.
The search process continues until the end of the path
name is reached and the desired inode is returned.
4.3BSD improved file system performance by adding a
directory name cache to hold recent directory-to-inode
translations.
Operating System Concepts
12.33
Silberschatz, Galvin and Gagne 2002
File System Data Structures
UNIX maintains
System inode table - inodes of files that are open + name +
counter
System file table - offset, mode + counter
Per process file table - not much
Tables are of fixed length
Operating System Concepts
12.34
Silberschatz, Galvin and Gagne 2002
Files
To create a file
Allocate new inode
Write new entry in directory file
To open a file
Search inode table
If open, create system & process file table entries, increment counters
If not open, read in inode and create table entries
To use a file, use the process file table entry
To close a file
Delete process file table entry, decrement system file table counter
If system file table counter is 0, delete and decrement inode table
counter
If inode table counter is 0, write out inode and delete
To delete a file
Delete directory entry
Decrement link counter
If link counter is 0, and inode not in inode table, delete inode
Operating System Concepts
12.35
Silberschatz, Galvin and Gagne 2002
Schematic View of NFS Architecture
Operating System Concepts
12.36
Silberschatz, Galvin and Gagne 2002
NFS Protocol
Provides a set of remote procedure calls for remote file operations. The
procedures support the following operations:
searching for a file within a directory
reading a set of directory entries
manipulating links and directories
accessing file attributes
reading and writing files
NFS servers are stateless; each request has to provide a full set of
arguments.
Modified data must be committed to the server’s disk before results are
returned to the client (lose advantages of caching).
The NFS protocol does not provide concurrency-control mechanisms.
Operating System Concepts
12.37
Silberschatz, Galvin and Gagne 2002