FileSystemImplementation

Download Report

Transcript FileSystemImplementation

File-System Structure
 File structure user view
 Logical storage unit
 Collection of related information
 File system physical view
 Resides on secondary storage (disks)
 Partitions and mounting
 Blocks
Operating System Concepts
12.1
Silberschatz, Galvin and Gagne 2002
Moving-Head Disk Mechanism
Operating System Concepts
12.2
Silberschatz, Galvin and Gagne 2002
Low Level Format
 Low-level formatting, or physical formatting — Dividing a
disk into sectors that the disk controller can read and
write.
 Zones of cylinders allow more (up to 40%) sectors on outer
tracks
 The sectors and tracks are mapped to a large 1-
dimensional array of logical blocks,
 Block 0 is the first sector of the first track on the outermost
or innermost cylinder.
 Mapping proceeds in order through that track, then the rest
of the tracks in that cylinder, and then through the rest of the
cylinders from outermost to innermost.
 The block is the smallest unit of transfer.
Operating System Concepts
12.3
Silberschatz, Galvin and Gagne 2002
Blocks and Fragments
 Block sizes
 Smaller => less internal fragmentation
 Larger => larger transfers
 Can improve efficiency by varying file-system parameters, e.g.,
block size = frame size on swap device
 4.2BSD uses two block sized for files which have no
indirect blocks:
 All the blocks of a file are of a large block size (such as 8K),
except the last.
 The last block is an appropriate multiple of a smaller
fragment size (e.g., 1024) to fill out the file.
 Thus, a file of size 18,000 bytes would have two 8K blocks
and one 2K fragment (which would not be filled completely).
Operating System Concepts
12.4
Silberschatz, Galvin and Gagne 2002
High Level Format
 To use a disk to hold files, the operating system still
needs to record its own data structures on the disk.
 Partition the disk into one or more groups of cylinders.
 Logical formatting or “making a file system”.
Operating System Concepts
12.5
Silberschatz, Galvin and Gagne 2002
Partitions and Disks
 A physical device may support several partitions, each hosting
a physical file system.
Operating System Concepts
12.6
Silberschatz, Galvin and Gagne 2002
Partitions and Disks
 Different file systems can support different uses.
 Potentially different OSs in different partitions
 Prevents one program from using all available space for a large file.
 Speeds up searches on backup tapes and restoring partitions from
tape.
 Known as
 IBM minidisks
 Mac volumes
 UNIX file systems
 Windows drives
Operating System Concepts
12.7
Silberschatz, Galvin and Gagne 2002
Partitions
 Boot block - for booting the OS
 Partition control block
 Information about the partition
 Free space list
 Free directory space list
 Superblock in UNIX, Master File Table in NTFS
 Directory structure
 Directory nodes, exactly one per file
 Inodes in UNIX
 MFT entries in NTFS
 Organization links nodes into a data structure
 Provides naming
 Data blocks
Operating System Concepts
12.8
Silberschatz, Galvin and Gagne 2002
Directory Implementation
 Linear list of file names with pointer to the data blocks.
 Simple to program
 Inefficient - searching, empty entries
 UNIX directory files
 Sorted tree, e.g., B-tree
 NTFS B+ tree
 Hash Table – linear list with hash data structure.
 Decreases directory search time
 Collisions – situations where two file names hash to the
same location
 Fixed size, or overflow facilities
Operating System Concepts
12.9
Silberschatz, Galvin and Gagne 2002
Disk Structures
 A logical file system that a user ordinarily sees may consist
of several physical file systems, each hosted in a partition
of a physical device.
 Permits very large logical file systems
 Makes monolithic backups possible
 The root file system is always available on a partition.
 Other file systems may be mounted — i.e., integrated into
the directory hierarchy of the root file system.
 A file system must be mounted before it can be accessed.
 A unmounted file system mounted at a mount point.
Operating System Concepts
12.10
Silberschatz, Galvin and Gagne 2002
Mapping File System to Physical Devices
Operating System Concepts
12.11
Silberschatz, Galvin and Gagne 2002
Layered File System
 Applications call the LFS via system calls
 LFS deals with file names, file and
directory operations, file tables, security
 FOM maps logical to physical, buffers,
tracks free blocks, tracks bad blocks
 BFS sends commands to DDs to access
specified blocks on the device
 I/O control consists of device drivers and
interrupts handlers for each device.
Operating System Concepts
12.12
Silberschatz, Galvin and Gagne 2002
File Tables
 Draw my example
 To open a file
 System wide file table is search to see if already open
 If no, the directory structure is searched (cached if possible)
 Node copied to system wide file table
 A new entry in the per-process file table
 Refers to the existing SW file table entry
 Contains mode of access, and offset
 Counter incremented in the SW file table
 To close a file
 Per-process file table entry is removed
 Count decremented in SW file table, and if 0 the node is
written to disk
 To create a file
 LFS allocates a new directory node
 Reads in, updates, and writes out the directory
Operating System Concepts
12.13
Silberschatz, Galvin and Gagne 2002
Contiguous Allocation
 Each file occupies a set
of contiguous blocks on
the disk
 E.g., IBM, VMS
Operating System Concepts
12.14
Silberschatz, Galvin and Gagne 2002
Contiguous Allocation
 Random access
Block = LA / BlockSize
Offset = LA % BlockSize
 Simple – only starting location (block #) and length
(number of blocks) are required in a directory node
 Wasteful of space
 Dynamic storage-allocation problem
 External fragmentation
 Files cannot grow easily
 Initial allocation?
Operating System Concepts
12.15
Silberschatz, Galvin and Gagne 2002
Linked Allocation
 Each file is a linked list of disk blocks: blocks may be
scattered anywhere on the disk.
 Block is a pointer (disk address) and data
block
Operating System Concepts
12.16
=
pointer
Silberschatz, Galvin and Gagne 2002
Linked Allocation (Cont.)
 Random access
Block in chain = LA / (BlockSize - LinkSize)
Offset = LA % (BlockSize - LinkSize)
 Simple – need only starting address
 Free-space management system – no waste of space
 New file has NULL pointer in directory
 Easy to get another or release a block
 Slow random access
Operating System Concepts
12.17
Silberschatz, Galvin and Gagne 2002
Linked Allocation (Cont.)
 FAT variation
 Table with index corresponding to blocks
 Linked table entries - cached for fast traversal
 Used by DOS and OS/2
Operating System Concepts
12.18
Silberschatz, Galvin and Gagne 2002
Extent-Based Systems
 Many newer file systems (e.g., Veritas File System) use a




modified contiguous allocation scheme.
Extent-based file systems allocate disk blocks in extents.
An extent is a contiguous block of disks. Extents are
allocated for file allocation. A file consists of one or more
extents.
Extents are linked together
 Extents reduce LinkSize overhead
E.g., NTFS
Operating System Concepts
12.19
Silberschatz, Galvin and Gagne 2002
Indexed Allocation
 Brings all pointers together into the index block.
index table
Operating System Concepts
12.20
Silberschatz, Galvin and Gagne 2002
Indexed Allocation (Cont.)
 Random access
Index table entry = LA / BlockSize
Offset = LA % BlockSize
 Fast random access
 Cached for speed
 Need index block
 Small files have mostly empty index block
 Fixed maximal file size
Operating System Concepts
12.21
Silberschatz, Galvin and Gagne 2002
Linked - Indexed Allocation
 Linked scheme – Link blocks of index table
 Random access
Index block in chain =
LA / (((BlockSize-LinkSize)/LinkSize) * BlockSize)
LAInBlock = LA % (((BlockSize-LinkSize)/LinkSize) * BlockSize)
Index table entry = LAInBlock / BlockSize
Offset = LAInBlock % BlockSize
Operating System Concepts
12.22
Silberschatz, Galvin and Gagne 2002
Two Level Indexed Allocation

outer-index
index table
Operating System Concepts
12.23
file
Silberschatz, Galvin and Gagne 2002
Two Level Indexed Allocation
 Random access
Outer index table entry = LA / (BlockSize2/LinkSize)
LAInBlock = LA % (BlockSize2/LinkSize)
Inner index table entry = LAInBlock / BlockSize
Offset = LAInBlock % BlockSize
 Fixed, but large maximal size
 Fast random access if cached
Operating System Concepts
12.24
Silberschatz, Galvin and Gagne 2002
Combined Scheme: UNIX (4K bytes per block)
 Fast for small files
 Gradual degradation as file size increases
Operating System Concepts
12.25
Silberschatz, Galvin and Gagne 2002
Free-Space Management
 Linked list (free list)
 Cannot get contiguous
space easily
 No waste of space
 Can use same routines as
for file management
Operating System Concepts
12.26
Silberschatz, Galvin and Gagne 2002
Free-Space Management
 Bit vector (n blocks) (e.g., Mac)
0 1 2
n-1
bit[i] =

…
1  block[i] free
0  block[i] occupied
 Can use Motorola “find bit” instruction
 Easy to get contiguous blocks
 Needs to be kept in memory
block size = 212 bytes (4KB)
disk size = 236 bytes (64GB)
n = 236/212 = 224 bits = 221 bytes (2MB (!))
Operating System Concepts
12.27
Silberschatz, Galvin and Gagne 2002
Efficiency and Performance
 Efficiency dependent on:
 Disk allocation and directory algorithms
 Types of data kept in file’s directory entry
 Performance
 Disk cache – separate section of main memory for
frequently used blocks
 Free-behind and read-ahead – techniques to optimize
cache performance for sequential access
 A page cache caches pages rather than disk blocks using
virtual memory techniques.
 Improve PC performance by dedicating section of memory
as virtual disk, or RAM disk.
Operating System Concepts
12.28
Silberschatz, Galvin and Gagne 2002
Log Structured File Systems
 Log structured (or journaling) file systems record each
update to the file system as a transaction.
 All transactions are written to a log. A transaction is
considered committed once it is written to the log.
However, the file system may not yet be updated.
 The transactions in the log are asynchronously written to
the file system. When the file system is modified, the
transaction is removed from the log.
 If the file system crashes, all remaining transactions in the
log must still be performed.
Operating System Concepts
12.29
Silberschatz, Galvin and Gagne 2002
UNIX File Systems
 Size of a block is given in sys/param.h as BSIZE, and is
typically 512 or 1024 bytes.
 The area of disc set aside for a file system is split into 4
regions:
 Block 0 is set aside for a bootstrap program if required
 Block 1 contains the superblock which describes the file
system as a whole, including free blocks and inodes lists.
 Block 2 starts the inode area (fixed size)
 Remaining blocks are allocated to files
 The UNIX file system supports two main objects: files and
directories.
 Directories are just files with a special format, so the
representation of a file is the basic UNIX concept.
Operating System Concepts
12.30
Silberschatz, Galvin and Gagne 2002
Inodes
 A file (could be a directory) is represented by an inode —
a record that stores information about a specific file on
the disk.
 Type of file
 Plain
 Directory
 FIFO
 Special, …
 Access permissions.
 Number of links to this inode
 User and group IDs of the file owner.
 Size of file in bytes for files, device number for devices.
 Times of creation, last access, and last modification
 Addresses of the files data blocks.
 Inodes are numbered from 1
 Inode 1 is reserved for the bad blocks list
 Inode 2 for the root directory of the file system.
Operating System Concepts
12.31
Silberschatz, Galvin and Gagne 2002
Directories
 The inode type field distinguishes between plain files and
directories.
 Directory entries are of variable length, with fields:
 The length of the entry
 The file name
 The inode number.
Operating System Concepts
12.32
Silberschatz, Galvin and Gagne 2002
Finding Inodes
 The OS has to map the supplied user path name to an
inode
 If the first character is “/”, the starting directory is the root
directory.
 For any other starting character, the starting directory is the
current directory.
 The search process continues until the end of the path
name is reached and the desired inode is returned.
 4.3BSD improved file system performance by adding a
directory name cache to hold recent directory-to-inode
translations.
Operating System Concepts
12.33
Silberschatz, Galvin and Gagne 2002
File System Data Structures
 UNIX maintains
 System inode table - inodes of files that are open + name +
counter
 System file table - offset, mode + counter
 Per process file table - not much
 Tables are of fixed length
Operating System Concepts
12.34
Silberschatz, Galvin and Gagne 2002
Files
 To create a file
 Allocate new inode
 Write new entry in directory file
 To open a file
 Search inode table
 If open, create system & process file table entries, increment counters
 If not open, read in inode and create table entries
 To use a file, use the process file table entry
 To close a file
 Delete process file table entry, decrement system file table counter
 If system file table counter is 0, delete and decrement inode table
counter
 If inode table counter is 0, write out inode and delete
 To delete a file
 Delete directory entry
 Decrement link counter
 If link counter is 0, and inode not in inode table, delete inode
Operating System Concepts
12.35
Silberschatz, Galvin and Gagne 2002
Schematic View of NFS Architecture
Operating System Concepts
12.36
Silberschatz, Galvin and Gagne 2002
NFS Protocol

Provides a set of remote procedure calls for remote file operations. The
procedures support the following operations:
 searching for a file within a directory
 reading a set of directory entries
 manipulating links and directories
 accessing file attributes
 reading and writing files



NFS servers are stateless; each request has to provide a full set of
arguments.
Modified data must be committed to the server’s disk before results are
returned to the client (lose advantages of caching).
The NFS protocol does not provide concurrency-control mechanisms.
Operating System Concepts
12.37
Silberschatz, Galvin and Gagne 2002