LEC6-FileSystem

Download Report

Transcript LEC6-FileSystem

CS450/550
Operating Systems
Lecture 6 File Systems
Palden Lama
Department of Computer Science
CS450/550 FileSystems.1
UC. Colorado Springs
Adapted from MOS2E
Review: Summary of Chapter 5
° OS responsibilities in I/O operations
• Protection and Scheduling
• CPU communicates with I/O devices
• I/O devices notify OS/CPU
° I/O software hierarchy
• Interrupt handlers
• Device drivers
• Buffering
° Storage Systems
• Disk head scheduling algorithms
° Power Management
° More reading: textbook 5.1 - 5.11
CS450/550 FileSystems.2
UC. Colorado Springs
Adapted from MOS2E
Chapter 6: File Systems
6.1 Files
6.2 Directories
6.3 File system implementation
6.4 Example file systems
CS450/550 FileSystems.3
UC. Colorado Springs
Adapted from MOS2E
Long-term Information Storage
Three essential requirements for long-term information storage
•
Must store large amounts of data
•
Information stored must survive the termination of the
process using it
•
Multiple processes must be able to access the information
concurrently
What are users’ concerns of the file system?
What are implementors’ concerns of the file system?
CS450/550 FileSystems.4
UC. Colorado Springs
Adapted from MOS2E
File Naming
° Files are an abstraction mechanism
• two-part file names
Typical file extensions.
CS450/550 FileSystems.5
UC. Colorado Springs
Adapted from MOS2E
File Structures
What files look like from programmers’ viewpoint?
° Three kinds of file structures
• Unstructured byte sequence (Unix and WinOS view)
• Record sequence (early machines’ view)
• Tree (mainframe view)
CS450/550 FileSystems.6
UC. Colorado Springs
Adapted from MOS2E
File Types
° Regular files
• ASCII files or binary files
° Directories
° Character special files
° Block special files
CS450/550 FileSystems.7
UC. Colorado Springs
Adapted from MOS2E
File Access
° Sequential access
• read all bytes/records from the beginning
• cannot jump around, could rewind or back up
• convenient when medium was mag tape
° Random access
• bytes/records read in any order
• essential for data base systems
• read can be …
- move file marker (seek), then read or …
- read and then move file marker
CS450/550 FileSystems.8
UC. Colorado Springs
Adapted from MOS2E
File Attributes
Possible file attributes
CS450/550 FileSystems.9
UC. Colorado Springs
Adapted from MOS2E
File Operations
1.
2.
3.
4.
5.
6.
Create
Delete
Open
Close
Read
Write
CS450/550 FileSystems.10
7. Append
8. Seek
9. Get attributes
10. Set Attributes
11. Rename
UC. Colorado Springs
Adapted from MOS2E
An Example Program Using File System Calls
CS450/550 FileSystems.11
UC. Colorado Springs
Adapted from MOS2E
An Example Program Using File System Calls (cont.)
CS450/550 FileSystems.12
UC. Colorado Springs
Adapted from MOS2E
Memory-Mapped Files
° OS provide a way to map files into the address space of a
running process; map() and unmap()
• No read or write system calls are needed thereafter
(a) Segmented process before mapping files into its address space
(b) Process after mapping
existing file abc into one segment
creating new segment for xyz
CS450/550 FileSystems.13
UC. Colorado Springs
Adapted from MOS2E
Directories: Single-Level Directory Systems
° A single-level directory system is simple for implementation
• contains 4 files
• owned by 3 different people, A, B, and C
What is the key problem with the single-level directory systems?
Different users may use the same names for their files
CS450/550 FileSystems.14
UC. Colorado Springs
Adapted from MOS2E
Two-level Directory Systems
Letters indicate owners of the directories and files
What additional operation required, compared with single-level directory systems?
Login procedure
What if a user has many files and wants to group them in logical way?
CS450/550 FileSystems.15
UC. Colorado Springs
Adapted from MOS2E
Hierarchical Directory Systems
A hierarchical directory system
CS450/550 FileSystems.16
UC. Colorado Springs
Adapted from MOS2E
Path Names
° Absolute path name
° Relative path name
A UNIX directory tree
CS450/550 FileSystems.17
UC. Colorado Springs
Adapted from MOS2E
Directory Operations
1.
Create
5. Readdir
2.
Delete
6. Rename
3.
Opendir
7. Link
4.
Closedir
8. Unlink
What are file system implementors’ concerns?
How files & directories stored?
How disk space is managed?
How to make everything work efficiently and reliably?
CS450/550 FileSystems.18
UC. Colorado Springs
Adapted from MOS2E
File System Implementation
° File system layout
• Most disks can be divided into one or more partitions
• BIOS MBR (Master Boot Record)
A possible file system layout
How to keep track of which disk blocks go with which file?
CS450/550 FileSystems.19
UC. Colorado Springs
Adapted from MOS2E
Implementing Files (1) – Contiguous Allocation
°
°
Pros: simple addressing and one-seek only reading
Cons: disk fragmentation (like Swapping)
fit CD-ROM
(a) Contiguous block allocation of disk space for 7 files
(b) State of the disk after files D and E have been dynamically removed
CS450/550 FileSystems.20
UC. Colorado Springs
Adapted from MOS2E
Implementing Files (2) – Linked List Allocation
°
Keep each file as a linked list of disk blocks
• Pros: no space is lost due to disk fragmentation
• Cons: how about random access?
Storing a file as a linked list of disk blocks
CS450/550 FileSystems.21
UC. Colorado Springs
Adapted from MOS2E
Implementing Files (3) – FAT (File Allocation Table)
°
FAT: a table in memory with the pointer word of each disk block
• High utilization + easy random access, but too “FAT” maybe?
Consider:
A 20 GB disk
1 KB block size
Each entry 3 B
How much space for a FAT?
How about paging it?
Linked list allocation using a file allocation table in RAM
CS450/550 FileSystems.22
UC. Colorado Springs
Adapted from MOS2E
Implementing Files (4) – I-nodes
° i-node: a data structure listing the attributes and disk addresses
of the file’s blocks; in memory when the corresponding file is open
Why i-node scheme requires
much less space than FAT?
An example i-node
CS450/550 FileSystems.23
UC. Colorado Springs
Adapted from MOS2E
Implementing Files (5) – Summary
° How to find the disk blocks of a file?
• Contiguous allocation: the disk address of the entire file
• Linked list & FAT: the number of the first block
• i-node: the number of the i-node
Who provides the information above?
The directory entry (based on the path name)
CS450/550 FileSystems.24
UC. Colorado Springs
Adapted from MOS2E
Implementing Directories (1)
° The directory entry, based on the path name, provides the
information to find the disk blocks
……
What to do for few but long and variable-length file names?
(a) A simple directory (MS-DOS/Windows)
Fixed-size entries
File names, attributes, and disk addresses in directory entry
(b) Directory (UNIX); each entry just refers to an i-node, i-number returned
CS450/550 FileSystems.25
UC. Colorado Springs
Adapted from MOS2E
Implementing Directories (2)
°
Two ways of handling long and variable-length file names in directory
(a) In-line: compaction and page fault.
CS450/550 FileSystems.26
UC. Colorado Springs
(b) In a heap: page fault
Adapted from MOS2E
Shared Files
° How to let multiple users share files?
What if directories contain the disk addresses?
File system containing a shared file
CS450/550 FileSystems.27
UC. Colorado Springs
Adapted from MOS2E
Shared Files in UNIX
° UNIX utilizes i-node’ data structure
• What if a file is removed by the owner?
(a) Situation prior to linking; (b) After the link is created
(c) After the original owner removes the file
CS450/550 FileSystems.28
UC. Colorado Springs
Adapted from MOS2E
Shared Files – Symbolic Linking
° A new file, created with type LINK, enters B’s directory
• The file contains just the path name of the linked file
• Con: extra overhead with each file access, parsing
• Pro 1: Only when the owner removes the file, it is
destroyed
- Removing a symbolic link does not affect the file
• Pro 2: networked file systems
CS450/550 FileSystems.29
UC. Colorado Springs
Adapted from MOS2E
Disk Space Management – Block Size
° All file systems chop files to fixed-size non-adjacent blocks
° Block size is a trade-off of space utilization and data rate
• Three-step disk access
°
°
°
CS450/550 FileSystems.30
Block size
Dark line (left hand scale) gives data rate of a disk
Dotted line (right hand scale) gives disk space efficiency
All files 2KB
UC. Colorado Springs
Adapted from MOS2E
Disk Space Management – Tracking Free Blocks
° How to keep track of free blocks?
(a) Storing the free list on a linked list.
CS450/550 FileSystems.31
UC. Colorado Springs
(b) A bit map
Adapted from MOS2E
Example of Tracking Free Blocks
° Consider a 16-GB disk, 1-KB block size, 32-bit disk block
number
if all blocks are empty, how many blocks in the free list and
in the bit map, respectively? Which one uses less space?
But what if the disk is nearly full?
How much information should be stored in the memory for each scheme?
CS450/550 FileSystems.32
UC. Colorado Springs
Adapted from MOS2E
Disk Space Management – Disk Quotas
°
An open file table in memory has attributes telling who the owner
of an opened file is; and a per-user table contains the quota
Quotas for keeping track of each user’s disk use
CS450/550 FileSystems.33
UC. Colorado Springs
Adapted from MOS2E
File System Reliability
° Physical dumping: starts at block 0, writes al the disk
blocks onto the output tape in order
° Logical dumping: starts one or more specified directories
and recursively dumps all files and directories found there
that have changed sine some given based date (e.g., the last
backup from an incremental dump or system installation for
a full dump)
CS450/550 FileSystems.34
UC. Colorado Springs
Adapted from MOS2E
File System Consistency
°
File system states
(a) consistent
(b) missing block
(c) duplicate block in free list
(d) duplicate data block
CS450/550 FileSystems.35
UC. Colorado Springs
Adapted from MOS2E
File System Performance - Caching
° Cache: a collection of blocks that logically belong on the
disk but are being kept in memory for performance reasons
• Hash the device and disk address and look up the result in a
hash table with collision chains
• Cache references are relatively infrequent
The block cache data structures with a bi-directional usage list
Why LRU is undesirable when consistency is an issue if the system crashes?
CS450/550 FileSystems.36
UC. Colorado Springs
Adapted from MOS2E
File System Performance – Caching II
° Cache & Consistency
• UNIX system call sync() every 30s
• MS-DOS strategy write-through
CS450/550 FileSystems.37
UC. Colorado Springs
Adapted from MOS2E
File System Performance – Block Read Ahead
° Block Read Ahead works well for files that are being read
sequentially
• Spatial locality
CS450/550 FileSystems.38
UC. Colorado Springs
Adapted from MOS2E
The Windows 98 File System (1)
Bytes
The extended MOS-DOS
CS450/550 FileSystems.39
directory entry used in Windows 98
UC. Colorado Springs
Adapted from MOS2E
Summary
° Files and directories
° File system implementation
•
•
•
•
Contiguous files
Linked lists
FAT
i-nodes
° Disk space management
° File system performance and consistency
° More reading: textbook 6.1 - 6.6
CS450/550 FileSystems.40
UC. Colorado Springs
Adapted from MOS2E