Transcript files

Operating Systems
COMP 4850/CISG 5550
File Systems
Files
Dr. James Money
File Systems
• We have three requirements for long term
storage of data
– We must be able to store a large amount of
information
– The information must survive the terminate of
the process
– Multiple processes should be able to access
the data concurrently
File Systems
• The solution involves storing the data on
disks in units called files
• The information in files must be persistent
– that is not affected by
creation/destruction of processes
• These files must be managed by the OS
• The part of the OS that handles this is
called the file system
Files
• There are several important aspects to the
file system:
– File
– File
– File
– File
– File
– File
Naming
Structure
Types
Access
Attributes
Operations
File Naming
• The naming of files is an abstraction of the
data on the disk
• We must shield the user from the details
of this process
• In order to do this, we usually refer to
files by name rather than location or a
number
File Naming
• The exact rules vary from each file system
• All current OSes allow string of one to
eight characters
• Many times digits are permitted as well
• Some times even special characters are
allowed
• Many support names as long as 255
characters
File Naming
• Examples
– cathy
– bruce
–2
– urgent!
– Fig.2-14
– supercalifragilicious
File Naming
• Some systems distinguish between upper
and lowercase letters
– UNIX, Linux
– John, john, and JOHN are all different files
• Others do not
– DOS, Windows 1,2,3,95,98,XP,Vista
– John, john, and JOHN are all the same file
File Naming
• Many operating systems have two part files
names separated by a period
– File Name
– File extension – indicated usually the program that
created the file and data it contains
• UNIX does not differentiate this, but names can
•
include a period
This also allows files such as file.tar.gz instead of
just file.tar or file.gz or file.tgz
File Naming
File Structure
• File can be structured many ways:
– Simple byte pattern
– Set of records
– Tree of data
• However, the file system usually only sees
the files as a group of bytes with no
structure to it
File Structure
File Structure
• By assuming just a byte sequence for files,
this give flexibility
• Now user programs can interpret the data
in the file anyway they want
• In the user wants to do unusual things,
this prevents the OS from getting in the
way
File Structure
• The second choice is a record based
approach
• The records are fixed length with an
internal structure
• Historically used with punch card systems
b/c they had 132 character records
• No current system works this way
File Structure
• The third type of file structure is a tree system
• This is a tree of records, not all being the same
•
•
•
length usually
There is a fixed length key field at a fixed
position
The tree is sorted on this key field for rapid
searching
You refer to a record by this key field
File Types
• Most OSes have several types of files
– Most of them support regular files and
directories
– Regular files contain user information
– Directories are system files for maintaining
the structure of the file system
– Character special files are special I/O files for
serial devices
– Block special files are special I/O files for
block devices
File Types
• Regular Files can contain either ASCII or
binary data
• ASCII data just contains line of text,
readable by humans
• Lines are terminated by carriage returns
or line feeds
• DOS uses both (CR-LF)
File Types
• ASCII file can edited with any basic text
editor
• It is easy to connect input and output of
two programs this way
• Interpretation of the data is easy in this
format
File Types
• The other type of regular files are binary
files
• The data usually is not human readable
• They have an internal structure
• This structure usually varies from file to
file
File Types
executable
Archive file
File Access
• Early OSes had only one type of access:
sequential access
• In sequential access, you could only read
the bytes in a file in order
• There was no skipping around
• You could rewind to the beginning of the
file
• This worked well with tapes
File Access
• When disks become more popular for
storage, there was a need for out of order
reading of data in files
• This is called random access files
• This is required by most applications
File Access
• There are two methods to specify where
to read
– Every read() call can specify a starting
position
– A special function called seek() can set the
current position
• Older systems used to make you define
the type of file at creation
File Attributes
• In addition to the name and data, every
file has associated information with it
• For example, the create and modification
dates and its size
• These are called file attributes
File Attributes
File Attributes
• The first four are related to the file
protection and who may access it
• The flags are bit the control a particular
property
File Attributes
• The record length, key position, and
key length are only used by files who
have records referenced by keys
• The times key track of creation,
modification and access times
• The current size tells how big the file is
File Operations
• Recall that files exist to store information
and retrieve it later
• The calls among the different OSes vary
• However, every OS provides functionality
for all the operations
File Operations
1. Create – The file is created with no data.
This tells the OS that there is a file
coming and to set its attributes
2. Delete – removes the file from the file
system. The file has to be deleted to free
up disk space
3. Open – Before the file can be read
from/written to, you must open it. The
open call may retrieve its attributes and
disk locations
File Operations
4. Close - when access is done, the file
should be closed. Sometimes this is
forced by allowing a limited number of
open files per process. This also forces
the writing of the data to the disk if it has
not already happened.
5. Read – retrieve the data from the file.
The bytes usually come from the current
position, and the number of bytes is
specified.
File Operations
6. Write – Data is written to the file, at the
7.
current position. If we are at the end of
the file, the file size increases. If we are
in the middle of the file, the data there is
overwritten.
Append – This is a special form of
write(). It adds data to the end of the
file. This is the same as seek-ing to the
end of file and calling write()
File Operations
8. Seek – For random access files,
9.
repositions the current read/write
position.
Get attributes – retrieves the file
attributes, for example the modification
time. This is typically a struct of values.
File Operations
10.Set attributes- sets the attributes in a
similar fashion to get attributes. Note
that not all attributes are settable.
11.Rename- allows you to rename a file
after it has been created. Not always
needed since we can copy the file and
delete the old one.
Example
• We consider an example program that
copies one file to another.
• The program name is copyfile
• Its syntax is
copyfile file1 file2
which copies file1 to file2
Example
• If file2 exists, it is overwritten
• If file2 does not exist, it will be created
• file1 must exist
• There are exactly two arguments
Example
Example
Memory Mapped Files
• Many programmers feel that the prior way of
•
•
•
•
working with files is cumbersome
There would prefer to use it similar to memory
access
One way to do this is to map/unmap a file to a
virtual address range
map() provides a file and a starting address
unmap() provides just the address usually
Memory Mapped Files
• If a file of length 32KB is mapped to
virtual memory at address 512K, then any
instruction to read and write between
512K and 542K refers to the file.
• 512K is the OK block, 513K is the 1K block
of the file, and so on