Distributed File System Over Windows®

Transcript Distributed File System Over Windows®

Distributed File System
Over Windows®
Project By:
Iyad Ibsais & Jamal Abdul-Haq
Supervisor:
Dr. Lu’ay Malhis
Presented to:
Dr. Lu’ay Malhis
Dr. Ra’ed Alqadi
Dr. Ashraf Armoush
Very special thanks to En. Samer Al-Arandi
for helping us to accomplish and complete this
project.
Introduction
Making a distributed file system over the
Windows environment is getting more and
more importance, specially in the labs,
companies, and any place where a
network exists because of the need to
access any file any where in the network
so that the whole work is distributed, and if
any machine was crashed, the files are
still there, we don’t loose all the files.
Architecture of the project
Server:
A central Multithreaded Server with the
following responsibilities:
1.Authenticator
2.File Manager
3.Router between clients
Architecture (Cont.)
Client
A single threaded client with a lot of
features like the main browser, the local
browser, the file editor, the image viewer
and other features
Database
• Currently In our project, we used the
Microsoft Access to make our own
database.
• In the beginning of the project we have
connected to an SQL server, but that gave
us a bad performance since we deal with
small but many tables in the database, the
SQL server gives a very long response
time for accessing such type of database
The Layered Model
The Core Functions
–
–
–
–
–
–
–
–
–
–
–
Messages
Multithreading
Databases
Intercommunication Between Databases and
Architecture
Distribution Methodology
Distribution Policy
The Role of Mutex in the Project
Group And Global Sharing
Logical Files And Physical Files
Directory Concept
Access Rights
Messages
• This is the communication unit between the client and
the server, in this unit, clients and server can interchange
the information needed to know the next action to be
done by either sides (client or server), it’s build as a
structure, and it’s called SMsg, this message is divided
into 5 attributes
Message Types
• Type : because every message sent or received is
attended to cause a certain procedure, it’s necessary to
differ between the messages interchanged between the
peers, there where the importance of the “Type” field
come from, the receive function is composed of a huge (
really huge !) switch statement, the cases are
determined by the type of the message received, the
type is one byte long and as a convention in the project,
every group of letters represents a category of
messages, for example, the capital letters are for
metadata messages , the numeric are for the
administrational messages, and so on.
• Types and meaning of the interchanged messages:
• We can divide the messages into two kinds
• For Details about messages, see the documentation
Multithreading
• We used the multithreaded server in our
project so that to provide the
independency for every client, so that
every client thinks that it has It’s own
server, and so that we made a
Synchronous Blocking communication
between peers, using the TCP/IP as a
basis, using such protocol simplify the
sending and receiving (But not that
much!!)
Multithreading Issues
• First problem was the concept made in the
TCP/IP communication that the listening
and accepting must be in the same thread.
We did this in the Main thread of the
server, but that created another problem,
when the accept function returns the
pointer to the access socket (the one you
read and write to) you can’t just pass it to
the thread class.
Problem and Solution
• Our server make a communication between
threads, and when this happens, the server
switches between threads, and that causes the
handle of the socket which was passed to the
thread to be deleted, since the handle of that
thread is sometimes erased as it’s considered as
temporary, the solution is not to pass the socket,
instead, pass the handle of it.
• We didn’t do that, we have passed the pointer
and we have solved the problem above with a
simple trick
Solution ( Cont )
• The deletion of the handle occur after 10
times of accessing the socket right after
the switching process, i.e. if you have
switched from thread to an other, you can
only send 10 times and then the exception
occurs, to solve this problem, we send just
one message after switching, and let the
client respond to us, and then we are free
to do what we want to with the socket.
Identifying the Threads
• Since we used a user-interface thread, i.e.
we have made a class inherited from the
CWinThread, every thread has a member
variable called m_ThreadID, we used it as
a finger tip of the threads, and these are
the identifiers inserted in the database,
since it’s a DWORD, not a pointer nor a
handler, and the messaging process
between threads is making use of them
Database
• As we mentioned before, the database is created by the
Microsoft Access, and it consists of two types of tables
1. Static Tables
a. The Users table
b. The ThreadID table
c. The GlobalFileID table
d. The Groups Table
e. The Server cache table
2. Dynamic Tables
They are the username tables and the groupname
tables
Static Tables
The Users table
• It consists of the UserName, Password,GroupName
and whether the user is logged in or not.
The ThreadID table
• It consists of the IDs of the current active threads
The GlobalFileID table
• This is where the physical name of files is taken from.
The Groups Table
• This table holds the names of the groups in the system
The Server cache table
• This table consists of the names of the files in the
server cache and the corresponding Physical names.
Dynamically Created Tables
• For every new user, a table is created to
hold his/her logical and physical names of
the files and directories in his account, and
the threads holding that particular file
currentely
Group Tables
• Similar to the user tables, but it has an
extra field, the user name who owns the
file
Intercommunication Between
Threads Through Tables
• The database is the randez-vous where every threads
meets, for example when a request to a file is made from
any client, the corresponding thread searches the
database for the name and the path of that file and then
find the thread which has this file physically, then it Posts
a message to that thread to let it know what file is exactly
wanted, and in it’s turn, the other thread calls it’s
corresponding client to send the file, so it can be saved
in the Server file cache, and when it’s there, this thread
tells the caller thread that the file is in the server cache
with the dynamic name (temporary name) so that the
caller thread starts to send this file to the requesting
client
• See the flash movie
Distribution Policy
• To make sense and to keep as little network
communication as possible, we assumed that
the user of the system often works on few
number of machines, the distribution of the files
are in those machines which the user use
occasionally, creating files are done on the
current machine, and saving files are done on
the original machine, also, we have used the
server cache table to keep track of the Last
Recent Used files (LRU) so that it will reduce the
response time to the half.
The Role of Mutex in the Project
• In order to add files from local machines or to
create new ones, the thread must have a unique
ID for that file, it get it from the GlobalFileID
table, which has only one entry, the ID which is
not used yet, the thread get it, and then
increment it, and because this is the only
common variable between threads, it must have
a lock so no simultaneous actions must happen,
and there comes the importance of mutes to lock
this critical and common section of the code.
Logical And Physical File Name
• Every file in the system is identified by 3
attributes
– A unique ID ( physical name)
– The actual name ( logical name)
– The full parent path of that file
So the file \dir1\dir2\file1.txt can be easily
differentiated from the file \dir1\file1.txt
because both have different IDs and different
parent paths
Directory Concept
• In our project, the directory is not a special file
with metadata in it, instead, the directory is a
virtual directory, since it’s nothing but an entry in
the database, and there comes the importance
of the file unique ID, since files are actually
stored in the machines in a one level directory,
and by convention when the system fines an
entry in the table with flag=1, it’s considered a
directory, and when sent to the client, it’s marked
as a directory, and drawn like one.
Sending a Msg
• Sending a message is not a simple matter as it
seems, since we are dealing with a
multithreaded server, which means a time slicing
between threads, trying to send a message
using the simple Send method will not work,
since the TCP/IP protocol may send it in several
stages or fractions, and the corresponding
thread may think that these fractions are
different messages and the server may not
function, or function strangely.
The Correct Send
• In order to send the message , we first
convert it to a buffer which has the same
size of the message, and the repeatedly
try to send the buffer or the parts that are
remaining to send, and the receiver part
waits untill it’s a complete message before
accessing the message through the huge
switch statement.
• See the flashes
What is really going on???
• Here I will give a simple example of the
infrastructure of the system and the real
communication between client and server,
suppose a user is trying to login, when he/she
types the username and the password, the client
encrypt the password and sends an ‘A’ type
message to the client with the typed name and
the encrypted password, this encryption is a one
way encryption, that is it can’t be decrypted, but
it’s understood by the server as it’s.
Scenario continue
• When making sure that the user is
authenticated, the corresponding thread
searches the user’s table for files and directories
in the root directory, for a file, a small ‘f’ letter is
stuffed at the beginning of it’s name, and for the
directory, ‘d’ is stuffed, so that the client can
distinguish between the directory and the file,
put them all in a long string with separators
between the names,and sends the message
holding that string, Type ‘N’ this time.
Scenario continue
• When this message is received by the
client, the string is extracted from the
message and sent to the View dialog,
where it’s tokenized and inserted in the
view list with the correct icon depending
on the stuffed letter for directories and
files, and also the file extension for
distinguishing between different types of
files (txt, exe, jpg …..)
Entering a directory
• When the user double clicks on a directory
,a message sent to the server with the
name of that directory and the parent of it,
and then the server searches the
database and sends the files and
directories whose parent is the sent
directory. And then assembling them in the
previous method and the same procedure
occurs.
A Special User !
• If the user logged in was an Administrator
then, after authentication the server
doesn’t look for any files, it just search the
GroupNames table and send them to the
client, and then the client opens an
Administration Dialog.
Issues to be discussed in details
- File Movement:
Such as creating files and directories,
editing files, copying files and directories,
cutting files and directories, deleting files
and directories
- Logging out of the account
Client Components
The client program consists of the following:
• The Connection And Login Dialog
• The Main Browser
• The Local File Browser
• The Administration Dialog
• The Editor
• The Media Player
• The Image Viewer
• Creating separate process for “EXE” files
Every one of these components will be full explained
at the practical discussion.
What happens when a client
crashes or destroyed
• When any client on the network is destroyed or
crashes, it automatically looses connection to
the server, then a connection lost routine is
executed at the server side, this includes:
– Marking the files in that machine as invalid
– Sending refresh to all other client so that the view will
no show the files on the crashed machine
– removing thread ID from the ThreadID table
– Killing the thread
Implemented but canceled Ideas
- CMsg: this is the message but it was implemented as a
class instead of struct, the class size differs from the sum
of the sizes of it’s attributes
- Multithreaded Client: this is a long story, the client is
implemented to be composed of several subServers and
subClients so that the server will not be responsible of
transferring the files among the clients, it’s just a router
between threads and that will make the transmission
faster between clients, we didn’t have time to complete
it.
- An SDK for using for general use, by making a DLL that
can be attached to any program, so that the program will
be able to open, read and write files on the system as if
they are in the users current machine