CMSC 621
Project
The Mechanics
You will do this project in groups. The group shall have no less than three,
and no more than four members. Groups with less than three members will
not be allowed, unless extreme circumstances demand it. In any case, such
groups will still be judged against the same criteria as regular groups.
-
On or before October 25, you will submit (electronically) a document
which provides preliminary system specs and design, assumptions you have
made, and a plan of execution (including a proposed timeline). This should
be emailed to cmsc621@cs.umbc.edu with the subject "First Project
Report".
-
On November 25, you will submit a brief report of your progress
to date, changes in the design if any, as well as an updated timeline.
Again, this will be via email , with the subject "Second Project Report".
-
The project report detailing your work (the system design, experimental
studies etc.) as well as submission of your code, will due on December
15, 1999 in a similar manner - emailed with subject as "Final Report".
Code should be attached as a tarred, gzipped file. Please note that pdf,
ps and text (formatted to 72 columns) are the only formats we will accept.
You will also arrange to demonstrate your project to the instructor between
December 6 and December 17. The report and code submission must
precede the demonstration, however.
You are allowed to discuss the project across groups. Clearly, you are
not
allowed to share solutions. You may read papers and textbooks in this
area as well -- some pointers are provided in this document. However, you
should cite the sources you have consulted. It is intended that you will
do this project using Java on the CS/UCS Unix systems. However, you are
free to chose another imperative language should you so desire, as long
as the project runs on CS/UCS machines.
The Bare bones Task (65 points)
In this project, you will design a simple distributed file system with
replication and disconnection management features. The system will consist
of several file servers, and clients which interact with them. Normal operations
related to file systems, such as creating, deleting, opening, closing,
reading, writing, seeking etc. should be allowed. These requests should
be processed by the fileserver that the client is connected to. When a
file is created by a client, it resides on the fileserver which the client
is connected to, and is owned by this client. In other words, the
primary copy of the file is kept on this server. When some fileserver receives
a request for operations on a file, it should contact the server which
owns the file and create a replica locally. All operations should then
proceed on the replica. For the basic task, we will create a pessimistic
replication scheme. In other words, any operations (such as write) which
could throw the replicas out of sync will be permitted on one replica only.
However, any number of file servers can freely create replicas and perform
operations which do not alter the file. Your system should allow a fileserver
wanting to create a replica to tell the owner filesystem the mode in which
it wants a file. If the mode is one which will change the file, then the
requester should also specify a lease period. The lessee fileserver
promises to return any changes to the file to the lessor (owner) within
the lease period, or seek an extension, which may or may not be granted.
If changes are returned in time, then the owner fileserver should "commit"
them, and update all other replicas appropriately. If the lessee does not
return the changes before lease expiry, then the lessor should invalidate
any changes that the lessee many have made (and send it a message to this
effect). Any request for a lease on a file that has already been leased
out should be blocked. The owner should lease the the file out in a FIFO
process and unblock the corresponding process.
Optimistic Replication (25 points)
The idea here is to extend the bare bones system to permit multiple servers
to simultaneously operate on replicas in a manner which will change them
(e.g. concurrent writes on replicas). There are several published approaches
in literature to achieve this, and you can chose to implement (some subset)
of any one of them. Good starting points include the CODA and SPRITE
papers on the course web site, besides the URLs provided below. When the
changes are returned, they should be merged as needed and the file updated.
If the changes are inconsistent, your system should simply flag an error
and chose one of the changes to commit. Your design documents should specify
exactly what you would implement.
Fileserver Discovery (10 points)
One problem with the scheme we have outlined is that a client needs to
know up front about the appropriate fileserver. This can create a problem
with mobile systems -- where the client moves and can reconnect at any
arbitrary network (Imagine unplugging a machine from your office, and plugging
it back in here at campus, and still have it work). What is needed is a
mechanism that will allow a client to find its nearest fileserver automatically.
Your system should permit this. (Hint: Think Jini).
Experiments
A part of your project is to design and carry out an experimental validation
to convince us that your system works. You should also analyze your system
for scalability -- both in the number of clients and the number of servers.
You should experiment with different workload mixtures. Comment on the
results you get -- try to identify what assumptions you made or implementation
mechanisms of your system cause particular scalability patterns. Try and
identify both the strengths and weaknesses of your system. If you use your
experimental results to refine your design, make sure you bring this out
in your report.
Some general suggestions
As should be evident to most of you, it is imperative for a project of
this complexity and involving teams that you design your system before
you code! In your design, you will need to make assumptions as you
flesh in the details of the system. Please make sure that you state them
in your design document. Make a timeline for your work, and try and stick
to it. Where you divide tasks, make sure you clearly define points of articulation
and interfaces between modules. As you form groups, please make sure that
you can find a common time to meet. This is especially true for those who
are part time students and hold jobs which will restrict your schedule.
Please comment your code well -- it will help both you and us. You in figuring
out code your partners have written, us in grading it. Also, use some form
of revision control on your source tree. CS/UCS machines have systems such
as CVS and RCS available for your use. This will help if lightening strikes,
UPC fails and machines/disks crash, making your recent changes disappear!
Please do create makefiles as well.
References
-
The Coda System (http://www.cs.cmu.edu/afs/cs/project/coda/Web/coda.html)
-
The FICUS System (http://ficus-www.cs.ucla.edu/ficus/)
-
The Sprite System (http://www.cs.berkeley.edu/projects/sprite/sprite.html)
-
The NOW/xFS System (http://now.cs.berkeley.edu/Xfs/xfs.html)