EECS 489 Lab 4: DHT O(N) Case
This assignment is due on Wednesday,
12 Feb 2016, 6 pm.
Introduction
In this lab, we will implement a simplified, Chord-like distributed
hash table (DHT) that takes O(N) time to add a new node
to the DHT. Our DHT relies on on-demand correction of inconsistencies
arising from DHT node additions. You can review the DHT algorithm in
the lecture
on DHT and PA2
walk-through. You can also, optionally, read the paper on
Chord (the algorithm in the paper relies on a periodic, instead
of an on-demand, process to fix inconsistencies).
The dhtn, distributed hash table node, built from the support
code takes the following command line options:
% dhtn [ -p <node>:<port> -I <ID> -t
<ttl> ]
If the node is run without the
-p option, it forms a new DHT with itself being the only node
overseeing the whole identifier space. The -p option
specifies the target of the node's message when joining an existing
DHT. A node's position in the DHT may not end up being adjacent to
the target node. A node's ID is based on the SHA1 hash of its IPv4
address and port number. As in Lab 3, we assume 8-bit identifiers.
The -I option allows you to override the ID computation and
give the node a static ID. This allows you to test how your code
handles ID collision. It also allows you to test specific node
addition orders and scenarios. The -t option allows you to set
a time-to-live (ttl) value different from the default
DHTM_TTL, defined in dhtn.h, for your join (and
in PA2, search) message(s).
A node is placed between its predecessor and successor in the
identifier ring if its ID is > that of the predecessor and
≤ that of its successor, where the ordering of the IDs
follow modulo arithmetic as in Lab 3. Entering 'p' on the
standard input (console) prints out the node's and
its predecessor's and successor's IDs (don't forget to hit
enter or return after the 'p').
[Unfortunately, console I/O is not implemented for Windows.]
Assumptions
To make the lab more manageable, we make the following simplifying
assumptions:
- No node departure: once a node joins the DHT, it doesn't depart
until you take down the whole DHT. This means that, as in PA1, you
don't need to clean up after a node departure, only be sure
that you can take down the DHT without node crashing.
- Node join process does not fail. To assume otherwise would
require a bit more complicated 2-phase commit join protocol.
- No concurrent joins. Nodes are added one a time. The provided
support code most likely will work with concurrent joins, but it
has not been tested for it. Consequently, until a node has completed its
join process, it will interpret receiving a join packet from another
node as an error.
- Every time a node needs to send a message, it opens a new
connection to the target node. The sender immediately closes the
connection once the message is sent. So there is no permanently
opened connections, unlike in PA1. The only exception to this
single message per connection rule is when performing on-demand
correction of DHT inconsistency due to node addition, as explained
below.
- When the ID range of a node changes, usually when its
predecessor node changes, its whole image database is reloaded
and its Bloom Filter recomputed.
Join Protocol
When the first node runs, it starts a listening socket, prints out its
<hostname>:<port> on the screen and waits for
join requests. Subsequent nodes should be started with an existing
node's <hostname>:<port> provided on the command
line. Each subsequent node first creates its own listening socket and
then connects to the provided node, sending over a join request with
its own ID, IPv4 address, and listening port number. The packet
format defined as dhtmsg_t in dhtn.h is as follows:
The version number MUST be DHTM_VERS, as
defined in dhtn.h. The dhtm_type field encodes the
type of packet. Packets of type DHTM_JOIN and
DHTM_REID use the same format. In the case of
DHTM_JOIN, the port and IPv4 address carried by the packet,
as part of dhtnode_t, are those of the joining node. In the
case of DHTM_REID, the dhtnode_t field is not
used.
After a node has sent out its join packet, it goes into a
select() loop waiting for a connection to arrive at its
listen socket or, if not running on Windows, for input on standard
input (console). Since we open a separate connection for each
message, when a connection is established, we immediately go into
receiving mode. Each new connection is assigned an ephemeral source
port different from the port it's listening for connection.
To identify a node, use its ID instead of its source port.
There are two possible outcomes to a join attempt: either there is an
ID collision and the node is asked to generate a new ID and try again
(i.e., it receives a DHTM_REID packet), or it receives a welcome
message, informing it of its successor and predecessor nodes. The packet
format used for DHTM_WLCM packet is dhtwlcm_t defined
in dhtn.h and it is very similar to dhtmsg_t but for an
additional dhtnode_t in the packet. The first
dhtnode_t in the welcome packet is the
successor to the joining node and the second
dhtnode_t its predecessor. The figure below shows the packet
format of dhtwlcm_t:
In dhtn::handlepkt() we check whether the returning packet is
of type DHTM_REID or DHTM_WLCM. In the former case,
we call dhtn::reID() to regenerate a new ID and then call
dhtn::join() again with the new ID to retry the join attempt.
In the latter case, we store the first dhtnode_t in the
return packet in the dhtn class member variable
fingers[DHTN_SUCC], which is where we keep the successor node's
information. Then we store the second dhtnode_t in the class
member variable fingers[DHTN_PRED]. We use a
fingers[] array instead of separate sucessor and predecessor
variables in anticipation of PA2.
The function dhtn::handlepkt() has been provided to you in
full. Please take your time to read it carefully and make sure you
understand what it is doing. Pay attention to how the socket whence a
packet arrived is closed as soon as we finish receiving the packet.
The only exception is when the arriving packet is a DHTM_JOIN
packet, in which case we call dhtn::handlejoin() to handle
the packet. In dhtn::handlejoin() be sure to close the
sender socket as soon as you're done with it. Otherwise, you could
run into a deadlock situation where multiple nodes are waiting for
each other to complete transmission and close connection.
Task 1
Your first task is to write the function dhtn::handlejoin(). There are four cases you need to consider in writing this function:
- When the joining node's ID collides with that of the current
node or its predecessor, send back to the joining node a
DHTM_REID message, as described above.
- When the joining node's ID falls within the identifier range of
the current node, the correct spot on the identifier ring has been
found for the joining node, insert it into place and inform the
joining node of its place in the identifier ring by sending it a
DHTM_WLCM message as described above. When the joining
node is inserted into place, it splits the current node's ID range.
- When the joining node's ID is not within the current node's
identifier range, but the node forwarding the join request believes
it to be within the current node's range---this indicates that the
forwarding node's successor information has become inconsistent due
to earlier node addition(s), send a DHTM_RDRT message to
the node forwarding the join request (not to the joining
node).
- When a DHTM_JOIN message must be forwarded along the
DHT by calling dhtn::forward() (see Task 2 below).
See comments in dhtn.cpp:dhtn::handlejoin() for
what you need to do for each of the above cases. Don't forget to close
the socket passed to dhtn::handlejoin() as soon as you don't
need it, to prevent deadlock situation. This task takes no more
than 35-45 lines of code. You will need your
hash.cpp:ID_inrange() code from Lab 3.
Task 2
The function dhtn::forward() handles the forwarding of
DHTM_JOIN messages. It is also where we implement the
"on-demand" repair of the DHT identifier ring inconsistencies arising
from node additions. Before forwarding a join message, check whether
the ID of the joining node falls within the expected range of the node
to which the message is to be forwarded. If so, set the
DHTM_ATLOC bit of the message's dhtm_type. As
detailed in the description for Task 1 above, if this expectation is
misplaced, the node to which the join message is forwarded will return
a DHTM_RDRT message. A DHTM_RDRT message has the
same format as a DHTM_JOIN message, except that the
dhtnode_t field contains the address and port number of a
suggested replacement successor node. We always accept the
suggestion, make it our new successor node, and re-send the join
message to it, with DHTM_ATLOC set. If there had been
multiple additions to the DHT, the suggested successor may yet again
turn out to be inconsistent information. In which case, we may have
to change successor node and retransmit the join request multiple
times.
Every time a DHTM_JOIN packet is forwarded, including
re-forwarding in response to the receipt of a DHTM_RDRT, its
dhtm_ttl field is decremented by one. When the ttl
reaches 0, the packet is dropped. Since the join process is assumed
not to fail, and since we don't have the equivalent of ICMP
implemented for this assignment, if a join packet is dropped due to
ttl expiration, you'll simply be prompted to re-run your test
case with a larger ttl (specified using the -t
command line option).
The comments in dhtn.cpp:dhtn::forward() contain further
details that should help in implementing this function. This second
task should take about 20 lines of code. In completing both tasks 1
and 2, you may use the functions in socks.cpp provided as part
of the support code (or use your own implementation from PA1).
Testing Your Code
The provided dhtn is linked with the imgdb class so
you can use netimg from Lab 3 to query it for an image. To
simplify experimentation, you can let all instances of your
dhtn share the same images folder, but each
dhtn instance should load to its image database only those
images whose IDs fall within its purview. We are not implementing
search on the distributed hash table in this lab, so only query for
images within a dhtn's ID range may return an image. In your
test, you can run netimg multiple times, each time connecting
to a different dhtn, requesting images within and outside the
node's ID range. As in PA1, each dhtn has two sockets: one
to communicate with other DHT nodes, the other for netimg
clients to query the node's imgdb.
After each node addition, the successor information of previously
added nodes may have become inconsistent such that when you hit
'p' you will see inconsistent successor information. This is
alright. However, for each new node you add, entering 'p' at
that newly joined node should show you the correct successor node.
Only after additional nodes have joined the network is the successor
information of existing nodes allowed to be inconsistent. While the
successor information is allowed to become inconsistent, the
predecessor information must stay consistent at all times. So if you
hit 'p' on each node, you should be able to reconstruct the
correct identifier ring by following the predecessor node information
at each node.
Support Code
The support code is available as lab4.tgz.
It contains only three files: a Makefile, dhtn.h,
and dhtn.cpp. To build the dhtn program, you need
your files from Lab 3. You could simply copy over these new files
from Lab 4 to your Lab 3 folder. If you want to save the
Makefile from Lab 3, please do so before you copy over and
overwrite it with the new Makefile. We assume you have a
working imgdb and ID_inrange() function from Lab 3.
If you didn't manage to get these functions to work in Lab 3, you can
get the solutions for 10 of your PA2 points. As with Lab 3, only
those who have completed PA1, or inform us that they are not going to
complete PA1, will have access to the support code because the support
code reveals the solution to parts of PA1.
You can also find the reference implementation refdhtn
in /afs/umich.edu/class/eecs489/w16/lab4.
The reference implementation is,
as usual, compiled on CAEN eecs489 hosts running Red Hat 7, so
don't try to run it on Mac OS X or Windows machines.
The support code has been compiled and tested on Linux, Mac OS X,
and Windows. As with Lab 3, on Ubuntu and Windows, you'd need to
install the OpenSSL library (see Lab 3 specs for instructions).
Submission Instructions
As with Lab 1, to incorporate publicly available code in your
solution, or to pass off the implementation of an algorithm as that of
another are both considered cheating in this course. If you can not
implement a required algorithm, you must inform the teaching staff
when turning in your assignment.
Your submission must compile and run without errors on CAEN
eecs489 hosts using the provided Makefile, unmodified, without any additional libraries or
compiler options.
Your "Lab4 files" comprises your dhtn.cpp
file.
To turn in your Lab4, upload a zipped
or gzipped
tarball of your Lab4 files to the CTools Drop Box. Keep your own backup copy! The timestamp on your
uploaded file is your time of submission. If this is past the
deadline, your submission will be considered late. You are allowed
multiple "submissions" without late-policy implications as
long as you respect the deadline. We highly recommend that you use a
private third party repository such as github
or M+Box or Dropbox or Google Drive to keep the back up copy of your
submission. Local timestamps can be easily altered and cannot be used
to establish your files' last modification times (-10 points). Be
careful to use only third-party repository that allows for
private access. To put your code in publicly accessible
third-party repository is an Honor Code
violation.
Turn in ONLY the files you have modified. Do
not turn in support code we provided that you haven't modified (-4 points).
Do not turn in any binary files (object, executable, dll,
library, or image files) with your assignment (-4 points). Your code
must not require other additional libraries or header files other
than the ones listed in the Makefile (-10 points).
Do remove all printf()'s or
cout's and cerr's and any other logging statements
you've added for debugging purposes. You should debug using a
debugger, not with printf()'s. If we can't understand the
output of your code, you will get zero point.