EECS 489 Lab 4: DHT O(N) CaseThis assignment is due on Wednesday, 12 Feb 2016, 6 pm.
IntroductionIn this lab, we will implement a simplified, Chord-like distributed hash table (DHT) that takes O(N) time to add a new node to the DHT. Our DHT relies on on-demand correction of inconsistencies arising from DHT node additions. You can review the DHT algorithm in the lecture on DHT and PA2 walk-through. You can also, optionally, read the paper on Chord (the algorithm in the paper relies on a periodic, instead of an on-demand, process to fix inconsistencies). The dhtn, distributed hash table node, built from the support code takes the following command line options:
AssumptionsTo make the lab more manageable, we make the following simplifying assumptions:
Join ProtocolWhen the first node runs, it starts a listening socket, prints out its <hostname>:<port> on the screen and waits for join requests. Subsequent nodes should be started with an existing node's <hostname>:<port> provided on the command line. Each subsequent node first creates its own listening socket and then connects to the provided node, sending over a join request with its own ID, IPv4 address, and listening port number. The packet format defined as dhtmsg_t in dhtn.h is as follows:
Task 1Your first task is to write the function dhtn::handlejoin(). There are four cases you need to consider in writing this function:
Task 2The function dhtn::forward() handles the forwarding of DHTM_JOIN messages. It is also where we implement the "on-demand" repair of the DHT identifier ring inconsistencies arising from node additions. Before forwarding a join message, check whether the ID of the joining node falls within the expected range of the node to which the message is to be forwarded. If so, set the DHTM_ATLOC bit of the message's dhtm_type. As detailed in the description for Task 1 above, if this expectation is misplaced, the node to which the join message is forwarded will return a DHTM_RDRT message. A DHTM_RDRT message has the same format as a DHTM_JOIN message, except that the dhtnode_t field contains the address and port number of a suggested replacement successor node. We always accept the suggestion, make it our new successor node, and re-send the join message to it, with DHTM_ATLOC set. If there had been multiple additions to the DHT, the suggested successor may yet again turn out to be inconsistent information. In which case, we may have to change successor node and retransmit the join request multiple times. Every time a DHTM_JOIN packet is forwarded, including re-forwarding in response to the receipt of a DHTM_RDRT, its dhtm_ttl field is decremented by one. When the ttl reaches 0, the packet is dropped. Since the join process is assumed not to fail, and since we don't have the equivalent of ICMP implemented for this assignment, if a join packet is dropped due to ttl expiration, you'll simply be prompted to re-run your test case with a larger ttl (specified using the -t command line option). The comments in dhtn.cpp:dhtn::forward() contain further details that should help in implementing this function. This second task should take about 20 lines of code. In completing both tasks 1 and 2, you may use the functions in socks.cpp provided as part of the support code (or use your own implementation from PA1).
Testing Your CodeThe provided dhtn is linked with the imgdb class so you can use netimg from Lab 3 to query it for an image. To simplify experimentation, you can let all instances of your dhtn share the same images folder, but each dhtn instance should load to its image database only those images whose IDs fall within its purview. We are not implementing search on the distributed hash table in this lab, so only query for images within a dhtn's ID range may return an image. In your test, you can run netimg multiple times, each time connecting to a different dhtn, requesting images within and outside the node's ID range. As in PA1, each dhtn has two sockets: one to communicate with other DHT nodes, the other for netimg clients to query the node's imgdb. After each node addition, the successor information of previously added nodes may have become inconsistent such that when you hit 'p' you will see inconsistent successor information. This is alright. However, for each new node you add, entering 'p' at that newly joined node should show you the correct successor node. Only after additional nodes have joined the network is the successor information of existing nodes allowed to be inconsistent. While the successor information is allowed to become inconsistent, the predecessor information must stay consistent at all times. So if you hit 'p' on each node, you should be able to reconstruct the correct identifier ring by following the predecessor node information at each node.
Support CodeThe support code is available as lab4.tgz. It contains only three files: a Makefile, dhtn.h, and dhtn.cpp. To build the dhtn program, you need your files from Lab 3. You could simply copy over these new files from Lab 4 to your Lab 3 folder. If you want to save the Makefile from Lab 3, please do so before you copy over and overwrite it with the new Makefile. We assume you have a working imgdb and ID_inrange() function from Lab 3. If you didn't manage to get these functions to work in Lab 3, you can get the solutions for 10 of your PA2 points. As with Lab 3, only those who have completed PA1, or inform us that they are not going to complete PA1, will have access to the support code because the support code reveals the solution to parts of PA1. You can also find the reference implementation refdhtn in /afs/umich.edu/class/eecs489/w16/lab4. The reference implementation is, as usual, compiled on CAEN eecs489 hosts running Red Hat 7, so don't try to run it on Mac OS X or Windows machines. The support code has been compiled and tested on Linux, Mac OS X, and Windows. As with Lab 3, on Ubuntu and Windows, you'd need to install the OpenSSL library (see Lab 3 specs for instructions).
Submission InstructionsAs with Lab 1, to incorporate publicly available code in your solution, or to pass off the implementation of an algorithm as that of another are both considered cheating in this course. If you can not implement a required algorithm, you must inform the teaching staff when turning in your assignment. Your submission must compile and run without errors on CAEN eecs489 hosts using the provided Makefile, unmodified, without any additional libraries or compiler options. Your "Lab4 files" comprises your dhtn.cpp file.
To turn in your Lab4, upload a zipped or gzipped tarball of your Lab4 files to the CTools Drop Box. Keep your own backup copy! The timestamp on your uploaded file is your time of submission. If this is past the deadline, your submission will be considered late. You are allowed multiple "submissions" without late-policy implications as long as you respect the deadline. We highly recommend that you use a private third party repository such as github or M+Box or Dropbox or Google Drive to keep the back up copy of your submission. Local timestamps can be easily altered and cannot be used to establish your files' last modification times (-10 points). Be careful to use only third-party repository that allows for private access. To put your code in publicly accessible third-party repository is an Honor Code violation.Turn in ONLY the files you have modified. Do not turn in support code we provided that you haven't modified (-4 points). Do not turn in any binary files (object, executable, dll, library, or image files) with your assignment (-4 points). Your code must not require other additional libraries or header files other than the ones listed in the Makefile (-10 points).
Do remove all printf()'s or cout's and cerr's and any other logging statements you've added for debugging purposes. You should debug using a debugger, not with printf()'s. If we can't understand the output of your code, you will get zero point.