EECS 489 Lab 2: A Peer Node

This assignment is due on Friday, 22 Jan 2016, 6 pm.

Introduction

The majority of socket programs, including netimg of Lab1, follows the client-server paradigm, where a server waits on a well-known port for clients' connections. In this lab, we'll explore peer-to-peer programming. A peer is basically both a server and a client. It accepts connections from other peers and also connects to one or more peers.

You're provided with a skeleton code: a source file peer.cpp and its accompanying header file peer.h, as part of this lab. You can download the support code from the Course Folder. The support code contains only three files: a Makefile, peer.h, and peer.cpp. The provided Makefile builds a program called peer. It requires netimg.h, socks.h, and socks.cpp from Lab 1. You could simply copy over the files from Lab 2 to your Lab 1 folder. If you want to save the Makefile from Lab 1, please do so before you copy over the Lab 2 version. The peer program takes two optional arguments on the command line:


% peer [-p <hostname>:<port> -n <maxpeers> -v <version>]

The -p option tells the peer program which peer to connect to initially. If this option is not provided, the peer starts as a server listening on a random, ephemeral port. The -n option allows the user to set a peer's maximum peering relationships (used only in PA1). The -v option works similarly to the same option for netimg.

To bootstrap the peer-to-peer (p2p) network, we first start a peer by itself. Everytime a peer runs, it prints to screen/console its fully qualified domain name (FQDN) and the port number it is listening on. When a peer is run with the hostname:port of another peer as its command line argument, the new peer tries to join the provided peer in the p2p network by creating a socket and connecting to the peer.

A peer that receives a join request will accept the peer if and only if its peer table is not full. Whether a join request is accepted or not, the peer sends back to the requesting peer the hostname:port of a peer in its peer table, if the table is not empty, to help the newly joined peer find more peers to join.

We will be re-using most of the functions in socks.cpp that you wrote for Lab 1. If you didn't manage to get these functions to work in Lab 1, you can get the solutions for 20 of your PA1 points.

Take a step back and look at the big picture, see how code implemented in two different processes in Lab 1 are now residing in the same process and how this process is serving the role of both a client and a server. Pay particular attention to how this is done using multiple sockets monitored by a single thread. Another goal of this lab is for you to gain an early experience with protocol design. In this case, we're designing a simple peer-to-peer join protocol, with redirection.

Task 1: Server Side

Your first task is to implement the server side of a peer. You can search for the string "Task 1" in the code to find places where "Task 1" related code must be filled in. You can search for the string "YOUR CODE HERE" in the code to find places where your code must go.

If peer is run without any option on the command line, its default constructor calls socks_servinit(server, sname, reuse) with server->sin_port = 0 and reuse set to 1. Since we will be re-using the same port number for both listen and connect sockets, modify your socks_servinit() to set the address reuse socket option before calling bind(). To bind the same address and port number to multiple sockets, on MacOS X and Windows, it is usually sufficient to set socket option SO_REUSEADDR. But on Linux, in addition to SO_REUSEADDR, you would need to set socket option SO_REUSEPORT. Furthermore, SO_REUSEPORT is not implemented on Winsocks. For portability across all three platforms, you should set both socket options, but guard SO_REUSEPORT with #ifndef _WIN32 (to be followed by the appropriate #endif). The OS will assign a random, ephemeral port to the socket. Finally, return the socket descriptor to the caller. Search for "Lab 2 Task 1" in socks_servinit() for where your code should go. This should take no more than 4 to 6 lines of code. Recall from Lab 1 that upon return from socks_servinit(), the provided self argument contains the IPv4 address of the current host and the port number bound to the returned socket. The name of the host is further stored in the provided sname argument.

After the peer object is set up, in main() call select() to wait for connection on the listening socket (1 to 2 lines of code). When select() returns, we first call peer:handlejoin() to check if a new peer is trying to join the p2p network. If a new peer is trying to connect to this peer and this peer's peering table is not full, handlejoin() calls socks_accept() to accept the connection and then calls peer::ack() to send back a pmsg_t message with pm_type field set to PM_WLCM. The new peer is then stored in the peer table. On the other hand, if the peer table is full, handlejoin() calls socks_accept() and peer::ack() as before, but in the call to peer::ack(), it sends back a redirect (pm_type = PM_RDRT) message. The function peer::ack(td, type) marshalls together a message of type pmsg_t defined in peer.h. It fills in the fields of the message: pm_vers must be set to PM_VERS, pm_type set to the type argument passed into peer::ack(). The pm_param field holds different parameter values corresponding to the pm_type field. For pm_type PR_WLCM and PR_RDRT, the pm_param field holds the number of peers attached to the pmsg_t packet. If the peer table is empty, the pm_param field is thus set to 0. If the peer table is not empty, the pm_param field is set to 1 (since in this lab we allow each peer a maximum of 2 partners) and the peer's struct sockaddr_in is sent to the joining peer, through the provided socket td. The figure below shows the pmsg_t sent.

If there's any error in sending, for example, if the other side of the connection has been closed by the peer, close the connection. This part takes less than 8 lines of code.

That's all for Task 1. It should take about 15 lines of code in total. After completing Task 1, you should test your code before continuing to Task 2. See the Testing section below for some guidelines on testing your code using the reference implementation of peer.

Task 2: Client Side

You can search for the string "Task 2" in the code to find places where "Task 2" related code must be filled in.

If a peer is run with the -p option, the user must provide a known peer hostname and port number to connect to, with the port number separated from the peer hostname by a colon. The provided function peer_args() handles parsing of the command line. Upon return from the call to peer_args(), the peer's hostname will be stored in the provided *pname and the port number will be stored, in network byte order, in *port. The peer object is then constructed with the known peer's hostname and port number. The peer default constructor connects to the known peer by calling socks_clninit(). Since we will be re-using the same port number both for connecting to other peers and for listening for connection from other peers, you'd need to extend your socks_clntinit() from Lab 1 to set the address reuse socket option. Search for "Lab 2 Task 2" in socks_clntinit() for where your code should go. You can basically cut and paste the same 5 lines of code for "Lab 2 Task 1" above. Upon return from socks_clntinit(), the known peer's address and port number are stored in the first element of peer::ptable. In addition, the OS would have assigned a random, ephemeral source port to the connected socket. Find out the assigned ephemeral source port number and store it in the self variable, along with the IPv4 address of the current host, as you had done in socks_servinit() for Lab 1. This should also be about 2 lines of code. At this point in the default constructor, we'll be calling the socks_servinit() function as part of Task 1 above. However, instead of calling the function with self.sin_port = 0, we'll be calling it with the random, ephemeral port number assigned by the OS when you connected with the known peer. Back in main(), select() will be waiting for activities on both the socket connected to the known peer and the socket on which you're listening for connection from other peers.

The function peer::handlemsg(td, msg) checks for activity on each connected peer's socket. If there's an incoming packet, it calls peer::recvmsg(td, msg, peer), which receives a pmsg_t message from the provided socket td. You need to first check the version number of the received packet. If its pm_vers is not PM_VERS, we really don't know what's in the receive queue of the socket. In which case, we need to clear the queue of all bits currently in the queue by calling socks_clear() (see below). Assuming the version number checks out, if the pm_param field of the packet is not 0, since in this Lab we assume at most one peer would be returned, we simply receive the peer's peer_t into the provided peer argument. If there's an error in receiving the packet, the function closes the socket td and returns the error code returned by the socket receive API. Otherwise, it returns the total amount of bytes received. You are to write the peer::recvmsg() function—about 30 lines of code.

When unknown data shows up in a socket's receive queue, we really don't know what it contains nor how to handle it, so the best we could do is to simply clear the receive queue of all data and wait for new data to arrive. If the socket is a blocking socket, we first set it to non-blocking, then we continue to receive and "drop" all data presently resident in the receive queue until the non-blocking receive tells us that there's no more data in the queue. If the socket was a blocking socket, we would need to restore it to its blocking state. You are to implement this process of clearing the receive queue in the socks.cpp:socks_clear() function. It should take no more than 7 to 10 lines of code. Note that a peer joining a p2p network simply connects to another peer, without sending any messages, so using a different version number with the -v command line option will not effect the join process.

Back in peer::handlemsg(), for this lab, a message with a wrong version number simply causes an error message to be printed out (we'll see better use of this feature with p2p search in PA1). If the version number checks out, receipt of a packet carrying another peer causes the third peer's address and port number to be printed out. If the received packet is of type PM_RDRT, handlemsg() informs the user that the join has been declined (redirected) and exits the process. The user can then manually try to connect to the third peer returned in the redirect packet by running the peer program again.

That's all for Task 2. The total number of lines for Task 2 should be less than 50 lines of code. And the total number of lines for both tasks together should be less than 70 lines.

You're not required to handle peer leaving the p2p network: once a peer departs, its partner peer is not required to clean up its peer table and be ready to accept another peer. You can assume that a peer is only torn down when the whole p2p network is being torn down. It is required, however, that when a peer leaves, its partner does not crash.

Testing Your Code

We will use the same four hosts CAEN has set up for this course. Again, don't use CAEN's login server (login.engin.umich.edu) which will redirect you to one of caen-vnc* hosts as these hosts do not allow for connection to their random ports. You can also run multiple peers on a single host and form p2p connections between them. When multiple peers are running on the same host, you can use localhost in place of the peer's hostname in the command line to peer.

In addition to the skeletal code and Makefile, we've also provided an executable binary of peer, called refpeer, that runs on CAEN eecs489 hosts. It is available on /afs/umich.edu/class/eecs489/w16/lab2/. As in Lab 1, this is a Red Hat 7 executable, not to be downloaded nor run on your Mac OS X, Ubuntu, nor Windows machines. Remember that you can connect to the CAEN eecs489 hosts only through UMVPN, MWireless, or from CAEN Lab desktops. You should test your code as soon as you completed Task 1. Use refpeer to connect to your peer. Similarly, after completing Task 2, connect your peer to refpeer. To see the expected behavior of the code, run multiple refpeers and have them connect to each other.

Here is an example test scenario, assuming that you have built the program peer and it is residing in your working directory/folder for this lab. Create four windows on your local host.

On the first window, ssh to eecs489p1.engin.umich.edu, change to your working directory for this lab, run peer without any command line argument:
p1% ./peer
It should print to screen (with a different port number, depicted in bold here):
This peer address is caen-eecs489p1.engin.umich.edu:43945

p1

p4

On the second window, ssh to eecs489p2.engin.umich.edu, change to your working directory for this lab, run peer with the following command line argument (replacing the port number with the one printed out on the first item above):
p2% ./peer -p p1:43945
It should print to screen (with different port numbers):
Connected to peer p1:43945 This peer address is eecs489p2.engin.umich.edu:56535 Received ack from p1:43945
Meanwhile, on the first window, you should see the following additional line printed to screen:
Connected from peer p2:56535

On the third window, ssh to eecs489p3.engin.umich.edu, change to your working directory for this lab, run peer with the following command line argument (replacing the port number with the one from the first item above):
p3% ./peer -p p1:43945
It should print to screen (with different port numbers):
Connected to peer p1:43945 This peer address is eecs489p3.engin.umich.edu:48141 Received ack from p1:43945 which is peered with: p2:56535
Meanwhile, on the first window, you should see the following additional line printed to screen:
Connected from peer p3:48141

On the fourth window, ssh to eecs489p4.engin.umich.edu, change to your working directory for this lab, run peer with the following command line argument (replacing the port number with the one from the first item above):
p4% ./peer -p p1:43945
It should print to screen (with different port numbers):
Connected to peer p1:43945 This peer address is eecs489p4.engin.umich.edu:40231 Received ack from p1:43945 which is peered with: p2:56535 Join redirected, try to connect to the peer above.
Meanwhile, on the first window, you should see the following additional line printed to screen:
Peer table full: p4:40231 redirected

Staying on the fourth window, run peer again with the following command line argument (replacing the port number with the one from the fourth item above):
p4% ./peer -p p2:56535
It should print to screen (with different port numbers):
Connected to peer p2:56535 This peer address is eecs489p4.engin.umich.edu:50095 Received ack from p2:56535 which is peered with: p1:43945
Meanwhile, on the second window, on eecs489p2, you should see the following additional line printed to screen:
Connected from peer p4:50095

That ends our sample test scenario and you can quit all four peers.

The above is a very simple test case to check that your peers are communicating with each other. You should further test your p2p network with other test cases of your own. Recall that a peer is not required to accept another peer if its partner departs.

Submission Instructions

As with Lab 1, to incorporate publicly available code in your solution is considered cheating in this course. To pass off the implementation of an algorithm as that of another is also considered cheating. If you can not implement a required algorithm, you must inform the teaching staff when turning in your assignment.

Do NOT use any libraries or compiler options that are not already used in the provided Makefile. Doing so would likely make your code not portable and if we can't compile your code, you will be heavily penalized. Test your compilation on CAEN eecs489 hosts! Your submission must compile and run without errors on CAEN eecs489 hosts using the provided Makefile, unmodified.

Your "Lab2 files" comprises your peer.cpp and socks.cpp files and, if modified, peer.h.

To turn in your Lab2, upload a zipped or gzipped tarball of your Lab2 files to the CTools Drop Box. Keep your own backup copy! The timestamp on your uploaded file is your time of submission. If this is past the deadline, your submission will be considered late. You are allowed multiple "submissions" without late-policy implications as long as you respect the deadline. We highly recommend that you use a private third party repository such as github or M+Box or Dropbox to keep the back up copy of your submission. Local timestamps can be easily altered and cannot be used to establish your files' last modification times (-10 points). Be careful to use only third-party repository that allows for private access. To put your code in publicly accessible third-party repository is an Honor Code violation.

Turn in ONLY the files you have modified. Do not turn in support code we provided that you haven't modified (-4 points). Do not turn in any binary files (object, executable, dll, library, or image files) with your assignment (-4 points). Your code must not require other external libraries or header files other than the ones listed in the Makefile (-10 points).

Do remove all printf()'s or cout's and cerr's and any other logging statements you've added for debugging purposes. You should debug using a debugger, not with printf()'s. If we can't understand the output of your code, you will get zero point.