USER’S MANUAL
Introduction
The server is named aserver.c and is running on http://king.mcs.drexel.edu:8051/ . This link contains the UNIX machine address and the port number 8051 on which the server is listening for requests. Initially, the server file was compiled via the following command: cc –o aserver aserver.c –lnsl –lsocket –lresolv
Once compiled, it was run by giving the following command: aserver 8051 &
The main page is located at http://king.mcs.drexel.edu:8051/root/index.html
, from where all other needed directories and files can be accessed. The 6 most important links are located right on the index page:
1). C Source Code: The server code written in C language
2). Documentation: Technical Documentation (this document)
3). Log file: The log file contains the log of all the accessed URLs, their access time and the IP address of the caller
4). NetStat: The output of netstat which contains the active network connections
5). Launch Process: Lauches the sleep process and updates the output of ps
6). Refresh: Updates the output of ps
Though all the important files have their links on the main page, it will be useful to know the directory structure for better and easy browsing. Refer to Table 1 to familiarize with the directory structure of my server.
root private test greeces userins launchd nethtml temp
SOURCE_CODE index.html aserver.c
Interest.html
Projects.html
Links.html home.html
Download.html
Contact.html auth.html glossary.html ulcm.html resume.doc info call_sleep.c
Swadh_paper.doc aserver.c
Table 1: Directory Structure of my server on king.mcs.drexel.edu
What happens if I try to access something out of “root’ directory?
The above table has “root” as the parent directory of my server. Nothing outside it can be accessed through browser. For example, try accessing “info” which is in the same directory as the “root” directory. If you try this, you will get a page titled Default Page (“home.html”), which basically instructs the user to have “root” in the path and only files and directories underneath the
“root” directory can be accessed.
Authentication
The “private” directory under the “root” directory requires access to view any files. If a user tries to access the “private” directory or the files underneath the “private” directory, for instance, “greeces”, “userins”, “launchd” and “nethtml”, one will get the Authentication
Required page (“auth.html”) instructing to supply username and password in the link. Now, how
exactly do you do that? Actually it is quite simple! For example, if you are trying to access
“launchd’ file, this is how your link should look like to successfully view that page: http://king.mcs.drexel.edu:8051/root/private/launchd/cs721/success
The username is cs721 and the password is success . These instructions can also be found on the
“auth.html” page that you get whenever authentication is required.
Actual Assignment 2
There were two functions that needed to be implemented: ps and netstat . Both these functions were implemented under the restricted access directory (“private”). To simply see the output of ps command, the user can follow the ‘Refresh’ link on the “index.html” page and then authenticate himself. The actual link after the authentication part looks as follows: http://king.mcs.drexel.edu:8051/root/private/refresh/cs721/success
The above link can also be found on “auth.html”, “refresh”, “launchd” and “nethtml” page under the title of ‘Refresh’. No matter from where the ‘Refresh’ link is obtained, its function is to simply show the most recent output of ps command. It can also be used to verify that a launched process has died after some time.
Additionally, if a user wants to launch a process like sleep 90 , he can simply follow the
‘Launch Process’ link which can be found on “index.html” again. Next, the user will need to authenticate in order to view the output of ps command after launching the sleep 90 process. The actual link after the authentication part looks as follows: http://king.mcs.drexel.edu:8051/root/private/launchd/cs721/success . The ‘Launch Process’ link, which includes the authentication, can be found on “auth.html”, “launchd”, “refresh” and
“nethtml” page. Again, no matter from where the “Launch Process” link is obtained, it simply launches sleep 90 and returns with the updated output of ps command that now includes the PID for the sleep process.
If the user wants to kill a certain process, he can just click on the link of the desired process. Obviously, this link can only be found on “launchd” or “refresh” page. Clicking on the link will kill that process and return the updated ps command output without the PID of the process that you killed since it no longer exists. It must be noted that the killing of any process has authentication associated with it but to make user’s life easy, the authentication is included in the link used for killing. The same holds true for every ‘Refresh’ and every ‘Launch Process’ requests if followed from “auth.html”, “launchd”, “refresh” or “nethtml” page. The actual link with the authentication part will have a similar link: http://king.mcs.drexel.edu:8051/root/private/killpro/<PIDofProcesstobeKilled>/cs721/success .
If the user tries to kill a process that has already expired or the process that no longer exists, then he will get the default error page of the browser. It is important to realize that the server PID is not shown for obvious reasons (is it a good idea to kill the server? I don’t think so…)!
Finally, if the user wants to check the active network connections, the he can follow the
‘NetStat’ link found on “index.html” page and then authenticate himself to view the output of netstat command. The actual link after the authentication part looks as follows: http://king.mcs.drexel.edu:8051/root/private/nethtml/cs721/success . The execution of netstat command takes few seconds on the UNIX machine and hence, there is some delay (very less though) in loading the “nethtml” page. To verify this delay, execute the netstat command on the machine where my server is located and compare its execution time with the time it takes to open the “nethtml” page (both, execution of netstat command and accessing the “nethtml” page,
should be done simultaneously for accurate results). The ‘NetStat’ link, which includes the authentication, can be found on “auth.html”, “launchd”, and “refresh” page.
To better view the source code of the server (“aserver.c”) and the technical documentation (“documentation.doc”), they have been shifted to my personal webpage. But there is a link for both items on the “index.html” page. The log file, which includes the URL, its
PID, the time and date it was visited and the IP address of the caller, is located at the following
URL: http://king.mcs.drexel.edu:8051/root/loghtml
. There is also a link for ‘Log file’ on
“index.html” page.
Miscellaneous
There are few files that are there for testing purpose only. For example, the “userins” and
“greeces” files are one line files to check the functionality of authentication. The “temp” file under the “test” directory is also there to test the functionality of the server. There is also a directory called “SOURCE_CODE” which contains the source code (“aserver.c”) of the server.
It can be accessed easily but the format of the code is not user-friendly. That is why the server code is posted on different webpage. The same is true for this Technical Documentation document.
Few other documents like “resume.doc” and “Swadh_paper.doc” are in the “root” directory. The “resume.doc”, as the name suggests, is my resume and the “Swadh_paper.doc” is an interesting thesis paper on religious philosophy. Both this paper can be found on my personal webpage (www.pages.drexel.edu/~kas26).
DESIGNER’S MANUAL
While designing the server code, there were few assumptions and choices made. Let’s first start with creating sockets to perform listening and handling requests. First, server family, server address and server port were determined and bind to a socket name. Getting the port was the first design decision. Should a port number be hard-coded or obtained from the argument of the server run command? Since supplying the port number as an argument proves to be a lot more flexible, it was decided to get the port number from the argument of the server run command.
Once the socket was bind, it meant it was ready for listening. To handle continuous requests, whenever there was a request, a child was forked off to handle the requests while the parent continued listening. The listening and handling requests was put into an infinite loop, which meant that the server continued listening and handling requests until it was killed. The signal handler was intelligently set up to resolve the problems with zombies. Since zombies are not desirable, the SIGCHLD was simply ignored.
Regardless, when a child was created by a fork to handle the requests, it called on a simple function called req_handle to handle the requests appropriately as per the specifications.
This function made few major assumptions:
1). Link size will be less than 4KB
2). File size was less than 4MB
3). Username and Passwords, each were less than 128 characters long
4). Each output line of netstat and ps was less than 199 characters long
The function started out by first reading the request from the socket and then parsing the link.
The very first check was to make sure that the request was for a file or directory inside the “root” directory. If it was not, then “home.html” page was displayed. Since the “private” directory had restricted access, the next check was to find out if the request was for a protected file/directory or for regular access file/directory. If it was for a regular access file, then the file was simply read and written to the socket for output. If the request was, indeed, for a protected file/directory, then either the user was asked via the “auth.html” page to authenticate himself by supplying the username and password in the link or if it was already supplied in the link, the link was sparse and checked for correct username/password combination. Upon successful check only, the user was allowed to access the protected file/directory. If the check failed, then the “auth.html” page was displayed. This could have happened because of two reasons: 1). The username/password combination were wrong or not supplied or 2). There is no such file/directory under the “private” directory. Additionally, there was one more purpose of “auth.html” page. The page not only gave instructions on how to authenticate oneself but also instructed on how to launch process, kill process, refresh the ps output and how to view the active network connections ( netstat ) via the “nethtml” page. The “auth.html” also provides authentication-included links to all the protected files. Note this is not to skip authentication process but to ease typing on user’s part.
By having the username/password combination in the link meant that authentication was done without user being aware of it. These links could be taken off without any loss in the functionality of the server. It will simply require the users to actually go to the link field and type the username and password at the end of the existing link.
Furthermore, if the user requested “launchd”, “refresh”, “nethtml” or “killpro” page after successful authentication, then the server performed a heavy volume of file manipulation. Going with the flow of the code, if output of netstat was requested via the “nethtml” page, then the
output of actual netstat command (as if performed on the machine directly) was captured into a text file called “net.txt”: system("/usr/bin/netstat > ./root/private/net.txt");
In short, the system command takes an actual UNIX command (same syntax) as its argument and executes as if executed on a UNIX terminal. For instance, if a user wants to capture the output of ls –l command into a file called LS, this is what he would do on a UNIX terminal: ls –l > LS
To perform the same operation while writing a C program can be done by: system(“ls –l > LS”);
Now, this “net.txt” could have been directly used to view the output of netstat command but due to formatting issues, the server read and wrote the whole output into a single line. And this made it hard to comprehend. As a result, I was forced to convert this text file into an html file for better viewing. The text file was read line by line by the fgets() command and output into an html file. The fgets() command took 3 things as its argument: buffer that is to hold the read characters, the size of the buffer in bytes and from where to read the characters: fgets(char s, size_t n, FILE *stream);
The fgets() function reads at most n -1 characters from stream into the buffer pointed to by s . No additional characters are read after fgets() has read and transferred a newline character to the buffer. A null character is written immediately after the last character that fgets() reads into the buffer. A null pointer is returned when end-of-file is reached.
This feature of fgets() command helped me make an html file efficiently: while (fgets(line,199,nettemp))
{ line[strlen(line)-1] = '\0'; fprintf(netstat,"<p>%s</p>",line);
}
Since the netstat output is unique every time it is called, the “net.txt” and “nethtml” file must be overwritten any time there is a request for netstat output. Links for launching a process, refreshing the output of ps command or going back to “index.html” page were written at the top of the “nethtml” page for easy browsing.
Moving along, next file manipulation occurred when the request was for “launchd” page.
For this request, the call_sleep function was called. The call_sleep function was yet another compiled C program that closed all file pointers, forked and started the sleep 90 process in the background. In a manner similar to netstat output request handling, the output of ps command was captured in a file called “ps.txt”. Again, the text file was not easy to read and so it was converted to an html file called “launchd” by using the fgets() command as mentioned above.
There is another reason for having an html file which will be discussed in the kill process paragraph. Since the ps output is unique every time it is called, the “ps.txt” and “launchd” file must be overwritten any time there is a request for launch process. Links for launching another new process, refreshing the output of ps command, obtaining the netstat output or going back to
“index.html” page were written at the top of the “launchd” page for easy browsing.
When there is request for refreshing (‘Refresh’) the ps output, the exact procedure is followed as in the case of launch process request except that no call_sleep function is called and as a result no sleep process is invoked. The function of ‘Refresh’ ps output is to ensure that the sleep process or any other process has died after certain time. As stated above, even in the case
of ‘Refresh’, the file must be overwritten every time. The same links as in “launchd” page are there in the “refresh” page.
If there is a method to launch a process, then there must be a method to kill a process.
This is where the kill process comes into the picture. To kill a process, a PID must be known for that process. This can be obtained from the ps output from either the “refresh” or “launchd” page. But then how exactly does the server remember the PID of each and every process? The solution to this was to make each output line of the ps command into a link and then parse that link to obtain the PID. This was the only solution that I could come up with. Once the PID of a process is obtained, killing is quite simple. Since kill is a built-in UNIX signal, it cannot be issued via the system( ) command. Instead, there is a command that does this: kill(PID,SIGKILL);
After issuing the kill command as shown above, the output of the ps was recaptured to depict the killing of the process. A process is considered to be non-existent or dead if its PID is no longer shown in the ps output. The recapturing of the ps output was done exactly the same way as the
‘Refresh’ request. In other words, ‘Launch Process’ request launched a new process and then requested ‘Refresh’ and ‘kill process’ request killed a process and then requested ‘Refresh’.
After all these file manipulations (if any) and authentication checks (if any), the file is eventually read and written to the socket: while ((size = read(File, image, sizeof(image))) > 0)
{ write(sock, image, size);
}
Finally, a log file is made to keep a log of all the activities. The log file includes the
URL, its PID, the time and date it was visited and the IP address of the caller. The URL was obtained from the parsing of the requested link, the PID was obtained by getpid() command, the time and date were obtained by using the built time function and the remote host name (caller) was obtained by getpeername() and gethostbyaddr() command:
/*Get remote host information*/ len=sizeof sin; if (getpeername(sock,(struct sockaddr *) &sin, &len) < 0) else perror("getpeername");
{ if ((host = gethostbyaddr((char *) &sin.sin_addr,sizeof
(sin.sin_addr),AF_INET))==NULL)
{ perror("gethostbyaddr");
} else
{
}
} fprintf(log,"Remote host '%s' ",host->h_name); printf("Remote host '%s' ",host->h_name);
timevar=time(NULL); lt=localtime(&timevar);
//get time related things if(lt->tm_hour>12)
{ lt->tm_hour = lt->tm_hour-12; amORpm = "p.m.";
{
}
} else amORpm = "a.m."; fprintf(log,"on %.2d/",lt->tm_mon+1); //get month and write it into the log file fprintf(log,"%.2d/",lt->tm_mday); //get day and write it into the log file fprintf(log,"%i ",lt->tm_year+1900); //get year and write it into the log file fprintf(log,"at %.2d:",lt->tm_hour); //get hour and write it into the log file fprintf(log,"%.2d:",lt->tm_min); //get minutes and write it into the log file fprintf(log,"%.2d",lt->tm_sec); //get seconds and write it into the log file fprintf(log," %s",amORpm); //get am or pm and write it into the log file fprintf(log," opened the URL http://king.mcs.drexel.edu:8051/%s",file); //get the
URL and write it into the log file fprintf(log," with pid = %d\n", getpid()); //get process id and write it into the log file
The log file was a text file and so the same problem of unreadable format made me convert the
“logs.txt” file into a “loghtml” file. The process of this conversion is same as for others mentioned above.
The toughest problem that still exists is the time it takes for “nethtml” page to load. But there is no way around it. Forking is not an option since the output is required before going any further. Its not like the server can not handle anymore simultaneous requests, but the “nethtml” has to wait and load the output regardless. There is delay because the netstat command itself takes time to execute on the UNIX machine. To verify this delay, execute the netstat command on the machine where my server is located and compare its execution time with the time it takes to open the “nethtml” page (both, execution of netstat command and accessing the “nethtml” page, should be done simultaneously for accurate results). Forking was an option for sleep command because the output after the execution of sleep was not required and the PID for sleep process can be obtained without it having finished its execution. This is definitely not the case when it comes to netstat output.
Future Work
There are certain things that can be improved for efficient server functionality. Just to list a few:
1). Currently, the file names in the “private” directly are exactly 7 characters long. Anything less or anything more will not work. This can certainly be improved to have variety of file names.
2). Currently, there is only one username/password combination but more combinations can be implemented and each combination having certain access rights. Also, the authentication is done via the links currently, but CGI scripts could make this more practical.
3). Another function should be written for converting a text file to an html file instead of copying the code over and over again and hence increasing the program length and run time.
4). The relative path does not work in Netscape Navigator 4.06 but works perfectly alright in
Internet explorer.