COMP 3331/9331: Computer Networks and Applications Lab Exercise 2: HTTP, Web Proxies and DNS (Solutions) AIM HTTP: To use Wireshark to investigate several aspects of HTTP including CONDITIONAL GET and basic HTTP authorization. Web Proxy: To understand the working of a web proxy DNS: To understand how DNS works by using the dig tool and Wireshark For the first two experiments we will use the Wireshark packet analyser that we used in the previous lab. Before you begin go to the “Trace Files” link on the course webpage and download the traces for HTTP (under Lab2). Experiment 1: Using Wireshark to CONDITIONAL GET/response interaction understand the HTTP The following indicate the steps for this experiment: Start Wireshark by typing wireshark at the command prompt. Load the trace file http-ethereal-trace-2 by using the File pull down menu, choosing Open and selecting the appropriate trace file. This trace file captures a request response between a client browser and web server where the client requests the same file from the server within a span of a few seconds. Filter out all the non-HTTP packets and focus on the HTTP header information in the packet-header detail window. By looking at the information in the HTTP GET and response messages (the first two messages), answer the following questions: 1) Inspect the contents of the first HTTP GET request from the browser to the server. Do you see an “IF-MODIFIED-SINCE” line in the HTTP GET? Answer: No this header is not present in the request. This is because this is the first time that the browser is requesting this particular file or the local browser cache has just been emptied prior to requesting this file. 2) Inspect the contents of the server response. Did the server explicitly return the contents of the file? How can you tell? Answer: Yes the server has responded with the contents of the file. This is obvious by inspecting the body of the response message. 3) Does the response indicate the last time that the requested file was modified? Answer: The response does indicate that the file was last modified on Tuesday, 23rd September 2003 at 05:35 GMT. 4) Now inspect the contents of the second HTTP GET request from the browser to the server. Do you see an “IF-MODIFIED-SINCE:” line in the HTTP GET? If so, what information is contained in the “IF-MODIFIED-SINCE:” header? Answer: The second GET request does indeed contain the “IF-MODIFIED-SINCE:” line and the time specified here is Tuesday, 23rd September 2003 at 05:35 GMT, which was obtained from the previous response message. In other words, the browser has cached the “LAST-MODIFED” time from the previous response for the same page and is including that time in the subsequent response message. 5) What is the HTTP status code and phrase returned from the server in response to this second HTTP GET? Did the server explicitly return the contents of the file? Explain. Answer: The status code and phrase in the response are 340 Not Modified. This indicates to the browser that the server has not modified the page since the address specified in the “If-MODIFIED-SINCE” time in the request. Hence, the server does not respond back with the requested file. The browser can simply display the locally cached version of this file. Experiment 2: Using Wireshark to understand HTTP Authentication Before you begin this experiment please read through the following link: http://frontier.userland.com/stories/storyReader$2159 titled, HTTP Access Authentication Framework, which provides an easy-to-read overview about basic HTTP authentication. The following indicate the steps for this experiment: Start Wireshark by typing wireshark at the command prompt. Load the trace file http-ethereal-trace-5 by using the File pull down menu, choosing Open and selecting the appropriate trace file. This file captures the sequence of messages exchanged between a browser and a password protected website, (http://gaia.cs.umass.edu/wireshark-labs/protected_pages/HTTP-wiresharkfile5.html). The user name is “wireshark-students” (without the quotes) and the password is “network” (again without the quotes). Filter out all the non-HTTP packets and focus on the HTTP header information in the packet-header detail window. By looking at the information in the HTTP GET and response messages answer the following questions: 1) What is the server’s response (status code and phrase) in response to the initial HTTP GET message from the browser? Answer: The response status code and phrase is 401 Authorization Required. This indicates to the browser that the requested page is password protected. The browser will then display a pop-up window prompting the user to enter the appropriate user name and password. 2) When the browser sends the HTTP GET message for the second time (after the user has entered the user name and password), what new field is included in the HTTP GET message? Answer: The second GET request is sent after the user has entered the login name and password. This message carries an additional header field called “Authorization: Basic”. The encrypted user name and password and carried in this field. Notice that the subsequent response from the server includes the desired webpage indicating that the user is authorized to view this page. 3) Go to the following webpage: http://www.base64decode.org/ and enter the entire text following the “Authorization: Basic” header in the above request (excluding the \r\n at the end) into the form field and and hit “decode” What is the result? Note that the text that you have just entered in encoded in base64 format by HTTP. The webpage allows you to decode this. Answer: The following string of characters: “ZXRoLXN0dWRlbnRzOm5ldHdvcmtz” translates to “eth-students:networks” which is the user name and password for the secure website, i.e. the user name is “wireshark-students” and the password is “networks”. 4) What can you comment on the level of security provided by the basic authorization mechanism of HTTP? Answer: The above indicates that HTTP uses base64 encoding to encrypt the user name and password. Anyone can download a tool like Wireshark and sniff packets belonging to other users, which pass by their network adaptor (for example. a shared Ethernet LAN), and anyone can translate from base64 to ASCII (you just did it!). This shows you that simple passwords on WWW sites are not secure unless additional measures are taken. There are ways of making WWW access more secure. We shall discuss this when we cover network security. Experiment 3: Web Proxy Tools For this experiment, we will use the netstat utility available in Linux. It can print information about network connections, routing tables, interface statistics, etc. However, for this experiment we are only interested in its ability to print information about active network connections on our computer. Exercise Follow the steps described below. You will notice certain questions as you attempt the exercise. Write down the answers for your own reference. The solutions will be put up on the webpage at the end of the week. If you have any questions or are experiencing difficulty with executing the lab please consult your lab instructor. Step 1: Open a web browser window and open the main UNSW webpage, www.unsw.edu.au. Step 2: Open an xterm window and increase the size of the window to a reasonably large size. Step 3: We will now look at the end-points (sockets) of the TCP connection established between your computer and the UNSW web server. In the xterm type, netstat –t –n If you encounter a large number of connections and your screen scrolls down, then follow instructions in Step 3b. Otherwise, continue to Step 4. Step 3b (Optional) Employing a similar approach to the first lab experiment on FTP can filter out all connections not of interest. In the xterm window try and locate the process id (pid) of your browser process by typing, ps –U yourloginname Now remember the pid for the browser process and type the following at the prompt, netstat –t –n –p | grep processid Step 4: Note down the IP address of remote end of the connection (i.e. non-local socket). Step 5: Now change the URL in your browser by typing in: www.ee.unsw.edu.au. Immediately run netstat as above and note down the IP address of the remote socket of the connection. Question 1. Is the IP address of the remote end of the TCP connection the same? Answer: The foreign address reflected in netstat will change as the user moves from site to site within the UNSW network. This is really a matter of policy, internal connections (UNSW sites) can by-pass the proxy. Step 5: Now change the URL in your browser by typing in: www.microsoft.com (a site external to the UNSW network) Note the IP address of the remote socket by running netstat as before. Step 6: Now change the URL to : www.cnn.com. Note the IP address of the remote socket. Question 2. Is the IP address of the remote end of the socket different from the one recorded in step 6? Is the pattern you observe with internal sites similar to the pattern with external sites? Suggest why the patterns are similar/dissimilar? Answer: With external sites, expect all connections to be first directed to the proxy. As a result the foreign address reflected in netstat will always point to the proxy. The proxy will first check for the requested file in its cache. It will contact the origin server for the file if no match is found within the cache. Proxies also provide a way for monitoring user traffic, for example to enforce IP quotas, which is a requirement for external traffic at CSE. Check the following link, http://www.cse.unsw.edu.au/faq/questions/wwwproxysetup.html for more information. EXPERIMENT 4: Tracing DNS using dig As described in Section 2.5 of the textbook (and in the Week 3 lecture), the Domain Name System (DNS) translates hostnames to IP addresses, fulfilling a critical role in the Internet infrastructure. In this lab, we’ll take a closer look at the client side of DNS. Recall that the client’s role in the DNS is relatively simple – a client sends a query to its local DNS server, and receives a response back. As shown in Figures 2.26 and 2.18 in the textbook, much can go on “under the covers”, invisible to the DNS clients, as the hierarchical DNS servers communicate with each other to either recursively or iteratively resolve the client’s DNS query. From the DNS client’s standpoint, however, the protocol is quite simple – a query is formulated to the local DNS server and a response is received from that server. Before beginning this lab, you’ll probably want to review DNS by reading Section 2.5 of the text. In particular, you may want to review the material on local DNS servers, DNS caching, DNS records and messages, and the TYPE field in the DNS record. Tools dig In this experiment, we’ll make use of the dig tool, which is available on our lab computers (and also most Linux/Unix and Microsoft platforms) To run dig in Linux/Unix, you should open a xterm window and type the dig command on the command line. The dig tool is often used by network administrators for verifying and troubleshooting DNS problems. In it is most basic operation, dig allows the host running the tool to query any specified DNS server for a DNS record. The queried DNS server can be a root DNS server, a toplevel-domain DNS server, an authoritative DNS server, or an intermediate DNS server (see the textbook for definitions of these terms). To accomplish this task, dig sends a DNS query to the specified DNS server, receives a DNS reply from that same DNS server, and displays the result. You can find out more about dig by typing “man dig” at the prompt. Note: Some of you may have heard of the nslookup tool. This has now been depreciated and replaced by dig. Exercise Follow the steps described below. You will notice certain questions as you attempt the exercise. Write down the answers for your own reference. The solutions will be put up on the webpage at the end of the week. If you have any questions or are experiencing difficulty with executing the lab please consult your lab instructor. Step 1: Open an xterm window and increase the size of the window to a reasonably large size. Step 2: Use dig to find out details about the CSE web server. Type, dig www.cse.unsw.edu.au Question 1. What is the canonical name for the CSE web server? What is its IP address? Suggest a reason for having an alias for this server. Answer: The following is the output of the above dig command: ; <<>> DiG 9.3.2-P1 <<>> www.cse.unsw.edu.au ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43449 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 4 ;; QUESTION SECTION: ;www.cse.unsw.edu.au. IN A ;; ANSWER SECTION: www.cse.unsw.edu.au. 300 IN CNAME albeniz.orchestra.cse.unsw.edu.au. albeniz.orchestra.cse.unsw.edu.au. 86400 IN A 129.94.242.51 ;; AUTHORITY SECTION: orchestra.cse.unsw.edu.au. 86400 IN orchestra.cse.unsw.edu.au. 86400 IN NS beethoven.orchestra.cse.unsw.edu.au. NS maestro.orchestra.cse.unsw.edu.au. ;; ADDITIONAL SECTION: maestro.orchestra.cse.unsw.edu.au. 86400 IN A 129.94.242.33 beethoven.orchestra.cse.unsw.edu.au. 86400 IN A 129.94.172.11 beethoven.orchestra.cse.unsw.edu.au. 86400 IN A 129.94.208.3 beethoven.orchestra.cse.unsw.edu.au. 86400 IN A 129.94.242.2 ;; ;; ;; ;; Query time: 0 msec SERVER: 129.94.242.2#53(129.94.242.2) WHEN: Wed Aug 1 18:25:45 2007 MSG SIZE rcvd: 195 The headers at the top indicate that this DNS message contains 1 query (the original query), 2 records in the Answer section, 2 records in the authority section and 4 records in the additional section. After examining the Answer section it is apparent that the canonical name for www.cse.unsw.edu.au is albeniz.orchestra.cse.unsw.edu.au and the IP address for the server is 129.94.242.51. Canonical names are usually very long and to hard to remember. An alias such as “www.cse.unsw.edu.au” is more mnemonic and easy to recall. Question 2. What can you make of the rest of the response (i.e. the details available in the Authority and Additional sections)? Answer: The Authority section contains NS resource records for the orchestra.cse.unsw.edu.au domain name. In other words it indicates the two authoritative name servers for this particular domain name which are beethoven.orchestra.cse.unsw.edu.au and maestro.cse.unsw.edu.au. The Additional section contains the IP addresses for these two authoritative name servers (i.e. the Type A resource records for the name servers). As will be apparent each of the names servers is replicated (i.e. there are 2 physical servers hosting each name server). This increases reliability and also allows for some load distribution. Question 3. What is the IP address of the server that replied to the DNS query that you initiated in Step 2 using dig? (Hint: The answer is available towards the bottom of the response)? Which server is this? Answer: The SERVER line at the end of the response (third-last line in the response) provides the IP address of the server that replied to the DNS query issues by dig. The IP address of this server is 129.94.242.2. This represents the local DNS server (or default DNS server), which is the server to which any DNS query from your host will be directed at. Although the response comes from this local DNS server, it is quite possible that this local DNS server iteratively contacted several other DNS servers to get the answer, as described in Section 2.5 of the textbook. Step 3: We will now query for the NS record for the “cse.unsw.edu.au” domain. Type, dig cse.unsw.edu.au NS This causes dig to send a query for a type-NS record to the default local DNS server. In words, the query is saying, “please send me the host names of the authoritative name servers for “cse.unsw.edu.au”. (When the type option is not used, dig uses the default, which is to query for type A records, as we did in Step 2). Question 4. What are the DNS name servers for the “cse.unsw.edu.au” domain? Find out their IP addresses? Answer: The name servers for cse.unsw.edu.au and their IP addresses are: maestro.orchestra.cse.unsw.edu.au: 129.94.242.33 beethoven.orchestra.cse.unsw.edu.au: 129.94.208.3, 129.94.242.2, 129.94.172.11 The Answer section of the response to the above dig command (Step 3) provides the NS resource records of the name servers and the Additional section contains the IP addresses (Type A resource records). Step 4 (Do-It-Yourself): Try running dig with the trace option (e.g., dig +trace www.yahoo.com). Observe the output. When the trace option is set dig uses iterative queries to resolve the hostname. The output includes the answers received from each server along the query path. Experiment 5: Tracing DNS with Wireshark Tools For this experiment, we will use the Wireshark packet analyser that we used extensively in the previous lab. Before you begin go to the “Lab Traces” link on the course webpage and download the DNS traces for Lab 2. Exercise Follow the steps described below. You will notice certain questions as you attempt the exercise. Write down the answers for your own reference. The solutions will be put up on the webpage at the end of the week. If you have any questions or are experiencing difficulty with executing the lab please consult your lab instructor. Step 1: Open an xterm and run Wireshark. Step 2: Load the trace file dns-ethereal-trace-2 by using the File pull down menu, choosing Open and selecting the appropriate trace file. This file captures the sequence of messages exchanged between a host and its default DNS server while using the nslookup utility for obtaining the canonical name (type A record) of www.mit.edu. The IP address of the default DNS server for the host is 128.238.29.22. Now filter out all non-DNS packets by typing “dns” (without quotes) in the filter filed. Also click the right arrow for DNS in the packet-header detail window. Now focus on the last two DNS messages from the 6 listed and answer the following questions: 1) What transport layer protocol is being used by the DNS messages? Answer: DNS uses UDP. 2) What is the source and destination port for the DNS query message and the corresponding response? Answer: Source port is 3742 and destination is 53 for the query. For the response it is reversed i.e., source port is 53 and destination is 3742. 3) To what IP address is the DNS query message sent? Is this the same as the default local DNS server? Answer: The DNS query is sent to the IP address 128.238.29.22, which is the default local DNS server. 4) How many “questions” are contained in the DNS query message? What “Type” of DNS queries are they? Does the query message also contain any “answers”? Answer: There is only one “question” in the query message. It is of type A and is requesting for the IP address for www.mit.edu 5) Examine the DNS response message. Provide details of the contents of the “Answers”, “Authority” and “Additional Information” fields. What can you infer from these? Answer: The response message contains one “Answer” which is the RR (resource record) for www.mit.edu. The RR is as follows: www.mit.edu: type A, class inet, addr 18.7.22.83 In addition there are three authoritative records which are the RRs for the authoritative name servers for the “mit.edu”. These RRs are as follows: mit.edu: type NS, class inet, ns BITSY.mit.edu mit.edu: type NS, class inet, ns STRAWB.mit.edu mit.edu: type NS, class inet, ns W20NS.mit.edu Finally there are also three additional RRs, which contain the type A records for the above three name servers. They are as follows: BITSY.mit.edu: type A, class inet, addr 18.72.0.3 STRAWB.mit.edu: type A, class inet, addr 18.71.0.151 W20NS.mit.edu: type A, class inet, addr 18.70.0.160 Step 3: Load the trace file dns-ethereal-trace-3 by using the File pull down menu, choosing Open and selecting the appropriate trace file. This file captures the sequence of messages exchanged between a host and its default DNS server while using the nslookup (this tool is similar to the dig tool that we used earlier in the lab) utility for obtaining the name servers (type NS record) of the “mit.edu”. The IP address of the default DNS server for the host is 128.238.29.22. Now filter out all non-DNS packets by typing “dns” (without quotes) in the filter filed. Also click the right arrow for DNS in the packetheader detail window. Now focus on the last two DNS messages from the 6 listed and answer the following questions: 1) To what IP address is the DNS query message sent? Is this the same as the default local DNS server? Answer: The DNS query is sent to the IP address 128.238.29.22, which is the default local DNS server. 2) Examine the DNS query message? What “Type” of DNS query is it? Does the query message also contain any “answers”? Answer: There is only one “question” in the query message. It is of type NS and is requesting for the name servers for the mit.edu domain. 3) Examine the DNS response message. Provide details of the contents of the “Answers”, “Authority” and “Additional Information” fields. What can you infer from these? Answer: The response message contains three “Answer” RRs (resource record) for the name servers of the mit.edu domain. These RRs are as follows: mit.edu: type NS, class inet, ns BITSY.mit.edu mit.edu: type NS, class inet, ns STRAWB.mit.edu mit.edu: type NS, class inet, ns W20NS.mit.edu There are three additional RRs, which contain the type A records for the above three name servers. They are as follows: BITSY.mit.edu: type A, class inet, addr 18.72.0.3 STRAWB.mit.edu: type A, class inet, addr 18.71.0.151 W20NS.mit.edu: type A, class inet, addr 18.70.0.160 Step 4: You can confirm the above records by using nslookup yourself. Type the following two commands and observe the output and compare it to your answers for Steps 2 and 3, nslookup -sil www.mit.edu nslookup –sil –type=NS mit.edu NS It is recommended that you try out these two commands since the format of the output with nslookup is slightly different from that of dig. Note: The “-sil” flag suppresses the message that indicates that nslookup is depreciated. Answer: This step will confirm the above information. END OF LAB