ppa07_for2e_rev2

advertisement
7
Basic Case Scenarios
Beginning with this chapter we’ll dig into the real meat of packet analysis as we use Wireshark
to analyze real-world network problems. We’ll begin by analyzing scenarios which you might
encounter day-to-day as a network engineer, help desk technician, or application developer all
derived from my real-world experience and that of my colleagues.
Rather than start out with problematic traffic we’ll explore scenarios that involve normal
traffic that you probably already generate daily. We’ll user Wireshark to examine traffic from
Twitter, Facebook, and ESPN.com to see how these common services work.
The second part of this chapter introduces a series of real-world problems. For each, I
describe the situation surrounding each problem and offer the same information that was
available to the analyst at the time. Having laid the groundwork we’ll turn to analysis as I
describe the method used to capture the appropriate packets, and step you through the analysis
process. Once analysis is complete I offer a complete a solution to the problem or point you to
potential solutions, along with an overview of lessons learned.
Throughout, remember that analysis is a very dynamic process and the methods I use to
analyze each scenario may not be the same ones that you might use. Everybody analyzes in
different ways; the most important thing is that the end result be positive.
Social Networking at the Packet Level
The first example we will look at is the traffic of two popular social networking websites,
Twitter and Facebook. We will examine the authentication process associated with each service
and see how two very similar services use two separate methods for performing the same task.
We will also look at how some of the primary functions of each service works in order to gain a
better understanding of the traffic we generate in our normal daily activities.
Twitter Authentication
One of the first things I do when I get to the office is check my Twitter feed. Whether you use
Twitter to stay up to date on news in the tech community or just to complain about how mad you
are at your girlfriend, it’s one of the more commonly used services on the Internet. For this
scenario, you’ll find a capture of Twitter traffic in the file twitter_login.pcap.
The Login Process
When I teach packet analysis one of the first things I have my students do is log into a website
they normally use and capture the traffic from the login process. This serves a dual purpose: it
exposes the individual to more packets in general and allows them to discover insecurities in
their daily activity by looking for clear text passwords traversing the wire.
Fortunately, the Twitter authentication process is not completely insecure as we can see in
the first few packets in the capture. As you can see in Figure 7-1, these first three packets
constitute the TCP handshake between our local device (172.16.16.128)  and a remote server
(168.143.162.68) . The remote server is listening for our connection on port 443 , which is
typically associated with SSL over HTTP, commonly referred to as HTTPS, a secure form of
data transfer. Based upon this alone we can assume that this is SSL traffic.
Figure 7-1: Handshake connecting to port 443
The packets that follow the handshake are part of the SSL encrypted handshake. SSL relies
upon keys, which are strings of characters used to encrypt and decrypt communication between
two parties. The handshake process is the formal transmission of these keys between hosts as
well as the negotiation of various connection and encryption characteristics. Once this handshake
is completed, secure data transfer begins. In order to find the encrypted packets that are handling
the exchange of data look for the packets that say “Application Data” in the Info column of the
packet details window. Expanding the SSL portion of any of these packets will display the
Encrypted Application Data field, containing the unreadable encrypted data , as shown in
Figure 7-2. This shows the transfer of the username and password during log in.
Figure 7-2: Encrypted credentials being transmitted
The authentication continues briefly until the connection begins its teardown process with a
FIN/ACK at packet 16. Following authentication we would expect our browser to be redirected
to our Twitter homepage, which is exactly what happens. As you can see in Figure 7-3, packets
19, 21, and 22 are part of the handshake process that sets up a new connection to the same
remote server (168.143.162.68 ) but on port 80 instead of 443 . Following the completed
handshake we see the HTTP GET request in packet 23 for the root directory of the web server (/)
 and the contents of the homepage at 172.16.16.128 in packet 24 .
Figure 7-3: The GET request for the root directory of our Twitter homepage(/)
once authentication has completed
Sending Data with a Tweet
Having logged in the next step is to tell the world what’s on our mind. Because I’m in the middle
of writing a book I’ll tweet “I’m writing a book right now” and capture the traffic from posting
that tweet in the file twitter_tweet.pcap.
This capture file starts immediately after the tweet has been submitted. It begins with a
handshake between our local workstation 172.16.16.134 and the remote address
168.143.162.100. The fourth and fifth packets in the capture comprise an HTTP packet sent from
client to server.
To examine this HTTP header expand the HTTP section in the packet details window of the
fifth packet as shown in Figure 7-4.
When you do you will immediately see that the POST method is used with the URL
/status/update , as shown in Figure 7-4. We know that this is indeed a packet from the tweet
because the host field contains the value twitter.com .
Figure 7-4: The HTTP POST for a Twitter update
Notice the information contained in the packet’s Line-based text data section  shown in
Figure 7-5. When you analyze this data you will see a field named authenticity token followed
by a “status=” field in a URL containing the value:
This+is+a+tweet+for+practical+packet+analysis%2c+second+edition
The value of the status field is the tweet I’ve submitted in unencrypted, clear text.
Figure 7-5: The tweet in clear text
There is some immediate security concern here because some people protect their tweets and
don’t intend for them to be seen by just anybody.
Twitter Direct Messaging
Let’s consider a scenario which has even greater security implications with the Twitter direct
messaging feature (which allows one user to send presumably private direct messages to another.
The file twitter_dm.pcap is a packet capture of a Twitter direct message. As you can see in
Figure 7-6, direct messages aren’t exactly private. The display of packet 7 in Figure 7-6 shows
that content is still sent in cleartext. This can be seen by examining the same Line-based text data
field  that we viewed in the previous capture.
Figure 7-6: A Direct messages in the clear
The knowledge that we gain here about Twitter isn’t necessarily earth shattering, but it may
make you reconsider sending sensitive data via private Twitter messages over untrusted
networks.
Facebook Authentication
Once I’ve finished reading my tweets I like to login to Facebook to see what my friends are up to
so that I can live vicariously through them. As with our last example we’ll use Wireshark to
capture Facebook traffic.
We’ll begin with the login process captured in facebook_login.pcap. The capture begins as
soon as credentials are submitted, and much like the capture of the Twitter login process, this
capture begins with a TCP handshake over port 443 . Our workstation at 172.16.0.122  is
initiating communication with 69.63.180.173 , the server handling the Facebook authentication
process. Once the handshake completes the SSL handshake occurs  and login credentials are
submitted as shown in Figure 7-7.
Figure 7-7: Login credentials are transmitted securely with HTTPS
One difference between the Facebook authentication process and the Twitter one is that we
see don’t immediately see the authentication connection teardown following the transmission of
login credentials. Instead, we see a GET request for /home.php in the HTTP header of packet 12
, as highlighted in Figure 7-8.
Figure 7-8: After authentication, the GET request for /home.php takes place
But when will the authentication connection be torn down? After the content of home.php
has been delivered as you can see with packet 64  at the end of the capture file. First the HTTP
connection over port 80 is torn down (packet 62)  and then the HTTPS connection over port
443 is torn down as shown in Figure 7-9.
Figure 7-9: The HTTP connection is torn down followed by the HTTPS
connection
Facebook Private Messaging
Now that we’ve examined its login authentication process let’s examine how Facebook handles
private messaging. The file facebook_message.pcap contains the packets captured while sending
a message from my account to another Facebook account. When you open this file you may be
surprised by the few packets it contains.
The first two packets comprise the HTTP traffic responsible for sending the message itself.
When you expand the HTTP header of packet 2 (as shown in Figure 7-10) you will see the
POST method is used with a rather long URL string  as highlighted in the figure. As you can
see, the string includes a reference to AJAX.
Figure 7-10: This HTTP POST references AJAX which may explain the low
number of packets seen here
AJAX stands for asynchronous JavaScript and XML and is a client-side approach to creating
interactive web applications that retrieves information from a server in the background. While
you might expect that after the private message is sent to the client’s browser the session would
be redirected to another page (as with the Twitter direct message), but that’s not the case. In this
case, the use of AJAX probably means that the message is sent from some type of interactive
pop-up rather than from an individual page, which means that no redirection or reloading of
content is necessary (one of the features of AJAX).
You can examine the content of this private message by expanding the line-based text data
portion of packet 2 as shown in figure 7-11. Just like with Twitter it appears as though
Facebook’s private messages are also sent unencrypted.
Figure 7-11: The content of this Facebook message is seen in clear text
We’ve now seen the authentication and messaging methods of two different web services,
Twitter and Facebook, each of which takes a different approach. The choice of method can be a
matter of opinion depending on your perspective. As for the authentication methods chosen,
programmers might argue that the Twitter method is better because it can be faster and more
efficient whereas security researchers may argue the Facebook method is more secure because it
ensures that all content has been delivered and no additional authentication is required before the
authentication connection closes, which may in turn make man-in-the-middle attacks (attacks
where malicious users intercept traffic between two communicating parties) more difficult to
achieve.
We’ve examined two similar web services which operate in a much different manner with
varying techniques and technologies used for their authentication methods and messaging
features. The point of this analysis was not to find out exactly how Twitter and Facebook work,
but simply to expose you to traffic that you can compare and contrast. This baseline should
provide a good framework should you need to examine why similar services aren’t operating as
they should, or are just operating slowly.
ESPN
Having completed my social network stalking for the morning my next task is to check up on the
latest news headlines and sports scores from last night. Certain sites always make for interesting
analysis and http://espn.com is one of those. I’ve captured the traffic of a computer browsing to
the ESPN website in the file http_espn.pcap.
This capture file contains many packets, 956 to be exact. This is simply too much data for us
to manually scroll through the entire file in an attempt to discern individual connections and
anomalies, so we will use some of Wiresharks analysis features to make the process easier.
The ESPN home page includes a lot of bells and whistles so it’s easy to understand why it
would take nearly 1000 packets to transfer that data to us. Whenever you have a large data
transfer occurring its useful to know the source of that data, and more importantly, whether it’s
from one or multiple sources. We can find out by using Wireshark’s conversations window
(Statistics  Conversations), as shown in figure 7-12. As you can see in the top row of this
window there are 14 unique IP conversations, 25 unique TCP connections, and 14 unique UDP
conversations, all of which are displayed in detail in the Conversations main window. There’s a
lot going on here, especially for just one site.
Figure 7-12: The conversations window shows several unique connections
For a better view of the situation we can see the application layer protocols used with these
TCP and UDP connections. Open the Protocol Hierarchy window (Statistics  Protocol
Hierarchy) as shown in figure 7-13.
Figure 7-13: The protocol hierarchy window shows us the distribution of protocols
in this capture
As you can see at , TCP accounts for 97.07 percent of the packets in the capture and at 
that UDP accounts for the remaining 2.93 percent. As expected  the TCP traffic is all HTTP,
which is broken down even further into the file types transferred over HTTP. It may be
confusing stating that all of the TCP traffic is HTTP when it only shows 12.66% as being TCP,
but that’s because the other 87.34% is pure HTTP traffic (data transfer and control packets). All
of the UDP traffic that is shown is DNS, based upon then entry under the UDP heading ().
Based upon this information alone we can draw a few conclusions. For one, we know from
previous examples that DNS transactions are quite small so the fact there are 28 DNS packets (as
listed in the Packets column next to the Domain Name Service entry in Figure 7-13) means that
we could have as many as 14 DNS transactions (divide the total number of packets by two,
representing pairs of requests and responses). DNS queries don’t happen on their own though,
and the only other traffic in the capture is HTTP traffic, so this tells us that it’s likely that the
HTML code within the ESPN website references other domains or subdomains by DNS name,
thus forcing multiple queries to be executed.
Let’s see if we can find some evidence to substantiate our theories. One simple way to view
DNS traffic is to create a filter. Entering dns into the filter section of the Wireshark window
shows all of the DNS traffic, displayed in Figure 7-14.
Figure 7-14: The DNS traffic appears to be standard queries and responses
This DNS traffic shown in Figure 7-14appears to all be queries and responses. For a better
view of the DNS names being queried we can create a filter that displays only the queries. To do
this:
Select a query in the packet list window
Expand the packet’s DNS header in the packet details window
Right click the Flags: 0x0100 (Standard query) field
Hover over Apply as Filter, and choose Selected.
This should activate the filter dns.flags == 0x0100, which shows only the queries and makes
it much easier to read the records we’re analyzing. And, as we can see in Figure 7-14, there are
indeed 14 individual queries (each packet represents a query), and all of the domain names seem
to be associated with ESPN or the content displayed on its home page.
Finally, we can verify the source of these queries by examining the HTTP requests. To do so,
Select Statistics from the main drop-down menu
Go to HTTP, select Requests, and click Create Stat (Make sure the filter you just
created is cleared before doing this.)
Figure 7-15: All HTTP requests are summarized in this window showing domains
accessed
The HTTP Requests window is seen in Figure 7-15.There are a variety of requests here and
these account for the DNS queries we saw a few moments ago. There are exactly 14 connections
here (each line represents a connection to a particular domain) so this accounts for all of the
domains represented by the DNS queries.
With this many connections occurring it may be in our best interest to make sure that this
highly involved process is occurring in a timely manner. The easiest way to do this is view a
summary of the traffic. In order to do this:
Choose Statistics from the drop-down menu, and select Summary.
This summary is seen in figure 7-16 and points out that the entire process occurs in about two
seconds , which is perfectly acceptable.
Figure 7-16: The Summary of the file tells us that this entire process occurs in
two seconds
It’s odd to think that our simple request to view a webpage broke into requests for fourteen
separate domains and subdomains touching a variety of different servers, and that this whole
process took place in only two seconds.
No Internet Access 1
The first problem scenario is rather simple; a user cannot access the Internet. You have had the
user verify that they cannot access any website on the Internet. The user can access all of the
internal resources of the network including shares on other workstations and applications hosted
on local servers. The network architecture is fairly simple as all clients and servers connect to a
series of simple switches. Internet access is handled through a single router serving as the default
gateway, and IP addressing information is provided by DHCP.
Tapping into the Wire
In order to determine the cause of the issue we can have the user attempt to browse the Internet
while our sniffer is listening on the wire. The network architecture here is very simple. Using
Figure 2-15 introduced in chapter two as a reference we can determine the most appropriate
method for placing our sniffer. The switches on our network do not support port mirroring. We
are requiring the user to perform a manual process in order to fully test the scenario we can
assume that it is okay to take the user offline briefly. Having access to a tap this would be the
most appropriate method for tapping into the wire. The resulting file is nowebaccess1.pcap.
Analysis
The traffic capture begins with an ARP request and reply, seen in figure 7-17. In packet 1 the
user’s computer, having a MAC address of 00:25:b3:bf:91:ee and an IP address of 172.16.0.8,
sends an ARP broadcast packet to all computers on the network segment in an attempt to find the
MAC address associated with the IP address of its default gateway, 192.168.0.10.
Figure 7-17: ARP request and reply for the computers default gateway
A response is received in packet 2 and the user’s computer learns that 172.16.0.10 is at
00:24:81:a1:f6:79. Once this reply is received the computer now has a route to a gateway that
should be able to direct it to the Internet.
Following the ARP reply, the computer must attempt to resolve the DNS name of the website
to an IP address using DNS in packet 3. The computer does this by sending a DNS query packet
to its primary DNS server, 4.2.2.2  (Figure 7-18).
Figure 7-18: A DNS query sent to 4.2.2.2
Under normal circumstances a DNS query would be responded to very quickly with a
response from the DNS server. In this case that doesn’t happen. Rather than a response, we see
the same DNS query sent out a second time to a different destination address. In packet 4 the
second DNS query is sent to the secondary DNS server configured on the computer, which is
4.2.2.1  (Figure 7-19).
Figure 7-19: A second DNS query sent to 4.2.2.1
Again, no reply is received from the DNS server and the query is sent again a second later to
4.2.2.2. This process repeats itself, alternating the destination packets between the primary 
and secondary  configured DNS servers over the next several seconds. The entire process takes
around eight seconds , which is how long it is takes before the Internet browser on the user’s
computer reports that a website is inaccessible (Figure 7-20).
Figure 7-20: DNS queries repeated until communication stops
Based upon the packets we’ve seen we can begin pinpointing the source of the problem. We
see a successful ARP request to what we believe is the default gateway router for the network so
we that device is online and communicating. We also know that the user’s computer is actually
transmitting packets on the network so we can assume there isn’t an issue with the protocol stack
on the computer itself. The problem clearly begins to occur when the DNS request is made.
In the case of this network, DNS queries are resolved by an external server on the Internet
(4.2.2.2 or 4.2.2.1). This means that in order for resolution to take place correctly our router
responsible for routing packets to the Internet must successfully forward the DNS queries to the
server, and the server must respond back. This all has to happen before the HTTP protocol can
be used to request the webpage itself. Based upon the context of our scenario we know that no
other users are having issues connecting to the Internet, therefore the network router and remote
DNS server don’t appear to be the source of the problem. The only thing remaining is the
potential for a problem on the user’s computer itself.
Upon deeper examination of the computer it is eventually found that rather than receiving a
DHCP assigned address, the computer has manually assigned addressing information and the
default gateway address is actually set to the wrong address. The address set as the default
gateway was not a router and was incapable of forwarding the DNS query packets outside of the
network.
Lessons Learned
The problem presented in this scenario was a result of a misconfigured client. The issue itself is
quite simple but had a significant impact on the user’s workflow. Although a simple
misconfiguration, in a troubleshooting scenario this could take quite some time to realize without
the ability to perform a quick analysis like we’ve done here. It’s important to keep in mind that
the use of packet analysis is not limited to large complex problems.
We didn’t enter the scenario knowing the IP address of the networks gateway router, so
Wireshark didn’t tell us exactly what the problem was, but it did tell us where to look which
saved valuable time. Rather than examining the gateway router, contacting our ISP, or trying to
find the resources to troubleshoot the remote DNS server we were able to focus troubleshooting
on the computer itself.
In a scenario where we were more familiar with the IP addressing scheme of the network the
analysis would have been even faster. The problem could have been identified immediately once
it was noticed that the ARP request was sent out to an IP address different than that of the
gateway router. These simple misconfigurations are often the source of many network problems
and can often be resolved more quickly with a bit of analysis at the packet level.
No Internet Access 2
Once again we have a user who is unable to access the Internet from their workstation. We have
had the user further test this issue and unlike the user in the last scenario, this user is actually
able to access the Internet, they are just unable to access http://www.google.com, which is
currently set to their homepage. Any other website is accessible, but when the user attempts to go
to any domain hosted by Google they are directed to a browser page that says “Internet Explorer
cannot display the webpage.” This issue is only affecting this individual user. Once again, this is
a simple network with a few simple switches and a single router serving as the default gateway.
Tapping into the Wire
This scenario will require us to recreate the problem situation by having the user attempt to
browse to http://www.google.com while we listen to the traffic that is generated. The network
architecture presents the same situation as the previous scenario so we will once again connect a
tap to the device in question in order to capture the traffic we need. The file that is created is
called nowebaccess2.pcap.
Analysis
The capture begins with an ARP request and reply, seen in figure 7-21. In packet 1 the user’s
computer, having a MAC address of 00:25:b3:bf:91:ee and an IP address of 172.16.0.8, sends an
ARP broadcast packet to all computers on the network segment in an attempt to find the MAC
address associated with the IP address of the host 192.168.0.102. This IP address is not the
gateway router for the network segment and as of now is unknown to us.
Figure 7-21: ARP request and reply for another device on the network
A response is received in packet 2 and the user’s computer learns that 172.16.0.102 is at
00:21:70:c0:56:f0. Based on the last scenario, we would assume that this is the address of the
gateway router so that packets can once again be forwarded to the external DNS server.
In this case however, the next packet that is transmitted is not a DNS request. The first packet
immediately following the ARP transaction is a TCP packet from 172.16.0.8 to 172.16.0.103
with the SYN flag set (Figure 7-22).
Figure 7-22: TCP SYN packet sent from one internal host to another
The SYN flag set in this TCP packet indicates that this is the first packet in the handshake
process for a new TCP-based connection between the two hosts . Notably, the TCP connection
that is attempting to be initiated is to port 80  on 172.16.0.102 . We know that port 80 is
typically associated with HTTP traffic. This connection attempt to port 80 is abruptly halted
when the host 172.16.0.102 sends a TCP packet in response (packet 4) with the RST and ACK
flags set  (Figure 7-23).
Figure 7-23: TCP RST packet sent in response to the TCP SYN
As we learned in chapter six, a packet with the RST flag set is used to terminate a TCP
connection. In this case, the host at 172.16.0.8 attempted to establish a TCP connection to the
host at 172.16.0.102 on port 80, but that host does not have any services configured to listen to
requests on port 80. As a result of that, the TCP RST packet was sent to terminate the
connection. This process repeats twice more with a SYN being sent from the user’s computer
and being responded to with a RST before communication finally ends (Figure 7-24). It is at this
point that the user receives a message in their browser saying that the page cannot be displayed.
Figure 7-24: The TCP SYN and RST packets are seen three times in total
In troubleshooting this scenario we will once again try to narrow down where we think the
problem might lie. The ARP request and reply that are seen in packets one and two do present a
bit of a concern since the ARP request is not for the MAC address of the gateway router, but
rather, some other device that is of unknown origin to us. This is peculiar, but doesn’t give us
enough information to come to a conclusion just yet.
After the ARP request and reply in the communication we would expect to see a DNS query
sent to our configured DNS server in order to find the IP address associated with
www.google.com, but we don’t. Whenever you expect to see a particular sequence of events in a
packet capture and you don’t you must examine the conditions that would cause that event to
happen and ensure that your situation meets those criteria. Conversely, you must also examine
the conditions that would prevent that event from happening to ensure your situation does not
meet any of those criteria.
In this case, a DNS query would normally happen when an application attempts to
communicate to a device based upon its DNS name. The DNS name must be resolved to an IP
address so that lower level protocols can address the data properly and get it to its destination.
Our situation meets that criteria, so based upon that information alone there should be a DNS
query present.
Examining things from the other side we consider conditions which would prevent a DNS
query from being made. There are two of these conditions. First, a DNS request will not be made
if the device initiating the connection already has the DNS name to IP address mapping in its
DNS cache. Secondly, a DNS query will not be made if the device connecting to the DNS name
already has the DNS to IP address mapping manually specified explicitly in its hosts file.
In this case, the problem does indeed lie in the fact that a DNS query is never sent. Both
conditions which would prevent the DNS query from being sent would be a result of an issue
with the computer initiating the communication. Upon further examination of the client
computer it is found that the hosts file on the computer has an entry for www.google.com,
associating it to the internal IP address 172.16.0.102.
This erroneous entry is the source of the user’s problems. A computer will typically refer to
its hosts file as an authoritative source for DNS name to IP address mappings and will check it
first before querying an outside source. In this case the user’s computer checked its host file and
found the entry for www.google.com in that file. Once it did that it presumed that
www.google.com was actually on its own local network segment and it sent an ARP request for
that host, received a response, and attempted to initiate a TCP connection to it on port 80. The
remote system was not configured as a web server and would not accept the connection attempts.
Once the hosts file entry was removed, the user’s computer began communicating correctly and
was able to access www.google.com.
You can examine you hosts file manually on a Windows system by opening
C:\Windows\System32\drivers\etc\hosts, or on a Linux system by viewing /etc/hosts.
This scenario is actually very common and is a technique that malware has been using for
years to redirect users to websites hosting malicious code. Imagine if an attacker were to modify
your host file so that every time you went to do your online banking it actually redirected you to
a fake site that had been setup to steal your account credentials!
Lessons Learned
As you begin analyzing traffic more you will begin to learn the nuances of protocols. Not only
will you learn what makes them work, but in cases like this one, you will learn what can keep
them from working. Here, the DNS query not being sent was no fault of the network router, the
remote web server, or even the DNS server. Rather, the misconfigured client was to blame.
Examining this issue at the packet level allowed us to quickly recognize some key items that
would not have been known to us otherwise. We were able to quickly spot an IP address that was
unknown and also we were able to quickly determine that DNS, a key component for this
communication process, was missing in the communication sequence. Using those pieces of
information the client was identified as the source of the problem and a thorough check of its
configuration could ensue.
No Internet Access 3
One last time, we’ve encountered a user complaining of not having Internet access from their
workstation. This user was able to narrow this issue down to a single website rather than the
entire Internet and it appears that the user is unable to access http://www.google.com. Upon
further investigation you’ve found that this issue is not limited to this individual user, but rather,
nobody in the organization can access Google domains. The network has the same characteristics
as the last two scenarios we’ve looked at.
Tapping into the Wire
This issue will require us to browse to http://www.google.com in order to generate traffic so we
can troubleshoot the issue. This issue is network-wide, which means it is also affecting your
computer as the IT administrator. At this point in the game you can’t necessarily rule out
something like a massive malware infection so it isn’t a best practice to sniff directly from your
device. In a situation like this where you cannot trust any device connected to the network a tap
is once again the best solution because it allows us to be 100% passive after a brief interruption
of service. The file created from this capture is nowebaccess3.pap.
Analysis
This packet capture begins with DNS traffic instead of the ARP traffic we are used to seeing.
The first packet in the capture is to an external address and there is a reply from that address in
packet two, so we can assume that the ARP process has already happened and the MAC to IP
address mapping for our gateway router already exists in the ARP cache of the host at
172.16.0.8.
The first packet in the capture is from the host 172.16.0.8 to the destination address 4.2.2.1 
and is a DNS packet . Examining the contents of the packet, you will see that it is a query for
the A record for www.google.com , seen in figure 7-25.
Figure 7-25: DNS query for www.google.com A record
The response to the query is received from 4.2.2.1 and is the second packet in the capture
file, seen in figure 7-26. If you examine the packet details window you will see that the name
server that responded to this request provided multiple answers to the query . At this point, all
is well and communication is occurring as it should.
Figure 7-26: DNS reply with multiple A records
Now that the user’s computer has determined the IP address of the web server it can attempt
to communicate with it. This process is initiated in packet three (Figure 7-27), which is a TCP
packet sent from 172.16.0.8 to 74.125.95.105 . This destination address comes from the first A
record provided in the DNS query response seen in packet two. The TCP packet that is sent has
the SYN flag set , and it attempting to communicate to the remote server on port 80 .
Figure 7-27: SYN packet attempting to initiate a connection on port 80
Based upon our knowledge of what the TCP handshake process should look like we would
expect a TCP SYN/ACK packet to be sent in response. Instead of what is expected, a short
duration of time elapses and another SYN packet is sent from source to destination. This process
occurs one more time after approximately a second (Figure 7-28), and communication stops and
the browser presents a message stating that the website could not be found.
Figure 7-28: The TCP SYN packet is attempted three times in total with no
response received
In troubleshooting this scenario we know that workstation within our network are able to
connect to the outside world because the DNS query to our external DNS server at 4.2.2.1 is
successful. The DNS server responds with what appears to be a valid address and our hosts
attempt to connect to one of those addresses. The local workstation we are attempting the
connection from appears to be functioning as it should based upon this information.
The remote server simply isn’t responding to our connection requests. In this case you would
typically expect a TCP RST packet to be sent but that doesn’t happen here. There are a multitude
of reasons that this might be occurring such as a misconfigured web server, a corrupted protocol
stack on the web server, or a packet filtering device on the remote network (such as a firewall).
These are all things that are on the remote network and are out of our control. In this case, the
web server was not functioning correctly and nobody attempting to access it was able to do so.
When the problem was corrected on Google’s end, communication was able to proceed as it
should.
Lessons Learned
In this scenario the problem was not something that was in our power to correct. Our analysis
determined that the issue was not with any of the hosts on our network, our router, or the external
DNS server providing name resolution services to us. The issue lied outside of our network
infrastructure.
The cause of the problem wasn’t apparent, but sometimes finding out that the problem isn’t
really OUR problem can not only relieve some stress, but can also save face when management
comes knocking. I have fought with many ISP’s, vendors, and software companies who will
claim that an issue is not their fault, but packets never lie and this scenario is a great example of
that.
Inconsistent Printer
This scenario begins with a call from your IT help desk administrator who is having trouble
resolving a printing issue. He has received multiple calls from users in the sales department who
are reporting that the high volume sales printer in functioning inconsistently. Whenever a user
sends a large print job to the printer it will print several pages and will unexpectedly stop
printing before the job is done. Multiple driver configuration changes have been attempted with
no positive result. The help desk staff is requesting that you take a look at the issue to ensure that
this isn’t a network problem.
Tapping into the Wire
The common thread in relation to this problem is the printer, so we want to place our sniffer as
close to the printer as we can. Obviously, we cannot install Wireshark on the printer itself. In this
scenario, the switches used in the network are advanced layer three switches so we can use port
mirroring. The port that the printer is connected to will be mirrored to an empty port that we will
plug a laptop with Wireshark installed into. Once this is done, a user will attempt to send a large
print job to the printer and the output will be monitored. The capture file that results is called
inconsistent_printer.pcap.
Analysis
A TCP handshake between the network workstation sending the print job (172.16.0.8) and the
printer (172.16.0.253) initiates the connection at the start of the capture file. Following the
handshake there is a TCP packet containing data sent to the printer (Figure 7-29) containing
1460 bytes of data . The amount of data can be seen in the far right side of the Info column in
the packet list window, or at the bottom of the TCP header information in the packet details
window.
Figure 7-29: Data being transmitted to the printer over TCP
Following packet four, another data packet is sent containing 1460 bytes of data  (Figure 730). This data is acknowledged by the printer , and the process continues with data being sent
and acknowledgements sent in reply.
Figure 7-30: Normal data transmission and TCP acknowledgements
The flow of data continues normally as you continue through the packet capture until you get
to the last two packets in the capture. Packet 121 is a TCP retransmission packet which is the
first sign of trouble, seen in figure 7-31.
Figure 7-31: These TCP retransmission packets are a sign of a potential problem
A TCP retransmission packet is sent whenever a device sends a TCP packet to another device
and the remote device never sends an acknowledgement back to indicate that the data is received.
Once a retransmission threshold is met the sending device assumes that the remote device never
received the data and retransmits. This will occur a few times before the communication process
effectively stops. The mechanics and calculation of this timeout period will be discussed in more
detail in the next chapter.
The retransmission here is sent from the client workstation to the printer indicating that the
printer never sent an acknowledgement back for data that was transmitted to it. If you expand the
SEQ/ACK analysis portion of the TCP header  along with everything listed under it you can
view the details of why this is a retransmission.
According to the SEQ/ACK analysis details processed by Wireshark, packet 121 is actually a
retransmission of packet 120 . Additionally, the retransmission timeout (RTO) for the packet
being retransmitted was around 5.5 seconds .
When analyzing the delay between packets you will become very familiar with changing the
time display format you are using in order to make it more appropriate for your situation. In this
case, we are curious to see how long the retransmissions occurred after the previous packet was
sent so we will change this option.
Select View from the main drop-down menu
Go to Time Display Format
Select Seconds Since Previous Captured Packet.
Once you do this you should be able to clearly see that the retransmission in packet 121
occurs 5.5 seconds after the original packet (packet 120) was sent  (Figure 7-32).
Figure 7-32: Viewing the time between packets is useful for troubleshooting
The next packet is another retransmission of packet 120. The RTO of this packet is 11.10
seconds, which includes the 5.5 seconds from the RTO of the previous packet. Viewing the time
column of the packet list window it appears as though this retransmission was sent 5.6 seconds
after the previous retransmission. This appears to be the last packet in the capture file. The
printer stops printing at approximately this time.
In this analysis scenario we have the benefit of only dealing with two devices inside of our
own network, so we just have to determine whether it is the client workstation or the printer to
blame. We can see that data is flowing correctly for quite some time and then at some point the
printer simply stops responding to the workstation. The workstation gives its best effort to get the
data to its destination as is evident by the retransmissions, but the printer simply stops
responding. This issue is reproducible and happens regardless of which computer sends a print
job so we have to assume the printer is the source of the problem.
After further analysis of the printer it is eventually found that the RAM in the printer was
malfunctioning. Any time large print jobs were sent to the printer it would print a random
number of pages, likely until certain regions of memory were accessed, and then the memory
issue would cause the printer to be unable to accept any new data at which point it would stop
communication with the host transmitting the print job.
Lessons Learned
Although this printer issue was by no means a result of a network issue we were able to use
Wireshark and analyze the TCP-based data stream to pinpoint the location of the problem. This
scenario centered solely on TCP traffic rather than DNS or HTTP like previous scenarios which
means we had to rely solely upon the troubleshooting functions of the TCP protocol. Luckily for
us, TCP is a very well designed protocol and will often leave us some useful information when
two devices simply stop communicating. In this case when communication abruptly stopped we
were able to pinpoint the exact location of the problem based on nothing more than TCP’s builtin retransmission functionality. As we continue through other scenarios we will rely on
functionality just like this to troubleshoot other and more complex issues.
Stranded in a Branch Office
In the past it was common for larger organizations to treat remote branch office locations as
separate entities from an IT standpoint due to the complexities and cost of setting up a large wide
area network (WAN). Now that the deployment of a WAN has become much more cost effective
it is in the best interest of most companies to roll branch offices into the same network
deployment that encompasses central office locations. Although this can be a money saving
effort it does bring additional complexities into the configuration and management of the
networks different components, which is what this scenario is centered on.
Our example company consists of a central headquarters (HQ) office and a newly deployed
remote branch office locations. The companies IT infrastructure is mostly contained within the
central office. The company operates using a Windows Server-based domain which will be
extended to the branch office by using a secondary domain controller. The domain controller is
responsible for handling DNS and authentication requests for users at the branch office. The
domain controller is a secondary DNS server and should receive its resource record information
from the upstream DNS servers at the corporate headquarters.
Your deployment team is currently rolling out the new infrastructure to the branch office
when you’ve found out that nobody is able to access any of the intranet web application servers
on the network. These servers are located at the main office and are accessed through the WAN
connection. This issue is affecting all of the users at the branch office, and is limited to just these
internal servers. All users are able to access the Internet and other resources within the branch.
Tapping into the Wire
The problem lies in communication between the main and branch offices so there are a couple of
places we could collect data from to start tracing down the problem. The problem could be with
the clients inside the branch office, so we will start by port mirroring one of those computers to
see what it sees on the wire. Once we’ve collected information we can use it to point towards
other collection points which might help us solve the problem. The initial capture file obtained
from one of the clients is called stranded_clientside.pcap. This is a complicated scenario
involving multiple sites, so a simplified network map has been made available to us in figure 733.
Figure 7-33: The network map shows all of the relevant components for this
issue
Analysis
The first capture file we are examining begins when the user at the workstation address
172.16.16.101 attempts to access an application hosted on the HQ App Server, 172.16.16.200.
This capture is very small and only contains two packets. It appears as though a DNS request is
sent to 172.16.16.251  for the A record  for “AppServer”  in the first packet (Figure 7-34).
This is the DNS name for the server at 172.16.16.200 in the central office.
Figure 7-34: Communication begins with a DNS query for the AppServer ‘A’
record
The response to this packet is not the typical DNS response we would expect. The response
is a server failure , which indicates that there is something preventing the DNS query from
completing successfully (Figure 7-35). Notice also that this packet does not contain any answers
to the query  since it is an error.
Figure 7-35: The query response indicates a problem upstream
We now know that communication between the users at the branch office and HQ is not
occurring because of some DNS related issue. The DNS queries at the branch office are resolved
by the DNS server at the address 172.16.16.251, so that’s our next stop.
In order to capture the appropriate traffic from the branch DNS server we will leave our
sniffer in place and simply change the port mirroring assignment so that the server traffic is
mirrored to our sniffer instead of the workstation, resulting in the file stranded_branchdns.pcap.
This file begins with the query and response we saw earlier along with one additional packet.
This packet looks a bit odd because it is attempting to communicate to the primary DNS server at
the central office (172.16.16.250)  on the standard DNS server port 53 , but it is not UDP 
like we are used to seeing (Figure 7-36).
Figure 7-36: This SYN packet uses port 53 but is not UDP
In order to figure out the purpose of this packet we must think back to our discussion of DNS
in chapter six. DNS uses UDP almost exclusively, but it does use TCP in a couple of cases. TCP
is used when the response to a query exceeds a certain size, but in those cases we will see some
initial UDP traffic that is the stimulus to the TCP traffic. The other instance when TCP is used
for DNS is a zone transfer where resource records are transferred between DNS servers. This is
likely the case here as it fits our scenario.
The DNS server at the branch office location serves as a slave to the DNS server at the
central office, meaning that it relies upon it in order to receive resource records. The application
server that users in the branch office are trying to access is located inside the central office,
which means that the central office DNS server is authoritative for that server. In order to the
branch office server to be able to resolve a DNS request for the application server the DNS
resource record for that server must be transferred from the central office DNS server to the
branch office DNS server. This is likely the source of the SYN packet in this capture file.
Now we know that this DNS problem is a product of a failed zone transfer between the
branch and central office DNS servers. This is quite a bit more information than we had before,
but we can go one step farther by figuring out why the zone transfer is failing. The possible
culprits for the issue can be narrowed down to the routers between the offices or the central
office DNS server itself. In order to figure this out we can sniff the traffic of the central office
DNS server to see if the SYN packet is even making it to the server.
I’ve not included a capture file for the central office DNS server traffic because there was
none. The SYN packet never reached the server. Upon dispatching technicians to review the
configuration of the routers connecting the two offices it was found that the central office router
was configured to only allow UDP traffic inbound on port 53 and block TCP traffic inbound on
port 53. This simple misconfiguration prevented zone transfers from occurring between servers
which prevented clients within the branch office from resolving queries for devices in the central
office.
Lessons Learned
You can learn a lot about investigation of a network communications issue by watching crime
dramas on TV. Whenever a crime occurs the detectives will start directly at the source and
interview those most affected by the situation. Based upon that examination leads are developed
which further focus the efforts of the investigation. This process continues until a culprit is
hopefully found. This scenario was a great example of that. We started by examining the victim
(the workstation) and established leads by finding the DNS communication issue. Our leads led
us to the branch DNS server, then to the central DNS server, and finally to the router that was the
source of the problem. When you are doing analysis try thinking of packets as potential clues.
The clues don’t always tell you who committed the crime, but through a few degrees of
separation you can get there eventually.
Ticked off Developer
Some of the most common arguments that occur in information technology are those between
developers and system administrators. Developers always blame shoddy network setup and
malfunctioning equipment for program malfunctions and system administrators tend to blame
bad code for network errors and slow communication. This scenario is no different and is
focused on a dispute between a programmer and a sysadmin.
The programmer in our scenario has developed an application that is responsible for tracking
the sales of multiple stores and reporting back to a central database. In an effort to save
bandwidth during normal business hours this is not a real-time application. Reporting data is
accumulated throughout the day and is transmitted at night in the form of a comma separated
value (CSV) file that can be received by the application and processed for insertion into the
database.
This is a newly developed application and is not functioning as it should be. The files being
sent from the stores are being received by the server but the data being inserted into the database
is not correct. Sections of data are missing, data is in the wrong place, and some portions of the
data are missing. Much to the dismay of the sysadmin, the programmer has blamed the network
for the issue. He is convinced that the files are becoming corrupted while in transit from the
stores to the central data repository. This means that our goal is to prove him wrong.
Tapping into the Wire
There are two options for collection of this data. We can capture packets at one of the individual
stores or at the central office. The issue is affecting all of the stores so it would make sense that if
the issue were indeed network related then it would be at the central office since that is the only
common thread among all stores. The network switches support port mirroring so we will mirror
the port the server is plugged into and sniff its traffic. The traffic capture will be isolated to a
single instance of a store uploading its CSV file to the collection server. This instance can be
found in the capture file tickedoffdeveloper.pcap.
Analysis
We don’t know anything about the application the programmer has developed other than the
basic flow of information on the network. When the file is initially opened it appears to start with
some FTP traffic so we will want to investigate that to see if that is indeed the mechanism that is
transporting this file. This is a good place to examine the communication flow graph to get a nice
clean summary of the communication that is occurring. In order to do this:
Select Statistics from the main drop-down menu
Select Flow Graph
Click OK
The flow graph is seen here in figure 7-37.
Figure 7-37: The flow graph gives a quick depiction of the FTP communication
Based upon this flow graph we see that a basic FTP connection is setup between
172.16.16.128 and 172.16.16.121 . Since it is 172.16.16.128 that is initiating the connection 
we can assume that it is the client and that 172.16.16.121 is the server that compiles and
processes the data. Perusing the flow graph confirms that this traffic is exclusively using the FTP
protocol.
We know that some transfer of data should be happening here so we can use our knowledge
of FTP to locate the packet where this transfer begins. The FTP connection and data transfer is
initiated by the client so we should be looking for the FTP STOR command that is used to
upload data to an FTP server. The easiest way to find this is to build a filter.
There are a couple of ways to build the filter we need including building a filter with the
expression builder, but there is a quicker method. This capture file is littered with FTP request
commands so rather than sorting through the hundreds of protocols and options in the expression
builder we can build the filter we need straight from the packet list window.
In order to do this we first need to select a packet with an FTP request command present.
There are a lot to choose from including packets 5, 7, 11, 13, 15, and more. We will choose
packet 5 since it’s near the top of our list.
Select packet 5 in the packet details window
Once selected, expand the FTP section in the packet details window and expand the
USER section
Right click on the Request Command: USER field
Select Prepare a Filter
Select Selected
This will place a prepare a filter for all packets that contain the FTP USER request command
and put it in the filter dialog. Of course, this isn’t the exact filter we want, but all that has to be
done is to edit the filter by replacing the word USER with the word STOR  (Figure 7-38).
Figure 7-38: This filter helps identify where data transfer begins
Once this filter is applied by pressing the Enter key we will see that only one instance of the
STOR command exists in the capture file and it occurs at packet 64 . Now that we know where
data transfer begins the filter can be cleared by clicking the Clear button above the packet list
window.
Examining the capture file beginning with packet 64 we can clearly see that this packet
specifies the file stre4829-03222010.csv is being transferred  (Figure 7-39).
Figure 7-39: The CSV file is being transferred using FTP
The packets following the STOR command are using a different port but are identified as
being part of an FTP-DATA transmission. Thus far we have verified that data is actually being
transferred but that doesn’t help us reach our goal of proving the programmer wrong. In order to
do that we have to be able to verify the contents of the file are sound after traversing the network.
In order to do this we are going to extract the transferred file in its entirety from the captured
packets. Although this may seem a bit odd, remember, when a file is transferred across the
network in an unencrypted format that means that the file is broken down into segments and
reassembled when it reaches its destination. In this scenario we captured packets as they were
reaching their destination but before they were processed by the application purposed with
reassembling the segments. This means that the data is all there, all we have to do is reassemble
it.
This is achieved in a reasonably easy manner by extracting the file as a data stream. In order
to do this:
Select any of the packets in the FTP-DATA stream (such as packet 66)
Click Follow TCP Stream
The results of this are displayed in the TCP stream in Figure 7-40.
Figure 7-40: The TCP stream shows what appears to be the data being
transferred
The data appears in clear text because it is being transferred over FTP and is not in a binary
format, but we can’t ensure that the file is intact based upon viewing the stream alone. In order to
extract it to its original format:
Click the Save As button
Specify the name of the file as it was displayed in packet 64 (Figure 7-41)
Click Save
Figure 7-41: Saving the stream as the original file name
The result of this save operation should be a CSV file that is an exact byte-level copy of the
file as it exists when transferred from the store system. The file can be verified by comparing the
MD5 hash of the original file to the hash of the extracted file, which should be the same, as it is
in Figure 7-42.
Figure 7-42: The MD5 hashes of the original file and the extracted file are
equivalent
Once the files are compared we have the information needed to prove that the network is not
to blame for the database corruption occurring within the application. The file is being
transferred from the store to the collection server and is intact when it reaches the server, so any
corruption must be occurring during the processing of the file.
Lessons Learned
The great thing about examining things at the packet level is that you don’t have to deal with the
clutter of applications. The number of poorly coded applications in the wild greatly outnumbers
the amount of good ones, but at the packet level none of that matters. In this case, the
programmer was concerned about all of the mysterious components his application was
dependent upon but at the end of the day his complicated data transfer that took hundreds of lines
of code is still no more than FTP, TCP, and IP. Using what we know about these basics protocols
we were able to ensure the communication process flowed correctly and even extract files to
prove that the network is sound. At the end of the day it’s crucial to remember that no matter
how complex the issue at hand, it’s still just packets.
Final Thoughts
In this chapter we’ve covered several basic scenarios where packet analysis has allowed us to
gain a better understanding of the communication we are participating in. Using basic analysis of
common protocols we were able to track down and solve network problems in a timely manner.
Most of these issues could have been solved by other means but not nearly as efficiently. You
will likely never encounter these same scenarios on your network but hopefully some of the
analysis techniques seen here can be applied to analysis of future problems you experience.
Download