ENIC4023 - Networked Embedded Systems Development Network Utilities Lab Note: the examples in this lab require to be run on a networked system with an IP address. Standalone PC's and some laptop systems without a permanent IP address are not suitable. Every PC on the lab network has two addresses. These are the physical address (the lab uses Ethernet) and the internet (IP) address. These together with the Domain Name System (DNS) and Address Resolution Protocol (ARP) are the basis for communication on an IP network. Physical (Ethernet) Address The Ethernet address (also known as a Media Access Control (MAC) address) is a 48-bit address that identifies the Ethernet adapter. Adapter manufacturers are assigned Ethernet address blocks to use so that they are all unique and never reassigned, depending on the adapter design it may not even be possible to change the Ethernet address on it. More generally the Ethernet address is the "physical address" of the PC, other technologies are possible for the physical layer networking (eg FDDI, frame relay) but Ethernet is by far the most used in an office/lab environment. Ethernet addresses are usually shown as bytes separated by a hyphen or a colon, eg: 00-40-95-44-27-20 (all numbers are HEX) Check out the Ethernet address of the PC you are working on by entering ipconfig /all at a command prompt. This will give you quite a lot of information about the network settings of your PC. Record the Ethernet (physical) address here: - - - - - Compare this Ethernet address with that found by the other students in the lab and confirm its uniqueness (if it isn't unique tell a technician fast). Internet Address Note that IP addresses may be changed from 32-bit (IPv4) to 128-bit (IPv6) soon (or may not be), this should not have any effect on the Java networking programs we write here!!! The Internet Address is also called an IP address (IP stands for Internet Protocol) or host address. It is a logical address assigned to a network interface (eg the one on your PC). An IP address is how one computer finds another on a large network. Each node must know its own IP address and that of any other computer with which it will communicate. An IP address is a unique 32-bit number, this takes up exactly 4 bytes in memory. An IP address is normally written as 4 unsigned bytes, each ranging from 0 to 255, with the most significant byte first. Bytes are separated by periods for the convenience of the human reader (dotted decimal format), eg: 146.191.100.247 Check out the IP address on your PC by again using ipconfig /all Record your IP address and subnet mask here: IP Address: . . . . Subnet Mask: . . IP addresses aren't allocated randomly but rather in blocks allocated to organisations such as a University. Since the size of organisations varies so will the number of IP addresses needed vary too. Three classes of network are defined by the range of addresses allocated to them: Address Class A B C D* E* binary address begins with 0 10 110 1110 1111 first term of dotted decimal address 0 to 127 128 to 191 192 to 223 224 to 239 240 to 255 * Note that class D is used only for multicast addressing and that class E is reserved for experimentation. Each class of network has a different number of bits of the address allocated to the network ID (the bit that defines the block of addresses), and the host ID (the bit that distinguishes between addresses within a block): Address Class A B C bits in network ID 8 16 24 bits in host ID 24 16 8 The class A format provides a small number of possible network IDs and a huge number of possible host IDs for each network. However a class C format can support only a small number of hosts (256 - 2 = 254) but there are many more of them available. Note that certain IP address ranges are not assigned to networks because they are reserved for special uses (see below). Check out the IP address of your PC, is it class A, B, or C? My office PC is 146.191.100.247 ie class B. A class B network can have approximately 216 = 65536 hosts. This is much too many to be easily managed on a single network segment. It is therefore usual to split this into a number of subnets interconnected by routers. The subnet to which a host belongs is determined by the IP address in conjunction with the subnet mask. The subnet mask is a 32-bit number which has 1's at the left end and 0's at the right end. The bit position at which the 1's change to 0's is what is significant. This defines a split in the host ID field defined by the network class, those bits of the host ID fields which have 1's in the corresponding bit of the subnet mask are "stolen" to act as a subnet ID. As an example consider my office PC again: = 1001 0010 1011 1111 0110 0100 1111 0111 = 1111 1111 1111 1111 1111 0000 0000 0000 = nnnn nnnn nnnn nnnn ssss hhhh hhhh hhhh (where n = network ID, s = subnet ID, h = host ID) IP address: subnet mask 146.191.100.247 255.255.240.0 The IP address starts with 10 therefore it is class B. As a class B network the left most 16-bits are the network ID, ie 1001 0010 1011 1111 The remaining bits where the corresponding bits of the subnet mask are 1's are the subnet ID, ie 0110 The bits where the corresponding bits of the subnet mask are 0's are the host ID, ie 0100 1111 0111 (There are "IP Address Calculators" available from many internet sites that purport to make understanding and manipulating IP addresses easier. Put "IP Address Calculator" into altavista.co.uk or other good internet search engine to find one) As well as the class D and E addresses there are other addresses that are not available for assigning to host systems: (1) All addresses where the host ID is all zeros is used to identify the subnet. Such an address is called the "subnet number". The subnet number for my office PC would thus be: 146.191.96.0 Work out the subnet number of your PC and record it here: . Subnet Number: (2) . . All addresses where the host ID is all ones is as the "directed broadcast address" for the subnet. This address can be used to send a packet to every host on the subnet. The subnet broadcast address for my office PC would thus be: 146.191.111.255 Work out the subnet broadcast address of your PC and record it here: . Broadcast Address: . . (3) All addresses where the subnet ID is all zeros are called zero subnets, their use is discouraged. (4) All addresses where the subnet ID is all ones are called broadcast subnets, their use is discouraged. (5) The first and last networks in each class are reserved (the last class A network address: 127.0.0.0 is used as the loopback address). (6) Some addresses are reserved for use by "private internets", these are networks that will never be connected to the internet so do not need unique addresses. These are: Address Range 10.0.0.0 to 10.255.255.255 172.16.0.0 to 172.31.255.255 192.168.0.0 to 192.168.255.255 Class A Number of Networks 1 B 16 C 256 Exercise Fill in the blanks in the following table (check your answers with an IP address calculator): IP address 200.1.1.130 subnet mask 255.255.255.224 network class subnet number broadcast address assignable address range no. of subnet bits no of host bits Domain Name System IP addresses are difficult for humans to remember. However the Domain Name System (DNS) associates hostnames that humans can remember (like my office PC: ICT-EEP-051.msroot.staff.paisley.ac.uk) with IP addresses that computers can remember (146.191.100.247). The DNS is a bit like a telephone directory, however storing the IP address - hostname mapping for all the computers in the world requires a lot of storage! The DNS is instead a distributed system; which also makes it more reliable. The task of keeping track of the address mapping is delegated to DNS servers. To get the IP address from the hostname (or vice-versa) a computer sends a query to its local DNS server which will supply the information. The local DNS server keeps a record of the mappings for all the local computers, so if the request concerns another local computer, it can reply straight away. If the request concerns a computer that is not local it sends a request to another DNS server and so on. Strictly speaking DNS servers work with fully qualified domain names, for example my office PC: ICT-EEP-051.msroot.staff.paisley.ac.uk These names are hierarchical, and DNS servers are arranged similarly so a request can be passed up the hierachy then back down again to get to the required DNS server. The DNS server can be accessed manually in Windows 2000 with NSLookup. To find out the hostname of your PC enter NSLookup at a command prompt. You will now get a ">" prompt. At this enter the IP address of your PC (as found above) and it should respond with your hostname. Try entering the hostname (just the bit before the first dot will do) you have just got and it should respond with your IP address back again. Enter exit to get back to the command prompt. The following shows what happened when I done this on my office PC: Note that the DNS server (in this case cis13.msroot.staff.paisley.ac.uk) responded with a different hostname from what ipconfig did when I ran it on my PC. This just means that my office PC has more than one hostname. When I entered the hostname that ipconfig gave me into NSLookup I got my correct IP address back. Make a note of the hostname (fully qualified domain name) of your PC below: You can also use the network utility tracert to check the hostname - IP address mapping. This utility is used for tracing the route a packet takes in getting to a specified destination. Try it out with your favourite website (one nearby preferable), eg for Glasgow University: From the above you can see that the IP address of www.glasgow.ac.uk is: 130.209.34.12 type this IP address into an internet browser and check. You can also use the network utility ping to get an IP address from a hostname (but not the other way round). This utility is used to check that a server is responding. Again try it out with one of your favourite websites, eg for the University of Central Lancashire: Note: University of Paisley network security policy may not allow tracert or ping to be used with external hosts. These policies are applied in an arbitrary way and change from time to time without notice or explanation. You may not be able to replicate the examples in this labsheet if the policies have been changed since the time of writing. If pinging external hosts is not allowed, try pinging other PC's in the same lab. Address Resolution Protocol (ARP) When two IP-enabled devices on the same network segment (in practice the same subnet) want to communicate, they do so using the low-level protocols and addressing mechanisms defined for the medium in use, in the lab this is Ethernet so Ethernet addresses and protocols are used. In order for IP systems to communicate with each other, they must first be able to identify the physical addresses of the other devices on the same network segment. This service is provided by the Address Resolution Protocol (ARP). The Java networking classes work at the IP level and so effectively hide the lower level protocols associated with the medium in use. Although we will not explicitly use ARP in our programs, it is still worth understanding it as it works in the background. Each host on a network segment maintains a table in memory called the ARP cache. This associates the IP addresses of other hosts on the same network segment with physical addresses (on the lab network this is Ethernet addresses). When a host needs to send data to another host on the segment, the host checks the ARP cache to determine the physical address of the recipient. You can check the contents of the ARP cache on your PC by entering arp -a at a command prompt. You should get something like this: At first sight this might seem to be a rather short list. The reason for this though is that the ARP cache is assembled dynamically and typically the entries expire after a pre-determined time and are removed. If the IP address that is to receive the data is not listed in the ARP cache, the host will send a broadcast (ie uses a special address to which all hosts will listen to) called an ARP request frame. This contains the unresolved IP address and also the IP and physical address of the requester. The host that owns the unresolved IP address responds by sending its physical address to the host that sent the request. The target host may also add the requester to its own ARP cache. When a host wants to communicate with another host that is not on the same subnet, it simply sends the data to whatever is listed as its default gateway which will be an IP router that forwards it on in a direction based on the IP address. Packet Sniffing A Packet Sniffer (Network Protocol Analyser) is a tool used to monitor the traffic on a network segment. In its simplest form it can be a software application run on a PC with a network connection. The packet sniffer captures the packets seen by the PC's network card (whether addressed to the PC or not). The captured packets can then be displayed and analysed. Hall's book shows screen captures from a packet sniffer called Shomiti Surveyor; a cut-down version of which is on the CD-ROM included with the book. Unfortunately the capture function is disabled in the cut-down version and the full version costs money; hence we will not be using it. However there is a good open-source packet sniffer available called Ethereal and this is installed on the lab PCs (see www.ethereal.com for information and downloads). The next page shows what packets were captured when I pinged host 146.191.100.246 from my office PC (146.191.100.247). The first is a broadcast ARP packet sent by my office PC. It knows that 146.191.100.246 is on the same network as itself so it needs to find out 146.191.100.246's physical address. 146.191.100.246 sees the ARP broadcast, recognises its IP address, and replies with a unicast packet back to my office PC with the information that its physical address is 00:60:b0:ed:3b:84. This exchange of ARP packets is only needed because 146.191.100.246 is not in the ARP cache of my office PC. If I pinged 146.191.100.246 a second time shortly after the first there would be no ARP packets sent as the information needed would already be in the ARP cache. The remaining 8 packets consist of 4 pairs of ICMP (Internet Control Message Protocol) echo request and echo reply packets. Whenever a host running the TCP/IP protocol suite receives a ICMP echo request packet it replies with an ICMP echo reply packet. The various messages that are supported by ICMP will be explained in the lecture program. To get a similar capture yourself, start up Ethereal, pull down the Capture menu and select Start. In the dialog that appears type: host your_PCs_IP_address into the Capture Filter field, and click OK. Ethereal will now capture all packets that your PC sees. Ping a neighbouring lab PC (make sure it is switched on first), click the Stop button in the Ethereal dialog and the captured packets should be displayed. You may find extra packets have been captured which are not caused by the ping, ignore these. Example - Internet Addresses The InetAddress class is Java's encapsulation of an IP address; it is used by most of the other networking classes including those associated with sockets (which we will be looking at in subsequent labs). You can check out the documentation for this class in the usual way. The class has two private attributes: a String representing the host name (eg media.paisley.ac.uk), and a byte array representing the IP address. Since these attributes are not public they cannot be accessed directly but instead various methods of the class can be used. A simple program that uses the InetAddress class to look up the Domain Name System (DNS) server to get the IP address of an internet host is shown:import java.net.*; public class HostByName { public static void main(String[] arg) { String hostname = "localhost"; if (arg.length > 0) hostname = arg[0]; try { InetAddress host = InetAddress.getByName(hostname); System.out.println("host name: " + host.getCanonicalHostName()); System.out.println("host address: " + host.getHostAddress()); } catch (UnknownHostException e) { System.out.println("Could not find " + hostname); } } } The InetAddress class does not have any public constructors, however calling the getByName method returns a suitably initialised InetAddress object. The getByName method throws an UnknownHostException so needs to be called within a try clause. Similarly to above, the following shows a program to get the name and IP address of the local host. In this case the getLocalHost method is used to return the suitably initialised InetAddress object:import java.net.*; public class LocalHost { public static void main(String[] arg) { try { InetAddress host = InetAddress.getLocalHost(); System.out.println("local host name: " + host.getCanonicalHostName()); System.out.println("local host address: " + host.getHostAddress()); } catch (UnknownHostException e) { System.out.println("Could not find local host"); } } } Exercise Write a simple Java application that takes a hostname or IP address as a command line argument and outputs the next ten consecutive IP addresses together with their host names if allocated. Hint: the getAddress method of the InetAddress class returns a byte array representing the IP address. Byte [3] is the least significant and byte [0] the most significant. Uniform Resource Locator (URL) An IP address just specifies the address of a host on a network; a URL also specifies the location of a resource (file) on that host and the scheme (protocol) to be used to access it. An example of a URL is: http://media.paisley.ac.uk/~johnstone/em_sys/index.html This can be broken into its component parts as follows: http:// the scheme (or protocol) media.paisley.ac.uk the server name /~johnstone/em_sys/ path that leads to the file index.html file name This specifies the file index.html, access via the path /~johnstone/em_sys/ from the server root, the host being media.paisley.ac.uk, and the application protocol to be used to access the resource is http:// URLs are useful for accessing resources on a network. Internet browsers use URLs to access internet sites. Java has a class called URL (in the java.net package) which is an abstraction of a URL. The URL class provides a simple way for a java program to locate and retrieve data from a network. The class declaration is: public final class URL extends Object implements Serializable The simplest URL class constructor takes a URL in string form as its single argument: public URL(String url) throws MalformedURLException Example - URL The following example (SourceViewer.java) reads a URL from the command line, opens an InputStream from that URL, chains the resulting InputStream to an InputStreamReader using the default encoding, and then uses InputStreamReader's read() method to read successive characters from the file, each of which is printed on System.out. That is, it prints the raw data located at the URL: eg if the URL references an html file, the program's output is raw html. Check that it performs as expected. import java.net.*; import java.io.*; public class SourceViewer { public static void main(String[] args) { if (args.length > 0) { try { URL u = new URL(args[0]); InputStream in = u.openStream(); in = new BufferedInputStream(in); Reader r = new InputStreamReader(in); int c; while ((c = r.read()) != -1) { System.out.print((char)c); } } catch (MalformedURLException e) { System.err.println(args[0] + " is not a parseable URL"); } catch (IOException e) { System.err.println(e); } } } }