ENIC4023 – Networked Embedded Systems Development TCP Sockets Lab Note: the examples in this lab require to be run on networked systems with IP addresses. Standalone PCs and some laptop systems without permanent IP addresses are not suitable. Overview of TCP Sockets Data is transmitted across the internet in packets of finite size called datagrams. Each datagram contains a header and a payload. The header contains the address and port to which the packet is going, the address and port from which the packet came, and various other housekeeping information used to ensure reliable transmission. The payload contains the data itself. However, since datagrams have finite length, it is often necessary to split the payload across multiple packets and reassemble it at the destination. It is also possible that one or more packets may be lost or corrupted in transit and need to be retransmitted or that packets will arrive out of order. Keeping track of this (splitting the payload into packets, generating headers, parsing the headers of incoming packets, keeping track of what packets have and haven't been received, etc) is a lot of work and requires a lot of intricate code. TCP sockets do all this and make it transparent to the programmer. TCP sockets allow a network connection to be treated as just another stream onto which bytes can be written and from which bytes can be read. Java has classes to support both TCP and UDP sockets. This lab deals exclusively with TCP sockets. UDP sockets work rather differently and are covered in another lab. The Java Socket Class A socket is a connection between two hosts. One host will be a client, the other a server; once the connection is established the difference is irrelevant. A connection is initiated by a client making a connection request to a server. The server does not know what clients will attempt to connect to it or when they will do so. A server therefore has additional complexity in that it must continuously "listen" to the network interface in case a client tries to connect to it. To begin with we will concentrate on writing client applications and save server applications until a later in the lab. The Java Socket class uses TCP to perform the following basic operations: (1) (2) (3) (4) Connect to a remote machine Send data Receive data Close a connection An abridged class diagram for the Socket class is shown (see Java API docs for full listing): Socket <<constructors>> + Socket(String host, int port) + Socket(InetAddress host, int port) .... <<query>> + InetAddress getInetAddress() + int getPort() + InputStream getInputStream() + OutputStream getOutputStream() + boolean isClosed() .... <<update>> + void close() .... Connection to the remote machine is usually (but not always) made when the constructor is executed. The two constructors shown differ in how the host address is specified: the first takes a String object and looks up the IP address via the DNS server, the second takes an InetAddress object directly. The getOutputStream() method returns an OutputStream object for writing data from your application to the other end of the socket. You usually chain this stream to a more convenient class like DataOutputStream or OutputStreamWriter before using it. For performance reasons the stream is also often buffered. The getInputStream() method returns an input stream that can read data from the socket into a program. This is usually chained to a filter stream or reader that offers more functionality, eg DataInputStream or InputStreamReader. It can also be an aid to performance to buffer the stream by chaining a BufferedInputStream or BufferedReader. A socket will be closed automatically when a program ends, or when its garbage is collected. However it is bad practice to rely on this, especially for programs that may run for an indefinite period of time, or are likely to breach the systems maximum number of open sockets. The close() method can be called to disconnect a socket, ideally this should be put in a finally block so that the socket is closed whether or not an exception is thrown. Example – Port Scanner This example illustrates the use of client sockets. The purpose of the program is to scan through the first 99 ports of a host to find those that have a server running on them. This is achieved by having a loop in which a client socket is instantiated that tries to connect to a server on the specific host port. If the connection is established successfully, it shows there is a server running on that port. If however an exception occurs, then there is no server running on that port. The PortScanner class only has a main function. This takes, as a command line argument, the name of the host system to be scanned (eg media.paisley.ac.uk); if no argument is passed the localhost (ie the system the program is running on) is scanned. The hostname entered by the user (or localhost if not) is a string not an IP address, this has to be looked up in the Domain Name System (DNS) server. There is a socket constructor that takes the hostname as a string and looks up the IP address, but since we are instantiating 99 sockets each connecting to the same host (or at least trying to), it makes sense to do the look up just once. The InetAddress class is Java's way of storing IP addresses; it doesn't have an explicit constructor but calling the getByName() method will cause the lookup to be done and an InetAddress object initialised. This has been done in a try clause as it can throw an UnknownHostException. Within the loop that goes round 99 times an attempt is made to instantiate a socket connected to a specific port of the host. A constructor which takes the InetAddress object is used so that the IP address already looked up is used. If the socket successfully connects an appropriate message is output, if not, ie if an IOException occurs no message is output; instead a dot is output to show progress as this program will typically run quite slowly. import java.net.*; import java.io.*; public class PortScanner { public static void main(String[] args) { String host = "localhost"; if (args.length > 0) host = args[0]; try { InetAddress theAddress = InetAddress.getByName(host); for (int i = 1; i < 100; i++) { Socket s = null; try { s = new Socket(theAddress, i); System.out.println(" "); System.out.println("There is a server on port " + i + " of " + host); } catch (IOException e) { System.out.print("."); //no server on port } finally { try { if (s != null) s.close(); } catch (IOException e) {} } } } catch (UnknownHostException e) { System.err.println(e); } } } Edit, compile, and run this example, checking out that is behaves as expected. Use it to scan the media server (media.paisley.ac.uk) to see on which ports it has servers running. Use Telnet to try to access these servers and see if you can find out what kind of servers they are. Example – Daytime Protocol Client The Daytime Protocol is defined in RFC 867 (see www.faqs.org/rfc/rfc867.txt). It is very simple: the client opens a socket to port 13 on the daytime server; in response the server sends the time in a human readable format and closes the connection. Before writing a client application it’s a good idea to check that there is a suitable server available to access (in another lab you will see how to write a Daytime Protocol server application). The server can actually be accessed using telnet: simply type in “telnet”, followed by the hostname, followed by the port number (13), eg:telnet ICT-L129-10.msroot.student.paisley.ac.uk 13 You will be advised the hostname of the PC on which the server is actually running at the lab. You should get a response something like: Tue Jan 14 10:33:01 GMT 2003 Connection to host lost. If you cannot get a response from a server with telnet then your client program will not get one either. In this case check with your lecturer. The code for a Daytime Protocol client is shown (taken from Harold's book p314, slightly modified): import java.net.*; import java.io.*; public class DaytimeClient { public static void main(String[] args) { String hostname; if (args.length > 0) { hostname = args[0]; } else { hostname = "localhost"; } try { Socket theSocket = new Socket(hostname, 13); Reader timeStream = new InputStreamReader(theSocket.getInputStream()); String timeString = ""; int c; while (( c = timeStream.read()) != -1) timeString += (char)c; System.out.println("It is " + timeString.trim() + " at " + hostname); } catch (UnknownHostException e) { System.err.println(e); } catch (IOException e) { System.err.println(e); } } } All the code is in the main method. The hostname should be supplied as a command line argument, if it isn't it will default to the localhost (ie assumes the server is running on your PC). Within the try block a new socket that connects to port 13 on the server is constructed.. This will cause an UnknownHostException if no such server exited hence the catch block for this. The client then calls the getInputStream() method of the socket to get its input stream, a reader forwhich is stored in the variable timeStream. Since the daytime protocol specifies ASCII, the bytes read from the stream are just cast to char and concatenated onto the String timeString. When the server closes the connection (as the protocol requires it to do) the value -1 will be read from the input stream and the accumulated string is then output on the console. Enter, compile, and run the above code and check out that it behaves as it should. It is interesting to look at how the java socket API relates to the packets sent on the network connection. A packet sniffer such as Ethereal can be used to capture the packets on the network. The capture can be done on either the client host or the server host, or indeed on any host that is connected to the same network broadcast domain (eg connected to the same hub). The only difference in the packets captured on the different hosts will be the timings. Note that the client and server applications need to be running on separate hosts in order that the packets will appear on the network media and hence can be seen by Ethereal. To do a capture simply start up Ethereal, select Start... from the Capture menu, enter host IP_addr in the Capture Filter field of the Capture Options dialog that appears (where IP_addr is the IP address of the client host), and click the OK button. Now run the daytime client application, stop the Ethereal capture and you should see the sequence of packets exchanged between the daytime client and server. The packet display you get should be similar to that shown on the following page. There is some other occasional activity on the network which may also appear on your capture, this is for "network housekeeping" and is not of interest to us at the moment. To minimise the chances of this try to have Ethereal capturing packets for as short a period as possible, eg have the daytime client ready to run before starting the Ethereal capture. On the capture shown, the first two packets are an ARP lookup by the client host (146.191.165.23) to get the physical address of the server host (146.191.165.22). This will only occur if the server is not in the client's ARP cache. The next three packets are the three-way handshake that establishes a TCP virtual circuit (connection). Note that all have a length of zero, ie they carry no data. The first is from client to server and has the SYN flag set. The second from server to client has both the SYN and ACK flags set. The third is from client to server with the ACK flag set. The sixth packet in the capture is from the server to the client. Being quite primitive applications, the server simply sends out the time data whenever a client connects; it doesn't wait for the client to send some sort of message asking for it. Notice the length is 30, indicating how much data has been sent. Notice also the PSH and ACK control flags are set. The PSH flag is set as this segment contains all the data in this message. Frame 6 has been selected in the top part of the Ethereal display. This causes the structure of this packet to be shown in the middle part of the display, note the levels of encapsulation (Ethernet frame - IP datagram - TCP segment - data). Selecting the data in the middle part of the display causes the data bytes to be highlighted in the bottom part of the display. It is shown in both hex and ASCII, the information (in this case the date and time) is clearly visible packet sniffers can be used for spying! The last three packets show a disorderly closure of the connection. The server application is quite primitive, once it has sent the date/time information it closes the connection. This causes frame 7 to be sent with the FIN control flag set (note also the sequence number has increased by 30, the quantity of data sent). The client acknowledges with frame 8. However as the server has closed the connection without negotiation with the client (eg the client can't get the data re-sent if it didn't receive it correctly), the client then sends frame 9 with the RST control flag set. In an connection that is opened and closed in an orderly way the RST flag would never be seen. Exercise RFC 865 defines the Quote of the Day Protocol (see www.faqs.org/rfc/rfc865.txt). Write a client application for this protocol. A quote of the day server application will be running on the same host as the Daytime server advised earlier. Use Ethereal to capture the packets exchanged by the client and server. Highlight the data that the server sends (ie the quote of the day text), and do a screen capture (Ctrl - Print Screen). Open a MS Word document and paste this in; it will fit the page better if you first change the Page Setup to landscape. Print this out and keep for reference. Server Applications Writing a server application is more difficult than writing a client application. A server doesn't know in advance who will contact it or when they will contact it. A server must be "listening" all the time which means that it will often have to be run in a thread so that the system can do other tasks as well. Windows 2000 allows you to run many command prompts concurrently so we can run a server in one and other applications in others; this will allow us to write some simple servers to illustrate the principals but more realistic applications will require threads. The Java ServerSocket Class A ServerSocket runs on a server and listens for incoming TCP connections. Each ServerSocket listens on a particular port on the server machine. When a client Socket on a remote host attempts to connect to that port, the server wakes up, negotiates the connection between the client and the server, and opens a regular Socket between the two hosts, ie server sockets wait for connections while client sockets initiate connections. Once the server socket has set up the connection, the server uses a regular Socket object to send data to the client. Data always travels over the regular socket. An abridged class diagram for the ServerSocket class is shown (see Java API docs for full listing): ServerSocket .... <<constructor>> + ServerSocket(int port) .... <<query>> + boolean isClosed() .... <<update>> + Socket accept() + void close() .... Example - Daytime Protocol Server The code for a Daytime Protocol server is shown (taken from Harold's book p357, slightly modified): import java.net.*; import java.io.*; import java.util.Date; public class DaytimeServer { public static void main(String[] args) { try { ServerSocket server = new ServerSocket(13); Socket connection = null; while (true) { try { connection = server.accept(); OutputStreamWriter out = new OutputStreamWriter(connection.getOutputStream()); Date now = new Date(); out.write(now.toString() + "\r\n"); out.flush(); connection.close(); } catch (IOException e) {} finally { try { if (connection != null) connection.close(); } catch (IOException e) {} } } } catch (IOException e) { System.err.println(e); } } } A ServerSocket object called server is instantiated to listen on port 13 of the local host. This is done inside a try block as it will throw an exception if the port is already in use. This ServerSocket object will stay in existence for the lifetime of the program. This is a very simple server application and so is limited to having only one connection open at a time. The identifier connection is declared to be of type Socket (ie regular Socket as opposed to ServerSocket). The accept method of the ServerSocket object is called and will block until a client initiates a connection. Often the accept method would be called from within a thread so that if the code that communicates between the hosts blocks (eg server waiting for input from client) other connections can be serviced concurrently. This has been dispensed with here as the communication is just one-way and so can't block. This leads to a considerable simplification. Once a client has connected to the server the accept method will cease to block and communication can now take place. An OutputStreamWriter is opened and a date object obtained. The date is then converted to a string and written to the stream. To make sure all the data has been completely sent the stream is flushed before the server closes the connection. A finally block is included to ensure that the connection is closed under all circumstances. Edit, compile, and run this example, checking out that it behaves as expected. Use the DaytimeClient application from a previous lab session to access the server. The server application can be closed at the command prompt with Ctrl-C. Exercise Modify the Daytime server code above to produce a Quote of the Day server which you should compile and run, checking that it behaves as expected. Use the QuoteClient application from the previous lab session to access the server. Example - Echo Server The Echo Protocol is defined in RFC 862 (see www.faqs.org/rfc/rfc862.txt). Sample code for an echo client is shown (similar to Harold p320): import java.net.*; import java.io.*; public class EchoClient { public static void main(String[] args) { String hostname = "localhost"; String line; if (args.length > 0) { hostname = args[0]; } try { Socket theSocket = new Socket(hostname, 7); BufferedWriter netOut = new BufferedWriter(new OutputStreamWriter(theSocket.getOutputStream())); BufferedReader netIn = new BufferedReader(new InputStreamReader(theSocket.getInputStream())); System.out.println("connected to echo server"); BufferedReader userIn = new BufferedReader(new InputStreamReader(System.in)); while (!(line = userIn.readLine()).equals(".")) { netOut.write(line); netOut.newLine(); netOut.flush(); System.out.println(netIn.readLine()); } } catch (IOException e) { System.err.println(e); } } } Notes: (1) Reader and Writer streams are used for network I/O, these are character based. No character set is specified when they are constructed so the default set will be used (here Unicode). Other character sets could have been used as long as both the input and output were the same. The echo server should just echo back what it is sent so should be able to cope with any character set encoding. (2) Depending on the operating system in use (or in use when a file being used was created), the end of line separator may be different. In Unix the end of line separator is a linefeed, in the Macintosh OS a carriage return, and in Windows a carriage return / linefeed pair. For now be aware that this is a real-world problem that must be addressed when writing real-world network applications; for now let's just worry about making it work simply in Windows. Sample code for an echo server is shown: import java.net.*; import java.io.*; public class EchoServer { public static void main(String[] args) { try { ServerSocket server = new ServerSocket(7); Socket connection = null; while (true) { try { connection = server.accept(); BufferedInputStream in = new BufferedInputStream(connection.getInputStream()); BufferedOutputStream out = new BufferedOutputStream( connection.getOutputStream()); while (true) { out.write(in.read()); out.flush(); } } catch (IOException e) {} finally { try { if (connection != null) connection.close(); } catch (IOException e) {} } } } catch (IOException e) { System.err.println(e); } } } Notes: (1) Input and output streams (buffered) are used for the network I/O; this means that the server is independent of what character set the client is using; the server just echo's back the numeric values it is sent. (2) Integers (representing the characters being echoed to the client are written to the BufferedOutputStream one at a time but will be held there until the flush() method is called. (3) The server will keep a connection open until the client closes it, an IOException will then occur causing the server to also try to close (for safety) the connection and go back to waiting for a new connection. (4) Only one client can be connected to this server at any one time. To allow many clients to be connected simultaneously the server would have to spawn a separate thread for each connection, this obviously complicates to code (see below). Edit, compile, and run these two applications (in separate command prompts), and check out that they behave as expected. Try using your client application to access someone else's server application over the lab network. The code for a threaded version of the above echo server is shown below. This allows more than one client to connect without one blocking the others out. This is done by starting a new thread for each connection: import java.net.*; import java.io.*; public class TEchoServer { public static void main(String[] args) { try { ServerSocket server = new ServerSocket(7); while (true) { Socket connect = server.accept(); new ThreadedEchoHandler(connect).start(); } } catch (IOException e) { System.err.println(e); } } } class ThreadedEchoHandler extends Thread { private Socket connection; public ThreadedEchoHandler(Socket sock) { connection = sock; } public void run() { try { BufferedInputStream in = new BufferedInputStream(connection.getInputStream()); BufferedOutputStream out = new BufferedOutputStream(connection.getOutputStream()); while (true) { out.write(in.read()); out.flush(); } } catch (IOException e) {} finally { try { if (connection != null) connection.close(); } catch (IOException e) {} } } } In the above code, every time a new socket connection is established (ie when the call to accept is successful) a new thread is launched to take care of the connection between the server and that particular client. The main program just goes back and waits for the next connection. Each connection starts a new thread, thus multiple clients can connect to the server at the same time. Edit, compile and run this server. Start several clients and check that you can communicate through all of them simultaneously. Network Working Group Request for Comments: 867 J. Postel ISI May 1983 Daytime Protocol This RFC specifies a standard for the ARPA Internet community. Hosts on the ARPA Internet that choose to implement a Daytime Protocol are expected to adopt and implement this standard. A useful debugging and measurement tool is a daytime service. A daytime service simply sends a the current date and time as a character string without regard to the input. TCP Based Daytime Service One daytime service is defined as a connection based application on TCP. A server listens for TCP connections on TCP port 13. Once a connection is established the current date and time is sent out the connection as a ascii character string (and any data received is thrown away). The service closes the connection after sending the quote. UDP Based Daytime Service Another daytime service service is defined as a datagram based application on UDP. A server listens for UDP datagrams on UDP port 13. When a datagram is received, an answering datagram is sent containing the current date and time as a ASCII character string (the data in the received datagram is ignored). Daytime Syntax There is no specific syntax for the daytime. It is recommended that it be limited to the ASCII printing characters, space, carriage return, and line feed. The daytime should be just one line. One popular syntax is: Weekday, Month Day, Year Time-Zone Example: Tuesday, February 22, 1982 17:37:43-PST Another popular syntax is that used in SMTP: dd mmm yy hh:mm:ss zzz Example: 02 FEB 82 07:59:01 PST NOTE: For machine useful time use the Time Protocol (RFC-868). Network Working Group Request for Comments: 865 J. Postel ISI May 1983 Quote of the Day Protocol This RFC specifies a standard for the ARPA Internet community. Hosts on the ARPA Internet that choose to implement a Quote of the Day Protocol are expected to adopt and implement this standard. A useful debugging and measurement tool is a quote of the day service. A quote of the day service simply sends a short message without regard to the input. TCP Based Character Generator Service One quote of the day service is defined as a connection based application on TCP. A server listens for TCP connections on TCP port 17. Once a connection is established a short message is sent out the connection (and any data received is thrown away). The service closes the connection after sending the quote. UDP Based Character Generator Service Another quote of the day service is defined as a datagram based application on UDP. A server listens for UDP datagrams on UDP port 17. When a datagram is received, an answering datagram is sent containing a quote (the data in the received datagram is ignored). Quote Syntax There is no specific syntax for the quote. It is recommended that it be limited to the ASCII printing characters, space, carriage return, and line feed. The quote may be just one or up to several lines, but it should be less than 512 characters. Network Working Group Request for Comments: 862 J. Postel ISI May 1983 Echo Protocol This RFC specifies a standard for the ARPA Internet community. Hosts on the ARPA Internet that choose to implement an Echo Protocol are expected to adopt and implement this standard. A very useful debugging and measurement tool is an echo service. An echo service simply sends back to the originating source any data it receives. TCP Based Echo Service One echo service is defined as a connection A server listens for TCP connections on TCP connection is established any data received continues until the calling user terminates based application on TCP. port 7. Once a is sent back. This the connection. UDP Based Echo Service Another echo service is defined as a datagram based application on UDP. A server listens for UDP datagrams on UDP port 7. When a datagram is received, the data from it is sent back in an answering datagram.