Date: March 27, 2012 Management Overview

-1- Networking in Operating Systems Author: Vaughn Friesen Date: March 27, 2012 -2- Management Overview The operating system has many tasks to perform in order to connect to a network or the internet. Generally it provides parts of the transport and network layers, as well as drivers for network cards and other hardware. The transport layer is responsible for the end-to-end communication. It may provide several features, including:    Reliable transmission Flow control Congestion control The protocol used dictates which of these features are performed, as some applications do not need them all or want to implement certain features themselves. Usually the operating system provides use of the internet transport layer to applications via an API, and one popular one is the sockets interface, originally developed for Unix. The operating system takes data from the application and hands it over to the network, where it is sent to the receiving host. That host hands it to the proper port to which it is addressed. Protocols that support reliable transmission need a way to guarantee each packet of data reaches the receiving host, even though the network is unreliable. TCP uses acknowledgements so that the sender knows when a packet has been received correctly or when it needs to be resent. It also guarantees that the packets are received in order. The transport layer, what it does, and how it works will be described in more detail here. -3- Table of Contents 1. Introduction 4 2. Sockets IP addresses and ports Example (C# sockets) Multiplexing and demultiplexing 6 6 7 10 3. Protocols Why different protocols? UDP: Simple protocol TCP: Reliable transmission Connection establishment Packet errors Lost packets Pipelined data transfer 11 11 11 12 13 14 15 16 4. Conclusion 18 5. Index 19 6. References 20 -4- Introduction Networks have many components working together, and in order for everything to run smoothly they are been separated into several levels – called layers. Each layer has a different objective, and builds on the features provided by the layer below it. There are five layers used in the internet. The International Organization for Standardization (ISO) proposed a seven layer model in the 1970s (called the Open Systems Interconnection or OSI), but the two extra layers were never very popular in practise, and they may or may not be used depending on the application developer (Kurose and Ross, 50). The first layer in the internet model is the application layer. This is where applications on the end systems run, using protocols such as HTTP or FTP (Kurose and Ross, 49). The transport layer is next, and its job is to transport data from one end system to the next, over many different types of links. The transport layer will be the focus of this paper. The two main transport layer protocols are TCP and UDP (Nutt, 624). The network layer provides a facility for moving pieces of data (called packets) from one host to the next. From one end system to the other, there are usually several hosts that implement the network layer (hundreds when using the internet). IP is the network layer protocol used by the internet (Kurose and Ross, 49). The data link (or link) layer moves packets across a link between two hosts. This link may be an Ethernet link, WiFi, or another type of connection. The physical layer’s job is to move the packet across the physical medium; this could be twisted pair cable, fibre optic cable, or a wireless signal (Kurose and Ross, 50). Generally the operating system provides support for parts of the transport and network layers (Nutt, 624). The transport layer is usually provided to application developers through different protocols; one such standard is the socket interface (Peterson and Davie, 42). -5Since many applications on a host can be accessing the network at the same time, each is assigned a port number. An IP address/port number combination is shown as 200.200.200.200:5000. To start a connection, the client sends a message to a specific port on a waiting server. The server responds and they start a connection – several of which can be open at once on different ports, as in the figure. The transport layer had a few issues to deal with, mainly because the lower layers are not always perfect. Sometimes a layer will drop a packet, and sometimes the packets will be received out of order. Different protocols handle these issues differently, or may not fix them at all. UDP, one of the popular transport layer protocols, is a basic service that does not guarantee delivery. TCP does, and it needs a way to make sure the packets reach the destination intact and in the right order (Kurose and Ross, 200). How does it do that? And what does the operating system do at the end systems? How does it make sure the packets get to the right place? The answer to these questions will be the focus of this paper. -6- Sockets IP addresses and ports One popular transport layer interface is the sockets API. This interface is popular on many platforms besides Unix, which was the platform it was originally developed for (Peterson and Davie, 43). Implementations of the sockets interface are available in the native Windows C libraries (Win32), as well as many other popular languages in other operating systems. In establishing a socket connection, the client (such as a computer running a web browser) sends a message to a server, which is just waiting for messages. Two things are needed by the client in order to find the server: an IP address and a port number. It does not need to know how to move the messages across the network or internet – that service is provided by the lower layers. So to the client it does not matter if the server is across the street or around the world; the transport layer knows it will be directed to the right host (Nutt, 645). The server has, all this time, been listening to that port. The port number could be a standard port number, such as 80 for HTTP or 21 for FTP. Ports from 0 to 1023 are reserved for certain protocols (Kurose and Ross, 202). Once the server receives the connection request, it sends a reply by inverting the source and destination address/port combination and sending a confirmation packet back. Sometimes a server needs several connections open at once; for example, in the case of a web server, where a client might request a connection while another client is downloading a file. In that case, it can create a child connection which will communicate with the client. -7- Example (C# sockets) C# provides a Socket class in the namespace System.Net.Sockets, which allows for relatively simple creation of sockets. C++ and Java also provide socket libraries, but a C++ implementation is long and complicated, and Java ones are readily available elsewhere (Kurose and Ross, 159). This example will create two different programs: NetSocket.cs and NetSocketClient.cs. They will use the TcpClient and TcpListener classes to communicate using TCP, and by using the IP address 127.0.0.1 they can be run on the same computer. NetSocket.cs using using using using System; System.Net; System.Net.Sockets; System.Text; namespace NetSocket { class NetSocket { static int DefaultPort = 5000; static char ClosingChar = '@'; /// <summary> /// Main part of the NetSocket application. /// </summary> /// <param name="args">Command line parameters.</param> static void Main(string[] args) { // Create the TCP listener to listen to the default port. TcpListener ServerSocket = new TcpListener(IPAddress.Any, DefaultPort); // Start the server socket. ServerSocket.Start(); Console.WriteLine("Server listening..."); // Wait for and accept a connection from the client. TcpClient ClientSocket = ServerSocket.AcceptTcpClient(); Console.WriteLine("Client accepted."); -8// Create a network stream and read from it up to the closing character. NetworkStream Stream = ClientSocket.GetStream(); if (Stream.CanRead) { byte[] BytesRead = new byte[1024]; Stream.Read(BytesRead, 0, BytesRead.Length); string StringRead = Encoding.ASCII.GetString(BytesRead); StringRead = StringRead.Substring(0, StringRead.IndexOf(ClosingChar)); Console.WriteLine("Data read:\n{0}", StringRead); } else { Console.WriteLine("Can not read from client stream."); } // Close connections. ClientSocket.Close(); ServerSocket.Stop(); Console.WriteLine("Finished. Press Enter to quit."); Console.ReadLine(); } } } NetSocketClient.cs using using using using System; System.Net; System.Net.Sockets; System.Text; namespace NetSocketClient { class NetSocketClient { static int ServerPort = 5000; static string ServerIP = "127.0.0.1"; static char ClosingChar = '@'; static void Main(string[] args) { // Create a new TCP client. TcpClient ClientSocket = new TcpClient(); // Connect to the server IP and port. ClientSocket.Connect(IPAddress.Parse(ServerIP), ServerPort); Console.WriteLine("Connected to server"); // Get the network stream and write to it, ending with the closing character. NetworkStream Stream = ClientSocket.GetStream(); -9byte[] BytesWrite = Encoding.ASCII.GetBytes("Greetings!" + ClosingChar); Stream.Write(BytesWrite, 0, BytesWrite.Length); Console.WriteLine("Finished. Press Enter to quit."); Console.ReadLine(); } } } We will start by looking at the server code. TcpListener ServerSocket = new TcpListener(IPAddress.Any, DefaultPort); ServerSocket.Start(); This creates a socket that is assigned any IP address and is listening to port 5000. TcpClient ClientSocket = ServerSocket.AcceptTcpClient(); Waits for a client to connect, and accepts the connection when it does. ClientSocket is where most of the work is done. NetworkStream Stream = ClientSocket.GetStream(); if (Stream.CanRead) { byte[] BytesRead = new byte[1024]; Stream.Read(BytesRead, 0, BytesRead.Length); string StringRead = Encoding.ASCII.GetString(BytesRead); StringRead = StringRead.Substring(0, StringRead.IndexOf(ClosingChar)); Console.WriteLine("Data read:\n{0}", StringRead); } GetStream returns an object that allows reading from and writing to the socket. The block of code inside the if statement reads a sequence of up to 1 KB of data up to a “@” character, which signals that the data ends. This, of course, is application dependant and the client and server have to agree on a protocol – our applications use a protocol that says, “@” signals the end of data. ClientSocket.Close(); ServerSocket.Stop(); Terminates the connection. If you compare this example to a C++ program, you will notice that the C++ program is much more complicated. That is because C# hides a lot of the details, which is exactly why I chose to make a C# example instead of a C++ one. Now we will look at the client program. TcpClient ClientSocket = new TcpClient(); ClientSocket.Connect(IPAddress.Parse(ServerIP), ServerPort); Opens a connection to the server 127.0.0.1 running on port 5000. NetworkStream Stream = ClientSocket.GetStream(); byte[] BytesWrite = Encoding.ASCII.GetBytes("Greetings!" + ClosingChar); - 10 Stream.Write(BytesWrite, 0, BytesWrite.Length); Sends a message to the server through the network stream. The two programs can run together on the same computer. The server needs to be started first, which will wait for a connection. When the client connects to it, it sends a message to the server, and the server closes the connection. Typically the client would have more data to send and would close the connection when it is finished. This example shows that communicating through sockets can be relatively simple, and it should give a basic idea how sockets work (System.Net.Sockets Namespace). Multiplexing and demultiplexing Multiplexing is a way for several devices to use the same connection. There are two ways of multiplexing: Time-Division Multiplexing (TDM) and Frequency-Division Multiplexing (FDM). TDM splits a connection into different frames, each the same length. A connection gets only certain frames. FDM divides the channel into different frequencies, with each connection getting a certain range of frequencies. This is similar to the different frequencies radio stations use. The receiving host has the job of demultiplexing, where the proper signals are sent to the proper ports. - 11 - Protocols Why different protocols? “The ultimate goal of the transport layer is to provide efficient, reliable, and cost-effective services to its users, normally processes in the application layer.” (Tanenbaum, 481) The transport layer may provide several features that the network layer does not: reliable transmission (error checking/correcting and packet ordering), and congestion control. Different protocols may support different features. There are two categories of transport protocols: connection-oriented and connectionless. A connection-oriented protocol sends a few messages back and forth to start a connection. TCP is an example of a connection-oriented protocol; it uses a “three-way handshake” which we will look at in more detail later (Kurose and Ross, 242). A connectionless protocol does not send any handshaking messages. The very basic protocol UDP is a connectionless protocol (Kurose and Ross, 209). The following table lists some of the differences between TCP and UDP. Feature Connection Error detection Dropped packets Error detected Congestion control Flow control Security UDP Connectionless Sometimes (checksum optional) No action Drop packet No No No TCP Three-way handshake Required Resend Resend Yes Yes Optional with extensions (SSL) Everything about TCP looks better at first glance, but there are still times when UDP is more useful. DNS, since it requires several messages to different servers, uses UDP. The connectionless protocol is much more efficient than the connection-oriented TCP, and the packet header is much smaller. Multimedia applications also use UDP sometimes, since a packet dropped here and there is not a big enough deal to sacrifice performance. And, since UDP does not have congestion control to slow it down, performance much higher. TCP is used by many popular application-layer protocols, such as HTTP, SMTP, and FTP. There are also many other transport layer protocols (such as RPC), but none are nearly as popular as TCP and UDP. UDP: Simple protocol UDP barely does anything in addition to the network layer below it. It performs multiplexing and demultiplexing but not much else. If additional features are needed, the application needs to provide them, because UDP was designed to do the minimum possible. A UDP packet is created with the following format (Zheng and Akhtar, 219). - 12 - Obviously to send a packet across the internet requires an IP address. So where is the IP address field? The transport layer does not store it in its header; as was said before, the network layer takes care of getting it from one end to the other. The network layer has its own packet format, so the whole UDP packet is placed into a network layer packet which has a field for the IP address. The UDP packet format is fairly self-explanatory. Each field is 16 bits, and the source and destination ports are stored first. The packet length includes the 64 bit header, and the maximum length is 64 KB. If an application sends more data than fits in the packet, it is spit up into several packets. But when it is, UDP will not guarantee they will be received in the right order. The checksum stores an error correction code, which can optionally be unused and set to 0. If a packet received is corrupt, it is dropped. UDP is quite basic, but very useful in cases where the features of TCP are not all needed. Implementing only the features that are needed could increase efficiency considerably. TCP: Reliable transmission protocol TCP supports reliable transmission, in that the packets handed up to the receiving application will be the same as those sent, in the right order and with none missing. The TCP packet format looks like this (Zheng and Akhtar, 220). - 13 - Once again, the source and destination ports are first. Next are the sequence and acknowledgement numbers, which will be described later. The 4 bit header length is required because the options field at the bottom is variable length, so the header size is not always the same. Next is a 6 bit field that was reserved for future use – and is currently still unused (Tanenbaum, 537). Next are 6 one-bit flags:       URG is set if the urgent pointer is used. ACK indicates an acknowledgement, and is described later. PSH indicates that a push is requested. RST resets a connection. SYN is used in establishing connections, and is described later. FIN indicates there is no more data. Next is the window size (discussed later), checksum for error checking, urgent pointer, and options (0 bytes or more). The data field is optional, so status messages with no data can be sent back and forth. The TCP header is much larger and more complicated than the UTP one – which makes sense because TCP has so much more functionality. Connection establishment A connection in TCP is established by a three-way handshake, or three messages passed back and forth (Kurose and Ross, 264). - 14 - The client sends packet with the SYN bit on and sequence number set to a sequence number it chooses. This is called a SYN segment. The server replies with a SYN segment that has the acknowledgement number set to the client’s sequence number plus one, and chooses its own sequence number. This packet is called the SYNACK segment. Finally the client replies with the SYN bit off, acknowledgement set to the server’s sequence number plus one, and SEQ set to the client’s sequence number plus one. Finally the connection is established and the hosts can start sending data. To terminate the connection, one of the hosts sends a packet with the FIN bit set. Usually each host sends one FIN and one ACK before the connection terminates (Tanenbaum, 541). Packet errors Networks are never perfect and quite often errors occur. They vary widely in scope, and different methods are used to deal with different types. In a perfect world, every packet would be received, in order, and without errors, and could simply be passed up to the application at the receiving end. Unfortunately, this is rarely the case. Consider the situation: both hosts are separate, so the only way they can communicate is through a lossy channel. Two packets could be received out of order, or a packet could be lost completely. If we send acknowledgements, the acknowledgements could be lost. Or even if it makes it back, we would spend a lot of time waiting for it – the round-trip time can be quite long for internet applications. Obviously there are a lot of issues that can happen and have to be dealt with, and some of the fixes will be described here. For the sake of space, the host sending packets will be called the server and the receiver will be called the client. TCP numbers each packet using a sequence number. The sequence number for a packet is calculated using the number of bytes in the packet, so it does not necessarily go up by 1. Using the sequence number, when a packet is received without error, the client sends an ACK with the same number as the correct packet received. When the server receives an ACK with sequence n, it knows that packet n was received correctly (Kurose and Ross, 228). - 15 - But what if a packet is corrupted? The client could send a special packet that says so – basically the opposite of an ACK. But that causes other problems; it could be lost or corrupted as well. Instead, it was decided that TCP would not send feedback if a corrupt packet is received. Instead, it ACKs the last good packet. In illustration 1, the packet 2 is corrupted on the way, so the client ACKs packet 1 (the last received good packet). In illustration 2, ACK 1 is lost, but it is not a problem since the server receives ACK 2 and it knows all packets up to packet 2 were received properly. Lost packets Sometimes packets can be lost on the way to the client, so it needs to be resent. Fortunately, because of the sequence numbers, the client can figure it out. Remember we said the client ACKs the last properly received packet; actually, the ACK corresponds to the last packet up to which all other packets have been received correctly. So if packet 1 and packet 3 are received but not packet 2, it will send an ACK 1. Usually the client will buffer the other packets received (in this case just packet 3) so that they do not have to be resent (Kurose and Ross, 232). TCP has a rule that when three ACKs with the same sequence number are received in a row, the server should resend the packet after that sequence number. - 16 - In illustration 1, packet 2 is lost so the client keeps sending ACK 1. In illustration 2, when the server receives three ACK 1s it figures that packet 2 had been lost and resends it. Note that once the client receives packet 2, it will ACK the last in order packet that was received, which may not be packet 2 if some other packets have been buffered. Pipelined data transfer Pipelined data transfer means that the server can send several packets at once. Of course, it makes sense that this would be done because otherwise sending data would take round-trip time (RTT, including the many delays) * number of packets (n). The RTT can be quite long, and an RTT of hundreds of milliseconds to even seconds is not uncommon. So, best case scenario, the time to move 1MB of data using the largest TCP packet size (1,500 bytes) and a conservative 50ms RTT would be: 50𝑚𝑠 ∗ 700 = 35000𝑚𝑠 The total time would be 35 seconds – just for 1MB of data. This does not even count lost or corrupted packets (which could add a lot more time), or the fact that the RTT could easily be over 50ms. Since even a slow DSL connection should be able to download 1MB in less than 10 seconds, it is obvious that pipelined data transfer is a necessity. TCP uses an algorithm called sliding window. It works by setting a maximum “window size”, which is the number of packets that are in use at once. If a packet is in the window, it can be sent immediately. Otherwise it cannot. When the first packet in the window has been ACKed, the window slides over and starts at the next packet (and another packet can be sent). - 17 - In the figure, once the first packet inside the window has been acknowledged, the window will slide to the right three packets, and there will be three more packets that can be sent. - 18 - Conclusion We described how the applications running on an operating system communicate through ports, and thus several connections can be open at once on a single host. The data is multiplexed using either TDM or FDM, allowing several connections to share a single link. We also described the features which the transport layer makes available; these features depend on the protocol used. An application can choose a basic protocol such as UDP if it needs better performance at the expense of reliability. Often more reliability is needed, and TCP is used. TCP guarantees delivery of packets uncorrupted and in order. Many internet application layer protocols are built using TCP; these protocols include HTTP, FTP, SMTP, and nearly anything that requires consistent, reliable delivery. UDP is a connectionless protocol, meaning it does not perform any handshaking before it starts sending data. This makes it ideal in situations where many servers have to each be sent a short piece of data, such as DNS. UDP also has a small, simple packet format. TCP is a connection-oriented protocol, and connections are established by a three-way handshake. The packet format is quite complicated, as it needs to support acknowledgements and other features in order for its reliable transmission to be implemented, as well as for handshaking. The transport layer in the internet model is quite complicated and includes many protocols not mentioned here. But it is essential that the transport layer be implemented properly, as it is the link between the applications and the network. Much research is still being done to provide better efficiency and reliability, especially since smartphones have become quite popular and many people surf the internet in many different ways. We have barely scratched the surface of the features that are needed to keep the internet running smoothly, but I hope this paper will cause an interest in the technology of the operating system and networks. - 19 - Index Application layer, 4 Child connection, 6 Data link layer, 4 Demultiplexing, 10 IP addresses, 6 Multiplexing, 10 Network, 4 Network layers, 4 Packet errors, 15 Physical layer, 4 Pipelined data transfer, 16 Ports, 5,6 Sequence number, 14 Sliding window algorithm, 16 Sockets programming, 7 TCP, 11,12 Three-way handshake, 13 Transport layer, 4,11 Transport layer protocols, 11 UDP, 11 - 20 - References Kurose, James, and Keith Ross. Computer Networking: A Top-Down Approach 4th Edition. Boston: Pearson, 2008. Print. Nutt, Gary. Operating Systems 3rd Edition. Boston: Pearson, 2003. Print. Peterson, Larry, and Bruce Davie. Computer Networks 2nd Edition. San Francisco: Morgan Kaufmann, 2000. Print. “System.Net.Sockets Namespace” Microsoft. Web. 21 Mar. 2012 Tanenbaum, Andrew. Computer Networks 4th Edition. Upper Saddle River: Pearson, 2003. Print. Zheng, Youlu, and Shakil Akhtar. Networks for Computer Scientists and Engineers. New York: Oxford, 2002. Print.

Date: March 27, 2012 Management Overview

Related documents

Products

Support

Date: March 27, 2012 Management Overview

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib