CSE 7348 - class 10 • Timeouts on sockets. – Call ALARM which generates the SIGALRM signal when the specified time has expired.. – Block wait for I/O in select (time limit built in). – Use the SO_RCVTIMEO or SO_SNDTIMEO socket options. – All three work with I/O operations (read or write, recvfrom, sendto). – Need a technique to work with connect. – Establish the signal handler for SIGALRM. Sigfunc = Signal (SIGALRM, connect_alarm); if (alarm(nsec != 0) connect(…..) // sets the alarm clock for the process. Then CSE 7348 - class 10 • Point of this function is that if the connection hangs the alarm will fire and the handler will take over. • Note that the handler restores any other handlers. • NB: the timeout value can be reduced but it cannot be extended on a BSD kernel. • Comments on data queuing (page 365). • How does a user determine how much data is queued without reading the data? • Use the recv function with the MSG_PEEK flag. This is one of the flags discussed earlier. This return is the number of bytes in the buffer. CSE 7348 - class 10 • Be wary that the number of bytes available in the buffer can change rapidly. • Some implementations support FIONREAD command of ioctl. This uses a value-result pointer to return the number of bytes in sockets receive queue. • Sockets and standard I/O. • The reads and writes use descriptors and are effectively system calls to the Unix kernel. • It is possible to use the standard I/O library of ANSI C. This is handy in that buffering of the stream is done automatically. • The standard I/O library can be used with sockets with some caveats CSE 7348 - class 10 • A standard I/O stream can be created from any Descriptor by calling the fdopen function. • Or given a standard I/O stream the corresponding descriptor can be determined by calling fileno. (to use select must convert the standard I/O stream to a descriptor). • Standard I/O streams can be made full duplex by opening the stream with an r+. – One caveat; an output func cannot be followed by an input func without a call to fflush, fseek, fsetpos, rewind. – Neither can an input func be followed by an output func without an intervening call to fseek, fsetpos, or rewind. – The problem with these last three is that they all call lseek which fails on a socket. CSE 7348 - class 10 • There are three types of buffering performed by the standard I/O library: – Fully buffered means that the I/O takes place only when the buffer is full. (~ 8192 bytes). – Line buffered (most common) I/O takes place on 0Ah, 0Dh. (LF, CR). – Unbuffered where I/O takes place whenever a call is made. – Most Unix implementations employ the following paradigms. – Stderr is always unbuffered – stdin and stdout are always fully buffered (unless a terminal in which case they are line buffered) – All other streams are fully buffered (unless a terminal which is line buffered). CSE 7348 - class 10 • Unix domain protocols; client server communication on a single host using the TCP API (socket API). • This comes from the need for inter-process communication on a single host; why not use the TCP API? • Other schemes are ‘shared memory’, proprietary comm protocols, polling schemes and a host of other difficult to understand and pointless to reproduce techniques. • This leads of course to the basis of distributed computing; the distribution has to begin with processes on a single platform able to communicate. CSE 7348 - class 10 • Two types of sockets in the Unix Domain; stream sockets (similar to TCP) and datagram sockets (similar to UDP). – Raw sockets are available but are discouraged • Unix domain sockets are an alternative to ICP methods (volume 2 starting next week). • Unix domain sockets are often twice as fast as a TCP socket (intra host comm). (X windows opens a window it checks the server’s host name, window and screen; if the same host then it opens a Unix domain stream to the server). • There is enhanced security on the newer Unix implementations to prevent inter-process hacking on a single host. CSE 7348 - class 10 • Unix domain sockets use a socket address structure. – The protocol family is AF_LOCAL – the pathname is stored struct sockaddr_+un { uint8_t sun_len sa_family_t sun_family; char sun_path[104]; // AF_LOCAL // must be null terminated The path name is of the form /home / surdenau / fake The example on page 375 uses the command line argument to get a pathname to be used in the bind. CSE 7348 - class 10 • The pathname used must have appropriate file access permissions set. (turn off world write bit or system(“o-w”); emply umask per the example. Ð The default file access should be ugo+rwx then modified by the umask value. Ð The pathname must be absolute not relative. No links. Ð The pathname specified in a call to select must be a pathname that is currently bound to an open Unix domain socket of the same type (stream or datagram). Ð The permission testing is the same as that for a write to a pathname. CSE 7348 - class 10 • Unix domain sockets are similar to TCP sockets in that they provide a stream interface to the process with no record boundaries. • If a call to connect for a Unix domain stream socket finds the listening buffer full, ECONNREFUSED is returned immediately. • Unix domain sockets are effectively unreliable. • Sending a datagram on an unbound Unix domain datagram socket does NOT bind a pathname to the socket. – This means that the user will be unable to send a reply since he has no way of knowing from whence the incoming data CSE 7348 - class 9 • Daemon processes and the inetd superserver. (ch 12). • A daemon (pronounced demon) is a process that runs in the background and is independent of control from all terminals. • Daemons are typically used for various and sundry administrative tasks in the Unix world. • Daemon are typically started by the system admin scripts; (/etc/rc). These daemons have superuser privileges. – The inetd superserver, a Web server and a mail server will typically be started by these initialization scripts. – I.e. daemon is from the Latin daimon which was built on a similar Greek word. A tutelary deity or spirit. CSE 7348 - class 9 • Many network servers are started by the inetd superserver. • Inetd listens for network requests (HTTP, etc) and invokes the actual server. • cron is a daemon; programs it invokes do NOT run as daemons (page 332). • Daemons can be started from user terminals, but will not display error messages there or be interrupted by signals from therein. • Since there must be a way for daemons to output messages the syslog function is used by the syslogd daemon to intercept these messages. CSE 7348 - class 9 • syslogd daemon • Usually started from the initialization scripts, this daemon reads from the / etc / syslog.conf file which specifies what to do with each type of log message to be received. • In addition a Unix domain socket is created and bound to the path / var / run / log. • A UDP socket is created and bound to port 514. • The path / dev / klog is opened and any error messages from the kernel appear as input on this device. • Syslogd runs in an infinite loop using select on the descriptors in the previous steps to be readable. CSE 7348 - class 9 • Now for something interesting: • The individual whom Alan Turing came to visit in 1942, on orders from the British Foreign Office, where Mr. Turing was employed as a code breaker, was John VonNuemann. • Mr. VonNuemann was a legendary gambler, raconteur and world class mathematician at the Princeton Institute for Advanced Studies. Mr. VonNuemann’s wit and humor were so special as to be noted in all remembrances of him. • Mr. VonNuemann, born in Hungary, is widely accounted as the inventor of the ‘data flow’ architecture used by most modern computing machines. CSE 7348 - class 9 • Exactly what Mr. VonNuemann and Mr. Turing spent their six months on is not certain. • It is known that shortly after their visit, Mr. Turing returned to Bechley Park where he devised systems of computation which allowed the breaking of the German ‘Enigma’ code. • After the war, both Mr. Turing and Mr. Von Nuemann designed computational devices. Both of which were far beyond the ability of the electronics industry to realize. • Turing eventually committed suicide, hence the bite (byte) out of the apple. CSE 7348 - class 9 • syslog function • The common technique for logging messages from a terminal is to call the syslog function void syslog ( int priority, const char *message…..) • The priority arg is a combination of a level and a facility. • The message is similar to a format string for printf with the addition of the %m specification and the error message corresponding to the current value of errno. • Levels vary from 0 to 7 with 0 being the highest priority. CSE 7348 - class 9 • The facility of log messages is used to identify the type of process sending the message (LOG_USER is the default). • An example of a facility is a FTP daemon, or the mail system, or the cron daemon. • The purpose of the facility and level is to allow all messages from a given facility or level to be handled the same. • As an example a config file will contain the following; kern.* local7.debug /dev/console /var/log/cisco.log Which specifies that all kernel messages get logged to the console and all debug messages from the local7 facility get appended to the file /var/log/cisco.log. CSE 7348 - class 9 • When an application calls syslog the first time a Unix domain datagram socket is created. Then connect is invoked to the well known pathname of the socket created by the syslogd daemon (var / run / log). • This socket remains open until the application terminates. • As an alternative the openlog and closelog functions can be used. void openlog (const char *ident, int options, int facility); void closelog (void); • ident is a string that is prepended to each log message by syslog. • options argument is the || of one or more constants (figure 12.3). • Normally a Unix domain socket is NOT created on a call to openlog. CSE 7348 - class 9 • The facility argument of openlog specs a default facility for any subsequent calls to syslog that do not spec a facility. • Most daemons call syslog and then only specify a level for any further logging. • daemon_init function: an example of how to daemonize a process. • Call fork and then parent terminates while child continues. Therefore child is in background while inheriting parent process group ID (but maintaining own process ID). • setsid (posix.1 function) creates a new session; the process now becomes the session leader and the group leader of a new process group with no controlling terminal. • Ignore the SIGHUP signal and fork again. Now the second child is running. This guarantees that the second child is no longer the session leader so it cannot acquire a controlling terminal. CSE 7348 - class 9 • daemon_init (continued) • set the global daemon, daemon_proc nonzero ( daemon_proc = 1; ) When this value is nonzero it forces a call to syslog instead of stderr. • Change the working directory to root (generally the working directory particularly for core files). (if a daemon is in a file system that file system cannot be unmounted). • Close all inherited open descriptors. Since there is no limit to the number of descriptors (or the limit is 32k or 64k) the standard approach is to close the first 64 descriptors. • Some daemons open / dev / null for reading and writing and duplicate the descriptor for stdin, stdout, stderr. Therefore any library function can read / write without failure. CSE 7348 - class 9 • daemon_init (continued) • Call openlog. Specify that the process ID should be appended to all log messages. • NB: A daemon should never receive the SIGHUP signal; also the SIGINT and SIGWINCH signals. Therefore these signals can be used to notify the daemon that something has changed (such as the config file). • Page 338, Steven’s sets up the daytime server as a daemon. An excellent exercise and a really great test question. CSE 7348 - class 9 • inetd daemon • In a typical Unix system the servers are all daemonized and started at boot. • All daemons contain nearly the same startup code. • Each daemon takes a slot in the process paper although it is asleep most of the time. • 4.3BSD solved this problem by providing an Internet superserver, the inetd daemon. • This daemon can be used by any server that employs TCP or UDP. It only handles Unix domain sockets. • Simplifies writing daemon processes - most startup details or handled by inetd (no longer need to call daemon_init). • Allows a single process to be waiting for incoming client requests for multiple services, instead of a process for each service. CSE 7348 - class 9 • inetd uses the same techniques to establish itself as daemon_init. • Configuration file is typically in / etc / inetd.conf….. • The config file specifies the services that the superserver is to handle and how to handle a service request. • Sample line; ftp stream tcp nowait root / usr / bin / ftpd ftpd -1 • Name of the server is always passed as the first arg to a program when it is exec’ed. • The names of the fields are service name, socket type, protocol, wait-flag, login-name, server-program, server-program-arguments. CSE 7348 - class 9 [/etc]%cat inetd.conf | more # #ident "@(#)inetd.conf 1.22 95/07/14 SMI" /* SVr4.0 1.5 */ # # # Configuration file for inetd(1M). See inetd.conf(4). # # To re-configure the running inetd process, edit this file, then # send the inetd process a SIGHUP. # # Syntax for socket-based Internet services: # <service_name> <socket_type> <proto> <flags> <user> <server_pathname> <args> # # Syntax for TLI-based Internet services: # # <service_name> tli <proto> <flags> <user> <server_pathname> <args> # • # Ftp and telnet are standard Internet services. • # CSE 7348 - class 9 ftp stream tcp nowait root /usr/sbin/in.ftpd in.ftpd telnet stream tcp nowait root /usr/sbin/in.telnetd in.telnetd # # Tnamed serves the obsolete IEN-116 name server protocol. name dgram udp wait root /usr/sbin/in.tnamed in.tnamed # Shell, login, exec, comsat and talk are BSD protocols. # shell stream tcp nowait root /usr/sbin/in.rshd in.rshd login stream tcp nowait root /usr/sbin/in.rlogind in.rlogind exec stream tcp nowait root /usr/sbin/in.rexecd in.rexecd comsat dgram udp wait root /usr/sbin/in.comsat in.comsat talk dgram udp wait root /usr/sbin/in.talkd in.talkd # # Must run as root (to read /etc/shadow); "-n" turns off logging in utmp/wtmp. # uucp stream tcp nowait root /usr/sbin/in.uucpd in.uucpd # # Tftp service is provided primarily for booting. Most sites run this # only on machines acting as "boot servers." CSE 7348 - class 9 • For each service listed in the inetd.conf file, the inetd daemon will socket, bind, listen (TCP socket). • The maximum number of servers that inetd can handle depends on the max number of descriptors that inetd can create. Each new socket is added to the select descriptor set used by inetd. • For TCP sockets, listen is called (not done for datagram sockets). • After all sockets are created, select is used to wait for a socket to become readable. inetd spends most of its time blocked on this call. • When select returns (socket is readable) accept is called to make the new connection. CSE 7348 - class 9 • The inetd daemon forks and the child process handles the service request. (concurrent server model) • child closes all descriptors other than the socket descriptor that it is handling; the new connected socket for TCP. • The child calls dup three times thereby assigning the socket to descriptors 0, 1, and 2 (stdin, stdout, stderr). • The original socket descriptor is then closed; this means that the child is reading from stdin, writing to stdout or stderr. • The child calls getpwnam to get the password entry for the loginname specified in the config file. • If this field is NOT root, then the child becomes the specified user by executing the setgid and setuid function calls. Inetd is executing with a user ID of 0 the child process inherits the user ID across the fork so it is able to become any user it desires. CSE 7348 - class 9 • The child now does an exec on the appropriate server-program to handle the request, passing all args delineated in the configuration file. • If the socket is a stream socket, the parent must close the connected socket. The parent loops back to select again waiting for another socket to become readable. • The descriptor handling in inetd deserves some attention. • If inetd directs a connection request to port 21, the child will have a descriptor for port 21. • Once the child dups the connected socket to descriptors 0, 1, and 2 the child closes the connected socket. • This means the client is now connected to fd 0, fd 1, fd 2 CSE 7348 - class 9 • Once the child calls exec all descriptors remain open across the call so the server uses descriptors 0, 1, and 2 to comm with the client. • On page 344 there is a very fine example of the daemon_inetd function. Review in detail along with Figure 12.12. • In the Daytime server of 12.12, all of the socket creation code has been removed; instead we have daemon_inetd ( argv[0], 0 ); The TCP connection uses descriptor 0. The infinite for loop is gone (invoked once per client connection). Call getpeername with descriptor 0 as a way to determine client protocol address. Use MAXSOCKADDR since we do not know the size of the socket address structure. CSE 7348 - class 9 • What has been done with inetd is actually rather simple; inetd has become the concurrent server for any and all TCP connections. • The same concurrent parent/fork child model is still being used (design pattern??). The model has simply been moved to a higher level of abstraction and all of the various servers have been serviced as exec’s within its child processes. • Most network services are run as daemons (in the background, independent of terminal control). • The output from a daemon is usually handled by the syslogd daemon via the syslog function. CSE 7348 - class 9 • To daemonize a program; • • • • • • fork to get in background call setsid to create a new session and become session leader fork again to avoid obtaining a new controlling terminal change the working directory appropriately. Change the file mode creation mask. Close all unneeded files. • Assignment: Problem 12.1. Turn in by next class in hard format. CSE 7348 - class 9 • Some thoughts on software process. • In 1908 when Igor Stravinsky debuted ‘le Sacre du Printemps’ rioting broke out at the Paris Opera house. • Classical music fans are usually not given to fisticuffs. • The problem is that music and musical beliefs are passionately held. Stravinsky with his dissonance and revolutionary approach to formal music upset a lot of people while wildly pleasing others. He of course changed the course of music in the 20th century. Stravinsky is why we have Jazz and R&R. CSE 7348 - class 9 • Similarly with software processes we have zealots and religiously held beliefs. • Like all deeply held beliefs they are held on the basis of faith; faith and faith alone (not unlike anything believed by anybody). • There is NO demonstrable, repeatable proof or data that any software process, methodology, language or approach is any more beneficial than any other (at least in economic terms). • Which is NOT to say that process, methodology or language are all interchangeable and without value. Quite to the contrary, process and methodology will have an economic impact on the viability of any product; but sales and marketing will have a much larger impact in many instances. CSE 7348 - class 9 • So what is the point? • Temper your beliefs not on the presentations of zealots or other believers in process X or methodology Y. • Instead perform as a scientist and engineer and look for the data which will allow you to accurately quantify the changes wrought by the adaptation of a method, a process or a language. • Remember that the software industry is most like the fashion industry; a large number of customers easily lead by what ever the couture houses of Paris (or Silicon Valley) are currently espousing. • The situation is made frenzied by the billions of dollars at stake. CSE 7348 - class 9 • And in one final bash of the so-called ‘cleanroom’. • Even a Level 1 clean room (less than I particle per cubic yard) is NOT defect free. • No non-trivial system can EVER be fully or accurately specified. • Given the provability of the above, and the outlandish vagaries of the English language, how can a set of requirements, written in English EVER be the basis for an accurate design? • The problem is how do we successfully take a set of flawed requirements, and translate them into a meaningful and accurate notation in the solution space? This translation must incorporate an ability to accommodate CHANGE. Which cleanroom really does not provide. Class 10 • Frequently, services are performed by so-called daemons. A daemon is a program that opens a certain port, and waits for incoming connections. If one occurs, it creates a child process which accepts the connection, while the parent continues to listen for further requests. • This concept has the drawback that for every service offered, a daemon has to run that listens on the port for a connection to occur, which generally means a waste of system resources like swap space. • Thus, almost all installations run a super-server that creates sockets for a number of services, and listens on all of them simultaneously using the select(2) system call. When a remote host requests one of the services, the super-server notices this and spawns the server specified for this port. Class 10 • The super-server commonly used is inetd, the Internet Daemon. It is started at system boot time, and takes the list of services it is to manage from a startup file named /etc/inetd.conf. In addition to those servers invoked, there are a number of trivial services which are performed by inetd itself called internal services. They include chargen which simply generates a string of characters, and daytime which returns the system's idea of the time of day. Class 10 • Why should you run the HTTP server standalone as opposed to running it from inetd? If your server has a lot of CPU cycles to burn, then there is nothing wrong with running HTTPd from inetd. • Most servers don't. If you run the server from inetd, that means that whenever a request comes in, inetd must first fork a new process, then load the HTTPd binary. Once the HTTPd binary is loaded, HTTPd must then load and parse all of its configuration files (including httpd.conf, access.conf, and mime.types), which is quite a task to be done for each and every request that comes in. • Now, contrast this with running standalone. When HTTPd gets a request, it makes a copy of itself (which requires no loading of a binary since shared text pages are used), and the copy handles the request. The configuration files have already been loaded on startup, and so we don't reload them every time. • In fact, with NCSA HTTPd 1.4 and 1.5, it doesn't even have to make another copy of itself if there is an extra server already running. The parent will just pass off the request to the free child. Class 10 • Common Services • telnet: allows remote users to connect to a host via telnet. Since user passwords are transmitted over the wire in plain text and can therefore be easily sniffed, we recommend disabling the service and using instead a secure terminal access mechanism which encrypts the entire session (such as kerberized telnet or SSH). If telnet access is a must, the service must be wrapped. • • ftp: allows remote users to transfer files to/from a host using ftp. Since user passwords are transmitted over the wire in plain text and can therefore be easily sniffed, we recommend disabling the service and using instead a secure file transfer mechanism which encrypts the entire session (SSH). Another solution is to install AFS, thus eliminating the need for an ftp daemon. If ftpd access is a must, the service must be wrapped. Class 10 • Common Services (cont) • shell: allows remote users to run arbitrary commands on a host via the Berkeley rsh utility (via the trusted hosts mechanism using the .rhosts file). Highly insecure. Instead, run the leland kerberalized version of rsh. • login: allows remote users to use the Berkeley rlogin utility to log in to a host without supplying a password (via the trusted hosts mechanism using the .rhosts file). Highly insecure. Instead, use the leland kerberalized version of rlogin. • exec: allow remote users to execute commands on a host without logging in. Exposes remote user passwords on the network, thus Highly insecure. Instead, use the leland kerberalized version of rexec. Class 10 • popper: allows remote users to use POP3 to retrieve their email from the server. Highly insecure. We recommend that you use the leland kerberalized version of popper if you have to run a popper service. Note: If you want to run the kerberalized popper server, you will need another srvtab specifically for that service in order for certain mail readers to work. Please contact srvtab-request@leland for the srvtab.pop srvtab. • comsat: used for incoming mail notification via biff and is largely unecessary. We recommend disabling the service. • finger: allows remote users to use the finger utility to obtain information about arbitrary users on a host. Highly insecure. We recommend disabling the service or using a more secure version such as cfinger. • talk: allows remote users to use talk to have a real time conversation with a user on a host. Considered light security hazard. Class 10 • tftp: allows remote users to transfer files from a host without requiring login. Used primarily by X-terminals and routers. Considered highly insecure. We recommend disabling the service. If tftp access is needed (for bootp clients), we recommend that the "-s" option be used and that the service be wrapped. • uucp: allows remote users to transfer files to/from using the UUCP protocol. I can't honestly remember anyone needing this service who had it enabled. We recommend disabling it and disabling setuid on any uucp binaries. • time: Used for clock synchronization. We recommend disabling the service and using xntp to synchronize your system clock to stanford. • echo, daytime, discard, chargen: These services are used largely for testing and are largely unnecessary. If they are enabled to use UDP, they can be used for a Denial of Service attack. We recommend disabling at least the UDP version of them. Class 10 • bootp, bootps: used for bootp services. We recommend disabling it unless you are running a bootp server. • systat: designed to provide status information about a host. Considered a potential secucrity hazard. We recommend disabling the service. • netstat: designed to provide network status information about a host to remote hosts. Considered a potential secucrity hazard. We recommend disabling the service. • Services based on RPC: RPC based services are used by protocpls such as NFS and NIS. These protocols are a known security hazard. our recommendation is against using NIS or NFS. Kerberos and AFS provide solutions to all the same problems without the security hazard. Class 10 • Configuring inetd at home • Start with an inetd.conf file that is completely commented out. Then add only the services that you need. • Services that you probably want are: (taken from a stock redhat 4.0 machine) telnet stream tcp nowait root /usr/sbin/tcpd in.telnetd ftp stream tcp nowait root /usr/sbin/tcpd in.ftpd -l -a • Things that open your machine up for .rhosts attacks: shell stream tcp nowait root /usr/sbin/tcpd in.rshd login stream tcp nowait root /usr/sbin/tcpd in.rlogind • Things that have no reason running on your machine gopher stream tcp nowait root /usr/sbin/tcpd gn Class 10 • First, inetd is dying because it receives a SIGPIPE when it tries to write to the socket returned by accept since it does not install a signal handler for it. To fix install a signal handler for SIGPIPE. • Now you may be wondering why does a write to the socket returned by accept() generates a SIGPIPE. This bring us to the second issue. It seems that at least under Linux 2.0.X accept() will return a socket in the received queue if it is not in the SYN_SENT or SYN_RECV state, even when it has not gone through the ESTABLISHED state. By doing a stealth scan on the port the socket goes from the SYN_RECV state to the CLOSED state. When you try to read from such a socket you get a SIGPIPE. The sematics of Linux's accept() seems to be non-standard. I wonder what else breaks by not handling SIGPIPE. Class 11 – ICMP Protocol Overview • Internet Control Message Protocol (ICMP), documented in RFC 792, is a required protocol tightly integrated with IP. ICMP messages, delivered in IP packets, are used for out-of-band messages related to network operation or mis-operation. Of course, since ICMP uses IP, ICMP packet delivery is unreliable, so hosts can't count on receiving ICMP packets for any network problem. Some of ICMP's functions are to: Announce network errors, such as a host or entire portion of the network being unreachable, due to some type of failure. A TCP or UDP packet directed at a port number with no receiver attached is also reported via ICMP. Announce network congestion. When a router begins buffering too many packets, due to an inability to transmit them as fast as they are being received, it will generate ICMP Source Quench messages. Directed at the sender, these messages should cause the rate of packet transmission to be slowed. Of course, generating too many Source Quench messages would cause even more network congestion, so they are used sparingly. Class 11 ICMP Overview Assist Troubleshooting. ICMP supports an Echo function, which just sends a packet on a round--trip between two hosts. Ping, a common network management tool, is based on this feature. Ping will transmit a series of packets, measuring average round--trip times and computing loss percentages. Announce Timeouts. If an IP packet's TTL field drops to zero, the router discarding the packet will often generate an ICMP packet announcing this fact. TraceRoute is a tool which maps network routes by sending packets with small TTL values and watching the ICMP timeout announcements. Class 11 – IGMP Protocol Overview • Internet Group Management Protocol (IGMP), documented in Appendix I of RFC 1112, allows Internet hosts to participate in multicasting. RFC 1112 describes the basic of multicasting IP traffic, including the format of multicast IP addresses, multicast Ethernet encapsulation, and the concept of a host group. • A host group is the set of hosts interested in traffic for a particular multicast address. Important multicast addresses are documented in the most recent Assigned Numbers RFC, currently RFC 1700. IGMP allows a router to determine which host groups have members on a given network segment. • The exchange of multicast packets between routers is not addressed by IGMP. Class 11 Network Working Group Smoot Carl-Mitchell Request for Comments: 1027 Texas Internet Consulting John S. Quarterman Texas Internet Consulting October 1987 Using ARP to Implement Transparent Subnet Gateways Status of this Memo • This RFC describes the use of the Ethernet Address Resolution Protocol (ARP) by subnet gateways to permit hosts on the connected subnets to communicate without being aware of the existence of subnets, using the technique of "Proxy ARP" [6]. It is based on RFC-950 [1], RFC-922 [2], and RFC-826 [3] and is a restricted subset of the mechanism of RFC-925 [4]. Distribution of this memo is unlimited. Acknowledgment The work described in this memo was performed while the authors were employed by the Computer Sciences Department of the University of Texas at Austin. Class 11 • Introduction The purpose of this memo is to describe in detail the implementation of transparent subnet ARP gateways using the technique of Proxy ARP. The intent is to document this widely used technique. • 1. Motivation • The Ethernet at the University of Texas at Austin is a large installation connecting over ten buildings. It currently has more than one hundred hosts connected to it [5]. The size of the Ethernet and the amount of traffic it handles prohibit tying it together by use of repeaters. The use of subnets provided an attractive alternative for separating the network into smaller distinct units. This is exactly the situation for which Internet subnets as described in RFC-950 are intended. Unfortunately, many vendors had not yet implemented subnets, and it was not practical to modify the more than half a dozen different operating systems running on hosts on the local networks. Class 11 • Therefore a method for hiding the existence of subnets from hosts was highly desirable. Since all the local area networks supported ARP, an ARP-based method (commonly known as "Proxy ARP" or the "ARP hack") was chosen. In this memo, whenever the term "subnet" occurs the "RFC-950 subnet method" is assumed. • 2. Design • 2.1 Basic method • On a network that supports ARP, when host A (the source) broadcasts an ARP request for the network address corresponding to the IP address of host B (the target), host B will recognize the IP address as its own and will send a pointto-point ARP reply. Host A keeps the IP-to-network-address mapping found in the reply in a local cache and uses it for later communication with host B. Class 11 • If hosts A and B are on different physical networks, host B will not receive the ARP broadcast request from host A and cannot respond to it. However, if the physical network of host A is connected by a gateway to the physical network of host B, the gateway will see the ARP request from host A. • Assuming that subnet numbers are made to correspond to physical networks, the gateway can also tell that the request is for a host that is on a different physical network from the requesting host. The gateway can then respond for host B, saying that the network address for host B is that of the gateway itself. Host A will see this reply, cache it, and send future IP packets for host B to the gateway. The gateway will forward such packets to host B by the usual IP routing mechanisms. The gateway is acting as an agent for host B, which is why this technique is called "Proxy ARP"; we will refer to this as a transparent subnet gateway or ARP subnet gateway. Class 11 • When host B replies to traffic from host A, the same algorithm happens in reverse: the gateway connected to the network of host B answers the request for the network address of host A, and host B then sends IP packets for host A to gateway. The physical networks of host A and B need not be connected to the same gateway. All that is necessary is that the networks be reachable from the gateway. • With this approach, all ARP subnet handling is done in the ARP subnet gateways. No changes to the normal ARP protocol or routing need to be made to the source and target hosts. From the host point of view, there are no subnets, and their physical networks are simply one big IP network. If a host has an implementation of subnets, its network masks must be set to cover only the IP network number, excluding the subnet bits, for the system to work properly. Class 11 • 2.2 Routing As part of the implementation of subnets, it is expected that the elements of routing tables will include network numbers including both the IP network number and the subnet bits, as specified by the subnet mask, where appropriate. • When an ARP request is seen, the ARP subnet gateway can determine whether it knows a route to the target host by looking in the ordinary routing table. • If attempts to reach foreign IP networks are eliminated early (see Sanity Checks below), only a request for an address on the local IP network will reach this point. We will assume that the same network mask applies to every subnet of the same IP network. The network mask of the network interface on which the ARP request arrived can then be applied to the target IP address to produce the network part to be looked up in the routing table. Class 11 • In 4.3BSD (and probably in other operating systems), a default route is possible. This default route specifies an address to forward a packet to when no other route is found. The default route must not be used when checking for a route to the target host of an ARP request. If the default route were used, the check would always succeed. But the host specified by the default route is unlikely to know about subnet routing (since it is usually an Internet gateway), and thus packets sent to it will probably be lost. This special case in the routing lookup method is the only implementation change needed to the routing mechanism. • If the network interfaces on which the request was received and through which the route to the target passes are the same, the gateway must not reply. • WHY????? Class 11 • 2.3 Multiple gateways • The simplest subnet organization to administer is a tree structure, which cannot have loops. However, it may be desirable for reliability or traffic accommodation to have more than one gateway (or path) between two physical networks. ARP subnet gateways may be used in such a situation: a requesting host will use the first ARP response it receives, even if more than one gateway supplies one. • This may even provide a rudimentary load balancing service, since if two gateways are otherwise similar, the one most lightly loaded is the more likely to reply first. Class 11 • 2.4 Sanity checks Care must be taken by the network and gateway administrators to keep the network masks the same on all the subnet gateway machines. The most common error is to set the network mask on a host without a subnet implementation to include the subnet number. This causes the host to fail to attempt to send packets to hosts not on its local subnet. Adjusting its routing tables will not help, since it will not know how to route to subnets. • If the IP networks of the source and target hosts of an ARP request are different, an ARP subnet gateway implementation should not reply. This is to prevent the ARP subnet gateway from being used to reach foreign IP networks and thus possibly bypass security checks provided by IP gateways.