Raw Sockets (+ other) UNPv1 - Stevens, Fenner, Rudoff Linux Socket Programming - Walton Other – • Readv ( ) and writev ( ) – Read or write data into multiple buffers – Connection-oriented. • Recvmsg ( ) and sendmsg ( ) – Most general form of send and receive. Supports multiple buffers and flags. – Connectionless cs423 - cotter 2 readv( ) and writev( ) • ssize_t writev ( int filedes, const struct iovec *iov, int iovcnt); • ssize_t readv ( int filedes, const struct iovec *iov, int iovcnt); – filedes – socket identifier – iov – pointer to an array of iovec structures – iovcnt – number of iovec structures in the array (16 < iovcnt: Linux 1024) – Return value is # of bytes transferred or -1 or error struct iovec { void * iov_base; // starting address of buffer size_t iov_len; // size of buffer }; cs423 - cotter 3 writev( ) example: • Write a packet that contains multiple separate data elements – 2 character opcode – 2 character block count – 512 character data block char opcode[2] = “03”; char blkcnt[2] = “17”; char data[512] = “This RFC specifies a standard …”; struct iovec iov[3]; iov[0].iov_base = opcode; iov[0].iov_len = 2; iov[1].iov_base = blkcnt; iov[1].iov_len = 2; iov[2].iov_base = data; iov[3].iov_len = 512; writev (sock, &iov, 3); cs423 - cotter 4 writev( ) example: iov iov[0].iov_base iov[0].iov_len 03 2 iov[1].iov_base iov[1].iov_len 17 2 This RFC specifies a standard .. iov[2].iov_base iov[2].iov_len 512 writev (sock, &iov, 3); cs423 - cotter 5 recvmsg ( ) and sendmsg ( ) • ssize_t recvmsg (int sock, struct msghdr *msg, int flags); • ssize_t sendvmsg (int sock, struct msghdr *msg, int flags); – Sock – Socket identifier – Msg – struct msghdr that includes message to be sent as well as address and msg_flags – Flags – sockets level flags – Return value is # of bytes transferred or -1 or error cs423 - cotter 6 Sockets level flags Flag Description MSG_DONTR Bypass routing table lookup OUTE Only this operation is MSG_DONTW nonblocking AIT Peek at incoming message MSG_PEEK Wait for all the data MSG_WAITAL Send or receive out-of-band L data MSG_OOB cs423 - cotter REC V * * * * SEN D * * * 7 Struct msghdr struct msghdr { void *msg_name; // protocol address socklen_t msg_namelen // size of protocol address struct iovec *msg_iov // scatter / gather array int msg_iovlen //# of elements in msg_iov void *msg_cntrl // cmsghdr struct socklen_t msg_cntrllen // length of msg_cntrl int msg_flags // flags returned by recvmsg }; cs423 - cotter 8 Msg_flags Flag MSG_EOR MSG OOB MSG_BCAST MSG_MCAST MSG_TRUNC MSG_CTRUNC MSG_NOTIFICATION cs423 - cotter Returned by recvmsg msg_flags * * * * * * * 9 recvmsg example (ICMP) char recvbuf[BUFSIZE]; char controlbuf[BUFSIZE]; struct msghdr msg; struct iovec iov; sockfd = socket (PF_INET, SOCK_RAW, pr->icmpproto); iov.iov_base = recvbuf; iov.iov_len = sizeof(recvbuf); msg.msg_name = sin; //sockaddr struct, for sender’s IP & port msg.msg_iov = &iov; msg.msg_iovlen = 1; msg.msg_control = controlbuf; for ( ; ; ) { msg.msg_namelen =sizeof(sin); msg.msg_controllen = sizeof(controlbuf); n = recvmsg(sockfd, &msg, 0); : :cs423 - cotter 10 What are (standard) sockets? 1. ? 2. ? 3. ? Limitations 1. ? 2. ? 3. ? cs423 - cotter 11 What are Raw Sockets? 1. A way to pass information to network protocols other than TCP or UDP (e.g. ICMP and IGMP) 2. A way to implement new IPv4 protocols 3. A way to build our own packets (be careful here) cs423 - cotter 12 Why Would We Use Them? • Allows us to access packets sent over protocols other than TCP / UDP • Allows us to process IPv4 protocols in user space – Control, speed, troubleshooting • Allow us to implement new IPv4 protocols • Allows us to control the IP header – Control option fields (beyond setsockopt() ) – Test / control packet fragmentation cs423 - cotter 13 Limitations? • • • • • • Reliability Loss No Ports Nonstandard communication No Automatic ICMP Raw TCP / UDP unlikely Requires root / admin cs423 - cotter 14 OS Involvement in Sockets User Space Socket App Kernel Space Linux TCP/IP Stack Socket ( AF_INET, SOCK_STREAM, IPPROTO_TCP) Identify Socket Type TCP Socket ( AF_INET, SOCK_RAW, IPPROTO_ICMP) Identify Socket Type IP Socket ( AF_PACKET, SOCK_RAW, htons(ETH_P_IP)) cs423 - cotter Identify Socket Type Ethernet 15 Normal Socket Operation (TCP) • Create a socket – s = socket (PF_INET, SOCK_STREAM, IPPROTO_TCP) • Bind to a port (optional) – Identify local IP and port desired and create data structure – bind (s, (struct sockaddr *) &sin, sizeof(sin)) • Establish a connection to server – Identify server IP and port – connect (s, (struct sockaddr *) &sin, sizeof(sin)) • Send / Receive data – Place data to be send into buffer – recv (s, buf, strlen(buf), 0); cs423 - cotter 16 Normal Socket Operation (TCP) User Space Kernel Space Socket App Linux Create socket socket ( ) Protocol TCP OK Bind to local port: Connect to remote port connect( ) TCP, IP, Internet OK Pass data thru local stack to remote port send( ) TCP, IP, Internet OK cs423 - cotter 17 Raw Sockets Operation (ICMP) • Create a socket – s = socket (PF_INET, SOCK_RAW, IPPROTO_ICMP) • Since there is no port, there is no bind * • There is no TCP, so no connection * • Send / Receive data – Place data to be sent into buffer – sendto (s, buf, strlen(buf), 0, addr, &len); * More later cs423 - cotter 18 Raw Sockets Operation (ICMP) User Space Kernel Space Socket App Linux Create socket socket ( ) Protocol ICMP OK Pass data thru local stack to remote host sendto( ) IP, Internet OK cs423 - cotter 19 Create a Raw Socket • s = socket (AF_INET, SOCK_RAW, protocol) – IPPROTO_ICMP, IPPROTO_IP, etc. • Can create our own IP header if we wish – const int on = 1; – setsockopt (s, IPPROTO_IP, IP_HDRINCL, &on, sizeof (on)); • Can “bind” – Since we have no port, the only effect is to associate a local IP address with the raw socket. (useful if there are multiple local IP addrs and we want to use only 1). • Can “connect” – Again, since we have no TCP, we have no connection. The only effect is to associate a remote IP address with this socket. cs423 - cotter 20 Raw Socket Output • Normal output performed using sendto or sendmsg. – Write or send can be used if the socket has been connected • If IP_HDRINCL not set, starting addr of the data (buf) specifies the first byte following the IP header that the kernel will build. – Size only includes the data above the IP header. • If IP_HDRINCL is set, the starting addr of the data identifies the first byte of the IP header. – Size includes the IP header – Set IP id field to 0 (tells kernel to set this field) – Kernel will calculate IP checksum • Kernel can fragment raw packets exceeding outgoing MTU cs423 - cotter 21 Raw Socket Input • Received TCP / UDP NEVER passed to a raw socket. • Most ICMP packets are passed to a raw socket – (Some exceptions for Berkeley-derived implementations) • All IGMP packets are passed to a raw socket • All IP datagrams with a protocol field that the kernel does not understand (process) are passed to a raw socket. • If packet has been fragmented, packet is reassembled before being passed to raw socket cs423 - cotter 22 Conditions that include / exclude passing to specific raw sockets • If a nonzero protocol is specified when raw socket is created, datagram protocol must match • If raw socket is bound to a specific local IP, then destination IP must match • If raw socket is “connected” to a foreign IP address, then the source IP address must match cs423 - cotter 23 Ping – Overview • This example modified from code by Walton (Ch 18) • Very simple program that uses ICMP to send a ping to another machine over the Internet. • Provides the option to send a defined number of packets (or will send a default 25). • We will build an ICMP packet (with a proper header, including checksum) that will be updated each time we send a new packet. • We will display the raw packet that is received back from our destination host and will interpret some of the data. – (Output format is different from standard ping) cs423 - cotter 24 ICMP Packet header struct icmphdr { u_int8_t type u_int8_t code u_int16_t checksum u_int16_t id u_int16_t sequence }; // ICMP message type (0) // ICMP type sub-code (0) E306, etc. // echo datagram id (use pid) // echo seq # 1, 2, 3, etc. Packet body: 0123456789:;<=>?…B cs423 - cotter 25 myNuPing.c (overview) • Global Declarations – Struct packet { }, some variables • unsigned short checksum (void *b, int len) – Calculate checksum for ICMP packet (header and data) • void display (void *buf, int bytes) – Format a received packet for display. • void listener (void) – Separate process to capture responses to pings • void ping (struct sockaddr_in *addr) – Create socket and send out pings 1/sec to specified IP addr • int main (int count, shar *strings[ ]) – Test for valid instantiation, create addr structure – Fork a separate process (listener) and use existing process for ping cs423 - cotter 26 #defines and checksum calc #define PACKETSIZE 64 struct packet { struct icmphdr hdr; char msg[PACKETSIZE-sizeof(struct icmphdr)]; }; int pid=-1; int loops = 25; struct protoent *proto=NULL; unsigned short checksum(void *b, int len) { unsigned short *buf = b; unsigned int sum=0; unsigned short result; for ( sum = 0; len > 1; len -= 2 ) sum += *buf++; if ( len == 1 ) sum += *(unsigned char*)buf; sum = (sum >> 16) + (sum & 0xFFFF); sum += (sum >> 16); result = ~sum; return result; } cs423 - cotter 27 display - present echo info void display(void *buf, int bytes) { int i; struct iphdr *ip = buf; struct icmphdr *icmp = buf+ip->ihl*4; printf("----------------\n"); for ( i = 0; i < bytes; i++ ) { if ( !(i & 15) ) printf("\n%04X: ", i); printf("%02X ", ((unsigned char*)buf)[i]); } printf("\n"); printf("IPv%d: hdr-size=%d pkt-size=%d protocol=%d TTL=%d src=%s ", ip->version, ip->ihl*4, ntohs(ip->tot_len), ip->protocol, ip->ttl, inet_ntoa(ip->saddr)); printf("dst=%s\n", inet_ntoa(ip->daddr)); if ( icmp->un.echo.id == pid ) { printf("ICMP: type[%d/%d] checksum[%d] id[%d] seq[%d]\n", icmp->type, icmp->code, ntohs(icmp->checksum), icmp->un.echo.id, icmp->un.echo.sequence); } } cs423 - cotter 28 Listener - separate process to listen for and collect messagesvoid listener(void) { int sd, i; struct sockaddr_in addr; unsigned char buf[1024]; sd = socket(PF_INET, SOCK_RAW, proto->p_proto); if ( sd < 0 ) { perror("socket"); exit(0); } for (i = 0; i < loops; i++) { int bytes, len=sizeof(addr); bzero(buf, sizeof(buf)); bytes = recvfrom(sd, buf, sizeof(buf), 0, (struct sockaddr *) &addr, &len); if ( bytes > 0 ) display(buf, bytes); else perror("recvfrom"); } exit(0); } cs423 - cotter 29 ping - Create message and send it void ping(struct sockaddr_in *addr) { const int val=255; int i, j, sd, cnt=1; struct packet pckt; struct sockaddr_in r_addr; sd = socket(PF_INET, SOCK_RAW, proto->p_proto); if ( sd < 0 ) { perror("socket"); return; } if ( setsockopt(sd, SOL_IP, IP_TTL, &val, sizeof(val)) != 0) perror("Set TTL option"); if ( fcntl(sd, F_SETFL, O_NONBLOCK) != 0 ) perror("Request nonblocking I/O"); cs423 - cotter 30 ping (cont) for (j = 0; j < loops; j++) { // send pings 1 per second int len=sizeof(r_addr); printf("Msg #%d\n", cnt); if ( recvfrom(sd, &pckt, sizeof(pckt), 0, (struct sockaddr *)&r_addr, &len) > 0 ) printf("***Got message!***\n"); bzero(&pckt, sizeof(pckt)); pckt.hdr.type = ICMP_ECHO; pckt.hdr.un.echo.id = pid; for ( i = 0; i < sizeof(pckt.msg)-1; i++ ) pckt.msg[i] = i+'0'; pckt.msg[i] = 0; pckt.hdr.un.echo.sequence = cnt++; pckt.hdr.checksum = checksum(&pckt, sizeof(pckt)); if (sendto(sd, &pckt, sizeof(pckt), 0, (struct sockaddr *) addr, sizeof(*addr)) <= 0) perror("sendto"); sleep(1); } } cs423 - cotter 31 myNuPing.c – main() int main(int count, char *argv[]) { struct hostent *hname; struct sockaddr_in addr; loops = 0; if ( count != 3 ) { printf("usage: %s <addr> <loops> \n", argv[0]); exit(0); } if (count == 3) // WE HAVE SPECIFIED A MESSAGE COUNT loops = atoi(argv[2]); if ( count > 1 ) { pid = getpid(); proto = getprotobyname("ICMP"); hname = gethostbyname(argv[1]); bzero(&addr, sizeof(addr)); addr.sin_family = hname->h_addrtype; addr.sin_port = 0; addr.sin_addr.s_addr = *(long*)hname->h_addr; if ( fork() == 0 ) listener(); else ping(&addr); wait(0); } else printf("usage: myping <hostname>\n"); return 0; cs423 - cotter } 32 “Ping” Output [root]# ./myNuPing 134.193.12.34 2 Msg #1 ---------------0000: 45 00 00 54 CC 38 40 00 80 01 1F BE 86 12 34 56 0010: 86 12 34 57 00 00 E4 06 DF 07 01 00 30 31 32 33 0020: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 0030: 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 0040: 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63 0050: 64 65 66 00 IPv4: hdr-size=20 pkt-size=84 protocol=1 TTL=128 src=134.193.12.35 dst=134.193.12.34 ICMP: type[0/0] checksum[58374] id[2015] seq[1] Msg #2 ***Got message!*** ---------------0000: 45 00 00 54 CC 39 40 00 80 01 1F BD 86 12 34 56 0010: 86 12 34 57 00 00 E3 06 DF 07 02 00 30 31 32 33 0020: 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 0030: 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 0040: 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63 0050: 64 65 66 00 IPv4: hdr-size=20 pkt-size=84 protocol=1 TTL=128 src=134.193.12.35 dst=134.193.12.34 ICMP: type[0/0] checksum[58118] id[2015] seq[2] [root]# cs423 - cotter 33 Summary • Raw Sockets allow access to Protocols other than the standard TCP and UDP • Performance and capabilities may be OS dependent. – Some OSs block the ability to send packets that originate from raw sockets (although reception may be permitted). • Raw sockets remove the burden of the complex TCP/IP protocol stack, but they also remove the safeguards and support that those protocols provide cs423 - cotter 34