Network programs in C/C++ We begin to expose lower-level

advertisement

Network programs in C/C++

We begin to expose lower-level coding issues which languages like Python hide from our view

‘arpwatch’

• We can quickly write a C++ utility-program giving a ‘dynamic’ view of the ARP cache

• It uses UNIX’s ‘system()’ library-function to repeatedly execute Linux’s ‘arp’ command

• It uses UNIX’s ‘sleep()’ library-function to create a timed delay before any new view

• It uses ANSI terminal-control strings for specifying cursor-movements onscreen

Just an infinite loop

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

// for ‘printf()’

// for ‘system()’

// for ‘sleep()’ legend[] = “Current contents of the ARP cache”; char

}

{ int main( int argc, char *argv[] ) do { printf( “\e[H\e[J” ); printf( “\e[1;1H” ); printf( “%55s\n”, legend ); system( “/sbin/arp” ); printf( “\n” );

// erase entire screen

// cursor to next row

// draw our title

// execute ‘arp’

// flush ‘stdout’ buffer

} sleep( 1 ); // delay for 1-second while ( 1 ); // an infinite loop (until user hits <CONTROL>-C)

From Python to C++

• Both of these are modern ‘high-level’ and

‘object-oriented’ programming languages

• But Python is intended to hide complexity, whereas C++ includes the features of the older C programming language, in which most of the original networking system code and applications were written, which reveals more of what’s really going on

Recall ‘iplookup’ (in Python)

#!/usr/bin/python import sys try: hostname = sys.argv[1] except: hostname = “localhost” import socket try: hostip = socket.gethostbyname( hostname ) except: hostip = “unknown” print “The IP-address for \’” + hostname + “\’ is “ + hostip

We can redo ‘iplookup’ in C++

#include <netdb.h>

#include <string.h>

#include <stdio.h>

#include <arpa/inet.h>

// for ‘gethostbyname();

// for ‘strcpy()’, ‘strncpy()’

// for ‘printf()’

// for ‘inet_atop();

#define BUFLEN INET_ADDRSTRLEN int main( int argc, char *argv[ ] )

{ char hostname[ 64 ] = { 0 }; if ( argc == 1 ) strcpy( hostname, “localhost” ); else strncpy( hostname, argv[ 1 ], 63 ); char hostip[ BUFLEN ] = { 0 }; struct hostent *hp = gethostbyname( hostname ); if ( !hp ) strcpy( hostip, “unknown” ); else inet_ntop( AF_INET, hp->h_addr, hostip, BUFLEN ); printf( “The IP-address for \’%s\’ is %s \n”, hostname, hostip );

}

In-class exercise #1

• Make a copy of the ‘iplookup.cpp’ sourcefile from our class website, but rename it

‘ipv6lookup.cpp’, like this:

$ cp /home/web/cruse/cs336/iplookup.cpp .

$ mv iplookup.cpp ipv6lookup.cpp

• Use your editor to make these changes:

– Change ‘AF_INET’ to ‘AF_INET6’, and

– Use ‘INET6_ADDRSTRLEN’ for BUFLEN

Intro to ‘sockets’ API

• To rewrite our Python demo (‘getquote’) in the C++ language, we use standard library functions which support the sockets API

(i.e., socket (), connect (), send(), recv () )

• We will also use a special data-structure named ‘saddr’ of type ‘struct sockaddr_in’ and some predefined symbolic constants

(e.g., AF_INET, SOL_SOCKET, etc)

General overview

• The so-called ‘sockets’ API is intended to allow C programmers to write networking applications using familiar kinds of library functions and data-objects from the UNIX filesystem toolkit (e.g., ‘read()’, ‘write()’,

‘open()’, ‘close()’, acting upon ‘handles’)

• For a bigger picture of all this, we look at the network software Reference Models

The OSI Reference Model

It’s based on a logical separation of concerns…

Level 1: The Physical Layer

Level 2: The Link Layer

Level 3: The Network Layer

Level 4: The Transport Layer

Level 5: The Session Layer

Level 6: The Presentation Layer

Level 7: The Application Layer

The Internet Protocol Stack

It’s based on the more practical goals of economy and efficiency…

The Link Layer

The Network Layer

The Transport Layer

The Application Layer

Only four layers, instead of seven, and focus is on the software only

Comparison of models

Application

Presentation

Session

Transport

Network

Link

Physical

OSI 7-Layer Model

Application

Transport

Network

Link

Physical

TCP/IP 4-Layer Model

Terminology

At all these software layers the generic term is ‘packet’

Application

Transport

Network

Link

Physical packets at this layer are called ‘messages’ packets at this layer are called ‘segments’ packets at this layer are called ‘datagrams’ packets at this layer are called ‘frames’ at this layer is just a raw stream of bits

Most network discussions refer to 8bit groups of binary digits as ‘octets’ rather than ‘bytes’ (although our textbook doesn’t adhere to this custom)

Physical bit-stream

• At its lowest level, network communication is achieved via a continuous stream of bits

• The NIC is able to understand this stream as having a logical structure consisting of finite packets of data (known as ‘ frames ’)

Frame Preamble (64 bits)

AA AA AA AA AA AA AA AB

Frame Contents from 512 to 12124 bits

Start-of-Frame Delimiter

Size will vary

Inter-Frame Gap (96 bits)

00 00 00 00 00 00 00 00 00 00 00 00

Silence for at least 12 bytes

‘Manchester’ encoding

• To allow the Physical Layer hardware at the two ends of a physical connection to synchronize their separate internal clocks

(so they can recognize distinct bits within the bit-stream), the bits can be encoded in a manner that conveys timing-information

• Example: 1 0 0 1 1 0 1 1 high low

There’s a transition (high-to-low=1, low-to-high=0) in the middle of each time-interval

Application-level concerns

• What will this program do for users?

– Weather info, File transfer, Stock quote

• Some examples of applications

– ‘telnet’, ‘ftp’, ‘mail’, ‘web browser’, ‘ssh’ host application

‘logical’ point-to-point connection host application transport network link physical switch/hub link physical router network link physical transport network link physical

We write code for this layer

Linux kernel developers write shared libraries for these lower layers

Linux systems programmers write these device-drivers

The ‘sockets’ API

Application Layer

Interface

Transport Layer

Interface

Network Layer

Interface

Link Layer

‘socket()’

• This function is like ‘open()’ for files, but it creates a data-structure in the kernel that is designed to support network messages rather than file accesses

• It’s designed to be quite ‘generic’ (i.e., can be used for various network technologies)

• It returns a non-negative number (like a file handle), but called a socket handle

‘connect()’

• We can use the ‘connect()’ function to let the kernel know which network host we want to send messages to and receive messages from (then we can use ‘write()’ to send, and can use ‘read()’ to receive)

• But it’s more usual in network programs to use ‘recv()’ and ‘send()’ (in place of ‘read()’ and ‘write()’) as extra options are possible

Function prototypes

From </usr/include/unistd.h> ssize_t write ( int fd, void *buf, size_t len ); ssize_t read ( int fd, void *buf, size_t len );

From </usr/include/sys/socket.h> ssize_t send ( int sd, void *buf, size_t len, int flags ); ssize_t recv ( int sd, void *buf, size_t len, int flags );

The ‘flags’ field lets an application request special handling of its network message (such as MSG_CONFIRM or MSG_DONTWAIT or MSG_OOB)

‘struct sockaddr_in’

• The ‘connect()’ function requires supplying a socket-address data-structure, designed for the particular type of socket being used

• For our ‘getquote’ example, we’ll need the socket to support Internet communication sin_family sin_port sin_addr extra padding with zeros

2-bytes 2-bytes 4-bytes 8-bytes

This is where the host’s IPv4 address goes

This is where the applicationprogram’s port-number goes

This is where the AF_INET address-family identifier goes

‘connect()’ prototype

int connect ( int sd, (struct sockaddr *)saddr, socklen_t salen );

Here the ‘sd’ argument is the ‘socket descriptor’ (i.e., the ‘handle’ that was returned by the kernel when you called the ‘socket()’ function.

The ‘saddr’ argument is a pointer to the ‘socket address’ data-structure you’ve created and initialized for the type of communication specified by the parameters you used when you asked the kernel to create the socket (e.g., the address-family and the desired transport-protocol)

The precise action taken by the kernel’s networking subsystem will be somewhat different, depending on the type of socket involved; for the connectionless datagram socket, no actual connection-messages are generated, but the socket is marked as one that does exchanges of messages only with the host-address and port-number specified.

‘sendto()’ and ‘recvfrom()’

• In our ‘getquote.cpp’ demo, you could skip the ‘connect()’ step (used in our Python version) if you’ll call ‘sendto()’ instead of

‘send()’ – specifying your destination for the message via a extra parameter-pair size_t sendto ( int sd, void *msg, size_t len, int flags, struct sockaddr *saddr, socklen_t salen ); size_t recvfrom ( int sd, void *buf, size_t len, int flags, struct sockaddr *saddr, socklen_t *salen );

In-class exercise #2

• Try modifying our ‘getquote.cpp’ demo by eliminating use of the ‘ connect ()’ function and using ‘ sendto ()’ in place of ‘send()’

• See if you can keep the ‘recv()’ function, but if not, try using ‘ recvfrom ()’ instead

Download