Powerpoint version of the TCP/IP slide show

advertisement
¦ The TCP/IP Protocol Suite
This talk will outline parts of TCP/IP
Focus on aspects relevant to network diagnosis
IP and DNS, but not FTP and telnet
For more information, see
ftp://ftp.cs.rutgers.edu/runet/tcp-ip-intro.txt & tcp-ip-admin.txt
Official source of info is the “RFC’s”
At Rutgers, on /rutgers/ref/rfcs. See rfc-index.txt
Or look at www.internic.net
Comer’s book on TCP/IP
¦ TCP/IP: The Protocol Stack
TCP/IP: a family of protocols
FTP, telnet, SMTP, etc: application layer
TCP and UDP: session layer
gets packets to the right application, maintains a connection
IP: network layer
gets packets to the right machine, routing
Ethernet, etc: encapsulation
Each layer has its own addressing and tables.
A packet has headers from each layer:
Ethernet C IP C TCP C application
¦ TCP/IP: The Protocol Stack
All communications are based on packets
A packet has up to 1500 bytes of data
If you’re sending a big file, it gets broken up into packets
Packet addressed to a specific application on a specific
system.
Each packet has a “header”, which looks something like this:
IP
From IP address 128.6.134.22
To IP address 165.230.180.144
From FTP server
To FTP program
Bytes number 22,123 through 23,620 of the file
<actual data>
TC
P
FTP
¦ IP: protocol definition
IP: Internet Protocol
Gets packets from machine to machine
This is the level normally involved in network diagnosis
Addressing
32-bit address, displayed as octets, eg.
128.6.134.22, or in hex, 80068616
Usually divided in three parts: network, subnet, host
128.6 is Rutgers [we have more than one network]
134 is subnet, in this case one of several NBCS staff nets
22 is the particular machine
Division not always on octet boundary. Subnet mask
255.255.255.0, or FFFFFF00. 1’s mark the network part
More recently, just show the number of bits, e.g. 128.6.0.0/16
¦ IP: Addressing
Address: 128.6.134.22 = 80068616
Mask: 255.255.255.0 = FFFFFF00
1’s mark the network:
80068600
128.6.134.0 to 128.6.134.255
0’s mark the host:
16
22
Newer terminology: net is 128.6.134.0/24
Address: 165.230.180.141 = A5E6B48D
Mask: 255.255.255.192 = FFFFFFC0
1’s mark the network:
A5E6B480 165.230.180.128 to …180.191
0’s mark the host:
0D
13
Newer teminology: net is 165.230.180.141/26
The main thing is to know what range of addresses are included in the net
¦ IP: Addressing
The Internet authorities assign Rutgers one or more nets
Currently it is 128.6.0.0/16 and 165.230.0.0/16
Normally use one or more /16 nets for large institutions
Possible to assign a range, e.g. 7.1.0.0/16 - 7.3.0.0/16
Class A: 1.0.0.0 to 126.255.255.255 used /8 mask
Class B: 128.0.0.0 to 191.255.255.255 used /16 mask
Class C: 192.0.0.0 to 223.255.255.255 used /24 mask
No longer true: now assigning “class B” nets everywhere
127.0.0.1 is “loopback”. (whole 127 is “reserved”)
¦ IP: Addressing
So Rutgers gets 128.6.0.0/16 and 165.230.0.0/16
We then allocate subnets to departments.
Currently 128.6 uses /24 mask and 165.230 uses /26
A department that needs 255 hosts on a subnet is allocated from
128.6, e.g. 128.6.4.0/24
A department that needs 64 hosts on a subnet is allocated from
165.230, e.g. 165.230.180.192/26
¦ IP: Routing
So how do you get from your machine to 165.230.180.22?
The network is a mesh of networks, routers, and point to point
Ether
Ether
Phys
Busch
Fiber
Ring
Ether
Busch
BuschLivingston
Fiber
Trunk
Livings.
Chem
Ether
Ether
GSB
Ether
Livingston
Fiber
Ring
Ether
Sociol
Any resemblance between
this and Runet is pure coincidence
There are only two real choices:
Send it directly over the Ethernet
Send it to a router
Ether
¦ IP: Routing
You want to send from 128.6.134.22 to 165.230.180.144
The IP system will break the addresses into subnet and host
So you want to get from host 22 on 128.6.134.0
to host 16 on 165.230.180.128
These are obviously on different networks. So you want to send it to
the router.
The routers keep track of each other. The router on 128.6.134 knows
which router handles 165.230.180.128.
So 128.6.134.22 sends it to a router, e.g. 128.6.134.1.
The routers get it to the router for 165.230.180.128.
The last router sends it to 165.230.180.144.
This talk does not discuss the protocols used among the
routers
They keep track of each other, and compute a (nearly) optimal route
This is complex, and can go wrong: unreachable nets or looping
¦ IP: Routing
Each system has a “routing table”. Show with netstat -r or
route
Network Address
0.0.0.0
127.0.0.0
165.230.180.0
255.255.255.255
Netmask
0.0.0.0
255.0.0.0
255.255.255.192
255.255.255.255
Gateway Address
If
165.230.180.1
127.0.0.1
165.230.180.4
165.230.180.1
le0
lo0
le0
le0
Type of destination:
This one host (mask 255.255.255.255)
All hosts on network (mask 255.255.255.192)
Default route (address 0.0.0.0)
How to send it
Send directly on this Ethernet (metric 0)
Send through a router (metric 1): router is “gateway address”
Metric
1
1
0
1
¦ IP: Routing
The main things you have to worry about in routing are:
Make sure your system has the right address
Make sure you have the correct subnet mask
Make sure your system knows how to find a router
Options for the address: configuration or DHCP/PPP
At RU we always configure the address, except dialups
Options for the subnet mask: configuration or ICMP/PPP
At RU we normally configure the subnet mask, except for dialups
Finding the router:
Configure a default router
Find a router from router discovery
Use DHCP/PPP
Use proxy ARP
¦ IP: Routing
Router Discovery
A special protocol that allows systems to find a router automatically
Your router must have router discovery enabled
Solaris uses router discovery by default, modified rdisc recommended
DHCP/PPP
Most dialup software automatically sets up the dialup as the default
We don’t currently support DHCP
Proxy ARP
Use “route add default XXXX 0” where XXXX is your own IP or name
Causes the system to think that the whole world is on your Ethernet
Routers will respond to the ARP request with their own address
Sort of a hack. Has only been used at Rutgers with older Suns
We encourage you to configure in the default router if your
system doesn’t do router discovery
¦ IP: Broadcasting
Ethernet and most other local media allow broadcasts
Send to Ethernet address of all one’s
All stations will accept it as for them
IP allows you to use address of 255.255.255.255 to broadcast
Normally used to find things
E.g. ARP sends an Ethernet broadcast to find machine with a given IP
Routers send broadcast so you can find them
These days applications tend to use multicast instead
Problem: everyone sees broadcasts. Often they aren’t interested.
Multicast is based on a whole range of Ethernet addresses that are
similar to broadcast, but have to be enabled
Multicast addresses start with 224 and higher
Most routing protocols now use multicast
The MBONE (video conferencing) uses multicast
¦ IP: Broadcasting
It is possible to broadcast on other subnets
To send a broadcast on 128.6.4, send packet to 128.6.4.255
This can become a security issue (“smurf”)
Thus we often disable this feature
Multicasting has a whole protocol to distribute it
The routers keep track of which hosts are interested in getting specific
multicasts
Send traffic only to routers where someone wants it
This protocol runs over the entire Internet for audio and video
Called the “MBONE”
Not enabled by default; you’ll need to negotiate with TD
¦ IP: testing
Single most useful test is ping
Relies on having another system you know is up and supports ping
Normally do several tests: hosts on local subnets, then on other
subnets
“ping 128.6.4.4” [recommend using IP address, not name]
128.6.4.4 is alive
or
no response from 128.6.4.4
If you can’t get anything from local addresses, suspect either a gross
error in your machine’s TCP/IP setup or a wiring/network hardware
problem.
If other machines on your net are OK, probably it’s your setup or your
machine’s cabling
If you can get to machine on your subnet but not to any other address,
probably your router is not working or there’s a problem in the wiring
going to it.
If you can get to machines on some subnets but not others, probably
the routers are having problems.
¦ IP: testing
For network delays or inconsistent results, use ping -s or -t
ping -s 128.6.4.4
PING 128.6.4.4: 56 data bytes
64 bytes from ns-lcsr.rutgers.edu (128.6.4.4): icmp_seq=0. time=5. ms
64 bytes from ns-lcsr.rutgers.edu (128.6.4.4): icmp_seq=1. time=3. ms
----128.6.4.4 PING Statistics---2 packets transmitted, 2 packets received, 0% packet loss
round-trip (ms) min/avg/max = 2/3/5
If some sequence numbers missing, packets are being
dropped.
Users will report this is “the network is slow”.
It is normal to drop the first packet.
This can be almost anything: bad wiring, bad hub, Runet.
If hosts on your subnet are OK, probably Runet problem.
If some or all hosts on your subnet drop packets, either badly
overloaded network or wiring, hub, etc
¦ IP: testing
Another good tool for testing connectivity is traceroute (tracert on some
PC’s)
C:\USERS\HEDRICK>tracert athos
1 130 ms 131 ms 120 ms calloway-a.rutgers.edu [165.230.80.66]
2
*
120 ms 121 ms busch-gw.rutgers.edu [165.230.80.65]
3 120 ms 121 ms 120 ms rucs-gw.rutgers.edu [165.230.96.130]
4 121 ms 120 ms 130 ms lcsr-gw.rutgers.edu [165.230.212.130]
5 130 ms 130 ms 130 ms athos.rutgers.edu [128.6.25.4]
Look for
Dropped packets (*), except for the first
Unreasonable times. 2 to 5 ms typical for internal Runet, 120 ms for
dialup, as here. Outside Rutgers, several hundred ms is typical.
Looping
Note that you can’t do much about any of this. However the output may
be useful to Telecom people if they can’t duplicate the problem
¦ IP: testing
Ifconfig (ipconfig for NT) and netstat (-i for Unix -e for NT) can be
useful for detecting problems with your network card, wiring, hubs,
etc.
Ifconfig and ipconfig can be useful for checking configuration options.
Netstat has many options, which let you look at open connections, the
routing table, etc. Netstat -i or -e gives packet counts. Look
specifically at output error and collisions.
Output errors should be very small. They indicate that the system was
unable to send a packet, even after retrying 16 times. This indicates
badly overloaded or broken network. A few per day are normal.
Collisions are a way to judge network load. Compare output packets
with collisions. More than 5 indicates some load, 10-20% is cause
for concern. However we’ve seen Ethernets work with up to 50%
collision!
¦ TCP: Protocol
IP is “unreliable”
Best effort at delivering packets, but can fail
Temporary equipment failure, routing in flux, etc
If you can’t deliver a packet, drop it
If you’re too badly overloaded, dtop packets
TCP creates a reliable stream using IP packets
Breaks the conversation into packets, reassembles at other end
Assigns “sequence numbers” to each packet
Acknowledge arrival using “acknowledge sequence number”
If your packets haven’t been acknowledged, retransmit
Verifies checksum, ignore if bad. TCP is good in bad networks
Does “window management”, i.e. flow control
If network is “slow”, probably packets are being retransmitted
Some TCP implementations are better than others in bad conditions
¦ TCP: Protocol
1 0.00000
2 0.00454
3 0.00008
4 0.00189
5 0.04711
8 0.12893
9 0.00009
10 0.00068
11 0.00160
12 5.00023
13 0.00010
14 0.00009
15 0.00008
...
28 0.01424
29 0.00008
30 0.00059
31 0.00221
geneva -> athos
athos -> geneva
geneva -> athos
geneva -> athos
athos -> geneva
athos -> geneva
geneva -> athos
geneva -> athos
athos -> geneva
geneva -> athos
athos -> geneva
geneva -> athos
geneva -> athos
TCP
TCP
TCP
TCP
TCP
TCP
TCP
TCP
TCP
TCP
TCP
TCP
TCP
D=23 S=60756 Syn Seq=00 Len=0
D=60756 S=23 Syn Ack=01 Seq=287 Len=0
D=23 S=60756
Ack=288 Seq=01 Len=0
D=23 S=60756
Ack=288 Seq=01 Len=24
D=60756 S=23
Ack=25 Seq=288 Len=0
D=60756 S=23
Ack=25 Seq=288 Len=12
D=23 S=60756
Ack=300 Seq=25 Len=0
D=23 S=60756
Ack=300 Seq=25 Len=6
D=60756 S=23
Ack=25 Seq=300 Len=18
D=23 S=60756
Ack=300 Seq=25 Len=6
D=60756 S=23
Ack=31 Seq=300 Len=18
D=23 S=60756
Ack=318 Seq=31 Len=0
D=23 S=60756
Ack=318 Seq=31 Len=9
athos -> geneva
geneva -> athos
geneva -> athos
athos -> geneva
TCP
TCP
TCP
TCP
D=60756 S=23 Fin Ack=93 Seq=409 Len=0
D=23 S=60756
Ack=410 Seq=93 Len=0
D=23 S=60756 Fin Ack=410 Seq=93 Len=0
D=60756 S=23
Ack=94 Seq=410 Len=0
¦ TCP: Protocol
TCP manages “ports” (or “sockets”)
Ports are 16-bit numbers. They way you get to a specific application
Telnet is port 23, mail is port 25, etc
These are “well known ports”, documented in /etc/services
(source IP, dest IP, source TCP port, dest TCP port) must be unique
All ports under 1024 are “privileged”
Users can’t run their own telnetd
Rsh, rlogin, etc, use this to avoid needing passwords
Rsh starts by sending the user name
Can you trust it?
If it’s coming from a port number 1024, in theory you can
But it must be coming from a machine you trust
Reverse lookup security; need to do a forward lookup
Even so, faking addresses is too easy; only do this for your own
network
¦ TCP: Diagnosis
Diagnosis
Normally you don’t do any explicit diagnosis or testing of TCP
The main issue is applications: is the right daemon running?
The only problems I know of with TCP have been on large Unix
servers, when the kernel gets confused about memory. It may
become impossible to start a TCP connection.
“netstat -m” will show memory allocation failures under Unix
“netstat -s” [Unix and NT] will show all kinds of neat TCP stats
¦ Other Protocols
UDP: Unreliable Datagram Protocol
Used for simple question and answer, where overhead for a
connection is not justified: it’s easier just to ask again
Implements ports just like TCP, so you can find your application
Primary use is DNS
NFS used to use it, but found that TCP was better
The general trend is to use TCP for everything
ICMP: Internet Control Message Protocol
Used for infrastructure messages
Currently not secure
Ping (ICMP echo request; ICMP echo reply)
Routers send back: ICMP host or network unreachable
Default router sends: ICMP redirect (use this other router instead)
Subnet mask: ICMP subnet mask request and reply
¦ DNS: Protocol Description
The Domain Name System maps names to IP addresses, and
back
A distributed database
Central information distributed via “root name servers”
Campus information distributed via campus servers
Many campuses have departmental servers.
How do we handle ftp.athena.mit.edu?
Ask root name server who knows about edu
Ask the server for edu who knows about mit.edu
Ask the MIT campus server who knows about athena.mit.edu
Ask the Athena project server for the address for ftp.athena.edu
¦ DNS: Protocol Description
Sometimes one server can handle several levels.
You actually ask each level the whole question
Root servers are listed in a configuration file
Ask root servers: ftp.ai.mit.edu
Response: for mit.edu, see MIT (names and IP addresses given)
Ask MIT server: ftp.ai.mit.edu
Response: for ai.mit.edu, see the AI servers (names and IP given)
Ask AI server: ftp.ai.mit.edu
Response: ftp.ai.mit.edu is an alias for mini-wheats.ai.mit.edu
We now go through the same thing for mini-wheats.
To avoid generating lots of traffic, this information is cached
¦ DNS: Protocol Description
You don’t want every PC to have to do this
So PC’s normally send their queries to a local DNS server, which will
handle queries for them and cache the information
So on a typical PC or workstation, all you have to do is configure a list
of DNS server for it to talk to. [2 or 3, for safety]
The system administrator’s web page lists recommend servers
We recommend that departments run caching servers locally
They can be configured just to point to one of the servers we maintain.
We currently support DNS servers only for Suns.
There are Track packages to help you set this up.
Security: an issue if people to access control by hostname
Forward lookups go to authority for the domain; probably OK
Reverse lookups authority for the IP address range; can claim any
name
So normally do reverse and then look up name forward to check
¦ DNS: Diagnosis
For PC/Workstation, there isn’t much to do.
Verify that it has a correct list of servers configured
Verify that it has the right default domain or list of domains, to allow
people to type eden, rather than eden.rutgers.edu
To check servers, using “host” on Unix.
“host athos 128.6.4.4” will ask the server 128.6.4.4 about athos.
First check that you can get to the server using ping
Most supposed DNS problems are network problems
Particularly problems looking up names outside Rutgers
Diagnosing problems outside Rutgers is fairly complex
Do “host -r -v target server” several times, starting with root servers.
Eventually you’ll find that some server isn’t responding
¦ ARP: protocol definition
ARP: Address Resolution Protocol
Used for Ethernet and similar LAN technologies
Ethernet cards have 48-bit addresses.
All transmissions on the Ethernet must use those addresses.
ARP lets a system go from IP address to Ethernet address:
> ping 128.6.134.2
geneva -> (broadcast) ARP C Who is snagglepuss, 128.6.134.2
snagglepuss -> geneva ARP R 128.6.134.2 is 0:5:2:fa:dd:24
The result goes into an ARP table:
le0
le0
128.6.134.1
128.6.134.2
255.255.255.255
255.255.255.255
00:60:70:2f:a0:29
00:05:02:fa:dd:24
If you ping an IP address on your subnet, should get ARP
entry
It is very uncommon for there to be a problem with ARP itself
¦ ARP: displaying
arp -a [Unix, NT]
Net to Media Table
Device
IP Address
Mask
Flags
Phys Addr
--- -------------------- --------------- -- --------------le0 nb-gw
255.255.255.255
00:60:70:2f:a0:29
le0 toolbox
255.255.255.255
08:00:20:1a:e4:3e
le0 farside
255.255.255.255
00:60:70:2f:a0:29
le0 ALL-ROUTERS.MCAST.NET 255.255.255.255
01:00:5e:00:00:02
le0 geneva
255.255.255.255 SP 08:00:20:7e:a4:91
e0 128.6.134.222
255.255.255.255 U
Nb-gw is the default router.
Note how farside, on a different subnet, shows as nb-gw
Toolbox is another machine on the same subnet
128.6.134.222 shows a request that hasn’t gotten a response
¦ Putting it Together
When I’m having trouble, I try to do testing in a systematic order,
basically following the order in which various protocols are used.
First, I try doing all tests using IP addresses rather than hostnames.
That tells immediately if the problem is DNS or something else.
If it is DNS, the main diagnostic tool is “host”
Otherwise, try pings for both local hosts and hosts on other subnets. If
no pings work, it’s likely to be a serious setup problem or hardware.
If you can’t get off the local subnet, ping the router.
To check your configuration, look at the routing table with netstat -r or
route, and verify that you have the right IP address, net mask, etc.
Ifconfig or ipconfig and various options of netstat can help.
For network slowness or inconsistency, ping -s (ot -t) and traceroute
(tracert) are normally the most useful.
Snoop on a Sun is really useful. (On Linux: tcpdump) It lets you check
to see whether the packets you expect are being sent and
responses are arriving. This is how I debug most complex problems.
¦ Putting it Together
Use the various tools to check out each step. Suppose you want to do
“telnet athos”. Here are the steps you need to check:
DNS lookup of athos [DNS setup, are DNS servers working? Try it with
IP address first. Most supposed DNS failures are network failures.]
Look up IP address in routing table [are IP address, net mask right? Do
you have a default router set up?]
Look up next hop IP address (destination or router) in ARP table. If not
there, send ARP request [failures of ARP are unusual]
Send packet [use ping, netstat, ifconfig, etc. to check for overloaded
network, router down, wiring problems, etc.]
Is other end OK? [this requires cooperation from someone at the other
end. Try the same command to a number of different hosts. If all fail,
it’s probably your problem. If only some do, it may be theirs]
A lot depends upon care in trying several different tests. Ping to a
variety of machines. But you need to think carefully about the
results. E.g. if pings work on your side of the Raritan only, suspect
RUnet
Download