Gray.ppt

advertisement
title:
What info can you
get from log files?
How do you get it?
October 11, 2010
Examples






How many dynamic address clients do we have in a
day?
How many address leases have been used?
Are any of our clients taking more than their fair
share of addresses (i.e. one)?
How many iPods are on cruznet?
How many unique names logged into the
authentication system?
What DNS servers are the kids using? Latvia?
ANSWER
Find the right log file and look through it.
I don't want to talk about graphing today,
except to say that regardless of how I graph,
I will need to produce “time series” data.
DHCP transaction logs
But first we need to review the DHCP protocol...
DHCP protocol
D.O.R.K.
And other stuff . . .
RA RA RA !!
If you believe DORK, this should be RK.
Clients renew (or try, anyway) by sending a
request to the server that last gave them a
lease. If they get an ack, then they get to
keep that address. If they get a nak, the
address is not available. Or the client might
get no response at all.
Sample log file
Example: How many MAC addresses?
Make a list of all the ethernet addresses in the
transaction log. Eliminate the duplicates. Count what
is left. Pretty simple.
We need some tools . . .
Unix tool: grep
Grep applies a pattern to each line of a file.
If the pattern exists in the line, the line is
printed. If the pattern doesn't match, the
line is skipped and the next line is
considered.
Print can mean print or it can mean pass to
the next program in a chain.
Unix tool: awk
Awk also operates on lines. It can reference
the words within the line, testing them for a
match, doing arithmetic on numbers, or just
printing them.
Awk uses curly braces { } to surround its
arguments.
Finding the MAC address
Oct 10 00:01:34 dhcpa dhcpd: [ID 702911
local7.info] DHCPREQUEST for 169.233.121.110
from d4:9a:20:a5:32:ab (iPod-touch-3) via
169.233.127.254
Oct 10 00:01:34 dhcpa dhcpd: [ID 702911
local7.info] DHCPREQUEST for 128.114.143.179
(128.114.142.212) from 00:13:72:bc:f5:81 (se81) via
128.114.143.240
In both cases, the address follows the word 'from'
Example w grep and awk
grep DHCPREQUEST dhcploga | awk '{if ($12 ==
"from") print $13; if ($13 == "from") print $14}'
Can also be written as:
grep DHCPREQUEST dhcploga | \
awk '{if ($12 == "from") print $13; \
if ($13 == "from") print $14}'
Output looks like:
7c:c5:37:72:49:cd
7c:c5:37:a2:58:ee
00:13:72:bd:c4:6d
60:33:4b:1c:d4:f9
c4:2c:03:8a:ea:d8
d4:9a:20:1e:fc:8f
70:f1:a1:ad:51:b5
00:13:02:b6:13:70
On Friday, there were 280K lines on dhcpa
Unix tool: uniq
Think unique
This tool compares lines in a file. When two or more
lines are identical, it replaces them with a single line.
But for this to be useful, all the addresses that are the
same need to be adjacent in the file. Said another way,
the file needs to be sorted.
Unix tool: sort
sort has lots of options. It can sort numerically or
alphabetically. It can start sorting on a column other
than the first word. It can reverse sort. Here, it doesn't
make any difference because we only need all
occurrences of the same MAC address to be on
adjacent lines.
Putting this together, we get . . .
grep DHCPREQUEST dhcploga | \
awk '{if ($12 == "from") print $13; \
if ($13 == "from") print $14}' | \
sort | uniq | wc
wc counts the lines and words in the file. The number
of lines is our answer: the number of different MAC
addresses seen by the system.
Answers for last week

Friday
17391

Thursday
19451

Wednesday 19435

Tuesday
19668

Monday
19040
Anything wrong here?



There are two servers – dhcpa and dhcpb. I should
have combined their logs.
DHCP assigns addresses to interfaces, not
computers. One computer that is on wireless and
wired at the same time will have two addresses.
A computer that is used on resnet (wired) in the
dorm and cruznet (wireless) on campus will show us
both MAC addresses, even if they're not used at the
same time.
Another example: iPhone count




There is a hint how this might be done in the log file
slide.
If we collect all the Client-hostnames, sort and run
uniq we will not get the right answer.
Ethernet MAC addresses are unique. But is no
reason to assume Client-hostnames are unique.
Monicas-iPhone? Could be lots of Monicas.
How could we fix this?
Client-hostname
For MAC address we had:
grep DHCPREQUEST dhcploga | \
awk '{if ($12 == "from") print $13; \
if ($13 == "from") print $14}'
Change this to:
grep DHCPREQUEST dhcploga | \
awk '{if ($12 == "from") print $14, $13; \
if ($13 == "from") print $15, $14}'
This looks like . . .
(Austin1) 00:26:bb:0d:26:54
(Austins-iPhone) cc:08:e0:1e:3b:43
(Austins-phone) 00:26:08:7c:87:fa
(Autumn-Saga) 00:25:bc:e1:5e:98
(Avery) 00:26:4a:18:da:7a
(Aviel-PC) 00:1b:24:f6:9d:0a
(Aviv-PC) f0:7b:cb:0c:04:fb
(AwesomeIV) f8:1e:df:d8:90:95
(AwesomePossum) 00:16:d4:a5:af:6c
(Awesomeness-PC) 00:23:4d:73:72:8c
Phone strategy



Select client names that contain phone with grep -i
-i option ignores case so we get both phone and
Phone
Retain the MAC addresses of the phones, and look
for unique ones as before, so
[stuff from previous page] | grep -i phone | \
awk '{print $2}' | sort | uniq | wc
Apple counts:

399 iPhones

552 iPods

42 iPads
How do we know this is right?
We can check the MAC vendor code to make sure it is
Apple
But really, we don't know.
DHCP leases with awk
The DHCP server keeps a record of current
leases in a file on the server disk. If the
server needs to restart, the leases file tells it
about leases already granted. There is one
record for each managed IP address. Each
address is in one of several states:
DHCP lease states
An IP-address can be:

FREE – address is available for assignment

BACKUP – address is free but on other server

ACTIVE – unexpired lease held by client

RELEASED – half way toward FREE

ABANDONED – address conflict
? ABANDONED ?
An address is put into abandoned status when a
client trys to use it and finds that it is already in
use. ABANDONED is the same as FREE except
that the address will not be offered if untainted
free addresses are available.
A leases entry looks like:
Lease 169.233.58.234 {
Starts 3 2010/10/06 02:29:38;
Ends 3 2010/10/06 05:15:00;
Tstp 3 2010/10/06 05:15:00;
Tsfp 3 2010/10/06 17:32:08;
Cltt 3 2010/10/6 02:29:28;
Binding state released;
Next binding state free;
Hardware ethernet a4:ba:db:b3:5c:cc;
Client-hostname “Catalina-PC”;
}
Lease entry (con't)
There may be other lines in the file,
depending on the client. For example, dhcp
option 82 might give switch/port
information.
There is a block like this for every dynamic
address the server is managing.
Writing to the leases file
When an address changes state, a new entry
for that address is written to the end of the
leases file. For any address, the only entry
that is current is the LAST one. The server
never changes entries in the middle of the
file.
Problems



Each address lease block is 10+ lines. Since grep
acts on lines individually, it won't work to select
whole blocks.
It would sure be nice if each address was one line.
awk commands could get to be pretty long and hard
to type. Commands can be placed in a file. This file
might be called an awk script. Then it becomes:
awk -f script.awk
Script to fix the multi-line problem
{
#
# Input is dhcp.leases. Output is one line per addr.
#
if ($1 == "lease") printf "\n%15s ",$2
if ($1 == "starts") printf "%s %s ",$3, $4
if ($1 == "ends") printf "%s %s ", $3, $4
if ($1 == "binding") printf "%8s ", $3
if ($1 == "hardware") printf "%s ",$3
if ($1 == "uid") printf "%s ", $2
if ($1 == "client-hostname") printf "%s ", $2
}
This produces long lines like:
169.233.146.200 2010/10/11 02:32:42; 2010/10/11
14:32:42; active; 00:22:fa:df:6d:c8;
"\001\000\"\372\337m\310"; "Kunal-PC";
169.233.144.244 2009/07/21 12:16:06; 2009/04/09
15:54:05; backup; 00:23:6c:05:8a:4f;
"\001\000#l\005\212O";
The UID is weird binary. Time is GMT. Client-ID is
optional for the client.
Usually, interest in one subnet
We'll use the 2-net as an example
awk -f script.awk dhcp.leases | \
awk -F. '{if ($3 == 2) print}'
-F. means to use the period as the word separator
instead of white space. The line begins with an address
like 128.114.2.10 so the third word is the number 2.
For each IP, keep only the last line
# script name is lastline.awk
{
line[$1]= $0
} END {
for (item in line) print line[item]
}
$0 is the whole line. $1 is the IP address.
Count of leases in use on 2-net
awk -f script.awk dhcp.leases | grep 128.114 \
awk -F. '{if ($3 == 2) print}' | \
awk -f lastline.awk | grep active | wc
Download