Log files - gozips.uakron.edu

advertisement
2440: 141
Web Site Administration
Web Server Monitoring and Analysis
Instructor: Enoch E. Damson
Monitoring and Analyzing Systems
 Monitoring operating systems, Web servers,
applications, etc typically involves analyzing log files
 Log files – contain information recorded by the
operating system in response to certain events
Monitoring and Analyzing the Web Server
Environment
2
Monitoring Operating Systems
 Logs are used to detect problems
 OS, application, or security problems
 Various tools can monitor performance
Monitoring and Analyzing the Web Server
Environment
3
Monitoring Windows
 Performance monitoring allows you to compare system
performance over time
 Windows Task Manager highlights CPU and memory
usage
 You can modify services to notify you if a service fails
Monitoring and Analyzing the Web Server
Environment
4
Windows Event Viewer
 The event viewer contains six event types shown in the left pane
Monitoring and Analyzing the Web Server
Environment
5
Windows Event Logs
 System and application events display three levels
of messages
 Information
 Warning
 Error
 Because many messages can be generated, a filter
focuses on what you want to see
 Over time, the logs fill up so you should clear them
or save them
Monitoring and Analyzing the Web Server
Environment
6
Monitoring Linux
 Logging is controlled by the syslogd daemon
 Below are some facilities which represent daemons
using syslogd
Monitoring and Analyzing the Web Server
Environment
7
Eight Levels of Message Priorities
in syslogd
Monitoring and Analyzing the Web Server
Environment
8
Web Server Log Files
 Files that keep track of Web server transactions
 Most Web servers write two log files to disk:
 Access log – contains a line for each Web server request
 Error log – contains a line for each generated error
response
 When log files grow:
 A common practice is to put the log files on a separate
drive or partition
 A better solution is to rotate the log files

Rename or remove the log files at regular intervals (weekly,
monthly, etc)
Monitoring and Analyzing the Web Server
Environment
9
Web Server Log File Formats
 Most Web servers support at least two logging
formats:
 Common Logfile Format (CLF)
 Extended Logfile Format (ELF)
 Most Web servers also allow the administrator to
specify a custom format, along with the above
formats
 A standard logfile format makes it easier for users
to understand files from different servers
 Allows third-party logfile analysis tools to support many
different Web servers
Monitoring and Analyzing the Web Server
Environment
10
Common Logfile Format (CLF)
 The NCSA and CERN Web servers first used this file format
 Many Web servers now support this format (IIS, Apache,
Netscape Enterprise, etc)
 Each line in the file represents a unique request
 Has a fixed format with seven fields to be logged:







remotehost
rfc1413
authuser
[date]
“request”
status
bytes
Monitoring and Analyzing the Web Server
Environment
11
Common Logfile Format…
 remotehost – remote (client) hostname or IP number
 rfc1413 – remote username
 rfc1413 defines a protocol used to determine the identity of a client that requests a
resource from the server
 Seldom used on Internet servers because it slows the server’s response
 A “-” is entered into the log if the server is unable to determine a userid
 authuser – when required, the username by which the user has authenticated
is provided
 A “-” is used for normal unrestricted requests
 [date] – date and time of the request
 Enclosed in brackets for potential spaces
 “request” – HTTP request line exactly as it came from the client
 Enclosed in quotes for potential spaces
 status – HTTP status code returned to the client
 bytes – content length of document transferred
 Example:
127.0.0.1 - - [24/Oct/2006:09:11:55 -0500] "GET /test.asp HTTP/1.1" 200 626
Monitoring and Analyzing the Web Server
Environment
12
Extended Logfile Format (ELF)
 Used to log more information or omit certain
fields
 Allows the administrator to specify exactly which
fields to log and in what order
 Each represents a request like CLFs but the
beginning of the file also contains some
configuration directives
 Each directive line begins with a #
 Two directives are required and must precede all entries
in the log file:


Version – specifies the version of the ELF to use
Fields – specifies what data to record in the logfile
Monitoring and Analyzing the Web Server
Environment
13
Extended Logfile Format…
Example:
#Software: Microsoft Internet Information Services 5.1
#Version: 1.0
#Date: 2006-10-27 03:04:57
#Fields: date time c-ip cs-method cs-uri-stem sc-status sc-bytes cs-version
2006-10-27 03:04:57 127.0.0.1 GET /test.asp 200 626 HTTP/1.1
 The fields directive here specifies 8 out of several available fields:
 date – client request date
 time – client request time
 c-ip – client IP address
 cs-method – HTTP request method
 cs-uri-stem – file requested by client
 sc-status – HTTP status code returned to the client
 sc-bytes – number of bytes sent from server to client
 cs-version – version of HTTP used by client to connect to the server
Monitoring and Analyzing the Web Server
Environment
14
Error Logs
 Contains informational messages and debugging
information
 Useful for:
 Finding problems with the server
 Debugging server-side programs and new configurations
 Most server packages allow the administrator to control
what types of messages are logged to the error log file
 The format is usually not configurable like ELFs but allows some
flexibility in choosing the severity and type of messages to log
 E.g only critical messages may be logged if a server is running
smoothly
Monitoring and Analyzing the Web Server
Environment
15
Referrers
 Determines what Web page was used by the client
to access a server
 May be the URL of a search engine or any Web site with
a link to the Web server
 A “-” is used if there was no Referrer header sent
 The Referrer header is not sent in the following
circumstances:





The users enters the URL by hand
The user clicked on a link to regular file and not a Web page on a
public site
The user loaded the URL from a bookmark file
The Referrer URL is on a private (internal) Web site
The user or browser has disabled sending the Referrer header
Monitoring and Analyzing the Web Server
Environment
16
Monitoring IIS
 IIS has specific counters for use in the Performance
Monitor
 The System event viewer provides specific
information
 IIS has extensive logging capabilities
 There are default log formats used by various third-party
applications that analyze logs
Monitoring and Analyzing the Web Server
Environment
17
Monitoring Apache
Error Logs
 By default, syslogd sends Apache messages to
/var/log/boot.log
 Location of the error log

ErrorLog logs/error_log
 Logs refers to /var/log/httpd
 You can create a different error log for each virtual host
Monitoring and Analyzing the Web Server
Environment
18
Monitoring Apache
Transfer Logs
 Transfer logs tell you about the use of your Web
site
 The default log is based on a combined format
 Determined by the CustomLog directive in the
configuration file (httpd.conf)
 There are a number of sample formats
 By default, logs are stored in
/var/log/httpd/access_log
Monitoring and Analyzing the Web Server
Environment
19
Monitoring DNS
 BIND uses a logging statement that you configure
in named.conf
 BIND defines logging in two parts:
 Channel defines where logging is sent
 Category defines what will be sent
 If the channel is going to a file, use the versions
option to define the number of backups
 Size option sets maximum size of the file
 print-time adds the date and time to the file
Monitoring and Analyzing the Web Server
Environment
20
BIND Categories
Monitoring and Analyzing the Web Server
Environment
21
Monitoring Exchange Server
 Exchange server uses the application portion of
Event viewer
 You can enable four types of logs
 audit – access to mailboxes
 protocol – commands used for SMTP, etc
 message tracking – senders and receivers
 diagnostic – analyze detailed problems
Monitoring and Analyzing the Web Server
Environment
22
Analysis Tools for the Web Server
 Analysis tools extract system data from logs and
format the data
 For IIS, one of the popular tools is WebTrends
 Helps you determine the source of Web traffic
 Determines which pages are most popular
 Several different reports
 123LogAnalyzer is available for both IIS and
Apache
 Many reports are similar to WebTrends
Monitoring and Analyzing the Web Server
Environment
23
Log File Analysis
 Simply looking at log files can provide a lot of information
about activities or requests on a server
 Simply counting the number of lines in an access log file
can help determine the number of hits
 Log files may be reviewed regularly to find the common
errors logged
 Some of the common errors include:




Dead links
Requests for non-existing files
CGI scripts not working properly
Permissions problems
 Some of the open-source log analyzers are:



Analog (http://www.analog.cx)
Webalizer (http://www.mrunnix.net/webalizer)
Report Magic (http://www.reportmagic.org)
Monitoring and Analyzing the Web Server
Environment
24
Statistics
 With the help of several log analyzer programs, some of the statistical
information that can be extracted include:








Most requested pages
Top entry pages (the first page clients enter a site through)
Most used browsers
Bandwidth utilization
Most active domains
Top referring sites and URLs
Error counts
Information about search engines (most common search engines, common queries,
etc)
 Some of the widely used commercial log analyzer products include:
 WebTrends (http://www.webtrends.com)
 Wusage (http://www.boutell.com/wusage)
 A database could also be used to store log information to increase efficiency of
logging and report generation
 Not all Web servers support logging to a database
Monitoring and Analyzing the Web Server
Environment
25
Download