Thesis Documentation

advertisement
ERROR REPORTING MODEL FOR PINGER ENDTO-END REPORTING
By
Muhammad Imran Alam
(2001 – NUST – BIT – 801)
A project report submitted in partial fulfillment of
the requirements for the degree of
Bachelors in Information Technology
In
NUST Institute of Information Technology
National University of Sciences & Technology
Rawalpindi, Pakistan
(2005)
CERTIFICATE
It is to certify that the contents and form of project proposal entitled
"Error Reporting Model for PingER End-to-End Reporting" submitted by Muhammad
Imran Alam has been found satisfactory for the requirement of the degree.
Advisor: ____________________________
Professor (Dr. Arshad Ali)
Co-Advisor: ______________________________
Assistant Professor (Dr. Waqar Mahmood)
Member: _________________________________
Lecturer (Mr. Tashfeen)
Member: _________________________________
Lecturer (Mr. Mohammad Bilal)
DEDICATION
In the name of Allah the Most Gracious, the Most Merciful
To my dear Family
Specially to my parents
ACKNOWLEDGEMENTS
First, I will thank Almighty Allah for giving me power, ability and
spirit to complete my Final Year Project successfully. I am thankful to my family,
especially my Parents and siblings for they always encouraged me through their
devotion towards my studies and for being there to pray for me all the time when I
really needed it.
I am extremely thankful to Prof. Dr. Arshad Ali my advisor during my
course of project for being so kind and for his excellent supervision. I am thankful to
him for his support that was always there during my project despite his busy schedule.
I am also thankful to Dr. Waqar Mahmood for his kind help and for always being there
to help me out of my problems. I will pay special thanks to Mr. Ejaz Ahmed for his
help; guidance, encouragement and continuous involvement in my project to
successful complete my project.
I would like to express my special gratitude to all of my committee, Dr.
Arshad Ali, Dr. Waqar Mahmood, Mr. Ejaz, Mr. Tashfeen and Mr. Mohammad Bilal
for closely supervising the project and providing me with the technical assistance.
Working with them provided me a good opportunity to learn from them and get an
experience, which will always help me in my practical life.
Muhammad Imran Alam
iii
ABSTRACT
An administrator has to see everyday weather all sites and nodes are up
; Sure, he could assign this task to others, but something even more insidious is
needed, something grow accustomed to seeing.
This system would automatically email the problems to the
Administrator; publish the latest results regarding down sites and down nodes, find out
anomalous sites based on different statistical techniques, find invalid data and TLD’s
in the PingER database and make various histograms and pie charts to help the
Administrator getting the Birds eye view of the PingER system. In short, this project
would the Administrators life less forgetful, computer more useful, and Administrator
can spent the same time in doing more productive tasks.
v
TABLE OF CONTENTS
DEDICATION………………………………………………………………………..iii
ACKNOWLEDGEMENTS………………………………………………………….iv
ABSTRACT………………………………………………………………..…….……v
CHAPTER 1:PROJECT SPECIFICATION…………………………..1
1.1 INTRODUCTION .................................................................................................................. 1
1.2 PROJECT DOMAIN .............................................................................................................. 2
1.3 PROBLEM STATEMENT..................................................................................................... 1
1.4 MOTIVATION ...................................................................................................................... 2
1.5 PROPOSED SOLUTION ....................................................................................................... 3
1.6 DELIVERABLES .................................................................................................................. 4
CHAPTER 2:LITERATURE REVIEW……………………………….7
2.1 PINGER INTRODUCTION................................................................................................... 7
2.2 MECHANISM ....................................................................................................................... 7
2.3 PINGER ARCHIACTURE .................................................................................................... 7
2.4 PINGER IMPORTANCE ....................................................................................................... 8
2.5 UNPREDICITABILITY ........................................................................................................ 9
2.6 QUALITY .............................................................................................................................. 9
2.7 RELATED WORK ............................................................................................................... 10
CHAPTER 3:REQUIREMENTS ANALYSIS……………………….11
3.1 ANATOMY OF THE PROBLEM........................................................................................ 11
3.2 SPECIFIC OBJECTIVES OF THE PROJECT ..................................................................... 12
3.3 MAKING DOWN SITES AND NODES FILTER ............................................................................ 12
3.4 MAKING DOWN NODE FILTER ...................................................................................... 13
3.5 MAKING ANOMALY DETECTION ENGINE .................................................................. 14
3.6 AUTOMATIC REPORT GENERTAION TOOL ................................................................ 14
3.7 DRAMATIC-EVENTS ANOMLY DETECTION ENGINE ............................................... 15
3.8 DATA DISCREPANCY FINDER ....................................................................................... 16
vi
CHAPTER 4:NON-FUNCTIONAL REQUIREMENTS AND
SPIDERING…………………………………………………………….17
4.1 SPIDER FOR PINGER MANAGEMENT ........................................................................... 18
4.2 BEST PRACTICES FOR PINGER MANAGEMENT SPIDER .......................................... 19
CHAPTER 5:IMPLEMENTATION DETAILS AND
ARCHITECTURE……………………………………………………...21
5.1 ANTICIPATED USERS ...................................................................................................... 21
5.2 DESCRIPTION .................................................................................................................... 21
5.3 ARCHITECTURE DIAGRAM ............................................................................................ 22
5.4 PINGER MANAGEEMENT ARCHITECTURE................................................................. 22
5.5 PERL SCRIPTS DESCRIPTION ......................................................................................... 23
5.6 DATABASE DESIGN ......................................................................................................... 39
CHAPTER 6:INTERFACE DESIGN………………………………...48
6.1 INTERFACE ::MAIN PAGE ............................................................................................... 48
6.2 INTERFACE::ITERNAL PAGES ....................................................................................... 48
6.3 CONCEPTUAL DIAGRAM ................................................................................................ 52
6.4 TECHNICAL DESIGN ........................................................................................................ 53
CHAPTER 7:RECOMMENDATION FOR FUTURE WORK…….54
7.1 STATISTICAL ANALYSIS OF PINGER DATA ............................................................... 54
7.2 OTHER TECHNIQUES ....................................................................................................... 56
CHAPTER 8:CONCLUSION…………………………………………58
REFRENCES…………………………………………………………...59
APPENDIX A:ADMINSTRATOR’S GUIDE………………………..61
A.1 PROJECT REQUIREMENTS ............................................................................................. 62
A.2 SOFTWARE INSTALLATION DETAILS ......................................................................... 62
A.3 PROJECT CONFIGURATION ........................................................................................... 66
vii
LIST OF FIGURES
Figure 2.1 : PingER Architecture ....................................................................................... 8
Figure 3.1: Anatomy of the Problem ................................................................................ 12
Figure 5.1: Architecture Diagram .................................................................................... 22
Figure 5.2: PingER Management Architecture ................................................................. 22
Figure 1.3: Insert Node Status .......................................................................................... 24
Figure 5.4: Down Sites ..................................................................................................... 25
Figure 5.5: Down nodes .................................................................................................... 26
Figure 5.6: Site Status for Down Sites .............................................................................. 27
Figure 5.7 : Node Status for Down Nodes ........................................................................ 28
Figure 5.8: Update mail status down sites ........................................................................ 29
Figure 5.9: Update mail status down nodes ...................................................................... 30
Figure 5.10: Down sites graphs and charts ....................................................................... 31
Figure 5.11: Down nodes graphs and charts ..................................................................... 32
Figure 5.12: Min-RTT Anomaly....................................................................................... 33
Figure 5.13: Min-RTT graphs and charts ......................................................................... 34
Figure 5.14 : Dramatic Anomaly ...................................................................................... 36
Figure 5.16: Data Triggers ................................................................................................ 38
Figure 5.17: Mismatch TLD’s .......................................................................................... 39
viii
LIST OF TABLES
Table 5.1: Insert node status ............................................................................................. 23
Table 5.2: Down Sites ....................................................................................................... 24
Table 5.3:Down nodes ...................................................................................................... 25
Table 5.4: Site Status for Down Sites ............................................................................... 26
Table 5.5: Node Status for Down Nodes .......................................................................... 27
Table 5.6: Update mail status down sites.......................................................................... 29
Table 5.7: Update mail status down nodes ....................................................................... 30
Table 5.8: Down sites graphs charts ................................................................................ 31
Table 5.9: Down nodes graphs and charts ........................................................................ 31
Table 5.10: Min-RTT Anomaly ........................................................................................ 32
Table 5.11: Min-RTT graphs and charts ........................................................................... 34
Table 5.12: Dramatic Anomaly......................................................................................... 35
Table 5.13: Down nodes graphs and charts ...................................................................... 36
Table 5.14: NOT-SET ....................................................................................................... 37
Table 5.15: Data Triggers ................................................................................................. 37
Figure 5.16: Data Triggers ................................................................................................ 38
Table 5.16: Mismatch TLD’s ............................................................................................ 39
Table 5.17: Mailstatus for downsites ................................................................................ 39
Table 5.18: Status lookup for down sites .......................................................................... 40
Table 5.19: Node region for down sites ............................................................................ 40
Table 5.20: Site region for down sites ............................................................................. 40
ix
Table 5.21: Dateid for down sites ..................................................................................... 40
Table 5.22: Monitors for down sites ................................................................................. 41
Table 5.23: Dssmonitors for down sites ........................................................................... 41
Table 5.24: Mail for down sites ........................................................................................ 41
Table 5.25: Node count for down node ............................................................................ 41
Table 5.26: Mail for down node........................................................................................ 42
Table 5.27: Node lookup for down node .......................................................................... 42
Table 5.28: Monitors for down node ................................................................................ 42
Table 5.29: Dateid for down node .................................................................................... 42
Table 5.30: Nodes for down node ..................................................................................... 43
Table 5.31: Dssnodes for down node ................................................................................ 43
Table 5.32: Mail for down node........................................................................................ 43
Table 5.33: Dtrtt for anomaly ........................................................................................... 44
Table 5.34: Minrtt for anomaly ......................................................................................... 44
Table 5.35: Dbminrtt for anomaly .................................................................................... 44
Table 5.36: Dssminrtt for anomaly ................................................................................... 44
Table 5.37: Dateid for anomaly ........................................................................................ 45
Table 5.38: Dtrtt for dramatic anomaly ............................................................................ 45
Table 5.39: Minrtt for dramatic anomaly .......................................................................... 45
Table 5.40: Dbminrtt for dramatic anomaly ..................................................................... 46
Table 5.41: Dssminrtt for dramatic anomaly .................................................................... 46
Table 5.42: Dateid for dramatic anomaly ......................................................................... 46
x
Table 5.43: Notset ............................................................................................................. 47
Table 5.44: TLD ................................................................................................................ 47
Table 5.45: DT .................................................................................................................. 47
xi
Chapter 1
1 PROJECT SPECIFICATION
In this chapter, overview of the whole project is given. The motivation of the project is
to find a good solution. It also include the technology overview ,project scope and
challenges faced during the project.
1.1 INTRODUCTION
This system would automatically emails the problems to the
Administrator; publish the latest results regarding down sites and down nodes,
Anomalous sites based on different statistical techniques , find invalid data and TLD’s
in the PingER database and make various histograms and pie charts to help the
Administrator getting the Birds eye view of the PingER system. In short, this project
would the Administrators life less forgetful, computer more useful, and Administrator
can spent the same time in doing more productive tasks.
1.2 PROBLEM STATEMENT
To create an application, that can automatically test the pingER
software and report errors to the administrators.
1
1.3 PROJECT DOMAIN

Network Management.

Network Anomaly Detection

Website Development
1.4 MOTIVATION
Presently, Administrators (typically there are more than one person who
administers PingER) at PingER are facing many difficulties in handling the system.
They have to do many repetitive tasks by hand. This project could be of a great help
for them. They are most of the time fixing problems in PingER, its quite time
consuming task. If there is some software, which could automatically do this. It will
save quite a lot of time.
Typical problems faced by the network Administrators at PingER are as
following:

The “Analyzer” (the Analyzer in this context is any “end user” such as a
networker, physicist, administrator, someone preparing a report etc. who looks
at the data and finds problems with it) may come across values, different from
the accepted values. E.g., the packet loss for a particular site is not with in the
usual range for that particular region. The Analyzer reports the problem to the
Administrator (these are the steps we wish to automate) who verifies the claim
2
of the Analyzer; then looks to find the reason for this anomaly, and tries to
solve this problem manually.

Some monitoring site or monitored node may be down for more than a month
or even permanently. Currently the Analyzer sees the problem while requesting
and viewing data from a particular monitoring site to a particular monitored
node and informs the Administrator about this. The Administrator then has to
figure out, how to solve this problem. The Administrator her/his self while
checking the system health on regular basis may come to know that some node
or sites are down.

Monitored node moves from one geographical region to another without any
prior notice to the administrator. E.g. At one instance, the host may be in Africa
and later it may be in America (an example of this would be a proxy server
replacing the real web server). After being notified the Administrator has to
find new monitoring site in the same region where the old monitoring site went
down and update the database of monitored sites accordingly.
1.5 PROPOSED SOLUTION

To indicate monitored sites, which are not available, create filters, which
indicate monitored nodes, which are not available. E.g., finding monitored
sites, those have not responded for more than 60 days. Automate the
detection and notification of the sites that are down. To assist in the detection
3
and alerting, knowledge of hosts known to be down, needs to be kept and
either included in the alert information or used to avoid an alert being sent.

To check anomalous behaviors in the data like RTT (Round Trip Time),
Packet loss and derived throughput, create module that look at the minimum
RTT for a host that is supposedly in some country or regions, and compares
this RTT with other hosts in the same region. If it was way below, the RTT
of other hosts in the region than flag it. Another way would be to have a
database of minimum RTTs for a region or even for a host and comparing
the actual minimum RTTs to see if it is way below that of the specified
minimum (way below might be 50%).

To report the anomalies in the network data and provide the PingER
management with the reports that makes the management task easy, By using
various types of summary reports and graphs that could show the network
nodes and sites health in a compact and user-friendly manner.
Using passive data as input, this project would be coded in Java, Perl
and CGI. Briefly, the goal of the project is reducing human intervention and increasing
the automation while at the same time minimizing false alerts and missing few of the
important anomalous events.
1.6 DELIVERABLES

Create filters, which indicate monitoring sites, which are not available.
4

Create filters, which indicate monitored sites, which are not available. i.e.;
which monitored sites have not responded for 60 days

Make an Anomaly Detection Engine, which detects various anomalies in
network parameters like RTT, Packet Loss etc and report that to concerned
person. i.e.; identifying when a host physically moves to a new location (e.g.
a named web server actually is a proxy that is not where it used to be.)

Make an Automatic report generation tool for the management Generate
daily, monthly, yearly reports regarding problems in monitoring data. i.e.,
graph showing which sites were down at what time and reports showing
Anomalies in RTT.

Create a module, which checks that if there is no data form a monitoring or
a monitored site for more than say 30 days (this number should be
configurable) it should send email to administrators specified.

Create a module that checks for anomalous events in RTT information and
if there are some, send reports to the Administrators. This is separate form
the anomalous event detection for the bandwidth time series. In this case,
the events would be dramatic. E.g., the min RTT dropped by a factor two,
there was no response for a long period of time (the period should be
specifiable as an option, but might be days or even months).
5

A module that finds discrepancies in the input data e.g.; at the time of
registration of the monitored hosts, the data entered might be incorrect and
incomplete; as a result the monitored host remains “not-Set”.
6
Chapter 2
2 LITERATURE REVIEW
In this chapter literature studied related to the project is discussed
2.1 PINGER INTRODUCTION
PingER is a tool that can be used by the for measuring the parameters of
the network then these parameters could be used to determine the situation on the link,
and that could help the ISP and other service providers in planning and policy making.
[14]
2.2 MECHANISM
Internet Control Message Protocol (ICMP) Echo mechanism is used by
the PingER. User can select the length of the packet that is to be send to a remote node
.This packet is echoed back. Based on these values collected by pinging, packet loss,
through put and various other parameters are calculated. [13]
2.3 PINGER ARCHIACTURE
The PingER architecture is illustrated below
7
Figure 2.1 : PingER Architecture [14]
2.4 PINGER IMPORTANCE

PingER provides universal coverage.

Administrators at remote hosts don’t have to install any software.

It has low network impact.

It provides useful short and long-term measures of :
o Bottleneck bandwidth,
o Available bandwidth,
o Response time,
o Packet loss,
o Reach ability
o Predictability
8
2.5 UNPREDICITABILITY
“One can also calculate the distance of each predictability point from
the coordinate (1,1). It is normalized to a maximum value of 1 by dividing the distance
by sqrt(2). I refer to this as the ping unpredictability, since it gives a percentage
indicator of the unpredictability of the ping performance.” [6]
2.6 QUALITY
In order to be able to summarize the data so the significance can be
quickly grasped, The characterizing of the quality of performance of the links is as
following:
(a) Delay
“ In data communications, the time between transmission and reception
of a signal. Usually expressed in nanoseconds”
(b) Loss
“The losses of data in a packet based network, usually due to
congestion and consequent buffer overflow.” The quality of the network is mainly
determined by the packet loss.
(c) Jitter
“Jitter is a measure of the variability over time of the latency across a
network”. There are Four categories of network degradation based on one-way jitter.
[7]
9
(d) Utilization
Utilization of the Link can be read from routers via SNMP .At 90% a
typical network would usually discard 2% of the packets. [7]
2.7 RELATED WORK
The Cooperative Association for Internet Data Analysis (CAIDA) has
developed measurement and analysis tools that can be used to understand Internet
traffic. [1]
10
Chapter 3
3 REQUIREMENTS ANALYSIS
Presently, Administrators at PingER are facing many problems in handling the system.
Finding out the sites/nodes that are down and Finding anomalies in the data, collected
at the Archive site. They have to do many manual tasks on daily basis.
This project would solve these problems As a result; it would be a great
time savior for the Administrators at PingER.
3.1 ANATOMY OF THE PROBLEM

“End user” such as a net worker, physicist, administrator, someone preparing a
report etc. who looks at the data and finds

The end user tells the Administrator of the Pinger that the packet loss for a
particular site is not with in the usual range for that particular region

Administrator verifies the claim of the Analyzer
11
2. “the packet loss
for a particular site
is not with in the
usual range for that
particular region”
Analyzer
Administrator
find the reason
for this
anomaly
1. “end user” such as a
networker, physicist,
administrator, someone
preparing a report etc. who
looks at the data and finds
problems with it
3. verifies the claim
of the Analyzer
solve this
problem
Figure 3.1: Anatomy of the Problem
3.2 SPECIFIC OBJECTIVES OF THE PROJECT
Keeping the above scenario in mind the Functional requirements that
have been gathered with interaction from Slac people are as following:
3.3 MAKING DOWN SITES Filter
Create filters, which indicate Monitoring sites and Monitored nodes,
that are not available or down during the last week. This Requirement can be divided
into two main parts making down sites filter.
In order to find the down sites each sites data is monitored and the data
is parsed to see weather the site is being down and if its so it is saved in database, and
next week ( or any number of days) again the site is checked if the site remains down
for the specified number of days ,these number of days could also be specified i.e.;
12
Administrator can input for how long if the site remains down then generate the mail.
Moreover, an Administrator can also input the fault of the site being down, so, that in
the future he can see the trends or reasons for a particular site going down. A database
is needed to be designed with this requirement in mind that the history data should be
available for decision support system; keeping this in mind the database has been
designed with the Star-Schema in mind.
3.4 MAKING DOWN NODES FILTER
This requirement is entirely different from the down sites, because in
this the nodes are determined to be down if it is showing no data for all number of
pairs of monitored nodes it is being monitored.
In order to find the down nodes each sites data is monitored and the
data is parsed to see weather the site is being down and if its so it is saved in database,
and next week ( or any number of days) again the site is checked if the node remains
down for the specified number of days ,these number of days could also be specified
i.e.; Administrator can input for how long if the site remains down then generate the
mail. Moreover, an Administrator can also input the fault of the site being down, so,
that in the future he can see the trends or reasons for a particular site going down. A
database is needed to be designed with this requirement in mind that the history data
should be available for decision support system keeping this in mind the database has
been designed with the Star-Schema in mind.
13
3.5 MAKING ANOMALY DETECTION ENGINE
To check anomalous behaviors in the data like RTT (Round Trip Time),
Packet loss and derived throughput and report that to the Administrators. The
following algorithm is needed to be followed as per the agreed requirements

Look at the minimum RTT for a host that is supposedly in some country or
regions

Compares this RTT with other hosts in the same region

If it was way below, the RTT of other hosts in the region than flag it
Since, the data being collected from Pinger is grouped in the regions.
so, by using these groups we are ensuring the constraint that the data follows a normal
distribution. Since, the RTT of all the sites in one region from another region show
nearly same RTT. So, we can apply this technique in our scenario for finding out the
anomalous site.
The site was found that were showing anomalous behavior. This is
based on the fact that the same sites in a region show similar RTT, when gathered from
similar monitoring sites. So, we can say that our samples are from the same group, and
we can apply statistical analysis on them.
3.6 AUTOMATIC REPORT GENERTAION TOOL
One of the for most important requirement is of making an Automatic
report generation tool for the management that Generate reports regarding problems in
14
monitoring data. This would include basically two types of charts and graphs .which
are as following:
(a) graph showing which sites were down at what time
Graph showing which sites and nodes were down most of the time .A
histogram chart showing the number of days the site was down and for how long. This
graph could help the Administrators to see the birds eye view of there network of
monitoring and monitored nodes.
(b) Report showing Anomalies in Packet Loss and RTT.
A second type of graph and chart includes pie charts. Which shows the
amount of nodes in a particular region that are down? This module would generate
these types of pie charts for the administrators.
3.7 DRAMATIC-EVENTS ANOMLY DETECTION ENGINE
This module would find the anomalous dramatic events in RTT and
report these events to the Administrators. This would check for anomalous events in
RTT information and if there are, some anomalous events send reports to
administrators. This is separate form the anomalous event detection for the bandwidth
time series. In this case, the events would be dramatic. e.g., the min RTT dropped by a
factor of two, there was no response for a long period of time (the period should be
specifiable as an option, but might be days or even months).
15
3.8 DATA DISCREPANCY FINDER
A module that finds discrepancies in the input data e.g.; Data entered at
the time of registration of monitored sites might be incorrect and incomplete and due
to it data is not available or the monitored host remains not-Set. This module would
also include the data triggers, like minrtt greater than 5000 or have negative values.
This module would include the finding of mismatch TLD’s and informing the
administrators so, that they can inform the actual person who have entered the invalid
16
Chapter 4
4 NON-FUNCTIONAL REQUIREMENTS AND
SPIDERING
This project is to be installed at PingER at SLAC. There systems are running on the
Linux platform and there applications are build on the Perl and CGI. Keeping this in
mind that, this system has to be integrated with the already installed system at SLAC.
There were many non-functional requirements. Some of these are as following:

Using Perl as the language

Using MySQL as a database Management System

The system should not mail too often about the anomalous sites.

The system should not mail too often about the down sites or down nodes.

The system should not consume too many resources.

The Scripts don’t consume for too much Bandwidth

The system should be easy to install and follow the packaging principals

System should be easy to upgrade if the need arises
In order to fulfill the above requirements some special measures were
taken. These measures would be elaborated through out this chapter.
17
4.1 SPIDER FOR PINGER MANAGEMENT
(a) Gain Automated Acess to Resources
One way is to Gather PingER data and exports them into Microsoft
Excel for use in presentations or tracking over time. Grab a copy of Pinger data and
store it in MySQL database and then do whatever queries and interpretations on it .the
project was more like of making an interpreter that can interpret the input data and
convert it into an information that can be used by the Administrators at SLAC. Once
raw data was at disposal, it was being repurposed, repackaged, and reformatted to my
heart's content.
(b) Aggregate otherwise disparate data sources
PingER management spider collects data of RTT and Packet Loss from
multiple sites and aggregated it into one database. Spiders were trained to aggregate
data, both across sources and over time.
(c) Combine the functionalities of sites
A spider has bridge the gap between Pinger data and the Pinger
management site by querying Pingtable.pl and providing that information to Pinger
management scripts and at last publishing the results.
(d) Perform regular webmaster functions
PingER spider takes care of the drudgery of daily SLAC’s
Administrator tasks. It check your Data to be sure that data is ‘standards-compliant’
18
and ‘tidy’ and check weather the sites and nodes are working fine and data is being
collected from them.
4.2 BEST PRACTICES FOR PINGER MANAGEMENT SPIDER
Some rules for the road as writing well-behaved spider. In order to
make my spider as effective, polite, and useful as possible, there are some general
things.
(a) Being Liberal for PingER Management
To minimize the fragility of scraping, a little boundary data is used. For
example, the title of an average data page looks something like this:
Monitoring-Site
Remote-Site May31 May30 May29 May28 May27
May26 May25 May24 May23 May22 May21 May20 May19 May18 May17
May16 May15 May14 May13 May12 May11 May10 May09 May08 May07
May06 May05 May04 May03 May02 May01 Apr30 Apr29 Apr28 Apr27
Apr26 Apr25 Apr24 Apr23 Apr22 Apr21 Apr20 Apr19 Apr18 Apr17
Apr16 Apr15 Apr14 Apr13 Apr12 Apr11 Apr10 Apr09 Apr08 Apr07
Apr06 Apr05 Apr04 Apr03 Apr02 Monitoring-Node
Mon-TLD
Mon-Region Remote-Node Rem-TLD
Rem-Region
If you're after the title, the boundary data is the Monitoring-Site and
Rem-Region tags. PingER spider is made to be as adaptive to site redesigns as
possible.
(b) Don't Limit the Dataset for PingER Management
Once the web service for the availability of the data is available by the
people of SLAC it would be nice to shift it to web services instead of the present data
19
collection spiders.Tools are used, like regular expression and some time I used simple
string comparisons.
(c) Choose a good identifier
“Identifier for spider is still to be written, choosing one that clearly
specifies what the spider does: what information it's intended to scrape and what it's
used for. So, the spider will be used to collect the PingER data and would do analysis
on the data.”One reason for no yet including the identifier is that one of the prerequisite for identifier is that, it should not impersonates an existing spider, so, for this
reason, it is needed to be asked from the SLAC peoples. So, next time one needs a
spider like this he can use mine.
(d) Not demanding unlimited site access or support
This may be the greatest application since Google's Page Rank, but it's
up to the Administrators at SLAC to decide if they entitle me to access site content or
restricted areas. It was asked nicely, and has not demanded direct access to the data. It
was Shared what was done; and code will be given to them. At the end of this project,
After all, it was scraping information from their web site. It's only fair that project is
shared with SLAC that makes use of their information.
(e) Take just enough, and don't take too often
The program will run and scrap data from the site once a week for some
scripts and for once a day for others.
20
Chapter 5
5 IMPLEMENTATION DETAILS AND
ARCHITECTURE
In this chapter implementation details of the project are discussed along with the
architecture of the project.
5.1 ANTICIPATED USERS
The anticipated users of our website will include faculty Network
Administrator. PingER users, Scientist, Administrators and Pinger Monitoring site
Administrators and normal users who are interested in seeing the anomalous sites or
nodes.
5.2 DESCRIPTION
The website’s home page can be accessed by anyone. It can be used to
see the monitoring sites and monitoring nodes that are down. It can be used to find out
sites and nodes that are showing anomalous behavior, Administrator can change the
status of the site i.e.; the reason of the site being going down. The administrator can
specify the number of days for which if the site remains down then inform the
administrator current status of the site.
21
5.3 ARCHITECTURE DIAGRAM
The website will be configured on Apache Web Server and Perl will be
used as gate way interface.[10]
Apache Web
Server
HTTP
(PERL
Interpreter
Module)
Database
PERL DBI
Web Server
Figure 5.1: Architecture Diagram [11]
5.4 PINGER MANAGEEMENT ARCHITECTURE





Down Sites and Nodes
Filter
Anomaly Detection Engine
Automatic report
generation tool
Dramatic-Events Anomaly
Detection Engine
Data Discrepancy finder
and solver
4.1 MAIN USER
PAGE
4.3 LINKS
3.PROCESSING
4.2 GENERATE PAGE
Automatic report generation tool
Dramatic-Events Anomaly Detection Engine
2.SAVE FILE
Anomaly Detection Engine
1.DOWNLOADS FILE
Data Discrepancy finder and solver
Down Sites and Nodes Filter
TSV
PingER Management
Figure 5.2: PingER Management Architecture
22
5.5 PERL SCRIPTS DESCRIPTION
The description of the major scripts is given as following:
(a) Insert Node Status
This form will be used to enter new type of status for the downnodes.
Table 1.1: Insert node status
Action:
enterstat.pl
Method:
Get
Element Name
sitefield
Type
Text field
Value
Enter a reason for Node
going down
submit
button
Send the text in the text
field ‘site field’ to the
apache web server
reset
button
Make the text field go
blank
23
Description
Get data from
Database
‘downsites’ and
table
‘nodestatus’
User clicked the
Insert site status link
Retrieved data
Data
displayed on
server
generated
page
Figure 5.3: Insert Node Status
(b) Down Sites
This page would show those sites that have been down last week
Table 5.2: Down Sites
Action
downsites.pl
Method:
Get
Element Name
Type
Value
Monitoring Sites
Table column
Monitoring Sites
Remote Sites
Table column
Remote Sites
status
Table column
Current reason of the
site being down
24
Downtime in days
Table column
Number of days the site
has been down in
succession
Description
Get data from
Database
‘downsites’ and
table ‘mails’
User clicked the Down sites link
Retrieved data
Data
displayed on
server
generated
page
Figure 5.4: Down Sites
(c) Down nodes
This page would show those nodes that have been down last week
Table 5.3: Down nodes
Action
downnodes.pl
Method:
Get
Element Name
Remote node
Type
Table column
25
Value
Remote node
status
Table column
Current reason of the
site being down
Downtime in days
Table column
Number of days the site
has been down in
succession
Description
Database of down
nodes from table
mails of down
nodes database
User clicked the Down nodes link
Retrieved data
Data
displayed on
server
generated
page
Figure 5.5: Down nodes
(d) Site Status for Down Sites
This page shows the latest status of each site
Table 5.4: Site Status for Down Sites
Action
sitestatus.pl
Method:
Get
26
Element Name
Type
Value
Monitoring site
Table column
Monitoring site
Monitored node
Table column
Monitored node
Node Status
Table column
Current reason of the
site being down
Down Time
Table column
Number of days the site
has been down in
succession
Description
Select the type of status
Update status
in the
database
‘downsites’
and table
‘monitors’
Figure 5.6: Site Status for Down Sites
(e) Node Status for Down Nodes
This page shows the latest status of each node
Table 5.5: Node Status for Down Nodes
Action
nodestatus.pl
27
Method:
Get
Element Name
Type
Value
mon_node
Table column
Monitored node
obsrved_site
Table column
Number of sites
observed by this node
node_status
Table column
Current reason of the
site being down
mailstatus
Table column
Number of days the site
has been down in
succession
Description
Select the type of status
Update
database
‘downnodes’
table
‘monitors’
Figure 5.7: Node Status for Down Nodes
(f) Update mail status Down sites
This page shows the latest status of each node for the user to update
28
Table 5.6: Update mail status down sites
Action
showmailstatus.pl
Method:
Get
Element Name
Type
Value
Monitoring site
Table column
Monitoring node
Time delay
Table column
Number of days for
which if the site
remains down then send
email
Description:
Update mail
status for a
particular
monitoring site
in the database
‘downsites’ and
table ‘monitors’
Update mail
status
Figure 5.8: Update mail status down sites
(g) Update mail status Down nodes
This page shows the latest status of each node for the user to update
29
Table 5.7: Update mail status down nodes
Action
showmailstatus.pl
Method:
Get
Element Name
Type
Value
Monitored nodes
Table column
Monitored node
Time delay
Table column
Number of days for
which if the node
remains down then send
email
Description
Update mail
status for a
particular
monitored node
in database
‘downnodes’
and table nodes
Change mail
status
Figure 5.9: Update mail status down nodes
(h) Down sites graphs and charts
This perl script make graph for the down sites.
30
Table 5.8: Down sites graphs and charts
Name
downsites.pl
Inputs
This week data of down
nodes along with the for
which they have been down
Output
Histogram of this week data
Description
Access the database ‘downsites’
and table ‘mail’
Cron job or the
administrator runs this
script
Retrieved data
Generate the histogram and
store in the icons folder of the
web server
Figure 5.10: Down sites graphs and charts
(i) Down nodes graphs and charts
This perl script make graph for the down nodes
Table 5.9: Down nodes graphs and charts
Name
downsodes.pl
Inputs
This week data of down
31
nodes
Output
Histogram of this week
down nodes along with the
for which they have been
down
Figure 5.11: Down nodes graphs and charts
(j) Min-RTT Anomaly
This shows the anomalous sites based on minrtt.
Table 5.10: Min-RTT Anomaly
Action
avgrtt.pl
Element Name
Type
Value
Mon region
Table column
Monitored node
Rem region
Table column
Number of sites
observed by this node
Monitoring site
Table column
Current reason of the
site being down
Monitored node
Table column
Number of days the site
has been down in
succession
rtt
Table column
32
Actual rtt from
monitoring site to
monitored region
Avg rtt
Table column
Avg rtt for the
particular region in
which monitored node
lies
Std rtt
Table column
Standard deviation rtt
for the particulate
region in which
monitored node lies
Description
Database of ‘minrtt’ is accessed
and from ‘dtrtt’ table the data is
accessed
User clicked the Min-RTT
Anomaly link
Retrieved data
Data displayed on server
generated page
Figure 5.12: Min-RTT Anomaly
(k) Min-RTT graphs and charts
This perl script generate graph of anomalous sites based on min-RTT
33
Table 5.11: Min-RTT graphs and charts
Name
minrtt.pl
Inputs
This data of anomalous sites
Output
Pie chart based on the
count of monitoring sites in
the Min-RTT table
Description:
Access the
database
‘minimumrtt’ and
table ‘dtrtt’
Cron job or the
administrator runs this
script
Retrieved data
Generate the
pie-chart and
store in the
icons folder of
the web
server
Figure 5.13: Min-RTT graphs and charts
(l) Dramatic Anomaly
Show sites based on the dramatic anomaly
34
Table 5.12: Dramatic Anomaly
Action
avgrtt.pl
Element Name
Type
Value
Mon region
Table column
Monitored node
Rem region
Table column
Number of sites
observed by this node
Monitoring site
Table column
Current reason of the
site being down
Monitored node
Table column
Number of days the site
has been down in
succession
rtt
Table column
Actual rtt from
monitoring site to
monitored region
min rtt
Table column
min rtt for the particular
region in which
monitored node lies
Std rtt
Table column
Standard deviation rtt
for the particular region
in which monitored
node lies
35
Description:
User clicked the dramatic
Anomaly link
Database of ‘dtminrtt’ is accessed
and from ‘dtrtt’ table the data is
accessed
Retrieved data
Data displayed on server
generated page
Figure 5.14: Dramatic Anomaly
(m) Down nodes graphs and charts
Show graphs based on the down nodes of last week
Table 5.13: Down nodes graphs and charts
Name
downsodes.pl
Inputs
This week data of down nodes
Output
Histogram of this week down nodes
along with the for which they have
been down
Description:
36
Access the
database
‘Downnodes’and
table ‘mail’
Cron job or the
administrator runs this
script
Retrieved data
Generate the
pie-chart and
store in the
icons folder of
the web
server
Figure 5.15: Down nodes graphs and charts
(n) NOT-SET
Table 5.14: NOT-SET
Action
notsetdb.pl
Element Name
NOT-SET
Type
Table column
(o) Data Triggers
Show sites which show invalid values
Table 5.15: Data Triggers
37
Value
Remote node
Action
datatriggerdb
.pl
Element Name
Data Triggers
Type
Value
Table column
This shows those sites,
node pairs that are
showing invalid data.
Description
Access the
database of ‘notset
and table ‘DT’
User clicked the NOT-SET link
Retrieved data
Data
displayed on
server
generated
page
Figure 5.16: Data Triggers
(p) Mismatch TLD’s
Show sites that have invalid TLD’s
38
Table 5.16: Mismatch TLD’s
Action
tlddb.pl
Element Name
Type
TLD’s
Value
TLD’s that mismatch
Table column
Description
User clicked the mismatch TLD’s link
Access the database of
‘notset’ and access the table
‘TLD’
Retrieved data
Data displayed on server
generated page
Figure 5.17: Mismatch TLD’s
5.6 DATABASE DESIGN
Database tables are as follows:
5.6.1 DownSites
(a) Mailstatus
Table 5.17: Mailstatus for downsites
Attribute
Data
Type
Length/signed/unsigned Null
39
mon_node
varchar
50
No
mail_time
int
unsigned
No
(b) Statuslookup
Table 5.18: Status lookup for down sites
Attribute
status
Data
Type
varchar
Length/signed/unsigned Null
50
No
(c) Noderegion
Table 5.19: Node region for down sites
Attribute
mon_node
node_region
Data
Type
varchar
varchar
Length/signed/unsigned Null
50
50
No
No
(d) Siteregion
Table 5.20: Site region for down sites
Attribute
mon_site
site_region
Data
Type
varchar
varchar
Length/signed/unsigned Null
50
50
No
No
(e) Dateid
Table 5.21: Dateid for down sites
Attribute
dateid
year
month
day
quater
week
dow
Data Type
Varchar
Int
Int
Int
Int
Int
int
Length/signed/unsigned
50
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
(f) Monitors
40
Null
No
No
No
No
No
No
No
Table 5.22: Monitors for down sites
Attribute
dateid
node_region
mon_node
mon_site
node_status
mailstatus
Data
Type
int
varchar
varchar
varchar
varchar
varchar
Length/signed/unsigned Null
UNSIGNED
50
50
50
50
UNSIGNED
No
No
No
No
No
No
(g) Dssmonitors
Table 5.23: Dssmonitors for down sites
Attribute
dateid
node_region
mon_node
mon_site
node_status
mailstatus
Data
Type
integer
varchar
varchar
varchar
varchar
varchar
Length/signed/unsigned Null
UNSIGNED
50
50
50
50
UNSIGNED
No
No
No
No
No
No
(h) Mail
Table 5.24: Mail for down sites
Attribute
node_region
mon_node
mon_site
node_status
mailstatus
Data
Type
varchar
varchar
varchar
varchar
varchar
Length/signed/unsigned Null
50
50
50
50
UNSIGNED
No
No
No
No
No
5.6.2 downnode
(a) nodecount
Table 5.25: Node count for down node
Attribute
Data
Type
Length/signed/unsigned Null
41
mon_node
Down_site
varchar
int
50
UNSIGNED
No
No
(b) Mail
Table 5.26: Mail for down node
Attribute
mon_node
Node_status
mailstatus
Data
Type
varchar
varchar
int
Length/signed/unsigned Null
50
50
gUNSIGNED
No
No
No
(c) Nodelookup
Table 5.27: Node lookup for down node
Attribute
mon_node
mon_site
Data
Type
varchar
varchar
Length/signed/unsigned Null
50
50
No
No
(d) Monitors
Table 5.28: Monitors for down node
Attribute
mon_node
mon_site
Data
Type
varchar
varchar
Length/signed/unsigned Null
50
50
No
No
(e) Dateid
Table 5.29: Dateid for down node
Attribute
dateid
dateid
year
month
day
quater
week
dow
Data Type
char
int
Int
Int
Int
Int
Int
int
Length/signed/unsigned
50
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
42
Null
No
No
No
No
No
No
No
No
(f) Nodes
Table 5.30: Nodes for down node
Attribute
dateid
node_region
mon_node
obsrved_site
down_site
node_status
mailstatus
Data
Type
int
varchar
varchar
integer
integer
varchar
varchar
Length/signed/unsigned Null
UNSIGNED
50
50
UNSIGNED
UNSIGNED
50
UNSIGNED
No
No
No
No
No
No
(g) Dssnodes
Table 5.31: Dssnodes for down node
Attribute
dateid
node_region
mon_node
obsrved_site
down_site
node_status
mailstatus
Data
Type
int
varchar
varchar
int
int
varchar
varchar
Length/signed/unsigned Null
UNSIGNED
50
50
UNSIGNED
UNSIGNED
50
UNSIGNED
No
No
No
No
No
No
(h) Mail
Table 5.32: Mail for down node
Attribute
node_region
mon_node
mon_site
node_status
mailstatus
Data
Type
varchar
varchar
varchar
varchar
varchar
Length/signed/unsigned Null
50
50
50
50
UNSIGNED
43
No
No
No
No
No
5.6.3 minrtt
(a) dtrtt
Table 5.33: Dtrtt for anomaly
Attribute
pair
monitoringsite
RemoteSite
avgrtt
minrtt
Data
Type
varchar
varchar
varchar
varchar
int
Length/signed/unsigned Null
50
50
50
50
UNSIGNED
No
No
No
No
No
(b) minrtt
Table 5.34: Minrtt for anomaly
Attribute
minrtt
sdrtt
Data
Type
int
int
Length/signed/unsigned Null
UNSIGNED
UNSIGNED
No
No
(c) dbminrtt
Table 5.35: Dbminrtt for anomaly
Attribute
pair
monitoringsite
RemoteSite
avgrtt
minrtt
sdrtt
Data
Type
Varchar
Varchar
Varchar
Varchar
int
int
Length/signed/unsigned Null
50
50
50
50
UNSIGNED
UNSIGNED
No
No
No
No
No
No
(d) dssminrtt
Table 5.36: Dssminrtt for anomaly
Attribute
dateid
mon_region
Data
Type
int
varchar
Length/signed/unsigned Null
UNSIGNED
50
44
No
No
mon_region
monitoringsite
RemoteSite
minrtt
sdrtt
varchar
varchar
varchar
int
int
50
50
50
UNSIGNED
UNSIGNED
No
No
No
No
No
(e) Dateid
Table 5.37: Dateid for anomaly
Attribute
Data
Type
varchar
int
Int
Int
Int
Int
Int
int
dateid
dateid
year
month
day
quater
week
dow
Length/signed/unsigned
Null
50
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
No
No
No
No
No
No
No
No
5.6.4 Dramaticrtt
(a) Dtrtt
Table 5.38: Dtrtt for dramatic anomaly
Attribute
pair
monitoringsite
RemoteSite
avgrtt
minrtt
Data
Type
varchar
varchar
varchar
varchar
int
Length/signed/unsigned Null
50
50
50
50
UNSIGNED
No
No
No
No
No
(b) minrtt
Table 5.39: Minrtt for dramatic anomaly
Attribute
minrtt
sdrtt
Data
Type
int
int
Length/signed/unsigned Null
UNSIGNED
UNSIGNED
45
No
No
(c) dbminrtt
Table 5.40: Dbminrtt for dramatic anomaly
Attribute
pair
monitoringsite
RemoteSite
avgrtt
minrtt
sdrtt
Data
Type
varchar
varchar
varchar
varchar
int
int
Length/signed/unsigned Null
50
50
50
50
UNSIGNED
UNSIGNED
No
No
No
No
No
No
(d) dssminrtt
Table 5.41: Dssminrtt for dramatic anomaly
Attribute
dateid
mon_region
mon_region
monitoringsite
RemoteSite
minrtt
sdrtt
Data
Type
int
varchar
varchar
varchar
varchar
int
int
Length/signed/unsigned Null
UNSIGNED
50
50
50
50
UNSIGNED
UNSIGNED
No
No
No
No
No
No
No
(e) Dateid
Table 5.42: Dateid for dramatic anomaly
Attribute
dateid
dateid
year
month
day
quater
week
dow
Data
Type
varchar
int
Int
Int
Int
Int
Int
int
Length/signed/unsig
ned
50
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
UNSIGNED
46
Null
No
No
No
No
No
No
No
No
5.6.5 notset
(a) notset
Table 5.43: Notset
Attribute
mon_node
Data
Type
Varchar
Length/signed/unsigned Null
50
No
(b) TLD
Table 5.44: TLD
Attribute
mon_node
Data
Type
Varchar
Length/signed/unsigned Null
50
No
(c) DT
Table 5.45: DT
Attribute
mon_node
rtt
Data
Type
Varchar
int
Length/signed/unsigned Null
50
UNSIGNED
47
No
No
Chapter 6
6 INTERFACE DESIGN
In this chapter interfaces of the pinger management system are shown and description
of each interface is given along it.
6.1 INTERFACE::MAIN PAGE
All the users will enter the main page that is shown below. All the links
to various WebPages are available in this page.
Figure 6.1 : Main Interface
6.2 INTERFACE::ITERNAL PAGES
(a) Down Sites
This page shows the sites that are down during last week.
48
Figure 6.2: Down Sites
(b) Site Status
This show the sites that can go down during next week.
Figure 6.3: Site Status
(c) Update Down Time
Update the number of weeks for which if the site remains down then
mail to the administrator.
Figure 6.4: update Down Time
49
(d) Insert Site Status
Insert a new type of status for the down sites.
Figure 6.5: Insert Site Status
(e) Graphs and Charts
Graph showing which sites were showing most of the time as down.
Figure 6.6: Graphs and Charts
(f) Down Nodes
This page shows the nodes that are down during last week.
Figure 6.7: Down Nodes
(g) Node Status
This webpage show the node that can go down during next week.
50
Figure 6.8: Node Status
(h) Graphs and Charts of Anomalous Sites
Showing status of anomalous site based on min-RTT and S.D for pair
of regions. Graphical methods are used to summarize the data. [1]
Figure 6.9: Graphs and Charts of Anomalous Sites
(i) NOT-SET
Showing sites and node, pairs which are not-set
Figure 6.10: NOT-SET
51
6.3 CONCEPTUAL DIAGRAM
Figure 6.11: Conceptual Diagram
52
6.4 TECHNICAL DESIGN
Figure 6.12: Technical Diagram
53
Chapter 7
7 RECOMMENDATIONS FOR FUTURE WORK
In this chapter future direction and recommendations are given.
7.1 STATISTICAL ANALYSIS OF PINGER DATA
For the analysis of the Pinger data following statistical techniques can
be used for finding out the anomalies in it data as well as finding out the anomalous
sites.
There are two types of charts, univariate and multivariate .These charts
are used to find the anomalies in the data. Univariate charts consist of mean and R
chart, mean and S chart. The mean chart monitors the mean of the process, R and S
chart monitors the variability of the process.
Some other types of charts are individual control charts, these charts are
good for Situation when the sample consists of a single observation that is n=1 then
this chart can be used to see that the process is in control or not.
If the data can be classified (defective or not defective, down or not
down, scrap or dump or garbage, anomalous or not anomalous) then the control charts
are called Attribute control chart.
Another type of chart that is used most frequently is fraction non
conforming control chart. There are two types of fraction non conforming control
54
charts i.e.; P chart and np chart. The goal of these charts is to monitor the proportion of
nonconforming Items (fraction nonconforming) for a process of interest using the data
collected over m samples each of size n. P chart is used, if proportion of item is nonconforming. np chart is used , If no. of item are non-conforming.
The problem with the fraction non conforming control chart is that the
no. of non-conforming items could not exceed the number of items in each sample
investigated, that is X<=n. For monitoring nonconformities no such restriction exists.
This means non conformities are counted per unit. i.e.; in a sample of ten bottles
17defects were found.
In order, to remove this discrepancy another type of chart is used
known as control chart for non-conformities. There are two type of charts based on if
sample size is constant from sub-group to sub-group. These are C control chart and U
control chart. C control chart monitors the no. of non-conformities. U control chart
monitors the average no. of non-conformities
The above techniques are valid if the data obeys the condition of
normality of data ,if data does not obeys it then to cater for small insensitivities of the
data cumulative ‘sum control charts’ can be used i.e.; if small shifts in the process
mean are important then use the cumulative sum control chart
or exponentially
moving average control charts. Cumulative sum control chart is not based on current
observation but also based on recent past observation. Exponentially moving average
control charts same as cumulative sum control chart but includes weights and width of
the control limits.
55
Multivariate Control Chart are used when two variables have corelation between them i.e.; it monitors two variables and determine weather the process
is in control or not. The Two types of charts are Hotelling T square control chart and
multivariate EWMA control chart.
The draw back with Hotelling T square control chart technique is that
only current observation or sub-groups are used. Multivariate EWMA control chart is
an extension off univariate EWMA control chart.
In case of pingER, ‘Process’ are ‘sites’ collecting data, ‘discriminates’
are ‘anomalies in the data’ or ‘anomalous sites’ and ‘groups’ are ‘sites in each
regions’.
7.2 OTHER TECHNIQUES
Some other techniques that can be useful are

Finding out the clusters of monitoring and monitored sites based on the Kmean

Finding out the clusters of monitoring and monitored sites based on the Kmedian

Finding the clusters from SOM (self organizing maps) i.e.; same to K-mean
but based on neural networks.

Making the hierarchical clusters and showing them in the tree view to the
administrator.
56
In order to find out the sites that are behaving in the same way based on
the association between the rtt of sites or through the time series analysis finding out
the sites that are behaving similarly during the last month or during the same span of
time.
This can be done be comparing the time series data of two sites for last
month simply by finding the co-relation between the sites data. Categorizing the sites
as anomalous i.e.; assigning probability to the sample data weather the site is showing
anomaly or not and up to what level. This would be done by using supervised learning
neural networks. For example: multi-layer precptron
Input: minrtt=200, avgrtt=250, packetloss=0.000, throughput=5
Output: .80
The above output suggests that the site is .80 probably not showing
anomalous behavior
Minrtt =700, avgrtt=650, packet loss=0.300, throughput=6
Output : .40
The above output showing the site is showing anomalous behavior.
Fuzzy logic can be applied on the back propagation algorithm to distinguish the sites
having the satellite connections and other types of events that are needed to be find
out.
Kohen-Sutherman algorithm can be applied for un-supervised learning,
in which the network would learn the data based on the input patterns.
57
Chapter 8
8 CONCLUSION
In this chapter, Results after the implementation of each requirement would be
discussed in detail.
One node is connected to many sites and many sites are monitored by
each node. Number of Down-sites and down-nodes varies every week. However, if the
node gets down, it has a more severe effect; since data from many sites is not
collected. So, first most priority of the Administrator should be to make the down
nodes up and then try to make the down sites up.
Anomalous sites keeps on increasing as the standard deviation from the
average RTT or min RTT is increased. Since, Administrator uses this site for finding
anomalous site. Presently, sites that are fifteen standard deviations from the mean or
min RTT are shown as anomalous. However, this can be decreased once there are no
sites at fifteen standard deviations.
Graphs and charts show the present condition of the sites and nodes.
Administrator can get the bird’s eye view of all the pingER sites and nodes and plan
out the strategy. Once implementing the strategy and errors he can verify weather the
system is doing fine. He can see that which pair of regions is showing more anomalous
sites as compared to others.
58
Lastly, the administrator can use this system to find out NOT-SET,
mismatch TLD’s and find out sites and nodes that are showing invalid data.
In short, this project would help PingER administrators in making there
system more manageable and automated by detecting downsites and down nodes,
anomalous sites based on statistical analysis and dramatic events, giving the
administrator the overview of system through graphs and charts and helping them to
find out sites and nodes which are NOT-SET, or showing mismatch TLD’s or showing
invalid
data.
59
REFRENCES
[1] Chambers, J. M., W.S. Cleveland, B. Kleiner, and P.A. Tukey. 1983. Graphical
Methods for Data Analysis. Boston: Duxbury Press
[2] Cooperative Association for Internet Data Analysis (CAIDA). 2000
http://www.caida.org/home/background.xml
[3] Cross-Industry Working Team, (XIWT). 1998. Customer View of Internet
ServicePerformance: Measurement, Methodology, and Metrics
http://www.xiwt.org
[4] Keynote.2000
http://www.keynote.com
[5] Stanford Linear Accelerator Center (SLAC). 1999
http://www-iepm.slac.stanford.edu/pinger
[6] Stevens, W. Richard. 1994. TCP/IP Illustrated, Volume 1: The Protocols. Reading,
MA: Addison-Wesley
[7] U.S Department of Commerce, (U.S. DOC). 1999. The Emerging Digital Economy
http://www.ecommerce.gov/ede/report.html
[8] Visual Networks. 1999
http://www.visualnetworks.com/corp/corpslaform.htm
[9] Spider Hacks by oreilly
[10] Programming Web Services with Perl by Randy J. Ray, Pavel Kulchenko
[11] E-Business and e-Commerce by Deitel and Deitel
[12] Web Programming Building Internet Applications by Chris Bates
[13] Retrieving PingER Historical Data
http://www-iepm.slac.stanford.edu/pinger/tools/retrievedata.html
60
[14] Tutorial on Internet Monitoring & PingER at SLAC
http://www.slac.stanford.edu/comp/net/wan-mon/tutorial.html#pinger
[15] Network Measurement
http://www.caida.org/outreach/metricswg/faq.xml#s2_2
[16] Internet End-to-end Performance Monitoring (IEPM) and the PingER project.By
Les Cottrell and Warren Matthews
http://www.slac.stanford.edu/grp/scs/net/talk03/hpn_aug03/pinger_report.pdf
61
Appendix A
ADMINSTRATOR’S GUIDE
In this chapter ,installation procedure for shifting the project to another machine is
discussed.
A.1 PROJECT REQUIREMENTS
Following are the pre-requists for running this project
a) Linux
b) perl 5.8
c) CGI
d) GD::Graph
e) MySQL
f) Apache web server
A.2 SOFTWARE INSTALLATION DETAILS

For the creation of database you should already have an account on your Linux
machine with following credentials:
user name : root
Password : nustniit
62

Make sure that MYSQL database daemon is running.

Check that you don’t have following databases uder the user account root.
Since, these databases would be created

Minrtt

Downsites

Downnode

Notset

Dramaticrtt
Follow the following procedure to confirm the above mentioned
databases are not present in MySQL database management system:

Open terminal

Login into MySQL server as root user

Type , ’show databases;’
63
/
root
var
bin
www
Cgi-bin
html
icons
Figure A.1: File structure
A list of databases would be shown, please verify that it don’t contain
any of the above mentioned databases

Put the ‘pingermanagement’ directory inside cgi-bin folder in the ‘cgi-bin’
directory of your server as shown above

Put the ‘pingermanagement’ directory inside html folder in the html directory
of your server as shown above

Put the ‘pinger_management’ directory provided with in any folder as you like.
There is a folder named ‘cronjobs’ in it. Inside this folder there are various
types of scripts.
64
Pingermanagement
Cgi-bin
Pinger_management
html
cronjobs
useragent
data
……
uninstall
cronall
Figure A.2 : Project file structure

On the terminal type
Cd <path where cronjobs is located on your PC>

This would change directory to the folder where cronall script is present. Now
run the script by using the following command
Perl cronall
Now, If your web server is configured right, you can access the
WebPages through the address http://ipaddress /Pingermanagement/index.html you can
also put the cronall script inside the crontab, and make them run whenever you like to
run them
65
A.3 PROJECT CONFIGURATION
Presently ,this project supports only analysis on RTT, the user can
easily update it to packet loss, through-put by following the listed down steps and
following the whole installation process from start. if you would do the following task
on the same linux machine as on which previously the application was installed the
don’t forget to run
To run ‘dropdatabases’ in the directory ‘uninstalldatabases’, which is
present in uninstall directory and this directory is present in the ‘pinger_management’
directory

In the ‘pinger_management’ directory, go to ‘user_agent’ directory, inside it
open the ‘minimumrtt’ , then change the ‘my $url= ’ to the required tsv file url.

Follow all the instructions as above.
66
Download