Building the e-Science Grid in the UK:

advertisement
Building the e-Science Grid in the UK:
GridMon - Grid Network Performance Monitoring
Mark Leese (m.j.leese@dl.ac.uk) and Robin Tasker (r.tasker@dl.ac.uk)
CCLRC Daresbury Laboratory, Warrington, Cheshire WA4 4AD
http://gridmon.dl.ac.uk/
Abstract:
At last year’s inaugural AllHands meeting, our paper outlined the proposed development of a comprehensive
and extensible network monitoring infrastructure for UK e-Science. This paper initially serves as an update,
outlining the last year’s good progress in establishing an infrastructure which is already supplying tangible
benefits.
The paper then introduces the project’s second phase, which will see GridMon’s integration into Grid
technology via compliance with the Open Grid Services Architecture. The starting point for this journey has
been the development of GridMon as a web service. This, and future stages of the journey will be outlined.
GridMon is not alone in developing a network monitoring system as a web and/or Grid service. Also
described, as part of GridMon’s aim to be a “best of breed” network monitoring system for UK e-Science, are
ongoing collaborations such as those with the Internet2 piPEs initiative.
Finally, consideration is given to new work which seeks to redress some of the widely observed imbalance
between the achieved and expected network performance of end users. By building on relevant research,
GridMon hopes to provide “best practice” examples of TCP configuration, with our monitoring results
showing these in a real world rather than ‘laboratory’ context.
Glossary:
API
Application Programming Interface
BAR
Backbone Access Router
BW
Bandwidth to the World
CCLRC Council for the Central Laboratories of the
Research Councils
CIM
Common Information Model
EDG
European Data Grid
GGF
Grid Global Forum
HEP
High Energy Physics
IEC
International Electrotechnical Commission
IEPM Internet End-to-end Performance
Monitoring
JANET Joint Academic NETwork
LFN
Long Fat Network
MCC Manchester Computing Centre
NMWG Network Monitoring Working Group
OGSA Open Grid Services Architecture
Introduction
The concepts and practice of network monitoring
are well understood and are widely used to
identify problems, quantify performance and set
expected levels of service.
piPEs
performance initiative Performance
Environment system
QoS
Quality of Service
R-GMA Relational Grid Monitoring Architecture
RPC
Remote Procedure Call
RTT
Round Trip Time
SJ4
Super JANET 4
SLAC Stanford Linear Accelerator Centre
SOAP Simple Object Access Protocol
TCP
Transmission Control Protocol
UCL
University College London
UDDI Universal Description Discovery Integration
UML Unified Modelling Language
URL
Uniform Resource Locator
WP
Work Package
WSDL Web Service Description Language
WSP
Web Service Provider
XML eXtensible Mark-up Language
Monitoring for the Grid builds on these
established concepts and practices, however, it is
different in intent and purpose. Firstly, Grid
monitoring deals with end-to-end performance.
Secondly, it is closely coupled with real Grid
applications and may allow those applications to
vary their transport strategies for optimal
performance, by for example, tuning TCP
parameters. To facilitate this, the products of
monitoring, the network metrics, are made
available to the Grid middleware via a publication
service. In addition, the data can also be made
available to end users and network personnel.
To this end, in June 2002, the UK e-Science Core
Programme began funding work to “...design and
deploy an infrastructure for network performance
monitoring within the UK e-Science community.”
This paper describes the first 12 months of the
project, and outlines the new work being
undertaken: web and Grid services, monitoring
collaborations, and TCP tuning.
The Last Year
Before reviewing the last year’s progress, it may
be helpful to provide a brief reminder of the
architecture GridMon set out to establish 12
months ago.
Monitoring Host
IperfER
PingER
UDPmon
bbcp/ftp
GridFTP
browser
Every 30 minutes (90 minutes for bbcp/ftp and
GridFTP) each machine performs monitoring
between itself and all other e-Science Centres. In
this way a mesh of monitoring is created, allowing
each centre to build up a picture of the quality of
its links to all other centres. The mesh approach is
feasible given the relatively low number of sites
involved (12-15 in this case).
IperfER, PingER and UDPmon are tools used by
the EDG WP7[1-4] group. Bbcp/ftp are end user
tools used for network monitoring in an approach
pioneered by the IEPM-BW work at SLAC[5].
miperfer[6], a multicast version of IperfER, is a
new tool, created in the last 12 months at MCC. It
is currently on extended beta trial.
The toolkit currently deploys just PingER,
IperfER and UDPmon, mirroring the original
EDG WP7 approach. The remaining tools will be
rolled out in due course.
data files
not
database
A presence has been established at all e-Science
centres. Some problems exist, but these are being
debugged. The rollout has required a great deal of
effort, however, good foundations have been laid,
which can now be built upon.
publication
service
Feedback from sites that, as intended, use the
tools themselves is favourable. GridMon’s success
is further demonstrated by the fact that other work
groups (e.g.UK HEP) are requesting to become
monitoring hosts.
Grid
middleware
People are also recognising GridMon as a useful
vehicle for deploying testing tools that are of
interest to them (e.g. miperfer from MCC). In
addition, the project is gaining experience that is
feeding into other e-Science monitoring projects,
such as those being run at Cambridge, UCL and
UKERNA.
miperfer
www.visualisation
implemented using LDAP, R-GMA or as a web or
Grid service. This will be discussed in a later
section.
Fig 1: GridMon architecture
Monitoring is performed by a kit of tools installed
on a suitable machine at each e-Science Centre.
Performance data is stored locally on that
machine, and is published to interested people via
a web interface, and will be made available to the
Grid middleware via a publication service. At
inception, the publication service could have been
The remainder of this section highlights the
features of what most GridMon users see: the user
interface, whose consistent view allows users to
navigate with ease across the infrastructure.
The start page for the GridMon installation at each
site will feature a UK map, as shown in figure 2.
Colour coded ‘blobs’ show the site’s connectivity
to other UK sites within the last 30 minutes.
Unsurprisingly, the blobs are red, green or amber.
Floating text will display the level of packet loss
that was last experienced.
Fig 4: data plot
Fig 2: active UK map
Mouse clicking on a site (blob) takes the user to
the GridMon interface for that site, where they the
site’s performance data using a form as shown in
figure 3.
Fig 3: selection form
The form allows the user to select the remote
hosts/sites, metrics and date range that they are
interested in. Clicking the View Plot button
produces the corresponding data plot, as shown in
figure 4.
Clicking the View reverse direction button will
show the same metric for the same period but in
the opposite direction, i.e. load the equivalent web
page from the remote end.
We finish the section by looking at an
example of where GridMon has proved useful.
Fig 5: TCP performance
Figure 5 shows a plot of TCP performance from
Daresbury to Manchester (upper plot) and
Newcastle (lower plot) for a period in December
2002. In this case the level of the graphs is
unimportant; we are only interested in their shape.
Note that the performance to Manchester is fairly
flat, whilst Newcastle suffers a severe daily drop
off. The performance to Newcastle was
representative of Daresbury’s performance to all
sites, except Manchester, and since Daresbury is
connected into JANET via Manchester, this
suggested the existence of a problem between
Manchester and the SJ4 core.
When prompted, the network staff at Manchester
BAR discovered that a router had been misconfigured, causing it to under perform under high
loading. Changes resulted in the improvements
seen toward the right of the plot.
Web Services
During the lifetime of the project, various
methods of publishing data to the Grid
middleware have been mentioned, including
LDAP and R-GMA. The popularity of these
technologies is fading however, and there is a
growing movement towards the use of web and
(OGSA) Grid services (a Grid service is
essentially a web service with some Grid specific
add-ons/pre-requisites).
When new technologies are developed, there is
the inevitable temptation to quickly adopt them
without considering their ‘true’ value, either to
maintain your cutting-edge status or simply
because everyone else is doing the same. In this
case however, web and Grid services do offer real
benefits….
Use of web and Grid services will lead to much
easier integration of differing monitoring
architectures,
allowing
systems
to
use
functionality and data provided by others. In the
UK for example, this would allow simpler
integration of the GridMon and new UKERNA
monitoring efforts, both e-Science projects.
To fulfil its role as a “best of breed” monitoring
solution, GridMon will need to take account of
work going on elsewhere, and where possible, get
involved. Web and Grid services will make this
task easier and improve the chances of success.
This will especially be the case if a web or Grid
service is combined with a classification system
such as that proposed by the GGF NMWG
hierarchy document [7]. This document describes
a set of network characteristics and a
classification hierarchy for those characteristics,
aimed at Grid applications and services. The
application of the hierarchy will facilitate the
creation of common schemata for describing
network monitoring data, the idea being that using
a standard classification for the measurements you
take maximises the portability of your data.
From a GridMon perspective, a web service will
be the first to be developed, which can then be
extended to a Grid service. The aim is to run them
in parallel, so that GridMon can be interrogated
by Grid and non-Grid applications alike.
The basic web service architecture is
shown in figure 6.
2. Client locates
suitable service
using registry
Client
UDDI
registry
1. WSP
registers service
with registry
3. Client requests WSDL doc
WSP
4. WSDL tells client how to interact
5. Service & client communicate using XML
messages, sent via SOAP
Fig 6: web service architecture
A client will search a UDDI registry for a service
that is of interest. Searches can be performed
based on business name, service name or a service
category. To make initial contact with a service,
the client is given the URL of the service’s WSDL
document. This XML document describes the
methods (functions) that the service has made
available, and how the client should interact with
them. Once the client has retrieved the WSDL
document it can start using the service, via XML
RPCs and XML messages encapsulated in SOAP
messages. Although beyond the scope of this
paper, authorisation and authentication may also
be an issue.
In the absence of a suitable UDDI registry [8], the
GridMon web services can be soft-coded as to the
locations of the GridMon web services at other
sites.
For simple implementations, the results of using
services can be returned as simple data types, such
as strings, as they would with other RPC
implementations. The only difference here is that
results are encapsulated in SOAP. This isn’t very
useful however, when dealing with large and
complex datasets, and situations where the service
could return differing amounts and types of data.
Enter the schema, a self-describing method of
representing data. This self-describing nature
makes it easier to share data between clients and
services that are capable of parsing schemas
(being flexible about what data they can send and
receive).
PMP
Backbone
e.g. US
Abilene
network
PMP
PMP
PMP
PMP
Host B
PMP
GigaPoP 2
Campus Y
Fig 7: sample piPEs topology
A full description of the architecture is beyond the
scope of this paper, but it is worth outlining the
salient features:
•
Collaborations
The piPEs project [11], being run by Internet2’s
E2Epi, seeks to reach a networking monitoring
utopia. In this utopia, when users experience
network problems, they have access to a tool
which can tell them what the problem is, where it
is located, and perhaps most importantly, who
should be contacted for its resolution.
GigaPoP 1
PMP
Work has begun, spearheaded by the NMWG, on
producing CIM, UML and XML[9] based
network monitoring schemas, all based on the
group’s hierarchy document Until these are
evaluated, no firm decision can be made over
which technology to use. As a proof of concept
however, later iterations of the GridMon web
services interface will use an XML schema based
on work at UCL and the previously mentioned
NWMG schema.
Implementation of an initial web service is in
progress, using Apache Tomcat to host the web
application, and Apache Axis to provide the
SOAP support required to turn the application into
a service. This and subsequent versions will be
used as a testbed in work conducted by UCL’s eScience Networking Centre of Excellence[10].
This is in addition to ‘proving’ the XML schema,
and is yet another example of GridMon adding
value.
Campus X
Host A
•
•
In its final form, the piPEs infrastructure will be
able to determine complete path (end-to-end)
performance by aggregating information relating
to the various segments that make up the path,
whether these segments are in the same domain or
not.
•
•
The basic topology is produced by inserting
Performance Monitoring Points (PMPs) at
selected stages in a network (nominally alongside
routers) as shown in figure 7.
•
A battery of tests is periodically performed,
providing a minimum set of measurements of
loss, jitter, throughput and one way delay.
The resulting performance data is stored
locally (within that domain) in a database.
When users or network administrators request
information about the state of the network,
on-demand tests can be scheduled if the
relevant data does not already exist in a local
or remote results database.
Users require authorisation to perform tests.
Users have two ways of using the system: the
human analysis engine and associated web
display for dealing with historic performance,
and the testing/analysis engine with
associated interface for dealing with the “here
and now”
A “culprit database” exists to relate support
personnel to network domains.
An important point perhaps is that there is nonhuman access to data, other than from other piPEs
domains.
The piPEs initiative also has overlap with Dante’s
multi-domain monitoring[12]. This will impact on
GridMon via its work with piPEs and UCL. And
while this work may sound ambitious, with
experience suggesting that it may also be difficult
to get all parties (domains) to sign up, the obvious
benefits make it a worthwhile cause to champion.
As previously mentioned, the SLAC IEPM–BW
tools (bbcp/ftp…) will be integrated into
GridMon. The tools will first be trialled between
CCLRC’s laboratories at Daresbury and
Rutherford Appleton.
Some collaboration will also take place with
DataTAG WP2[13] regarding the work outlined in
the next section: TCP tuning.
This section hopefully highlights the level of
monitoring initiatives that the GridMon team have
exposure to. GridMon is a UK e-Science project,
but it doesn’t exist in a vacuum, and is evolving to
show the best way to carry out monitoring, based
on the best techniques and technologies from
around the world.
UK e-Science using the installed base of
monitoring machines.
A full discussion of TCP tuning issues is well
beyond the scope of this paper, but an interesting
if less frequently used example is interrupt
handling. Many NIC drivers offer features to limit
or queue the number of interrupt requests sent to
the machine’s CPU. This throttling makes the NIC
disturb the CPU as little as possible, leaving it free
for other tasks. Relaxing these limitations can
considerably increase NIC throughput, but at the
expense of CPU utilisation, since it is disturbed
more frequently. For a typical e-Science Grid
application (which is likely to be computationally
intensive) there must be a trade off between the
requirements for network bandwidth and CPU
usage.
Work has begun in this area, initially using
Gigabit Ethernet enabled machines at Daresbury
and Rutherford Appleton. Figure 8 highlights the
dangers of disabling various options!
TCP Tuning
Given the success of GridMon’s first stage in
establishing a monitoring infrastructure, it is now
possible to carry out work relating to end-to-end
TCP performance, using the installed base of
GridMon machines as a testbed.
LFNs can be described as network connections
that have high RTTs and high bandwidths, so that
they resemble long and fat pipes. Problems with
TCP’s inability to scale to work with LFNs were
discovered as early as late 1980’s[14]. Fixes
implemented since are now coming to the limit of
their application, as the current definition of an
LFN reaches a new order of magnitude. TCP’s
current problems with LFNs, and other typical eScience applications are well documented [15]
[16] [mathematical treatment 17]. Matters are not
helped by known implementation problems [18].
This has given rise to new TCP implementations
such as Fast[19] and Scalable[20] TCP, but with
these technologies still at the experimental stage, a
clear requirement exists for showing how to
achieve optimum performance from existing
“standard” TCP implementations, such as Reno.
Much work is going into this topic and it is
GridMon’s intention to use the available research
to demonstrate real-world TCP best practice to
Figure 8: initial TCP tuning
Acknowledgements
The work described here is closely coordinated
with work underway within the EDG, and benefits
from collaboration with the IEPM work at SLAC,
and multicast work at MCC.
Conclusion
The first year of the GridMon project has gone
well, with an initial presence established at each
of the 12 e-Science Centres. There have been, and
continue to be some technical problems, but this is
to be expected with a varied set of installed
machines. This does not appear to be an off-
putting factor however, and the success of
GridMon is being demonstrated by the fact that
non e-Science groups are requesting to become
involved. Indeed, as GridMon grows in scope and
functionality, its use is expected to widen further.
8.
As we move into the second phase of work,
GridMon is well poised to evolve into a “best of
breed” monitoring solution, building on work of
the GGF, Internet2, SLAC and others,
acknowledged leaders in their respective fields.
9.
Providing web and Grid services interfaces will
increase GridMon’s user base by attracting users
who were uninterested in the human interface, and
by generating interest from other network
monitoring groups who can now use GridMon
with their own developments. TCP tuning can be
considered as a value added service, providing a
‘real
world’
networking
best
practice
demonstrator using an already available
infrastructure. Both these strands of work are
being carried out because they will prove to be
genuinely useful, rather than being the proving of
a technology. The future therefore, is bright.
Everyone is now familiar with Moore’s law,
summing the rapid growth of semiconductor
devices. Networking also moves at a fast pace,
and whereas work beyond web/Grid services and
TCP tuning may be difficult to predict, evaluating
alternate TCP stacks such as Fast and Scalable
TCP looks a likely contender. The arrival of
SuperJANET5 also raises new possibilities, such
as a permanent UK implementation of QoS.
Whatever the direction of future UK networking,
there is still much to do, and much that is possible.
GridMon is funded until June 2004, and hopefully
it will be given the opportunity to reach its full
potential.
10.
11.
12.
13.
14.
15.
16.
References
1. EDG WP7, Network Services:
http://ccwp7.in2p3.fr/
2. IperfER: http://www.hep.ucl.ac.uk/~ytl
3. Pinger: http://wwwiepm.slac.stanford.edu/pinger/
4. UDPmon: http://www.hep.man.ac.uk/u/rich/
5. IEPM-BW: http://wwwiepm.slac.stanford.edu/bw/
6. miperfer: http://www.csar.cfs.ac.uk/staff/daw/
7. B. Lowekamp, B. Tierney, L. Cottrell, R.
Hughes-Jones, T. Kielmann, and T. Swany. A
17.
18.
Hierarchy of Network Performance
Characteristics for Grid Applications and
Services, Global Grid Forum, 19 June 2003:
http://www-didc.lbl.gov/NMWG/docs/draftggf-nmwg-hierarchy-00.pdf
R.J. Allan, D. Chohan, X.D. Wang, M.
McKeown, J. Colgrave, and M. Dovey. UDDI
and WSIL for e-Science, Grid Support Centre,
2002.
http://esc.dl.ac.uk/Papers/UDDI/uddi/uddi.ht
ml
D. Gunter. Schemas for Exchanging Network
Measurements with OGSI. NMWG, 19 June
2003: http://wwwdidc.lbl.gov/NMWG/schemas/NMWG_Schemas_for_OGSI.html
piPEs:
http://e2epi.internet2.edu/E2EpiPEs/e2epipe_i
ndex.html
Dante inter-domain performance monitoring:
http://www.dante.net/tf-ngn/perfmonit/
DataTAG WP2, High Performance Networks:
http://icfamon.dl.ac.uk/DataTAG-WP2/
Y. Li, P.D. Mealor, M.J. Leese and P. Clarke.
Plug ‘n’ Play (Network) Performance
Monitoring. To be presented at UK e-Science
All Hands Meeting, September 2003.
V. Jacobson, R. Braden. RFC1072: TCP
Extensions for Long-Delay Paths. IETF,
October 1988:
http://www.ietf.org/rfc/rfc1072.txt
D. Katabi. Congestion Control for High
Bandwidth-Delay Product Networks
(extended abstract). MIT, February 2003:
http://datatag.web.cern.ch/datatag/pfldnet200
3/papers/katabi.pdf
W. Feng and P. Tinnakornsrisuphap. The
Failure of TCP in High-Performance
Computational Grids. Proceedings of 2000
Supercomputing Conference (SC ’00):
http://csdl.computer.org/dl/proceedings/sc/20
00/9802/00/98020037.pdf
T.V. Lakshman and U. Madhow. The
performance of TCP/IP for networks with
high bandwidth-delay products and random
loss. IEEE/ACM Trans. Networking, vol. 5,
no. 3, pp. 336-350, June 1997:
http://www.ece.ucsb.edu/Faculty/Madhow/Pu
blications/ton97.ps
V. Paxson, M. Allman, S. Dawson, W.
Fenner, J. Griner, I. Heavens, K. Lahey, J.
Semke, and B. Volz, RFC2525: Known TCP
Implementation Problems. IETF, March
1999: http://www.ietf.org/rfc/rfc2525.txt
19. C. Jin, D. Wei, S. H. Low, G. Buhrmaster, J.
Bunn, D. H. Choe, R. L. A. Cottrell, J. C.
Doyle, W. Feng, O. Martin, H. Newman, F.
Paganini, S. Ravot and S. Singh. FAST TCP:
From Theory to Experiment. Caltech, 30
March 2003: http://netlab.caltech.edu/FAST/
20. T. Kelly. Scalable TCP: Improving
Performance in Highspeed Wide Area
Networks. CERN / Universiry of Cambridge,
21 December 2002:
http://datatag.web.cern.ch/datatag/pfldnet200
3/papers/kelly.pdf
Download