An Overview of the External Security Review and Performance

advertisement
An Overview of the
External Security Review
and
Performance
of the GT4 Components
William (Bill) Allcock
07 April 2005
with slides from:
Stu Martin, Lisa Childers, Jarek Gawor, Sam Meder
Sam Lang, Ravi Madduri, Jen Schopf
Security Review
GT4 Security Architecture
External Review

Reviewers

Marty Humphrey (Univ of Virginia)

Jim Basney (NCSA)

Matt Crawford (Fermi Lab)

Zach Miller (Condor)

Jamie Frey (Condor)

Carey Kireyev (Condor)
3
The Goals

Review of the *architecture*






No code looked at
The plan was to have a written report from the review
team, which we could respond to and then publish
However, due to problems with ANL accounting, Russ ?
, the head of the IETF security group, and the one
person who was going to be paid, didn’t participate.
He was also supposed to write the report 
At this point we are at the mercy of our volunteer
reviewers.
We do have email summary of the issues and
recommended changes, discussed here
4
Web Services Security

There were two main concerns

Delegation service should turn over its key
pair regularly. This is to prevent its theft
and surreptitious use with future proxy
certificates delegated to it


This is now in place.
The entire container uses the same keys.
No protection from mis-behaving services in
the container. reuses key pair.
5
GridFTP concerns

Large body of C setuid code.
Recommendation is to run the control channel
(the client connection) process as a nonpriveledged user, run the data channel (moves
the data, control channel connects to it, not the
external client), as root setuid and lock it down
to only accept connections from the control
channel
 We now support anonymous and user/pass
authentication



No longer guaranteed GSI auth on 2811
Make sure the [de]activation is very explicit, not just empty
anon name.
6
Other issues


executables, configs, etc., should be owned by
someone other than the user that runs the jobs
(globus-exec .vs. globus-admin)
Use of sudo



policy needs to be cut and paste, putting a
generic tool to specific use.
Support X509 NameConstraint Extension
Number of other things that we have already
fixed, such as explicit destroy in Delegation
Service.
7
Performance Issues
What do we mean by Performance?



Performance in the broadest sense of the word.

How fast

How many

How stable

How easy
We keep the URL below (fairly) up to date
http://wwwunix.globus.org/toolkit/docs/development/4.0drafts/perf_overview.html
9
GridFTP
New GT4 GridFTP Implementation









NOT web services based
NOT based on wuftpd
100% Globus code. No licensing issues.
Absolutely no protocol change. New server should
work with old servers and custom client code.
Extremely modular to allow integration with a variety
of data sources (files, mass stores, etc.)
Striping support is present.
Has IPV6 support included (EPRT, EPSV), but we have
limited environment for testing.
Based on XIO
wuftpd specific functionality, such as virtual domains,
will NOT be present
11
New Server Architecture

Data Transport Process (Data Channel) is architecturally,
3 distinct pieces:




The protocol handler. This part talks to the network and
understands the data channel protocol
The Data Storage Interface (DSI). A well defined API that
may be re-implemented to access things other than POSIX
filesystems
ERET/ESTO processing. Ability to manipulate the data
prior to transmission.

currently handled via the DSI

In V4.2 we to support XIO drivers as modules and chaining
Working with several groups to on custom DSIs

LANL / IBM for HPSS

UWis / Condor for NeST

SDSC for SRB
12
Current Development Status


GT3.9.5 is a beta release (interfaces wont change).
This code base has been in use for over a year. There
are bug fixes in CVS.
We reused the data channel code, from wuftpd, and so
has been running for several years.

Initial bandwidth testing is outstanding.

Memory leaks are approx 30 KB per 24 hrs of transfer

One host was supporting 1800 clients.

One workflow had O(1000) errors with the 3.2.1 server,
and had none with the new server.

Statically linked version in VDT 1.3.3

Bottom Line: You REALLY want to use the GT4 version.
13
Deployment Scenario under
Consideration



All deployments are striped, i.e. separate
processed for control and data channel.
Control channel runs as a user who can only read
and execute executable, config, etc. It can write
delegated credentials.
Data channel is a root setuid process




Outside user never connects to it.
If anything other than a valid authentication occurs it
drops the connection
It can be locked down to only accept connections
from the control channel machine IP
First action after successful authentication is setuid
14
Possible Configurations
Typical Installation
Control
Data
Striped (n=1)
Control
Data
Striped (n>1)
Control
Data
non-privileged
user
root
Striped Server (future)
Control
Data
15
TeraGrid Striping results





Ran varying number of stripes
Ran both memory to memory and disk to
disk.
Memory to Memory gave extremely high
linear scalability (slope near 1).
We achieved 27 Gbs on a 30 Gbs link (90%
utilization) with 32 nodes.
Disk to disk we were limited by the storage
system, but still achieved 17.5 Gbs
16
Memory to Memory
Striping Performance
BANDWIDTH Vs STRIPING
30000
25000
Bandwidth (Mbps)
20000
15000
10000
5000
0
0
10
20
30
40
50
60
# Stream = 16
# Stream = 32
70
Degree of Striping
# Stream = 1
# Stream = 2
# Stream = 4
# Stream = 8
17
Disk to Disk Striping Performance
BANDWIDTH Vs STRIPING
20000
18000
16000
14000
Bandwidth (Mbps)
12000
10000
8000
6000
4000
2000
0
0
10
20
30
40
50
60
70
Degree of Striping
# Stream = 1
# Stream = 2
# Stream = 4
# Stream = 8
# Stream = 16
# Stream = 32
18
Scalability Results
GridFTP Server Performance
Upload 10MB file from 1800 clients to ned-6.isi.edu:/dev/null
2000
100
1900
Load
1800
90
1700
Memory Used
1600
80
1500
70
CPU %
1300
1200
60
1100
1000
50
Throughput
900
800
40
CPU %
/ Throughput (MB/s))
# of Concurent Machines
/ Response Time (sec)
1400
700
600
30
Response Time
500
400
20
300
200
10
100
0
0
0
500
1000
1500
2000
2500
3000
3500
Time (sec)
19
RFT
So, what about Web Services…

Web Services access to data movement is
available via the Reliable File Transfer
Service.
WSRF, WS-addressing, WSN, WSI compliant
 It is reliable. State is persisted in a
database. It will retry and either succeed
or meet what you defined as ultimate
failure criteria.
 It is a service. Similar to a job scheduler.
You can submit your data transfer job and
go away.

21
Important Points

Container wide database connection pool


Container wide RFT thread max


Request has a thread pool equal to concurrency
Resource Lifetime is independent of transfers


Total number of transfer threads limited
One resource per request


Can either wait for ever or throw an exception
Needs to exceed transfer lifetime if you want state
around to check status.
URL expansion can be time consuming

Currently does not start transfers until fully expanded
22
RFT Architecture
23
RFT Testing

Current Testing :







We are in the process of moving the Sloan Digital Sky Survey DR3
Archive. 900+K files, 6 TB. We have killed the transfer several
times for recoverability testing. No human intervention has been
required.
Current maximum request size is approx 20,000 entries with a
default 64MB heap size.
Since GRAM uses RFT for staging all GRAM tests that include staging
also test RFT (and GridFTP and the delegation service and core…)
Infinite transfer - LAN - killed the container after ~120,000
transfers. Servers were killed by mistake.
 Was a good test. Found a corner case where postgres was not
able to perform ~ 3 update queries / sec and was using up CPU.
Infinite transfer - WAN - ~67000 killed because of the same reason
as above
Infinite transfer - 3 scripts creating transfer resources of one file
with life time of 5 mins. Found a synchronization bug and fixed it. -Active
nightly tests and a script to randomly kill the container and database
daemon are in progress.
24
MDS
MDS Query results


Only one set of data so far. No data yet for Trigger
Service. Ran at this load for 10 minutes without failure.
DefaultIndexService





Message size 7.5 KB
Requests processed: 11262
Elapsed Time: 181 seconds
Average round-trip time in milliseconds: 16
ContainerRegistryService




Message Size 32KB
Queries processed: 6232
Elapsed Time: 181 seconds
Average round-trip time in milliseconds: 29
26
Long Running Test

Ran 14 days (killed by accident during
other testing

responded to over 94 million requests

13 millisecond average Query RTT

76 requests per second.

Has also had diperf tests run against it
(next slide)
27
28
Java Core
Core Performance

We’ve been working hard to increase basic
messaging performance



Factor 4 improvement so far
We’re testing reliability
We’ve shown that core can scale to a very
large number of resources (>>10000)
30
Core Messaging Performance
Messaging Performance
700
600
Time (ms)
500
Axis Update Branch (1/10/05)
400
CVS Head (1/10/05)
CVS Head (11/05/04)
300
CVS Head (11/01/04)
200
100
0
1
5
9 13 17 21 25 29 33 37 41 45 49 53 57
Message Size
(number of GRAM subjob messages)
31
Security Performance

We’ve measured performance for both WS
and transport security mechanisms


See next slide for graph
Transport security is significantly faster
than WS security


We made transport security (i.e. https) our
default
We’re working on making it even faster by
using connection caching
32
Security Performance
1800
1600
1400
Time (ms)
1200
Transport Security CVS Head
(1/19/2005)
1000
Message Security CVS Head
(1/19/2005)
800
Conversation Security CVS Head
(1/19/2005)
600
400
200
0
1
4
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58
Message Size
(number of GRAM subjob messages)
33
C WS Core
What is implemented

Implements serialization / deserialization

Implements WS-Addressing


Implements Secure Conversation, Message
Security, and Transport Security (HTTPS,
the default)
Implements Notification Consumer (client
side), but not Notification Source (server
side)
35
C WS Core Clients:
Java vs. C

Java VM startup: large initial overhead

Simple Java client Request/Response:
~5 seconds

Simple C client Request/Response:
~0.5 seconds
36
C WS Core Performance:
Service Container

Without Security

Java Container


C Container


0.36s avg. Request/Response
0.015s avg. Request/Response
With Security

Java Container


0.66s avg. Request/Response
C Container

0.12s avg. Request/Response
37
C Performance Improvements

HTTP Persistence

No Security, No Caching


No Security, With Caching


0.17s avg. Request/Response
With Security, No Caching


0.25s avg. Request/Response
2.6s avg. Request/Response
With Security, With Caching

0.52s avg. Request/Response
38
C Performance Improvements
(Planned)


Improved Deserialization performance of
optional schema elements
WS-Security performance:

Inlined Canonicalization
39
C globusrun-ws Performance







Query Delegation Factories:
 0.046s
Query Certificate Chain:
 0.058s
CreateManagedJob:
 0.12s
Active Notification:
 5.11s
Cleanup Notification:
 0.73s
Done Notification:
 2.29s
C client total processing time:

1.12s
40
GRAM
Some of our Goals
“GRAM should add little to no overhead
compared to an underlying batch system”

Submit as many jobs to GRAM as is possible
to the underlying scheduler



Goal – efficiently fill the process table for fork
scheduler
Submit/process jobs as fast to GRAM as is
possible to the underlying scheduler


Goal - 10,000 jobs to a batch scheduler
Goal - 1 per second
We are not there yet…

A range of limiting factors at play
42
Design Decisions

Efforts and features towards the goal

Allow job brokers the freedom to optimize



Reduced cost for GRAM service on host



Single WSRF host environment
Better job status monitoring mechanisms
More scalable/reliable file handling



E.g. Condor-G is smarter than globusrun
Protocol steps made optional and shareable
GridFTP and RFT instead of globus-url-copy
Removal of non-scalable GASS caching
GT4 tests performing better than GT3 did

But more work to do
43
GRAM / GridFTP file system mapping




Associates compute resources and GridFTP servers
Maps shared filesystems of the gram and gridftp hosts, e.g.
 Gram host mounts homes at /pvfs/home
 gridftp host mounts same at /pvfs/users/home
GRAM resolves file:/// staging paths to local GridFTP URLs
 File:///pvfs/home/smartin/file1... resolves to:
 gsiftp://host.domain:2811/pvfs/users/home/smartin/file1
$GL/etc/gram-service/globus_gram_fs_map_config.xml
44
GRAM 3.9.4 performance

Service performance & stability



Throughput

GRAM can process ~70 /bin/date jobs per minute

~60 jobs/minute that require delegation
Job burst

Many simultaneous job submissions

Are the error conditions acceptable?
Max concurrency


Total jobs a GRAM service can manage at one time without
failure?
Service uptime

Under a moderate load, how long can the GRAM service
process jobs without failure / reboot?
45
Long Running Test




Ran 500,000+ sequential jobs over 23
days
Staging, delegation, fork job manager
http://bugzilla.globus.org/bugzilla/show_b
ug.cgi?id=2582
As an experiment we have been tracking
some of our work in bugzilla
46
Max Concurrency Test

Current limit is 32,000 jobs due to a Linux
directory limit



using multiple sub-directories will resolve
this, but not likely to make it for 4.0
Simple job to Condor scheduler. Long
running sleep job. No staging, streaming,
no delegation, no cleanup
http://bugzilla.globus.org/bugzilla/show_b
ug.cgi?id=3090
47
Max Throughput Tests



Current limit is approx 77 jobs per minute
Simple job to fork scheduler. /bin/date
job. No staging, streaming, no delegation,
no cleanup
bottleneck investigation


http://bugzilla.globus.org/bugzilla/show_bu
g.cgi?id=2521
with delegation was 60 jobs per minute

http://bugzilla.globus.org/bugzilla/show_bu
g.cgi?id=2557
48
Overall Summary

Have we done all the testing we want to?


Have we done far more testing on GT4
than any other GT release?


absolutely not
You bet
Have we done enough testing?

We think so
49
RLS



I don’t really have results for RLS.
They have not changed much and did their
testing a while ago.
However, will point out two projects that
are heavily using it and GridFTP in
production.


LIGO – Laser Interferometer Gravitational
Wave Observatory
UK QCD – Quantum Chromodynamics
50
LIGO

They don’t have such a pretty web site 

However, impressive numbers

Produce 1 TB per day

8 sites

> 3 million entries in the RLS

> 30 million files
This replication of data using RLS and GridFTP is enabling
more gravitational wave data analysts across the world to do
more science more efficiently then ever before. Globus RLS
and GridFTP are in the critical path for LIGO data analysis.
51
Installation

configure / make

lots of binaries planned

much better platform support


nightly testing on a wide variety of
platforms (will probably install for you)
tools (need more work here)

check your security config

grid-mapfile-check-consistency
52
Download