transparencies - Indico

advertisement
Applications and the Grid
The European
D
at aG
ri d
http://www.e
P
DataGrid is
a p
ro
j e
c
u
roj ec
- d
t f u
n
a
de
ta
d b
g
y
r i d
t
th
Team
e
.o
E
u
r g
ro
p
e
an
U
n
io
n
Grid T
u
to
rial
4
/ 2
8
/ 2
0
0
3
–
n
°
1
Overview
An applications view of the the Grid –
H
ig
h E
nerg
B
ona
r i ef
C
E
P
estb
C
a
ex
p
the M
I D
s i n H
r c
M
E
P
?
odel
a
nd i ts ev
ol uti ons towa
r ds
arth O
M
Wha
a
a
R
b
a
na
g
em
ent a
nd H
E
P
r eq
ui r em
ents.
tter ns
nd 2
v
a
l i da
ti on :
wha
t ha
s a
l r ea
dy b
odel
of
een done on the
I D
- b
a
sed di str i b
uted c
om
p
. m
the H
E
P
servation
nd p
t do typ
iolog
B
p
nd m
ents
i ssi on a
dg
l ysi s a
ed ?
er i m
na
eds 1
ur r ent G
ta
essi ng
testb
B
enti on of
da
r oc
ases
G
H
T
m
se C
sics
R
P
hy
Why we need to use G
E
P
L
y
U
i c
l a
a
l
ns
E
a
r th O
b
s.
a
p
p
l i c
a
ti ons do ?
L
y
A
S
T
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
What all applications want from the G
rid
A homogeneous way of looking at a ‘virtual computing
lab
’
V
( V
O
mad
e up of heterogeneous resources as part of a
irtual O
rganisation)
which manages the allocation
of resources to authenticated
authorised
users
A uniform way of ‘logging on’ to the Grid
and
B
as
ic
func
tions
for j ob
s
ub
mis
s
ion,
data
management and monitoring
Ab
ility to ob
tain res
ourc
es
(
s
erv
ic
es
)
s
atis
fying
user requirements for data, CPU, software,
turnaround…
…
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
Common Applications Issues
Applications are the end- u
f inally
m
ak
ing
throu
req
se C
U
u
of
ev
g
h U
irem
S
E
C
AS
E
S
:
ents in sof tw
R
I D
odeling
their u
:
they
sag
e of
u
e f or g
a standard techniq
are dev
ases are narrativ
ents of
the G
are the ones
the dif f erence
All applications started m
sers of
an actor u
e docu
sing
elopm
m
a sy
ent m
the G
stem
[ . . . ]
I D
athering
ethodolog
ents that describ
R
ies
e the seq
to com
u
ence
plete
processes
W
hat Use Cases are N
O
T
:
the description of an architecture
the representation of an im
pl em
entation
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
4
Why Use C
ses ?
a
Applications domain
ALICE
AT
LAS
CM
S
LH
Cb
O
t h
e
r H
EP
O
t h
e
HEP Common Application Layer
r
Ap
p
s
…
Domains interface
D
a
ta
P
G
B
P
R
a
D
I D
G, Gr i P
m
g
o
f
h
y
n
id
, E
U- D
a
d
l ew
t a
GR
a
re
I D
Services
(GLOBUS, C
o
d
o
r - G ,…
)
OS & Net services
Computer Scientist domain
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
5
Use Cases (HEPCAL)
http://lhcgrid.web.cern.ch/LHCgrid/SC2/RTAG4/finalreport.doc
O
b
A
s
tain
k
Grid auth
f or re
Grid log
B
row
s
e
v
oc
oris
ation
J ob
ation
of
Grid auth
oris
J ob
ation
S
D
m
Grid re
S
D
e
m
s
ourc
atas
V
tadata up
e
e
t re
g
is
e
irtual datas
D
atas
U
s
D
e
e
t up
r- de
ata s
D
atas
D
D
e
ata s
P
h
y
D
s
ic
e
s
D
e
r de
ata re
D
B
D
e
ata s
s
d c
c
e
e
e
e
Grid
laration
ate
rializ
ation
ue
re
c
f e
p
s
s
lic
t de
le
f in
d c
e
t v
t b
on
e
v
r to n
a up
e
c
on
- Grid s
load to th
os
q
m
is
utp
c
J ob
e
s
s
ourc
v
p
roduc
tion
n
is
s
c
c
e
ry
ub
m
e
iron
littin
aly
ue
ry
e
s
s
or R
f or A
b
e
trie
orte
v
al
d or F
ailin
g
P
roduc
tion
J ob
s
trol
s
n
date
ion
ov
r j ob
re
s
ut A
e
on
J ob
A
t e
v
torag
e
D
e
Grid
aluation
e
t in
tion
( c
atalog
rif ic
s
s
tan
om
p
ue
re
c
m
e
de
dition
le
is
e
m
s
e
s
ion
tim
n
ation
t m
odif ic
ation
g
j ob
1
te
)
de
le
tion
ote
D
e
t tran
J ob
m
S
ulation
im
x
p
e
on
rim
s
f orm
e
itorin
ation
g
J ob
n
t s
of tw
are
de
v
e
lop
m
e
n
t f or th
e
Grid
tion
V
le
ation
in
ata s
E
ation
al f rom
row
ue
ub
e
J ob
ation
atalog
C
te
P
c
up
rror R
S
to th
ue
s
lic
t ac
c
s
atalog
s
s
p
trie
ata s
row
e
t re
e
c
t m
al data s
ata s
U
f in
t re
ata s
t de
e
t tran
e
s
atalog
O
J ob
load
t ac
e
atas
D
e
e
tration
irtual datas
V
c
s
J ob
s
date
tadata ac
e
c
J ob
in
E
D
c
( c
atas
e
om
ts
p
le
te
)
O
V
w
O
C
w
S
on
ide
re
ide
re
dition
of tw
are
s
p
p
s
ourc
e
ourc
ub
ub
e
lis
lis
re
h
h
s
e
rv
alloc
in
in
ation
ation
to us
e
rs
g
g
g
datab
as
e
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
6
High Energy Physics applications
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
7
The LHC challenge
HEP is carried out by a community of
spread al l
ov
T
L
he C
g
oal
ER
T
N
er the w
arg
for the w
at an energ
e Hadron C
y scal e (
the study of q
L
HC
ex
10
T
h
L
I
ex
C
p
7
+
7
-g
G
is the most chal l eng
T
eV
p-p)
t years
corresponding
bang
ing
S
Y
,G
to the v
-1
( <
U
3
10
s)
U
T
s)
ery first
, al l ow
ing
l uon pl asma
il l
produce an unprecedented amount of data
collision events / year (+ same from simulation)
C
E
,
A
T
L
A
ond
S
,
C
s to 3
M
S
,
ut to d
L
H
4
C
P
b
B
d
ata / year / ex
p
eriment
)
ata storag
e center ):
up
to 1
. 2
G
B
/s p
er
eriment
ollision event record
2
N
uired, stored, anal ysed :
ata rate ( inp
uark
is corresp
(A
D
10
ER
and the model s beyond it ( S
erse after the big
periments w
to be acq
ol l ider at C
hol e HEP community in the nex
instants of the univ
orl d
est the standard model
more than 10,000 users
B
s are larg
e:
up
to 2
5
M
B
(real d
ata) and
(simulation)
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
8
The LHC detectors
CMS
ATLAS
~6-8 PetaBytes / year
~1010 events/year
~103 batch and interactive users
LHCb
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
9
multi-level trigger
filter out background
reduce data volume
online system
Hz
(40
T
B/se
data
c)
r
e
c
offli
o
ne a rding
n al y &
sis
Grid Tutorial
75 K - specia B/sec)
leve
l ha
l 2 - Hz (7
rdw
are
5
e
G
m
B
b
5K
Hz ( edded pr /sec)
oce
5G
leve
ssor
B
/
sec)
l3
s
100 - PC s
(100
Hz
M
leve
l1
40 M
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
1
0
Data Handling and
C
detector
event
eventfilter
filter
(selection
(selection&&
reconstruction)
reconstruction)
reconstruction
P
h
o
m
y
s
p
u
ic
s
tatio
A
n f o
naly
s
r
is
processed
data
event
summary
data
raw
data
event
event
reprocessing
reprocessing
analysis
batch
batch
physics
physics
analysis
analysis
b
interactive
physics
analysis
er t so
n
@
c
er n
.c
simulation
les.r o
event
event
simulation
simulation
h
analysis objects
(extracted by physics topic)
Deploying the LHC
G
lob
a
l G
r
id
S
er v
U
ic
n
e
L
i x
a
b
m
grid for a
re
L
a
b
gion
al
grou
p
U
C
a
p
S
r 1
K
A
F
The LHC
Tier 1
r3
s ic
art m
ie
s
e
Co
Tier2
n
m
p
Cen
t
u
t i n
c
e
g
U
t r e
CERN Tier 0
I
ran
J
t al y
ap
n
i n
an
t op
. .
a
b
G
. .
e
rm
an
y
b
L
a
b
c
.c
L
. .
h
γ
α
U
n
i y
U
n
i b
s t u
s ic
dy
s
c
y
@
h
grou
p
n
p
er n
grid for a
β
er t so
s k
p
y
T
b
e
h
N
i a
les.r o
D
de
ie
R
U
U
T
E
n
HEP Data Analysis and Datasets
Raw data (
h
Re
k
c
B
o
y
A
y
c
u
u
l s
c
k
s
i s
h
e
te
,
h
~
e
i g
h
1
M
B
y
te
b
l u
j e
s
c
te
r
ts
s
ATLAS Barrel Inner Detector
Ð
H→bb
ts
d data (
c
O
e
g
c
i c
s
m
g
c
to
s
m
r
du
l l e
y
u
O
i s
o
s
S
h
ac
P
p
)
E
S
D
)
~
1
0
0
b
…
(
A
O
D
)
~
1
0
te
Re
tr
al y
B
s
n
tr
n
k
,
W
te
i ts
RA
r
ti o
O
ar
an
i z
d A
O
am
n
d
d b
y
s
,
(
s
f
c
e
D
o
j e
i z
e
s
s
b
ts
T
p
A
G
tati s
e
v
e
h
n
y
s
ti c
s
)
i c
s
to
p
~
al
data o
i c
1
k
B
y
te
n
ts
Ð
b
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
1
3
HEP Data Analysis –
r
o
c
e
ssing
Processing fundamentally indep
due to indep
S
o h
av
endent nature of ‘ ev
e concep
(
e. g.
w
h
tak
A
p
2
Production p
, 0
group
es ~
0
j ob
ents)
Ph
ysics group
ev
A
T
h
O
D
+
T
A
p
mb
arassing p
s’
w
h
ich
p
rocess N
is p
th
arallel)
0
*
is w
6
ev
lanned b
ents w
ould th
rocessing
(
(
y
ex
p
1
?
0
0
ev
1
- 3
/
T
en inv
b
ents
p
t to ex
p
)
.
e
h
ysics
t)
times a year of 1
month
olv
yte
eriment and p
ary from ex
rocessing
5
ents
lete on one node)
*
ill v
ev
s of ~
s merging into total set of 2
rocessing
econstruction p
E
ns
ents’
day to comp
data managers(
R
r
organised in group
rocessing for 1
0
atte
litting and merging
simulation j ob
ich
ts of sp
p
endent (
Processing organised into ‘ j ob
p
0
*
Produce ~
1
*
9
0
*
*
7
G
is may b
e distrib
uted in sev
eral centres
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
1
4
Processing Pa
(
2
)
Individual physics analysis
(
acco
r
H
w
E
le
S
o
+
t o
le
ct ive
T
g
R
o
f
ns .
A
W
d r
r
up o
r
k
ss ce
ve
plicat io
r
g
e
r
b
ns o
y de
f
e
al A
O
f o
d ve
r
n ‘ chao
t ic’
individuals)
ib
D
r
f init io
+
ut e
T
y se
A
t uning
G
le
alg
d in e
and r
ct ive
o
x
r
pt
ay e
un t he
acce
it hm
m
s,
ir
ach
o
w
n
ss t o
che
ck
ing
nt s)
plicat io
ill b
nt r
ill ne
dat a (
e
e
pat t e
-
physicist s dist r
W
nal e
his w
r
o
acce
ccasio
e
w
ds o
ct io
D
ill ne
se
t o
e
ant
se
W
ding
undr
t t erns
n o
n o
f
f
a f unct io
anisat io
R
A
O
A
n o
D
+
W
T
+
f
n in t he
A
E
pr
S
e
x
o
G
in e
x
pe
r
im
e
nt ,
and
D
ce
ssing
pe
r
im
Grid Tutorial
e
and physics
-
nt
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
1
5
Alice: AliE
n
- E
D
G
in
t eg
r
a
t io
n
EDG R
B
Server
EDG U
J DL
C
I
t r a
I nst a
nsla
er t if ica
lla
EDG
t ion
t ion
C
AliEn CE
E
t es
ED
Alice SE on EDG nodes
Alice Da
a
t a
C
ccess b
a
y
t a
log
u
ED
G
U
G
e
AliEn S
EDG nodes
N
g
u
E
s
Data
atal o
E
I
W
C
S
(Cerello,
Barbera,Buncic
Saiz,et al.)
e
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
1
6
What have HEP experiments
al read
1
. 0
and
d
2
o
ne o
n the ED
G
testb
ed
s
. 0
The EDG User Community has actively contributed to the
validation of
f eb 2
A
0
0
3
some p
sup
p
H
C ex
p
eriments have ran their sof tw
reliminary version)
to p
orted by the testbed 1
middlew
status q
rep
p
A
dep
0
2
–
f eatures p
erations
rovided by the EDG
are
uery,
lication,
s/ w
0
are ( f irstly in
erf orm the basics op
alidation included j ob submission ( J DL
V
the f irst and second EDG testbeds ( f eb 2
)
ll f our L
y
basic data manag
reg
ister into rep
endencies or incomp
) ,
outp
ement op
lica catalog
erations (
s ) ,
atibility ( e. g
ut retrieval,
.
check
j ob
f ile
of
missing
p
ossible
libs,
rp
ms)
roblems
TL
A
S
,
challeng
CM
S
,
A
lice
have run intense p
es and stress tests
during
2
0
0
roduction data
2
Grid Tutorial
and 2
-
4
/ 2
8
/ 2
0
0
0
3
0
–
3
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
1
7
The CMS Stress Test
CMS Mo
n
t e
Ca
farm ( eg.
M
r o
e
P
B
S
d
u
c
t i o
n
u
s
i n
g
B
O
SS a
n
d
I m
p
a
l a
t o
o
c
e
m
2
b
5
0
2
5
0
e
, 0
, 1
.
0
4
7
G
r
0
2
0
ev
0
ev
0
2
t o
rid as ‘local farm’
J a
n
u
a
r y
2
0
0
3
ents generated by job submission at 4
ent files p
sep
arate U
L
E
I ’ s
roduced
b data transferred using automated grid tools during p
including transfer to and from mass storage systems at C
l s
)
odified to treat G
D
p
Originally designed for submitting and monitoring jobs on a ‘local’
r l o
E
roduction,
R
N
and
yon
fficiency of 8
3
%
for ( small)
C
M
K
I N
jobs,
7
Grid Tutorial
0
-
%
4
for ( large)
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
C
ation
s
M
S
an
I M
d th
e
jobs
Grid -
n
°
1
8
The CMS Stress Test
Site
CE
SE
Disk Space
(GB)
lxshare0393
100
lxshare0384
1000(=100*10)*
Number of CPUs
CERN
lxshare0227
122
CNAF
testbed008
40+
grid007g
1000*
RAL
gppce05
16
gppse05
360
NIKHEF
tbn09
22
tbn03
35
ccgridli03
120
ccgridli08
400
ccgridli07
200
Legnaro
cmsgrid001
50
cmsgrid002
513(+513)
Padova
grid001
12
grid005
680
Ecole Polytechnique
polgrid1
4
polgrid2
220
Imperial College
gw39
16
fb00
450
LYON
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
1
9
CMS Stress Test : Architecture of
the sy
stem
CMS
RefDB
SE
CE
CMS software
BOSS
DB
parameters
EDG
J o
R
u
b
o
n
u
tpu
t f i l teri n
ti me mo
n
i to
ri n
g
g
WN
JDL
Workload
Management
System
SE
CE
CMS software
UI
IMPALA/BOSS
d
P
u
sh
P
u
l l
d
i n
ata o
f o
r i n
f o
Replica
Manager
SE
CE
ata
reg
i strati o
n
CE
SE
CMS software
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
0
Main results and observations
w
from CMS
ork
RESULTS
Could distribute and run
CMS s/w
G
env
ironm
ent
D
in E
p
~
O
B
1
y
0
sic
W
5
0
K
ev
ents f or
s with
, 0
SERV
A
0
0
j obs in 3
TI
O
N
week
uic
k
ly
add new sites to p
ast turnaround in bug
T
est was
th
N
sy
e ov
ew release E
stem
f ix
labour intensiv
erall sy
eriod
time
F
p
S
ere able to q
h
2
enerated ~
G
stem
D
G
2
ing
e (
sinc
. 0
th
ould f ix
tra resourc
e m
es
new sof tware
e sof tware was dev
f rag
suitable f or f ull integ
ide ex
and installing
was initially
sh
rov
elop
ing
and
ile)
aj or p
roblem
s p
ration in distributed p
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
rov
iding
roduc
tion
lic
d th
ation
s
an
e
Grid -
a
n
°
2
1
Earth O
e
rv
ati o
n
ap
p
l i c
ati o
n
s
Global Ozone (GOME) Satellite Data Processing and
V
s
WP9)
(
b
alidation by
The
p
K
MI
,
I
PSL
DataGrid testbed p
rocessing env
EO sites ( H
4/28/2003
N
ollan
ironm
d,
F
ran
and ESA
r o
v
i d
es
a
collaborativ
e
ent for 3 geographically distributed
ce,
I taly)
DataGrid is a project funded by the
European Union
Grid Tutorial
-
22
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
2
Earth Observation
ESA missions:
a
•
b
d
a
5
•
ou
t
y
0
0
0
(ER
0
EN
1
G
V
I
b
G
b
S 1
/
t e
s,
y
SAT
y
2
t e
s of
d
a
t a
p
e
r
)
f or
t h
e
ne
mission (2
0
D
x
0
t
2
a
)
t a
.
G
e
r
•
nh
l e
v
a
•
h
a
p
c
a
ont r
nc
e
l
p
r
ist or
ic
p
r
e
r
l l ow
imp
•
id
e
mining
p
a
e
a
c
Ea
r
e
il it y
h
iv
t h
mod
e
c
c
e
ss h
ig
h
e
a
of
l a
r
g
e
s
sc
t ions (d
,
:
t o a
ssing
c
r
b
t o EO
t s
oc
a
t e
a
u
r
l
u
e
od
ov
l ic
t h
ib
ie
nc
t a
e
f u
l l ing
c
omp
…
sion,
d
a
l e
x
t a
)
Source: L. Fusco, June 2001
Fe
d
e
r i c
o
. C
a
r m
i n
a
t i
,
E
U
r e
v i e
w
p
r e
s e
n
t a
t i o
n
,
1
M
a
r c
h
2
0
0
2
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
3
ENVISAT
•
• 33
55
00
•
• LL aa
00
uu
MM
nn
cc hh
•
• 11
00
•
• 22
00
00
MM
•
• 44
00
00
TT
•
• ~~
•
• 11
•
• ~~
ii nn
11
00
00
bb
bb
00
dd
00
ee
uu
00
rr oo
dd
oo
ss tt rr uu
++
77
EE
pp
yy
mm
pp
nn
ee
ss
nn
dd
tt ee
ss
““ ss tt aa
nn
ee
dd
ii cc aa
aa
pp
pp
rr oo
rr oo
aa
gg
dd
rr uu
aa
rr yy
tt ss
oo
nn
bb
oo
aa
vv
rr aa
tt aa
rr dd
dd
ee
mm
bb
aa
tt ee
mm
FF ee
tt aa
dd
rr aa
aa
””
ff aa
dd
tt ee
22
aa
rr oo
ii vv
dd
uu
cc ii ll ii tt ii ee
ss cc ii ee
nn
cc oo
88
ss tt
,,
22
00
00
22
rr dd
tt oo
rr cc hh
pp
ee
gg
rr oo
ee
dd
uu
nn
dd
// yy
ee
aa
rr
cc tt ss
ss
cc ee
ii nn
uu
EE
ss ee
uu
rr
rr oo
pp
pp
rr oo
ee
jj ee
cc tt ss
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
4
Earth Observation
Two different GOME p
inv
es
OP
N
Th
S
I D
( H
A
etwork
R
m
s
ing
h
tl y
tec
h
niq
u
es
wil l
b
e
nd)
a
b
ea
l y
)
Tig
-
L
oos
c
ou
el y
c
p
ou
l ed p
u
s
l ed -
ing
u
s
M
P
ing
N
I
eu
ra
l
s
l ts
tel l ite O
A
ol l a
EGO ( I ta
R
u
es
ted
ER
e res
a
L
a
OP
N
tig
roc
s
s
re c
u
erv
rem
a
h
ec
k
ed b
tions
ents
c
a
y
re c
oinc
V
om
A
L
p
I
a
D
A
T
I
red a
ident in a
( F
ON
g
a
ins
rea
Grid Tutorial
-
a
4
ra
t g
nc
rou
e) .
nd tim
nd- b
/ 2
8
/ 2
0
0
3
–
a
s
ed
A
e.
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
5
GOME OZONE Data Processing
Mod
Level-
1
retriev
e ac
Level-
2
v
ertic
ab
C
L
el
I D
e th
i n
A
c
i d
R
statistic
tual p
data p
al c
ov
o
data (raw satellite measurements) are analysed to
h
rov
olumn of
e E
en
t
arth
al meth
al q
uantities :
L
ev
el- 2
ides measurements of
atmosp
’ s surf ac
data c
data (g
ysic
h
ere at a g
Z
O
N
E
with
en lat/ lon loc
in a
ation
e
onsists of
round- b
iv
O
data
ased ob
L
ev
el- 2
serv
data c
o- reg
ations) and c
istered with
omp
ared using
ods
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
6
R
aw
from
T
th
satellite data
e GOME instru
m
ent
P
h
o
f
r
e
EO
o
c
1
e
D
y
e
s
s
a
a
i n
t a
r
o
g
c
a
f
h
n
a
G
d
l l e
n
a
l i d
v
O
M
g
e
:
E d
a
a
t i o
n
t a
Level 1
ESA –
P
r
d
o
W
a
c
e
s
s
i n
t a
t o
i t h
O
g
o
P
K
o
z
ER
o
N
f
n
M
r
a
w
e
A a
L
I
n
p
G
d
r
o
O
M
N
A
R
data
E
f i l e
N
I D
s
O
IPSL
Validate GOME ozone profiles
W
Level 2
ith
Grou
nd B
ased m
easu
r.
Visu
alization
DataGrid
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
7
GOME Processing Steps (1- 2
Step 1:
Transfer Level1 data to the Grid Storage Element
Step 2:
Register Level1 data with the ReplicaManager
)
S
i t e H
Replicate to other SEs if necessary
S
C
i t e G
S
E
C
S
E
C
I n
t erf a
c
e
R
M
ep
a
n
g
R
M
y
C
d
a
ep
a
t a
l i c
l o
d
p
u
S
E
i t e C
S
E
S
er
C
e
p
l ic
a
te
i t e B
E
S
C
E
E
S
E
a
g
e
ta
D
a
ta
D
a
ta
t
a i nt a p
d
S
E
t a
M
i n
i t e D
E
S
E
a
R
User
S
C
l i c
a
i t e E
E
S
E
C
Submit job
User
i t e F
E
S
u
t
a i nt a p
d
u
t
a i nt a p
u
d
t a
a
t
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
8
GOME Processing Steps (3-4
Step 3:
)
Submit jobs to process Level1
data, produce Level2 data
C
Step 4:
L
Transfer Level2
data products to the
Storage Element
F
N
L
: :
F
P
N
F
N
: :
L
F
P
N
F
I n
f orm
N
: :
P
F
C
N
a
d
t i on
ex
Rep
l i ca
a
t a
l og
a
r c
C
S
Se
I n
M
D
r c
h
Se
S
C
t erf a
y
E
t
x
q
s ta
R
ecut a
b
e
ue
s
B
t
rok
SS
tif
SS
E
e
r e
k
ic
a
te
C
iE t e D
E
C
er
S
E
i t e C
S
E
C
S
S
E
C
v
c
iE t e E
tus
tr ie
e
iE t e F
E
Resource
j ob
L
scri p
SS
C
e
r
h
i t e G
E
C
ce
e
t e
ori t i es
i t e H
h
ser
R
J
a
Submit job
U
ut h
c
I n
User
ert i f i ca
A
E
E
i t e B
S
E
S
E
s ul t
i n
l e
p
ut
a i tn a p
d
d
ut
a i nt a p
L
d
F
N
L
F
L
N
L
F
F
L
N
og
ic
a
l
f
il e
n
a
a
ut
t a
me
N
P
F
N
P
h
y
s ic
a
l
f
il e
n
a
Grid Tutorial
me
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
2
9
GOME Processing Steps (5-6
Produce Level-2 / LIDAR Coincident data
perform VALIDATION
Step 5:
Level 2
C
O
I
N
C
I
D
LI
E
N
T
D
D
A
A
)
Visualize Results
Step 6:
R
T
A
Validation
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
0
Biomedical Applications
G
en
om
ic
s,
an
post- gen
d
proteom
om
ic
ic
s,
s
ilitate the sharin
gen
om
ic
test grid
f or c
Fe
d
e
r i c
o
. C
a
r m
a
t i
atab
- aw
om
i n
d
E
U
r e
v i e
w
p
r e
s e
n
e gen
t a
t i o
n
,
1
M
a
r c
h
2
om
0
0
roc
am
d
are algorithm
parativ
,
P
g of
ases an
ed
ic
an
Explore strategies that
f ac
M
ic
s
s
im
aly
ages
sis
ess the hu
ou
prod
al im
u
n
t of
c
ed
d
b
ata
y
agers in
ge
d
igital
hospitals.
2
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
1
Biology and Bio- inf or
ap
The
lic
of b
e
no
mi c
c
o
Biologist
d
i s
id
t r
i b
r
e
amp
p
ar
e
d
seq
ing of the human
at ab
as
e
s
w
uences of p
g
e
,
g
e
o
g
r
ith alread
roteins b
ap
hi c
al l y
y map
p
ed
nt
l i ng
s
d
,
at ab
t o
as
i d
e
e
s
,
r
nt i f y
e
l at e
s
d
elonging to
e ( D
t o
i mi l ar
,
N
A
seq
uencies)
d
i t i e
i f f e
s
o
r
r
e
s
e
nt
c
o
mmo
n
t s
d
gBL
ex
amp
seq
at ions
ical goal of these algorithms is to anal y
i f f e
s
d
p
uences of human genetic cod
Typ
d
d
make use of l ar
ut e
entified
seq
s
io- informatic algorithms to
erform research on the map
g
at ic
international community of Biologists has a keen
interest in using
p
p
m
A
S
T ( Basic L
ocal A
le of such an ap
uences of p
p
lignment S
earch Tool )
lication seeking p
roteins or D
N
A
is an
articular
in the genomic cod
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
e
lic
ation
s
an
d th
e
Grid -
n
°
3
2
Grid technology opens the perspective of large
com
pu
tational pow
data sou
A
er and easy access to heterogeneou
rces.
grid for health w
disk
and com
pu
ou
A
S
T
8
&
D
een b
ork
for prom
io- inform
for sharing
oting standards and
atics and m
edical
iom
edical grid is b
eing deployed b
y the D
ataGrid
proj ect
http://d
1
etw
rces,
ew
atics
first b
I
ld provide a fram
ting resou
fostering synergy b
inform
s
O
C
=
b
2
s
7
. c
&
o
T
r
B
d
L
i s
. l u
=
E
/f e
N
_
P
p- c
R
O
g
J
i /s
&
R
r c
C
N
hi d
=
E
a
P
d
b
_
?
R
A
C
C
N
T
_
I
A
O
:6
N
=
3
3
D
4
&
5
S
&
E
C
S
A
S
L
I
L
O
E
N
R
=
=
2
P
2
R
1
O
5
J
9
_
2
I
0
S
0
2
- 1
0
-
T
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
3
Biomedical requirements
Large user
c
o
m
m
un
High priority jobs
i t y
(
t h
o
usan
d
s o
f
an
o
l o
D
n
y
m
o
us/
gro
I
n
up
v
an
agem
at a up
d
ersi o
i n
m
an
h
o
ac
c
S
ec
en
t
um
P
a
Li m
c
e
en
c
ra
t
(
a
an
M
ul at e T
T
ry
p
n
a y
et w
t i o
q
resp
I
B
s o
f
ear)
P
o
n
u
c
ic
se
a
sa
a
e
a
rs
tion
n
d
be
c
tw
e
pu
ta
grid
- w
a
s
om
e
n
u
se
r
tion
tion
site
hou
n
- w
id
e
d
s of
/
im
ge
id
e
Operated on by 10’s of
ori th
m
s
i pel i ne proc
essi ng
pi pel i ne desc
sc
/
l iz
P
u
rf a
l l e
d
ity
m
te
al g
i t ed
f ast
at a
ge
tiv
om
l um
ages i n
i sk
en
d
g
o
i t al
d
uri t y
d
c
il e
c
in
at es an
agem
sp
i m
n
Large v
ra
d
te
gi n
at a m
priv
users)
h
edu
ri pti on l ang
u
ag
e /
l i ng
rk
o
n
n
se t i m
e
ueues
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
4
Diverse Users…
Patient
has free access to own medical data
Ph
y
s
ic
ian
has comp
lete read access to p
atients data.
F
ew p
ersons hav
e read/ write
access.
R
es
ear c
may
N
B
io
l o
h
ob
tain read access to anony
ominativ
g
is
serv
C
er s
e data shou
em
ic
ld b
e b
lank
mou
s medical data for research p
u
ed b
efore transmission to these u
sers
rp
oses.
t
has free access to p
h
u
b
lic datab
ases.
U
se web
p
ortal to access b
iolog
y
serv
er
ices.
al / Ph
owns p
riv
ar m
ac
o
ate data.
l o
N
g
ic
al
m
anu
f ac
tu
r er
eed to control the p
ossib
le targ
Grid Tutorial
-
4
/ 2
8
ets for data storag
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
e.
d th
e
Grid -
n
°
3
5
…and data
Biological Data
Public and private databases
V
ery
F
req
H
M
e
d
ical d
eno
th
( do
ubles every
rm
nning
8
- 1
2
m
o
nth
s)
)
ats
ata
D
istributed o
ag
w
us f o
tro
I m
ng
g
S
ro
uent updates ( versio
etero
f ast g
sem
antic
es and m
ver im
ag
ing
sites
etadata
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
6
Web portals for biologists
Biologist enters sequences through web interface
P
ip
elined
ecution of bio-inform
atics algorithm
s
Genomics comparative analysis (thousands of files of ~Gbyte)
Genome comparison takes days of CPU (~n**2)
Phylogenetics
2
D
ex
,
3
D
m
olecu
la
r
str u
ctu
r e of
p
r oteins…
The algorithms are currently executed on a local cluster
B
ig la
B
u
b
s ha
t gr ow
v
e b
ing p
ig clu
r essu
ster s …
r e on r esou
r ces –
G
r id
w
ill help
More and more biologists
c
omp
are larger and larger seq
u
enc
es (
w
h
ole genomes)
…
to more and more genomes…
w
ith
f anc
ier and f anc
Grid Tutorial
-
4
/ 2
8
ier algorith
/ 2
0
0
3
–
A
p
p
lic
ation
ms ! !
s
an
d th
e
Grid -
n
°
3
7
Example GRID application for
B
iolog
dgBLAST
or D
N
datab
D
A)
:
d
g
B
L
A
S
to b
e search
ases to b
ed and a pointer to th
igh
speed (trade of f
A score is assigned to every
detects
onl y
e resul ts graph
euristic al gorith
rel ationsh
ips am
isol ated regions of
Blastn:
com
against a
Blastp
:
against a
vs sensitivity
ical l y
presented
m
ong sequences w
sim
il arity
h
ich
sh
are
.
pares a nucl eotide query
pares an am
)
candidate sequence
sequence
nucl eotide sequence datab
com
e set of
e queried.
esigned f or h
uses an h
T
requires as input a given sequence (protein
f ound and th
y
ase
ino acids query
sequence
protein sequence
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
8
The Visual DataGrid Blast, a first genomics
ap
p
lication on DataGrid
A graphical interface to enter query sequences and select the
reference datab
ase
A script to ex
A graphical interface to
analyz
ecute the B
L
AS
T
algorithm
on the grid
e result
Accessib
le from
the w
eb
portal genius. ct. infn. it
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
3
9
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
4
0
Other Medical Applications
Complex modelling of anatomical structures
Anatomical and functional models, p
S
urgery
R
S
M
R
M
I
S
h
R
modelling
ammograph
Automatic p
simulation
imulation of M
atoin
ealistic models, r eal- time constr aints
ar alleliz
I s
, ar tifacts modeling
ies analy
ath
olog
ared and distrib
D
ata h
ier ar ch
y
, p
ar allel simulation
sis
ies detection
uted data management
, dy
namic indices, op
timiz
ation, cach
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
ing
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
4
1
Summary
HEP,
EO and Biology users have deep interest in the
deploym
b
ent and the ac
oosting their c
c
apac
( 1 / 2 )
om
tual availab
puter pow
ities in an unprec
Currently
interfacing via A
op
P
I
efully ex
I D
,
edented w
ay.
data p
rocessing schem
and m
es.
W
ill m
ove
ore detailed
s
p
erim
interfacing ap
L
R
er and data storage
onto areas of interactive analysis,
H
the G
evaluating the basic functionality of the tools and
their integration into
ility of
p
ents w
ill do com
lications to G
R
I
D
m
on w
ork
under the
in
um
brella of
CG
HEPCAL (Co
m
Ap
i n
p
l i c
t e
g
a
r
t i o
a
n
t i o
m
La
n
o
o
y
f
e
n
G
r
r
U
)
s
w
e
i d
Ca
o
t o
r
o
k
s
e
w
l s
s
i l l
f o
i n
t o
b
r
e
a
u
t h
HEP Co
s
e
e
d
a
s
LHC p
m
a
m
r
o
b
o
a
s
t o
t y
d
w
n
i s
p
f o
r
t h
e
e
http://lcg.web.cern.ch/LCG/SC2/RTAG4
T
t o
g
h
e
e
r
e
a
t h
e
e
. g
Pr
o
r
r
e
w
a
i t h
.
j e
m
c
i n
n
y
t h
g
e
r
(G
r
y
p
r
o
j e
c
t s
i n
t h
e
w
o
r
l d
a
n
e
m
u
s
t
w
o
r
k
m
HEP w
t s
i d
e
Ph
y
h
n
a
v
e
, PPD
D
G
a
t a
, i V
T
D
a
G
g
, Cr
o
s
s
g
r
i d
, N
o
r
d
u
g
r
i d
+
U
S
L)
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
4
2
Summary ( 2 / 2 )
Many challanging issues are facing us :
st rengt hen effect iv
t est b
k
o
up
lut io
t he p
ns,
furt her d
w
w
o
rk
ill im
assiv
e p
ro
d
uct io
ns o
n t he
E
D
G
ed
eep
ev
e m
p
ev
ack
p
lem
im
ace w
elo
p
lem
p
m
ages t o
ent
m
it h nex
ent ing o
id
ad
d
d
lew
t
generat io
co
r int erfacing t hem
are co
ress gro
any new
n grid
w
m
p
o
nent s fo
ing user’ s d
funct io
m
p
t o
E
r all E
em
ut ing
D
D
and
G
G
s.
E
D
G
2
. 0
nalit y.
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
A
p
p
lic
ation
s
an
d th
e
Grid -
n
°
4
3
Acknowlegements a
nd
r
ef er
ences
Thanks to the following who provided material and advice
J
L
G
G
ridP
F
H
P
W
C
)
M
, O
arris(
P
S
9
)
. M
W
)
http: /
(
L
/
L
H
C
R
F
C
C
C
L
ompu
E
G
G
P
O
N
P
8
)
,
B
A
(
u
R
gu
L
W
H
A
C
L
P
b
1
)
0
)
, L
, J
R
M
ob
ontagnat(
ertson(
L
C
W
G
P
)
1
0
, D
D
)
, F
u
C
arminati(
ellmann(
L
C
A
G
lice)
/
P
O
O
W
P
8
)
sites and docu
N
B
rook(
L
H
C
b
)
,
P
H
ob
son (
C
M
S
)
,
J
M
/
lcg. web
9
b
lic. web
. cern. ch/
lhc- compu
. cern. ch/
L
C
W
P
lcg. web
. cern. ch/
L
C
G
/
S
C
2
/
R
TA
G
6
http: /
/
lcg. web
. cern. ch/
L
C
G
/
S
C
2
/
R
TA
G
4
/
www. dante. net/
/
http: /
1
0
oy
le(
A
tlas)
U
, C
K
W
P
1
0
)
ting- review- pu
b
lic/
P
u
b
lic/
R
eport_
final. P
D
F
G
/
lcgapp. cern. ch/
/
datagrid- wp8
/
edmsoraweb
/
sty
http: /
laising(
ments
http: /
http: /
P
B
, T D
ontagnat (
(
(
model for regional centres)
geant/
H
proj ect/
(
. web
E
E
P
u
C
A
L
G
rid u
ropean R
http: /
/
/
marianne. in2
. cern. ch/
0
0
1
D
ataG
/
rid- W
cedar/
P
8
/
doc. info?
docu
ment_
http: /
/
/
. fr/
eq
3
3
u
2
4
0
9
irements)
0
0
1
/
cedar/
datagrid/
wp1
0
doc. info?
docu
ment_
id=
docu
ment_
id=
3
3
2
4
1
1
(
R
eq
ts)
eq
ts)
/
www. healthgrid. org
www. creatis. insa- ly
/
id=
R
grid
. cern. ch: 8
p3
etworks)
persist/
. cern. ch: 8
. srin. esa. it/
edmsoraweb
http: /
http: /
x
se cases)
esearch N
(
W
J
)
eview)
http: /
, J
L
)
stin(
ting- review- pu
ting R
http: /
8
I
E
T http: /
L
P
W
O
reton(
aroney
http: /
A
B
rank(
eview
lhc- compu
H
, V
, M
ome interesting W
S
inford(
randi(
edmsoraweb
on. fr/
. cern. ch: 8
0
0
M
E
1
/
D
I
G
R
cedar/
I
D
/
doc. info?
Grid Tutorial
-
4
/ 2
8
/ 2
0
0
3
–
3
3
A
2
p
4
p
1
lic
2
(
R
ation
s
an
d th
e
Grid -
n
°
4
4
Download