ICRI Manifesto - Call for Help
Peter Wittenburg
CLARIN Research Infrastructure
DASISH Cluster Project
EUDAT Data Infrastructure
The Language Archive – Max Planck Institute for Psycholinguistics
Nijmegen, The Netherlands
our vision and problem
• our (CLARIN) problem was stated in 2005 at
ECRI in Nottigham:
building virtual collections
as a coming “global” scenario
• started to implement
a FIM test domain and
to talk to Terena and
eduGain in the hope
of solutions
• it’s 6 years ago !!!
• since then we tried to
push things, but
• do we have an
operational
solution?
• who is responsible
for doing what?
• how much do we
need to know/do?
my changed role
• until 2011 I was active and part of the pushing team in CLARIN
together with Daan & Dieter (from MPI) and others
• I was/am a data practitioner
• now I changed to another role
• these old people are dangerous
• they tend to become impatient
and get a biting voice 
• now I am allowed to be blunt from a community perspective:
the situation is still a mess
embedding of FIM
(slide from Larry Lannom)
Enabling
Technologies
ID
ID
Discovery
Access
(ref. resolution,
protocols, AAI)
ID
ID
0100
0101..
ID 0100 ID
0101..
0100
0101..
ID
ID
ID
ID
ID
ID
ID
ID
Scientists, Data Curators,
End Users, Applications Interpretation
ID
0100
0101..
ID
ID
0100
0101..
ID
Datasets
Accessed via Repositories
Reuse
where are we in CLARIN (D)
• current functions in CLARIN D relevant for AAI
• MD component registry access
• virtual collection registry (write access)
• weblicht web application access
(chaining without delegation, busy with a workaround based on
certificates)
• BBAW web application access
• MPI web application based data access
• tested/planned are:
• access to BBAW and IDS resources
• monitoring of centers (nagios) by Jülich CC
• center registry (write access) at Garching CC
• service hosting/deployment, access to workspaces at CC
where are we in CLARIN (D)
• what are we facing in CLARIN D
• 4 of 9 centers with SP (in CLARIN SPF)
• 7 of 9 centers have IdPs in DFN AAI
• centers act as IdPs and SPs - different agreements
• web services check at application access level
• i.e. no delegation - only ok for a closed domain
• big problem:
• some IdPs deliver NO attributes
• some deliver just EPTID (diff per SP, number)
i.e. no useful credentials for AAI
where are we in CLARIN EU
• same functions intended but now across borders
• until now MPI SP contract partner - will go over to CLARIN ERIC
• MPI has contracts with all centers that act as SP
(MPI (NL), INL (NL), Meertens (NL), IDS (DE), BBAW (DE), UTU (DE), CSC (FI),
ATILF-CNRS (FR), UFAL (CZ))
• MPI has contracts with 7 NRENs (SurfFederatie (SP metadata, IdP
metadata), DFN (metadata), HAKA (metadata, pem), Kalmar Union (via
HAKA - metadata): FEIDE (Norway), WAYF (Denmark + Iceland), SWAMID
(Sweden))
• but does not scale (negotiations with CZ and UK (JANET) take
months/years)
(this needs to be almost an automatism)
what happens in EUDAT
•
in
•
•
•
EUDAT we are working across disciplines all with different AAI
CLARIN as shown
ENES (climate modeling) have own federation
LifeWatch, VPH, EPOS in pre FIM state (as many others)
• different requirements for services
• Safe replication
only between “trusted centers”, but
access to copies
• Staging to HPC pipes
by users using GridFTP
• Medatata aggregation
all is open - no AAI required
• SimpleStore
many researchers
• Hosting
nr. of
• general trends
centers
Q4: CLARIN, DARIAH,
CESSDA, LifeWatch, etc
Q1: ENES, etc.
• don’t dare to explain the intended solution 
nr. of
users
how to characterize the situation
•
AAI is a complex field with different types of players
• we lack a proper technical infrastructure
• lack a fully operational AAI infrastructure for Q4 domains
• lack a gateway between Shib & Certificate domains
• lack attempt to solve the delegation issue
• we lack agreements on trust establishment
• the whole “game” is not understood by some players
(SPs are thought to come from big commerce only)
• IdPs do what they like (often restrictive interpretations)
(administrators decide and not researchers)
• NRENs don’t sign “official” agreements
• we have a new CoC - will it help to overcome barriers?
• what are the consequences
• disciplines are going their own way
reinventing the wheel, creating sub-scenes, etc.
• a lot of hobbyism takes place creating islands
who is responsible
•
•
who feels now responsible and will lead a concerted action
what to expect from eduGain
• eduGain seems to be ready for European MD exchange
• currently only few (?) agreements, opt-in for IdPs
• standard MD profile not yet broadly used
• CLARIN pilot with eduGain: we as SPs sign special CoC
• who takes responsibility for a proper technical solution
ESFRI projects - GEANT - EUDAT - Grid - ???
isn’t it the terrain of GEANT - but I lost my hope
• who takes responsibility to tackle policy problems
ESFRI - ESFRI projects - EC - ministries - ??
• who will get all player focused to make real steps
I can see a clear role for FIM - but needs to be broad and massive
role of Manifesto
• it’s like an SOS call from communities
• we cannot agree with the situation and
the way things are dealt with
• so much time and money invested with
little output for communities
• we need a concerted action in Europe,
since research is about competition
and shared access to distributed data
and services will be key
• why does it seem again that we need
the US to achieve breakthroughs
FIM is a very good initiative. But we need more.
Thanks for your attention.