Peter Wittenburg
CLARIN Research Infrastructure
DASISH Cluster Project
EUDAT Data Infrastructure
The Language Archive – Max Planck Institute for Psycholinguistics
Nijmegen, The Netherlands
• our (CLARIN) problem was stated in 2005 at
ECRI in Nottigham: building virtual collections as a coming “global” scenario
• started to implement a FIM test domain and to talk to Terena and eduGain in the hope of solutions
• it’s 6 years ago !!!
• since then we tried to push things, but
• do we have an operational solution?
• who is responsible for doing what?
• how much do we need to know/do?
• until 2011 I was active and part of the pushing team in CLARIN together with Daan & Dieter (from MPI) and others
• I was/am a data practitioner
• now I changed to another role
• these old people are dangerous
• they tend to become impatient and get a biting voice
• now I am allowed to be blunt from a community perspective: the situation is still a mess
(slide from Larry Lannom)
Enabling
Technologies
ID
0100 ID
ID
ID
0101..
Discovery
Access
(ref. resolution, protocols, AAI)
Scientists, Data Curators,
End Users, Applications Interpretation
ID
ID
0100
0101..
ID
ID
ID
ID
ID
ID
ID
ID
ID
ID 0100
0101..
ID
Datasets
Accessed via Repositories
Reuse
• current functions in CLARIN D relevant for AAI
• MD component registry access
• virtual collection registry (write access)
• weblicht web application access
(chaining without delegation, busy with a workaround based on certificates)
• BBAW web application access
• MPI web application based data access
• tested/planned are:
• access to BBAW and IDS resources
• monitoring of centers (nagios) by Jülich CC
• center registry (write access) at Garching CC
• service hosting/deployment, access to workspaces at CC
• what are we facing in CLARIN D
• 4 of 9 centers with SP (in CLARIN SPF)
• 7 of 9 centers have IdPs in DFN AAI
• centers act as IdPs and SPs - different agreements
• web services check at application access level
• i.e. no delegation - only ok for a closed domain
• big problem:
• some IdPs deliver NO attributes
• some deliver just EPTID (diff per SP, number) i.e. no useful credentials for AAI
• same functions intended but now across borders
• until now MPI SP contract partner - will go over to CLARIN ERIC
• MPI has contracts with all centers that act as SP
( MPI (NL), INL (NL), Meertens (NL), IDS (DE), BBAW (DE), UTU (DE), CSC (FI),
ATILF-CNRS (FR), UFAL (CZ))
• MPI has contracts with 7 NRENs ( SurfFederatie ( SP metadata , IdP metadata ), DFN ( metadata ), HAKA ( metadata , pem ), Kalmar Union (via
HAKA metadata ): FEIDE (Norway), WAYF (Denmark + Iceland), SWAMID
(Sweden))
• but does not scale (negotiations with CZ and UK (JANET) take months/years)
(this needs to be almost an automatism)
• in EUDAT we are working across disciplines all with different AAI
• CLARIN as shown
• ENES (climate modeling) have own federation
• LifeWatch, VPH, EPOS in pre FIM state (as many others)
• different requirements for services
• Safe replication only between “trusted centers”, but access to copies
•
• Staging to HPC pipes by users using GridFTP
• Medatata aggregation all is open - no AAI required
SimpleStore
• Hosting many researchers nr. of centers
• general trends
Q4: CLARIN, DARIAH,
CESSDA, LifeWatch, etc
Q1: ENES, etc.
nr. of users
• don’t dare to explain the intended solution
• AAI is a complex field with different types of players
• we lack a proper technical infrastructure
• lack a fully operational AAI infrastructure for Q4 domains
• lack a gateway between Shib & Certificate domains
• lack attempt to solve the delegation issue
• we lack agreements on trust establishment
• the whole “game” is not understood by some players
(SPs are thought to come from big commerce only)
• IdPs do what they like (often restrictive interpretations)
(administrators decide and not researchers)
• NRENs don’t sign “official” agreements
• we have a new CoC - will it help to overcome barriers?
• what are the consequences
• disciplines are going their own way reinventing the wheel, creating sub-scenes, etc.
• a lot of hobbyism takes place creating islands
• who feels now responsible and will lead a concerted action
• what to expect from eduGain
• eduGain seems to be ready for European MD exchange
• currently only few (?) agreements, opt-in for IdPs
• standard MD profile not yet broadly used
• CLARIN pilot with eduGain: we as SPs sign special CoC
• who takes responsibility for a proper technical solution
ESFRI projects - GEANT - EUDAT - Grid - ???
isn’t it the terrain of GEANT - but I lost my hope
• who takes responsibility to tackle policy problems
ESFRI - ESFRI projects - EC - ministries - ??
• who will get all player focused to make real steps
I can see a clear role for FIM - but needs to be broad and massive
• it’s like an SOS call from communities
• we cannot agree with the situation and the way things are dealt with
• so much time and money invested with little output for communities
• we need a concerted action in Europe, since research is about competition and shared access to distributed data and services will be key
• why does it seem again that we need the US to achieve breakthroughs
Thanks for your attention.