Ibis: A Programming System for Real-World Distributed Computing Henri Bal Frank J. Seinstra Niels Drost Jason Maassen bal@cs.vu.nl High Performance Distributed Computing group Vrije Universiteit Amsterdam Introduction ● Distributed systems continue to change ● ● Distributed applications continue to change ● ● Clusters, grids, clouds, mobile devices e-Science, web, pervasive applications Distributed programming continues to be notoriously difficult Our approach ● Study fundamental underlying problems ● … hand-in-hand with realistic applications ● … integrate solutions in one system: Ibis User ● Distributed Systems Funding from NWO (2002), VL-e (2003-2009), EU (JavaGAT, XtreemOS, Contrail), VU, COMMIT Example: Supernova Detection ● DACH 2008 challenge, Japan ● Distributed multi-cluster system ● ● ● ● Heterogeneous Distributed database (image pairs) ● Large vs small databases/images ● Partial replication Image-pair comparison given (in C) Find all supernova candidates ● Task 1: As fast as possible ● Task 2: Idem, under system crashes DACH 2008: Result ● ● All participating teams struggled (1 month) ● Middleware instabilities… ● Connectivity problems… ● Load balancing… But not the Ibis team ● Winner (by far) in both categories ● Note: many Japanese teams with years of experience ● ● Hardware, middleware, network, C-code, image data… Focus on ‘problem solving’, not ‘system fighting’ ● incl. ‘opening’ of black-box C-code Ibis Results: Awards & Prizes AAAI-VC 2007 Most Visionary Research Award 1st Prize: DACH 2008 - BS 1st Prize: DACH 2008 - FT WebPie: A Web-Scale Parallel Inference Engine J. Urbani, S. Kotoulas, J. Maassen, N. Drost, F.J. Seinstra, F. van Harmelen, and H.E. Bal 3rd Prize: ISWC 2008 ● 1st Prize: SCALE 2008 1st Prize: SCALE 2010 Many domains; data/compute intensive, real-time... Finalist: Enlighten Your Research (EYR) 2011 Jungle Computing (Frank Seinstra) ● ‘Worst case’ computing as required by end-users ● Distributed ● Heterogeneous ● Hierarchical (incl. multi-/many-cores) Why Jungle Computing? ● ● Scientists often forced to use a wide variety of resources simultaneously to solve computational problems Prominent causes: ● Desire for scalability ● Distributed nature of (input) data ● Software heterogeneity (e.g.: mix of C/MPI and CUDA) ● Ad hoc hardware availability ● … Computational Astrophysics ● The AMUSE system (Leiden University) ● Early Star Cluster Evolution, including gas gravitational dynamics stellar evolution AMUSE hydrodynamics radiative transport ● Gravitational dynamics (N-body): GPU / GPU-cluster ● Stellar evolution: Beowulf cluster / Cloud ● Hydro-dynamics, Radiative transport: Supercomputer Computational Astrophysics Demonstrated live at SC’11, Nov 12-18, 2011, Seattle, USA (next week) Ibis Design ● Applications need functionality for ● Programming (as in programming languages) ● Deployment (as in operating systems) Programming Deployment Logical Practical Likes math Visual (GUI) Ibis Software Stack Ibis as ‘Master Key’ ● JavaGAT ● ● high-level API for developing/running applications independently of the underlying middleware a.o. used for user-authentication, file-copying, job submission, job monitoring, … package org.gridlab.gat.io.cpi.rftgt4; import java.net.MalformedURLException; import java.net.URL; import java.rmi.RemoteException; import java.security .cert.X509Certificate; import java.util.Calendar; import java.util.HashMap; import java.util.LinkedList; import java.util.List; import java.util.Map; import java.util.Vector; import javax.xml.namespace.QName; import javax.xml.rpc.ServiceException; import javax.xml.rpc.Stub; import javax.xml.soap.SOAPElement; ● Example: file copy ● JavaGAT vs. Globus import org.apache.axis.message.addressing.EndpointReferenc eTy pe; import org.apache.axis.ty pes.URI.MalformedURIException; import org.globus.axis.util.Util; import org.globus.delegation.DelegationConsta nts; import org.globus.delegation.DelegationException; import org.globus.delegation.DelegationUtil; import org.globus.gsi.GlobusCredential; import org.globus.gsi.GlobusCredentialExceptio n; import org.globus.gsi.gssapi.GlobusGSSCredentialImpl; import org.globus.gsi.jaas.JaasGssUtil; import org.globus.rft.generated.BaseRequestTy pe; import org.globus.rft.generated.CreateReliable FileTra nsferIn putTy pe; import org.globus.rft.generated.CreateReliable FileTra nsferOu tputTy pe; import org.globus.rft.generated.DeleteRequestTy pe; import org.globus.rft.generated.DeleteTy pe; import org.globus.rft.generated.OverallStatus; import org.globus.rft.generated.RFTFaultResou rceProperty Ty pe; import org.globus.rft.generated.RFTOptionsTy pe; import org.globus.rft.generated.ReliableFileTra nsferFa ctory PortTy pe; import org.globus.rft.generated.ReliableFileTra nsferPortTy pe; import org.globus.rft.generated.Start; import org.globus.rft.generated.TransferReque stTy pe; import org.globus.rft.generated.TransferTy pe; import org.globus.transfer.reliable.client.BaseRFTClie nt; import org.globus.transfer.reliable.service.RFT Constan ts; import org.globus.wsrf.NotificationConsumerManager; import org.globus.wsrf.Notify Callback; import org.globus.wsrf.ResourceException; import org.globus.wsrf.WSNConstants; import org.globus.wsrf.container.ContainerExc eption; import org.globus.wsrf.container.ServiceContainer; import org.globus.wsrf.core.notification.Resour ceProperty ValueChan geNotification ElementTy pe; import org.globus.wsrf.encoding.Deserializatio nExcep tion; import org.globus.wsrf.encoding.ObjectDeserializer; import org.globus.wsrf.impl.security .authentication.Constants; import org.globus.wsrf.impl.security .authorization.Authorization; import org.globus.wsrf.impl.security .authorization.HostAuthor ization; import org.globus.wsrf.impl.security .authorization.Iden tity Authorization; import org.globus.wsrf.impl.security .authorization.Self Authorization; import org.globus.wsrf.impl.security .descriptor.ClientSecurity Descrip tor; import org.globus.wsrf.impl.security .descriptor.ContainerSecu rity Descriptor; import org.globus.wsrf.impl.security .descriptor.GSISecureMs gAuthM ethod; import org.globus.wsrf.impl.security .descriptor.GSITransportAuthMe thod; import org.globus.wsrf.impl.security .descriptor.ResourceSecu rity Descriptor; import org.globus.wsrf.impl.security .descriptor.Security DescriptorEx ception; import org.globus.wsrf.security .Security Manager; import org.gridlab.gat.CouldNotInitializeCrede ntialExc eption; import org.gridlab.gat.CredentialExpiredException; import org.gridlab.gat.GATContext; import org.gridlab.gat.GATInvocationExceptio n; import org.gridlab.gat.GATObjectCreationException; import org.gridlab.gat.Preferences; import org.gridlab.gat.URI; import org.gridlab.gat.io.cpi.FileCpi; import org.gridlab.gat.security .globus.GlobusSecurity Utils; import org.ietf.jgss.GSSCredential; import org.ietf.jgss.GSSException; import org.oasis.wsn.Subscribe; import org.oasis.wsn.TopicExpressionTy pe; import org.oasis.wsrf.faults.BaseFaultTy pe; import org.oasis.wsrf.lifetime.SetTerminationTime; import org.oasis.wsrf.properties.GetMultipleRe sourcePropertiesResp onse; import org.oasis.wsrf.properties.GetMultipleRe sourceProperties_Element; import org.oasis.wsrf.properties.ResourceProperty ValueChan geNotif ication Ty pe; class RFTGT4Notify Callback implements Notify Callback { RFTGT4FileAdaptor transfer; OverallStatus status; public RFTGT4Notify Callback(RFTGT4FileA daptor transfer) { super(); this.transfer = transfer; this.status = null; } package tutorial; import org.gridlab.gat.GAT; import org.gridlab.gat.GATContext; import org.gridlab.gat.URI; import org.gridlab.gat.io.File; public class RemoteCopy { public static void main(String[] args) throws Exception { GATContext context = new GATContext(); URI src = new URI(args[0]); URI dest = new URI(args[1]); File file = GAT.createFile(context, src); file.copy (dest); GAT.end(); @SuppressWarnings("unchecked") public void deliver(List topicPath, EndpointReferenceTy pe producer, Object messageWrapper) { try { ResourceProperty ValueChangeNotificationTy pe message = ((ResourceProperty ValueChangeNotificationElementTy pe) messageWrapper) .getResourceProperty ValueChangeNotification(); this.status = (OverallStatus) message.getNewValue().get_any ()[0] .getValueAsTy pe(RFTConstants.OVERALL_ST ATUS_ RESO URCE, OverallStatus.class); if (status.getFault() != null) { transfer.setFault(getFaultFromRP(status.getFault())); } // RunQueue.getInstance().add(this.resourc eKey ); } catch (Exception e) { } transfer.setStatus(status); } } private BaseFaultTy pe getFaultFromRP(RFTFaultResourceProperty Ty pe fault) { if (fault == null) { return null; } } if (fault.getDelegationEPRMissingFaultTy pe() != null) { return fault.getDelegationEPRMissingFaultTy pe(); } else if (fault.getRftAuthenticationFaultTy pe() != null) { return fault.getRftAuthenticationFaultTy pe(); } else if (fault.getRftAuthorizationFa ultTy pe() != null) { return fault.getRftAuthorizationFaultTy pe(); } else if (fault.getRftDatabaseFaultTy pe() != null) { return fault.getRftDatabaseFaultTy pe(); } else if (fault.getRftRepeatedly StartedFaultTy pe() != null) { return fault.getRftRepeatedly StartedFaultTy pe(); } else if (fault.getTransferTransientFaultTy pe() != null) { return fault.getTransferTransientFaultTy pe(); } else if (fault.getRftTransferFaultTy pe() != null) { return fault.getRftTransferFaultTy pe(); } else { return null; } } } @SuppressWarnings("serial") public class RFTGT4FileAdaptor extends FileCpi { public static final Authorization DEFAULT_AUTHZ = HostAuthorization .getInstance(); Integer msgProtectionTy pe = Constants.SIGNATURE; static final int TERM_TIME = 20; static final String PROTOCOL = "https"; private static final String BASE_SERVICE_PATH = "/wsrf/services/"; public static final int DEFAULT_DURATION_HOURS = 24; public static final Integer DEFAULT_MSG_PROTECTION = Constants.SIGNATURE; public static final String DEFAULT_FACTORY_PORT = "8443"; private static final int DEFAULT_GRIDFTP_PORT = 2811; NotificationConsumerManager notificationConsumerManager; EndpointReferenceTy pe notificationConsumerEPR; EndpointReferenceTy pe notificationProducerEPR; String security Ty pe; ● ● Simple, portable, … Standardized (in SAGA) String factory Url; GSSCredential proxy ; Authorization authorization; String host; OverallStatus status; BaseFaultTy pe fault; String locationStr; ReliableFileTransferFactory PortTy pe factory Port; throw new GATInvocationException( "RFTGT4FileAdaptor: setAuthMethods failed, " + e); } RFTGT4Notify Callback notify Callback = new RFTGT4Notify Callback(this); try { notificationConsumerEPR = notificationConsumerManager .createNotificationConsumer(topicPath, notify Callback, security Descriptor); } catch (ResourceException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: createNotificationConsumer failed, " + e); } Subscribe subscriptionRequest = new Subscribe(); subscriptionRequest.setConsumerReference( notifica tionCo nsumerEPR); TopicExpressionTy pe topicExpression = null; try { topicExpression = new TopicExpressionTy pe( WSNConstants.SIMPLE_TOPIC_DIALECT, RFTConstants.OVERALL_STATUS_RE SOURCE); } catch (MalformedURIException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: create TopicExpressionTy pe failed, " + e); } subscriptionRequest.setTopicExpression(topicExpre ssion); try { rft.subscribe(subscriptionRequest); } catch (RemoteException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: subscription failed, " + e); } public RFTGT4FileAdaptor(GATContext gatContext, Preferences preferences, URI location) throws GATObjectCreationException { super(gatContext, preferences, location); if (!location.isCompatible("gsiftp") && !location.isCompatible("gridftp")) { throw new GATObjectCreationException("cannot handle this URI"); } String globusLocation = Sy stem.getenv("GLOBUS_LOCATION"); if (globusLocation == null) { throw new GATObjectCreationException("$GLOBUS_L OCATI ON is not set"); } Sy stem.setProperty ("GLOBUS_LOCATION", globusLocation); Sy stem.setProperty ("axis.ClientConfigFile", globusLocation + "/client-config.wsdd"); this.host = location.getHost(); this.security Ty pe = Constants.GSI_SEC_MSG; this.authorization = null; this.proxy = null; try { proxy = GlobusSecurity Utils.getGlobusCredential(gatContext, preferences, "globus", location, DEFAULT_GRIDFTP_PORT); } catch (CouldNotInitializeCredentialException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (CredentialExpiredException e) { // TODO Auto-generated catch block e.printStackTrace(); } this.notificationConsumerManager = null; this.notificationConsumerEPR = null; this.notificationProducerEPR = null; this.status = null; this.fault = null; factory Port = null; this.factory Url = PROTOCOL + "://" + host + ":" + DEFAULT_FACTORY_PORT + BASE_SERVICE_PATH + RFTConstants.FACTORY_NAME; locationStr = setLocationStr(location); } String setLocationStr(URI location) { if (location.getScheme().equals("any ")) { return "gsiftp://" + location.getHost() + ":" + location.getPort() + "/" + location.getPath(); } else { return location.toString(); } } protected boolean copy 2(String destStr) throws GATInvocationException { EndpointReferenceTy pe credentialEndpoint = getCredentialEPR(); TransferTy pe[] transferArray = new TransferTy pe[1]; transferArray [0] = new TransferTy pe(); transferArray [0].setSourceUrl(locationStr); transferArray [0].setDestinationUrl(destStr); int lifetime = DEFAULT_DURATION_HOURS * 60 * 60; ClientSecurity Descriptor secDesc = new ClientSecurity Descriptor(); if (this.security Ty pe.equals(Constants.GSI_SEC_MSG)) { secDesc.setGSISecureMsg(this.getMessageProtectionTy pe()); } else { secDesc.setGSITransport(this.getM essageProtectio nTy pe()); } secDesc.setAuthz(getAuthorization() ); if (this.proxy != null) { secDesc.setGSSCredential(this.proxy ); } // Get the public key to delegate on. X509Certificate[] certsToDelegateOn = null; try { certsToDelegateOn = DelegationUtil.getCertificateChainRP( delegationFactory Endpoint, secDesc); } catch (DelegationException e) { throw new GATInvocationException("RFTGT4FileAdaptor: " + e); } X509Certificate certToSign = certsToDelegateOn[0]; } protected EndpointReferenceTy pe getCredentialEPR() throws GATInvocationException { this.status = null; URL factory URL = null; try { factory URL = new URL(factory Url); } catch (MalformedURLException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: set factory URL failed, " + e); } try { factory Port = BaseRFTClient.rftFactory Locator .getReliableFileTransferFactory PortTy pePort(factory URL); } catch (ServiceException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: set factory Port failed, " + e); } setSecurity Ty peFromURL(factory URL); return populateRFTEndpoints(factory Port); } // FIXME remove when there is a DelegationUtil.delegate(EPR, ...) String protocol = delegationFactory Endpoint.getAddress().getScheme(); String host = delegationFactory Endpoint.getAddress().getH ost(); int port = delegationFactory Endpoint.getAddress().getPort(); String factory Url = protocol + "://" + host + ":" + port + BASE_SERVICE_PATH + DelegationConstants.FACTORY_PATH; // send to delegation service and get epr. EndpointReferenceTy pe credentialEndpoint = null; try { credentialEndpoint = DelegationUtil.delegate(factory Url, credential, certToSign, lifetime, false, secDesc); } catch (DelegationException e) { throw new GATInvocationException("RFTGT4FileAdaptor: " + e); } return credentialEndpoint; protected void setRequest(BaseRequestTy pe request) throws GATInvocationException { CreateReliableFileTransferInputTy pe input = new CreateReliableFileTransferInputTy pe(); if (request instanceof TransferRequestTy pe) { input.setTransferRequest((Transfer RequestTy pe) request); } else { input.setDeleteRequest((DeleteRequ estTy pe) request); } } public EndpointReferenceTy pe[] fetchDelegationFactory Endpoints( ReliableFileTransferFactory PortTy pe factory Port) throws GATInvocationException { GetMultipleResourceProperties_Element request = new GetMultipleResourceProperties_Element(); request .setResourceProperty (new QName[] { RFTConstants.DELEGATION_ENDPOINT_FACTO RY }); GetMultipleResourcePropertiesResponse response; try { response = factory Port.getMultipleResourceProperties(request); } catch (RemoteException e) { e.printStackTrace(); throw new GATInvocationException( "RFTGT4FileAdaptor: getMultipleResourceProperties, " + e); } SOAPElement[] any = response.get_any (); Calendar termTimeDel = Calendar.getInstance(); termTimeDel.add(Calendar.MINUTE, TERM_TIME); input.setInitialTerminationTime(termTimeDel); CreateReliableFileTransferOutputTy pe response = null; try { response = factory Port.createReliableFileTransfer(input); } catch (RemoteException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: set createReliableFileTransfer failed, " + e); } EndpointReferenceTy pe reliableRFTEndpoint = response .getReliableTransferEPR(); ReliableFileTransferPortTy pe rft = null; try { rft = BaseRFTClient.rftLocator .getReliableFileTransferPortTy pePort(reliableRFTEndpoint); } catch (ServiceException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: getReliableFileTransferPortTy pePort failed, " + e); } setStubSecurity Properties((Stub) rft); subscribe(rft); Calendar termTime = Calendar.getInstance(); termTime.add(Calendar.MINUTE, TERM_TIME); SetTerminationTime reqTermTime = new SetTerminationTime(); reqTermTime.setRequestedTerminationTime(termTime); try { rft.setTerminationTime(reqTermTime); } catch (RemoteException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: setTerminationTime failed, " + e); } try { rft.start(new Start()); } catch (RemoteException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: start failed, " + e); } RFTOptionsTy pe rftOptions = new RFTOptionsTy pe(); rftOptions.setBinary (Boolean.TRUE); // rftOptions.setIgnoreFilePermErr(false); TransferRequestTy pe request = new TransferRequestTy pe(); request.setRftOptions(rftOptions); request.setTransfer(transferArray ); request.setTransferCredentialEndpoint(cred entialEn dpoint) ; setRequest(request); while (!transfersDone()) { try { Thread.sleep(1000); } catch (InterruptedException e) { throw new GATInvocationException("RFTGT4FileAdap tor: " + e); } } return transfersSucc(); } public void copy (URI dest) throws GATInvocationException { String destUrl = setLocationStr(dest); if (!copy 2(destUrl)) { throw new GATInvocationException( "RFTGT4FileAdaptor: file copy failed"); } } public void subscribe(ReliableFileTransferPortTy pe rft) throws GATInvocationException { Map<Object, Object> properties = new HashMap<Object, Object>(); properties.put(ServiceContainer.CLA SS, "org.globus.wsrf.container.GSIServiceCon tainer"); if (this.proxy != null) { ContainerSecurity Descriptor containerSecDesc = new ContainerSecurity Descriptor(); Security Manager.getManager(); try { containerSecDesc.setSubject(JaasGssUtil .createSubject(this.proxy )); } catch (GSSException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: ContainerSecurity Descriptor failed, " + e); } properties.put(ServiceContainer.CO NTAIN ER_D ESCRIPTOR, containerSecDesc); } this.notificationConsumerManager = NotificationConsumerManager .getInstance(properties); try { this.notificationConsumerManager.startListening(); } catch (ContainerException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: NotificationConsumerManager failed, " + e); } List<Object> topicPath = new LinkedList<Object>(); topicPath.add(RFTConstants.OVERALL_ST ATUS_ RESOU RCE); ResourceSecurity Descriptor security Descriptor = new ResourceSecurity Descriptor(); String authz = null; if (authorization == null) { authz = Authorization.AUTHZ_NONE; } else if (authorization instanceof HostAuthorization) { authz = Authorization.AUTHZ_NONE; } else if (authorization instanceof SelfAuthorization) { authz = Authorization.AUTHZ_SELF; } else if (authorization instanceof Identity Authorization) { // not supported throw new GATInvocationException( "RFTGT4FileAdaptor: identity authorization not supported"); } else { // throw an sg throw new GATInvocationException( "RFTGT4FileAdaptor: set authorization failed"); } security Descriptor.setAuthz(auth z); Vector<Object> authMethod = new Vector<Object>(); if (this.security Ty pe.equals(Constants.GSI_SEC_MSG)) { authMethod.add(GSISecureMsgAuthMetho d.BOT H); } else { authMethod.add(GSITransportAuth Method. BOTH ); } try { security Descriptor.setAuthMethods(authMe thod); } catch (Security DescriptorException e) { private EndpointReferenceTy pe delegate( EndpointReferenceTy pe delegationFactory Endpoint) throws GATInvocationException { GlobusCredential credential = null; if (this.proxy != null) { credential = ((GlobusGSSCredentialImpl) this.proxy ) .getGlobusCredential(); } else { try { credential = GlobusCredential.getDefaultCredential(); } catch (GlobusCredentialException e) { throw new GATInvocationException("RFTGT4FileAdap tor: " + e); } } EndpointReferenceTy pe epr1 = null; try { epr1 = (EndpointReferenceTy pe) ObjectDeserializer.toObject(any [0], EndpointReferenceTy pe.class); } catch (DeserializationException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: ObjectDeserializer, " + e); } EndpointReferenceTy pe[] endpoints = new EndpointReferenceTy pe[] { epr1 }; return endpoints; } sy nchronized void setStatus(OverallStatus status) { this.status = status; } public int transfersActive() { if (status == null) { return 1; } return status.getTransfersActive(); } public int transfersFinished() { if (status == null) { return 0; } return status.getTransfersFinished(); } } public int transfersCancelled() { if (status == null) { return 0; } return status.getTransfersCancelled(); } private void setSecurity Ty peFromURL(URL url) { if (url.getProtocol().equals("http")) { security Ty pe = Constants.GSI_SEC_MSG; } else { Util.registerTransport(); security Ty pe = Constants.GSI_TRANSPORT; } } public int transfersFailed() { if (status == null) { return 0; } return status.getTransfersFailed(); } private void setStubSecurity Properties(Stub stub) { ClientSecurity Descriptor secDesc = new ClientSecurity Descriptor(); if (this.security Ty pe.equals(Constants.GSI_SEC_MSG)) { secDesc.setGSISecureMsg(this.getMessageProtectionTy pe()); } else { secDesc.setGSITransport(this.getM essageProtectio nTy pe()); } public int transfersPending() { if (status == null) { return 1; } return status.getTransfersPending(); } secDesc.setAuthz(getAuthorization() ); public int transfersRestarted() { if (status == null) { return 0; } return status.getTransfersRestarted(); } if (this.proxy != null) { // set proxy credential secDesc.setGSSCredential(this.proxy ); } stub._setProperty (Constants.CLIENT_DESCRIPTOR, secDesc); } public boolean transfersDone() { return (transfersActive() == 0 && transfersPending() == 0 && transfersRestarted() == 0); } public Integer getMessageProtectionTy pe() { return (this.msgProtectionTy pe == null) ? RFTGT4FileAdaptor.DEFAULT_MSG _PROTECTIO N : this.msgProtectionTy pe; } public boolean transfersSucc() { return (transfersDone() && transfersFailed() == 0 && transfersCancelled() == 0); } public Authorization getAuthorization() { return (authorization == null) ? DEFAULT_AUTHZ : this.authorization; } /* * private BaseFaultTy pe getFaultFromRP(RFTFaultResourceProperty Ty pe fault) { * if (fault == null) { return null; } * * if (fault.getRftTransferFaultTy pe() != null) { return * fault.getRftTransferFaultTy pe(); } else if * (fault.getDelegationEPRMissingFaultTy pe() != null) { return * fault.getDelegationEPRMissingFaultTy pe(); } else if * (fault.getRftAuthenticationFaultTy pe() != null) { return * fault.getRftAuthenticationFaultTy pe(); } else if * (fault.getRftAuthorizationFa ultTy pe() != null) { return * fault.getRftAuthorizationFaultTy pe(); } else if * (fault.getRftDatabaseFaultTy pe() != null) { return * fault.getRftDatabaseFaultTy pe(); } else if * (fault.getRftRepeatedly StartedFaultTy pe() != null) { return * fault.getRftRepeatedly StartedFaultTy pe(); } else if * (fault.getTransferTransientFaultTy pe() != null) { return * fault.getTransferTransientFaultTy pe(); } else { return null; } } */ private EndpointReferenceTy pe populateRFTEndpoints( ReliableFileTransferFactory PortTy pe factory Port) throws GATInvocationException { EndpointReferenceTy pe[] delegationFactory Endpoints = fetchDelegationFactory Endpoints(factory Port); EndpointReferenceTy pe delegationEndpoint = delegate(delegationFactory Endpoints[0]); return delegationEndpoint; } /* * private BaseFaultTy pe deserializeFaultRP(SOAPElement any ) throws * Exception { return getFaultFromRP((RFTFaultResourceProperty Ty pe) * ObjectDeserializer .toObject(any , RFTFaultResourceProperty Ty pe.class)); } */ void setFault(BaseFaultTy pe fault) { this.fault = fault; } } Ibis as ‘Glue’ ● IPL + SmartSockets ● Stable, fault-tolerant, malleable communication system ● a.o. used for linking separate ‘activities’ of an application ● ● ● Activities: often largely ‘independent’ tasks implemented in any popular language or model (e.g. C/MPI, CUDA, Fortran, Java…) Each typically running on a single GPU/node/Cluster/Cloud/… Example: connectivity (a.o. under firewalls) With SmartSockets: No SmartSockets: Ibis as ‘HPC Solution’ ● Ibis as replacement for e.g. C++/MPI code ● ● Benefits: ● (better) portability ● malleability (open world) ● fault-tolerance ● (run-time) task migration Comparable speed: Example Application #3: Color-based Object Recognition by a Grid-connected Robot Dog Seinstra et al. (IEEE Multimedia, Oct-Dec 2007) ‘Master Key’ + ‘Glue’ + ‘HPC’ ● ● Original code (2004/2005): ● ‘Multimedia services’ in C++/MPI (pre-installed at each cluster) ● Wide-area communication using TCP Sockets (instable) ● SSH tunneling where necessary (firewalls, …) ● Execution on each cluster ‘by hand’ Step-wise conversion to 100% Ibis / Java ● Phase 1: JavaGAT as ‘Master Key’ ● Phase 2: IPL + SmartSockets as ‘Glue’ ● Phase 3: Ibis as ‘HPC Solution’ ● Eventual result: ● ‘wall-socket computing from a memory stick’ ● Awards at AAAI 2007 and CCGrid 2008 Conclusions ● Ibis enables problem solving (avoids system fighting) ● Successfully applied in many domains ● ● ● Astronomy, multimedia analysis, climate modeling, remote sensing, semantic web, model checking, medical imaging, smart phones, … Data intensive, compute intensive, real-time… Open source, download: www.cs.vu.nl/ibis/