PaulNilsson PILOT2.0 Introduction • NewPanDA PilotProjectlaunchedinApril2016 – Projecttospanthenextfewyears – PresentedatPanDA workshopinApril,WFMSreviewinMay,TIMinJune(plusother presentationsinADCandPilotDevmeetings)[Formoreinfo..] – Developmentandsupportoftheoldpilot(“Pilot1.0”)willcontinueasitremainsthe productionpilotuntilnewpilot(“Pilot2.0”)isready • Motivation – SomeofthePilot1.0codebaseisgettingabittoooldandisdifficulttomaintain – Refactoringisaslowprocessthathasalreadybeengoingonforyearsanddoesnot alwayshavehighestpriority – Moremanpowermadeavailabletoalleviateasteadyincreaseoffeaturerequests – Newfeatures/workflowsareoftenchallengingtoimplement/support • E.g.experimentasaplug-in,glExec,failovers,objectstores,eventservice,HPC:s,.. • “Complete”rewrite – Keepingsomerecentnewdevelopments(notcut-and-paste) – Gettingridofalllegacycodeandoutdatedmechanisms – Rethinkofbasicpilotflow 2 DocumentationandCodeRepository • Goal:Pilot2.0shouldbe“completely”documented – Auto-documentationofcode • Documentationtoolstobeevaluated(e.g.sphinx,pydoc,doxygen) – Generaloverview,majorworkflowsandalgorithmsonwikis • GeneralPilot2.0wiki: https://twiki.cern.ch/twiki/bin/view/PanDA/Pilot2 (inprogress) • Sitemoverswiki: https://twiki.cern.ch/twiki/bin/view/PanDA/SiteMovers (alsocovering Pilot1.0) • Pilot2.0repositoryalreadylocatedinGitHub – Devbranch: https://github.com/PanDAWMS/pilot-2.0 3 TestingFramework • • A pullrequestintoPilotGitHubwilltriggera seriesofteststhatneedtosucceedbefore releasecanbeapproved Usepatternvalidationto.. – Avoidtypingerrors – Keepcodeclean – Followcodingconvention • • Unittestsareaddedtopilotfornew functionalities Automatictestworkflow – Fulltestofalltestfunctions – Pullrequestsarelabelled“OK”or“FAIL” automaticallywithAPI – For“FAIL”,addfailuremessagetoGitHub withAPI • • Managermergessuccessfullytestedpull requests DevandRCpilotwillautomaticallybe created(bycron job- currentlymanually created) 4 MiniPilot • AminimalpilothasbeendevelopedbyDaniel Drizhuk (Kurchatov Inst.) – FirstPilot2.0codetohavebeendeveloped – Tobeusedbythedevelopersprimarilyduring theinitialdevelopmentandtestingstage – Formoduleandcomponenttesting – CaneventuallyresultinaSimplePilot for externaluse/startingpointfornewPanDA users • Documentation/instructionsinGitHub – https://github.com/PanDAWMS/pilot2.0/tree/dev/lib/minipilot • • • • Easytousebydesign Codeiscleanandsimple Usingproper/standard[python]logging Followingcodingconventions – Enforcedbytestingframework 5 SiteMovers • Anewsitemoverarchitectureindevelopmentsincesummerof2015 – Implementedin“Pilot1.0”butwilllargelysurviveinPilot2.0 – Majordevelopmenteffortandagreatsimplification(usingnewAGISfunctionality leadingtoremovalofseveralcomplexschedconfig fields[toberemovedaftersite migration]) – Essentiallyready,majorcopycommandssupported(rucio,xrdcp,lcg-cp,lsm,dccp) • StandingbywithGFAL2andOSsitemoversfornon-Rucio usingexperiments – Somefunctionalitynotimplementedyet(e.g.alt.stage-outtonucleussite)or tested(objectstoretransfers,tobehandledbyrucio) • Currentlybeingdiscussed • Migrationplanforsitestousenewsitemoversisinprogress – ValidationofDDMendpointsforsupportofparticularprotocols • Mostendpointsalreadytestedandverifiedwithstand-alonescript – Validationofcontentofinformationsystem – Validationofmoverimplementation – Cornercasestudy • Validationofopportunisticresourcesandspecificstorageimplementations – Post-migrationclean-up • AGISandschedconfig cleanupandoptimization,retirementofoutdatedpilotcode – Manysiteadminshavealreadybeencontacted – Sitemigrationresponsible:DanilaOleynik (UTA) 6 ComponentModel • RecentF2Fdiscussionbasedon Pilot1.0workflowsresultedinfirst draftofPilot2.0component model • Comparisonswithothermodelsto bedone – E.g.RadicalPilot(S.Jha etal) • FirstversionofPilot2.0code structurebasedoncomponent modelalreadycreatedand availableinGitHub – Forinitialtestingofworkflows, testingsystem – Currentlyonlyskeletoncode • ComponentModeland Architecturedocumenttobe written 7 HarvesterIntegration • Harvestercansimplifysomeoftheworkflowsinthe pilot – LogiccanbemovedawayfrompilottoHarvester • Willsimplifypilotcode – Harvestercandecidewhichpilotmodulestouse • EspeciallyusefulonHPC:swhereonlyareducedsetofpilot componentsareneeded – E.g.multi-jobsonHPC:s;getJob functionalitynotneeded • ThemodularapproachofPilot2.0willfacilitatethe integrationwithHarvester – Note:Pilot1.0isalsomodularandHarvesterwillinitially launchcomponents(HPCcode)from‘old’pilot • Detailsarebeingdiscussed • SeeslidesfromTadashiformanymoredetails 8 Timeline • MiniPilot readyforusebypilotdevelopers(incl.documentation) – PresentedatGrid2016conference(D.Drizhuk) • Testingsystemcurrentlybeingdesigned(implementationbeing tested,designedbyWenGuan,UniversityofWisconsin) – PresentedatGrid2016conference(D.Drizhuk) • Functionalrequirementshavebeenwrittendownformost workflows • ComponentmodeldiscussiondoneinJuly2016 – F2Fmeetingwithfiveofthedeveloperspresent • DesignphaseinitiatedinJuly2016 – SkeletonPilot2.0componentscreatedforfirstdesigniteration – Componentmodelandarchitecturedocumentin~sixmonths(from now) • Implementationandtestingphaseforbasicfunctionalities – ~Sixmonths(i.e.in~oneyearfromnow) • Finalproductfororiginalrequirements – Includingadditionalimplementationsandextensions – ~Twoyears 9 CorePilotTeam Name Timeallocation Tasks(tobefurtherdefined) PaulNilsson 25%Pilot2.0, 75%Pilot1.0 ->100%Pilot2.0 Projectleader;generaloverviewand management AlexeyAnisenkov 20-50% Sitemovers,RunJob*classes Daniil Drizhuk Upto50% MiniPilot,RunJob*classes MarioLassnig 25% Site movers(Rucio optimization;ADCData Management) Danila Oleynik >=50% HPCpilot,eventservice,generalpilot workflow WenGuan >=50% Eventservice,HPCpilot,jobmanager Pavlo Svirin (100%ALICEPilot) HPCforALICEintegration +CollaborationwithShantenu Jha (andtwoofhisPostDocs/Rutgers,Radical/SAGA), Shao-TingCheng(AMS)– andtherewillbeothersaswell 10 Organization • Pilotteamisspreadoutovertheworld,sometimes presentatCERN – Paul,Wen,Mario,Pavlo stationedatCERN- wouldbegood tohaveDanilahereaswell,AlexeyandDaniilareinRussia (occasionallyatCERN) – Vidyo conferences – Face-to-facemeetings • Projecttransparency – Reportdevelopmentstatusfrequently(invarious meetings) – Todo/wishlist wiki(‘major’Pilot1.0featureswillalsobe addedhere) – Meetingminutes[atleastaftervidyo conferences] • DistributedtoPanDA mailinglist • Archivedonindico pages(PanDA PilotDevelopmentMeetings) 11