Pilot 2.0, US ATLAS Planning Meeting 2016 - Indico

advertisement
PaulNilsson
PILOT2.0
Introduction
•
NewPanDA PilotProjectlaunchedinApril2016
– Projecttospanthenextfewyears
– PresentedatPanDA workshopinApril,WFMSreviewinMay,TIMinJune(plusother
presentationsinADCandPilotDevmeetings)[Formoreinfo..]
– Developmentandsupportoftheoldpilot(“Pilot1.0”)willcontinueasitremainsthe
productionpilotuntilnewpilot(“Pilot2.0”)isready
•
Motivation
– SomeofthePilot1.0codebaseisgettingabittoooldandisdifficulttomaintain
– Refactoringisaslowprocessthathasalreadybeengoingonforyearsanddoesnot
alwayshavehighestpriority
– Moremanpowermadeavailabletoalleviateasteadyincreaseoffeaturerequests
– Newfeatures/workflowsareoftenchallengingtoimplement/support
• E.g.experimentasaplug-in,glExec,failovers,objectstores,eventservice,HPC:s,..
•
“Complete”rewrite
– Keepingsomerecentnewdevelopments(notcut-and-paste)
– Gettingridofalllegacycodeandoutdatedmechanisms
– Rethinkofbasicpilotflow
2
DocumentationandCodeRepository
• Goal:Pilot2.0shouldbe“completely”documented
– Auto-documentationofcode
• Documentationtoolstobeevaluated(e.g.sphinx,pydoc,doxygen)
– Generaloverview,majorworkflowsandalgorithmsonwikis
• GeneralPilot2.0wiki:
https://twiki.cern.ch/twiki/bin/view/PanDA/Pilot2 (inprogress)
• Sitemoverswiki:
https://twiki.cern.ch/twiki/bin/view/PanDA/SiteMovers (alsocovering
Pilot1.0)
• Pilot2.0repositoryalreadylocatedinGitHub
– Devbranch: https://github.com/PanDAWMS/pilot-2.0
3
TestingFramework
•
•
A pullrequestintoPilotGitHubwilltriggera
seriesofteststhatneedtosucceedbefore
releasecanbeapproved
Usepatternvalidationto..
– Avoidtypingerrors
– Keepcodeclean
– Followcodingconvention
•
•
Unittestsareaddedtopilotfornew
functionalities
Automatictestworkflow
– Fulltestofalltestfunctions
– Pullrequestsarelabelled“OK”or“FAIL”
automaticallywithAPI
– For“FAIL”,addfailuremessagetoGitHub
withAPI
•
•
Managermergessuccessfullytestedpull
requests
DevandRCpilotwillautomaticallybe
created(bycron job- currentlymanually
created)
4
MiniPilot
•
AminimalpilothasbeendevelopedbyDaniel
Drizhuk (Kurchatov Inst.)
– FirstPilot2.0codetohavebeendeveloped
– Tobeusedbythedevelopersprimarilyduring
theinitialdevelopmentandtestingstage
– Formoduleandcomponenttesting
– CaneventuallyresultinaSimplePilot for
externaluse/startingpointfornewPanDA
users
•
Documentation/instructionsinGitHub
– https://github.com/PanDAWMS/pilot2.0/tree/dev/lib/minipilot
•
•
•
•
Easytousebydesign
Codeiscleanandsimple
Usingproper/standard[python]logging
Followingcodingconventions
– Enforcedbytestingframework
5
SiteMovers
•
Anewsitemoverarchitectureindevelopmentsincesummerof2015
– Implementedin“Pilot1.0”butwilllargelysurviveinPilot2.0
– Majordevelopmenteffortandagreatsimplification(usingnewAGISfunctionality
leadingtoremovalofseveralcomplexschedconfig fields[toberemovedaftersite
migration])
– Essentiallyready,majorcopycommandssupported(rucio,xrdcp,lcg-cp,lsm,dccp)
• StandingbywithGFAL2andOSsitemoversfornon-Rucio usingexperiments
– Somefunctionalitynotimplementedyet(e.g.alt.stage-outtonucleussite)or
tested(objectstoretransfers,tobehandledbyrucio)
• Currentlybeingdiscussed
•
Migrationplanforsitestousenewsitemoversisinprogress
– ValidationofDDMendpointsforsupportofparticularprotocols
• Mostendpointsalreadytestedandverifiedwithstand-alonescript
– Validationofcontentofinformationsystem
– Validationofmoverimplementation
– Cornercasestudy
• Validationofopportunisticresourcesandspecificstorageimplementations
– Post-migrationclean-up
• AGISandschedconfig cleanupandoptimization,retirementofoutdatedpilotcode
– Manysiteadminshavealreadybeencontacted
– Sitemigrationresponsible:DanilaOleynik (UTA)
6
ComponentModel
• RecentF2Fdiscussionbasedon
Pilot1.0workflowsresultedinfirst
draftofPilot2.0component
model
• Comparisonswithothermodelsto
bedone
– E.g.RadicalPilot(S.Jha etal)
• FirstversionofPilot2.0code
structurebasedoncomponent
modelalreadycreatedand
availableinGitHub
– Forinitialtestingofworkflows,
testingsystem
– Currentlyonlyskeletoncode
• ComponentModeland
Architecturedocumenttobe
written
7
HarvesterIntegration
• Harvestercansimplifysomeoftheworkflowsinthe
pilot
– LogiccanbemovedawayfrompilottoHarvester
• Willsimplifypilotcode
– Harvestercandecidewhichpilotmodulestouse
• EspeciallyusefulonHPC:swhereonlyareducedsetofpilot
componentsareneeded
– E.g.multi-jobsonHPC:s;getJob functionalitynotneeded
• ThemodularapproachofPilot2.0willfacilitatethe
integrationwithHarvester
– Note:Pilot1.0isalsomodularandHarvesterwillinitially
launchcomponents(HPCcode)from‘old’pilot
• Detailsarebeingdiscussed
• SeeslidesfromTadashiformanymoredetails
8
Timeline
• MiniPilot readyforusebypilotdevelopers(incl.documentation)
– PresentedatGrid2016conference(D.Drizhuk)
• Testingsystemcurrentlybeingdesigned(implementationbeing
tested,designedbyWenGuan,UniversityofWisconsin)
– PresentedatGrid2016conference(D.Drizhuk)
• Functionalrequirementshavebeenwrittendownformost
workflows
• ComponentmodeldiscussiondoneinJuly2016
– F2Fmeetingwithfiveofthedeveloperspresent
• DesignphaseinitiatedinJuly2016
– SkeletonPilot2.0componentscreatedforfirstdesigniteration
– Componentmodelandarchitecturedocumentin~sixmonths(from
now)
• Implementationandtestingphaseforbasicfunctionalities
– ~Sixmonths(i.e.in~oneyearfromnow)
• Finalproductfororiginalrequirements
– Includingadditionalimplementationsandextensions
– ~Twoyears
9
CorePilotTeam
Name
Timeallocation
Tasks(tobefurtherdefined)
PaulNilsson
25%Pilot2.0, 75%Pilot1.0
->100%Pilot2.0
Projectleader;generaloverviewand
management
AlexeyAnisenkov
20-50%
Sitemovers,RunJob*classes
Daniil Drizhuk
Upto50%
MiniPilot,RunJob*classes
MarioLassnig
25%
Site movers(Rucio optimization;ADCData
Management)
Danila Oleynik
>=50%
HPCpilot,eventservice,generalpilot
workflow
WenGuan
>=50%
Eventservice,HPCpilot,jobmanager
Pavlo Svirin
(100%ALICEPilot)
HPCforALICEintegration
+CollaborationwithShantenu Jha (andtwoofhisPostDocs/Rutgers,Radical/SAGA),
Shao-TingCheng(AMS)– andtherewillbeothersaswell
10
Organization
• Pilotteamisspreadoutovertheworld,sometimes
presentatCERN
– Paul,Wen,Mario,Pavlo stationedatCERN- wouldbegood
tohaveDanilahereaswell,AlexeyandDaniilareinRussia
(occasionallyatCERN)
– Vidyo conferences
– Face-to-facemeetings
• Projecttransparency
– Reportdevelopmentstatusfrequently(invarious
meetings)
– Todo/wishlist wiki(‘major’Pilot1.0featureswillalsobe
addedhere)
– Meetingminutes[atleastaftervidyo conferences]
• DistributedtoPanDA mailinglist
• Archivedonindico pages(PanDA PilotDevelopmentMeetings)
11
Download