UTA HEP/Computing/0026: McFarmGraph______________________________________________ 1 McFarmGraph a web based monitoring tool for McFarm jobs Sankalp Jain, Aditya Nishandar, Drew Meyer, Jae Yu, Mark Sosebee, Prashant Bhamidipati, Heunu Kim, Karthik Gopalratnam, Vijay Murthi, Parag Mhashilkar Abstract McFarmGraph is a web based Graphical User Interface used to monitor McFarm jobs on Linux farms. This document is intended to be a comprehensive description of the design of McFarmGraph. For installation and administration please refer McFarmGraph administration guide. UTA HEP/Computing/0026: McFarmGraph______________________________________________ 2 1. INTRODUCTION: McFarmGraph consists of two parts – the front-end CGI scripts that are used to display the status of various jobs in a graphical format, and the back-end daemon that is used to bring status files over to the web server from the various farms. Status Daemon (McFarmGraph_daemon): McFarm periodically (e.g. every four hours) outputs a status file on each farm where it is running. This flat file summarizes the status of the various jobs running in the farm. The update period can be changed based on a parameter that can be set by the farmer (see McFarm documentation). The purpose of the daemon is to bring the status file from each remote farm that is doing productions using McFarm. The daemon uses globus services (gsiftp) periodically to transfer the status files and stores them locally. The daemon then triggers the XML generator, which converts the flat files to XML format. The flowchart illustrates the control flow for McFarmGraph daemon. Graphical User Interfaces: This is a set of CGI scripts written in PERL, and some applets written in Java. The scripts interpret the XML data representation of the status file transferred from the remote farms and present them to the user in a graphical format that can be accessed from the web. Request Structure: Each Mcfarm request consists of a bunch of jobs that are grouped together according to the request id of type ReqXXXX. Individual job names consist of the Request Id followed by a descriptor string and a number that is unique in the group. For e.g. “Req6279-zh-zmumu+hbb-03219094626”, “Req6279-zhzmumu+hbb-03219094710”, Req6279-zh-zmumu+hbb-03219094921… all belong to the group with request id Req6279. The “%Done” attribute displayed on the webpage for a particular request id is the average of %Done attribute of individual jobs within that group. Also the figures displayed in the PieChart represent the percentage of jobs that are in a particular phase. 2. STATUS DAEMON DESIGN: Scalability and simplicity are the pivotal issues that influence the design of the daemon as well as the cgi scripts. Fig 1 shows the directory structure in which the job status files from the remote farms are stored. UTA HEP/Computing/0026: McFarmGraph______________________________________________ 3 /home/mcfarm/McFarmGraph_New/ /SWIFT-HEP /mcp10 /mcp11 mcp11 //CSE-HEP /OU-HEP /LTU-HEP /conf /mcp14 mcp14 /log /tmp README daemon.log daemon.conf mcp10 Fig 1. McFarmGraph Directory Structure on hepfm007.uta.edu The McFarmGraph job information as well as the configuration and log file is placed in the directory structure as shown above. Whenever a new farm is added the daemon automatically creates a directory corresponding to a farm (e.g. SWIFT-HEP for Swift Farm). The mcpxx subdirectories are created according to the mcp versions on a particular farm (e.g. CSE-HEP has mcp13 & mcp14, whereas OU-HEP has only mcp14). The status files and their XML representations are stored in these (mcpxx) directories. Each mcpxx will typically contain, mcpxx(flat file), mcpxx_arch (XML representation of the archived job information) and mcpxx.xml(XML representation of queued and live jobs). The conf and the log directories contain the McFarmGraph _daemon configuration and the log files respectively. Addition of farms is done through the configuration file. The tmp directory is used as a scratch space when the daemon is running to store the process id of a running daemon as well as some temporary files (e.g. ls.txt). Fig 1a and 1b illustrate the control flow in the McFarmGraph_daemon. UTA HEP/Computing/0026: McFarmGraph______________________________________________ 4 Start Read the configuration file. Stop Start/Stop/ check/no args Check if the daemon is up Yes No Print “daemon is running” Start Check if the daemon is up Print Usage and Exit Is running? Yes No Is running? Read the daemon’s process id Fork a child process and separate it from the parent Print “daemon is running” Invoke the main ( ) subroutine Exit Exit Flush the logs Print “daemon is running” Issue a kill with pid as the daemons’ pid Sleep for specified time (UPDATE_INTERVAL) Exit Sleep Interval Over? No Fig 1a. McFarmGraph Daemon Flowchart Yes UTA HEP/Computing/0026: McFarmGraph______________________________________________ 5 Main Subroutine: main () Redirect the output stream to the Log File. Read the configuration file for the farm variables by invoking the initialize() subroutine Check for configuration errors Yes Print “Configuration Errors” Exit Errors ? No Create Farm objects corresponding to NUMBER_OF_FARMS For each farm, call the farm_mkdir () method to create farm specific directories if not already present. For each Farm, call the getFiles () method to retrieve the job status files. When all the files are retrieved, start the XML generator Fig 1b. main() subroutine in the McFarmGraph Daemon 3. XML GENERATOR: UTA HEP/Computing/0026: McFarmGraph______________________________________________ 6 In earlier version of McFarmGraph lot of computation was done while the client (browser) was waiting. Although this processing wasn’t a bottleneck but would have increased as the size of flat files increase. So in order to avoid this, bulk of the processing is now being done offline with the data stored in an XML file. While generating the page the task is now simply reading the data from the XML file and generating the HTML code. The task of generating the XML data is done by two scripts. A wrapper which for each file pulled over from various farms calls a subroutine (in xmlgen.pm) which generates the XML data for that status file. The diagram below shows the flow chart for the subroutine. Flow chart of XML generator START Read the file path of the file to operate on and create file paths for both XML files Create a temporary sorted file from status file Read a line from the sorted file Yes Write archived job info to arch XML file EOF No Accumulate job info RequestId changes? No Calculate % Done Delete the sorted file %Done = 100 ? Yes EXIT Accumulate archived job info No write info in live jobs xml file UTA HEP/Computing/0026: McFarmGraph______________________________________________ 7 4. CGI and PERL Scripts The following scripts generate the various Web pages: 1. filter.cgi 2. applet.pm 3. filemani.pm 4. generalpage.pm 5. html.pm 6. jobpage.pm All of these scripts are written in PERL and are located under /usr/local/apache2/cgi-bin on hepfm000.uta.edu. Apart from these scripts there is the java applet code which is in the file PieChart.java under /usr/public_html/job_status/applet hepfm000.uta.edu. All the images that are used in the web pages and the “style.css” file are also under /usr/public_html/job_status on hepfm000.uta.edu. Functions of various scripts filter.cgi: All the requests from the browser are directed to filter.cgi along with a set of parameters. This script then invokes subroutines in other files depending on the parameters. applet.pm: This script generated all the applet specific HTML code. filemani.pm: This script consist a single subroutine whose functions are explained below. generalpage.pm: This script generates bulk of the Req. Desc page. jobpage.pm: This script generates the Job Desc page html.pm: This script prints most of the HTML code for all scripts. Generation of Web Pages McFarmGraph generates most of the pages dynamically using CGI. The only static page is the “index.html” page which is stored under /usr/public_html/job_status on hepfm000.uta.edu. For adding a new farm this page has to be modified (refer the installation guide for more details). For the other pages there are 3 cases: 1. “Farm Request Ids” Page Request UTA HEP/Computing/0026: McFarmGraph______________________________________________ 8 Generation of “Farm Request Ids” page printHTMLHeader, printHTMLfooter Browser main, farm name Job Status index.html page webpage html.pm HTML code filter.cgi printHTMLCell , printCellLink readDir Req and reply between browser and cgi script filemani.pm Call to functions Function Return In this case the parameters passed to the filter.cgi file include “main” and the farm name. The filter.cgi script calls the html.pm file function to print the header and then calls the readDir function which read the directory for the requested farm and creates a link for each mcp version available on that farm. UTA HEP/Computing/0026: McFarmGraph______________________________________________ 9 2. “Farm Request Desc.” Page Request UTA HEP/Computing/0026: McFarmGraph______________________________________________ 10 Generation of “Farm Request Desc.” page Browser Farm request ids page html.pm genpage, mcp ver. farm name, arch? webpage printHTMLfooter HTML code filter.cgi HTML code HTML code generalPage Req and reply between browser and cgi script Call to functions applet.pm printApplet printHTMLHeader , printHTMLCell , printCellLink generalpage. pm Function Return The page generated here will either be one containing all the “live jobs” or all the “archived jobs” on this farm for the requested mcp version depending on the presence of last attribute. filter.cgi script calls generalPage subroutine in generalpage.pm file which does the rest of the processing. generalPage calls various subroutines in html.pm file and also printApplet in applet.pm which embeds the applet into the HTML code generated. UTA HEP/Computing/0026: McFarmGraph______________________________________________ 11 The PHASES column in archived page indicates all the phases this Request has gone through. 3 “Job Desc.” Page Request Generation of “Job Desc.” page Browser Farm request desc. page jobpage, mcp ver. Req desc.,farm name ,status webpage printHTMLHeader, printHTMLfooter html.pm HTML code filter.cgi printHTMLCell jobPage Req and reply between browser and cgi script Call to functions Function Return jobpage.pm UTA HEP/Computing/0026: McFarmGraph______________________________________________ 12 The page generated here lists all the jobs in a particular group identified by the Request Id passed on by the browser. If a status parameter is present in the request (present in case the applet link is clicked) then the page lists the details of all the jobs in the group whose current status is the one requested. For e.g. it might contain all the jobs whose status is “D0GSTAR”. 3. FUTURE WORK: The performance of the McFarmGraph tool can certainly be improved. Some of the future work is highlighted below. We would like to reiterate that these are just some of the suggestions; no study of their feasibility and success is done. Exploring options to cgi: When the number of farms being monitored increases cgi scripts could be a potential performance bottleneck. Java servlets might be able solve this issue. Expiration of proxies: During the course of development of McFarmGraph, it was observed that the proxies expire, thus disabling the retrieval of status files from the remote site. Modification to the daemon could solve this problem. Java applets load slowly: For every row in the job status page a new applet is loaded and executes on the client side Java Virtual Machine. Mechanisms for caching the byte code and having a single instance of the applet would speed up the loading time. McFarmGraph status updates: Currently McFarmGraph pulls status files; majority of the information that it contains has already been pulled over before. Instead it would be more efficient to get information about those Requests that either have jobs that are still running on the farm or have those requests that have finished since last update.