Grid Computing at the Undergraduate Level: Can We Do It? Panel Jens Mache Amy Apon Lewis & Clark College Portland, Oregon University of Arkansas Thomas Feilhauer Barry Wilkinson University of Applied Sciences Dornbirn, Austria University of North Carolina Charlotte (Moderator) Fayetteville SIGCSE 2008 Technical Symposium on Computer Science Education Friday, March 14, 2008 1 Grid Computing at the Undergraduate Level: Can We Do It? Thomas Feilhauer University of Applied Sciences Dornbirn, Austria SIGCSE 2008 Technical Symposium on Computer Science Education Friday, March 14, 2008 2 Course Web page: http://www2.staff.fh-vorarlberg.ac.at/~tf/grid/ Grid Computing Course at FHV The students have to work in a Linux environment they shouldn't be afraid of Linux Senior-level course taught in the last (6th) semester of the computer science bachelor program Prerequisites: – all students need to have: • • • • • – knowledge of network protocols experiences with Object-oriented programming good working knowledge in Java basics of client/server programming (Web apps) fundamental knowledge of XML most students have (in addition to the above) knowledge of: • • • • • RPC/RMI JNDI (naming & directory service) CORBA JavaEE Java Web services (Apache Axis) 3 How did we proceed? Web services – – standards: WSDL, SOAP tools: Apache Axis State in Web services – – define "resource" standards: WSRF WS-Addressing, WS-ResourceProperties, WS-ResourceLifetime, WS-Notification Frameworks and tools for Grid applications – – – – GT4 Java WS core scheduler: Condor database access: OGSA-DAI gLite (EGEE) 4 Problems faced Lots of specifications & standards for underlying technologies Lots of (mechanical) steps need to be performed to get first program running – – – dependencies between the steps lots of different command line tools for code generation & deployment lots of different files to maintain and keep consistent • WSDL file, Java files, WSDD file, JNDI deployment file, ant file • dependencies & redundancies error prone Existing tutorials on GT4 – – – – Most of the problems are not specific for teaching Grid computing, but for developing apps within the Grid environment in general explanations often oversimplified students tend to rush through the examples students miss out on understanding code and tool interaction positive experiences with the tutorials on Condor and OGSA-DAI Need appropriate IDEs – – e.g. Introduce, gEclipse under development 5 Steps in developing a GT4 App. 1. Define service's interface in WSDL adapt template WSDL file for resources and services 2. Optionally: Use WSDL2Java to generate framework classes for service implementation 3. Resource implementation 4. Resource Home implementation 5. Service implementation 6. Provide WSDD Deployment Descriptor 7. Provide JNDI deployment file 8. Implement client 9. Adapt Ant build file build.xml 10. Build service using Ant 11. Deploy service 12. Invoke service using the client 6 Experiences Very high motivation among students – – Good mixture of theory and practice – – elective course attractive and relevant topic lots of examples start with simple examples, stepwise extended to more complex ones • singleton resource • multiple resources • finding a resource by querying resource properties • destroying resources Concentrate on the relevant parts of the specifications Provide templates for configuration and build files available from tutorials, e.g. by Sotomayor Use communication (collaboration) or sequence diagrams to explain relationships and message flow between different objects Explain and discuss the code step-by-step All students passed the (first) exam 7 Grid Computing at the Undergraduate Level: Can We Do It? Jens Mache Lewis & Clark College Portland, Oregon SIGCSE 2008 Technical Symposium on Computer Science Education Friday, March 14, 2008 8 200/300-level course that covers grid and network programming Assignment 1 Assignment 2-4 Assignment 5+6 Assignment 7 Assignment 8 Assignment 9 Mini-project web concepts (http) sockets (Java) RMI web services (Apache Axis) grid “math” service grid “sticky note” service bigger example, e.g. “File buy” 9 Steps in the “math” assignment 0. Setting up the environment 1. Defining the interface in WSDL 2. Implementing the service in Java 3. Configuring the deployment in WSDD 4. Build the Math service (Create a GAR file) 5. Deploy the Math service 6. Write and compile the client 7. Start the container and execute the client All of the above steps are mostly done for you! 8. Add functionality to the service 10 “Math” assignment Write .wsdl & .java Compile & deploy Re-start container Write client Compile & execute 11 “Sticky note” assignment 1. 2. 3. 4. 5. 6. 7. 8. 9. Getting Started: Deploy a Service State Management Part I: Create Resources Lifetime Management Part I: Destroy Resources State Management Part II: Add a Resource Property Aggregating Resources: Register with a Local Index Building a VO: Register with a Community Index Lifetime Management Part II: Lease-based Model Notification: Resource as Notification Producer Discovery: Find a Resource 12 Recommendations Cover network programming in Java and RMI – Cover web services, XML and WSDL. Cover the basics of certificates – – at least step-by-step, and with theoretical background if possible typically, one cannot even start a grid service without cert’s Do not underestimate the time and effort required to set up the required software. – introduces important concepts (stub compilation and interfaces versus implementations) A viable alternative to one server shared by all students is installing a stand-alone container on individual student computers. Follow a basic grid service exercise with a second more advanced grid exercise. 13 Prerequisites the client/server paradigm XML web services network security ? network programming in the Java Unlike the prereq’s for cluster computing algorithms, message passing in C or Fortran 14 Grid Computing at the Undergraduate Level: Can We Do It? Amy Apon University of Arkansas Fayetteville SIGCSE 2008 Technical Symposium on Computer Science Education Friday, March 14, 2008 15 University of Arkansas: Teaching Grid Computing to Beginning Programmers Our beginning programming class is taken by both computer science majors and advanced students in science and engineering courses Course is taught in C and includes a weekly lab We wanted to introduce grid computing as a research tool to the students in this class This meant teaching grid computing to freshmen computer science students 16 The Ultimate Target Grid Platform: GPN Grid GPNGrid was developed as a virtual organization within the Open Science Grid Open Science Grid uses Condor for workload management 17 The Actual Student Platform – a Condor pool on our local cluster We configured Condor on a small cluster of about 30 computers, with a single submit node that the students logged in to Condor is based on the idea of a ClassAd universe = vanilla executable = fire arguments = $(PROCESS) output = fire_$(PROCESS).out error = fire_$(PROCESS).error log = fire.log queue 5 18 First Attempt: Fall 2005 First, a one hour lecture was given on Condor concepts, including how to write a ClassAd Then, Condor was used by the students in one hour of the last lab of the semester Students were given substantial code for an application they could run in Condor: the Game of Life Students completed the implementation 19 First Attempt: Fall 2005 Then, a scientific question was posed: “Given a set of input configuration files, which of these will still have living cells after 20 generations of the simulation? Answering the question required running the program a lot of times – a great application for grid computing! 20 First Attempt: Mostly failure Several concepts were more difficult than we expected: The batch submission process Using the computer to solve a scientific problem Understanding the distributed nature of the application – a failure of the submit machine caused a lot of frustration and many students did not complete the exercise! 21 Second Attempt: Spring 2006 A new application, a fire simulation, was developed that did not require input files 22 Second Attempt: Spring 2006 Again, a scientific question was posed: “What percentage of the forest will burn with a given probability of a neighbor tree catching on fire?” [http://www.shodor.org] • Students were asked to use the grid to run the application many times and graph the results 23 Second Attempt: Only partial success Again, the results were not completely satisfactory Students could perform the mechanics of submitting a Condor application, and use Excel to graph the results They still did not seem to understand the distributed nature of the application Grid computing seemed to get in the way of understanding the science 24 Third Attempt: In two parts Fall 2006: We had students do a homework assignment to learn the computational science concepts only – write a program to calculate the heat distribution in a room This was the last homework assignment of the semester 25 Third Attempt: In two parts Spring 2007: In a special studies course, build on the computational concepts Several assignments were given: The use of Unix tools such as cat, sort, and gnuplot – Complete the fire simulation from Spring 2006 – Study Condor and ClassAds – Finally, pose a scientific question: “What percentage of the forest will burn with a given probability of a neighbor tree catching on fire?” – 26 Third Attempt: Success Use Condor to run over 10,000 simulations, graph the results 27 University of Arkansas Conclusions Grid computing can be taught to beginning students, but not in the first semester The infrastructure must be absolutely flawless for this to succeed 28 University of Arkansas Conclusions Prerequisites to teaching Grid computing include: – – – Background in computational concepts and the idea of using the computer to answer a scientific question The concept of batch submission Basic use of command line Unix tools if command line tools are used, or a portal 29 University of Arkansas Conclusions Grid computing can be useful to undergraduate science and engineering majors Curriculum at this level needs to focus on running application, accessing data, and synthesizing results from the grid computation 30 Grid Computing at the Undergraduate Level: Can We Do It? Barry Wilkinson University of North Carolina Charlotte (Moderator) SIGCSE 2008 Technical Symposium on Computer Science Education Friday, March 14, 2008 31 North Carolina State-wide undergraduate course Taught jointly: UNC-Charlotte and UNC Wilmington. First taught 2004. Again in 2005 and 2007. Uses North Carolina’s televideo network NCREN, which connects universities and colleges across state. Distributed computing resources at several universities form Grid computing platform. 14 Universities and colleges participated in total. 32 Participating Sites VIRGINIA Appalachian State University TENNESSEE UNC Asheville Western Carolina University NC Central University UNC Greensboro Lenoir Rhyne College NC State University Winston-Salem State University Wake Tech. Comm. College Elon University UNC Chapel Hill UNC Charlotte UNC Pembroke GEORGIA NORTH CAROLINA UNC Wilmington SOUTH CAROLINA © World Sites Atlas (sitesatlas.com) 33 Undergraduate Grid computing courses Often take bottom-up approach – Starting with client-server concepts, creating Web and Grid services, and then progressing through underlying Globus middleware, security mechanisms, and job submission all using a Linux command-line interface. Need to raise level to top-down approach – Introduce students to production Grid tools such as portals, application portlets, workflow tools, and how to Grid-enable applications. 34 Grid Computing platform A Grid computing platform is needed to teach Grid computing in realistic setting Problems with many students trying to do Grid computing assignments on a Grid or centralized server. 35 Aspects of new North Carolina Grid Course Now starts with a GridSphere Grid portal to access resources. Moves to command line assignments later. Leads to assignment for developing portlets within Grid portal. Students use their own computers for some assignments. Student final projects 36 Programming Assignments (Spring 2007) Assignment 1 Assignment 2 Assignment 3 Assignment 4 Assignment 5 Assignment 6 Assignment 7 Mini-project Using grid computing portal Using the grid through a command line. Using a scheduler (Condor-G) Installing GT4 core. Creating, deploying, and testing a GT4 Grid service. Installing and using GridNexus workflow editor to create and execute workflows. Install Gridshpere and Implement a portlet within Gridsphere portal. MPI assignment on grid Developing grid computing assignment Assignments 4, 5, and 6 require students to install significant software packages on their computer. 37 Avoiding problems It require immense work to prepare for a handson distributed Grid computing course. Critical that all assignments fully tested prior to start of class and all computer systems reliable and software maintained. Assignments went much smoother by requiring students to use personal computers when possible. 38