SIGCSE 2008 panel presentation, March 14, 2008

advertisement
Grid Computing at the Undergraduate Level:
Can We Do It?
Panel
Jens Mache
Amy Apon
Lewis & Clark College
Portland, Oregon
University of Arkansas
Thomas Feilhauer
Barry Wilkinson
University of Applied Sciences
Dornbirn, Austria
University of North Carolina Charlotte
(Moderator)
Fayetteville
SIGCSE 2008
Technical Symposium on Computer Science Education
Friday, March 14, 2008
1
Grid Computing at the Undergraduate Level:
Can We Do It?
Thomas Feilhauer
University of Applied Sciences
Dornbirn, Austria
SIGCSE 2008
Technical Symposium on Computer Science Education
Friday, March 14, 2008
2
Course Web page:
http://www2.staff.fh-vorarlberg.ac.at/~tf/grid/
Grid Computing Course at FHV
The students have to work in a Linux environment 
they shouldn't be afraid of Linux


Senior-level course taught in the last (6th) semester of the
computer science bachelor program
Prerequisites:
–
all students need to have:
•
•
•
•
•
–
knowledge of network protocols
experiences with Object-oriented programming
good working knowledge in Java
basics of client/server programming (Web apps)
fundamental knowledge of XML
most students have (in addition to the above) knowledge of:
•
•
•
•
•
RPC/RMI
JNDI (naming & directory service)
CORBA
JavaEE
Java Web services (Apache Axis)
3
How did we proceed?

Web services
–
–

standards: WSDL, SOAP
tools: Apache Axis
State in Web services
–
–
define "resource"
standards: WSRF
WS-Addressing, WS-ResourceProperties, WS-ResourceLifetime,
WS-Notification

Frameworks and tools for Grid applications
–
–
–
–
GT4  Java WS core
scheduler: Condor
database access: OGSA-DAI
gLite (EGEE)
4
Problems faced


Lots of specifications & standards for underlying technologies
Lots of (mechanical) steps need to be performed to get first program
running
–
–
–

dependencies between the steps
lots of different command line tools for code generation & deployment
lots of different files to maintain and keep consistent
• WSDL file, Java files, WSDD file, JNDI deployment file, ant file
• dependencies & redundancies  error prone
Existing tutorials on GT4
–
–
–
–

Most of the problems are not specific for
teaching Grid computing, but for developing
apps within the Grid environment in general
explanations often oversimplified
students tend to rush through the examples
students miss out on understanding code and tool interaction
positive experiences with the tutorials on Condor and OGSA-DAI
Need appropriate IDEs
–
–
e.g. Introduce, gEclipse
under development
5
Steps in developing a GT4 App.
1.
Define service's interface in WSDL

adapt template WSDL file for resources and services
2.
Optionally: Use WSDL2Java to generate framework classes for
service implementation
3.
Resource implementation
4.
Resource Home implementation
5.
Service implementation
6.
Provide WSDD Deployment Descriptor
7.
Provide JNDI deployment file
8.
Implement client
9.
Adapt Ant build file build.xml
10. Build service using Ant
11. Deploy service
12. Invoke service using the client
6
Experiences

Very high motivation among students
–
–

Good mixture of theory and practice
–
–


elective course
attractive and relevant topic
lots of examples
start with simple examples, stepwise extended to more complex ones
• singleton resource
• multiple resources
• finding a resource by querying resource properties
• destroying resources
Concentrate on the relevant parts of the specifications
Provide templates for configuration and build files
available from tutorials, e.g. by Sotomayor
Use communication (collaboration) or sequence diagrams to explain
relationships and message flow between different objects
 Explain and discuss the code step-by-step
 All students passed the (first) exam

7
Grid Computing at the Undergraduate Level:
Can We Do It?
Jens Mache
Lewis & Clark College
Portland, Oregon
SIGCSE 2008
Technical Symposium on Computer Science Education
Friday, March 14, 2008
8
200/300-level course that covers grid
and network programming
Assignment 1
Assignment 2-4
Assignment 5+6
Assignment 7
Assignment 8
Assignment 9
Mini-project
web concepts (http)
sockets (Java)
RMI
web services (Apache Axis)
grid “math” service
grid “sticky note” service
bigger example, e.g. “File buy”
9
Steps in the “math” assignment
0. Setting up the environment
1.
Defining the interface in WSDL
2.
Implementing the service in Java
3.
Configuring the deployment in WSDD
4.
Build the Math service (Create a GAR file)
5.
Deploy the Math service
6.
Write and compile the client
7.
Start the container and execute the client
All of the above steps are mostly done for you!
8.
Add functionality to the service
10
“Math”
assignment
Write .wsdl & .java
Compile & deploy
Re-start container
Write client
Compile & execute
11
“Sticky note” assignment
1.
2.
3.
4.
5.
6.
7.
8.
9.
Getting Started: Deploy a Service
State Management Part I: Create Resources
Lifetime Management Part I: Destroy Resources
State Management Part II: Add a Resource Property
Aggregating Resources: Register with a Local Index
Building a VO: Register with a Community Index
Lifetime Management Part II: Lease-based Model
Notification: Resource as Notification Producer
Discovery: Find a Resource
12
Recommendations

Cover network programming in Java and RMI
–


Cover web services, XML and WSDL.
Cover the basics of certificates
–
–

at least step-by-step, and with theoretical background if possible
typically, one cannot even start a grid service without cert’s
Do not underestimate the time and effort required to set up the
required software.
–

introduces important concepts (stub compilation and interfaces
versus implementations)
A viable alternative to one server shared by all students is
installing a stand-alone container on individual student computers.
Follow a basic grid service exercise with a second more
advanced grid exercise.
13
Prerequisites
the client/server paradigm
 XML
 web services
 network security ?
 network programming in the Java
Unlike the prereq’s for cluster computing
 algorithms, message passing in C or Fortran

14
Grid Computing at the Undergraduate Level:
Can We Do It?
Amy Apon
University of Arkansas
Fayetteville
SIGCSE 2008
Technical Symposium on Computer Science Education
Friday, March 14, 2008
15
University of Arkansas: Teaching Grid
Computing to Beginning Programmers




Our beginning programming class is taken by
both computer science majors and advanced
students in science and engineering courses
Course is taught in C and includes a weekly lab
We wanted to introduce grid computing as a
research tool to the students in this class
This meant teaching grid computing to freshmen
computer science students
16
The Ultimate Target Grid Platform:
GPN Grid


GPNGrid was developed
as a virtual organization
within the Open Science
Grid
Open Science Grid uses
Condor for workload
management
17
The Actual Student Platform –
a Condor pool on our local cluster


We configured Condor on a small cluster of
about 30 computers, with a single submit
node that the students logged in to
Condor is based on the idea of a ClassAd
universe = vanilla
executable = fire
arguments = $(PROCESS)
output = fire_$(PROCESS).out
error = fire_$(PROCESS).error
log = fire.log
queue 5
18
First Attempt: Fall 2005




First, a one hour lecture was given on
Condor concepts, including how to write a
ClassAd
Then, Condor was used by the students in
one hour of the last lab of the semester
Students were given substantial code for an
application they could run in Condor: the
Game of Life
Students completed the implementation
19
First Attempt: Fall 2005

Then, a scientific question was posed:
“Given a set of input configuration files, which
of these will still have living cells after 20
generations of the simulation?
Answering the question required running the
program a lot of times – a great application
for grid computing!
20
First Attempt: Mostly failure
Several concepts were more difficult than
we expected:



The batch submission process
Using the computer to solve a scientific
problem
Understanding the distributed nature of the
application – a failure of the submit machine
caused a lot of frustration and many students
did not complete the exercise!
21
Second Attempt: Spring 2006

A new application, a fire simulation, was
developed that did not require input files
22
Second Attempt: Spring 2006

Again, a scientific question was posed:
“What percentage of the forest will burn with a
given probability of a neighbor tree catching
on fire?”
[http://www.shodor.org]
• Students were asked to use the grid to run
the application many times and graph the results
23
Second Attempt: Only partial success




Again, the results were not completely
satisfactory
Students could perform the mechanics of
submitting a Condor application, and use
Excel to graph the results
They still did not seem to understand the
distributed nature of the application
Grid computing seemed to get in the way of
understanding the science
24
Third Attempt: In two parts


Fall 2006: We had students do a homework
assignment to learn the computational
science concepts only – write a program to
calculate the heat distribution in a room
This was the last homework assignment of
the semester
25
Third Attempt: In two parts


Spring 2007: In a special studies course,
build on the computational concepts
Several assignments were given:
The use of Unix tools such as cat, sort, and
gnuplot
– Complete the fire simulation from Spring 2006
– Study Condor and ClassAds
– Finally, pose a scientific question:
“What percentage of the forest will burn with a given
probability of a neighbor tree catching on fire?”
–
26
Third Attempt: Success

Use Condor to run over 10,000
simulations, graph the results
27
University of Arkansas Conclusions


Grid computing can be taught to
beginning students, but not in the first
semester
The infrastructure must be absolutely
flawless for this to succeed
28
University of Arkansas Conclusions

Prerequisites to teaching Grid
computing include:
–
–
–
Background in computational concepts and the
idea of using the computer to answer a scientific
question
The concept of batch submission
Basic use of command line Unix tools if command
line tools are used, or a portal
29
University of Arkansas Conclusions


Grid computing can be useful to
undergraduate science and engineering
majors
Curriculum at this level needs to focus
on running application, accessing data,
and synthesizing results from the grid
computation
30
Grid Computing at the Undergraduate Level:
Can We Do It?
Barry Wilkinson
University of North Carolina Charlotte
(Moderator)
SIGCSE 2008
Technical Symposium on Computer Science Education
Friday, March 14, 2008
31
North Carolina State-wide
undergraduate course

Taught jointly: UNC-Charlotte and UNC Wilmington.

First taught 2004. Again in 2005 and 2007.

Uses North Carolina’s televideo network NCREN, which
connects universities and colleges across state.

Distributed computing resources at several universities
form Grid computing platform.

14 Universities and colleges participated in total.
32
Participating Sites
VIRGINIA
Appalachian State University
TENNESSEE
UNC Asheville
Western Carolina University
NC Central University
UNC Greensboro
Lenoir Rhyne
College
NC State University
Winston-Salem
State University
Wake Tech.
Comm. College
Elon
University
UNC Chapel Hill
UNC Charlotte
UNC Pembroke
GEORGIA
NORTH CAROLINA
UNC Wilmington
SOUTH CAROLINA
© World Sites Atlas (sitesatlas.com)
33
Undergraduate Grid computing
courses

Often take bottom-up approach
–

Starting with client-server concepts, creating Web and
Grid services, and then progressing through
underlying Globus middleware, security mechanisms,
and job submission all using a Linux command-line
interface.
Need to raise level to top-down approach
–
Introduce students to production Grid tools such as
portals, application portlets, workflow tools, and how to
Grid-enable applications.
34
Grid Computing platform

A Grid computing platform is needed to teach
Grid computing in realistic setting

Problems with many students trying to do
Grid computing assignments on a Grid or
centralized server.
35
Aspects of new North
Carolina Grid Course

Now starts with a GridSphere
Grid portal to access resources.

Moves to command line
assignments later.

Leads to assignment for
developing portlets within Grid
portal.

Students use their own
computers for some
assignments.
Student final projects

36
Programming Assignments (Spring 2007)
Assignment 1
Assignment 2
Assignment 3
Assignment 4
Assignment 5
Assignment 6
Assignment 7
Mini-project
Using grid computing portal
Using the grid through a command line.
Using a scheduler (Condor-G)
Installing GT4 core. Creating, deploying,
and testing a GT4 Grid service.
Installing and using GridNexus workflow
editor to create and execute workflows.
Install Gridshpere and Implement a portlet
within Gridsphere portal.
MPI assignment on grid
Developing grid computing assignment
Assignments 4, 5, and 6 require students to install
significant software packages on their computer.
37
Avoiding problems

It require immense work to prepare for a handson distributed Grid computing course.

Critical that all assignments fully tested prior to
start of class and all computer systems reliable
and software maintained.

Assignments went much smoother by requiring
students to use personal computers when
possible.
38
Download