vonLaszewski-whattoteach-microsoft-ajyEDIT

advertisement
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
Abstract Form
Author Information
Name Gregor von Laszewski
Department/Discipline Computer Science and PhD Program College of Computing and Information
Science
Institution Rochester Institute of Technology
Mailing Address1 Gregor von Laszewski
Rochester Institute of Technology
Bldg 74
Lomb Memorial Drive
Rochester, NY 14623
laszewski@gmail.com
Paper Information
Title: Advanced Cyberinfrastructure for Computational Science
Liberal Arts
1
Social Science
(others)
Earth Science
Chemistry
Healthcare
Life Science
Biomedicine
Biology
Physics
Astronomy
Abstract:
Computational science requires the use of
advanced cyberinfrastructure in order to
provide an environment that is capable of
handling the most advanced scientific
problems. Therefore, we propose to include
in a computational science curriculum a
Database Design
number of correlated theoretical and
Data Mining
practical activities to expose the students to
Digital Signal Processing
these challenging environments and to ignite
Embedded System
a new way of thinking. It is important to not
HPC
only deal with technical challenges that
enable collaboration and use of the
Machine Learning
infrastructure, but also to explore the
Modeling/Simulation
sociological challenges that encourage
Parallel Programming
collaboration and identify the potential risks.
Software Security
We expose the students through practical
experiences; integrating parallel
Visualization
programming, security, and workflow
Workflow Management
management to develop this new way of
(others)
thinking. Our goal is to create a
CS Pedagogy
interdisciplinary curriculum that is composed
of courses developed to address
CEfS Investigation
cyberinfrastructure challenges.
x x x X x X x X x
x x x X x X x X x
x x x X x X x X x
x x x x x x x x x
This address is for us to deliver the $5,000 award. Please provide the university contact information as necessary.
©2008 Gregor von Laszewski.
1
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
ADVANCED CYBERINFRASTRUCTURE FOR COMPUTATIONAL
SCIENCE
Gregor von Laszewski,
Rochester Institute of Technology, Bldg 74, Lomb Memorial Drive, Rochester, NY 14623
E-mail: laszewski@gmail.com, Phone: 585 298 5285
1 Abstract
Computational science requires the use of advanced cyberinfrastructure in order to provide an
environment that is capable of handling the most advanced scientific problems. Therefore, we propose
to include in a computational science curriculum a number of correlated theoretical and practical
activities to expose the students to these challenging environments and to ignite a new way of thinking.
It is important to not only deal with technical challenges that enable collaboration and use of the
infrastructure, but also to explore the sociological challenges that encourage collaboration and identify
the potential risks. We expose the students through practical experiences; integrating parallel
programming, security, and workflow management to develop this new way of thinking. Our goal is to
create an interdisciplinary curriculum that is composed of courses developed to address
cyberinfrastructure challenges.
2 Introduction
In order to enable progress in the scientific discovery process, it is important to note that we are
entering an area of increased interdisciplinary science that utilizes or can benefit from advanced
cyberinfrastructure. This not only includes the use of sensor networks, supercomputers, data centers,
advanced instruments by a disperse set of scientists, but also the capability of integrating large groups
of collaborating researchers and allowing them to access the facilities, and to share their data and
results. Moreover, there is a vision of frequent collaboration within and between different disciplines on
multiple scales to achieve innovative solutions to new problems. (Figure 1).
Sensor
Networks
Super
Computers
Data Centers
Scientific
Instruments
Interdisciplinary
Domain
Experts
Collaboration
Tools &
Frameworks
Figure 1: Advanced Cyberinfrastructure allows the creation and access to state of the art resources.
©2008 Gregor von Laszewski.
2
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
Thus, we have reached a phase where increase knowledge discovery is augmented heavily by utilizing
and correlating information by many researchers across multiple scales. One ingredient in making
advanced discoveries possible is the access to sophisticated cyberinfrastructure. Another ingredient is
the integration of commodity tools and services that we are already familiar with in our daily life.
However, a challenging problem is to actually build an infrastructure in such a way that it is transparent
to the majority of scientists so they can focus on their research rather than dealing with infrastructure
related problems. In order to achieve this goal, an intermediate step is needed while building and
deploying infrastructure based on requirements posed by the scientists. There is a growing availability
for infrastructure-oriented scientists that can collaborate with the domain experts and we are in the
need of domain experts that can collaborate with the infrastructure experts to create and deploy an
immersive scientific infrastructure. This will allo scientific explorations to take place anywhere, with
anyone, at anytime.
Enabling this vision has been a research focus of many scientists around the world, leading to research
efforts in Grid computing and Service Oriented Science. However, it is important to realize the
commercial market has at same time promoted and developed solutions for their own problems leading
to commodity solutions that can be reused as part of scientific infrastructure assets today.
The playing field has not become easier, but rather, more complex. New standards, hardware,
commodity technologies, scientific data and knowledge can quickly overwhelm any researcher.
Moreover, a scientist may not even know which solutions to pick from in order to address a problem
he/she may have. Hence, to utilize such infrastructure it is important that experts exists who are familiar
with it and assist scientists in utilizing it efficiently. While integrating the possibilities in outlooks in a
regular scientific curriculum we hope to raise awareness of new possibilities and ignite a new way of
thinking. Besides the technological challenges, scientists need to also be aware of the sociological
challenges of collaboration and cooperation. The following proposal is to enhance the curriculum by
integrating an interdisciplinary focus in addition to creating the necessary educational infrastructure to
build and deploy advanced cyberinfrastructure.
The paper is structured as follows. First, the progress of current infrastruture develop,ent is reviewed.
Second, lessons from grid computing are discussed and summarized. Third, new activities are proposed
which can raise awareness amongst interdisciplinary scientists and augment with recommendations to
be implemented as part of a cirriculum.
2.1 Advanced Scientific Infrastructure Evolution
A number of projects have demonstrated that the use of an integrative cyberinfrastructure including
sensor networks, supercomputers, datacenters, scientists, and collaborative technologies is beneficial to
the quest of advancing science (LEAD).
©2008 Gregor von Laszewski.
3
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
2.1.1 Computer Hardware Infrastructure
An important aspect of scientific computing lies in the evolution of the available hardware, driven and
motivated by various factors. These factors include: building and designing hardware suitable to address
our most challenging scientific problems, and the availability to build and operate such systems under
budget constraints often with commodity technologies. It is obvious that not everyone will be able to
afford a supercomputer so there is a need to recognize the fact that a variety of hardware platforms,
scaling from scientific workstations and desktops all the way up to the world’s most capable
supercomputing resources, are included in the tools used by scientists. We are all aware that due to
Moore’s law (define Moore’s law) the supercomputer from today will be replaced by the desktop of
tomorrow. Thus, it is important to review several current trends in supercomputing, as well as in the
commodity hardware market, in order to assess what we will have available to the scientist tomorrow.
When looking at the current Supercomputing Top 500 performance list we notice that as of June 2008, a
machine called IBM Roadrunner is on top of the list (see Figure 1).
1.0E+16
Peak Performance
1.0E+14
1.0E+12
1.0E+10
1.0E+08
1.0E+06
1.0E+04
IBM Roadrunner
1.0E+02
1.0E+00
1940
1950
1960
1970
1980
1990
2000
2010
Year
Figure 2: Peak performance of supercomputers at time of introduction.
Roadrunner is the first “hybrid” supercomputer that includes compute nodes in which the Cell
Broadband Engine, originally built for video game hardware such as the Sony Playstation 3, works in
concurrence with traditional x86 processors. In total, 6562 dual-core AMD Opteron chips and 12,240 cell
chips are used (CluE). The machine has 98 Terabytes of memory, and is enclosed in 278 compute racks.
It achieves a performance of 1.026 petaflop/s (TOP500)(LANL)(IBM).
The development of this machine is motivated by a change in how we can achieve higher performance,
resulting from a change in technology in accordance with Moore’s law. We observe that frequency,
power and instruction-level-parallelism have no significant influence on increased performance in recent
years. To outweigh this trend multi-core systems have been developed that will shortly be replaced by
many core systems. As a result, memory bandwidth and capacity per core are decreased due to the
increased number of processing cores.
©2008 Gregor von Laszewski.
4
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
2.1.2 Compute Clusters
When looking further at the current TOP500 list, two additional trends are noticeable. First, the number
of processors per system increased over time. Second, the use of Linux as the preferred operating
system for clusters has become obvious. Efforts such as open sourcing Solaris and the introduction of
Microsoft’s HPC software has not yet made a substantial difference.
Figure 3: Trend of increasing numbers of processors in the top500 and the dominance of Linux as an operating system of
choice for cluster computing. Source: (TOP500).
2.1.3 Desktop Supercomputing
Due to the availability of Graphics processors alongside powerful multi-core systems, it is now possible
to build systems that leapfrog performance improvements to the scientific user community at an
unprecedented rate. For example, at the Service Oriented Cyberinfrastructure Laboratory at Rochester
Institute of Technology we have built a small three-node Cell-based cluster for less than $1500USD. This
cluster has been used to calculate a mathematical problem on Folkman numbers and has outpaced a
high throughput cluster with 120 state-of-the-art nodes by a factor of two. Others in the community
have made similar experiences while using Playstation clusters in areas such as astrophysics (ASTROPS3)
and computational biology (FOLDING@HOME).
Mot recently, with the introduction of CUDA framework, other hardware based accelerations have been
possible. As we demonstrate in our laboratory, it is possible to build such a high-powered system for less
than $2500USD. Others have chosen a similar approach and many applications have demonstrated the
viability of such hardware configurations including bioinformatics (NAMD) and tomography(FASTRA). In
such cases, performance improvements of a factor of 100 have been observed.
One problem these new systems have is the relatively new hardware is difficult to program as only a few
tools exist at this time. Consequently, we have to deal with new ways on how to optimally utilize these
parallel computing platforms. Furthermore, it is important to increase the attention on flow and data
processing, in order to avoid bandwidth limitations.
©2008 Gregor von Laszewski.
5
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
The availability of such systems makes it possible to introduce a methodology shift on how science from
a batch-oriented process is performed to a real time methodology for smaller scale problems.
2.1.4 Grids
The Grid computing approach (Laszewski & Wagstrom, 2004) has been instrumental in pushing the
limits on the high-end infrastructure side while enabling an elementary minimal set of software
middleware to construct environments, thereby allowing resource sharing amongst many of the
available supercomputing resources. However, it is a well-known fact that it is not sufficient to just
develop the middleware but to create suitable deployments of Grids for the scientific community. In the
US, we have two important efforts, such as the Open Science Grid and the TeraGrid (see Table 1) that
make it possible to give scientists access to a distributed set of resources integrated within a single Grid
infrastructure.
Table 1: Overview of the TeraGrid resources (June 2008) not including high throughput resources.
Name
Ranger
Abe
Lonestar
Queen Bee
Steele
Big Red
BigBen
Blue Gene
Tungsten
DataStar p655
TeraGrid Cluster
Cobalt
Frost
TeraGrid Cluster
DataStar p690
TeraGrid Cluster
NSTG
Rachel
Institution
TACC
NCSA
TACC
LONI
Purdue
IU
PSC
SDSC
NCSA
SDSC
NCSA
NCSA
NCAR
SDSC
SDSC
UC/ANL
ORNL
PSC
System
Sun Constellation
Dell Intel 64 Linux Cluster
Dell PowerEdge Linux Cluster
Dell Intel 64 Linux Cluster
Dell Intel 64 Linux Cluster
IBM e1350
Cray XT3
IBM Blue Gene
Dell Xeon IA-32 Linux Cluster
IBM Power4+ p655
IBM Itanium2 Cluster
SGI Altix
IBM BlueGene/L
IBM Itanium2 Cluster
IBM Power4+ p690
IBM Itanium2 Cluster
IBM IA-32 Cluster
HP Alpha SMP
CPUs
62976
9600
5840
5440
6496
3072
4136
6144
2560
2176
1744
1024
2048
524
192
128
56
128
Peak
Tflops
Memory
Tbytes
Disk
Tbytes
579.4
89.5
62.2
50.7
40.0
30.6
21.5
17.1
16.4
14.3
10.2
6.6
5.7
3.1
1.3
0.6
0.3
0.3
123.0
9.4
11.6
5.3
12.4
6.0
4.0
1.5
3.8
5.8
4.5
3.0
0.5
1.0
0.9
0.2
0.1
0.5
1730.0
100.0
106.5
100.0
170.0
266.0
100.0
19.5
109.0
115.0
60.0
100.0
6.0
48.8
115.0
4.0
2.1
6.0
2.1.5 Service Centers
As indicated in (Laszewski, 2005) Grids middleware has evolved over time to include a service oriented
architectural framework. This is an important development that is also reflected in activities in the
business community that provides service Centers. This results most recently in collaborations between
national funding agencies and companies such as IBM and Google (CluE). One of the first companies to
realize this potential was Amazon with the Elastic Compute Cloud (EC2) and the Simple Storage Service
(S3). Recently there has been an increased effort of such companies to develop tools that allow for the
easy creation of services that can be developed on the local desktop and can be uploaded to a service
©2008 Gregor von Laszewski.
6
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
infrastructure (GOOGLE). An analysis of these commercial offerings has been conducted as part of an
investigation within the EDGEE project (EDGEE).
2.1.6 Virtualization
Over the last couple of years, virtualization technologies have provided an enormous impact on the
commodity market. Virtualization technologies date back to the early days of IBM in the 1960s ,
however it is only recently that these technologies have been developed for modern operating systems
at-large. Through virtualization, it is possible to abstract the computer’s resources and physical
characteristics from the applications running on the resources. Two modes are possible through various
virtualization technologies. First, making a single resource appear like multiple resources. Second,
making multiple resources appear as one resource. Naturally, virtualization is one of the key concepts of
Grid and Cloud computing.
Virtualization that allows an entire operating system to be run in isolation is an important ingredient in
making large-scale cyberinfrastructure more accessible. It will reduce the dependency on hardware for
tasks that may not require the highest performance and allow high performance tasks to best utilize all
available hardware. Such virtualization technologies will enable the next generation of Grid and Cloud
computing infrastructure to be part of infrastructure service centers, such as the TeraGrid.
2.2 Scientific Portals
Even with virtualization technologies, it is obvious that the available hardware infrastructure will be, in
many cases, too difficult to be used by an individual scientist. A natural way to deal with this issue is the
creation of scientific portals that hide much of the underlying infrastructure. Projects such as (OGCE)
have been instrumental in exposing communities to Grid infrastructures on the TeraGrid (TERAGRID).
Hence, a scientist can accomplish key tasks including one time authentication, configuring and launching
of a set of applications, and managing the input and output data sets from a desktop environment.
Furthermore, through Web 1.0 and 2.0 technologies domain specific portals can be developed that
further abstract this mechanism and let research explorations be formulated as sophisticated queries to
backend cyberinfrastructure services.
3 Consequences for Teaching
3.1 Knowledge Discovery Cycle
Due to the challenges involved in state-of-the-art research problems, there is a need to revisit the
knowledge discovery cycle and be re,omded that any improvement in either method or performance
will accelerate our chances in solving problems in which we are interested. As depicted in Figure 4, one
can derive models that can be implemented through computer simulations and deployed either in an
existing or new cyberinfrastructure environment. After verification of the results, new knowledge and
concepts are obtained that are related to the initial problem. This new knowledge msy lead to an
iterative process in case the problem warrants better solutions. Based on the interactions with many
scientists, a lot of time is spent in the implementation and deployment phase. Any reduction to this time
©2008 Gregor von Laszewski.
7
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
will benefit the scientists, and will be considered an improvement. This improvement is not only
achieved by increased performance provided through advanced cyberinfrastructure, such as
supercomputers, but also through more efficient use of the tools and services available to the scientists.
Concept &
Knowledge
Discovery
Verification
Model
Problem
Deployment
Implementation
Figure 4: Knowledge discovery cycle to solve a problem
When taking into account some of the most recent hardware developments, a few key areas are
identified that would be considered beneficial as part of the development of a curriculum for the next
generation of scientists. However, this paper will address just a subset of issues that are related to the
software development as part of the scientific discovery process. These areas focused are listed in Error!
Reference source not found., and contribute to the cyberinfrastructure expertise.
Figure 5: Cyberinfrastructure expertise to address some of the newest hardware infrastructure
©2008 Gregor von Laszewski.
8
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
3.2 Concepts
In order to develop new algorithms to be appropriate for advanced cyberinfrastructure, students need
to understand concepts from parallel and distributed computing. This includes the understanding of
message passing, data, task parallelism and other concepts from Grid and Cloud computing. Additionally,
concepts introducing scientific portals and their purpose need to be addressed. Special attention must
be placed on security related issues that are related to authentication and authorization to use
distributed resources. In addition, concepts related to fostering collaboration not only between
scientists from the same domain but also between different domain is vital to the learning experience.
These concepts can be introduced through various teaching materials including lecture-based materials
and applied projects.
3.3 Tools
One of the important issues we have faced in the past as part of a large collaborative project, has been
dealing with the deficit in basic knowledge of software engineering practices, not only with the scientists
but also with those getting involved in the cyberinfrastructure development. Data sharing in a variety of
forms has become an essential tool for everyone that participates in large-scale collaborative projects.
This not only includes the storing of scientific data but also the collaborative code development with the
help of open source repositories. Hence, we recommend that a number of effective educational
activities must be addressing these two issues. Examples for simple tools that can be part of this learning
experience include the source code repository subversion and a database such as MySQL. Naturally, we
assume the student possesses knowledge in at least one programming and one scripting language. This
will address the issues of performance and the development of scriptable components that are used to
bridge between different infrastructure components.
3.4 Infrastructure
Due to the recent development of hardware that allows for performance increases based on parallel
paradigms, we believe it is necessary that the curriculum for computational scientist students must
include an extensive education. This includes introductions to parallel computing, task parallelism and
workflows, Grid and Cloud computing as well as the effective utilization of CUDA, cell, and other closeto-metal technologies. As the material covering this area is rather large, the introductory course needs
to be followed up by additional learning experiences that may focus on one or more of these subjects in
more detail. Furthermore, we believe that courses need to be offered year-round in order to allow the
students easy access to such learning opportunities. Rochester Institute of Technology has established,
at the Masters and PhD graduate student level (undergraduates with sufficient qualifications can also
take the classes), a strong curriculum in parallel and distributed computing. The intention is to evolve
the curriculum in order to address the areas depicted in Figure 5. A first step in this direction has been
achieved while offering a year round course in advanced cyberinfrastructure that covers Grid and Cloud
computing with an emphasis on security and workflows. Because this course is project oriented, there
was the chance to introduce projects that are in the close-to-metal category. One special feature about
this course is that each student (as part of a team, with no more than four students) is required within
the first three weeks of the class to build their own cluster and Grid, based on commodity hardware
given to them. The groups assembled a Playstation cluster, a Rocks cluster (ROCKS), a cluster running
©2008 Gregor von Laszewski.
9
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
entirely on virtual machines, an OSX XGrid, and a Globus 4 Grid, and a Microsoft HPC cluster. After this
exercise, the different groups had to report on the ease of installation and maintenance. As none of the
students had to do this activity before, they obtained insight in the complexity of what it takes to deploy
and maintain such systems throughout the quarter. In parallel, the students had to form groups and
identify a project that utilizes this infrastructure and explain how it is related to Grid and Cloud
computing. All projects were maintained through a subversion source code repository so students were
able to experience how to develop code in a large collaborative group.
Due to the relative maturity of cluster computing, a number of useful tools and libraries exist. It is
beneficial to be aware of these resources in order to utilize this hardware platform. Examples include:
the introduction to batch processing, queuing systems and MPI programming. We intend to collect links
to such material and make it available through our web pages, as part of upcoming courses.
In addition, we have found some good tutorials on Grid computing, specifically in regards to the Open
Science Grid (OSG). These tutorials were easily executable by the students. In one instance we initially
had difficulties getting student accounts. However this was a welcome issue as it pointed out important
issues while using resource infrastructure hosted at a service center. The students learned that
procrastination is not advisable due to the length it takes to obtain such accounts. This was mostly
motivated by the recognition that verifying an identity and obtaining accounts is a complex process and
prone to human error. Walking students through the process in detail showed them what it takes to
make this infrastructure happen. It also provided worthwhile to contrast this learning experience with
methods on how to do it differently as this provided a valuable learning experience in applied security.
Much of the activities proposed can be conducted without great financial commitment. Grid resources,
such as the OSG and TeraGrid, provide small allocations for teaching purposes, commodity clusters can
be built from discarded equipment, and the creation of powerful machines including a CUDA card is
relatively inexpensive. One advantage that we saw while introducing infrastructure deployment aspects
into the lectures was that the students appreciated hands-on experiences and found the class more
attractive. This is of an important area of concern because parallel and distributed computing has to
deal with significant competition within a university such as classes involving robots or computer
graphics.
4 Teaching Methodology
4.1 Course Structure
The teaching methodology that proposed varies based on the maturity and complexity of the material
introduced in the class. A class in cluster computing or message passing can be driven by lectures and
homework assignments only. Today many people believe that experience through applied projects must
be part of the teaching methodology. These projects can be conducted either through class assignments
or through a special lab class that is mandatory to take in conjunction with a lecture. At RIT, only project
driven homework assignments based on one single big project are used. This has the advantage that the
students not only need to focus their attention on the technological challenges, but also in need to
©2008 Gregor von Laszewski.
10
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
manage a collaborative project. Over time the hope is to attrack students from other scientific domains
which will encourage interdisciplinary teams. All teaching material will be placed in a repository in order
to allow for successive improvements. In parallel to the projects, seminar style lectures are conducted
where a student presents on a topic for a large part of a class. This allows the students to learn the
communication skills necessary to interact with each other, similar to how we expect communication
would take place in a large-scale project. All presentation material needs to be handed in the day before
the presentation. If there were a case in which it was deemed insufficient it could be replaced or
augmented by material provided by the course instructor. One additional learning outcome for all
students is the ability to communicate their results in a written, professional paper.
4.2 Learning in a Global Society
The concepts and methods used at any high school and university prepare students for a world with lifelong learning opportunities that may be continued after graduation not only as part of research
activities but also as part of the involvement to build a knowledge society. While the teaching proposal
for this material is plentiful at a high academic level, there is a need to disseminate material that is
suitable for the introduction of many of the concepts at an earlier rung of the educational ladder (Figure
6). It is important to recognize that some students are accustomed to a variety of concepts as users of
advanced infrastructure. Video games and consoles, cell phones, social networks, and Web portals, and
PCs are already common among K-12 students. However, this experience primarily is based on using this
technology. It is important to work towards reducing the time this and more advanced infrastructure is
used by potential future students. The OLPC project (OLPC), for example, demonstrated that soon the
world will be facing a population with an increased technological skill level. In order to stay competitive,
curriculums that allow for the introduction of advanced scientific and cyberinfrastructure based topics
need to be encouraged earlier. This can already be seen as see some universities have successfully
introduced parallel computing concepts on the undergraduate level.
Researchers
Parallel
Computing
Concepts
Reducing the
entry level
PhD Graduate
Students
• ..... all of the below ....
• Large Supercomputer,
Grid, Cloud, CUDA, CELL
Graduate
Students
• Commodity Cluster,
CUDA,CELL
Undergraduate
Students
• Programming, PCs
K-12
• Games, PCs
Figure 6: Exemplary educational ladder and the introduction of topics by age group.
©2008 Gregor von Laszewski.
11
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
5 Conclusion
In order to address the changes of the hardware made available to most scientists, we need to integrate
activities that foster the understanding of parallel computing, as well as the underlying infrastructure,
that requires optimized algorithm design. With the help of commodity technologies and toolkits that
utilize them, we intend to integrate learning activities, not only for students majoring in computer
science but also for students in domain and interdisciplinary areas of science. This not only includes the
use of compute clusters, but also the use of desktop supercomputers enabled through GPUs, the Cell
Broadband Engine and Grid and Cloud computing. Besides preparing for these infrastructure related
resources, we will also focus on the methodologies for how projects can be conducted in large and small
teams, including interdisciplinary team members.
AKNOWLEDGEMENTS!??
Add grid1 and grid2 class as they were guinny pigs
6 Bibliography
ASTROPS3. (n.d.). Retrieved from http://www.sciencedaily.com/releases/2007/03/070319205733.htm
CluE. (n.d.). Cluster Exploratory (CluE) relationship. Retrieved from
http://www.nsf.gov/news/news_summ.jsp?cntn_id=111186
EC2. (n.d.). Amazon Elastic Compute Cloud (Amazon EC2) - Beta. Retrieved from
http://www.amazon.com/EC2-AWS-ServicePricing/b/ref=sc_fe_l_2?ie=UTF8&node=201590011&no=3435361&me=A36L942TSJ2AJA
EDGEE. (n.d.). AN EGEE COMPARATIVE STUDY: GRIDS AND CLOUDS- EVOLUTION OR REVOLUTION?
Retrieved from https://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdf
FASTRA. (n.d.). Retrieved from http://fastra.ua.ac.be/en/index.html
FOLDING@HOME. (n.d.). Retrieved from http://www.scei.co.jp/folding/en/
GOOGLE. (n.d.). App Engine. Retrieved from http://code.google.com/appengine/
IBM. (n.d.). Roadrunner Press release. Retrieved 06 30, 2008, from http://www03.ibm.com/press/us/en/pressrelease/24405.wss
LANL. (n.d.). Roadrunner Home Page. Retrieved 06 30, 2008, from
http://www.lanl.gov/orgs/hpc/roadrunner/index.shtml
©2008 Gregor von Laszewski.
12
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
Laszewski, G. v. (2005). The Grid-Idea and Its
Evolution,http://www.mcs.anl.gov/~gregor/papers/vonLaszewski-grid-idea.pdf. doi: 10.1524/itit.2005.
Journal of Information Technology , 47 (6), 319-329.
Laszewski, G. v., & Wagstrom, P. (2004). Gestalt of the Grid,
http://www.mcs.anl.gov/~gregor/papers/vonLaszewski--gestalt.pdf. Wiley.
LEAD. (n.d.). Retrieved from https://portal.leadproject.org
NAMD. (n.d.). Retrieved from http://www.ks.uiuc.edu/Research/vmd/cuda/
OGCE. (n.d.). Open Grid Computing Environment. Retrieved from http://www.ogce.org
S3. (n.d.). Amazon Simple Storage Service (Amazon S3). Retrieved from
http://www.amazon.com/gp/browse.html?node=16427261
TERAGRID. (n.d.). Teragrid Portal. Retrieved from https://portal.teragrid.org
TOP500. (n.d.). Retrieved from http://www.top500.org/lists/2008/06
7 Appendix
YOU SHOULD REFFER TO THIS SECTION IN THE PAPER
At Rochester Institute of Technology students are majoring in a compute cluster as part of the College of
Computing and Information Science. The current official course offerings in the course cluster are listed
next from which the first two are a requirement and the others are optional.
7.1 Distributed Systems Cluster
The Distributed Systems Cluster (RIT-DC) focuses on systems formed from multiple cooperating
computers. Courses cover the analysis, design, and implementation of distributed systems, distributed
middleware, and computer networking protocols, including security. By choosing different combinations
of courses students can gain a broad understanding of the distributed systems field, concentrate on the
high level distributed application aspects, or concentrate on low level networking aspects.
Course name
Short Description
Distributed Systems I
introduction to the study of the hardware and
software issues affecting the design of a distributed operating system
introduction to the concepts and principles of computer networks
Data Communications and Networks I
Cryptography
Cryptography II
review of basic cryptographic algorithms, their implementation and
usage
investigates advanced topics in cryptography
Distributed Systems II
current topics in the field
Parallel Computing I
study of the hardware and software issues in parallel computing
©2008 Gregor von Laszewski.
13
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
Parallel Computing II
Topics in Distributed Systems
Data Communications and Networks II
Ad Hoc Networks
Secure Operating Systems and Networks
Topics in Data Communications
Privacy and Security
study of selected topics in parallel algorithm design through the
analysis of algorithms used in various areas of app
Current topics in the field
continues the study of computer networks begun in Data
Communications and Networks I, emphasizing design principles
explores serverless ad-hoc networks
provides students with an introduction to the issues surrounding
security aspects in operating systems and networks
Current topics in the field
Advanced Computer Networks
provides students with an introduction to the issues surrounding
security of computer systems and privacy concerns in
Current topics in the field
Intelligent Security Systems (Seminar)
Current topics in the field
Sensor Networks Data Management
This course explores policies, methods and mechanisms for
protecting enterprise data
7.2 Advanced Cyberinfrastructure
7.2.1 Grid Computing I
Grids are a type of parallel and distributed system that enable the sharing, selection, and aggregation of
resources distributed across multiple administrative domains based on their resource availability,
capacity, performance, cost and the users' quality-of-service requirements. Students will become
familiar with various aspects of Grid computing and be introduced to technologies that are instrumental
in practical deployments of Grids.
Prerequisites:


Approval from the instructor
Knowledge in Java, C, or C++ is required. Knowledge in WS technologies and parallel programming
is beneficial. Your background knowledge may determine the project you will conduct as part of
this seminar.
Course Goals and Rationale
The goals of this course are to provide students with the fundamentals of Grid Computing and to
introduce them to the various ways in which Grid Computing can be utilized and applied. According to
Gartner Research Grid computing is a multibillion dollar effort per year. This course provides the student
with elementary knowledge providing the potential for excellent career opportunities in research and
industry. This course is part of an educational pipeline that can spawn from K-12, undergraduate,
graduate, and post graduate. The topic of Grid computing is a state-of-the-art and advanced topic in
distributed systems. It exposes the students to in depth topics of different fields such as security,
parallel computing, service oriented architectures, and scientific computing. It enhances the already
excellent opportunities in the distributed computing cluster. The learning experience will be enhanced
by literature reviews, student presentations, and practical experiences to build, use, and program Grid
environments. A follow up course “Seminar Grid Computing II: e-Science” is planned for the fall.
©2008 Gregor von Laszewski.
14
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
Resources

This course will not be using any textbook. Instead research papers and the internet will be used
Topic Outline
What follows is an examples of potential topics offered in the first installment of this course. Due to the
fast paced research area appropriate adjustments to the course outline will be integrated in order to
stay state-of-the-art.
Each main topic will be covered for one week in the class.










Introduction (Week1)
o Topics: Motivation of Grid Computing, Grid Computing History, Use Cases
Lab (Week 2)
Grid Programming Tools (Week 3)
o Topics: WS Technologies and Grids, Globus, Condor, Unicore, BOINC, Amazon
o Assignments
o Project Proposals due
o Remember that I may initially assign some smaller tasks in class. Please add them to the
syllabus under Assignments for the week. Include task's name and description.
o Bib files are from now on maintained through svn and jabref
Grid Security (Week 4)
o Topics: Certificate Authorities, GSI, VOMS, Shibboleth, Myproxy
o Asignments
 Gridlab from OSG is due (demo possible)
Job Submission (Week 5)
o Topics: GRAM, EDGEE, Gluideins, Workflows
Information Services (Week 6)
o Topics: Ganglia, MDS, GLUE
o Assignments:
 Progress report on project in written form
 Draft for final report
 code must be in svn
Data Grids (Week 7)
o Topics: Data vs Computing, Pedigrees
Grid Portals & Knowledge Grids (Week 8)
o Topics: Portals, Gateways, Mashups
Deployments (Week 9)
o Topics: TeraGrid, Edgee, Condor, BOINC, Playstation
Final Presentations (Week 10)
o Assignments
 Final report due
©2008 Gregor von Laszewski.
15
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008

 Final project due
 Update your web page to include links to your final deliverables
Future of Grid Computing
o Topics: Virtual Machines, Cloud Computing, Knowledge Grids, Research Issues
7.2.2 Grid Computing II
This course is a continuation of Grid Computing I. The assignment due dates are similar structured as in
the first course. The project from the first course can be continued.










Introduction (Week 1)
o Grid Computing and scientific applications
o Overview of use Cases
Grid Security (Week 2)
o VOMS
o Shibboleth
Job Management (Week 3)
o Scheduling
o Gluideins
o Workflows
Information Services (Week 4)
o Scalable Information Services
o Information mashups
o Information Databases
Data Grids (Week 5)
o Advanced Pedigrees
o Information Protection
Knowledge Grids (Week 6)
o Cyberinfrastructure for the Deaf
Applications (Week 7 & 8)
o Astrophysics
o Storm prediction and climatology
o Advanced instruments
o Bioinformatics
o Others
Grid Programming Tools (Week 9)
o WS Technologies and Grids
o Workflow tools
o Social networks
o Unicore
o Boinc
o Amazon
Final Presentations (Week 10)
Reflection, what is next? (Week 11)
©2008 Gregor von Laszewski.
16
Teaching Advanced Cyberinfrastructure for Computational Science, Gregor von Laszewski, June, 2008
7.2.3 Proposed outline for a CUDA class
 Introduction (Week 1)
o History
 Basics of GPU and CUDA technologies (Week 2)
o Running examples, compiling
 Programming CUDA (Week 3)
o Matrix Multiplication
 CUDA Memory model (Week 4)
o Tiling
 CUDA Hardware (Week 5)
 CUDA Compute Core (Week 6)
 CUDE Performance Optimization (Week 7)
 CUDA Control Flow (Week 8)
 Floating point operations and impact on accuracy, BLAS, Sorting, … (Week 9)
 CUDA Applications (Week 10)
 Project presentations (Week 11)
©2008 Gregor von Laszewski.
17
Download