Report to the Lilly Endowment, Inc. Grant Number 2008 1639-000 18 Month Program Report January 1- May 31, 2010 Submitted by: Michael A. McRobbie Indiana University President Bradley C. Wheeler Vice President for Information Technology, CIO, and Dean of Information Technology Craig A. Stewart Executive Director, Pervasive Technology Institute, Associate Dean of Research Technologies Table of Contents I. Introduction and Executive Summary................................................................................................... 3 II. Digital Science Center ........................................................................................................................... 7 III. Data to Insight Center ..................................................................................................................... 28 IV. Center for Applied Cybersecurity Research .................................................................................... 40 V. Research Technologies........................................................................................................................ 46 VI. Bringing Distinction to the State of Indiana .................................................................................... 55 VII. Institute Coordination and Support ................................................................................................ 57 VIII. Management and Operations ......................................................................................................... 58 IX. Economic Development .................................................................................................................. 59 X. External Relations and Strategic Initiatives ........................................................................................ 63 XI. Educating the Residents of Indiana and Beyond ............................................................................ 65 Appendix 1: Technology Disclosures during the Reporting Period............................................................. 67 Appendix 2: Open Source Software ............................................................................................................ 68 Appendix 3: Online Services........................................................................................................................ 70 Appendix 4: Publications (January 1-May 31, 2010)................................................................................... 73 Appendix 5: Presentations (January 1 – May 31, 2010) ............................................................................. 80 Appendix 6: Active and Pending Grants ..................................................................................................... 87 Appendix 7: Interim Financial Report ......................................................................................................... 95 Appendix 8: Education, Outreach and Training Events .............................................................................. 97 Appendix 9: Public and Governmental Service Activities ......................................................................... 105 Appendix 10: News and Media Placements ............................................................................................. 106 Appendix 11: Glossary of Technical Terms Used in this Report ............................................................... 109 2 I. Introduction and Executive Summary Pervasive Technology Institute has enjoyed another successful period, both in the receipt of more external grants and also in the participation of several projects of national and international significance. PTI has gained a solid national reputation for Indiana University and the state of Indiana in the areas of high performance computing, data management and preservation, computational support of scientific research, and security and privacy policymaking. Although the announcement came shortly after the close of the current period, it is notable that Indiana University was recently named by Computerworld magazine as one of the 100 best places to work in IT (http://www.computerworld.com/spring/bp/detail/767). Although this is an honor that is shared by all IT organizations within the IU system, the recognition specifically pointed to factors such as the opportunity for employees to participate in leading research, be published in scholarly journals and present at international conferences as contributing to IU receiving this distinction. In these ways, PTI has contributed significantly to the award criteria and is an important component in IU’s success in attracting and retaining top intellectual talent to the state of Indiana. Other major highlights for the reporting period are summarized below in non-technical language. More detailed and technical descriptions can be found in the body of the report. This report has been structured to provide nontechnical bullet lists in this section and at the start of each of our other especially technical sections. Another important change the standard format of this report is the inclusion of IU’s Research Technologies division as a separate entity within the PTI report rather than including contributing information within each center. RT contributes so substantially to the success of PTI and its activities are so cross-cutting that it felt important to give RT its own section in order to most accurately and completely reflect its contribution to the success of PTI. PTI highlights for the reporting period: In a time of unprecedented national financial hardship PTI has enjoyed continued grant success during the period. PTI received additional grant awards totaling nearly $4 million (bringing its external funding total to more than $22 million) and submitted a remarkable 54 currently pending grants totaling more than $82.5 million. According to a recent report by the US Science Coalition, “When public money is invested in university-based basic research there is tremendous return on investment. Research creates jobs directly for those involved and indirectly for many others, through innovations that lead to new technologies, new industries and new companies.” (http://www.sciencecoalition.org/successstories/index.cfm) Fred H. Cate, director of the PTI Center for Applied Cybersecurity Research served as a policy advisor on technology privacy and security, making presentations to the US Department of Commerce, the Committee on Judiciary Subcommittee on Crime and 3 Drugs in the US Senate, and the US Federal Trade Commission. Testimony provided by Cate was cited in numerous national media outlets, including the New York Times. During the reporting period, the Digital Science Center along with Research Technologies Systems and Applications groups completed the initial stage of hardware installation and testing for the FutureGrid project. This is a complicated and delicate process that required significant technical expertise and effort to achieve. The FutureGrid project places Indiana and Indiana University at the helm of one of the most important national efforts related to the future of technology and scientific research. Supported by a $10 million grant from the National Science Foundation and led by PTI’s own Geoffrey Fox, FutureGrid provides a testbed for the most significant emerging grid and cloud technologies. These are the technologies expected to drive global business and scientific research in the coming decades. The project will be used to define the future of US national computing infrastructure and contributes significantly to US competitiveness in the sciences. The DSC along with Research Technologies Systems completed installation of more than 21 teraflops of computing power to support the Polar Grid project during the reporting period. The Polar Grid project, which is funded by a series of grants from the National Science Foundation, has been enormously successful and has continued to grow. The project is creating a computational grid in the polar regions to support research by the NASA and the Center for Remote Sensing of Ice Sheets (CReSIS) on the earth’s rapidly melting polar ice sheets. Rising sea levels created by melting ice sheets threaten coastal areas with flooding and endangers wildlife. The computational power provided by Polar Grid is allowing scientists to begin processing data while still in the field and has greatly increased the speed at which discoveries can be made in this critical race against time. During the period, RT Systems employees traveled to Greenland and Antarctica and Chile to install and upgrade Polar Grid systems. The Data to Insight Center (D2I) organized and led Indiana University’s participation in a $20M proposal to the National Science Foundation Sustainable Digital Data Preservation and Access Network Partners (DataNet) program. The project would help to develop techniques to preserve valuable data related to meteorological science. Word on the proposal, which received a successful NSF site visit in February 2010, is expected summer 2010. D2I made significant contributions to the Vortex2 project, the largest national effort to date to understand the formation and behavior of tornados. D2I’s LEAD II technology provided real-time forecasts to handheld devices used by storm chasers in the field. 4 Scholarly Accomplishment Group Publications Technical Presentations Inventions Disclosed Online Services Provided Public and Governmental Service Activities 1 Open Source Software Distributed 9 Digital Science Center Data to Insight Center Center for Appled Cybersecurity Research UITS Research Technologies Pervasive Technology Institute Total 63 58 17 1 5 14 0 3 1 0 26 18 0 0 0 0 1 11 1 0 0 0 95 101 1 12 18 1 Educating the 21st Century Workforce Group Undergraduate Student Employees/Interns M.S. Students Employed or Supported Undergraduate Degrees Awarded M.S. Degrees Awarded Ph.D. Degrees Awarded 3 Ph.D. Students Employed or Supported 18 0 4 3 Education, Outreach and Training Events 11 Digital Science Center Data to Insight Center Center for Applied CyberSecurity Pervasive Technology Institute Total 0 5 26 13 3 3 0 8 0 0 0 0 0 0 13 5 29 31 3 7 3 32 5 Grant-related Activity Group Digital Science Center Data to Insight Center Center for Applied Cybersecurity Research UITS Research Technologies (PTI Related) Pervasive Technology Institute Total Number of Grant and Contract Proposals Submitted 29 Total Dollar Amount of Proposals Submitted Total Dollar Amount of Proposals Awarded $15,127,601 Number of Grant and Contract Proposals Awarded 5 10 $27,074,510 3 $1,242,066 8 $14,597,012 3 $149,786 7 $25,738,276 1 $1,839,949 54 $82,537,399 12 $3,988,463 $756,662 6 II. Digital Science Center Geoffrey C. Fox, Director II.1 Digital Science Center Mission and Activity Summary The Digital Science Center (DSC) focuses on creating an intuitively usable cyberinfrastructure with tremendous capabilities for supporting collaboration and computation. Easy-to-use, human-centered interfaces to cyberinfrastructure created by the Digital Science Center will enable the many thousands of researchers in the public and private sectors to use the capabilities of cyberinfrastructure and accelerate innovation and discovery. The DSC includes the following labs and support units: Community Grids Lab - Geoffrey Fox, Director; Marlon Pierce, Gregor Von Laszewski and Judy Qiu, Assistant Directors Complex Networks and Systems Group - Alex Vespignani, Director Open Systems Lab - Andrew Lumsdaine, Director University Information Technology Services (UITS) Research Technologies (RT) Applications Division - D. Scott McCaulay, Director UITS/RT Systems Division – Matt Link, Director Center Highlights January 1-May 31, 2010 The following bullet list provides a non-technical overview of accomplishments for the period. More detailed and technical descriptions appear in the section that follows. With the Research Technologies Systems group, the Digital Science Center made significant progress on its FutureGrid Project. FutureGrid is funded by a $10 million grant from the National Science Foundation and puts IU in a leadership role of one of the largest and most important research efforts in U.S. computational science. FutureGrid is a national testbed for emerging grid and cloud computing technologies that hold tremendous potential for business and scientific research. FutureGrid will help to define way the NSF provides computing power to scientists in the coming decades and will have a significant impact on U.S. competitiveness in scientific research. During the reporting period, the main infrastructure for FutureGrid was completed. A site visit by the NSF is scheduled in July to approve the hardware and open the system for use by scientists. The DSC along with Research Technologies Systems completed installation of more than 21 teraflops of computing power to support the Polar Grid project during the reporting period. The Polar Grid project, which is funded by a series of grants from the National Science Foundation, has been enormously successful and has continued to grow. The project is creating a computational grid in the polar regions to support research by the 7 NASA and the Center for Remote Sensing of Ice Sheets (CReSIS) on the earth’s rapidly melting polar ice sheets. Rising sea levels created by melting ice sheets threaten coastal areas with flooding and endangers wildlife. The computational power provided by Polar Grid is allowing scientists to begin processing data while still in the field and has greatly increased the speed at which discoveries can be made in this critical race against time. During the period, RT Systems employees traveled to Greenland and Antarctica and Chile to install and upgrade Polar Grid systems. The Digital Science Center continues as a leader in the development of portals and gateways. Portals and gateways are online services that help scientists gain easy access to the supercomputers they need to perform their research. Using supercomputers can be a significant challenge because there has traditionally been a steep learning curve. Portals and gateways allow scientists to more easily access advanced technology without requiring a deep understanding of how that technology operates. During the reporting period, the DSC’s QuakeSim earthquake modeling portal played a crucial role in research conducted by NASA on the Baja California earthquake. During the reporting period, the Community Grids Lab had an important software release of its “Twister” program, a powerful tool that helps scientists find meaning in very large data sets. Twister improves upon Google’s popular MapReduce software tool, allowing it to achieve higher performance, perform faster data transfers, and reduce the time it takes to process vast sets of data for data mining and machine learning applications. Twister has a great deal of potential to increase the speed of scientific discovery, especially in the areas of biomedical research. The Open Systems lab had an active period, completing one major project and reaching milestones in several others. The OSL contributes to scientific and business competitiveness in Indiana and the U.S. by providing valuable open source software that optimizes high performance computers and is freely available to the scientific research and business communities. The Complex Networks and Systems (CNeTS) group had another successful period. A primary research focus involves the modeling of the spread of infectious disease, including H1N1 and HIV in parts of Africa, in order to help health officials make decisions about how to prevent or slow the spread of illness in populations. During the period, CNeTS director, Alex Vespignani was featured in the prestigious international journals Science and Nature describing his work on modeling the H1N1 pandemic. Vespignani’s predictions about the spread of H1N1 were found to be exceptionally accurate, bringing him international recognition for the modeling techniques developed in his lab. 8 Scholarly Accomplishment A summary of the scholarly accomplishments of the Digital Science Center during this reporting period is provided below: Group Publications Technical Presentations Inventions Disclosed 1 Open Source Software Distributed 4 Online Services Provided 13 Public and Governmental Service Activities 1 Community Grids Lab Open Systems Lab Complex Networks and Systems Group Digital Science Center Total 45 37 6 5 0 3 1 0 12 16 0 2 3 0 63 58 1 9 17 1 Educational Activities The following table provides a summary of the educational activities of the DSC during this reporting period: Group Community Grids Lab Open Systems Lab Complex Networks and Systems Group Digital Science Center Total Undergraduate Student Employees/Interns 0 M.S. Students Employed or Supported by DSC 3 Ph.D. Students Employed or Supported by DSC 0 Undergraduate Degrees Awarded M.S. Degrees Awarded Ph.D. Degrees Awarded Education, Outreach and Training Events 0 3 0 6 0 0 10 0 0 1 3 0 0 8 0 1 2 2 0 3 18 0 4 3 11 9 Funded Research The following table provides a summary of grants submitted and grants received by the DSC during the current reporting period: Group Community Grids Lab Open Systems Lab Complex Networks and Systems Group Digital Science Center Total Number of Grant and Contract Proposals Submitted 10 Total Dollar Amount of Proposals Submitted $8,476,235 Number of Grant and Contract Proposals Awarded 2 Total Dollar Amount of Proposals Awarded 6 $3,643,489 1 $50,321 13 $3,007,877 2 $373,549 29 $15,127,601 5 $756,662 $332,792 II.2 Digital Science Center Research The following section includes research highlights for the Digital Science Center as a whole and for each lab and group within the DSC. II.2.1 Center-wide Research Projects Achieving Major Milestones FutureGrid (With Research Technologies Systems and Applications Groups) The FutureGrid project, which was announced in the previous report and started in fall of 2009, is a test bed for emerging technology related to grid and cloud computing. The project places Indiana University at the helm of one of the most important, leading edge projects in the field of computational science today. It is a national collaboration supported by a $10 million grant from the National Science Foundation led by PTI’s Geoffrey Fox. The goal of FutureGrid is to allow the U.S. science and business communities to test the most promising new supercomputing technologies in order to plan the next generation of national computational infrastructure that will be provided by the National Science Foundation. The NSF currently provides national computational resources to the U.S. scientific community through supercomputing centers and networks such as the TeraGrid. FutureGrid will help NSF to establish future scientific research networks in order to preserve U.S. competitiveness in science and business. FutureGrid focuses on cloud technologies as the emerging computational paradigm in the coming decades. Wikipedia defines cloud computing as “Internet-based computing, whereby shared resources, software, and information are provided to computers and other devices on demand, like the electricity grid.” Cloud computing supports research and business by providing a single access point to numerous computational resources that lie “in the cloud” without 10 requiring that the user know or understand the complex technology that is supporting them. Businesses such as Google and Amazon are already heavily relying upon cloud computing to support their business and are proving it to be a critical emerging technology. The current reporting period has been highly active for FutureGrid, as the main framework of the testbest was established and tested during this time. One of the major milestones is the acceptance by the NSF of the computing hardware that makes up the grid testbed. This is still in progress but should be completed by end of June 2010. The hardware that will become available in the last week of June is listed in the table below. At present, the FutureGrid team is working on a software architecture that allows them to dynamically provision different software stacks onto the FutureGrid hardware initiated by the users. FutureGrid uses a concept called “raining” that supports virtual environments, helping to minimize overhead and maximize performance for the research scientists (see figure below). System type # CPUs # Cores TFLOPS Total RAM Secondary (GB) storage (TB) Site IBM iDataPlex (sierra) 168 672 7 2688 72 SDSC Cray XT5m (xray) 168 672 6 1344 335 IU (Above) High-level hardware specifications for systems to be included as part of the FutureGrid. The testbed will help to define the next national computing grid in the U.S., supporting U.S. competitiveness in science and industry. FutureGrid “rains” an environment on the testbed suitable for the user to conduct his experiments. It reduces overhead and maximizes performance for researchers. The coming period will be an exciting time for the FutureGrid project as the hardware officially becomes available for use by the community and large-scale testing of these cutting edge cloud 11 technologies will begin. Portals and Gateways (with Research Technologies Applications Group) The Research Technologies Applications (RT-A) group at Indiana University continues to partner with the PTI Digital Science Center to address usability issues in scientific computing through the development of portals and gateways. Today’s multicore computers offer unparalleled computing power for scientific research, but the barriers to entry can be quite high. Portals and gateways provide easy to use single entry points for scientists to access high performance computers and other advanced technology essential to their research without requiring that they have an in-depth understanding of the computers themselves. During the reporting period, DSC and RTA staff have worked together to release tools and provide support for science gateways as part of the Open Science Grid and Linked Environments for atmospheric discovery. Portals and gateways developed by PTI are helping U.S. scientists be more competitive by helping them gain easy access to some of the most powerful scientific computing available. PolarGrid (With Research Technologies Systems Group) The IU-led Polar Grid project is creating a high performance computational grid in the northern and southern global arctic regions in order to process data collected about the rapidly melting ice sheets. Funded by a grant from the National Science Foundation, Polar Grid allows scientists to process ice sheet data while still in the field, speeding the time between data collection and discovery. Because the melting sea ice has potentially serious environmental consequences for low-lying and coastal areas, this is a problem that must be understood and mitigated as quickly as possible. Polar Grid is significantly improving the speed at which discoveries about polar ice can be made. During the reporting period, the Research Technologies Systems group installed 21.9 TFLOPS of HPC systems for the Polar Grid project. A portion of that system, 64 nodes, has been installed separately and will be relocated to Elizabeth City State University (ECSU) later in 2010. ECSU is a partner on the PolarGrid award and is a minority serving institution in North Carolina. In addition to installing this new system at IU in Bloomington two RT-S staff members traveled to Thule in Greenland for fieldwork related to the PolarGrid project. The expedition was lead as part of NASA’s IceBridge mission, the largest airborne survey ever flown of Earth's polar ice. RTS also worked closely with the Center for Remote Sensing of Ice Sheets (CReSIS) that is headquartered at the University of Kansas for the IceBridge mission. RT-S is working closely with CReSIS and RT-A to port the current analysis code from using runtime Matlab libraries to a compiled environment. 12 II.2.2 Community Grids Lab The mission of the Community Grids Lab (CGL) is to create the technology that will enable grid computing to help solve important scientific problems. In creating new global communities, grid computing will open the way to new possibilities for e-Business and e-Science. The Community Grids Lab (CGL) focuses on creating new technology infrastructure and applications that will enable distributed business enterprises and cyberinfrastructure for distributed science and engineering. Computers and networks are getting faster; the distinction between computers and the network is blurring. This points to a future where individuals and corporations interact with grid-based applications without needing to explicitly manage the underlying technology details. CGL's focus on applications has spawned much cross-disciplinary collaboration in research and development of scientific and business applications. A current major emphasis is in earth science and particle physics, with other projects in education, biocomplexity, chemistry, apparel design, digital film production, and sports informatics. Projects Achieving Major Milestones Open Grid Computing Environments Funding Agency: National Science Foundation The Open Grid Computing Environments (OGCE) project creates open source software for building science gateways and consults with many major participants in the TeraGrid Science Gateway program. Science gateways are web-based access points and tools that make it easier for scientists to use advanced computing technology by greatly reducing the amount of computational expertise required to run experiments using supercomputers and other advanced technology. During this period the OGCE completed its preliminary integration of several major components into a single build environment: the OGCE Google-compatible Gadget container, the XRegistry service registry, the GFAC application factory tool, the XBaya workflow composer, the Registry gadget, and the Experiment Builder gadget. All of these tools can now be compiled and deployed together or on separate servers using a single build command. The OGCE also provided integration and consulting support for the following science gateways: GridChem (NCSA), a computational chemistry gateway, is now using the XBaya workflow composer; UltraScan (UTHSCSA), a biophysics gateway, is using OGCE's GFAC and supporting tools to prototype its new job submission infrastructure; the Expressed Sequence Tag (EST) Pipeline Portal is using the OGCE's advanced job submission tools to run 10,000's of jobs on both local Indiana University and TeraGrid resources. The completion of this milestone should greatly improve the ability of scientists to use TeraGrid computing resources. 13 QuakeSim Funding Agency: NASA QuakeSim is a NASA funded project to build a science gateway and supporting Web service for the earthquake science community. QuakeSim includes both earthquake fault spatial deformation and GPS time series analysis tools. During the current reporting period, we made several major upgrades to the deployed infrastructure, including the ability to create synthetic InSAR fringe diagrams that can be compared to direct observation. These tools were used prominently by PI Andrea Donnellan in studies of the aftermath of the Baja California, Mexico earthquake (see figure). OGCE's workflow composer tool integrated with the GridChem science gateway's middleware. The figure shows a computational chemistry workflow chain of services (represented as boxes) that combine the CHARMM and Gaussian applications to calculate molecular structures. 14 Screen shot of displacement vectors (arrows) and InSAR deformation plot from a simulation of the April 2010 Baja earthquake, produced using the QuakeSim portal's online services. OREChem Funding Agency: Microsoft Research OREChem is a collaboration between crystallographers, digital librarians, and cyberinfrastructure researchers to extend the Object Reuse and Exchange (ORE) specification to crystallography and more generally to the problem of integrated scientific information management. The Community Grids Lab's role is to provide expertise in service-oriented computing and Grid computing. During the current reporting period, we developed a collection of REST services for processing OREChem Atom/XML feeds, converting them into RDF triples and storing in our RDF triple store for later search and retrieval. We also implemented services for constructing and executing computational chemistry jobs on the TeraGrid using OREChem feed information. We used OGCE tools for this service composition and execution. 15 A subset of IU's OREChem services are shown in the OGCE's XBaya workflow composer tool. These online services (boxes in the main canvas) are integrated to extract crystallographic information (molecular structures), create Gaussian computational chemistry input files from them, and run Gaussian jobs on the TeraGrid. Multicore Project Funding Agency: Microsoft, Inc. This project is focused on programming models and runtime that will be used on systems of multicore computers. These programming models are useful for scientific research in a variety of biomedical areas including genetic and drug research as well as other data intensive research. Initial work focused on performance of threading versus Message Passing Interface (MPI) in both kernels and datamining. Current major areas are biomedical applications and data intensive technologies using Hadoop and Dryad. MapReduce and its generalizations offer an attractive programming model for data intensive computing. In particular, our research is using, extending and evaluating Iterative MapReduce 16 which adds support of iterative problems to the core MapReduce capabilities of “map” followed by “reduce”. During the period, CGL had a major open source software release of “Twister” (http://www.iterativemapreduce.org/), developed as a novel prototype of i-MapReduce. We will continue to look at community and commercial MapReduce systems Hadoop and Dryad and feed back our lessons to their developers directly and through our papers. We have identified support of inhomogeneous problems (where currently dynamic scheduling in Hadoop sometimes outperforms the static task definition in Dryad) as one key issue. A challenge for Iterative MapReduce is maintaining the dynamic fault tolerance of current systems while extending support to iterative problems with tighter synchronization constraints. II.2.3 Open Systems Lab (Andrew Lumsdaine, Director) The OSL mission is to develop science and technology for computing with large-scale and pervasive hardware and software systems, to enable more productive computing and software development, and to foster economic development in the State of Indiana. Work in the Open Systems Laboratory (OSL) is motivated by the changing nature of modern information technology systems. Projects Completed During Current Reporting Period ST-CRTS: Collaborative Research: Lifting Compiler Optimizations via Generic Programming Principal Investigator, Co-PIs: Andrew Lumsdaine (IU), Jaakko Jarvi (Texas A&M) Funding Agency: National Science Foundation Award Number: CCF-0541335 Award Amount: $279,233 Effective dates: 2/15/2006 – 1/31/2010 Project Summary: The NSF-funded research team, which includes Andrew Lumsdaine at Indiana University and Jaakko Järvi at Texas A&M University along with their collaborators, have applied the principles of generic programming to improve optimization of computer software by compilers. Generic programming is a type of computer programming that uses non-specific basic instructions that can be tailored later to specific projects, saving time and reducing redundancy when writing code. Compilers are sets of code that convert source code written in one programming language into another programming language in order to improve software performance. The optimizer in a compiler attempts to transform a given program to one that performs faster than the original program, but that still produces equivalent results. The goal of this project was to develop new programming techniques that improve software generally – for the benefit of science, business, education and society. Potential transformations often arise from general (algebraic) rules. For example, an elementary-school student learns that adding zero to any number x is an unnecessary 17 computation---optimizers today routinely utilize such a rule to eliminate unnecessary computations. A high-school, or maybe a college student, learns that the same rule applies to adding the zero matrix to any (compatible) matrix, and indeed to the binary operation and the identity element of any monoid (a mathematical structure having such a law). Optimizers today are very unlikely to take advantage of these general rules, however. Compilers' optimizers' view of programs is very ``low-level,'' and, as a result, many optimization opportunities remain unrealized. To achieve the best performance, programmers must adapt how they write their programs, and often complicate them. The research team has demonstrated how ``high-level'' general rules, algebraic laws, about operators and functions can be represented and organized, and used by compilers' optimizers. As a result, programmers can use the most suitable abstractions that help them in effectively producing correct programs, and yet obtain efficient programs. The team's approach is applied to C++, a commonly-used programming language, and targets generic simplification rules, removal of redundant computations and data transfers, and similar optimizations. Intellectual Merit: Compiler research has almost exhausted the optimization opportunities for generally applicable compiler optimizations based on properties of low-level operations. High-level domain- or library specific optimizations, on the other hand are costly to implement and integrate into a compiler infrastructure, and thus seldom justified. This project strikes a balance between these two approaches, and offers an economical approach to high-level optimizations. The research team's work on structuring high-level optimizations leverages the principles of generic programming, in particular, the categorization of types into concepts according to their capabilities. Defining algorithms in terms of concepts gives rise to generic algorithms that can operate on objects of many different types. Defining optimizations in terms of concepts similarly gives rise to generic optimizations that apply to operations over many different types. To enable the expression of generic, concept-based optimizations, an NSF-funded research team had worked toward direct language support for generic programming for C++. This work resulted in the "ConceptC++" extensions to C++, designed together with many collaborators. Utilizing ConceptC++, and the ConceptGCC compiler developed by Doug Gregor within this project. Building on these achievements, the research team developed a generic simplifier, whose transformations are guided by concepts and "axioms" contained within them. The team also devised two prototype languages for writing compiler optimizations (and analyses) generically and thus reusing them across different types. Broader Impact: Software is important in most of all aspects of modern life; improvements on software development methods and tools that make it easier to develop more efficient software thus translate to benefits to society – from scientific research, to business and economics. The work in this project directly impacts the future development of mainstream programming languages 18 that support generic programming, C++ in particular, and on their standard libraries. The research team members participate in programming language standards bodies, collaborate with language and compiler implementers, and work to introduce language features that better support generic programming. The project has directly trained graduate students and post-doctoral researchers in the emerging field of generic programming, and will continue to do so in the future. Results from this research are integrated into graduate programming courses. Overall, the project advances discovery and understanding by relieving researchers and software professionals to focus more on the solutions to their scientific and development problems. This work also created infrastructure, namely concept-enabled compilers, that will enable future research in generic programming. The research team has published several papers on work performed within the project, as well as giving various presentations at academic conferences, and made software artifacts available for others to use. Transformative Nature of Research: Today's mainstream compilers are considered impenetrable "black boxes" to programmers. Producing programs that perform efficiently may often require iteratively modifying the program in small ways to coerce the compiler into producing a fast executable. Optimizations are not under the control of the programmer. With the approach researched, developed, and advocated in this project, programmers are given this control. The end result is that the iterative tuning process can be drastically reduced and thus sped up, translating into greatly increased programming productivity. Moreover, when programmers no longer need to abandon their high-level abstractions to obtain performance, it is possible to express more complex problems. Ultimately, this can transform the way in which programmers approach optimizing their programs and their libraries. Projects Achieving Major Milestones during the Reporting Period Coordinated Fault Tolerance for High Performance Computing Funding Agency: U.S. Department of Energy We have focused our efforts in Open MPI on reliability improvements, and expanding support for the CIFTS Fault Tolerance Backplane (FTB). As part of the reliability improvements, we matured the process fault recovery operations to support run-through stabilization, reactive automatic recovery, and proactive process migration. The former option supports continuing research into fault tolerant MPI semantics and applications that can continue processing even though some processes may have failed. At Supercomputing 2009, we demonstrated a fault tolerant version of POV-Ray using Open MPI's stabilization feature and the CIFTS FTB. The proactive process migration feature allows end users to move processes away from predicted failure and planned system outages. The reactive automatic recovery feature provides end users with a transparent, automatic recovery mechanism when an unexpected process failure occurs. As part of our expanding support for the CIFTS FTB, we have improved the internal error 19 reporting mechanisms by adding a stable reporting interface, called OPAL SOS, which can report directly to the FTB. Additionally, we have been collaborating with CIFTS FTB partners to standardize fault events and workflows to enhance the overall resiliency of HPC systems by encouraging adoption of the FTB. Alongside this work, we added support for checkpoint/restart-based parallel debugging in Open MPI that can dramatically shorten the debugging cycle, saving software developers hours or days of time spent debugging. Open Source Cluster Application Resources (OSCAR) We have restructured the main frame of OSCAR since version 6.0.x to make the system more reliable and to enable the developers to participate in the programming with the less learning curve of the codes. The renovation of the OSCAR main frame is done and trunk of the OSCAR SVN repository is stabilized. As OSCAR 6.0.x promised, 'yum install oscar' works with the OSCAR specific repository setup on OSCAR 6.0.5. Meanwhile, we still have to test all the features of the new release depending on the OSCAR communities' help and we really need to find a way to test OSCAR systematically. We believe that the systematic testing should be considered in the new release even though this has nothing to do with the new features of OSCAR. As the usual OSCAR release, we will be able to focus on supporting more distros and platforms by the systematic testing. We support RHEL5(X86, X86-64), Debian(X86, X86-64), and Ubuntu(X86, X86-64) so far. Development and Improvement of a Tissue-Simulation Funding Agency: National Institutes of Health The OSL is collaborating with the Biocomplexity Institute at Indiana University to provide an open-source, multiscale modeling environment for cell-based modeling of the development, structure, behavior and pathologies of tissues and organs, the Tissue-Simulation Environment (TSE), as one such platform. The TSE will build upon the current Cellular Potts Model-based modeling environment, CompuCell3D, and the Systems Biology Workbench to allow simple model development by both modelers and experimentalists, provide a framework for model sharing, support SBML and CellML and allow transparent selection of the level of modeling detail. The software will include graphical user interfaces and support for parallel computing. During this reporting period, we were able to develop and demonstrate a solution for performing parameter studies of CompuCell3D models that combined workflows and IU’s Big Red supercomputer to perform the simulations. The open source software, VisTrails (vistrails.org), was used to construct the workflows and handle data provenance. One workflow automatically constructed multiple sets of parameter values and remotely invoked (via Globus Toolkit) simultaneous CompuCell3D jobs on Big Red. Another workflow retrieved the resulting output data and rendered images in VisTrails. This project offered a valuable alternative to the traditional, workstation-based CompuCell3D application. 20 Portion of VisTrails workflow (left) and resulting spreadsheet of cell sorting simulations (right) from a parameter study run on Big Red. Causal Connectivity and Computations in Hundreds of Neurons in Cortex Funding Agency: National Science Foundation The OSL is collaborating with John Beggs (Physics, IU) to determine “causal” connectivity between hundreds of neurons in cortical networks and determine computational operations in neurons where causal connections converge. Causal connectivity has been conceptualized in many ways, but this project adopts the definition given by Norbert Wiener: “For two simultaneously measured signals, if we can predict the first signal better by using the past information from the second one than by using the information without it, then we call the second signal causal to the first one.” In this sense, the field does not use the term “causal” literally, but to indicate predictive value. As a directional measure, causal connectivity cannot be deduced merely from non-directional measures like correlations or firing rates. During this reporting period, we have obtained some initial data from the experimentalists and have begun developing software applications for visualization and analysis. Our goal is to provide open source tools that neuroscience researchers are able to freely download, use, and extend. As the datasets grow in size, we will want to apply high-performance computing to the analysis. The open source ParaView application to visualize neuron firing data. 21 Movie frames of neuron firing: frames N and N+3 (3 ms later), depicting possible correlation between one neuron and adjacent neurons (yellow circle). A Declarative Approach to Managing the Complexity of Massively Parallel Programs Funding Agency: National Science Foundation Our current focus is on identifying and experimenting with declarative abstractions that make it easier to write parallel codes, especially when programming in the Bulk Synchronous Parallel (BSP) style. We have continued to explore the parallel algorithms exemplified by the thirteen Berkeley Dwarfs, and have examined how their communication patterns might be expressed declaratively. We recently finished the preliminary design of Kanor, our declarative parallel programming language; Kanor's declarative communication constructs are based on list comprehensions and array slices. We have implemented a prototype of Kanor as a C++ template library, and have begun porting the Berkeley Dwarfs from MPI to Kanor. Our initial experience is that Kanor communication code is shorter and simpler than the MPI equivalent, at least for codes written in the BSP style. II.2.4 Complex Networks and Systems Group (Alex Vespignanni, Director) The Complex Networks and Systems Group (CNetS) is hosted at the IU School of Informatics and brings together faculty from different units across campus working in the broad areas of complex networks and systems. The center activities include modeling and mining of complex information, technological and social networks, agent-based systems, computational social sciences, artificial life, computational epidemiology etc. The center is receiving funds by the Lilly Foundation through the PTI, NSF, NIH and a number of private foundations and corporations. Projects Completed During Current Reporting Period Designing an Effective HIV Prevention Plan for Botswana by Coupling an Information Network Model with a Meta-population Transmission Model Principal Investigator: Alessandro Vespignani, PI Funding Agency: University of California, Los Angeles (UCLA) 22 Award Number: Subaward 2000 G MF 329 Award Amount: $37,500 Effective Dates: May 01, 2009-April 30, 2010 Project Summary: We have used an interdisciplinary approach to design a novel theoretical framework, based on network science that will aid in developing effective health policies for controlling the HIV epidemic in Botswana, a resource-constrained country in Sub-Saharan Africa. Our research links mathematics, physics, epidemiology, public policy and public health. We decided, as the initial stage, to take a complex biological model that the collaborating group of Professor Blower published in Science in January 2010 (Science. 2010 Feb 5;327(5966):697-701.) and to add a network structure linking individuals in the model. We then plan to expand this network model so that it reflects heterosexual transmission and apply the model to Botswana. The funding that we were awarded from the National Academies Keck Futures Initiative (NAKFI) is helping us collect preliminary results, and identify new research directions. Once we obtain preliminary results we will seek future support from National Institute of Allergy and Infectious Disease (NIAID) at the National Institutes of Health (NIH). Intellectual Merit: The project is working on the development of a new class of models to examine the effect of network dynamics on the spread of drug-resistant strains of HIV. The intention is to create an understanding of how and where the virus is spreading and how it is likely to spread in the future in order to create a targeted approach to prevention and treatment of HIV. Broader Impact: Our research goals are only achievable through an interdisciplinary collaboration between specialists in very different fields. The collaboration that we have been able to form, through NAKFI funding, is truly interdisciplinary and synergistic. Transformative Nature of Research: This is the first approach to the problem that will contain insights based on our new interdisciplinary methodology using network science. It can potentially change how HIV is managed in Botswana. Societal Benefits: According to the international AIDS charity, AVERT, Botswana is among the hardest hit places on earth, with an estimated one-in-four adults living with HIV. Average life expectancy in Botswana is currently less than forty years. Modeling the disease in Botswana will lead to health policies that will save lives. Improving the techniques used in modeling and predicting 23 the spread of infectious disease can also improve treatment and prevention of disease worldwide. How Network Structure Gives Rise to Dynamical Complexity Principal Investigator, Co-PIs: Larry Yaeger (PI), Olaf Sporns (Co-PI) Funding Agency: National Academies, Keck Futures Initiative Award Number: NAKFI CS22 Award Amount: $50,000 Effective Dates: May 1, 2009 – May 31, 2010 Project Summary: We are applying a combination of an information theoretic measure of neural complexity and network science / graph theoretical tools to the neural dynamics and network topologies of artificial neural networks evolved to control agents in a computational ecosystem. This research will be useful in developing new and better types of artificial intelligence. The specific goal is to understand the relationship between network structure and network function, in general, and, specifically, to determine which structural characteristics are most predictive of and most likely to confer dynamical complexity in artificial neural networks. We have demonstrated a relationship between increasing clustering coefficient, decreasing path length, and a bias towards small-world networks that is directly correlated with increasing neural complexity during a period of behavioral adaptation to the environment. This suggests a convergence between evolution for network functionality and previously elucidated evolution for physical constraints, such as wiring length and brain volume. Intellectual Merit: We combine a sophisticated artificial life model (Polyworld) with a powerful collection of graph theoretical tools (the Brain Connectivity Toolbox) and the gold standard information theoretic complexity metric (Tononi, Sporns, Edelman). The computational model has been designed so as to force natural selection to evolve the statistics of network connectivity rather than specific network designs, and records all network topologies and neural dynamics. By evolving the agents in an environment with heterogeneous resources we are able to identify periods of behavioral adaptation to the environment as the population approaches an Ideal Free Distribution, and focus our attention to complexity growth and changes in network topology during these periods. Broader Impact: We have developed a C++ version of the (MATLAB) Brain Connectivity Toolbox (BCT), speeding it up by approximately a factor of 30. It is available at http://code.google.com/p/bct-cpp/. We have also provided wrappers for Python calls into this library, and expect to provide wrappers for other languages in the future. There are a substantial number of users of the original BCT, 24 ranging from its intended purpose of neural network analysis to the design of a lens for a space telescope, so we expect to have a significant impact on the broader scientific community. Transformative Nature of Research: Improvements in our understanding of the relationship between network structure and network function may impact many fields of science. Knowledge of the specific topological features associated with high dynamical complexity may allow us to shape the search space for evolution in such a way as to promote higher levels of artificial intelligence in shorter timeframes. Societal Benefits: The dynamics of social networks are being used to illuminate everything from online communication to file sharing to the spread of disease. Though our work is not currently targeted at diagnosis, it is possible that structural breakdowns resulting in neurological disorder could be better discovered and understood given the insights we are generating. Projects Achieving Major Milestones Global Epidemic and Mobility Model Funding Agency: National Institutes of Health, U.S. Defense Threat Reducation Agency, Abbott, ISI Foundation The Global Epidemic and Mobility (GLEaM) model provide real time forecast on the unfolding of the H1N1 epidemic worldwide. This modeling effort has been unique as it has been the only one attempting to obtain quantitative results worldwide. The necessity to provide new way to obtain real estimates for the disease parameters have pushed the team to work on a new methodology that perform a likelihood analysis of the model with respect to chronological data of the diffusion processes. This methodology allowed us to obtain early estimates of the transmission potential of the H1N1 virus by taking advantage of the multi-scale diffusion processes defined by the population mobility networks. This is the only model coupling countries worldwide and this feature is extremely relevant in evaluating the time pattern of emerging infectious diseases. The early results have been validated with a posteriori analysis with the real data collected by the CDC in the months of May and June. The agreement between the predictions and the actual unfolding of the pandemic has been proven to be remarkable. The GLEaM approach has then been used to provide in the month of June and July long term prediction of the occurrence of the epidemic activity peak in the Northern hemisphere countries in the winter. The method anticipated an early peak occurring in October/November in most of the countries. The predictions, of a quantitative nature (peak week and relative 95% reference range), have been published in early September on BMC Medicine. This is the only paper so far that has attempted a quantitative forecast of the activity peaks. The predictions contained in the paper have been validated since January 2010 against the real data provided by agencies of more than 40 countries. The results show a very good 25 agreement between predictions and real data with offset of at most two weeks. These findings provide a strong and remarkable test of the quantitative level of the prediction offered by computational methods. . Caption: Epidemic activity world wide on Oct 26, 2009 according to the GLEaM computational platform. The color scale indicates the number of infected people. 26 II.3 Educational Activities and Workforce Development The following students from the Digital Science Center completed degrees during the reporting period. Student Name Sashikiran Challa Jun Ji Karthik Muthuraman Jaliya Ekanayake Tak-Lon Wu Mark Meiss Diep Hoang Matthew Whitehead Prabhanjan Kambadur Degree Type MS in Cheminformatics MS in Computer Science MS in Bioinformatics PhD in Computer Science MS in Computer Science PhD in Computer Science MS in Computer Science PhD in Computer Science PhD in Computer Science Lab Community Grids Lab Community Grids Lab Community Grids Lab Community Grids Lab Community Grids Lab CNetS CNetS CNetS Open Systems Lab The following chart shows employees hired or terminating the Digital Science Center during the reporting period. Name Fugang Wang Andrew Younge Quenrui Cai Scott Beason Adam Hughes John McCurley Snehal Patil Ying Wang Torsten Hoefler Lab Community Grids Lab Community Grids Lab Community Grids Lab Community Grids Lab Community Grids Lab CNetS CNetS CNetS Open Systems Lab William Byrd Open Systems Lab Status Hired Hired Hired Terminated Hired Hired Hired Hired Left OSL to work for Blue Waters Directorate National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Hired as Post Doc Associate 27 III. Data to Insight Center Beth Plale, Director III.1 Data to Insight Center Mission and Activity Summary The mission of the Data to Insight Center (D2I) is to create tools and guidelines that allow scientists and companies to harness the vast stores of digital data now being produced, and to turn these data into insight that effectively guides human decisions and advances human knowledge. This includes: Creation of tools and guidelines for archiving large-scale and complex data and information. Data being collected today may be valuable for decades or centuries in the future – in some cases, data will be of value in perpetuity. The Data to Insight Center will create tools for storing data in ways that are reliable. Decades from now a person will be able to ask, “Is this data set really what it claims to be?” and know that that data set can be used with confidence. Related to this, the Data to Insight Center will develop tools for the maintenance and expansion of digital data sets over time so that one can not only find the most recent data but also ask the question, “What was the data set as of a certain date in the past?” and get a definitive answer. Creation of tools for listing and discovering data sets. The great library of Alexandria, founded about 300 BC, had as its simple goal the collection of copies of all books ever written in the world. Today’s collections of data are too vast, and in many cases too sensitive, to be held in any one place, and the information technology challenges of managing libraries of data are much different than libraries of books. What is needed today is not a universal library of data, but rather a universal library catalog of data. The Data to Insight Center will work to develop such a catalog, including the tools to create it, and the tools to look up data stored referenced in the data catalog. Creation of tools for using and understanding large data sets. This will include continued development of tools that allow analysis of weather data in real time to better predict hurricanes and tornadoes. It will also include development of tools for automated inspection of data. Such tools will analyze data and present to a human visualizations of a data set that include potentially interesting trends or new discoveries. D2I includes the following labs and support units: Center for Data and Search Informatics - Beth Plale, Director Visualization and Interactive Spaces Lab - M. Pauline Baker, Director IU Digital Library Program - Robert McDonald, Director University Information Technology Services (UITS) Research Technologies (RT) Visualization and Futures Division - Eric Wernert, Director UITS/RT Systems Division - Matt Link, Director (Statistics for this group are included with DSC in previous section.) 28 Center Highlights January 1 – May 31, 2010 The Data to Insight Center has seen impressive success this Spring. The Center continues to pursue and develop opportunities along its three thrusts of Scientific Data Preservation, Sustainability, Climate and the Environment, and Data at Scale. Highlights for period are presented in non-technical language in the bullets below. The sections that follow provide more detail in technical terms. Robert McDonald was appointed as Executive Director of the newly funded KUALI Open Library Environment, which received its initial match funding of close to a million dollars from the Andrew W. Mellon Foundation. The Center has successfully participated in the NSF funded Vortex2 field effort to better understand tornadoes. By leveraging IU’s extensive investment in high performance computing, Data To Insight produces a handful of short-term, highly accurate weather forecasts every morning that are instantly made available to researchers in the field through their mobile phones. The LEAD II/Vortex2 effort is supported in part by Microsoft. Contributing to advancing Science, Technology, Engineering and Math (STEM) education, D2I’s Beth Plale collaborated with the Research Technologies Advanced Visualization Lab on a stereo movie, funded by the NSF TeraGrid, aimed at middleschool children concerning the value of computational science and its impact in the lives of every one of us. In an effort that overlaps sustainability, climate and the environment, and scientific data preservation, D2I organized and led Indiana University’s participation in a $20M proposal to the National Science Foundation Sustainable Digital Data Preservation and Access Network Partners (DataNet) program. The project would help to develop techniques to preserve valuable data related to meteorological science. Word on the proposal, which received a successful NSF site visit in February 2010, is expected summer 2010. The proposal is in partnership with University of Michigan and University of Illinois Urbana Champaign, with University of Michigan as the lead. In the area of Data at Scale, or managing massive sets of scientific data, D2I is pursuing the establishment of a research center that will allow research access to the HathiTrust digital repository. HathiTrust is a digital repository used for digitally storing shared university library content. This effort is related to but not completely dependent upon the outcome of the Google Books Settlement Agreement. The Data to Insight Center has hired postdoctoral researchers into all of its open positions. Stacy Kowalczyk from the IU Digital Library Program joined in May 2010 and will fill a gap in the Scientific Data Preservation thrust. Mehmet Aktas, who has a strong background in the computer science area of systems, will join in June and will add strength to the Data at Scale thrust. Gamal El Afandi, an atmospheric scientist with collaborative ties to Purdue University and the National Center for Atmospheric Research (NCAR) will join in August and add strength to the Sustainability, Climate and Environment thrust. 29 The Data to Insight Center is launching a new fellows program. Call for proposals is expected to be announced late Summer 2010. Scholarly Accomplishment A summary of the scholarly accomplishments of the Data to Insight Center during this reporting period is provided below: Group Publications Technical Presentations Inventions Disclosed Open Source Software Distributed Online Services Provided Center for Data and Search Informatics Digital Library Program Visualization and Interactive Spaces Lab Data to Insight Center Total 4 7 0 2 1 Public and Governmental Service Activities 1 1 7 0 0 0 0 0 0 0 1 0 0 5 14 0 3 1 1 Educational Activities The following table provides a summary of the educational activities of the Data to Insight Center during this reporting period: Group Center for Data and Search Informatics Digital Library Program Visualization and Interactive Spaces Lab Data to Insight Center Total Undergraduate Student Employees/Interns M.S. Students Employed by D2I Ph.D. Students Employed by D2I Undergraduate Degrees Awarded M.S. Degrees Awarded Ph.D. Degrees Awarded 0 15 10 0 3 0 Education, Outreach and Training Events 0 2 10 3 N/A N/A N/A 7 3 1 0 3 0 0 1 5 26 13 3 3 0 8 30 Funded Research The following table provides a summary of grants submitted and grants received by the Data to Insight Center during the current reporting period: Group Center for Data and Search Informatics Digital Library Progam Visualization and Interactive Spaces Lab Data to Insight Center Total Number of Grant and Contract Proposals Submitted Total Dollar Amount of Proposals Submitted Number of Grant and Contract Proposals Awarded Total Dollar Amount of Proposals Awarded 6 $25,366,692 2 $310,066 3 $920,003 1 $932,000 1 $787,815 0 0 10 $27,074,510 3 $1,242,066 31 III.2 Data to Insight Center Research Highlights Projects achieving major milestones during reporting period LEAD II/Vortex2 Funded in part by Microsoft LEAD II is a continuation and extension of the original Linked Environments for Atmospheric Discovery (LEAD) project, funded by the National Science Foundation (NSF), which makes meteorological data, forecast models, and analysis and visualization tools available for research and education in problems related to climate, environment and the atmosphere. LEAD II’s involvement with Vortex2 is through providing timely, customized daily weather forecasts for the Vortex2 field team for their specific location that day in their nomadic field effort to understand tornado behavior. The resulting images an be accessed through a smart phone or on the Web. The LEAD II/VORTEX2 partnership includes Plale and other scientists from IU’s Pervasive Technology Institute in addition to atmospheric scientists Keith Brewster of University of Oklahoma (OU) and Craig Mattocks of the University of North Carolina, Chapel Hill. D2I’s involvement in VORTEX2 also provides an invaluable educational experience for IU graduate students. The students working on LEAD II are solving real-world problems with strict time constraints. It’s the kind of experience that only comes from hands-on research. We can’t control the weather; but we can help to predict it. LEAD II is funded in part by Microsoft Corporation and the NSF, utilizes the Microsoft Trident Scientific Workflow Workbench, the ARPS Data Analysis System (ADAS) services from OU, Weather Research Forecast (WRF) model, and employs the IU Big Red supercomputer, and the TeraGrid, NSF’s national network of high performance computing and data storage resources. Major milestones: For month of May, produced 186 weather forecasts and 1,300 weather images 11GB of data downloaded from the web site by active users 32 To view the latest forecasts on the project website or to use the Mobile Viewer link go here: http://dataandsearch.org/dsi/vortex2. Karma Provenance collection and representation toolkit (http://www.dataandsearch.org/provenance) Karma announced the new release of the core Karma provenance collection system, Karma v3.0, which contains instrumentation using Axis2 handlers, more extensive test clients, and better documentation. Karma v3.0 supports provenance activities published from services, workflows and nested workflows. The provenance data is efficiently stored in a relational database, and supports Open Provenance Model (OPM) v1.0 standard for interfacing with the tool. Karma has two active subprojects, netKarma and InstantKarma, both of which have significant milestones to report this quarter. NetKarma Funding agency: National Science Foundation through BBN, Corp. NetKarma is a three year effort to demonstrate provenance collection in the Global Environment for Network Innovations (GENI) global network. In this period, the project demonstrated several milestones: Significant milestones: Demonstration of NetKarma: Demonstrated provenance collection from the experimental layer of PlanetLab. This was accomplished by working with Jeannie Albrecht of Williams College and Amin Vahdat of UC San Diego, the team that wrote the GUSH experiment control tool. Demoed at GENI Engineering Conference, March 2010. The prototype collection code called “gush2netkarma-1.0” code is linked from the NetKarma GENI WIKI page and available at the following location: https://globalnoc.iu.edu/grnoc-internal/file-bin/systemengineering/softwareprojects/netkarma/gush2netkarma-1.0.tar.zip 33 NetKarma Poster used at the demo session can be found here: http://groups.geni.net/geni/attachment/wiki/netKarma/Netkarma_poster_gec7.pdf Instant Karma: Applying a Proven Provenance Tool to NASA’s AMSR-E Data Production Stream Funding agency: NASA InstantKarma is a two year project funded to examine and provide provenance collection for the NASA AMSR-E ingest pipeline. The AMSR-E is an imaging instrument on board a NASA satellite. The products produced from the satellite include imaging sea ice over the North Pole, and in the first phase of this project (which began in March 2010), we focus on improving the 34 useability of sea ice imagery by collecting provenance about the processes applied, and providing that information along with the sea ice imagery when it is posted to a data clearinghouse (the National Snow and Ice Data Center (NSIDC). In the abbreviated period since the project began, several milestones of a startup nature were achieved: The Instant Karma management and technical teams worked out project start-up issues. Primary activities involved sharing information about the different technologies brought to the project (Karma provenance collection tool and AMSR-E product generation environment). The Instant Karma kickoff meeting was held April 19 at UAHuntsville. The primary purpose of the meeting was to walk through the AMSR-E Sea Ice product generation processing workflow in detail, identifying potential provenance information collection points. The team identified several potential science use cases and outlined plans for the Karma presentation at the AMSR-E Science Team meeting to be held early in June. The team set up a testbed machine at University of Alabama Huntsville. The testbed machine will run an experimental version of the AMSR-E processing pipeline and will be serve as a testbed for experimentation. This system will be populated with sample AMSR-E L2A Brightness Temperature data, to be used as input for the daily Sea Ice data products. AMSR-E Science Team meeting – scientists engaging in snow and ice research assembled at a AMSR-E Science team meeting. Our team presented its goals for provenance collection of the AMSR-E pipeline. The audience was responsive, and provided good feedback. 35 Slide used to explain a use-case for Instant Karma. XMC Cat Funding Agency: National Science Foundation XMC Cat is a web service toolkit for capturing and storing metadata during the execution of scientific workflows to enable data discovery and reuse. Its advantages include adaptability to domain schemata through configuration instead of code changes, support for automatic capture of metadata through curation plugins, and search and browse capabilities through a web-based GUI that dynamically adjusts to the domain schema. This allows XMC Cat to be deployed in different scientific domains without requiring new code to be written. It is currently in use in the LEAD Science Gateway Significant Milestones include: Current version 1.2.6 released Project makes the news http://newsinfo.iu.edu/news/page/normal/14523.html including being picked up by ACM TechNews, 26 May 2010 and PhysOrg.com http://www.physorg.com/news193928800.html 36 KUALI OLE (Open Library Environment) Funding agencies: Kuali OLE Partners and the Andrew W. Mellon Foundation Kuali OLE is an open source software partnership that brings together funding from the Kuali OLE Founding Partners (Indiana University, Duke University, Lehigh University, North Carolina State University, University of Chicago, University of Florida, University of Maryland, University of Pennsylvania, and the University of Michigan) and The Andrew W. Mellon Foundation to build an open and extensible library management system using a services oriented architecture approach. This software will be developed to deliver management functionality to academic libraries in new ways and with a modern technology approach. The project timeline for Kuali OLE is April 2010-July 2012. Significant milestones: Receiving the initial matching funding of $932,000 from The Andrew W. Mellon Foundation The formation of the Kuali OLE Board and the Kuali OLE Functional Council. VIVOWeb Funding agency: National Center for Research Resources/National Institutes of Health under U24 Funding Track (Enabling National Networking of Scientists and Resource Discovery) VIVOWeb is a two-year (Oct 2009-Sept 2011) large-scale research project that involves the University of Florida, Cornell University, and Indiana University including the IU School of Library and Information Science’s Cyberinfrastructure for Network Science Center, and Information Visualization Lab, the IU Digital Library Program (D2I), the IU Libraries, and UITS Identity Management Services. VIVOWeb was funded under U24 (Enabling National Networking of Scientists and Resource Discovery) funds in order to provide a social networking platform for national use in translational medicine. The goals of VIVOWeb are to expand upon the successful VIVO social networking software developed at the Cornell University Libraries (http://vivo.cornell.edu) and enhance its usability by building a network of institutions that use the software in a national federation (University of Florida, Cornell University, Indiana University, Washington University, Weill Cornell Medical College, Scripps Research Institute, and Ponce Medical School). The IU DLP and D2I will provide the production implementation of VIVOWeb at Indiana University and will work with UITS Identity Management Services and the Office of the Provost to promote the service as a unique resource to IU faculty that will enable wider engagement in many areas of research and in building stronger grant proposals among like disciplinary researchers. The counterpart grant funded under this U24 proposal is a Harvard based project called eagle-i. Eagle-I will devotes its two-year period to providing a network tool for scientific resources. Eagle-I and VIVOWeb will work together to help build upon each network’s strengths in software development, visualization, and adoption and outreach within the translational medicine community. Significant milestones achieved in the Spring 2010 period include: 37 Successfully installed production-test VIVO software version 1.0 at IU <http://vivo.iu.edu/vivo/> Hired VIVO programmer, Brian Keese. Held technical roadmap meeting at IU March 3, 2010. Developed first VIVO National Conference to be held August 2010, New York City. Pursuit of Visualization Collaboration and Grant Opportunities In an effort to increase external partnerships and funding from the national visualization scene the Data to Insight Center funded two personnel from the RT-Visualization Department to attend the Department of Energy Computer Graphics Forum in Park City Utah. This forum provided an opportunity to network with major thought leaders and contributors involved in DoE visualization projects. One of RT-V's strategic objectives is to define and further the national agenda on scalable immersive visualization (encompassing both high-end and low-cost systems). The funding of this (and other) trips specifically aimed at accomplishing this objective directly relates to the thrusts identified by The Center. This particular trip included side-trips to Desert Research Institute (leader in environmental sciences through the application of knowledge and technologies to improve people's lives) and U.C. Davis. Objectives which this trip made possible: 38 Promote IU’s agenda for sustainable and scalable immersive visualization in the national community Better align PTI/D2I and RT-V for participation in NSF and DoE funding opportunities Demonstrated IU's unified agenda between Eric Wernert (IU Veteran) and Bill Sherman Participate on panel of how emerging technologies can contribute to the visualization process III.3 Educational Activities and Workforce Development The following students from the Data to Insight Center completed degrees during the reporting period. Student Name Sharanya Chinnusamy Tejas Totade Ashish Bhanga Pascal Lola Mansoor Siddeeg Ginger White Degree Type Masters of Computer Science Masters of Computer Science Masters of Computer Science Bachelors of Science Bachelors of Science Bachelors of Science Lab DSI DSI DSI VISLab VISLab VISLab The following chart shows employees hired or terminating the Data to Insight Center during the reporting period. Name Brian Keese Felix Terkhorn Stacy Kowalczyk Lab DLP DSI DSI Prashant Sabhnani Aparna Rao Abhijeet Kodgire DSI DSI DSI Shobana Krishnan DSI Kavitha Chandrasekar Prajakta Purohit Zong Peng Kalani Ekanayake Sharanya Chinnusamy Tejas Totade Ashish Bhanga Pascal Lola Mansoor Siddeeg Ginger White DSI DSI DSI DSI DSI DSI DSI VISLab VISLab VISLab Status Hired April 2010 Hired April 2010 Post Doc Research Scientist Hired May 2010 Hired May 2010, Summer RA Hired May 2010, Summer Hourly Hired Spring 2010, Hourly, Summer RA Hired Spring 2010 Hourly, Summer RA Hired May 2010, Summer RA Hired May 2010, Summer RA Hired May 2010, Summer Hourly Hired May 2010, Summer Hourly Graduated, Hired by MicroSoft Graduated Graduated, Hired by MicroSoft Graduated Graduated Graduated 39 IV. Center for Applied Cybersecurity Research Fred Cate, Director IV.1 Center for Applied Cybersecurity Research Mission and Activity Summary The Center for Applied Cybersecurity Research (CACR) works to enhance the security and integrity of information systems, technologies, and content by facilitating research and education informed by, and integrated with, the practice of information assurance. CACR helps to coordinate and integrate the research, teaching, and practice of more than 70 cybersecurity professionals at Indiana University. The key elements are a highly interdisciplinary approach— the only cybersecurity program in the nation to include law and business schools—that integrates theory and practice. The CACR includes the following labs and support units: Previously Existing Center for Applied Cybersecurity Research – Fred Cate, Director; Kay Connelly, Senior Associate Director Advanced Network Management Lab – Steve Wallace, Director; Gregory Travis, Assistant Director University Information Technology Services (UITS) Research Technologies (RT) Life Sciences Division – William Barnett, Director Scholarly Accomplishment A summary of the scholarly accomplishments of the Center for Applied Cybersecurity Research during this reporting period is provided below: Group Publications Technical Presentations Inventions Disclosed Online Services Provided Public and Governmental Service Activities 0 Open Source Software Distributed 0 Applied Cybersecurity Research Advanced Network Management Lab Center for Applied Cybersecurity Reserch Total 25 16 0 0 1 2 0 0 0 0 26 18 0 0 0 0 40 Educational Activities The following table provides a summary of the educational activities of the Center for Applied Cybersecurity Research during this reporting period: Group Undergraduate Student Employees/Interns Applied Cybersecurity Research Advanced Network Management Lab Center for Applied Cybersecurity Research Total M.S. Students Employed by CACR Ph.D. Students Employed by CACR Undergraduate Degrees Awarded M.S. Degrees Awarded Ph.D. Degrees Awarded 0 0 0 0 0 0 Education, Outreach and Training Events 13 0 0 0 0 0 0 0 0 0 0 0 0 0 13 Funded Research The following table provides a summary of grants submitted and grants received by the Center for Applied Cybersecurity Research during the current reporting period: Group Applied Cybersecurity Research Advanced Network Management Lab Center for Applied Cybersecurity Research Total Number of Grant and Contract Proposals Submitted 7 Total Dollar Amount of Proposals Submitted $14,297,012 Number of Grant and Contract Proposals Awarded 3 Total Dollar Amount of Proposals Awarded 1 $300,000 0 $0 8 $14,597,012 3 $149,786 $149,786 41 IV.2 Center for Applied Cybersecurity Research Highlights January 1-May 31, 2010 The Center for Applied Cybersecurity Research works to improve the quality of information assurance through research, public, and professional outreach; collaboration with practitioners and policymakers; and bridge-building among Indiana University’s diverse resources in information assurance research, teaching, and practice. Under the Center’s leadership, Indiana University has been recognized as a National Center of Academic Excellence in both Information Assurance Education and Information Assurance Research. With the generous support of the Lilly Endowment, during the six months covered by this report, CACR staff and fellows submitted eight grant applications seeking over $14.5 million in support. We published 19 articles and monographs and made 22 scholarly and professional presentations. We also brought 13 speakers to the Indianapolis and Bloomington campuses as part of our regular cybersecurity and health informatics speaker series. But the achievements of the past six months are reflected not only in numbers such as these, but in the substantive work—much of it behind-the-scenes or part of longer term initiatives—in which the Center engages. CACR organizes its work around three substantive focal areas: health care, national security, and higher education. Health Care CACR has focused considerable attention on the security and privacy of health information. While long significant, this area has achieved new importance with the rapid growth of electronic health records and the extraordinary expansion of health-related information that is found in non-clinical settings and is critical to the delivery of health treatments. In April, CACR hosted the first meeting of its National Institutes of Health-funded working group on how to protect privacy in health research. This group includes internationally recognized physicians, health researchers, patient advocates, ethicists, privacy experts, and regulators from the United States, United Kingdom, and Canada, who are working together on an 18month project to enhance privacy protection in health research, while eliminating bureaucratic and unnecessary barriers to accessing research data. The group will host a stakeholders conference in Chicago in August to address specific issues—such as multi-center research and alternatives to patient consent—that pose the greatest challenge to formulating a more effective and efficient privacy protection regime for medical research. CACR has also continued building a new Center for Strategic Health Information Provisioning (CSHIP) in partnership with Medical School, Informatics and Computing, Maurer School of Law, OVPIT, and the Regenstrief Institute. The new center will collaborate with industry, not-forprofit groups, and others to bring a wider range of resources to bear on questions concerning how to protect and responsibly use personally identifiable health information. Stan Crosley, former Chief Privacy Officer of Eli Lilly and Company, has been brought on board to help lead this effort. Stan co-founded and served as Chairman of the Board of Directors of the International Pharmaceutical Privacy Consortium, and he serves on the boards of the Indiana 42 Health Informatics Corporation, the International Association of Privacy Professionals, and The Privacy Projects, and the Conference Board's Chief Privacy Officers Council. To strengthen our collaboration among faculty and students interested in these important issues, CACR sponsors a Health Informatics Seminar which meets twice each month to foster an environment where faculty and students can learn about each other’s research and other major research and policy developments on campus. National Security CACR has continued to expand its activities relating to information security and privacy in the context of national and homeland security. Much of that work has been behind-the-scenes, for example, participating in classified reviews of DHS cybersecurity programs. These reviews are unusual in that they depart from the government’s usual practice of providing high-level security clearances only to government employees and contractors. In addition, IU was one of only three universities asked to participate in these reviews. CACR has been in active discussions with the Indiana National Guard about how to better address cybersecurity as part of the Indiana Complex Operations Partnership (InCOP)—a congressionally funded initiative to support U.S. troops before and during deployment abroad. A series of proposals are currently under review that would enhance cybersecurity training for officers and draw on IU’s considerable network resources to provide earlier and more complete data to military commanders on emerging cyber attacks. CACR also hosted General Renuart, the four-star commander of NORTHCOM/NORAD, for a second visit to explore possible collaborations between IU, DOD, and DHS. Higher Education IU has long been a national leader in cybersecurity efforts among colleges and universities. On April 1, 2010, we hosted our sixth CACR Higher Education Cybersecurity Summit. Thanks to the Lilly Endowment’s support, it was possible for 290 people from colleges and universities throughout the Midwest to attend without charge. In addition, all of the plenary sessions were streamed live and are available online (under “Archived Comments” at http://www.indiana.edu/~uits/cacrsummit10/program.html). The keynote speaker for the summit was Bruce Schneier, an internationally renowned security technologist and author. Described by The Economist as a "security guru," Schneier is best known as a refreshingly candid and lucid security critic and commentator. He spoke on the timely subject: Security, Privacy, and the Generation Gap. 43 The summit also featured a firstof-its-kind panel on privacy issues in higher education, featuring five university chief privacy officers. While universities collect and use an extraordinary volume and variety of personal information about students, In April, CACR hosted the Higher Education Cybersecurity Summit keynote speaker for the employees, summit was Bruce Schneier,internationally renowned security technologist and author. alumni, donors, and visitors, and account for one-third of reported data breaches, they have been slow to adopt privacy policies and appoint privacy officers. This panel brought together five of the first university privacy officers to address the special challenges of protecting privacy and security in higher education, and techniques for achieving success. Other Activities Not all of our activities fit within our three substantive focal areas. CACR also engages in considerable public outreach through appearances before civic and community groups, published op-eds, and similar activities. Working with public broadcasting station WFIU, CACR has developed its first prototype “Security Matters” segments. Each 60-second piece addresses a specific security threat or vulnerability, and then directs listeners to a website where they can watch simple, how-to- video tutorials on how to perform important security tasks to counter the threat or vulnerability (such as how to choose a strong password, set a password on an iPod or Blackberry, secure a wireless router at home, etc.) We have also focused efforts internally to help strengthen and coordinate IU’s capacity in cybersecurity research and practice. With the exceptional support provided by the Lilly Endowment, we awarded three internal grants, totaling $149,786, to support early stage cybersecurity research. These grants are designed to help spur innovative research and collaboration that are likely to lead to future external support, while also encouraging imaginative approaches to practical information assurance problems. These grants are in addition to the five internal grants (totaling $232,000) that we awarded last year. 44 We invested more than $10,700 in travel grants and student registration fees to encourage participation in key conferences by cybersecurity faculty and students, and to support visits by leading cybersecurity researchers and practitioners to IU. Finally, CACR appointed six new “Fellows” from four IU units to bring a greater range of expertise to our work: • • • • • • • Yue Jake Chen (Informatics) Arjan Durresi (Computer Science) Minaxi Gupta (Informatics) David Ripley (ANML) Zeynep Salih (Medicine) Greg Travis (ANML) Xukai Zou (Computer Science) The need for more rational and effective cybersecurity efforts continues to grow. Thanks to the support of the Lilly Endowment, IU is increasingly recognized as a national leader in this important effort, and we are better positioned to make an even greater difference in the future. 45 V. Research Technologies II.1 Research Technologies Mission and Activity Summary The Research Technologies (RT) division has in many ways become the fourth pillar of Pervasive Technology Institute and for this reason it warrants its own section in the report. RT directors serve in leadership roles in all three PTI centers and RT activities overlap with and contribute to all of PTI. The mission of the Research Technologies division of UITS is to develop, deliver, and support advanced technology solutions that enable new possibilities in research, scholarly endeavors, and creative activity at Indiana University and beyond; and to complement this with education and technology translation activities to improve the quality of life of people in Indiana, the nation, and the world. RT is composed of the following units, each providing a specialized area of support to PTI’s centers: Applications – Director, Scott McCaulay Systems – Director, Matt Link Visualization and Futures – Senior Manager, Eric Wernert Life Science – Senior Manager, William Barnett Research Technologies Highlights January 1-May 31, 2010 Highlights of RT activities are listed below by group in relatively non-technical terms. Scholarly Accomplishment A summary of the PTI related scholarly accomplishments of Research Technologies during this reporting period is provided below: Group Publications Technical Presentations Inventions Disclosed Applications Systems Visualization Life Science Research Technologies Total 0 0 0 1 1 0 3 4 4 11 0 0 0 1 1 Open Source Software Distributed 0 0 0 0 0 Online Services Provided 0 0 0 0 0 Education, Outreach and Training Events 0 3 29 0 32 Public and Governmental Service Activities 0 0 0 0 0 46 Funded Research The following table provides a summary of PTI related grants submitted and received by Research Technologies during the current reporting period: Group Applications Systems Visualization Life Science Number of Grant and Contract Proposals Submitted 1 0 1 1 Research Technologies Total 3 Total Dollar Amount of Proposals Submitted $3,609,878 0 $1,062,323 $45,783 Number of Grant and Contract Proposals Awarded 0 0 0 0 $4,717,984 0 Total Dollar Amount of Proposals Awarded $0 $0 $0 $0 $0 V.2 Research Technologies Research and Activities V.2.1 Research Technologies Applications Division The Research Technologies Applications group provides support for the development, licensing, deployment and optimization of applications to further the research mission of faculty and students at Indiana University, as well as for the national research community through funded cyberinfrastructure grid projects, and for commercial research in the state of Indiana through the IEDC. Supported applications include high performance parallel MPI applications, commercial and open source statistical and mathematical applications, and complexity hiding interfaces such as gateways and portals. Activity highlights: During the reporting period, RT Applications performed the acceptance testing of all the hardware deployed as part of the FutureGrid project. This is an intensive and intricate operation involving measuring successful run times of many applications and algorithms at various core counts on the machines to substantiate optimal performance. Additional tests use the entire machine over periods of days and weeks to assure that all aspects of the system perform reliably and robustly. The heterogeneous and distributed nature of the FutureGrid hardware make this process a somewhat greater challenge than the usual deployment of new supercomputing hardware. During March 2010 the Large Hadron Collider (LHC) located at the CERN laboratory in Geneva, Switzerland began generating the highest energy particles (7 TeV) ever 47 produced by a particle accelerator, with significant contributions being made by IU researchers and technologists. Researchers from the IU High Energy physics group participate in ATLAS, a CERN project in which a subatomic particle detector measures the fundamental building blocks of the universe. Through their work in the NSF’s Open Science Grid (OSG) Grid Operations Center, technologists from Research Technology Applications help in the effort to process the massive amount of data produced by ATLAS. Research Technology Applications, with the Pervasive Technology Institute's Digital Science Center at Indiana University, produced a new software tool called “Twister” to support faster execution of many data mining applications implemented as MapReduce programs. The tool extends the functionality of MapReduce, a distributed programming technique patented by Google for large-scale data processing in datacenter environments. Twister allows MapReduce to achieve higher performance, perform faster data transfers, and reduce the time it takes to process vast sets of data for data mining and machine learning applications. In the Spring of 2010, Indiana University announced that their popular and highly utilized Big Red supercomputer would continue to be made available to a national audience of scientific researchers through the NSF TeraGrid. The Research Technologies Applications group provides expertise to users to port their applications to run in the Big Red environment, and to optimize the applications to take advantage of the machine’s unique architecture in order to maximize throughput. RT-A will continue to provide this service to TeraGrid users through at least March 31, 2011. On April 21,2010, RT-A along with partners at Dresden’s Technische Universität hosted a hands-on workshop on Vampir, a tool designed to conduct performance analyses and diagnose problems in serial and parallel supercomputing applications. Vampir was created by the Center for Information Services and High Performance Computing (ZIH) at the Technische Universität in Dresden, Germany, a close collaborative partner of IU's Pervasive Technology Institute (PTI) in the area of high performance computing research. The tool will be a fundamental component of FutureGrid, a collaborative grid and cloud computing test-bed funded by the National Science Foundation and developed under the leadership of the PTI Digital Science Center. In January 2010, RT-A Director Scott McCaulay was invited to NSF headquarters in Arlington, Virginia to address the Principal Investigators of the NSF Software Development for Cyberinfrastructure program on the subject of software sustainability, and to present the results of the Software Sustainability Workshop held in Indianapolis in 2009 48 V.2.2 Research Technologies Systems Division The Research Technologies Systems group provides robust and reliable systems and services that enable computing research experimentation and implementation, and which amplify the talents and visions of local and national researchers in a wide range of scientific domains. The RT Systems group designs, deploys and administers the world-class supercomputing and storage systems that make up the hardware component of Indiana University’s advanced cyberinfrastructure, as well as the core services which support the effective use of these systems. The goal of this research computing environment is to enable new types of research, pedagogy, creative activity and community impact. This environment combines deep human expertise, robust systems and services, and advances in computer science and informatics to address the needs of researchers and their collaborators on the local, national, and international stage. Activity highlights: During the reporting period, RT Applications performed the acceptance testing of all the hardware deployed as part of the FutureGrid project and the RT Systems group supported and installed the systems for acceptance. This is an intensive and intricate operation involving measuring successful run times of many applications and algorithms at various core counts on the machines to substantiate optimal performance. Additional tests use the entire machine over periods of days and weeks to assure that all aspects of the system perform reliably and robustly. The heterogeneous and distributed nature of the FutureGrid hardware make this process a somewhat greater challenge than the usual deployment of new supercomputing hardware. During March 2010 the Large Hadron Collider (LHC) located at the CERN laboratory in Geneva, Switzerland began generating the highest energy particles (7 TeV) ever produced by a particle accelerator, with significant contributions being made by IU researchers and technologists. Researchers from the IU High Energy physics group participate in ATLAS, a CERN project in which a subatomic particle detector measures the fundamental building blocks of the universe. Through their work in the NSF’s Open Science Grid (OSG) Grid Operations Center, technologists from Research Technology Applications help in the effort to process the massive amount of data produced by ATLAS. Research Technology Applications, with the Pervasive Technology Institute's Digital Science Center at Indiana University, produced a new software tool called “Twister” to support faster execution of many data mining applications implemented as MapReduce programs. The tool extends the functionality of MapReduce, a distributed programming technique patented by Google for large-scale data processing in datacenter environments. Twister allows MapReduce to achieve higher performance, perform faster data transfers, and reduce the time it takes to process vast sets of data for data 49 mining and machine learning applications. The work on Twister and other MapReduce technologies was supported by the Research Technologies Systems group as analysis was performed on Quarry, and a test cluster from the FutureGrid project. In the Spring of 2010, Indiana University announced that their popular and highly utilized Big Red supercomputer would continue to be made available to a national audience of scientific researchers through the NSF TeraGrid. The Research Technologies Applications group provides expertise to users to port their applications to run in the Big Red environment, and to optimize the applications to take advantage of the machine’s unique architecture in order to maximize throughput. The Research Technologies Systems (RT-S) group maintains the services on Big Red. RT-A and RT-S will continue to provide this service to TeraGrid users through at least March 31, 2011. On April 21, 2010, RT-A along with partners at Dresden’s Technische Universität hosted a hands-on workshop on Vampir, a tool designed to conduct performance analyses and diagnose problems in serial and parallel supercomputing applications. Vampir was created by the Center for Information Services and High Performance Computing (ZIH) at the Technische Universität in Dresden, Germany, a close collaborative partner of IU's Pervasive Technology Institute (PTI) in the area of high performance computing research. The tool will be a fundamental component of FutureGrid, a collaborative grid and cloud computing test-bed funded by the National Science Foundation and developed under the leadership of the PTI Digital Science Center. In January 2010, RT-A Director Scott McCaulay was invited to NSF headquarters in Arlington, Virginia to address the Principal Investigators of the NSF Software Development for Cyberinfrastructure program on the subject of software sustainability, and to present the results of the Software Sustainability Workshop held in Indianapolis in 2009. In the first Quarter of 2010 Research Technologies Systems installed 21.9 TFLOPS of HPC systems for the NSF PolarGrid project that’s lead by the PTI’s Digital Science Center. A portion of that system, 64 nodes, has been installed separately and will be relocated to Elizabeth City State University in North Carolina (ECSU) later in 2010. ECSU is a partner on the PolarGrid award and is a minority serving institution. In addition to installing this new system at IU in Bloomington two RT-S staff members traveled to Thule in Greenland for fieldwork related to the PolarGrid project. The expedition was lead as part of NASA’s IceBridge mission, the largest airborne survey ever flown of Earth's polar ice. RT-S also worked closely with the Center for Remote Sensing of Ice Sheets (CReSIS) that is headquartered at the University of Kansas for the IceBridge mission. RT-S is working closely with CReSIS and RT-A to port the current analysis code from using runtime Matlab libraries to a compiled environment. 50 V.2.3 Research Technologies Visualization and Futures Division The mission of the Research Technologies Visualization & Futures division is to support the research, creative activities, education, and engagement missions of Indiana University through innovative applications of visual technologies. The main support unit of RT Visualization & Futures is the Advanced Visualization Laboratory (AVL) which provides advanced facilities, resources, and expertise in the areas of scientific and information visualization, virtual reality, computer graphics techniques, and innovative input/output technologies. RT Visualization & Futures also provides support for digital arts activities through the Institute for Digital Arts and Humanities, coordinates IU’s participation in TeraGrid visualization initiatives, and serves as the point of contact between Research Technologies and the IU Digital Library Program. Activity highlights: AVL continued to invest in the development and deployment of stereo video. Major new hardware developments came in the development of a split-beam stereoscopic rig for the existing camera pair. Major new stereoscopic content acquisitions include: prescribed burn in Brown County State Park, IU Opera dress rehearsal, IU women’s basketball, and the men’s and women’s Little 500. Major outreach efforts include teaching segments of a telecomm course, delivering an IDAH brownbag on stereo video, a presentation at the WonderLab in Bloomington, and Chris Eller’s appearance on the WTIU Friday Zone teaching kids about stereo. Stereo video holds a great deal of promise in a variety of educational and entertainment settings. Bill Sherman (PI) and Eric Wernert (co-PI) submitted a grant to the NSF Software Development for Cyberinfrastructure (10508) solicitation. In conjunction with Hank Childs (co-PI, University of California Davis) and Jim Stone (co-PI, Princeton), the project seeks to integrate the benefits of immersive interfaces and displays with scalable visualization tools (especially VisIt and VTK) in a systematic and sustainable manner. AVL staff were heavily involved in the support of the Intermedia Festival held in Indianapolis late April 2010. The Intermedia festival showcases live performances that synthesize traditional performing arts with digital content and telecommunications. AVL provided technical assistance and showcased guest environments in its Virtual Reality Theater. AVL staff collaborated with the Spring 2010 virtual reality class at the Indiana Academy. AVL staff mentored high school students throughout the Spring semester and then hosted students in the Virtual Reality Theater for their final presentations. AVL contributed to the Theoretical Approach to Coordinated Behavior grant. Dr. Hui Zhang, our newest talent, began working with Dr. Chen Yu, Psychology. This project has been underway for quite some time but the AVL just recently began contributing. It explores social and interpersonal behavior through the use of advanced technologies such as robotics and virtual reality. AVL has nearly completed the first of two stereo movies aimed at middle-school children concerning the value of computational science and its impact on our lives as lay 51 citizens. These will be part of the TeraGrid Education, Outreach and Training Stereo video series. V.2.4 Research Technologies Life Science Division RTLS is the only area within RT that is dedicated to supporting domain-focused research. Its primary mission is to empower researchers with cutting edge, enabling information technologies in both basic and biomedical life sciences within IU, the state of Indiana, and beyond. RTLS consists of three subunits: the Advanced Information Technology Core, which acts as a gateway for IU School of Medicine researchers to RT systems and services; Biomedical Applications, which provides custom applications and services to support life sciences projects and; the Center for Computational Cytomics, which supports basic biology researchers. Activity highlights: Indiana Clinical and Translational Sciences Institute (CTSI) HUB-RT Life Sciences launched two new applications in support of translational research, both funded by a NCRR Supplement to the original CTSI award. The first, i2iconnect, is an online service that matches inventors and their discoveries with potential industry licensing partners. Developed in conjunction with Cook Medical, it provides a listing of potential licensing partners by product and disease specializations and has over 1000 participants. As part of this project, an MBA team from the Kelley School of Business developed a business and marketing plan for i2iconnect. The second is a secure collaboration service based on Alfresco Share that allow researchers from multiple institutions to quickly and securely establish online collaboration sites to develop grant proposals, conduct research on shared datasets, and prepare publications. Although still in testing, it has already attracted 300 users across multiple institutions. The Informatics Core for The Collaborative Initiative on Fetal Alcohol Spectrum Disorders-The Informatics Core, part of the Biomedical Applications Group in RTLS, has now collected over 6,900 examinations for 4 clinical research programs at 15 sites in the US, South Africa, Russia, the Ukraine, and Finland. They are now assisting in 5 cross-site studies on the diagnosis of fetal alcohol spectrum disorders based on face dsymorphology, brain scans, and behavior. Bioinformatics Support for the National NeuroAIDS Tissue Consortium (NNTC)- The purpose of the project is to provide bioinformatics support to the members of the National NeuroAIDS Tissue Consortium in the pursuit of their research. One of the Consortium members, Dr. Gelman, conducted a microarray based transcriptional study on the effects of HIV-associated encephalitis on three different parts of the brain. We were responsible for providing the software platform that would allow for the organization and sharing of these data. We met this milestone by deploying caArray, a microarray data management system produced by the Cancer Biomedical Informatics Grid (caBIG) effort at the NCI. We also provided assistance to another NNTC researcher, Dr. Morgello, who is constructing a series of tissue microarrays from the tissues stored 52 at the NNTC. We were able to provide assistance in this effort by leveraging an existing software package called TMAj from Johns Hopkins University. Additionally, we integrated this software with caTissue in order to provide the much needed clinical context to the pathologist at a crucial point in their workflow. We have presented our progress to the Scientific Leadership Group of the NNTC and have been encouraged by their positive response. Cancer Biomedical Informatics Grid (caBIG) Deployent at the Indiana University Simon Cancer Center (IUSCC)- The purpose of this grant is to deploy caBIG applications appropriately and meaningfully for the IUSCC. To this end, we deployed two applications – caArray and caIntegrator, at the Translational Genomics Core of the IUSCC in April, 2010. This facility is responsible for performing a number of Illumina microarray studies for IUSCC researchers, and as such requires software to organize and distribute microarray results from the core facility. The two software packages, which work together in conjunction, will provide both an organizational platform (caArray), and an integrative platform where clinical data can be associated with microarray results. This milestone was elucidated in the Deployment Plan in the Fall of 2009, and was achieved on schedule. Additionally, we presented our future plans to the Leadership Council on April 5th; those activities were approved and are ongoing. Drosophila Stock Center Database Upgrade- The CCC is helping the Drosophila stock center in the IU Department of Biology upgrade the database that it uses for its operations from a small, single-user database system running on an under-desk computer to a more robust and capable system running on a real server. In the last 18 months, the project was conceived; requirements were gathered; choice of database engine was made; and the work of conversion put under way. Image segmentation for the identification and tracking of fluorescent blobs - Le-Shin Wu of the CCC has been collaborating with Sidney Shaw of the IU Department of Biology to develop algorithms to identify blobs of green protein in microscope images and to track those blobs through time and space in 3-D movies. Typically, green protein segments are inserted into proteins using genetic engineering techniques that insert them into genes with are then transcribed and translated to produce protein. Tracking allows researchers to study the location and movements of proteins in living cells. A goal of the project was to produce an algorithm that requires no assumption about the biology of proteins and cells. In the last 18 months, the project produced its first accepted paper that will presented in August 2010 at the 20th International Conference on Pattern Recognition in Istanbul, Turkey. Image management for the Light Microscopy Imaging Center-CCC and the PTI Community Grids Lab have been working with Light Microscopy Imaging Center in the Department of Biology at IUB to develop a database system that will manage images collected at microscopes at the center. In the last 18 months, the project was conceived, planned, and work began. Basic functionality has been achieved, and a server has been put into operation at the center. Funding for the project is being sought from the National Science Foundation. 53 V.3 Educational Activities and Workforce Development The following chart shows employees hired or terminating Research Technologies during the reporting period. Name Group Status Dr. Hui Zhang Visualization Albert William Visualization Hired as Research Programmer/Analyst in Jan. 2010 Starting as 50% FTE in Jan. 2010 (as part of small TG EOT grant) Xin Hong Ganesh Shankar Life Science Life Science Left our Employ in December Hired as new Manager of the Advanced IT Core 54 VI. Bringing Distinction to the State of Indiana PTI scientists and leadership continue to earn national and international prestige for the state of Indiana by assuming leadership roles, serving as policy advisors and being published in respected international publications. Some notable accomplishments during the current reporting period are as follows: During the period, CACR director, Fred H. Cate served as a policy advisor on privacy and security to the US Department of Commerce, the Committee on Judiciary Subcommittee on Crime and Drugs in the US Senate, and the US Federal Trade Commission. Testimony provided by Cate was cited in numerous national media outlets, including the New York Times. The Data to Insight Center contributed key weather prediction technology to Vortex2, the largest effort to CACR’s Fred Cate testified before study and understand the nature of tornadoes numerous US government panels undertaken in the US to date. D2I’s LEAD II technology during the reporting period. Cate is an internationally recognized provided real-time weather updates to scientists expert on technology security tracking storms in the field. and privacy. During the reporting period, the Digital Science Center along with Research Technologies Systems and Applications groups completed the initial stage of hardware installation and testing for the FutureGrid project. This is a complicated and delicate process that required significant technical expertise and effort to achieve. The FutureGrid project places Indiana and Indiana University at the helm of one of the most important national efforts related to the future of technology and scientific research. This model, created by the CNeTS group of the Digital Science Center shows human mobility in North America. It was recently featured as part of an article by Alessandro Supported by a Vespignani on the “Fragility of interdependency” in the April 15 edition of the journal $10 million grant Nature. from the National 55 Science Foundation and led by PTI’s own Geoffrey Fox, FutureGrid provides a testbed for the most significant emerging grid and cloud technologies. These are the technologies expected to drive global business and scientific research in the coming decades. The project will be used to define the future of US national computing infrastructure and contributes significantly to US competitiveness in the sciences. Allessandro Vespignani of the Digital Science Center was published in the prestigious international journal Nature, receiving placement as the coveted cover story. The article featured his highly successful work predicting and modeling the spread of the H1N1 pandemic. Professor Lizhe Wang of the DSC Community Grids Lab was named Coordinator of the IEEE TCSC technical area of Green Computing, a significant national research effort. The Digital Science Center in PTI was selected to host the IEEE international conference on cloud computing, CloudCom 2010. DSC director Geoffrey Fox serves as conference general chair, the Community Grids Lab’s Judy Qiu serves as program chair, and Daphne Siefert-Herron of the Strategic Initiatives group is the organizing and communications chair of that conference, which will bring top researchers from around the world to Indianapolis. Allessandro Vespignani was a finalist in the 2010 Mira Awards for excellence in Indiana technology, recognizing his work on the modeling of the H1N1 pandemic. In January 2010, Research Technologies Applications Director Scott McCaulay was invited to NSF headquarters in Arlington, Virginia to address the Principal Investigators of the NSF Software Development for Cyberinfrastructure program on the subject of software sustainability, and to present the results of the Software Sustainability Workshop held in Indianapolis in 2009. During March 2010 the Large Hadron Collider (LHC) located at the CERN laboratory in Geneva, Switzerland began generating the highest energy particles (7 TeV) ever produced by a particle accelerator, with significant contributions being made by IU researchers and technologists. Researchers from the IU High Energy physics group participate in ATLAS, a CERN project in which a subatomic particle detector measures the fundamental building blocks of the universe. Through their work in the NSF’s Open Science Grid (OSG) Grid Operations Center, technologists from Research Technology Applications help in the effort to process the massive amount of data produced by ATLAS. William Barnett, of the RT Life Science group was elected Vice Chair and Chair-Elect of the Communications Key Function Committee for the national Clinical and Translational Sciences Award (CTSA) Consortium. This is a national consortium focusing on bringing advances in medicine from the research labs to the bedside. 56 VII. Institute Coordination and Support Craig A. Stewart, Executive Director During the period, there were few major changes in PTI Coordination and Support functions. This is a positive situation in that PTI has reached a period of stability in all the core areas of coordination. PTI COO Therese Miller has assembled an exceptional team of administrators who are highly skilled in the preparation and submission of successful funding proposals. This team has become recognized across the university as among the top experts in grant management and preparation. Despite the group’s solid competency and impeccable work ethic, the sheer volume of grants being submitted and received by PTI necessitates an expansion of this group during the coming period. The Strategic Initiatives team had only one change during the period with management of the PTI Web site and CI Newsletter transferring from Barbara O’Leary to Peg Lindenlaub. Team manager, Daphne Siefert-Herron has assembled a full team with in-house expertise to produce Web content, written reports, printed materials, video and multimedia presentations for use in external relations, education and outreach to the academic community and general public. The quality of this team has been evident in private (in site visits with the NSF) and in public (public communications with video regarding the Data to Insight Center’s XMCCat project). Adding video to our communications offers a way to better convey both the content of our work and a sense of excitement about it. Each of the PTI centers now enjoys the support and coordination of a full-time project manager who oversees all aspects of administrative management within the centers. With all of these pieces now solidly in place, PTI has entered a period of reliable organization and exceptional productivity. 57 VIII. Management and Operations Therese Miller, Chief Operating Officer Pervasive Technology Institute continues to move down a path toward sustainability. Grant opportunities are at an all time record, as PTI has gained national recognition in numerous areas of research. Overall management of grant projects has increased tremendously, including monthly progress and financial reporting. In particular, we are focusing on compliance aspects of these grants, with special attention paid to personnel effort reporting. Total active funding now exceeds $22 million, with an additional $71.4 million in pending proposals. PTI is fortunate to be located in the new Innovation Center (IC) at Indiana University. This area has been targeted as a future technology park, and the IC continues to be a hub for economic development. The future Cyberinfrastructure Building (CIB) is now under construction adjacent to the IC and will be home to our Research Technologies collaborators in the Fall of 2011. Also adjacent to these buildings is the new IU Data Center, housing supercomputers and file storage systems; cyberinfrastructure that are critical to many research projects. Collaboration and innovation are improved by the overall physical location and convenience of these assets. The Strategic Initiatives group continues to focus on enhancing the identity of PTI within the state, as well as the nation. They strengthen our overall mission, by making the public aware of the work of PTI researchers that support and contribute to many basic science research areas. This is research that makes a difference to the nation and the world, as it attempts to solve some of the basic environmental and societal problems of today. 58 IX. Economic Development In the 18 months since the inception of PTI, the Institute has brought in a total of more than $22 million in additional grants beyond the initial award from the Lilly Endowment, with $4 million received in the current period and an additional $58 million in pending grants. According to a new report by the US Science Coalition, “When public money is invested in university-based basic research there is tremendous return on investment. Research creates jobs directly for those involved and indirectly for many others, through innovations that lead to new technologies, new industries and new companies.” The report sites institutions such as PTI that bring large sums of federal grants into the area as contributing positively to an area’s overall economy. It also explains how investment in basic research pays off in economic terms and contributes to a region’s overall success in business and industry as well as scientific competitiveness. We encourage individuals to read the complete findings of the report at http://www.sciencecoalition.org/successstories/index.cfm but the box below provides some important highlights from the report. Key excerpts from this report are shown below: Some of how this plays out is *From Sparking Economic Growth, a report from the US Science Coalition evident from the http://www.sciencecoalition.org/successstories/index.cfm ) current status of the Pervasive Universities conduct the majority of basic research in the United States— 55 Technology percent in 2008. Business and industry conduct less than 20 percent of basic Institute. More research in the United States. The federal government is the primary source of funding for basic research than 60 full time conducted in the United States, providing some 60 percent of funding. The employees of second largest source of basic research funding is the academic institutions Indiana themselves. University are Basic research is conducted for the sake of knowledge and is essential to funded directly scientific discovery and understanding. Basic research is the first step in the innovation process. or indirectly Innovations that flow from university-based basic research are at the root of through external countless companies. Companies spun out of research universities have a far funding that greater success rate than other companies, creating good jobs and spurring comes to PTI. economic activity. This puts cash The US continues to lead in global research and development expenditures into the central from all sources. However, China and other nations are investing aggressively in R&D in order to enhance their innovation capabilities. and south America’s global competitiveness and long-term economic health depends on central significant and consistent investment in basic research. economy, adds tax revenue, and aids the creation of new startup businesses. For example, former students of Pervasive Technology Labs fellow Katy Boerner are now operating a successful IT business in Bloomington IN. Prior to the advent of PTL and PTI, such students would have moved elsewhere – to Boston or Silicon valley – to start up companies. Facts about Basic Research and the Innovation Process 59 Direct investment in the economy continues through the now renamed PTI Capital Development Fund. The IURTC holds the investments in six companies for the benefit of the PTI Capital Investment Fund. Investments have been made at various times since 2003 per recommendations by an advisory board. Investments have included promissory notes, common stock, preferred stock and LLC units. All investments are in private, non-publicly traded companies. Investments have been made in six companies: Anabas, Inc. CareGuide (formerly Haelan) ChartLogic (formerly DynoMed) Precise Path Robotics, Inc. (formerly Indy Robotics LLC) SGC Technologies LLC Veteris/Veterisoft The companies in which we have invested, and PTI Capital Investment fund, have felt the effects of the company. Veteris/Vertetisoft has officially moved from being inactive to simply being closed and out of business. CareGuide (formerly Haelan) declared insolvency during this reporting period. However, the value of the PTI holdings in Precise Path Robotics has increased from $50,000 to a current valuation of $147,000. The value of the stock in Anabas, Inc. held by IU has risen from $200,000 to $250,000 and IU’s collaboration with Anabas has put it in a position to win a subcontract on a project with the Air Force Research Lab that is expected to be worth more than $1,000,000. Direct engagement in the private sector carries risks and benefits. During this reporting period we have experienced both the downside of risks (insolvency and closing of two businesses in which we invested) and the benefits (increased activity and value related to two other businesses). Action and engagement offer the opportunity to engage, influence, and possibly succeed in economic development. Inaction holds only the guarantee of failure. Through support of the Lilly Endowment PTI has been able to engage directly and experience some success and aid businesses operating in Indiana and the Indiana economy overall. Detailed reports on PTI Capitol Fund investments follow: Anabas: Original investment date: June, 2003 Amount initially invested: $400,000 Format: convertible subordinated promissory note and preferred stock Activity since investment. Company continues to use grant funded activities to develop sensor-centric grid applications and web conferencing services. Of the $400,000 investment by the PTI Capital Investment fund, $200,000 was in the form of direct investment in ownership of preferred stock, and $200,000 was in the form of a note which has now been settled via a transfer of software licenses and intellectual property 60 rights to IU. This transfer has put IU in a position to negotiate a major contract with the Air Force Research Labs that will be worth at least five times the value of the note. Current status o Active Current valuation of preferred stock investment in Anabas: o $250,000. Contact/source of information o Alex Ho, CEO Haelan/CareGuide: Original investment date: June, 2003 Amount: $450,000 Format: convertible subordinated promissory note with earn-out provisions Activity since investment o CareGuide acquired Haelan in December of 2006 by merger. Convertible subordinated notes with an earn-out provision were issued and are past due as of December 8, 2009. CareGuide notified all note holders that it was insolvent and liquidating all of its assets as of March 30, 2010 as initiated by its bank to satisfy the bank debt outstanding. No funds are available to satisfy any note obligations to the Haelan noteholders and no legal action is considered due to the low probability of recovery and high cost of legal expenses. Current status o Insolvent Current valuation o $0 Contact/source of information Richard Westheimer, Haelan note rep, phone 513-651-1110 or rlwesty@fuse.net DynoMed/ChartLogic Inc.: Original investment date: June, 2004 Amount: $300,000 Format: common and preferred stock Activity since investment o ChartLogic acquired DynoMed in December of 2004 by merger. ChartLogic is a Utah based company that provides proprietary electronic medical records solutions for physicians. Current status o Active Current valuation o $300,000. Private company, no market value established for stock. Contact/source of information 61 ChartLogic web site, repeated attempts for information with no company response. Indy Robotics/Precise Path Robotics, Inc.: Original investment date: September, 2005 Amount: $50,000 Format: common stock Activity since investment o Precise Path Robotics acquired Indy Robotics in December of 2006. Scott Jones is Co-Founder and Chairman of Precise Path. Company has developed a commercially viable robotic greens mower for use at golf courses. Current status Active Current valuation $147,000. Contact/source of information Jason Zielke, COO, phone 317-818-8185 ext 7005 or Jason.Zielke@precisepath.com SGC Technologies LLC: Original investment date: April, 2005 Amount: $100,000 Format: LLC unit Activity since investment o Company initially developed an enterprise and data sharing software, FileShare and continues to maintain that product. Development recently on version of software for e-health care that would be HIPPA compliant. Current status o Active Current valuation o Private company, no market value established for LLC units. Contact/source of information o Greg Travis, founder Veteris/Veterisoft: Original investment date: July, 2004 and 2005 Amount: $400,000 Format: LLC unit Current status Company is closed Current valuation $0. 62 X. External Relations and Strategic Initiatives Daphne Siefert-Herron, Manager of Strategic Initiatives During the period, the Strategic Initiatives Team has had continued success in bringing news about PTI to the national and regional scientific and lay communities. Articles and news items about PTI appeared more than 70 times in the popular and technical during the period (a detailed list can be seen in Appendix 10). One notable achievement during the period was the launch of our new Inside PTI newsletter early in 2010. The newsletter is produced every two weeks and includes short blurbs about PTI employees, research, and other information such as conferences and funding opportunities that are interesting or useful to the PTI community. The publication has been very popular and has been requested by individuals not employed by PTI who also find the information useful. The goal of the newsletter is to foster and increase collaboration and share institutional knowledge by keeping the entire PTI community informed about what each of our leading edge research teams is working on and accomplishing. Inside PTI was compiled and edited by our communications intern, Doug Hungerford prior to his graduation in May. The Strategic Initiatives group has since hired a new intern, Helen Russick, a Junior majoring in Journalism at IU Bloomington. Helen is an accomplished writer, bringing experience from prior internships with the Indiana State Legislature and the Indianapolis Museum of Art. She is now serving as the new editor of Inside PTI. To see the current and archived issues of Inside PTI, visit http://pti.iu.edu/inside. Also during the reporting period, the Strategic Initiatives team saw a change in leadership for the PTI Web site. Barbara O’Leary left PTI to pursue other interests in May and her role as PTI Web Coordinator was taken over by Peg Lindenlaub. Peg is a longtime employee of Indiana University in the Research Technologies division. Her technical and institutional knowledge has proven invaluable, both for the PTI Web site and in her role as editor for our internationally distributed publication on high performance computing, the CI Newsletter. The Strategic Initiatives team, with the support of our videographer, Jonathan Morrison has brought to life several more video productions during the reporting period. They present a lay-audience friendly overview of many of our projects including the LEAD II/Vortex2 tornado research project at http://pti.iu.edu/video/vortex2, and important software releases XMC Cat http://pti.iu.edu/video/xmccat and Twister http://pti.iu.edu/video/twister. One of the most impressive videos of the period is on the CACR project Ethical Technologies in the Homes of Seniors project http://newsinfo.iu.edu/asset/page/normal/8472.html. The videos have been well-received and run by several of the media outlets we commonly work with, as they are increasing video features in their publications and Web sites. Video has proven to be a cost-effective way to reach a broader audience and educate the public about technology. We intend to produce an ever greater volume of video content in the coming period. The Strategic Initiatives team is currently busy working on several projects that will come to fruition during the coming reporting period. One is planning for our second international academic conference, CloudCom 2010 which will take place in December in Indianapolis. The conference will gather many of the top research scientists working in advanced computing today and it is a notable honor that IEEE selected Indiana University to host this prestigious event. Also during the period, we completed work on 63 a new booklet about the Center for Applied Cybersecurity Research and began work on similar booklets for the other two PTI centers that will be completed during the coming period. Our team is also busy planning the PTI/Indiana University display at the annual Supercomputing Conference taking place in November in New Orleans. 64 XI. Educating the Residents of Indiana and Beyond Daphne Siefert-Herron, Manager of Strategic Initiatives Pervasive Technology Institute remains committed to educating the residents of Indiana and beyond about the value of advanced technology to a healthy, educated and productive society. During the reporting period, PTI hosted or participated in numerous events and activities that bring technology concepts to a broader audience. This has also been a busy period planning for our busy summer outreach season, which includes numerous technology summer camps and other educational activities for Hoosier children and teens. Also during the reporting period, the Center for Applied Cybersecurity Research began planning for an early fall launch of a new educational program to air on the Bloomington National Public Radio affiliate entitled, Security Matters. The program will help to educate the public on how to use technology safely and securely in their homes and businesses. CACR along with members of the PTI Strategic Initiatives team began producing related video content for the series during the period. Highlights of educational outreach during the reporting period are: In March Members of PTI Members of the PTI Advanced Visualization Lab teach Hoosier kids and families how organized a 3D movies are made during a very popular special event at Wonderlab children’s workshop held science and technology museum in Bloomington. at the Wonderlab children’s science museum in Bloomington entitled “Be an Interstellar Commuter with a Supercomputer”. The workshop introduced how supercomputers are used in scientific research such as astronomy and how these computers differ from ordinary desktop computers. The workshop was designed by Kurt Seiffert of the Research Technologies Systems group. In April, PTI offered another very popular event at Wonderlab developed by members of the Research Technologies Advanced Visualization Lab (AVL). Members of the AVL produced a video showing various 3D animation techniques and explaining how 3D video is produced. During the program, the group made contacts with several area science teachers who are interested in having the program repeated in their classrooms in the fall. The OSL helped coach a middle-school team that participated in the Game On event, one of several competitions held during the Indiana State Science Olympiad held on the 65 campus of IU-Bloomington in March. Game On had students compete to create a computer game in a fixed length of time, using the open source Scratch software (http://scratch.mit.edu/). The students learned the basics of modern, object-oriented computer programming and even simple uses of concurrent programming with message-passing. Students participate in the Game On event at the Indiana State Science Olympiad. 66 Appendix 1: Technology Disclosures during the Reporting Period PTI Center & Lab Digital Science Center Community Grids Lab Disclosure Name Description Disclosure Date Status of Disclosure Related URL Scientific Workflows Tools developed by the LEAD and OGCE projects to support scientific workflows on Grids and Clouds. 3/17/10 Approved http://www.collab-ogce.org 67 Appendix 2: Open Source Software PTI Center & Lab Digital Science Center Community Grids Lab Community Grids Lab Community Grids Lab Software Name Description Related URL FutureGrid Initial software repository, Apachee License Tools for building Web-based science gateways Software supporting extensions of the MapReduce programming model for Cloud computing. http://futuregrid.org OGCE Science Gateway Software Suite Twister Iterative Map Reduce http://www.collab-ogce.org http://www.iterativemapreduce.org/ Community Grids Lab Twister Supports faster execution of many data mining applications implemented as MapReduce programs. http://www.iterativemapreduce.org/ CNetS Scholarometer (Fil Menczer) Bct-cpp (Larry Yaeger) Browser extension http://scholarometer/indiana.edu/download.html C++ implementation of the Brain Connectivity Toolbox Stable release of the Open MPI project includes bug fixes since the 1.4.0 release. http://code.google.com/p/bct-cpp/ CNetS Open Systems Lab Open MPI 1.4.1 (Jan. 15, 2010) Open Systems Lab OSCAR 6.0.5 (April 7, 2010) Open Systems Lab Open MPI 1.4.2 (May 4, 2010) http://www.open-mpi.org/software/ompi/v1.4/ http://svn.oscar.openclustergroup.org/trac/oscar/blog/oscar6.0.5 Stable release of the Open MPI project includes bug fixes since the 1.4.1 release. http://www.open-mpi.org/software/ompi/v1.4/ 68 Data to Insight Center DSI Karma 3.0 DSI XMC Cat 1.2.6 VISLab TUIOZone 0.1.8 Contains instrumentation using Axis2 handlers, more extensive test clients, and better documentation. XMC Cat is a web service toolkit for capturing and storing metadata during the execution of scientific workflows to enable data discovery and reuse. TUIOZone is a library designed to support interaction on multi-touch surfaces. http://www.dataandsearch.org/provenance/?q=taxonomy/term/3 http://www.dataandsearch.org/dsi/xmccat http://jlyst.com/tz/ 69 Appendix 3: Online Services PTI Center & Lab Digital Science Center Community Grids Lab Name of Service Description Related URL Future Grid Portal Drupal-based portal to support the NSFfunded Future Grid project and provide collaborative content management. iGoogle-compatible gadgets for interacting with Future Grid information service. http://futuregrid.org/ Science gateway for processing expressed sequence tags (ESTs) and reconstructing genomes. Major upgrades from previous report to support multiple TeraGrid sites, multiple pipeline tools. Science gateway for drug screening, docking, and discovery. Major upgrades include support for off-target docking . Prototype job management service for new collaboration with the NIH-funded UltraScan project (biophysics). A prototype version of the MyOSG portal running in the OGCE gadget container software. Client interfaces for Amazon EC2-style virtual machine management. http://swarm.cgb.indiana.edu/ Community Grids Lab Future Grid Gadgets Community Grids Lab EST Pipeline Portal Community Grids Lab BioDrugScreen portal Community Grids Lab UltraScan gateway hosting Community Grids Lab MyOSG Gadget Portal Community Grids Lab Future Grid virtual machine image and instance browsers. 1) FG Hardware/Software List Gadget: http://gw19.quarry.iu.teragrid.org/Gadgets/FGKB/fgkb.x ml 2) FG Core Services Gadget: http://gw19.quarry.iu.teragrid.org/Gadgets/Google/Inca_ Services.xml 3) FG Knowledge Base (KB): http://gw19.quarry.iu.teragrid.org/Gadgets/Futuregrid/ha rdware/hardware.xml http://biodrugscreen.org/ Not yet public. UltraScan website is http://www.ultrascan.uthscsa.edu/ https://gadget.grid.iu.edu/ishindig-webapp http://gw19.quarry.iu.teragrid.org/Gadgets/EC2/EC2_We b_App.html 70 PTI Center & Lab Digital Science Center (Cntd.) Community Grids Lab Name of Service Description Related URL OREChem services Workflow User Interface: http://ogceportal.iu.teragrid.org/xbaya/orechemxbaya.jnlp Community Grids Lab GridChem Gateway Advanced Support https://gw26.quarry.iu.teragrid.org:19443/ https://gw26.quarry.iu.teragrid.org:19442 http://gw26.quarry.iu.teragrid.org/xbaya/xbayagridchem.jnlp Community Grids Lab OGCE Gateway Gadget Container, Application Registry, Generic Application Factory Service, Eventing System https://ogceportal.iu.teragrid.org/ishindig-webapp https://ogceportal.iu.teragrid.org:19442/ https://ogceportal.iu.teragrid.org:19443/xregistry http://ogceportal.iu.teragrid.org/monitor/ogce_monitorin g_dashboard.html http://ogceportal.iu.teragrid.org:12346/ http://ogceportal.iu.teragrid.org:13333/MsgBox http://ogceportal.iu.teragrid.org:19440/XWorkflows/ Community Grids Lab OREChem Gateway Advanced Support PubChem REST Services, Application Registry, Generic Application Factory Service, XBaya Workflow Interface http://gridfarm018.ucs.indiana.edu:8146/ https://gw26.quarry.iu.teragrid.org:18442/ https://gw26.quarry.iu.teragrid.org:7443/xregistryinterfac e/index.jsp https://gw26.quarry.iu.teragrid.org:18443/xregistry?wsdl http://ogceportal.iu.teragrid.org/xbaya/orechemxbaya.jnlp Community Grids Lab OLAS Gateway Advanced Support OLAS Portal, Application Registry, Generic Application Factory Service, XBaya Workflow Interface Community Grids Lab ODI Gateway Development ODI Portal, Application Registry, Generic Application Factory Service, XBaya Workflow Interface https://gw42.quarry.iu.teragrid.org/ishindig-webapp https://gw42.quarry.iu.teragrid.org:19442/ https://gw42.quarry.iu.teragrid.org:19443/xregistry http://gw42.quarry.iu.teragrid.org/monitor/olas_monitori ng_dashboard.html http://gw42.quarry.iu.teragrid.org:12346/ http://gw42.quarry.iu.teragrid.org:13333/MsgBox http://gw42.quarry.iu.teragrid.org:19440/XWorkflows/ https://gw8.quarry.iu.teragrid.org/ishindig-webapp https://gw8.quarry.iu.teragrid.org:19442/ https://gw8.quarry.iu.teragrid.org:19443/xregistry http://gw8.quarry.iu.teragrid.org/monitor/odi_monitorin g_dashboard.html http://gw8.quarry.iu.teragrid.org:12346/ http://gw8.quarry.iu.teragrid.org:13333/MsgBox http://gw8.quarry.iu.teragrid.org:19440/XWorkflows/ 71 PTI Center & Lab Data to Insight Center Center for Data and Search Informatics Name of Service Description Related URL LEAD Portal - Updates Cyberinfrastructure system in support of scientific discovery related to the atmosphere. http://portal.leadproject.org 72 Appendix 4: Publications (January 1-May 31, 2010) Digital Science Center Publications Community Grids Lab Aktas, Mehmet S., Geoffrey C. Fox, and Marlon Pierce. "A Federated Approach to Information Management in Grids." International Journal of Web Services Research. 7 (2010): 65-98. Aló, Richard A., et al. A Model for LACCEI: Minority Serving Institutions and CyberInfrastructure Research/ Education Minority Serving Institutions-CyberInfrastructure Empowerment Coalition- MSI-CIEC. The Eighth LACCEI Annual Conference for Engineering and Technology Innovation and Development for the Americas. Arequipa, Peru, 2010. Aló, Richard A., et al. Advancing Computational Science, Visualization and Homeland Security Research/ Education at Minority Serving Institutions National Model Promoted/ Implemented by MSI-CIEC (Minority Serving Institutions-CyberInfrastructure Empowerment Coalition. Vol. 1. The 10th International conference on Computational Science (ICCS 2010), 1. Amsterdam, the Netherlands: Elsevier B.V., 2010. Aló, Richard A., et al. CyberInfrastructure Research/ Education Development at Minority Serving Institutions National Model Promoted/ Implemented by MSI-CIEC (Minority Serving Institutions-CyberInfrastructure Empowerment Coalition)., 2010. Bae, Seung-Hee, et al. Dimension Reduction and Visualization of Large High-dimensional Data via Interpolation. The ACM International Symposium on High Performance Distributed Computing (HPDC). Chicago, IL: ACM Press, 2010. Bae, Seung-Hee, et al. Dimension Reduction and Visualization of Large High-dimensional Data via Interpolation. Bloomington, IN: Indiana University, 2010. Bae, Seung-Hee, Judy Qiu, and Geoffrey C. Fox Multidimensional Scaling by Deterministic Annealing with Iterative Majorization Algorithm. Bloomington, IN: Indiana University, 2010. Bollen, Johan, Geoffrey Fox, and Prashant Raj Singhal How and where the TeraGrid supercomputing infrastructure benefits science. Bloomington, IN: Indiana University, 2010. Choi, J. Y., et al. Browsing Large Scale Cheminformatics Data with Dimension Reduction., 2010. Choi, Jong Youl, et al. Browsing Large Scale Cheminformatics Data with Dimension Reduction. Bloomington, IN: Indiana University, 2010. Choi, Jong Youl, et al. Browsing Large Scale Cheminformatics Data with Dimension Reduction. The ACM International Symposium on High Performance Distributed Computing (HPDC). Chicago: ACM Press, 2010. Choi, Jong Youl, et al. Generative Topographic Mapping by Deterministic Annealing. Bloomington, IN: Indiana University, 2010. Choi, Jong Youl, et al. Generative Topographic Mapping by Deterministic Annealing. Vol. 1. The 10th International Conference on Computational Science (ICCS 2010), 1. Amsterdam, the Netherlands: Elsevier B.V., 2010. 73 Choi, Jong Youl, et al. High Performance Dimension Reduction and Visualization for Large High-dimensional Data Analysis. Eds. Manish Parashar, and Rajkumar Buyya. The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid2010). Melbourne, Australia: IEEE, 2010. Download: FGPlatform.docx (43.12 KB) Ekanayake, Jaliya, et al. Applicability of DryadLINQ to Scientific Applications. Bloomington, IN: Indiana University, 2010. Ekanayake, Jaliya, et al. Twister: A Runtime for Iterative MapReduce. Bloomington, IN: Indiana University, 2010. Ekanayake, Jaliya, et al. Twister: A Runtime for Iterative MapReduce. The ACM International Symposium on High Performance Distributed Computing (HPDC). Chicago, 2010. Ekanayake, Jaliya, Thilina Gunarathne, and Judy Qiu Cloud Technologies for Bioinformatics Applications. Bloomington, IN: Indiana University, 2010. Fox, Geoffrey Algorithms and Application for Grids and Clouds. The 22nd ACM Symposium on Parallelism in Algorithms and Architectures. Santorini, Greece: ACM, 2010. Fox, Geoffrey Cloud Technologies and Data Intensive Application., 2010. Fox, Geoffrey Clouds and MapReduce for Scientific Applications. Bloomington, IN: Indiana University, 2010. Fox, Geoffrey FutureGrid Platform FGPlatform: Rationale and Possible Directions. Bloomington, IN: Indiana University, 2010. Fox, Geoffrey Hybrid Computational Infrastructure Supporting eResearch. Bloomington, IN: Indiana University, 2010. Guha, Rajarshi, et al. "Advances in Cheminformatics Methodologies and Infrastructure to Support the Data Mining of Large, Heterogeneous Chemical Datasets." Current Computer-Aided Drug Design. 6 (2010): 50-67. Gunarathne, Thilina, et al. Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications. Bloomington, IN: Indiana University, 2010. Gunarathne, Thilina, et al. Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications. The ACM International Symposium on High Performance Distributed Computing. Chicago, IL: ACM, 2010. Kapadia, Apu, et al. Secure Cloud Computing with Brokered Trusted Sensor Networks. The 2010 International Symposium on Collaborative Technologies and Systems (CTS 2010) . Chicago, IL USA: IEEE, 2010. Mustacoglu, Ahmet Fatih, and Geoffrey Fox Performance of a Collaborative Framework for Federating Distributed Digital Entities. The 2010 International Symposium on Collaborative Technologies and Systems (CTS 2010). Chicago, IL USA, 2010. Muthuraman, Karthik Narayan Using SWARM service for a GRID based Sequence Assembly. Bloomington, IN: Indiana University, 2010. Oh, Sangyoon, Jai-Hoon Kim, and Geoffrey Fox. "Real-time performance analysis for publish/subscribe systems." Future Generation Computer Systems. 26 (2010): 318-323. 74 Pace, Tyler, Shaowen Bardzell, and Geoffrey Fox Human-Centered e-Science: A Group-Theoretic Perspective on Cyberinfrastructure Design. The 2010 International Symposium on Collaborative Technologies and Systems (CTS 2). Chicago, IL USA, 2010. Qiu, Judy, et al. "Hybrid Cloud and Cluster Computing Paradigms for Life Science Applications." the 11th Annual Bioinformatics Open Source Conference . Boston, MA 2010. Qiu, Judy, et al. Hybrid Cloud and Cluster Computing Paradigms for Life Science Applications. Bloomington, IN: Indiana University, 2010. Qiu, Judy, et al. Performance of Windows Multicore Systems on Threading and MPI. Bloomington, IN: Indiana University, 2010. Qiu, Judy, et al. Performance of Windows Multicore Systems on Threading and MPI. Eds. Manish Parashar, and Rajkumar Buyya. The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid2010). Melbourne, Australia: IEEE, 2010. Schulze, Bruno, and Geoffrey C. Fox. "Advanced Scheduling Strategies and Grid Programming Environments." Concurrency & Computation: Practice & Experience. 22 (2010): 233-240. Wang, Fugang Cyberaide JavaScript: A Web Application Development Framework for Cyberinfrastructure. Vol. Masters., 2010. Wang, Lizhe, et al. "Cloud computing: a perspective study." New Generation Computing. 28 (2010): 137-146. Wang, Lizhe, et al. "Provide Virtual Distributed Environments for Grid Computing on Demand." Journal of Advances in Engineering Software. 41 (2010): 213-219. Wang, Lizhe, et al. Schedule Distributed Virtual Machines in a Service Oriented Environment. The 24th IEEE International Conference on Advanced Information Networking and Applications (AINA’10). Perth, Australia: IEEE, 2010. Wang, Lizhe, et al. Towards Energy Aware Scheduling for Precedence Constrained Parallel Tasks in a Cluster with DVFS. Eds. Manish Parashar, and Rajkumar Buyya. The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid2010). Melbourne, Australia: IEEE, 2010. Wang, Lizhe, Jie Tao, and Gregor von Laszewski. "Multicores in Cloud Computing: Research Challenges for Applications." Journal of Computers. 5 (2010). Yildiz, Beytullah, and Geoffrey C. Fox Distributed Handler Architecture. Bloomington, IN: Indiana University, 2010. Younge, Andrew Towards a Green Framework for Cloud Data Centers. Vol. Masters. Computer Science, Masters. Rochester, NY: Rochester Institute of Technology, 2010. Open Systems Lab Cottam, J., S. Foley, and S. Menzel Do Roadshows Work?: Examining the Effectiveness of Just Be. Technical Symposium on Computer Science Education. Milwaukee, WI: ACM Press, 2010. Georgiev, T., and A. Lumsdaine. "Rich Image Capture with Plenoptic Cameras." International Conference on Computational Photography. Cambridge, MA 2010. 75 Hoefler, T., C. Siebert, and A. Lumsdaine Scalable Communication Protocols for Dynamic Sparse Data Exchange. The 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP 2010). Bangalore, India: ACM, 2010. Hoefler, Torsten, et al. The Case for Collective Pattern Specification. The First Workshop on Advances in Message Passing. Toronto, Canada, 2010. Georgiev, Todor, and Andrew Lumsdaine Theory and Methods of Lightfield Photography. The 31st Annual Conference of the European Association of Computer Graphics (EUROGRAPHICS 2010). Norrköping, Sweden, 2010. Heiland, Randy, et al. Workflows for Parameter Studies of Multi-Cell Modeling. 2010 Spring Simulation Multiconference (SpringSim'10). Orlando, FL, 2010. Complex Networks and Systems Abi-Haidar, A. and Rocha, L.M., Biomedical Article Classification using an Agent-Based Model of T-Cell CrossRegulation. in 9th International Conference on Artificial Immune Systems, (Edinburg, U.K., In Press), SpringerVerlag. Abi-Haidar, A. and Rocha, L.M., Collective Classification of Biomedical Articles using T-Cell Cross-regulation. in 12th International Conference on the Synthesis and Simulation of Living Systems (Alife XII), (Odense, Denmark, In Press), MIT Press. Bollen, J., Mao, H. and Pepe, A., Determining the public mood state by analysis of microblogging posts. in 12th International Conference on the Synthesis and Simulation of Living Systems (Artifical Life XII), (Odense, Denmark, Submitted). Colizza, V. and Vespignani, A. The Flu Fighters. Physics World, 23 (2). 26-30. Hoang, D.T., Kaur, J. and Menczer, F., Crowdsourcing Scholarly Data. in WebSci10: Extending the Frontiers of Society On-Line, (Raleigh, NC, 2010). Kolchinsky, A., Abi-Haidar, A., Kaur, J., Hamed, A.A. and Rocha, L.M. Classification of Protein-protein Interaction Full-text Documents Using Text and Citation Network Features. IEEE/ACM Transactions On Computational Biology And Bioinformatics. Kurtz, M. and Bollen, J. Usage Bibliometrics. in Cronin, B. ed. Annual Review of Information Science and Technology, Information Today, Inc., 2010. Lourenço, A., Carreira, R.C., Glez-Peña, D., Méndez, J.R., Carneiro, S.A., Rocha, L.M., Díaz, F., Ferreira, E.C., Rocha, I.P., Fdez-Riverola, F. and Rocha, M. BioDR: Semantic indexing networks for biomedical document retrieval. Expert Systems with Applications, 37 (4). 3444-3453. Meiss, M., Goncalves, B., Ramasco, J., Flammini, A. and Menczer, F., Agents, Bookmarks, and Clicks: A topical model of Web navigation. in The 21st ACM International Conference on Hypertext and Hypermedia (Hypertext2010), (Toronto, Canada, 2010), ACM. Schifanella, R., Barrat, A., Cattuto, C., Markines, B. and Menczer, F., Folks in Folksonomies: Social Link Prediction from Shared Metadata. in The 3rd ACM International Conference on Web Search and Data Mining (WSDM), (New York, NY, 2010), ACM, 271-280. Vespignani, A. Complex networks: The fragility of interdependency. Nature, 464. 984-985. 76 Yaeger, L., Sporns, O., Williams, S., Shuai, X. and Dougherty, S., Evolutionary Selection of Network Structure and Function. in 12th International Conference on the Synthesis and Simulation of Living Systems (Artifical Life XII), (Odense, Denmark, Submitted). Data to Insight Center Publications Dalmau, M. and Schlosser, M. Challenges of Serials Text Encoding in the Spirit of Scholarly Communication. Library Hi Tech 28 (3). Herath, C. and Plale, B., Streamflow - Programming Model for Data Streaming in Scientific Workflows. in The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid2010), (Melbourne, Australia, 2010), IEEE, 302. Katz, D.S., Callaghan, S., Harkness, R., Jha, S., Kurowski, K., Manos, S., Pamidighantam, S., Pierce, M., Plale, B., Song, C. and Towns, J. Science on the TeraGrid 04/2010, 2010. Ramakrishnan, L. and Plale, B. Multidimensional Classification Model for Scientific Workflow Characteristics 1st International Workshop on Workflow Approaches to New Data-centric Science (WANDS'10) co-located with ACM SIGMOD International Conference on Management of Data, Indianapolis, IN. 06/2010, 2010. Ramakrishnan, L., Plale, B. and Gannon, D., WORKEM: Representing and Emulating Distributed Scientific Workflow Execution State. in The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid2010), (Melbourne, Australia, 2010), IEEE, 283. Center for Applied Cybersecurity Research Publications Antolovic, D. Radiolocation in Ubiquitous Wireless Communication. Springer, New York, 2010. Arenson, A.D., Bakhireva, L.N., Chambers, C.D., Deximo, C.A., Foroud, T., Jacobson, J.L., Jacobson, S.W., Jones, K.L., Mattson, S.N., May, P.A., Moore, E.S., Ogle, K., Riley, E.P., Robinson, L.K., Rogers, J., Streissguth, A.P., Tavares, M.C., Urbanski, J., Yezerets, Y., Surya, R., Stewart, C.A. and Barnett, W.K. Implementation of a Shared Data Repository and Common Data Dictionary for Fetal Alcohol Spectrum Disorders Research. Alcohol. Cate, F.H. The Limits of Notice and Choice. IEEE Security & Privacy, 8 (2). 59-62. Cate, F.H. Playing charades with terrorists Washington Times, Washigton D.C. 02/03/2010, 2010. Chang, C.-I., Jiao, X., Wu, C.-C., Du, Y. and Chang, M.-L. A Review of Unsupervised Spectral Target Analysis for Hyperspectral Imagery. EURASIP Journal on Advances in Signal Processing. 26. Chen, S., Wang, R., Wang, X. and Zhang, K., Side-Channel Leaks in Web Applications: a Reality Today, a Challenge Tomorrow. in The 31st IEEE Symposium on Security and Privacy, (Oakland, CA, 2010), IEEE. Du, Y., Arslanturk, E., Zhou, Z. and Belcher, C. Video based non-cooperative iris image segmentation. IEEE Transactions on Systems, Man, and Cybernetics, Part B (99). 1-11. Du, Y., Belcher, C. and Zhou, Z. Scale Invariant Gabor Descriptor-based Noncooperative Iris Recognition. EURASIP Journal on Advances in Signal Processing 13. 77 Du, Y., Ives, R.W. and Etter, D.M. New approaches to iris recognition: 1- dimensional algorithm. in Voeller, J.G. ed. Wiley Handbook of Science and Technology for Homeland Security, Wiley, Hoboken, NJ, 2010. Ives, R., Bishop, D., Du, Y. and Belcher, C. Iris Recognition: The Consequences of Image Compression. EURASIP Journal on Advances in Signal Processing 9. Jiao, X., Chang, C.-I. and Du, Y. Orthogonal subspace projection approach to finding signal sources in hyperspectral imagery Proceedings of SPIE, SPIE 04/2010, 2010.7695 Kalafut, A.J., Shue, C.A. and Gupta, M., Malicious Hubs: Detecting Abnormally Malicious Autonomous Systems. in The 29th IEEE Conference on Computer Communications, (San Diego, CA, 2010), IEEE, 1-5. Kapadia, A., Myers, S., Wang, X. and Fox, G., Secure Cloud Computing with Brokered Trusted Sensor Networks. in The 2010 International Symposium on Collaborative Technologies and Systems (CTS 2010) (Chicago, IL USA, 2010), IEEE, 581-592 Li, F., Yang, Y., Wu, J. and Zou, X., Fuzzy Closeness-based Delegation Forwarding in Delay Tolerant Networks. in IEEE International Conference on Networking, Architecture, and Storage (NAS), (Macau, China, In Press), IEEE. Moyle, L.C., Muir, C.D., Han, M.V. and Hahn, M.W. The contribution of gene movement to the "Two Rules of Speciation". Evolution. Nuzhdin, S.V., Rachkova, A. and Hahn, M.W. The strength of transcription-factor binding modulates co-variation in transcriptional networks. Trends in Genetics, 26 (2). 51-53. Prakash, P., Kumar, M., Kompella, R.R. and Gupta, M., PhishNet: Predictive Blacklisting to Detect Phishing Attacks. in The 29th IEEE Conference on Computer Communications, (San Diego, CA, 2010), IEEE, 1-5. Schrider, D.R. and Hahn, M.W. Lower linkage disequilibrium at CNVs is due to both recurrent mutation and transposing duplications. Molecular Biology and Evolution, 27 (1). 103-111. Shue, C.A. and Gupta, M., Hiding in Plain Sight: Exploiting Broadcast for Practical Host Anonymity. in Hawaii International Conference on System Sciences (HICSS), 2010, (Koloa, Kauai, HI, 2010), Computer Society Press, (9 pages). Sui, Y., Yang, K., Du, Y., Orr, S. and Zou, X., A novel key management scheme using biometrics. in Mobile Multimedia/Image Processing, Security, and Applications 2010 (Orlando, FL, 2010), SPIE, 77080C-77080C-77010. Thomas, N.L., Zhou, Z. and Du, Y., A new approach for sclera vein recognition. in Mobile Multimedia/Image Processing, Security, and Applications 2010, (Orlando, FL, 2010), SPIE. Turner, T.L. and Hahn, M.W. Genomic islands of speciation or genomic islands and speciation? Molecular Ecology, 19 (5). 848-850. Y.Du, Belcher, C., Zhou, Z. and Ives, R.W. Feature Correlation Evaluation Approach for Iris Image Quality Measure. Signal Processing, 90 (4). 1176-1187. Yang, K., Sui, Y., Zhou, Z., Du, Y. and Zou, X., A new approach for cancelable iris recognition. in Mobile Multimedia/Image Processing, Security, and Applications 2010 (Orlando, FL, 2010), SPIE. Zhou, Z., Du, Y. and Delp III, E.J., A new approach for multiple wavelength-based iris recognition. in Mobile Multimedia/Image Processing, Security, and Applications 2010, (Orlando, FL, 2010), SPIE. 78 Advanced Network Management Lab Meiss, M., Goncalves, B., Ramasco, J., Flammini, A. and Menczer, F., Agents, Bookmarks, and Clicks: A topical model of Web navigation. in The 21st ACM International Conference on Hypertext and Hypermedia (Hypertext2010), (Toronto, Canada, 2010), ACM. Research Technologies Publications Life Science Group Wu, L., Sidney L. Shaw: A Hypothesis Testing Approach For Fluorescent Blob Identification. Proceeding of 20th International Conference on Pattern Recognition (ICPR 2010). 79 Appendix 5: Presentations (January 1 – May 31, 2010) Digital Science Center Presentations Community Grids Lab S.E. Bae, “Scalable High Performance Dimension Reduction with Text Version,” April 15, 2010 J. Y. Choi, “Generative Topographic Mapping in Life Science,” April 12, 2010. J. Y. Choi, S.H. Bae, X. Qiu and G. Fox, “High Performance Dimension Reduction and Visualization for Large Highdimensional Data Analysis,” The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Melbourne, Australia, May 17-20, 2010. J.Y. Choi (speaker), L. Wang, G. von Laszewski, J. Dayal, “Towards Energy Aware Scheduling for Precedence Constrained Parallel Tasks in a Cluster with DVFS” 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Melbourne, Victoria, Australia, May 17-20, 2010. G. Fox, “FutureGrid in a Nutshell,” Status report, May 12, 2010. G. Fox, “Clouds Cyberinfrastructure and Collaboration” The 2010 International Symposium on Collaborative Technologies and Systems, The Westin Lombard Yorktown Center Chicago, Illinois, USA, May 20, 2010. G. Fox, “Clouds and FutureGrid,” MSI-CIEC All Hands Meeting, San Diego SuperComputing Center, January 27, 2010. G. Fox, “Cloud Technologies and Data Intensive Applications,” INGRID 2010 Instrumenting the Grid, Poznan, Poland, May 13, 2010. G. Fox and S. Jha, “Introduction to Programming Paradigms Activity at Data Intensive Workshop,” Data Intensive Research Workshop, National e-Science Center, Edinburgh, UK, March 15, 2010. G. Fox, J. Qiu and SALSA Group, “MapReduce and Clouds for Science,” Microsoft External Research Review, Redmond, WA, April 6-7, 2010. G. Fox, “Overview of Cyberinfrastructure and the Breadth of Its Application,” Howard University, Cyberinfrastucture Day, April 16, 2010. G. Fox, “PolarGrid,” Meadowood, Bloomington, Indiana, January 12, 2010. G. Fox, “PolarGrid NSF CReSIS Review,” National Science Foundation, April 13, 2010. G. Fox, J. Qiu, S. Beason, J. Ekanayake, T. Gunarathne, J. Y. Choi, S.H. Bae, Y. Ruan, H. Li, B. Zhang, S. Ekanayake and S. Wu, “SALSA Group’s Collaborations with Microsoft,” February 5, 2010. G. Fox, “SALSA Group Poster, Spread Out Version,” Data Intensive Research Workshop, National e-Science Center, Edinburgh, UK, March 17, 2010. 80 S. Marru, “OGCE Workflow Toolkit for Multi-Scale Science Applications,” One Degree Imager Gateway/Pipeline, National Optical Astronomy Observatory, Tucson, AZ, January 21, 2010. S. Marru, R. Singh, M. Pierce, “UltraScan Gateway Advanced Support,” TeraGrid Science Gateway Teleconference, April 16, 2010. S. Myers, A. Kapadia, X. F. Wang and G. Fox, “Secure Cloud Computing With Brokered Trusted Sensor Networks” and “Side-Channel Threats to Web Applications,” Innovation Center, Indiana University, Bloomington, IN, March 29, 2010. K. Narayan, “Using SWARM service to run a Grid based EST Sequence Assembly,” School of Informatics and Computing Masters Capstone, May 17, 2010. T. Pace, “Human-Centered e-Science: A Group-Theoretic Perspective on Cyberinfrastructure Design,” The 2010 International Symposium on Collaborative Technologies and Systems, The Westin Lombard Yorktown Center, Chicago, Illinois, USA, May 19, 2010. M. Pierce and G. Fox, “Building Effective CyberGIS: FutureGrid,” National Science Foundation TeraGrid Workshop on Cyber-GIS, Washington, DC, February 2-3, 2010. M. Pierce, “Building and Testing OGCE Software on the NMI Build and Test Facility,” SDCI/STCI Build and Test Workshop, National Science Foundation, January 27, 2010. M. Pierce, G. Fox, X. Gao, J. Ji and C. Sun, “Indiana University QuakeSim Activities,” January 14, 2010. M. Pierce, G. Fox, S. Challa, “IU OREChem Summary Slides,” ACS, March, 2010, and Microsoft External Research Meeting, April, 2010. M. Pierce, “Science Gateway Advanced Support Activities in PTI,” Innovation Center, Indiana University, Bloomington, IN, March 4, 2010. J. Qiu, “Cloud Technologies and Their Applications,” Indiana University, Bloomington, IN, February 12, 2010. J. Qiu, “Cloud Technologies and Their Applications,” Indiana University, Bloomington, IN, March 26, 2010. J. Qiu, “Cloud Technologies and Their Applications,” Keynote at 5th International Workshop on Content Delivery Networks, The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Melbourne, Australia, May 17, 2010. J. Qiu and G. Fox, “Digital Science Center,” Innovation Center, Indiana University, Bloomington, IN, February 12, 2010. J. Qiu, “Performance of Windows Multicore Systems on Threading and MPI,” Proceeding Frontiers of GPU, Multiand Many-Core Systems Workshop, The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Melbourne, Australia, May 18 2010. Qiu, J. and SALSA Group, “SALSA and Cheminformatics,” Innovation Center, Indiana University, Bloomington, IN, February 12, 2010. J. Qiu and SALSA Group, “SALSA Group’s Collaborations with Microsoft,” Innovation Center, Indiana University, Bloomington, IN, February 2, 2010. 81 R. Schlegel, K. Zhang, Z. Li, A. Kapadia, X. F. Wang, “Soundminer A Stealthy and Context-Aware Sound Trojan for Smartphones,” Innovation Center, Indiana University, Bloomington, IN, March 29, 2010. C. Stewart, “FutureGrid: An Experimental, High-Performance Grid Testbed,” TeraGrid meeting, March 3, 2010. G. von Laszewski, “Future Grid Introduction,” Open Science Grid All Hands Meeting, Fermilab, Batavia, IL, March 8, 2010. G. von Laszewski, “ Future Grid Introduction,” Department of Energy, April 7, 2010 L. Wang, G. von Laszewski, J. Tao and M. Kunze, “Schedule Distributed Virtual Machines in a Service Oriented Environment,” 24th IEEE International Conference on Advanced Information Networking and Applications, Perth, Australia, (talk not presented due to visa issues), April 20-23, 2010. Complex Networks and Systems J. Bollen, "The MESUR Project: Studying Science from Usage Data – Implications for Scholarly Impact Metrics," 12th Fiesole Collection Meeting, Leuven, Belgium, April 8-10, 2010. J. Bollen, "Tracking Science in Real-Time from Large-Scale Usage Data," Computation Institute, University of Chicago, April 2, 2010. J. Bollen, “Tracking Science in Real-Time from Large-Scale Usage Data,” Annual Spring conference of the German Physical Society, Regensburg, Germany, March 21-26, 2010. F. Menczer, R. Schifanella, A. Barrat, C. Cattuto and B. Markines, “Folks in Folksonomies: Social Link Prediction from Shared Metadata,” Third ACM International Conference on Web Search and Data Mining 2010, New York, NY, February 3 – 6, 2010. F. Menczer, Presentation to 2010 National Science Foundation Information Integration and Informatics PI Workshop, Rosslyn, VA, April 22 – 23, 2010. F. Menczer, National Science Foundation Workshop on Examining Web-Scale Research Collaboration, Renesselaer Polytechnic Institute, Troy, NY, April 7 – 8, 2010. F. Menczer, NSF III PI Workshop 2010, poster and demo: http://cs.georgetown.edu/NSF-III-2010/. F. Menczer, Web Science 2010, Raleigh, NC, April 26 – 27, 2010. http://www.websci10.org/home.html. A. Vespigani, “Reaction-Diffusion Processes in Multiscale Networks and the Spatial Sspread of Infectious Diseases,” ReactionMini Stat Mech Meeting, University of California, Berkeley, CA, January 8 – 10, 2010. A. Vespigani, “Predicting the Behavior of Techno-Social Systems,” BIFI 2010 International Conference, Zaragoza, Spain, February 3 – 6, 2010. A. Vespigani, “Predicting the behavior of Techno-Social Systems: How informatics and Computing Help to Fight off Global Pandemics,” James Martin 21st Century School, Oxford University, UK, February 25, 2010. 82 A. Vespigani, “Multiscale Mobility Networks and the Large Scale Spearding of Infectious Diseases,” American Physical Society March Meeting, Portland, Oregon, March 14 – 17, 2010. A. Vespigani, “Forecasting Techno-Social Systems: How Physics and Computing Help to Fight Off Global Pandemics,” American Physical Society March Meeting, Portland, OR, March 14 – 17, 2010. A. Vespigani, “Predicting the Behavior of Techno- Social Systems: How Complex Networks, Physics and Computing Help to Fight Off Global Pandemics,” Center for Scientific Computation & Mathematical Modeling Nonlinear Dynamics of Networks, University of Maryland, College Park, April 7, 2010. A. Vespigani, “Predicting the Behavior of Socio-Technical Systems: A Network Approach,” Social and Cognitive Networks Academic Research Center, Rensselaer Polytechnic Institute, Troy, NY, May 4, 2010. A. Vespigani, “Theory of Cascading Events in Complex Networks,” NetSci 2010 Satellite Event: The Fidelity Center for Applied Complexity presents: Cascading Events in Complex Financial Networks, Boston, MA, May 11, 2010. L. Yaeger, Norwegian University of Science and Technology , May 18, 2010. Open Systems Lab N. Edmonds, A. Lumsdaine and J. Willcock, “Active Messages for Parallel Graph Computations,” Society for Industrial and Applied Mathematics Conference on Parallel Processing for Scientific Computing, Seattle, WA, February 24-26, 2010. R. Heiland, “Workflows for Parameter Studies of Multi-Cell Modeling,” High Performance Computing Symposium 2010, Symposium of the Spring Simulation Multiconference, Orlando, FL, April 12-14, 2010. A. Lumsdaine, “On Informatics Computations (and the Parallel Boost Graph Library),” Intel workshop: Academic Research on Parallel Algorithms for Non-Numeric Computing, Santa Clara, CA, March 9, 2010. A. Lumsdaine, “Rich Image Capture with Plenoptic Cameras,” International Conference on Computational Photography, MIT, Cambridge, MA, March 28-30, 2010. A. Lumsdaine, “Theory and Methods of Lightfield Photography,” Eurographics 2010, Norrkoping, Sweden, May 3-7, 2010. Data to Insight Center Presentations C. Eller, “Applying Stereoscopic Video and Animation to the Digital Arts and Humanities,” Institute for Digital Arts and Humanities, Bloomington, IN, March 4, 2010. S.T. Kowalczyk, “Data Publishing,” American Society for Information Science and Technology Research Data Access and Preservation Summit, Phoenix, AZ, April 9-10, 2010. R.H. McDonald, et. al., “Beyond the Silos of the LAMs: Time to Speak Up. Collaborative and Open Software Development Directions for Libraries, Archives and Museums,” CNI Spring Membership Meeting, Baltimore, MD, April 12, 2010. 83 R.H. McDonald and M. Winkler, “Building A Sustainable Software Community for Academic Libraries through Kuali OLE,” Java Application Special Interest Group Spring 2010 Conference, San Diego, CA, March 10, 2010. R.H. McDonald, et. al., "The Future of Higher Education: What Is the IT Professional’s Role?,” Educause Midwest Regional Conference, Chicago, IL, March 17, 2010. B. Plale,“Earth Systems Data in Real Time Applications: Low Latency, Metadata, and Preservation, Data-Intensive Research: How Should We Improve our Ability to Use Data,” e-Science Institute, University of Edinburgh, Edinburgh Scotland, March, 2010. B. Plale, “LEAD II / Trident Workflows for Timely Weather Products: the Challenge of Vortex2,” Microsoft External Research Symposium, April, 2010. J. Riley, "FRBRized Discovery and Cataloging,” Music Library Association Annual Meeting, San Diego, CA, March 23, 2010. J. Riley, "How Much to Semanticize? Looking at the future of Library Data and the Semantic Web," University of Illinois at Urbana, Champaign Library Colloquium Series, April 21, 2010. J. Riley, "Shareable Metadata for Visual Resources," Visual Resources Association Annual Conference, Atlanta, GA, March 18, 2010, B. Sherman and E. Wernert, “3D on the Expensive -- Opportunities, Challenges, and Why it Can Still Make Sense/Cents,” Counterpoint to “3D on the Cheap,” Department of Energy Computer Graphics Forum, Park City, Utah, April 12, 2010. B. Sherman, “You should be using… immersive visualization,” 2010 DOE Computer Graphics Forum, Park City, Utah, April 12, 2010. E. Wernert, “Indiana University Introduction & Site Report, =” Department of Energy Computer Graphics Forum, Park City, Utah, April 12, 2010. Center for Applied Cybersecurity Research Presentations F.H. Cate, “Data Can Be Good: Exploring Alternatives to Data Minimization for Protecting Privacy,” 2010 International Association of Privacy Professionals Global Privacy Summit, Washington, DC, April 21, 2010. F.H. Cate, “Internet, Blogs, and Social Networking: Legal and Ethical Issues”, Spring Judicial College Program, Indiana Judicial Center, Indianapolis, IN, April 15, 2010 Cate, F.H., “The History and Purpose of Phi Beta Kappa,” Academic Convocation, Elon University, Elon, NC, April 13, 2010. F.H. Cate, “A Nation Under Siege: Information Security and the Attack on American Interests,” Center for Applied Cybersecurity Research, Higher Education Cybersecurity Summit, Indianapolis, IN, April 1, 2010. F.H. Cate, “Liberty and Law,” Phi Beta Kappa Address, Purdue University, West Lafayette, IN, March 25, 2010. 84 F.H. Cate, “Legal Issues in Pervasive and Autonomous Information Technology,” Ethical Guidance for Research and Application of Pervasive and Autonomous Information Technology, Association for Practical and Professional Ethics and Indiana University Poynter Center for the Study of Ethics and American Institutions, Cincinnati, OH, March 3-4, 2010. F.H. Cate, “Private Data in Public Hands,” “Navigating the Digital Ocean: Riding the Waves of Change,” 11 th Annual Privacy and Security Conference and Exposition, Victoria, BC, February 9-10, 2010. F.H. Cate, “Protecting Privacy in Health Research: The Limits of Notice and Choice,” Symposium Celebrating the 50th Anniversary of Dean William L. Prosser’s Privacy California Law Review, University of California Berkeley, Berkeley, CA, January 29, 2010. M. Gupta, “Project Bloom: Empowering the Security Research Community through Data Projects and Computing,” Midwest University Industry Summit, Purdue University, West Lafayette, IN, March 31, 2010. J. Harris, and R.L. Hill, “Building a Trusted Image for Embedded Communications Systems,” 6th Annual Cyber Security and Information Intelligence Workshop, Oakridge, TN, April 21-23, 2010. R.L. Hill, “PlugNPlay Trust for Embedded Communications Systems,” The Symposium on Computing at Minority Institutions, Jackson State University, Jackson, MS, April 8-10, 2010. R.L. Hill, “Characterizing Trustworthy Behavior of Email Servers,” The Symposium on Computing at Minority Institutions, Jackson State University, Jackson, MS, April 8-10, 2010. S.A. Myers, “One Bit Encryption is Complete,” 2010 International Symposium on Collaborative Technologies and Systems, Chicago, IL, May 17-21, 2010. S.A. Myers, “One Bit Encryption is Complete,” Securing Information Technology in Healthcare Workshop, Dartmouth College, Hanover, NH, May 17, 2010. S.A. Myers, “One Bit Encryption is Complete,” 2010 IBM Research Cryptography Seminar, Columbia University City University of New York/ New York University, March, 2010. S.A. Myers, “One Bit Encryption is Complete,” Princeton University Theory Seminar, March 12, 2010. D. Ripley, A. Grubesi and T. Matisziw, “Geography of Internet2 Netflow” NetFlo Conference 2010, New Orleans, LA, January 12, 2010. D. Ripley, “Shared Darknets,” REN-ISAC Member Meeting, Educause Security Professionals Conference 2010, Atlanta, GA, April 14, 2010. Research Technologies Presentations Systems Group D. Hancock, “FutureGrid: Design and Implementation of a National GridTest-Bed,” Cray User Group in Edinburgh, Scotland, UK, May 25, 2010. D. Hancock, “FutureGrid Overview,” SPXXL, IBM Large User Group, Maui, HI, Jan 14, 2010 85 D. Hancock, “High Performance Computing monitoring at Indiana University” presented by Corey Shields, International Group of System Administrators, Argonne National Labs, Argonne, IL , May 24, 2010. Visualization Group C. Eller, “Applying Stereoscopic Video and Animation to the Digital Arts and Humanities,” Institute for Digital Arts and Humanities, Bloomington, IN, March 4, 2010. B. Sherman, “You should be using… immersive visualization”. Panel presentation at 2010 DOE Computer Graphics Forum, Park City, Utah, April 12, 2010. B. Sherman and E. Wernert, “3D on the Expensive -- Opportunities, Challenges, and Why it Can Still Make Sense/Cents,” Counterpoint to “3D on the Cheap,” DOE Computer Graphics Forum, Park City, Utah, April 12, 2010. E. Wernert, “Indiana University Introduction & Site Report,” DOE Computer Graphics Forum, Park City, Utah, April 12, 2010. Life Science Group W.K. Barnett, G. Elmore and V. Sheehan, “Cyberinfrastructure for Research and Healthcare,” Marshall University, Huntington, WV, January 14, 2010. W.K. Barnett, “i2iconnect: Bridging Inventors and Industry,” CTSA Industry Forum, Bethesda, MD, February 17, 2010. W.K. Barnett, “The Federated ID and How it Benefits Collaboration,” CTSA Communications Key Function Committee Annual Meeting, Bethesda, MD, March 4, 2010. W.K. Barnett, “The Indiana CTSI HUB,” Hubbub 2010, Indianapolis, IN, April 13, 2010. 86 Appendix 6: Active and Pending Grants 87 88 89 90 91 92 93 94 Appendix 7: Interim Financial Report 95 96 Appendix 8: Education, Outreach and Training Events Group Digital Science Center Community Grids Lab Community Grids Lab Community Grids Lab Community Grids Lab Community Grids Lab Event Title Open Science Grid Meeting TeraGrid Meeting DOE MAGIC Meeting CCGrid2010 Demo OGCE Science Gateway Tutorial Description Conference Name/Location Date(s) Approximat e # of Attendees Introduction to FG ? OSG Meeting >50 Technical unknown ? Technical unknown Introduction to FG Demonstrating features of FG Tutorial on how to use OGCE tools to build science gateways REU Interns Research Class MAGIC, virtual meeting CCGrid 2010 March 9, 2010 March 2010 7 April 2010 May 2010 >15 Technical unknown ? Technical unknown Indiana University Innovation Center March 31, 2010 10-15 Technical, scientific 0 Bloomington, Indiana Spring Semester, 2010 January 2010 7 0 N/A Undergrad uate Technical Lay unknown TG2010 Audience Type Community Grids Lab Indiana University CNetS Interview to Portuguese network TV channel RTP2 (Bairro Alto program) Video available at: http://www.yout ube.com/watch? v=iZ8D_NzFpJw CNetS “Somos todos Cyborgs’, Interview to Jornal de Letras, Tercafeira Game On http://aeiou.visa o.pt/somostodoscyborgs=f554308 April 6, 2010 N/A Technical State Science Olympiad/Bloomi ngton, IN 3/20/10 40 K-12 Industry Collaboration Workshop on Life Sciences Informatics/IUB 5/6/10 50 Technical, Business Open Systems Lab Open Systems Lab Life Sciences Workshop Middle school students compete to develop a computer game, teaching programming skills. Share research from SOIC faculty and Indiana Life Sciences industries. Attendees from Traditionally Underrepres ented Groups* unknown 97 Group Open Systems Lab Data to Insight Center Digital Library Program Event Title Description Conference Name/Location Date(s) Approximat e # of Attendees NSF/TCPP Curriculum Planning workshop on Parallel and Distributed Computing http://www.c s.gsu.edu/~tc pp/curriculu m/?q=worksh op Washington, DC 2/52/6/2010 20 Business The New Digital Library of the Commons Talk on the DLC service that provides free and open access to fulltext articles, papers, and dissertations. Talk on results of the Variations3 project's development and evaluation activities. Talk on the recentlyadopted IU Web Accessibility Administrativ e Practice. Talk on IUScholarWor ks Journal Service, an open access publishing option for IU scholars who desire local control over their journals. DLP Brown Bag 1.27.10 28 Technical DLP Brown Bag 2.10.10 26 technical DLP Brownbag 3.10.10 30 technical DLP Brownbag 3.24.10 35 general Digital Library Program Variations: Building a Digital Music Library Community Digital Library Program Web Accessibility at IU Digital Library Program IUScholarWorks Journals Service Panel Discussion Audience Type Attendees from Traditionally Underrepres ented Groups* 98 Group/Lab Data to Insight Center (Cntd.) Digital Library Program Event Title Streamlining the Digitized Image Workflow Digital Library Program Building better metadata creation tools Digital Library Program Omeka at the Lilly Visualization and Interactive Spaces Lab Down By The Water Description Conference Name/Locatio n Date(s) Approximate # of Attendees Discussions of the ways in which DLP has been able to streamline digitization, metadata entry, archival, discovery and delivery for digital image collections. Examine research on data entry interfaces, look at the state-ofthe-art in metadata creation tools, demonstrate some features that make metadata creation tools work well. An overview of Omeka's features and demo of local pilot project using Omeka to showcase digitized content from the collections of the Lilly Library. Field exercise conducted at Indiana schools, supported by VIS Lab software and personnel DLP Brownbag 4.21.10 25 technical DLP Brownbag 5.5.10 28 technical DLP Brownbag 4.7.10 26 general Forest Glen Elementary Various dates in April 36 Audience Type (Technical , General, Business, k-12, etc.) K-12 Attendees from Traditionall y Underrepre sented Groups* Varies by school 99 Group/Lab Center for Applied Cybersecurity Research CACR Event Title Health Informatics Seminar CACR Security Seminar CACR Security Seminar CACR Health Informatics Seminar CACR Security Seminar CACR Health Informatics Seminar Description Conference Name/Location Date(s) Approximat e # of Attendees Audience Type (Technical , General, Business, k-12, etc.) Attendees from Traditionally Underrepres ented Groups* Denny Morrison – Centerstone Research Institute Eliza Du Biometrics and Cancellable Biometrics Carl Gunter – Cybersecurity Architectures for Control Systems Michael Reece and Debby Herbenick – Integrating Methodologic al Advances and Technology to Understand and Improve Sexual Health Steve Myers – One Bit Encryption is Complete David Hakken – Creating an Academic Program in Health Informatics Department of Computer Science, Lindley Hall Janurary 14, 2010 20 General 10% Maurer School of Law January 21, 2010 23 Technical 20% Maurer School of Law February 4, 2010 27 Technical 10% Department of Computer Science, Lindley Hall February 11, 2010 20 General 5% Maurer School of Law February 18, 2010 35 Technical 5% Department of Computer Science, Lindley Hall February 25, 2010 18 General 2% 100 Group/Lab Center for Applied Cybersecurity Research (Contd.) CACR Event Title Security Seminar CACR Health Informatics Seminar CACR Health Informatics Seminar CACR CACR Higher Education Cybersecurity Summit CACR Security Seminar CACR Health Informatics Seminar Description Conference Name/Location Date(s) Approximate # of Attendees Audience Type (Technica l, General, Business, k-12, etc.) Attendees from Traditionally Underrepres ented Groups* Susan Hohenberger – New Development s in Digital Signatures Peter Todd – Delivering Information to Help Shoppers Make Healthier Choices Stephen Downs – Child Health Improvement through Computer Automation Conference on Cybersecurity Maurer School of Law March 4, 2010 29 Technical 5% Department of Computer Science, Lindley Hall March 11, 2010 30 General 5% Department of Computer Science, Lindley Hall March 25, 2010 12 General University Place Conference Center/Indianapo lis, IN April 1, 2010 290 10% Yvo Desmedt – 60 Years of Scientific Research in Cryptography: A Reflection Hamid Ekbia – Dubious Partners: Serious Games and Personal Health IUPUI – Taylor Hall April 7, 2010 20 Technical, Professiona l, Adminstrati on Technical Department of Computer Science, Lindley Hall April 8, 2010 22 General 5% 5% 101 Group/Lab Event Title Center for Applied Cybersecurity Research (Contd.) CACR Security Seminar CACR Security Seminar Research Technologies Systems Systems Systems Visualization Visualization Visualization Visualization Visualization Lunch with the Sysadmins Lunch with the Sysadmins Lunch with the Sysadmins STC Tour IT Student Ambassadors Tour Margaret Single Student Tour Society for Hispanic Professional Engineers Tour Arenson Center for Research Tour Description Conference Name/Location Date(s) Approximate # of Attendees Susan Hohenberge r – New Developmen ts in Digital Signatures Shishir Nagaraja – Slaying the Snooping Dragon: Detecting Botnets via Structured Graph Analysis Maurer School of Law March 4, 2010 29 Technical 5% Maurer School of Law April 15, 2010 29 Technical 5% Outreach Simon Hall, IUB 2 Technical N/A Outreach Cyclotron Facility, IUB Science Building, IUB Advanced Visualization Lab, IUPUI January 11, 2010 February 8, 2010 March 8, 2010 January 2010 7 Technical N/A 2 Technical N/A 4 Professional 1 Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI January 12, 2010 6 Undergradu ate Students 0 January 21, 2010 1 Student 0 February 2, 2010 15 Technical 15 Advanced Visualization Lab, IUPUI February 2010 6 Technical 3 Outreach Tour of group from society of technical communicati on Tour of AVL Tour of AVL Tour of AVL Tour of AVL Audience Type (Technical , General, Business, k-12, etc.) Attendees from Traditionally Underrepres ented Groups* 102 Group/Lab Description Conference Name/Location Date(s) Approximate # of Attendees Bailey tour for Chinese visitors Tour of AVL February 2010 4 Technical 0 Visualization Tour for Hoosier Mamas Group Tour of AVL Bloomington February 2010 20 Lay 0 Visualization Tour for Swinford and Colleagues Tour of AVL February 2010 4 Technical 0 Visualization Tour for Kristy Kallback-Rose Tour of AVL February 2010 1 Technical 0 Visualization Tour for Jason Moore School of Medicine Faculty Candidate Informatics Visualization Class Tour Tour of AVL Advanced Visualization Lab, IUPUI Lindley Hall 120 Bloomington Campus Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI March 2010 1 Technical 0 Lindley Hall 120 Bloomington Campus Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI March 2010 16 Undergradu ate Students 1 March 2010 14 Technical 0 March, 2010 16 Technical 0 April, 2010 7 Students 0 Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI April, 2010 21 Students 4 April 30, 2010 28 Technical 1 April 2010 5 Technical 3 April 2010 21 Technical 0 Research Technologies (Cntd.) Visualization Visualization Event Title Visualization Tour of AVL Tour of AVL D510 Critique Visualization Visualization Tour of AVL Technology Partnership Wabash Twenty First Century Scholars Tour April 19th Visualization Visualization Visualization Visualization Tour of AVL Tour of AVL Acheson Class Tour Women in computing tour April 30th Stereo video presentation to Telecom T351 stereo video presentation to Telecom T284 Tour of AVL Tour of AVL Tour of AVL Audience Type (Technical , General, Business, k-12, etc.) Attendees from Traditionally Underrepres ented Groups* 103 Group/Lab Event Title Research Technologies (Cntd.) Visualization Visualization D510 Critique 3D opera sneak peek for Travis Gregg Visualization Visualization Visualization Description Conference Name/Location Date(s) Approximate # of Attendees Tour of AVL Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI April, 2010 16 Technical 0 April, 2010 1 Lay 0 May, 2010 15 Technical 0 May, 2010 4 Technical 0 Tour of AVL Advanced Visualization Lab, IUPUI May, 2010 1 Technical 0 Tour of AVL Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI Advanced Visualization Lab, IUPUI May, 2010 6 Technical 1 May, 2010 8 Technical 0 May, 2010 1 Technical 0 May, 2010 3 Technical 0 Tour of AVL Tour of AVL #31784: D510 Final Exhibition #31953 Quick Tour for Laurie Antolovic and commercial group #33036: Host Jon Vickers in LH 120 to show stereo HD and mono 4K content Visualization Tour of AVL #31802: Dentistry Visualization Tour of AVL #31520: IDAH Visualization Tour of AVL #31892: MIT faculty tour Visualization Tour of AVL #32813: Engineering tour Audience Type (Technica l, General, Business, k-12, etc.) Attendees from Traditionally Underrepres ented Groups* *Includes groups traditionally underrepresented in the science and technology fields including: African Americans, Native Americans, Native Pacific Islanders, Hispanic Americans. 104 Appendix 9: Public and Governmental Service Activities Group/Lab Digital Science Center Community Grids Lab Event Title US Department of Energy MAGIC Description Conference Name/Location Date(s) Introduction to Future Grid Project MAGIC, virtual meeting April 7, 2010 105 Appendix 10: News and Media Placements PTI leadership, researchers or projects were featured more than 70 times in news releases and items in the online, print, and broadcast media during the reporting period. Summaries of the main news items for the reporting period are as follows: January 5, 2010: Fearless Flying with Fred H. Cate. Privacy and security expert Fred Cate, Director of PTI's Center for Applied Cybersecurity Research, shares his thoughts on airport security in recent issues of Miller-McCune, Indianapolis Star, Newark Star Ledger, and the IU News Room. January 30, 2010: Call for participation: NSF Campus Bridging Technologies workshop. IU invites position papers for the NSF-sponsored workshop to be held April 7-8, 2010 at University Place Conference Center on the IUPUI campus in Indianapolis. Deadline: March 1, 2010. February 2, 2010: Tony Hey, VP of the External Research Division of Microsoft to keynote DSC’s inaugural Seminar Series. Tony Hey, VP of the External Research Division of Microsoft Research, speaks at DSC's inaugural 'Seminar Series' in the IU Innovation Center. February 11, 2010: Bruce Schneir to keynote sixth annual Cybersecurity Summit. On April 1st renowned security guru Bruce Schneier will headline CACR's 2010 Higher Education Cybersecurity Summit. February 17, 2010: Vespignani invited to present complex networks seminar as part of Oxford seminar series. DSC's Alex Vespignani gives an invited talk Feb. 25 at the Old Indian Institute, Oxford, examining the H1N1 pandemic as a way of anticipating trends, evaluating risks in real time. February 26, 2010: CACR offers security research grants to IU faculty and staff. CACR is soliciting grant proposals for research in information security at Indiana University. Applications should be submitted by 5:00 p.m. on Wednesday, March 31. February 26, 2010: (New York Times Article) When American and European Ideas of Privacy Collide. "On the Internet, the First Amendment is a local ordinance,” said Fred Cate said in an article for the New York Times. March 15, 2010: IU “Twister” software improves Google’s MapReduce for large-scale scientific data analysis. PTI researchers introduce new tool to support faster execution of data mining applications implemented as MapReduce programs. March 22, 2010: IU to host workshop on Vampir performance analysis tool for supercomputers. Free hands-on workshop to be held April 21 in Bloomington, hosted by the PTI/UITS High Performance Applications Group. 106 March 29,2010: Specter pushes for stronger federal privacy laws. CACR Director, Fred H. Cate testified during a Senate hearing in Philadelphia. He and other experts were invited to debate whether secret video recordings should fall under the federal wiretap statute. April 5, 2010: Call for participation, 2nd International Conference on Cloud Computing Technology and Science by Indiana University. CloudCom 2010 is now accepting papers and workshop proposals until July 1, 2010. It will take place Nov. 30-Dec. 3, 2010, at University Place Conference Center on the IUPUI campus. April 14, 2010: Vespignani featured in the journal ‘Nature’. Alessandro Vespignani's paper "Complex Networks: The fragility of interdependency" in the April 14 issue of Nature. April 15, 2010: Informatics honors outstanding alumni at annual awards banquet. The IU School of Informatics hosted its annual Alumni Awards Banquet on April 15 in Indianapolis. Kay Connelly, senior associate director of the CACR, was one of five winners, earning the "Young Alumni Award." May 4, 2010 IU Weather prediction technology supports national tornado research project. Storm chasers from the VORTEX2 national tornado research project will spend six weeks getting up close and personal with tornadoes in an effort to better understand how they form and behave—and D2I's LEAD II technology will help guide their way. May 5, 2010: IU gears up for fourth annual Summer Technology Workshop for Teens. Members of the CACR Advanced Network Managment Lab host their fourth annual Summer Technology Workshop for teens in Bloomington June 15 and 17. May 24, 2010: IU-developed software helps researchers find meaning in massive scientific data sets. Data to Insight Center’s new XMC Cat helps find scientific needles in massive digital haystacks. XMC Cat will drastically reduce the time between data collection and possible scientific breakthrough by making it easier for researchers to sort through large amounts of data and locate the most relevant information for the study. PTI videos released during the reporting period. (Visit URL’s below to watch videos.) LEAD II/Vortex2 tornado research project: http://pti.iu.edu/video/vortex2 XMC Cat Software: http://pti.iu.edu/video/xmccat Twister Software: http://pti.iu.edu/video/twister. 107 Ethical Technologies in the Homes of Seniors project: http://newsinfo.iu.edu/asset/page/normal/8472.html. 108 Appendix 11: Glossary of Technical Terms Used in this Report Cloud computing/technologies: (from Wikipedia) Internet-based computing, whereby shared resources, software, and information are provided to computers and other devices on demand, like the electricity grid.” Cloud computing supports research and business by providing a single access point to numerous computational resources that lie “in the cloud” without requiring that the user know or understand the complex technology that is supporting them. Businesses such as Google and Amazon are already heavily relying upon cloud computing to support their business and are proving it to be a critical emerging technology. Cloud computing is largely believed to be the dominant emerging computational paradigm for the coming decades. Compilers: sets of code that convert source code written in one programming language into another programming language in order to improve software performance. Data at Scale: Massive data sets that require specialized tools and software to effectively manage and find meaning within them. Generic programming: a type of computer programming that uses non-specific basic instructions that can be tailored later to specific projects, saving time and reducing redundancy when writing code. Multicore computers: computers with two or more central processing units. Many computers produced today are multicore, to allow for increased performance and computational speed when paired with effective software programs. Open Source Software: Software with source code that is free and open to the public and that may be adapted for individual use. Science Gateways and Portals: web-based access points and tools that make it easier for scientists to use advanced computing technology by greatly reducing the amount of computational expertise required to run experiments using supercomputers and other advanced technology. 109 Pervasive Technology Institute Contact Information 501 N. Morton St. Ste 224 Bloomington IN 47404 pti@indiana.edu (812)856-1537 Executive Director Craig A. Stewart stewart@indiana.edu Chief Operating Officer Therese Miller millertm@indiana.edu Digital Science Center Director Geoffrey C. Fox gcf@indiana.edu Data to Insight Center Director Beth Plale plale@indiana.edu Center for Applied Cybersecurity Research Director Fred Cate fcate@indiana.edu 110