ProjectDescription - Electrical and Computer Engineering

advertisement
Project Description
Introduction
Next-generation scientific applications will require access to Petabytes of observational imagery, measurements
from remote sensing instruments (e.g. radio or infrared telescopes, planetary rovers, undersea drones), data from
biological and/or chemical experiments, and other types data archives. These data sets will reside on geographically
distributed data sources, and with very heterogeneous schemas, data processing capabilities, and usage policies. To
be useful to scientists, these applications must correlate these data products to find relationships and trends that lead
to new knowledge. Moreover, scientist must be able incorporate new processing capabilities to custom-tailor the
data products generated by sensors, and data already stored in databases. Thus, these new applications must be built
on top of systems designed for wide-area networks that support, among other features: 1) Reliable 24/7 access to
distributed data sources, 2) access to distributed computational resources (e.g. CPU cycles, disk space), 3) efficient
distributed query processing, and 4) customization via distributed software deployment.
Typically, database middleware systems have been used as a solution to integrate heterogeneous data sources and
support applications that require access to data in these sources. The term Federated System is used to depict a group
of data sources integrated via database middleware. Unfortunately, the existing solutions are based on a centralized
architecture that cannot scale to the wide-area environments that are becoming common place for scientific
applications. This architecture, shown in Figure 1, relies on a central integration server to provide client applications
with a uniform view of the data, and single-point of access to the data sources. The integration server relies on the
capabilities of translators to extract the data from the sources, and perform schema mapping operations to convert
data from local schemas into the schema specified by the client to the integration server. Once the data items have
been translated, they are sent back to the integration server for further processing. Most of the query processing
occurs at the integration server site and the data sources often act as mere I/O nodes. A catalog associated with the
integration server provides the metadata necessary to guide the process to find data sources, schema mapping rules,
and query processing strategies. There have been two modalities to realize database middleware systems. The first
approach is to use a relational database engine as the integration server, and use database gateways as the translators
that allow the integration DBMS to access distributed data. In the second approach, a Mediator System specifically
tailored for distributed processing is employed. This solution features an integration server called the mediator,
which acts as data integrator, and a group of wrappers which act as the translators.
By the nature of their architecture, scaling these database middleware systems to very dynamic wide-area
environment is extremely difficult. Since most of the processing is done at the integration server, very large amounts
of data must be moved over the network in order to deliver the data from the source to the translators. Support for
extending the system with code that implements customized operations such image subsetting, feature extraction,
and spatial analysis is difficult since it is usually done manually. As the number of sites in the system increases, it
becomes very difficult to keep track of sites where the necessary software has been deployed. During query
processing, failures at one or more of the sources being used to solve a query often result in an unrecoverable error
that forces the query to be restarted and all the previous work is lost. Access to computational resources, such as
clusters and disk farms, is rarely included as a feature that can be accessed via the software toolkits included with
the middleware system. Hence, developers need to work with several system toolkits which requires a team with
expertise on each of these systems. Also, the catalog used in the system is assumed to be an “oracle” which knows
the location all the data sources, schema mapping rules and query operators needed to solve a given query.
This proposal represents our effort to develop a new framework to develop database middleware systems that are
more aligned with the nature of wide-area environments and with the requirements of scientific applications being
deployed on these environments. We propose to realize this framework by building the GaiaNET database
middleware system to integrate databases and computational resources on large-scale wide-area networks. GaiaNET
will provide applications with the abstractions of three types of services: data services, computational services and
software library services that can be seamlessly combined to build applications that support complex scientific
queries and data analysis. These services will be combined using an approach that we call dynamic service
composition, which is a based on a Peer-to-Peer (P2P) architecture. The most salient novel features of GaiaNET are:
1) self-organization of federated sites, 2) dynamic selection and eviction of sources that participate in solving a
query, 3) decentralized query processing with elastically redundant information and computation to meet reliability
requirements, 4) decentralized control and coordination of query processing, 5) automatic deployment of
application-specific code to remote federated sites, 6) ability to satisfy Quality of Data (QoD) and Quality of Service
(QoS) requirements for applications, and 7) seamless integration with Web technology and standards. GaiaNET will
enable its users to federated sensors, databases, clusters, and other scientific equipment to create applications with
value-added features that simplify the tasks of scientists and engineers working to analysis the vast amounts of data
collected on a daily basis. GaiaNET will be distributed as an open source system. GaiaNET differs from Network
Middleware such as CORBA, RMI, .NET and RPC since the later are used as an infrastructure layer to provide
applications access to the network. GaiaNET (as well as database middleware) is at a higher layer, and can leverage
on the Network Middleware for connectivity purposes. But neither CORBA, RMI nor .NET provide services such as
distributed query execution, schema mapping, and caching as GaiaNET will do.
Client
Client
Integration
Server
Internet
Translator
Oracle 9i
Translator
XML Data
Translator
IBM DB2
Translator
Text Data
Figure 1: Centralized Database Middleware Architecture
The GaiaNET project will be carried out by a multi-disciplinary team of Computer Scientists, Electrical Engineers
and Earth Scientists from the University of Maryland, College Park (UMCP), and the University of Puerto Rico,
Mayagüez (UPRM). The Computer Science team will lead the effort responsible to develop GaiaNET and will
consist of researchers from UMCP and UPRM. This team will collaborate with Electrical Engineers and Earth
Scientists from two research centers at UPRM: 1) the NASA Tropical Center for Earth and Space Studies (TCESS) ,
and 2) the NSF Center for Subsurface Sensing and Imaging Systems (CenSSIS). TCESS is a NASA University
Research Center devoted to satellite data acquisition, image processing and analysis. TCESS operates a Synthetic
Aperture Radar (SAR) and HRPT tracking stations that receive over 70GB of satellite imagery per week. CenSSIS
is a multi-institution NSF Engineering Research Center 1 that seeks to revolutionize our ability to detect and image
biomedical and environmental-civil objects or conditions that are underground, underwater, or embedded within
cells or inside the human body. Researchers from these two centers will contribute their expertise to deploy and use
the system, develop domain-specific applications and processing code, and serve as a testbed to provide feedback on
the features of the system.
The proposed project has great potential to achieve significant broader impacts with particular emphasis on the
following areas: 1) promote collaborative teaching, training and learning between different academic institutions, 2)
broaden participation of Hispanic groups of students, faculty, and scientists in cutting-edge database and Internet
research (approximately 50% of the undergraduate engineering students at UPRM are women), 3) build much
needed research and education partnerships between an internationally recognized university (UMCP) and a
minority serving university (UPRM), 4) promote graduate and undergraduate research experiences to help increase
the number of Ph.D. and M.S. degrees in the U.S. workforce, 5) provide tools for scientific research to help reduce
costs and turnaround time, and 6) provide a new framework for developing more affordable and accessible
distributed database technology. The project will foster direct interaction among people with widely diverse
educational levels and backgrounds.
1
Participating institutions in CenSSIS are Northeastern University, Rensselaer Polytechnic Institute, Boston
University and University of Puerto Rico, Mayagüez.
The remainder of this proposal is organized as follows. Section 2 provides overview material and defines the
problem. Section 3 provides a technical description of the GaiaNET system and the research issues that we must
tackle to build such a system. Section 4 discusses related work. Section 5 discusses the broader impacts that this
project can have on research, education, and our society. Section 6 presents our expertise, management approach
and milestones to be completed over a five year period. Finally, section 7 discusses the results generated by our team
members from prior NSF support.
Overview of Service Composition
Consider an Earth Science application that needs to correlate surface satellite images with the land regions near to
the coast of San Juan, Puerto Rico. The goal of the analysis is to study the effect of urban development in coastal
erosion. Various types of satellite images (e.g AVHRR, MODIS, ETM+) are kept on databases located in
Maryland, New York and Texas. Suppose that Maryland and New York are replicated sites. Schematic maps of
costal regions are kept in San Juan. The application must find the appropriate satellite images from these types, and
draw the coastal map on top of the image for a given region. A color code must be used to indicate zones of sand,
vegetation, or water depth (just like weather forecasters show bands of showers on city maps). We can model the
satellite images with a relational view Images(taken:Date,band:Integer,location:Rectangle,data:BLOB) that
indicates date taken, radiation energy band measured, location on the Earth, and the actual image data bytes.
Likewise, the maps can be modeled with a view Maps(taken:Date,agency:String,landtype:integer,location:Polygon)
that indicates date of map creation, agency that made the map, type of land (e.g. coast, mountain, city) and a
polygon with the lines that form the map. To support our application, we must have a database middleware system
that can expose these views to the application, find the data to populate the views, and support complex queries to
analyze the data and extract new information.
We propose a novel framework for building database middleware systems based on services that supply data,
software and computing power, and which can be composed to form an execution pipeline that extracts the data,
processes data items based on a query specification and returns the results to the client application. We define a
service as a server application that provides some type of functionality and which is reachable over a network. A
data service provides access to a collection of data and metadata for a given application domain. These include
database engines, web server, file system or any other customized-server. A software service provides access to code
and associated metadata; this code performs a given computational task. Examples include sorting routines, spatial
indexing code, and feature extraction functions. Finally, a computing service provides access to computational
resources required to process a collection of data with a specific set of software routines. Notice that there might be
some services that have dual roles, such as a database engine that provides both data and software to process them.
Following the World Wide Web Consortium (W3C) convention, we define a Web service as a service identified by a
Uniform Resource Identifier (URI), whose interface is described by XML, and which can be accessed over the Web.
In our framework, all data services, software services, and computing services are exposed to applications as Web
services. We shall use the word “service” and “Web service interchangeably”, but bear in mind the all the services
we mention are actually Web services.
We can model the data sources for our Earth Science application as Web services. Let us assume that the sites at
Maryland, New York and Texas are running data services, and computing services. Meanwhile, San Juan is running
all three types of services. To solve a query such as “Correlate all images and maps for region R, where the images
were taken between June 1999 and April 2002”, we can use the following strategy: 1) Invoke the Web data services
in New York , Texas and San Juan to extract the images and maps, 2) route the images and maps to computing
services willing to correlate them, 3) route the necessary correlation code from a software service to the computing
services found in step 2, and 4) send the results back to the client application. Figure 2 depicts this approach,
assuming that data is brought from New York, Texas, and San Juan. The images from Texas and New York are first
filtered using the date predicate to remove unwanted ones. All images are sent to San Juan for correlation, and sent
to the client application which is assumed to be in San Juan.
In general, we can solve the problem of executing a query Q by building a service composition graph. The graph
formed by the interaction of services shown in Figure 2 is an example of a service composition graph. Given a query
Q, a service composition graph G is a directed acyclic graph (DAG) G =(V,E) with the following properties:
1. V represents a set of Web services acquired to solve Q.
2.
3.
4.
Each edge (u,v) E represents the fact that service u is being used by service v. This is called a service
composition between services u and v. These edges are called composition edges.
Each edge (u,v) E has an associated cost C. For all composition edges (u,v), the cost of the edge is defined
as the cost incurred using service u, plus the cost incurred to deliver its results to service v.
By executing the services in G=(V,E) as indicated by E we can find a solution to query Q.
The relationships in E form the heart of our framework since they represent the flow of data and computation that
enables data to be extracted, and processed according to the instructions and code issued in the query Q. In the
context of query processing, composition graph G represents a plan to solve the query. The set of services V take
care of executing one or more of the query operators in this plan, provide the necessary data, or provide the code
required for one or more operator. The composition edges indicate how data and code moves between the services.
The cost of composition edges is application-specific; some metrics for cost can be resource usage, response time
(wall clock time), volume of data transfer, or monetary cost (assuming some services are charged in dollars). Hence,
the cost of the computation represented by the composition graphs is also application-specific.
Image
Service
Image
Service
TX
Image
Service
NY
TX
Computing
Service
Sofware
Service
NY
Computing
Service
Computing
Service
SJU
SJU
MD
Computing
Service
MD
MAP
Service
SJU
Client
SJU
Figure 2: Service Composition Graph for Web Services
At first glance, it might appear that a service composition graph for a given query Q might be computed once at
query time and used throughout the computation of the query. However, this approach will not be suitable for widearea environments where data sources come and go, network speeds change depending on current traffic, and newly
available computer cluster usage might be granted for just a limited amount of time. For example, in our sample
scenario it might be the case that the New York database fails, so the system must switch to the data service at
Maryland to get the remaining images. Thus, rather than building just one service composition graph G, the
middleware system should monitor query execution, and transform the current G into a new composition graph G
whenever it is needed, as shown in Figure 3. This new graph will have services or composition edges not present in
graph G, but which are now necessary to complete the execution of query Q. Likewise, unused services and
composition edges are removed from G . Clearly, for G to be useful it ought to be the best candidate replacement,
meaning that it minimizes the total cost to solve the remaining of query Q. Notice that this process might occur
several times during the execution of query Q. Therefore, to solve query Q we actually need a sequence of service
composition graphs S  {G0 , G1 ,..., Gn } , where G0 is the initial service composition graph, and Gn is the final
Gi is produced by running an optimization algorithm
to modify the services or compositions in Gi 1 , for i 1. To generate Gi , this modification algorithm shall consider
composition graph. In this context, service composition graph
current system status and the progress achieved to so far to solve the query. To the best of our knowledge, no other
work in the research literature has modeled query processing and service composition for database middleware in
this fashion.
This proposal presents a five-year plan focused on the research necessary to fully develop this database middleware
and query processing framework, to be known as the GaiaNET database middleware system. We also plan to
collaborate with our partners from TCESS and CenSSIS to develop a realistic scenario that will help us deploy, test
and characterize the system with rigorous performance experiments. In order to realize the GaiaNET system our
research effort must lead us to overcome a series of barriers that block any attempt at a straight implementation of an
efficient database middleware system for wide-area environments. These barriers are:
1. Barrier #1 – Poor understanding of an adaptive query model for wide-area system dynamics: Most of
query processing models assume a stable environment where system dynamics do not change during query
execution. There are a few notable exceptions such as Query Scrambling (REF) and Eddies (REF) which
attempt to adapt to changes in the execution environment. However, these solutions mostly deal with reordering
of operators. We need a general framework that takes into consideration opportunities to bring data from
alternate sources (perhaps concurrently), send query computation to other sites, or partition the query
computation by leveraging on redundancy of data and computing resources on a wide-area network. Quality of
Service (QoS) and Quality of Data (QoD) guarantees must also be present in this framework.
2. Barrier #2 – Lack of a decentralized middleware architecture that can adapt to system conditions:
Currently, database middleware systems follow a centralized architecture. All configuration information and
relationships must be present in the catalog. Federations are built by hand, requiring interaction between system
administrators. The integrator server becomes a single point of failure in the system. There is a need for a P2P
database middleware architecture that provides redundancy, more opportunities to find sites for processing, and
is based on Web services.
Image
Service
Image
Service
TX
Image
Service
NY
TX
Computing
Service
Computing
Service
NY
MD
Computing
Service
MD
Unavailable
Sofware
Service
Computing
Service
SJU
SJU
MAP
Service
SJU
Client
SJU
Figure 3: Reconfiguration of Service Composition Graph for Web services
3.
4.
5.
Barrier #3 – Lack of a well-established framework for Web service compositions: Web services are still an
evolving technology. Vendors are focusing on marketing, not offering many concrete scenarios, and virtually no
documentation. At the time of this writing, the W3C Working Group on Web Services is just beginning a
standardization process. Composition of services is a problem not well understood, and there are almost no
reference implementations. Performance metrics needed to compare different composition schemes are yet to be
identified. Algorithms to find a composition graph to solve a given query are not known.
Barrier #4 – Inadequate APIs to write applications that use Web service compositions: There is no
language to specify the composition of Web services in declarative fashion. Thus, right now compositions must
be specified implicitly in the structure of the applications. WSDL (REF) is a language used to define a Web
service, but its does not specify compositions. Moreover, most APIs only provide support to register services,
and to specify XML messages with SOAP to exchange requests. How all these metadata elements get
disseminated and managed efficiently throughout the system is still unclear.
Barrier #5 – Inadequate query execution engines for processing based on Web Service Composition:
Typically, query optimizers and query execution engines are separated modules in a database middleware
engine. While some systems perform query execution in distributed fashion, optimization is centralized. This
arrangement precludes a more adaptive operation that enables to system to monitor current execution
performance and quickly adapt to changes by modifying the query plan with more efficient query sub-plans.
There is little to support to discover new processing strategies once the query is running. As a result, the
6.
7.
8.
middleware cannot capitalize on new resources (e.g. computer clusters or replicated collections) that become
available after query execution starts.
Barrier #6 – Limited understating on how to build systems that feature some form of SelfAdministration: Database middleware rely heavily on system administrators to define schemas, available data
sources, available software for query processing, and policies for system usage. This scheme is not scalable for
a wide-area environment. In (REF) we developed the concept of self-extensible middleware, which means that
during query processing the middleware system is capable of shipping code needed for query processing to
remote sites. This allows the system to dynamically extend its functionality and adapt to the needs of the query
at hand. This framework must be extended to a P2P setting, and new features such as computing sites discovery,
data source discovery, dynamic federation formation, and application-code discovery should be developed.
Barrier #7 – Limited tools to automate or semi-automate the process of schema mapping: Schema
mapping is a very difficult task, and there are few tools to help administrators build the infrastructure to
simplify the generation of schema mapping rules via automation. Very often, developers must either hard-code
schema mapping rules into the translators, or write stored procedures that perform the schema mapping during
data extraction. Schema mapping ought to be automated as much as possible, and should follow a more
declarative process.
Barrier # 8 – Limited tools to protect federated systems and Web Services from attacks: Like most
networked system, database middleware system can be vulnerable to attacks. Often, these systems delegate the
role of system security to the network and operating system infrastructure that hosts each data source. Issues
such as intrusion detection, and data security to prevent data misuse, are currently being explored but relatively
little technology transfer and adoption has occurred.
Our research plan will focus on dealing with barriers 1-6 because we find this is the niche that better fits our
expertise and interests. There are several on-going projects (REFS) that are working on barrier #7. Likewise, the
volume of research activity geared towards dealing with barrier 8 is impressive, to say the least. We shall keep track
of advances in schema mapping and security to incorporate them in GaiaNET as they become available, and
whenever possible, we will attempt to bring our own contributions to the system. But the focus on this research will
be the development of a framework for Web service composition, and query processing on wide-area environments,
with emphasis on their realization in the GaiaNET middleware system.
GaiaNET System Architecture
We have designed GaiaNET to be an open source system tightly coupled with the Web. The rationale for this choice
to leverage on the wealth of experience deploying Web-based applications that has been acquired by many
enterprises. By leveraging on Web-technology, which has been proven scalable and very reliable, we can maximize
the acceptance of GaiaNET, and reduce the cost, time for deployment and risk incurred by an enterprise willing to
use it.
GaiaNET: P2P environment
Client
Workstations
Client
CSP
QSB
QSB
DAP
DAP
Databases
P2P Cubetree Caching and Storage Services
QSB
DAP
QSB
DAP
DAP
DAP
Sensors
Databases
NICK???
Adaptive Replication Services
Bienvenido???
Smart Mirror Services
Pedro???
Sensor Management Services
Isidoro???
Related Work
A Distributed Database System is composed of a collection of Database Management Systems (DBMS), connected
together via a computer network, that agree to work together in a federation to manipulate their distributed data
collection and share their query processing capabilities. Among the most interesting prototypes built over the past
two decades we have R*[50], Distributed Ingres [44], ADMS+/- [40] and Mariposa [42, 45]. These systems
proposed to federate groups of autonomous DBMS located on different hosts, and with no central authority
imposing restrictions on the local operations at each site. In fact, these were early examples of Peer-to-Peer (P2P)
systems, which are now gaining again broader acceptance. In most Distributed Databases it was assumed that all
sites were relational systems, and that they either ran the same DBMS or all sites adhere to a common
communications protocol. Distributed join processing [9, 24, 32, 48], transaction processing [35] and data caching
[12] have been the major focus of research on these systems. The work in [20, 27]studied the structure of the DBMS
to be run at the federated sites, and argued that a client-server DBMS model was a more appropriate and efficient
arrangement than P2P. The work in [12] explores different strategies to cache frequently used data at the client
DBMS and reduce the latency incurred when data must be requested from the server DBMS running on a
mainframe.
The basic assumption in a Heterogeneous Database System is that the sites being federated have very different
characteristic in terms of data models, database server and query execution capabilities. Often, Heterogeneous
Database Systems are referred to as Database Middleware Systems since they are a middle layer of software that
interconnects the client applications with the data sources. Still, most of these systems are based on the assumption
that data sources reside in mainframes or enterprise servers. Typically, the server application used at the
middleware layer to service client requests is called the integration server. In its most basic form, the middleware
layer can be made out of database gateways, which are software modules that allow a single-site database server to
gain access to the data managed by another database server (possibly from a different vendor). The gateway gives
the local DBMS an access method mechanism to the data stored in the foreign database server, and this local DBMS
becomes an integration server (also called application server) for its clients. Database gateways are provided by
major database vendors such as Oracle [16], Informix [15] and Sybase [17].
By far, the most complex and interesting type of middleware system is the Mediator system, in which In a server
application called the mediator acts as the integration server for the client applications. The mediator is specially
tailored for data translation [14], schema mapping [31], and distributed query processing [26, 38]. The mediator
provides services such as query parsing, query optimization, query execution and transaction processing. Moreover,
the mediator provides a common data model used to resolve the conflicts that might arise as result of the differences
in the schemas of the data sources. The mediator relies on wrappers to gain access to the information contained in
the data sources. The wrappers receive requests from the mediator to query data in the data sources, and then they
generate queries or procedure calls specific to the data source in order to fetch the data of interest. All the data
values retrieved from a data source are translated from the local schema into the schema specified by the mediator,
and then are send to the mediator where they are further processed to produce the final result of the query. Some of
the better known mediator-type middleware systems are Pegasus [4], TSIMMIS [13], DISCO [46], METU [22],
Garlic [39] and MOCHA [38].
Agent Systems are based on the idea that applications can be built in terms of groups of intelligent agents that work
as a group to solve a given task. Agents exhibit intelligence, mobility and autonomy to carry our tasks and make
decisions on behalf of the users. Thus, agents can be used to monitor stocks, buy goods on-line, participate in
auctions, and perform data integration operations. The precise definition of what an agent is (or should be) and to
implement it is somewhat of a controversial issue. Some researchers view agents as small programs [8, 28], others
think of them in terms of logic semantics [5], while some researchers [41] disregard them as a bad idea that isolates
user from the experience of interacting with networked applications. The body of literature in Agent System is
extensive, but it is mostly focus on topics more relevant to Artificial Intelligence than Database Systems, so we shall
not go any further on this discussion on agents.
Sensor Networks [11, 23] are networks of formed when a set of small, unattended sensor devices are deployed over
a given area. The sensors form ad-hoc relationships between them to cooperate on sensing physical phenomena.
Some of the target applications for sensor networks are surveillance of inhospitable terrain, monitoring and detection
of forest fires, study of traffic patterns on metropolitan areas, and monitoring of equipment on manufacturing plans.
Data Streams emerge in this context as an abstraction representing a continuous flow of information produced by the
sensors which can seldom be stored in raw form. Recent efforts have proposed mechanisms to efficiently optimize
[49], process [34], and aggregate [21] data emanating from Data Streams.
Broader Impacts
Personnel, Management Approach and Milestones
The project will be organized into four teams, each one led by a faculty member. Dr. Manuel Rodríguez-Martínez
will be the project PI and leader of the Distributed Data Management and Integration Team. Dr. Bienvenido VélezRivera will be the leader of the Programmatic Interfaces and Visualization Team. The Parallel Processing and
Storage Team will be led by Dr. Pedro I. Rivera-Vega. Finally, Dr. Miguel Vélez-Reyes will be the leader of the
Image Processing Team. Each team will have at least one graduate student and one undergraduate student from our
programs in Electrical Engineering, Computer Engineering, and Computer and Information Science and Engineering
(CISE). In addition, we will provide two assistantships for Earth Science students to be integrated into the project.
We will have monthly group meetings, and one project review day per semester. During the UPRM Industrial
Affiliates Week we will have a NASA TerraScope Workshop to present project status to the public and bring NASA
personnel for assessment of the project. We will form a committee to manage the interaction with our TCESS and
CenSSIS partners, and incorporate their user perspective throughout the development of TerraScope. Administrative
operations will be managed by the Center for Computing Research and Development (CECORD). These activities
include equipment purchase, travel arrangements, and organization of the TerraScope Workshop. The UPRM Ph.D.
program in CISE will also contribute with the organization of seminars and student activities.
The following list presents a brief description of the expertise of the personnel associated with this project (for
details, see resumes at the end of this proposal):
 Dr. Manuel Rodriguez-Martinez, PI, Computer Science
Experience: Database Middleware System, Distributed Query Processing, and Computer Networks. Member of
ADM Group. Developer of the MOCHA System for NASA ESIP Federation.
 Dr. Nick Roussopoulos, PI, Computer Science
Experience
 Dr. Bienvenido Velez-Rivera, Co-PI, Computer Science
Experience: Information Retrieval, Distributed Systems, Cluster-Computing,Human Computer Interaction.
Developer of the Info-Radar system at MIT. Coordinator of Computer Science Programs at UPRM.
 Dr. Isidoro Couvertier, Co-PI, Electrical Engineering
 Dr. Pedro I. Rivera-Vega, Co-PI, Computer Science
Experience: Parallel Algorithms, Data Structures Analysis, High Performance Computing, and Computer
Science Education. Coordinator of Computer Science B.S. Program at UPR-Rio Piedras.
Prior NSF Support
Download