>>: Let's just get started, please, with our first... University of Valencia, please.

advertisement
>>: Let's just get started, please, with our first speaker, German Molto from the
University of Valencia, please.
>> German Molto: Management and contextualization of scientific virtual
appliances. And here is the outline of the talk.
I will start with a brief introduction and overview to our research group. Then I'm
going to focus specifically on scientific cloud computing. In particular I'm going to
talk about contextualizing virtual machines and how to manage them using
repositories and catalogs. Finally I will talk about scientific applications in which
we are applying these kind of techniques. Finally I will end my talk with some
conclusions and future challenges.
Okay. So our research group focuses on applying different computational
techniques such as parallel distributed grid and cloud computing technologies
into different scientific fields such as engineering simulation, photonics,
proteomics, biomedical computation. We typically establish collaborations with
research groups so that we can apply this kind of techniques.
If we have learned something through all these years is that scientific
applications typically require large computational power and also precisely a
large amount of data. So this is why typically combine several techniques, for
example high performance computing and grid computing in order to solve large
dimension problems.
And specifically talking about grid computing this has been a technique that has
been successfully employed in many areas. And perhaps one of the most
important approach of grid computing is that it has leveraged scientific
collaboration in the shape of virtual organizations and has allowed to access a
large wall of computing power.
But it has also some drawbacks. If I had to mention just one is that using grid
computing the main problem is that it is resource providers, the ones that define
the execution environments. And this is one of the key area that virtualization
and cloud computing tries to solve. Because with cloud computing it is resource
consumers, not resource providers that wants to define the execution
environments in the shape of virtual machines.
And having a controlled environment is especially important for scientific
applications. But cloud computing has also other advantages and it's that it
allows dynamic scaling of infrastructures for resource providers. A user can have
a fast an easy access to a large amount of resources and virtualization which
leads to server consolidation also leads to reduced energy consumption.
Now we focus on the point of view of the scientist and engineers. We don't want
to bother with technology they just want to run the applications as fast as
possible and to solve large dimension problems. And when we talk about grid
technologies we have lots of concept and tools that complicate it like to users
and developers.
Another, we are talking about cloud computing, we also have additional
problems, which hypervisor I'm going to use, I'm going to configure, deploy and
monitor my application which APIs I'm going to use. So I think it's time to focus
on abstracting the details of application porting to the cloud.
And in particular on this talk I'm going to focus on the lower level or the lowest
level of this typical classification of cloud systems where on the top level we have
software as a service. And the user application. Odd middle we have platform
as a service where application developers are provided with APIs to develop
cloud applications. And the lowest level we have virtual machine managers
which provide the enactment of management on virtual machines.
And we can currently find lots of virtual machine managers currently in the
literature. But we are trying to consider in these key factors in order to decide
which one we want to focus on. And we want them to be open source, we want
them to have access to public clouds such as Amazon EC2. Have wide variety
of APIs, very important for assigning applications. Different hypervisor
technology supported. Contextualization support. Network management. And
also ecosystem, wide variety of users, large community.
And having these key factors in mind, we're focus on OpenNebula, Eucalyptus,
and also we currently keep and eye on what the Nimbus product is doing.
Well, virtual machine managers focus on supporting the lifecycle of virtual
machines. But for scientific cloud computing we also require automated
contextualization of virtual machines in order to get scientific virtual appliance.
And also we need to reuse different virtual machines among different
experiments and also between different researchers. So this is why we focus on
application contextualization and also the management user repositories and
catalogs.
Okay. Virtual appliances is a very well known term. It's just an encapsulation of
an application, application requirements and operating system like mini-max
execution unit in a cloud.
So but if we're talking about scientific virtual appliances, then things get a little bit
more complicated because these kind of applications might require certain
operating system, certain services from the operating system; program
transferred services. Persistence layer either database or files, super files.
Special middleware, for example where applications might require the [inaudible]
for example. Computational libraries you might have scientific examples that
requires numerical kernel such as [inaudible] pack. And finally you have the
application and the application data. So creating scientific virtual appliance is not
a trivial task.
So going from a virtual machine where you have the plain OS into a scientific
virtual appliance where you have finally the scientific application running, this
typically requires a process called contextualization. And contextualization
means creating the appropriate hardware and software environment for a
successful execution of an application. And this typically happens at two levels.
Because first of all virtual machines have to be contextualized. When the virtual
machine boots, for example, it needs outbound connectivity and this
contextualization process, its support is provided typically by the virtual machine
managers.
But then you have to contextualize your applications, so it defines the appropriate
environment. So the applications need to be deployed, configured, built and
executed.
And we can currently find lots of tools out there for matching configuration which
allows dealing with ordinary matching configuration and also the installation of
commonly used packages, very comprehensive tools. Many of them are in the
shape of client servers tools. But we wanted to focus on the specific workflow for
deploying scientific applications. And things typically go like this. First of all, you
have to revolve dependencies. Applications might require other related
packages or system packages so the dependencies have to be installed first.
Then you have to configure applications, which means a subset of actions,
copying files, changing properties, declaring environment variables.
Then you have to build the application using different build systems.
And you finally have to start the application which depending on the application
might require invoking a script, starting an application, parallel execution or
whatever.
So we are currently working on software for application contextualization. And
the idea is to inoculate squared and configuration into the virtual machine without
minimum -- with minimum user intervention.
So we currently have this approach. We have the user that has the application
and the user developer writes the application deployment description in XML high
level language, no programming skills needed. And together with the software
dependencies this goes to the contextualizer tool, which is -- which creates a
contextualization plan so that in the virtual machine panels are installed,
configured, build and the application is run so that we can perform the
deployment of the application at boot time in the virtual machine.
So we counsel have a proof of kept tool of this application coded in Python for -to ensure good portability. We have a plugin based mechanism so that users
can write just XML deployment descriptions. No need of program skills.
And we currently stage the tool, the application and the application requirements
into the virtual machine at boot time, creating a special disk image so that when
the virtual machine boots it can start the contextualization process, we can go
from a virtual machine to a virtual appliance ready to execute the application.
Now, cataloging is an important feature we want in order to reduce or enhance
collaboration and virtual machine sharing. And it is true there exist VM catalogs
out there but they focus specifically on human conception. They don't provide
APIs, unstructured metadata.
And we wanted to work on a catalog that includes virtual machine metadata, a
description of the operating system and the software environment. Very
important to execute applications. And we use the open virtualization format
which is XML based to describe the features of the virtual machine. We also
want to provide links to other repositories which are either local or remote. And
we are currently working on matchmaking algorithms to retrieve the most
appropriate virtual machines according to application requirements. This is one
area in which we are working right now.
So we currently have this approach. We have the user, which provides the OVF
description of the application. It talks to the catalog to register a virtual machine.
And the catalog creates an instance of a transfer manager which criterias
temporary credentials, which are delegated to the user in order to upload the disk
image size to the repository, and this image size is reduced in the catalog an
conveniently tagged so they can later be accessed.
Now, concerning the repository, it includes the storage of VMs and provides data
access mechanisms who currently include HTTP and FTP, very well known
protocols. But we're also considering including GridFTP which would provide
enhanced certificate based security. It's a protocol for transferred files.
And the current virtual machines that we are considering is Golden virtual
machines currently based on Ubuntu JeOS which provides a full operating
system in just 380 megabytes. It's very important. A very low footprint. And also
having pre-contextualized virtual machines. For example, for grid applications
we have virtual machines where we have already Blobus Toolkit 4 deployed so
we can contextualize yes, the Grid Services that need to be deployed.
Well, so this is the big picture that we currently have. We have the user with the
application requirements. And these application requirements are submitted to
this cloud enactor component. This cloud enactor component talks to the VM
catalog in order to retrieve the most appropriate virtual machines, which are
retrieved from the repository. And then it talks to the contextualization software
which must compute the deviation from the virtual machine and the application
requirements so that it must create this contextualization plan, which is submitted
to the virtual machine manager. And the virtual machine manager starts the
virtual machine. And when the virtual machine boots, it starts the
contextualization process. So that you finally have your scientific application
running on this virtual infrastructure.
Well, there's a missing point here in this figure and the point is how I'm going to
control the application and access the output files inside the virtual appliance.
And we currently rely on the Opal 2 Toolkit which is a tool developed in NBCR,
and it provides a Web Service wrapper for applications. And it was initially
developed to provide a wrapper to let us see applications but it is very, very
useful in this kind of environment because it allows operations for remote start in
monitoring and terminating the application. And very important, is that it also
allows to access the output files while the dials are being produced because they
are exposed as a Tomcat service. And this enables to introduce computational
steering. I mean, to see how your simulations or your executions are being
processed and the output data are being written in files. This very important
approach.
So once we have these Web Service application shall we have the hardware
hypervisor and then the virtual appliance which runs the scientific application and
a Web Service wrapper that allows this cloud enactor component to control these
applications, to start -- to monitor them and to access the output paths. This is
currently the approach that we have.
Now, what kind of applications are we using in this kind of techniques? We are
currently considering three different applications. Simulation of cardiac electrical
activity; simulation of guided light in photonic crystal fibers; and also optimization
of protein design with target properties.
These are applications that we have been working in them with collaboration with
other research groups in the last five years. And they are application in which we
have previously applied high performance computing and recomputing and we
are currently investing how these applications can benefit from these cloud
infrastructures.
Okay. So just to wrap up. I would like to have some conclusions. Scientific
cloud computing requires tools to abstract the interaction with cloud
infrastructures. So going from applications to scientific virtual appliance is a keep
point.
We are working at the application contextualization and virtual appliances
management.
And within the cloud looks like an alternative approach for executing scientific
applications. And the main benefit over a grid infrastructure is that it allows to
define the specific execution environments. This is one of the crucial points.
We also see some challenges in the near future. One of the most important is for
example infrastructure providers currently are different silos so probably get
software gateways should be developed to aggregate these kind of
infrastructures. And we also see a large ecosystem of virtual machine
managers. Many of them share the same functionalities and goals. So we'll
probably see in the near future the rise and fall of some of them.
And it is true that their exists common API for cloud computing so that you can
develop your application against one assemble API and then you can access
different cloud infrastructure. But we'll have to see how this breaks above in
time. And a final thought is that clouds and grids will provide computational
support to scientific applications. Okay. And this is where I wanted to tell you if
you have any questions I will be glad to ask you. Thank you very much.
[applause].
>>: Questions for our speaker? Bill.
>>: So you mentioned this cause of the [inaudible] VMs. [inaudible] people sort
of develop [inaudible] get it running and save it all as [inaudible] so the
automation step that you are offering where someone expresses at a high level
application requirements and automatically [inaudible] VM, I guess I'm not quite
clear why that automation need to exist. I mean, at some point somebody has to
manually kind of construct the application once [inaudible].
>> German Molto: I mean the first time you have to manually configure the
virtual machine in order to install the application, but you can replace this manual
installation process with an automated process. And this automated process is
[inaudible].
>>: You've already done the manual process once [inaudible].
>> German Molto: Yes. This is true. This is true. But this is -- this is just one
single infrastructure. For example if you are using Amazon EC2, you just create
a specific virtual machine for this infrastructure. But imagine that you might
access different infrastructures simultaneously. And for example Amazon EC2 it
uses shared hypervisor, then you can use other infrastructure that uses
hypervisor so you might find in a situation where you have to deploy the ->>: [inaudible].
>> German Molto: But once you have configured the application you can save it
as a precontextualized virtual machine so that you can later be reused.
>>: Any other questions?
>>: I have one more.
>> German Molto: Okay. [laughter].
>>: [inaudible] unit in the sort of dataflow diagram you showed at the first step,
the user expresses application requirements. How are they expressed?
>> German Molto: Currently we're working -- we're working with OVF, but we
think that the OVF is -- it is an XML document but it is lots of expressive -- I
mean, we're currently thinking how the transition to another probably most
appropriate format without being too much expressive because many of them -many times the applications -- the requirements of applications are -- don't need
to be so much expressive. So we currently have concept using OVF that we will
probably transition to another approach.
>>: Let's thank the speaker.
[applause].
>> Gregor Srdic: Hi and welcome. Well, I see that my presentation is very
different for all the previous ones as I'm going to present you our project that was
-- application was built to demonstrate proof of concept and technology and to
use our tools that were available to us.
I come from University of Maribor of Slovenia, and I'm part of Cloud Computing
Centre there that was established in cooperation with leading business partners
in our field such as Microsoft, IBM and Oracle. And the aim of this Cloud
Computing Centre, which is actually the first in our country and also the first in
our region, is to transfer technology and knowledge between academic sphere
and business and besides research you can prepare in color page this me and
my colleagues also assist at pre-graduate student programs and we try to
motivate students additionally by including the newest concepts of the computer
science into our classes. So as a practical part of one of our courses we have
decided to give students an assignment to write down a few ideas about practical
use of cloud computing, and then we've collected those ideas and selected a few
of the best ones.
We built high level architecture and divided students into several groups. Each
group then built partial solution individually and at the end of semester we
integrated these solutions into final applications, and this project called SkyInfo
was one of those applications, and it showed enough potential that we decided to
carry on developing it even after the end of the course.
The problem that this project is referring to is I'm sure well known to everybody in
today's information society every individual is overwhelmed with large amount of
data, and it is surrounding us and most of this data is unimportant and only
makes it harder for us to get to the information we really want. And also a lot of
time is wasted processing this unnecessary data. And during this time, a lot of
information that we were looking for can already become obsolete.
Well, to overcome these problems, we decided to filter all available data in the
relation to individual's current location and expressed interest, therefore SkyInfo
is designed to offer and intelligent anytime, anywhere available service which will
provide users with relevant localized and personalized information. The main
idea is that on the first side users submit messages about events they witnessed,
either by writing a text message or by submitting a picture, and on the other side
they receive events from other users based on their current position and
subscriptions. Each subscription defines a category of message to be retrieved
and also a maximum distance and time validity of a message.
Of course, considering this solution, we've also encountered a few problems.
The first problem was that multiple people can witness the same event and send
a message, so we had to develop an algorithm to identify those duplicates and
treat them properly. Another problem is that after user sends a few false
messages, his integrity this certainly be questioned. Therefore, we designed a
rating system that works on feedback. So every user who receives a certain
message can respond to it and either confirming it or denying it. Sorry. Either
confirming it or denying it. And based on this feedback messaging users rating is
updated.
We have considered a lot of use cases for our application. These include
reporting of traffic accidents or reporting of traffic jams and other traffic
information, warning about approaching weather conditions, reporting lost and
found objects. Another interesting idea is localized advertising and so on.
Here we will take a look, a closer look at one of scenarios. Image Mary, what is
walking in a park on a sunny afternoon she finds sad puppy who obviously got
lost. She takes a picture and submits it to our system. And luckily the puppy
owner is already looking for the puppy nearby, and he also uses our system, and
they instantly receives a message with position and thanks to our system quickly
puppy and his master are joined together and happy once again.
Here we have a presentation of this high level architecture with the decomposed
-- the application into several models. We have enabled -- with this, we have
enabled loose copying of components and later I will mention that we have
decided to implement each of these components using Web Service technology.
The main part of SkyInfo application is distributing messages. Therefore, we've
tried to develop in innovative approach to exchanging data about current events.
So besides considering location and timeframe, we've also tried to simplify the
process of submitting messages. To submit a message user only has to fill out a
description and select a category and then our system client for that matter
automatically gathers location either from internal external GPS or from other
available methods. And then these messages are distributed back to clients,
which continuously report their location to our system and retrieve new
messages.
There's also another way for urgent delivery of messages at the request of the
user messages can be delivered directly and instantly over short messaging
service.
The way of sending messages, text messages which I just presented seems to
me to be as simple as it could possibly be. But we have gone even further and
we've developed another innovative approach of exchanging data about events.
Here users can only submit a picture, a plain picture and that can be submitted
either over to Web Service or to e-mail or to multimedia messaging and thanks to
use of Amazon's Mechanical Turk service we can clarify this picture into one of
available categories and of course system -- I forgot to mention the picture has to
include geolocation, so this mean -- many devices nowadays already support
that, so this is not the problem.
Here I have an example of a human task that is build by this -- by our system.
Mechanical Turk provides programming interface for building human tasks which
are available online and every end user on the Web can work on them. So I
think it's a very interesting alternative to computing methods and algorithms
which are at least for image recognition very complex.
Well, our project is still a work in progress, but so far we've built a first version of
core system and to clients on -- on this picture is a mobile client. The first one is
built in Android platform. And we have intention to build another one for
Windows Mobile. We had some problems calling Web Services from version 6.5,
so we decided to wait for version 7.
Here is on the left is a presentation of new messages and on the right is a form
for submitting new messages. This is a Web application which besides
displaying messages enables users to subscribe to categories. It's built in silver
-- Microsoft Silverlight technology.
Another challenge that we face building this promise was integrating many
technologies. For this purpose we've employed Web Services so those gave us
kind of a bridge between different platforms. The core service of SkyInfo are built
in Java and running on IBM WebSphere application server. The supporting
services are built in Microsoft .NET and are running on IIS server. And we
moved first of our services already to Windows Azure. And the ones that are
built in .NET could easily be transferred, all of them, there as well.
For human task as I already mentioned we used Amazon Mechanical Turk. And
for Web interface we built -- we've used Microsoft Silverlight. Then for mobile
clients we have used Android and Windows Mobile. And of course a big part of
our service is using maps. We've used Bing and Google Maps.
Well, the key contributions of our project I believe that linking relevant information
and events with location and user interest leads to quickest access to desired
information. And I think that our approach to exchanging data about occurred
events but something only pictures is very innovative.
So to wrap up, I would like to say again what our project is all about. Well, it's
basically a framework for sharing information in a structured and personalized
manner. There, we would like it to be useful for general public. SkyInfo could
also be distributed as software as a service. We believe to the organizations that
require private information service. That's it.
[applause].
>>: Questions for our speaker?
>>: I have one. So you use Mechanical Turk for application in the cloud, right,
but you also have the users when they submit events into your cloud based app,
they classify them as well, right?
>> Gregor Srdic: Yes. They are two different ways. You can submit the
message. There you have to select a category but if you submit only a picture,
the picture is classified using Mechanical Turk.
>>: Okay. And how is your turn-around time on Mechanical Turk classification?
>> Gregor Srdic: Yeah, we didn't test that in reality, we only test it in the sand
box until now. So we did the human tasks ourselves. It was quick.
>>: I have a question. I live in the [inaudible] Brazil and so [inaudible] terrible
traffic jams. And I would like to present to you a scenario and see how SkyInfo
would respond to that. For instance, you have a traffic jam and hundreds of
users starts using their cell phone sending pictures with the cars in front of them
stopped. And these images are geotagged. But I'm there on Snoopy Avenue
sending pictures and in my own northbound our southbound side can you decide
that based on the geotag and if you're receiving hundreds of images of cars
stopped, could you infer the reason of traffic jam or would then be useful if only a
few cars would be close enough to the reason of why the traffic stops?
>> Gregor Srdic: It's an interesting point. I'm sure that it could be useful to know
that there is an accident. I don't -- with our -- what we've done so far, it's
impossible to tell on which side of the road is happening. But you get some basic
information. Maybe users can write it in description. That's another way.
>>: Good point. I realize that you didn't mention what you do with the content of
the SMS text message you receive. Those use human users to process the
content?
>> Gregor Srdic: No. We just distribute it to the users that are receiving
messages. We made some algorithms for identifying duplicate messages that
these do read the content, try to find some patterns.
>>: How do you distinguish [inaudible] system in the cloud where people can
actually subscribe interest in certain events and [inaudible] so you're actually
generating these messages from the devices out in the field and routing them up
to the cloud for subscribers or people who are interested in them in routing them
to them. So my question is that's very similar pattern to pub-sub systems,
published subscribe, where people subscribe to [inaudible] people publishing
messages. I'm curious if you're taking a different approach, if your platform is
different from a pub-sub system in terms of your processing or scalability or have
you compared it against pub-sub systems?
>> Gregor Srdic: I'm not familiar with them.
>>: Okay. [inaudible]. Have you done any scalability in terms of how it scales in
terms of event loads and ->> Gregor Srdic: Well, we actually are running this in a private cloud so -- but we
haven't done extensive testing.
>>: Any additional questions? Let's thank our speaker one more time.
[applause].
>> Marco Parenzan: Thanks to everyone. What I want to present is also for me,
for us is proof of concept. I work -- I am a computer science engineering work for
a chemical engineering laboratory that is called MOSE. It does technology but
will revolutionize the world of research but is called multi-scale molecular
modeling. This is a -- this technique approach is applied in three main fields.
That is material science, life science, and process simulation.
What is multiscale molecular modeling? Multiscale molecular modeling is an
approach that allow at the end to simulate a process at engineering level
beginning from a quantum mechanics level of simulation. But this is a moot
scale because there is now computation processing resources available now and
in the near to medium to far future to simulate this are realtime. So multiscale
means that we have many stages but you can buy at single, but the [inaudible] is
to simulate the process from quantum mechanics level arriving to the engineering
and engineering level.
How do we solve the problem having many steps to resolve this modeling is a
message -- message passing multiscale. So we have in general we have many
software but simulate the same process or the same process remodel it at the
proper level. Each result -- each input, each application have and input. It's
simulation, the processor, and then the output is passed of the next level.
This is mainly a sequential processor, but the sequence of activities can all
obviously be modelled in a cloud. So the sequence of activities can be
distributed worldwide because the laboratory -- the laboratory collaborates with
many other figures, laboratories, companies, so the laboratory is able to simulate
inside but can also distribute the work around the world.
At the same the customers are worldwide. But at this moment the laboratory is
not on the cloud. Because we are based on software that is not on the cloud.
The simulation software are not on the cloud. So we made in house we have a
pile of servers for which we made some simulation. But specifically this proof of
concept is for us to talk with these people but give us software or internally how
we can do work in computation on the cloud.
Also because we need the cloud because cloud is useful collaboration platform.
So it can allow us to distribute the work around the world.
Can MOSE access alone the cloud alone in terms of his focus? So mechanical
engineering with chemical engineers so that are the typical users. You have to
understand that I am the only computer engineer inside the laboratory. So often
they need also to write some code with simulation but also some code that can
parameterize some of this software.
The idea is to help these people writing some code, some code that can execute
on the cloud.
So the objectives of this research is to move laboratory on the cloud so we
cannot wait the software companies to do this. So we tried to do something to
make this work. As I told you before, we, as computer engineer, we can write
these but there is always a big problem that is the main problem. What does a
molecular mean so what does an engineering process mean for a computer, for
a computer science engineer?
So the idea why don't we enable non-computer scientists writing their own code,
simplifying writing their own code?
One aspect we have to understand that at this moment one thing is that what
non-computer science engineer can do, what does -- at the moment, for example
simulation software ask to these people? And it's normal today to have this kind
of software are accessible all in C++ code, but is quite difficult.
So, for example, one activity we have done with another group is moving the
cloud to the CLR code, writing in C# or VB.NET instead of VB 6or, worst, is C++.
The next step, the next step are the usage of dynamic languages like Python or
Ruby and then also the usage of domain specific languages, DSLs. What can
abstract on another level the way we can express code and data?
Just a few slides. I'm not Microsoft but I like to share with you technologies that
are from Microsoft they are from IronPython is about three, four years but is not
so everywhere. There were a lot of -- few people. Dynamic languages are on
the sector, and someone that comes from the open source community is now
part of Microsoft is called Jim Hugunin and has developed the first version of
IronPython. And the nice thing about work is that IronPython was split in the
form with version 2 in two parts. One that is a specific part that is IronPython
language. The other thing is the dynamic language runtime. That is the common
part for dynamic language is in fact we are waiting in these days probably for
version 1.0 of IronRuby. That is the other main language in the dynamic
community.
One other big thing is that these languages run natively in .NET, so they are
completely CLR types. So completely managed code. They run under Azure as
we will see in few minutes. They are quite advanced, for example, buy
IronPython is already unicode code and various problems with the original
Python code that is not yet unicode.
And the nice thing is that dynamic language runtime allow other code, for
example JavaScript that is coming from community we are having a native
JavaScript language running on the runtime.
And if you don't know the DLR is also under C#4.0 and on .NET 4.0 that is in the
four days we will have a final version because the dynamic keyword in C# is in
effect a part of that DLR. In fact, Jim Hugunin has moved to the C# team.
This is one technology what we are using to allow no computer people to simplify
their programming abilities. Another tool that we are starting using is Oslo. Oslo
is already named as SQL server modeling because this -- we can say research
project but it was not a research project, has -- is becoming a product that will be
out in long time because it's one, two years. What is -- what is Oslo? Oslo is a
part of a data platform from Microsoft, has this component and language, the
quantity tool and the repository.
What is Oslo? Oslo is a framework of tools that allows us to create dynamic -domain specific languages. So we can write code -- we have a tool on which we
create in a simple, simple way languages. Mainly textual languages. Why
quadrant is a visual tool to interrupt with this -- with domain language.
And this is useful also because if we use storage in -- under Windows Azure, not
SQL Azure but Azure storage, blob, tables, or queues, but mainly, program,
blobs, Oslo, what is Oslo? Oslo is a tool that allow us to make a parser for
unstructured data. In fact yet when Ed Lazowska presented -- told where the
scientific world lives in a big amount of unstructured data, well Oslo is a nice tool
to structure the data. In particular, Oslo is a schema -- is a schema language to
give a schema to any data we want. In fact, this is the way -- this is why Oslo is
get into SQL server. It was SQL server platform or mainly the data platform.
So what we -- which are the simplification steps? What is our proof of concept?
We have a Windows Azure application specifically Web application, Web Role in
an Azure application with MVC2 and a worker role for background processing.
What we do with this Web application, we lowered Python scripts for example
that are executed in a worker role. How do we interact with this with dynamic
language, with jobs that run on a worker role? We do input and output with
textual message that are parsed with Oslo. So we see a demo to fix the correct
messages.
I have some just to say -- just to see what Oslo is. A file that does not relate to
the example, but to see what are the structured data. I have downloaded a
comma separated file from a business website which you have money change.
This tool is called [inaudible]. In the center you have grammar that is an
evolution of a typical [inaudible] grammar. This tool as the nice thing that you
write the grammar. On the left you have source and on the right you see you
have the M graph that is -- in this case is the serialization of the abstract syntax
tree that is generated from this tool. So this is to show you vector -- how to
structure the unstructured data. Specifically for this example. Okay. My
example is quite simple. Is matrix multiplication. But suppose you had an input
made in this way, not in a coding way, a way that is probably similar to MATLAB,
for example.
Okay. You have two variables E and M, a vector and a matrix. You see that
there is a grammar and then you have the result of a parsing. That is accessible
to a programming language to [inaudible] data. This is the input. The output I
find, okay the output is -- the part of the output is a template in which the results
are written on a text file. So in this way opposite have written grammar which I
parse a template, and you see text expand, text expand. If you write you make
[inaudible] write, read text is just write with text, expand is read with value, write
the value. Okay. With Oslo it's quite simple to write this kind of things. So if we
see the application, I have done a simple application, quite -- I would like to call it
the Facebook of code but probably it's a -- oh, okay.
Here we have, for example as a publisher I can write some Python code, and you
see what can be simple. You don't have an environment, you don't have virtual
studio, you have just a Web page in which you write the code, just the code you
need. Okay. Simple matrix multiplication in Python language. And sorry that the
zoom is not good, but for example here you see dictionary of math that is the
schema that is the parsing of an input. I need a dictionary and I have output that
is M report that is the template we are seeing that are defined in an administrator
part. I have a map of dictionary and [inaudible].
So you see that this Web application can organize the code, the project for a
person that has no skill, program, with Visual Studio. How you can interact with
this application, you have to send the messages. So for example you have a
request on which you define I want a batch processing for this component when I
give input. Simple. Okay. Simple messenger. Matrix multiplication what makes
that multiplication double of the -- square, sorry, of the values.
Save. So in this moment when I save on table, I have already sent a message
on a queue that is written on the queue. So this trigger, the worker role vector,
read the message, find in which language is written the code and then you see
on the refresher that is posted a message.
Okay. I have made a mistake because I needed a vertical vector, sorry. So I
have to put the comma -- okay. This obviously this is a Web interface that
people think that we replace with this -- with a WCF Web role in which we have a
REST application in which all the information are serialized and then parsed.
Okay. You see has 1, 4, 9, 16 that is [inaudible] duplication.
This, this is the demo. So back to my presentation. Just to say that the method
is too simple important a code. This is the result of one of a simulation we do
with our software. But we think that this is just a little more complicated matrix
multiplication. With a matrix that is quite large so in this way can be useful the
cloud because we can divide that information on multiple worker roles, multiple
instance of a worker role that can elaborate that multiplication.
So the result -- well, the proof of concept is vector. All these pieces work
together. This is working under where if I have not shown you, but this is where
Azure development environment. So this is where code that can be supplied to
the real Azure account.
The conclusion. Why MOSE need the cloud? Because we need the platform
that allow Azure to create with messaging based application. In the demo we
have seen the creation and the execution of a simple step in the process. And
the input and output that are more near to the scientific world that the information
world. You have not seen the XML that is -- was quite typical to write parsable
text but was quite awful to be used from scientific people.
The code we have libraries that define these components but invokes for
example IronPython and invokes Oslo.
What's next? The next is obviously continue with the project. One idea is the
definition of a project. And I have updated representation just to locate our work
of presentation yesterday from Paul Watson where a person made the workflow
of a component. Well, our work is exactly this proof of concept is exactly the
single workflow step. And now we need also the entire platform.
Collaboration in the process. So how do we share this input/output messages?
So blogging tool can be useful, Twitter integrated with a platform can be also
useful.
But the next is the verticalization on the domain. So Python is useful but we can
extract again how writing a DSL that is a real language I made -- here I cannot do
the demo because I have written the code in -- with a [inaudible] that is .NET 4
why Windows Azure is 3.5. And this is why yesterday I asked when cloud will be
4.0. This is grammar for a language -- a custom language that has the same
output. But the idea is to have, yes, we can have a standard tool that allow us to
make the normal operation probably in a more verbose way to be readable, to be
more understandable from scientific people. But it will be interesting, for
example, having a mixer of Python syntax, having LINQ like syntax to interact
with data sources but avoiding connection strings. And this is nice because living
in a cloud in Windows Azure we can construct the need for connection string and
say we are on a Azure account so give me the -- that particular blob or table or
queue object.
So this is it. This is as I told is a work in progress. And I think it's also an
invitation to use these tools because they are very, very useful. Thank you.
[applause].
>>: Questions for the speaker?
>>: I have one. So is your bottom line goal with all this to make -- to make it
easy to find specific languages to pass standardized formats around multiple
simulation codes offering each size scale? Or is it to rewrite simulation codes
into something that runs in the cloud environment?
>> Marco Parenzan: We can -- the first step is the idea that we can allow to
abstract as much as we can the programming environment to the chemical
engineer. Then if he can write code, he can try to write the code he needs. So
also the simulation code. There are many -- there are many -- for example, one
example is the processing -- the process engineering in which you have mix or
split or reactors which can be written in some code that can be expressed in just
few questions. So what we want is not to make the chemical engineer write a
class -- a function but just write the equation, declare which are the input, which
are the outputs and then execute it.
Azure, the -- these techniques can be applied outside the cloud obviously. The
nice thing of a cloud is the storage because mechanical engineer which in front
of a computer does not know where to save the information. It only write text
files. So Oslo can define the schema. Azure can safe these files in a readable
way because it's quite important.
>>: So unless it's defined and saved how would a group of chemical engineers
which is distributed along several loads, how could such a group cooperate
combining reusing [inaudible].
>> Marco Parenzan: In this demo you have not seen a collaboration platform. In
fact, I've written that very -- we need to define a process.
>>: Okay.
>> Marco Parenzan: This proof of concept works on expression. So do we can
abstract the programming skills to the chemical engineer so simplify the work to
the chemical engineer. Can a chemical engineer write some code? This is the
first answer that this work do.
The next step is to make -- is to make the real platform that allow the
collaboration. As I told the work presented yesterday from Paul Watson showed
that other people are working on the same argument.
>>: Any more questions for the speaker?
So I have one for you. First I was really surprised to see Oslo, to see Oslo in
action [inaudible] I really like that. But can you talk about the [inaudible] of Azure
some of the simulations, what the simulations [inaudible] one worker or class of
simulations you're targeting right now?
>> Marco Parenzan: This work is a ticket for us to go to the simulation software
companies to say well, bring your software on the cloud. So we have in Trieste
there is a company that does this kind of work. There is a project on this. And
this can be a sort of another runtime. We could say that in Python or on a DSL
can be another specific runtime but can be the runtime for this application, for
this platform, from this software, this software company to bring their software to
the cloud. We know that we cannot write an entire simulation, but it's an
invitation to port the code from this company have to bring the code to the cloud.
>>: [inaudible].
>> Marco Parenzan: What I see is that the chemical engineer have many difficult
[inaudible] so we have to abstract and Oslo is, Oslo is not a revolution in terms of
languages because we have probably more powerful tools, for example ANTLR
that is the component that can generate power source for us. But Oslo is a
typical Microsoft tool, is a very simple tool to use. It's just we can relate to Oslo
like we relate to an XML DOM. It's the same. But because you know that a
DOM box -- DOM box work is one where -- in fact yesterday I was -- yeah. I
show you. I have -- sorry. I was with do you go last [inaudible] and we've DOM
box to ask what Oslo would perform. But DOM box says give a schema, a
schema for text files for generating text files. This is the objective.
>>: Great.
>>: Just real quick I want to reask one part of Roger's question, which is the sort
of two -- at least two sizes of simulation and ones that work in a single mode and
ones that don't work on a single mode. And one part of Roger's question was do
you tend to support the ones that do not work [inaudible] requires orchestration of
[inaudible] codes? And if so, how?
>> Marco Parenzan: As I told before, we know what we cannot rewrite and
entire simulator.
>>: It's not a rewrite, I guess it's just a -- there's a scale problem I suppose that
even if you were -- even if someone else comes in and writes -- hand you a CFD
simulation, a computational fluid dynamic simulation, that necessarily runs
parallel because their performance.
>> Marco Parenzan: Okay. Okay.
>>: Azure may not be the best fit for that for some of the reasons Roger
mentioned? Do you intend to just ignore -- which is fine ->> Marco Parenzan: No, no, no, no, I understand. The idea in this way with DSL
is to model with DSL -- to simplify in DSL the way we can express where
distribution of a computation. So try to abstract -- our objective is always writing
vertical aspect domain. So knowing that there is difficulties to divide to the
process in parallel, the idea of DSL is abstract also the parallel abstraction in the
language, so to try obviously to make it accessible to scaled and non-scaled
programmer.
>>: Let's thank our speaker one last time.
[applause].
>> Marco Parenzan: Thank you.
>>: Our final speaker of the session is Domenico Tlia from the University of
Calabria towards an Open Service Framework for Cloud-based Knowledge
Discovery.
>> Domenico Talia: Thank you. So my talk is a discussion about the strategy for
the implementation of an service -- base of service oriented framework for
running knowledge and discovery application on cloud-based systems. Actually
the presentation is mainly divided in two parts. So the first part I will try to
discuss the strong. So one approach for the completion of services oriented
distributed knowledge discovery task and application on -- yeah. Okay. Today
we must say cloud because we are in a cloud-oriented conference. But I think
that's -- we can speak about large scale distributed system or large scale high
performance computing system. So we understand what we mean.
So the idea is to try to outline approach for implementation of large scale service
oriented knowledge discovery tasks as services on this kind of platform and
investigate how this kind of knowledge and discovery data mining services can
be used to implement distributed analysis application according to the service
oriented architecture model.
So in the first part I present the approach and the second part of the talk I will
give you some references to real projects we are running where we develop
some software according to this approach that show the feasibility of the
approach and the way in which this services can be used for real implementation
of distributed knowledge discovery application.
So, you know, we have to deal with complex problem with big, bigger and bigger
problems today. And most of them came from the -- okay. Obviously we -- it's
life. But the main issue here is that we have to deal with very huge data source.
So data source today are larger and larger and often they are distributed, okay?
So starting from this scenario we have to deal with this huge amount of data, and
we have to use this data. Okay? And obviously we have a main problem -- we
have one problem which is storing data, okay? And it's a problem. But it -- from
my point of view, the main problem is not storing data, the main problem is to
analyze data, to mine data, to process data trying to understand it.
So you know, obviously one problem could be or is having no data, having no
information. But another, the other side of the coin is that having so much data
that you cannot understand nothing. You cannot manage them by railroad tools
or they cannot read by humans. So having too much data is more or less like not
having data, okay?
Now, just to mention some estimate, I'm not sure how accurate is this estimate
but it's an estimate, okay? It seems that in -- oh, sorry. It's not 2006 but is 2009
the 9 change its position. In 2009 it seems that we produced 750 billion
gigabytes. And the problem is that in 2010 we are going to produce one
zettabyte of data which is an impressive amount. And just to show you the -- I'm
not sure if this is a point or not. Okay. This is the forecast produced by IDC.
And you say we have a problem with the available storage in the information
create. And so this it seems is the trend. So it's very impressive trend. And so
starting from this scenario, we must be able to face this challenge. So to handle
this very large amount of information and the associated complexity, which is
related to the processing of this large amount of data.
So the idea here is to use, as I said before, large scale distributed system like
HPC, cloud system, grid systems, P2P systems, to provide sufficient
computational support for this kind of application and for analyzing this data. For
discovering the interesting part of this data. We had some talk today and
yesterday on this topic. Very, very interesting topic -- talks. So that seems are in
accordance with this approach and then half of that even if has been done by
well known and very expert people.
So the idea here is to try to use this bunch of following to support the completion
of integrated framework for doing analysis of data through services interface.
Okay? So the basic idea is to try to identify single stack of this big complex
process and implement each step. I will show you model later. As a single
service. And then composing services in a sort of ecosystem of services that
may run on this kind of large scale infrastructure.
Having at the hand a sort of data analytics clouds, you know. We have a big
data center and data center are used to store data but not only to store data, to
process, then to query. And typically the interface of data is service oriented
interface. So in this case, the idea is two hats, data analysis or data analytics
services for handling with this large amount of data.
And obviously in this kind of infrastructures we have security facility, resource
information services to identify the resources we have in terms of data, in terms
of software, in terms of algorithms and so on. Communication mechanisms,
scheduling, fault detection. So all this is sort of basic infrastructure that could be
used for implementing data analytics clouds.
Now, okay. Yeah. When we speak about data analysis, we have to deal with
data mining algorithms. So with the distributed and parallel implementation of
data mining algorithms, that means adopting data parallel approach or task
parallelism approach, managing data dependencies. Having the possibility to
define workflow, so dynamic task graphs that obviously are based on data
dependencies themselves. Deal with dynamic data access and all these aspects
are part of the tasks or the problem of implementing parallel data mining and all
distributed mining approach. You know, there are so -- actually we may say that
parallel data mining algorithm could be part of distributed mining applications. So
especially in a cloud. I may have complex data analytics application that's run
parallel data mining algorithms some part of clouds in some virtual machine and
these are part of a larger scenario, larger application which is distributed for
example in a larger cloud or in some intercloud, okay?
And obviously we need to program this data mining operating task and patterns.
So this is one key point for me because my idea is to implement this data mining
operation as a service, okay? That could be integrated among them. Obviously
you know we may address the problem at different levels and these different
levels are not alternative among them, okay, so obviously we need to use, okay,
traditional approach or libraries, languages, concurrent languages and so on as
we know. But obviously this could be an alternative or this different approach
based on components part or could be part of the solution. Could be a way to
integrate both approach.
And on top of that, we should think about Web services, green services, cloud
services, workflow, mashup and so on.
Okay. If we go up, we increase the grain sides of application, the grain sides of
tasks. And going down we increase the number of processes. So the number of
how to say the degree of concurrence we have, the grain of tasks cease and the
number of tasks increase. Okay.
So, to do this we suggest to exploit the service oriented architecture to define
services, basic services for supporting distributed mining application, yeah, in
large scale as I mentioned before, large scale distributed systems. For example,
in private cloud -- escrow, in private clouds but also in larger clouds like
interclouds. Okay. So the idea is having services for data selection, for data
transport, for data analysis, for model -- knowledge model representation and
also for visualization.
More in detail, the idea is this one. So -- sorry. The idea is, okay, if you need to
run a data mining application or knowledge and discover application, you need to
identify the steps of your task, okay, of your application. Some of them obviously
start from the -- from data, okay? You have to identify the data source in which
run the analysis. And then, for example you can pre- processing data, filtering
data and so on. So you need to prepare data for the mining. Okay?
So in general, in a KDD process, this is the first part of the process. And we
think that each of this operation, so each of this single KDD step can be
implemented as a service, okay? And you have services for example for
pre-processing data, for filtering data, for transforming data. Having them in the
integral format for the analysis, okay?
Going ahead each single data mining task can be completed as a service. For
example, we may have classification algorithms, clustering algorithm or a priori
algorithm, association rule discovery and so on. Each one of these can be
implemented as a service. Okay? You may I have the code of this algorithm in
Java or C# or whatever, but you may offer them as a single service, okay?
So in each layer we have a collection of algorithm that are offered as a service.
Then we can in a distributed setting whose single data mining task of single KDD
steps to implement distributed mining patterns, okay? So I want to run, for
example, a parallel classification or a meta-learning application or a collective
learning application that run on a large amount of -- a large number of machine,
okay? I can take single services for analysis on a single machine and compose
a distributed mining pattern, a distributed mining application.
What is important is that also this one could be a single service which is
composed of a set of services that come from this layer. On top of that, we can
go to have a complete data mining implication or a complete KDD process as
implemented as a single service and implemented as a collection, a collection of
previous task and patterns. Typically this could be done in a sort of multi-step
workflow, okay? Where each node of the workflow can be a single service or a
multiservice in terms -- so it can be composed of several elementary -- how to
say -- small grain services. Okay?
So we have a way to compose data mining application in which we reuse all the
services in the lower level for implementing application in the top layer. Okay?
Okay. So at then we have a sort of open service framework for cloud-based data
mining. So in this way having single services and having a way to compose the
services in an application we allow developers to program distributed data
analytics processor implication as a composition of single or aggregated services
available over a cloud. Okay?
So these services obviously should exploit other basic cloud services, for
example, for data transfer, replica management, data integration and querying
that perhaps are already available in many clouds, okay?
And all this is a sort of ecosystem. Service oriented ecosystem for data -- for
data mining. Okay. So at the end, this kind of approach may result in service
based distributed mining application for communities with -- we discussed it
before, the community of chemistry people or other scientific community that
need to have a way to run a data analysis application or for virtual organization.
So community of different physical organization.
And we may have also distributed data analysis services on demand. Because
people may access the cloud and ask to analyze some data. Okay? On
demand. So at the end we have a sort of knowledge discovery ecosystem.
Okay. So I hope I have 10 or 5 to 10 minutes to -- okay, just to show some
examples of this kind of approach. Because I tried to outline the general
approach so the general way of for implementing this kind of approach and then I
will show you some examples. Just to finish this first part obviously we may
wonder if data mining services are or are not programming abstractions. So in
terms of I would say programming language approach. So what is this? Having
in a additional way this is -- is not apparently a sort of programming abstraction
proposal. But I think that we should consider it data mining services as
programming abstraction because if each single data mining algorithm is single
data mining application could be used as a small element of a more complex
application, we should have mechanism to compose them so to program them to
run application -- more and more complex application. So we have basic
services as a simple operation. Okay?
And using service programming languages for composing them like workflow
based application or workflow based paralleling we can compose the basic
services for programming data analytics application. And obviously we have
complex services and their complex composition towards the implementation of
distributed programming patterns for data analytic services.
Okay. So just to show you some examples of the approach I presented here, I
give you a very quick overview of four systems. I have no time to go into details
but we have a lot of material in case you are interested in all of the systems. And
some of them are open source are available on the Web. Okay.
Weka4WS is one approach of implementing the Weka, the Weka toolkit which is
a well known open source for data mining. The only limit of Weka is that it is
running on a single machine. So it's a sequential application. What we did is
provide an implementation based on the Web Services so we may run it on a
larger scale. So typically on the Web, so on larger scale infrastructure.
The Knowledge Grid is another data mining framework, software framework for
the implementation of the project proposal.
And then we did some experiments in providing Mobile Data Mining Services.
And this is very interesting because this is a way to couple mobile device with
cloud. So you may use the clouds running also -- requesting the running of
implementation application from mobile device, okay.
And then the last one is Mining@Home which is which is an approach based on
peer-to-peer -- peer-to-peer architecture.
Okay just to show you some detail. So this is just the interface of the Weka4WS
in which you have a way to compose very complex workflow of data analysis, so
an entire knowledge and discovery process that typically start from a data set
that could be divided, partitioned in several data set. And all of this can be run in
parallel on a cloud or on a Web server in which -- so each node can be run on a
different machine. Okay? So the interface allow the user to program the
application at a very high level and then the system will run the application in
parallel and move data where data is needed and move results back to the user
when the result are available. Okay?
So in this way, we program a data mining workflow and run this workflow in
parallel on a large scale distributed infrastructure.
The same approach has been used in the service-oriented knowledge grid. So
the idea is, as I mentioned before, we have a set of services. We have a way to
compose the services and to run them on a distributed infrastructure. So what
are the services here? The services are data, access services, data mining
algorithm of a process service, distributed mining patterns offered as a service
and so on.
So you have a catalog of all the software and data you have available to
compose your application. Then you look at the catalog of all the resource -- all
the services available and compose your workflow. Okay? So the workflow
composition is very abstract. I will show you a slide of the interface. And then
the system cares of running these in parallel or a distributed infrastructure. So
the user doesn't word it of this transformation, of this passage, okay? And in this
way the user focus on the services available and on its application, not on the
distributed details -- sorry, on the architecture detail of the distributed
infrastructure. Okay?
Yeah. The idea is this -- this is just a UML -- extended UML approach in which
you compose your workflow and then annotate each node with some information
about the algorithm or the tools, et cetera. But it perhaps is better to show this
interface, okay, because you compose your application as a workflow. For
program, in this case we take a data set, we split because we have a splitter, we
split the data on different partition and we run them on four machine, for
example, okay? And then we collect the result and produce the model. Okay?
So the user have this kind of interface okay? So this interface correspond to this
and this layer. Then when the application is composed then all they take are
green, so it means everything is available. You may say run and this is run on a
distributed platform.
Okay. I'm going to conclude the talk just showing what we did with a service -the implementation of a service oriented mobile data mining framework. In this
case we have the clouds, we have a set of machine and we have the client that
are mobile, okay? So here the idea is that we have data providers that typically
are on the cloud and mining services and mobile cloud -- mobile clients that are
outside. Okay. So in this case, the idea is that you can -- you can have access
to the resources, select the data set you need to analyze, select the algorithm
from your mobile phone and then say run. The application is run on the
distributed infrastructure and the result are send back at the end. All this is done
through a service oriented infrastructure to a service oriented interface because
here is just a client for service invocation. Okay? So it's very simple.
Okay. So what happens is that, okay, you can choose some parameters and
then you see the result. Obviously here we have some problem with the
representation of result because the screen is very small. But we provide an
interface that tell the user to select which part of the result he valiant to visualize
at which time. Okay. The last one is try to explode this approach in a sort of
very large scale infrastructure using the public resources computing paradigm.
So in this way, the idea is that to emulate the approach of Seti@home or similar,
having mining@home. So in this case, obviously the idea is that for example for
community of scientists that share data that are public typically if you think about
biology data or -- okay. I'm going to finish. And they may incorporate sharing
data, sharing machines and obviously running application on this machine. So
this is a sort of centralized data analysis application that could be programming
as a large collection of task and services. Okay. This is just a snapshot.
Okay. So, yeah, obviously we evaluate the impact so I mean the overhead of
services and typically the overhead is not so much. So it's a -- okay. It's a very
small percentage. So having the application -- offering the application through a
service oriented interface does not come indicate, does not have a very large
overhead. So this means that this is -- it's feasible. Okay. We did some
speedup evaluation. Okay. This is just the last slide just to conclude. So the
idea is that okay, I think all agreed that the high performance computing
infrastructure may allow us to attack new problems. But they're required to solve
more challenging problems. And perhaps new programming models and the
new programming environments are required.
So in this programming framework should be devoted to manage data and to
analyze data because data is becoming a very big player. So programming data
analysis applications and services is a must. Okay. In the long-term vision this
approach may bring to a sort of pervasive collection of data analysis services and
the application that must be accessed and used as public utilities.
And I think that this approach is in accordance with the approach that the cloud
community is pursuing. So it's running in this moment. Obviously we must be
ready to manage this very complex scenario. Okay. Thank you very much.
[applause].
>>: Unfortunately we only have time for one question, but ->> Domenico Talia: Okay. We can discuss later.
>>: So if there's any questions.
>>: I'd just like to give a comment in the most humble sense I think you should
reconsider your use of the red color in the slides because sometimes they attract
the attention for what's not the essential part, okay. That's cultural [inaudible].
>> Domenico Talia: Thank you.
>>: I just wondered whether you have -- it would be possible, maybe you both
can answer the question.
>> Domenico Talia: You may start. [laughter].
>>: No of using work flows to deploy the combination, composition of the mining.
>>: I can only say that's a great idea [inaudible] abstracting a way [inaudible].
>>: Exactly.
>>: I love the hierarchal [inaudible].
>>: Yeah.
>> Domenico Talia: Yeah. Because if I can add something. Because users of
this kind of system often are not computer scientists. So people, scientists or
business -- I don't know, that I go a support to businessman should focus on the
application not on the architecture of the task. And perhaps workflow service
oriented approach could help.
>>: [inaudible] we talk to financial analysts about their pipeline, they're doing
different computations but if you look at the core level they're using the same
approach with different parameters in different ways. So it's skill [inaudible]
support 100 pipelines to workflow [inaudible] very nice.
>> Domenico Talia: Okay. Thank you very much.
[applause]
Download