2 - The DIstributed Computing Environments Team

advertisement
Environments for eScience
on Distributed Infrastructures
Marian Bubak
Department of Computer Science and Cyfronet
AGH University of Science and Technology
Krakow, Poland
http://dice.cyfronet.pl
Informatics Institute, System and Network Engineering
University of Amsterdam
www.science.uva.nl/~gvlam/wsvlam/
Coauthors
•
•
•
•
•
•
•
•
•
•
•
•
Bartosz Balis
Tomasz Bartynski
Eryk Ciepiela
Wlodek Funika
Tomasz Gubala
Daniel Harezlak
Marek Kasztelnik
Maciej Malawski
Jan Meizner
Piotr Nowakowski
Katarzyna Rycerz
Bartosz Wilk
dice.cyfronet.pl
•
•
•
•
•
•
Adam Belloum
Mikolaj Baranowski
Reggie Cushing
Spiros Koulouzis
Michael Gerhards
Jakub Moscicki
www.science.uva.nl/~gvlam/wsvlam
Motivation and main goal
• Recent trends
– Enhanced scientific discovery is becoming collaborative and analysis
focused; in-silico experiments are more and more complex
– Available compute and data resources are distributed and
heterogeneous
• Main goal
– Optimal usage of distributed resources (e-infrastructures, ubiquitous) for
complex collaborative scientific applications
Collaborative eScience experiments
(1) Problem
investigation:
(2) Experiment
Prototyping:
• Look for relevant problems
• Browse available tools
• Define the goal
• Decompose into steps
• Design experiment workflows
• Develop necessary components
Shared
repositories
(4) Results
Publication:
(3) Experiment
Execution:
• Annotate data
• Publish data
• Execute experiment processes
• Control the execution
• Collect and analysis data
A. Belloum, M.A. Inda, D. Vasunin, V. Korkhov, Z. Zhao, H. Rauwerda, T. M. Breit, M. Bubak, L.O. Hertzberger: Collaborative e-Science
Experiments and Scientific Workflows, Internet Computing, July/August 2011 (Vol. 15, No. 4), pp. 39-47
System under research
•
Applications
–
–
–
•
Infrastructure
–
–
–
–
•
Provenance
Repository
workflow
Federated Cloud Storage
Hbase
Scaling
–
–
•
Desktops
Clusters
Grids
Clouds
Storage
–
–
•
Stream oriented applications
Data parallel application
Parameter sweep applications
Automatic Task farming for grid jobs and web services
MapReduce
Provenance
–
–
Open Provenance model
Xml history Tracing
Cloud
Cloud
www.science.uva.nl/~gvlam/wsvlam/
Research objectives
• Investigating applicability of distributed computing infrastructures (DCI;
clusters, grids, clouds) for complex scientific applications
• Optimization of resource allocation for applications on DCI
• Resource management for services on heterogeneous resources
• Urgent computing scenarios on distributed infrastructures
• Billing and accounting models
• Procedural and technical aspects of ensuring efficient yet secure data
storage, transfer and processing
• Methods for component dependency management, composition and
deployment
• Information representation model for DCI federation platforms, their
components and operating procedures
Spatial and temporal dynamics in grids
•
•
Grids increase research capabilities for science
Large-scale federation of computing and storage resources
– 300 sites, 60 countries, 200 Virtual Organizations
– 10^5 CPUs, 20 PB data storage, 10^5 jobs daily
•
However operational and runtime dynamics have a negative
impact on reliability and efficiency
~95%
1 job
<10%
100 jobs
seconds
3 hours
asynchronous and frequent failures
and hardware/software upgrades
long and unpredictable job waiting times
J. T. Moscicki: Understanding and mastering dynamics in Computing Grids, UvA PhD thesis, promoter: M. Bubak, co-promoter: P. Sloot;
12.04.2011
User-level overlay with late binding scheduling
•
•
•
•
Improved job execution characteristics
HTC-HPC Interoperability
Heuristic resource selection
Application aware task scheduling
1.5 hours
Completion time
with late binding.
40 hours
Completion time
with early binding.
J. T. Moscicki, M. Lamanna, M. Bubak, P. M. A.Sloot: Processing moldable tasks on the Grid: late job binding with lightweight user-level
overlay, FGCS 27(6) pp 725-736, 2011
Cloud performance evaluation
• Performance of VM deployment times
• Virtualization overhead Evaluation of open source cloud
stacks (Eucalyptus, OpenNebula, OpenStack)
• Survey of European public cloud providers
• Performance evaluation of top cloud providers (EC2,
RackSpace, SoftLayer)
• A grant from Amazon has been obtained
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
IaaS Provider
Weight
Amazon AWS
Rackspace
SoftLayer
CloudSigma
ElasticHosts
Serverlove
GoGrid
Terremark ecloud
RimuHosting
Stratogen
Bluelock
Fujitsu GCP
BitRefinery
BrightBox
BT Global Services
Carpathia Hosting
City Cloud
Claris Networks
Codero
CSC
Datapipe
e24cloud
eApps
FlexiScale
Google GCE
Green House Data
Hosting.com
HP Cloud
IBM SmartCloud
IIJ GIO
iland cloud
Internap
Joyent
LunaCloud
Oktawave
Openhosting.co.uk
Openhosting.com
OpSource
ProfitBricks
Qube
ReliaCloud
SaavisDirect
SkaliCloud
Teklinks
Terremark vcloud
Tier 3
Umbee
VPS.net
Windows Azure
EEA
Zoning
20
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
0
0
1
1
1
0
1
1
0
0
0
0
0
1
0
0
1
1
1
0
1
1
1
0
0
0
0
0
0
1
1
1
jClouds
API
Support
20
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
1
0
0
0
0
BLOB
storage
support
10
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
1
0
1
1
0
0
1
0
0
0
1
0
0
0
0
0
0
1
Perhour
instance
billing
5
1
1
1
1
1
1
1
1
0
0
0
0
0
1
0
0
1
1
1
0
1
1
0
1
1
0
0
1
1
0
1
1
1
1
1
0
1
1
1
0
0
1
1
0
1
0
1
0
1
API
Access
5
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
0
1
0
1
0
0
1
1
1
0
1
1
0
0
1
1
1
1
0
1
1
1
0
0
0
1
0
1
1
1
1
1
Published
price
5
1
1
1
1
1
1
1
0
1
0
0
0
1
1
0
0
1
0
1
0
0
1
1
1
1
0
1
1
1
0
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
0
1
1
1
VM
Image
Import /
Export
3
0
0
0
1
1
1
0
1
0
1
0
0
0
1
1
1
0
0
0
1
0
0
0
1
0
1
1
1
0
0
1
0
0
0
0
0
1
1
0
0
0
0
1
0
1
0
1
0
0
Relational
DB
support
2
1
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
1
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
Score
27
27
25
18
18
18
15
13
12
8
5
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
M. Bubak, M. Kasztelnik, M. Malawski, J. Meizner, P. Nowakowski and S. Varma: Evaluation of Cloud Providers for VPH Applications, poster
at CCGrid2013 - 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Delft, the Netherlands, May 13-16, 2013
Resource allocation management
Developer
Admin
Scientist
The Atmosphere Cloud Platform is a one-stop management service for
hybrid cloud resources, ensuring optimal deployment of application
services on the underlying hardware.
VPH-Share Core Services Host
Cloud Facade
(secure
RESTful API )
VPH-Share Master Int.
Cloud Manager
Atmosphere
Management
Service (AMS)
Cloud stack
plugins (Fog)
Development Mode
Atmosphere
Internal
Registry (AIR)
Generic Invoker
Workflow management
OpenStack/Nova Computational Cloud Site
Other CS
External application
Cloud Facade client
Head
Node
Worker Worker Worker Worker
Node
Node
Node
Node
Amazon EC2
Customized applications may directly
interface Atmosphere via its RESTful
API called the Cloud Facade
Image store
(Glance)
Worker Worker Worker Worker
Node
Node
Node
Node
P. Nowakowski, T. Bartynski, T. Gubala, D. Harezlak, M. Kasztelnik, M. Malawski, J. Meizner, M. Bubak: Cloud Platform for Medical
Applications, eScience 2012 (2012)
Cost optimization of applications on clouds
Task
Infrastructure model
– Multiple compute and
storage clouds
– Heterogeneous instance
types
•
Application model
– Bag of tasks
– Leyered workflows
•
•
•
•
Modeling with AMPL (A
Modeling Language for
Mathematical
Programming)
Cost optimization under
deadline constraints
Mixed integer
programming
Bonmin, Cplex solvers
Input
Layer 1
1h
A
private
Application
Compute
Output
B
B
Private cloud
C
2.5 h
Layer 2
B
6h
Layer 3
0.5 h
D
Layer 4
E
0.3 h
F
2h
m1.small
m1.large
t1.micro
m2.xlarge
Layer 5
rs.1gb
rs.2gb
rs.4gb
rs.16gb
Storage
Storage
Compute
Compute
Rackspace
Amazon
20000 tasks, 512 MiB input and 512 MiB output, task execution time 0.1h @ 1ccu machine
3000
Amazon S3
Rackspace Cloud Files
Optimal
2500
Multiple providers
2000
Cost ($)
•
1500
Amazon's and private instances
1000
Rackspace and private instances
Rackspace instances
500
0
0
10
20
30
40
50
60
70
80
90
100
Time limit (hours)
M. Malawski, K. Figiela, J. Nabrzyski: Cost minimization for computational applications on hybrid cloud infrastructures, Future Generation
Computer Systems, Volume 29, Issue 7, September 2013, Pages 1786-1794, ISSN 0167-739X, http://dx.doi.org/10.1016/j.future.2013.01.004
Workflow management systems in eScience
“are key technology to integrate computing and data analysis components, and to
control the execution and logical sequences among them. By hiding the complexity
in an underlying infrastructure, SWMSs allow scientists to design complex scientific
experiments, access geographically distributed data files, and execute the
experiments using computing resources at multiple organizations.“
Report of the NSF/Mellon Workshop on Scientific and Scholarly Workflow. Oct 3-5,
2007, Baltimore, MD
Auto-scaling workflows
• Automatic scaling of workflow
components based
– Resource load
– Application load
– provenance data
• Scaling across various
infrastructures
– desktop
– Grids
– Clouds
R. Cushing, S. Koulouzis, A. S. Z. Belloum, M. Bubak: Dynamic Handling for Cooperating Scientific Web Services, 7th IEEE International
Conference on e-Science, December 2011, Stockholm, Sweden
Auto-scaling workflows
Service Load
Running Service
instances
R. Cushing, S. Koulouzis, A. S. Z. Belloum, M. Bubak: Dynamic Handling for Cooperating Scientific Web Services, 7th IEEE International
Conference on e-Science, December 2011, Stockholm, Sweden
Auto-scaling workflows
R. Cushing, S. Koulouzis, A. S. Z. Belloum, M. Bubak: Prediction-based Auto-scaling of Scientific Workflows, Proceedings of the 9th
International Workshop on Middleware for Grids, Clouds and e-Science, ACM/IFIP/USENIX December 12th, 2011, Lisbon, Portugal
Workflow as a Service
• Once a workflow is initiated on the resources it stays alive
and process data/jobs continuously
• Reduce the scheduling overhead
R. Cushing, Adam S. Z. Belloum, V. Korkhov, D. Vasyunin, M.T. Bubak, C. Leguy: Workflow as a Service: An Approach to Workflow Farming,
ECMLS’12, June 18, 2012, Delft, The Netherlands
Provenance in Practice: Blast Application
[Department of Clinical
Epidemiology, Biostatistics and
Bioinformatics (KEBB), AMC ]
The aim of the application is the alignment of DNA
sequence data with a given reference database.
A workflow approach is used to run this application on
distributed computing resources.
For Each workflow run
•
•
•
•
The provenance data is collected an stored following the
XML-tracing system
User interface allows to reproduce events that occurred
at runtime (replay mode)
User Interface can be customized (User can select the
events to track)
User Interface show resource usage
on-going work UvA-AMC-fh-aachen
Semantic workflow composition
• GworkflowDL language (with A.
Hoheisel)
• Dynamic, ad-hoc refinement of
workflows based on semantic
description in ontologies
• Novelty
– Abstract, functional blocks translated
automatically into computation unit
candidates (services)
– Expansion of a single block into a
subworkflow with proper concurrency
and parallelism constructs (based on
Petri Nets)
– Runtime refinement: unknown or failed
branches are re-constructed with
different computation unit candidates
T. Gubala, D. Harezlak, M. Bubak, M. Malawski: Semantic Composition of Scientific Workflows Based on the Petri Nets Formalism. In: "The
2nd IEEE International Conference on e-Science and Grid Computing", IEEE Computer Society Press,
http://doi.ieeecomputersociety.org/10.1109/E-SCIENCE.2006.127, 2006
Semantic integration for science domains
•
•
•
•
Concept of describing scientific domains for in-silico
experimentation and collaboration within laboratories
Based on separation of the domain model, containing
concepts of the subject of experimentation from the
integration model, regarding the method of (virtual)
experimentation (tools, processes, computations)
Facets defined in integration model are automatically
mixed-in concepts from domain model: any piece of
data may show any desired behavior
Proposed, designed and deployed the
method for 3 domains of science:
–
Computational chemistry inside InSilicoLab
chemistry portal
–
Sensor processing for early warning and crisis
simulation in UrbanFlood EWS
–
Processing of results of massive bioinformatic
computations for protein folding method
comparison
–
Composition and execution of multiscale
simulations
–
Setup and management of VPH applications
T. Gubala, K. Prymula, P. Nowakowski, M. Bubak: Semantic Integration for Model-based Life Science Applications. In: SIMULTECH 2013
Proceedings of the 3rd International Conference on Simulation and Modeling Methodologies, Technologies and Applications, Reykjavik, Iceland
29 - 31 July, 2013, pp. 74-81
Cooperative virtual laboratory for e-Science
• Design of a laboratory for virologists, epidemiologists and clinicians
investigating the HIV virus and the possibilities of treating HIV-positive
patients
• Based on notion of in-silico experiments built and refined by cooperating
teams of programmers, scientists and clinicians
• Novelty
– Employed full concept-prototyperefinement-production circle for
virology tools
– Set of dedicated yet interoperable
tools bind together programmers
and scientists for a single task
– Support for system-level science
with concept of result reuse
between different experiments
T. Gubala, M. Bubak, P. M. A. Sloot: Semantic Integration of Collaborative Research Environments, chapter XXVI in “Handbook of Research
on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare”, Information Science Reference IGI Global 2009, ISBN:
978-1-60566-374-6, pages 514-530
GridSpace - platform for e-Science applications
•
•
•
•
•
Experiment: an e-science application
composed of code fragments (snippets),
expressed in either general-purpose
scripting programming languages,
domain-specific languages or purposespecific notations. Each snippet is
evaluated by a corresponding
interpreter.
GridSpace2 Experiment Workbench: a
web application - an entry point to
GridSpace2. It facilitates exploratory
development, execution and
management of e-science experiments.
Embedded Experiment: a published
experiment embedded in a web site.
GridSpace2 Core: a Java library providing
an API for development, storage,
management and execution of
experiments. Records all available
interpreters and their installations on
the underlying computational resources.
Computational Resources: servers,
clusters, grids, clouds and einfrastructures where the experiments
are computed.
E. Ciepiela, D. Harezlak, J. Kocot, T. Bartynski, M. Kasztelnik, P. Nowakowski, T. Gubała, M. Malawski, M. Bubak: Exploratory Programming
in the Virtual Laboratory. In: Proceedings of the International Multiconference on Computer Science and Information Technology, pp. 621-628,
October 2010, the best paper award.
Collage - executable e-Science publications
Goal:
Extending the traditional
scientific publishing model with
computational access and
interactivity mechanisms;
enabling readers (including
reviewers) to replicate and
verify experimentation results
and browse large-scale result
spaces.
Challenges:
Scientific: A common description schema for primary data (experimental data, algorithms, software,
workflows, scripts) as part of publications; deployment mechanisms for on-demand reenactment of
experiments in e-Science.
Technological: An integrated architecture for storing, annotating, publishing, referencing and reusing
primary data sources.
Organizational: Provisioning of executable paper services to a large community of users representing
various branches of computational science; fostering further uptake through involvement of major
players in the field of scientific publishing.
P. Nowakowski, E. Ciepiela, D. Harężlak, J. Kocot, M. Kasztelnik, T. Bartyński, J. Meizner, G. Dyk, M. Malawski: The Collage Authoring
Environment. In: Proceedings of the International Conference on Computational Science, ICCS 2011 (2011), Winner of the Elseview/ICCS
Executable Paper Grand Challenge
E. Ciepiela, D. Harężlak, M. Kasztelnik, J. Meizner, G. Dyk, P. Nowakowski, M. Bubak: The Collage Authoring Environment: From Proof-ofConcept Prototype to Pilot Service in Procedia Computer Science, vol. 18, 2013
GridSpace2 / Collage - Executable
e-Science Publications
23
• Goal: Extend the traditional way of authoring and
publishing scientific methods with computational
access and interactivity mechanisms thus bringing
reproducibility to scientific computational
workflows and publications
• Scientific challenge: Conceive a model and
methodology to embrace reproducibility in
scientific worflows and publications
• Technological challenge: support these by modern
Internet technologies and available computing
infrastructures
• Solution proposed:
• GridSpace2 – web-oriented distributed
computing platform
• Collage – authoring environment for
Dec 2011
executable publications
Jun 2012
Jun 2011
GridSpace2 / Collage - Executable e-Science
Publications
Results:
•
•
•
•
GridSpace2/Collage won Executable
Paper Grand Challenge in 2011
Collage was integrated with Elsevier
ScienceDirect portal so papers can be
linked and presented with
corresponding computational
experiments
Special Issue of Computers &
Graphics journal featuring Collagebased executable papers was
released in May 2013
GridSpace2/Collage has been applied
to multiple computational workflows
in the scope of PL-Grid, PL-Grid Plus
and Mapper projects
E. Ciepiela, D. Harężlak, M. Kasztelnik, J. Meizner, G. Dyk, P.
Nowakowski, M. Bubak: The Collage Authoring Environment:
From Proof-of-Concept Prototype to Pilot Service. In: Procedia
Computer Science, vol. 18, 2013
E. Ciepiela, P. Nowakowski, J. Kocot, D. Harężlak, T. Gubała, J. Meizner, M. Kasztelnik, T. Bartyński, M. Malawski, M. Bubak: Managing
entire lifecycles of e-science applications in the GridSpace2 virtual laboratory–from motivation through idea to operable web-accessible
environment built on top of PL-grid e-infrastructure. In: Building a National Distributed e-Infrastructure–PL-Grid, 2012
P. Nowakowski, E. Ciepiela, D. Harężlak, J. Kocot, M. Kasztelnik, T. Bartyński, J. Meizner, G. Dyk, M. Malawski: The Collage Authoring
Environment. In: Procedia Computer Science, vol. 4, 2011
Cookery – framework for building DSLs
•
•
•
•
Workflows based on graph representations are widely used to develop scientific
applications. However they encounter certain issues, they are not easy to share, to track
chagnes and to perform tests.
Applications developed using general-purpose programming langauges don’t meet these
issues – a wide range of tools were developed for software development for code sharing
and tracking changes (version controll, code reviews).
We propose a solution based on Ruby programming language that combines advanteges
from two worlds, it is not more complex for the end-user than solutions based on
graphical representations and it enables the wide range of tools for software
development
Applications can be written in DSL that is close to English:
Read file /tmp/test_data.gzip.
Count words.
Print result.
Transforming scripts into workflows
• Scientific workflows are considered to be a convinient high-level
alternative to solutions based on programming languages
• We investigate GridSpace collaborative and execution environment
based on Ruby language that enables acces to Grid infrastructure using
APIs
• We describe how to address issues of analysing Ruby soruce code to
build workflow representations
a
b
c
d
e
=
=
=
=
=
GObj.create
a.async_do_sth("")
b.get_result
a.async_do_sth(c)
d.get_result
M. Baranowski, A. Belloum, M. Bubak and M. Malawski: Constructing workflows from script applications, Scientific Programming, 2012,
doi:10.3233/SPR-120358
HyperFlow: model & execution engine
• Simple yet expressive model for complex scientific apps
• App = set of processes performing well-defined functions and
exchanging signals
HyperFlow model JSON serialization
{
• Supports a rich set
"name":
"...",  name of the app
"processes": [ ... ],  processes of the app
of workflow patterns
"functions": [ ... ],  functions used by processes
"signals":
[ ... ],  exchanged signals info
• Suitable for various
"ins":
[ ... ],  inputs of the app
"outs":
[ ... ]  outputs of the app
application classes
}
• Abstracts from other distributed app aspects (service model,
data exchange model, communication protocols, etc.)
Scalable data access
• Storage federation
• In service orchestration, all data is
passed to the workflow engine
• Data transfers are made through SOAP,
which is unfit for large data transfers
S. Koulouzis, R. Cushing, K. Karasavvas, A. Belloum, M. Bubak: Enabling web services to consume and produce large distributed datasets, to be
published JAN/FEB, IEEE Internet Computing, 2012
Data reliability and integrity
DRI is a tool which can keeps track of binary data stored in a cloud infrastructure, monitor
data availability and faciliate optimal deployment of application services in a hybrid cloud
(bringing computations to data or the other way around).
LOBCDER
DRI Service
Metadata extensions for DRI
Binary
data
registry
Validation
policy
End-user features
(browsing, querying,
direct access to data,
checksumming)
A standalone application service, capable of autonomous operation. It periodically
verifies access to any datasets submitted for validation and is capable of issuing alerts
to dataset owners and system administrators in case of irregularities.
Register files
Get metadata
Migrate LOBs
Get usage stats
(etc.)
Configurable validation runtime
(registry-driven)
Amazon S3
OpenStack Swift
Runtime layer
Cumulus
VPH Master Int.
Store and marshal data
Data management
portlet (with DRI
management
extensions)
Distributed Cloud storage
Extensible
resource
client layer
Data security in clouds
•
•
•
To ensure security of data in transit
Modern applications use secure tranport
protocols (e.g.TLS)
For legacy unencrypted protocols if absolutly
needed, or as additional security measure:
–
–
•
•
•
Site-to-Site VPN, e.g. between cloud sites is
outside of the instance, might use
Remote access – for individual users accessing
e.g. from their laptops
Data should be secure stored and realiable
deleted when no longer needed
Clouds not secure enough, data optimisations
preventing ensuring that data were deleted
A solution:
–
–
end-to-end encryption (decryption key stays in
protected/private zone)
data dispersal (portion of data, dispersed
between nodes so it’s non-trivial/impossible to
recover whole message)
Jan Meizner, Marian Bubak, Maciej Malawski, and Piotr Nowakowski: Secure storage and processing of confidential data on public clouds.
In: Proceedings of the International Conference On Parallel Processing and Applied Mathematics (PPAM) 2013, Springer LNCS
Colaborative metadata management
Objectives
•
•
•
•
Provide means for ad-hoc metadata model
creation and deployment of corresponding
storage facilities
Create a research space for metadata model
exchange and discovery with associated data
repositories with access restrictions in place
Support different types of storage sites and
data transfer protocols
Support the exploratory paradigm by making
the models evolve together with data
Architecture
•
•
•
Web Interface is used by users to create,
extend and discover metadata models
Model repositories are deployed in the PaaS
Cloud layer for scalable and reliable access
from computing nodes through REST
interfaces
Data items from Storage Sites are linked from
the model repositories
MapReduce specific language
• We provide a domain specific language for defining MapReduce
operations
• It allowes to execute once specified queries on many MapReduce
engines
• Applications can switch data sources easier
• Applications can have separated environmenats for different
stages of development (development, testing, production) –
more robust code
Separation of concerns
•
•
Scientific applications are constructed from 3 types of components
We strictly define their concerns
– Tasks is the place where we define computations
– Resource is where we define used resources
– In Mapping we join resources with
•
We limit interactions by defining relations
– Tasks use constructs determined by Resource (e.g. MapReduce constructs
– Mapping maps corresponding Tasks to Resources
Towards ecosystem of data and processes
Is it possible to create an ecosystem where scientific data and processes
can be linked through semantics and used as alternative to the current
manual composition of eScience applications?
• How to implement adaptive scheduling needed for workflow enactment
across multiple domains?
• How to achieve QoS for data centric application workflows that have
special requirements on network connections?
• How to achieve robustness and fault tolerance for workflow running
across distributed resources?
• How to increase re-usability of workflows, workflow components, and
refine workflow execution?
Workflowless eScience
2013
2004-2012
Self-organizing linked process ecosystem
A Networked Open
Processes. built from an RDF
store describing SADI
services.
• Vertexes are operations
described in BioMoby
Semantics.
• Edges show a semantic
match between output
and input
Computing on browsers
Result
Result
Result
Result
Job
Out put
1
2
3
4
Enqueue
Mast er
Job
Job
Job
Job
REST Service
Host ed
Websit e
Dequeue
1
2
3
4
Parceled
Jobs/Result s
Web
Browser
Slaves
Web
Browser
Web
Browser
Web
Browser
R. Cushing, G.a Putra, S. Koulouzis, A.S.Z Belloum, M.T. Bubak, C. de Laat: Distributed computing on an Ensemble of Browsers, IEEE
Internet Computing, PrePress 10.1109/MIC.2013.3, January 2013
Automata-based dynamic data processing
• Data processing schema can be
considered as a state
transformation graph
• The graph facilitates data
processing in many ways
– Data state can be easily tracked
– Using the graph as a protocol
header, a virtual data processing
network layer is achieved
– Data becomes self routable to
processing nodes
– Collaboration can be achieved by
joining the virtual network
State Graph describing a filtering state machine
for tweets which is mapped to 11 VMs
R.Cushing, A.Belloum, M.Bubak et al.: Automata-based Dynamic Data Processing for Clouds, BigDataClouds 2014
Building scientific software based on Feature Model
Research on Feature Modeling:
• modelling eScience applications family
component hierarchy
• modelling requirements
• methods of mapping Feature Models to
Software Product Line architectures
Research on adapting Software Product Line
principles in scientific software projects:
• automatic composition of distributed
eScience applications based on Feature
Model configuration
• architectural design of Software Product
Line engine framework
B. Wilk, M. Bubak, M. Kasztelnik: Software for eScience: from feature modeling to automatic setup of environments, Advances in Software
Development, Scientific Papers of the Polish Informations Processing, Society Scientific Council, 2013 pp. 83-96
Common Information Space (CIS)
•
•
Facilitate creation, deployment and robust operation of Early Warning
Systems in virtualized cloud environment
Early Warning System (EWS): any system
working according to four steps:
monitoring, analysis, judgment,
action (e.g. environmental
monitoring)
Common Information Space
• connects distributed component
into EWS and deploy it on cloud
• optimizes resource usage taking into
acount EWS importance level
• provides EWS and self monitoring
• equipped with autohealing
B. Balis, M. Kasztelnik, M. Bubak, T. Bartynski, T. Gubala, P. Nowakowski, J. Broekhuijsen: The UrbanFlood Common Information Space for
Early Warning Systems. In: Elsevier Procedia Computer Science, vol 4, pp 96-105, ICCS 2011.
Multiscale programming and execution tools
•
•
•
•
•
•
MAPPER Memory (MaMe) a semanticsaware persistence store to record metadata
about models and scales
Multiscale Application Designer (MAD)
visual composition tool transforming high level
description into executable experiment
GridSpace Experiment Workbench
(GridSpace)
execution and result
management of experiments
MaMe
choose/add/delete
Mapper A
Submodule
A
Mapper B
Submodule
B
MAD
•
A method and an environment for composing multiscale
applications from single-scale models
Validation of the the method against real applications
structured using tools
Extension of application composition techniques to
multiscale simulations
Support for multisite execution of multiscale simulations
Proof-of-concept transformation of high-level formal
descriptions into actual execution using e-infrastructures
GridSpace
•
K. Rycerz, E. Ciepiela, G. Dyk, D. Groen, T. Gubala, D. Harezlak, M. Pawlik, J. Suter, S. Zasada, P. Coveney, M. Bubak: Support for Multiscale
Simulations with Molecular Dynamics, Procedia Computer Science, Volume 18, 2013, pp. 1116-1125, ISSN 1877-0509
K. Rycerz, M. Bubak, E. Ciepiela, D. Harezlak, T. Gubala, J. Meizner, M. Pawlik, B.Wilk: Composing, Execution and Sharing of Multiscale
Applications, submitted to Future Generation Computer Systems, after 1st review (2013)
K. Rycerz, M. Bubak, E. Ciepiela, M. Pawlik, O. Hoenen, D. Harezlak, B. Wilk, T. Gubala, J. Meizner, and D. Coster: Enabling Multiscale Fusion
Simulations on Distributed Computing Resources, submitted to PLGrid PLUS book 2014
PL-Grid Project Results
•
First working NGI in Europe in the framework of EGI.eu
(since March 31, 2010)
•
Number of users (March 2012): 900+
•
Number of jobs per month:
•
Resources available:
−
−
750,000 - 1,500,000
Computing power: ca. 230 TFlops
Storage: ca. 3600 TBytes
•
High level of availiability and realibility of the resources
•
Facilitating effective use of these resources by providing:
–
–
–
•
innovative grid services and end-user tools like Efficient
Resource Allocation, Experimental Workbench and Grid
Middleware
Scientific Software Packages
User support: helpdesk system, broad training offer
Various, well-performed dissemination activities, carried out
at national and international levels, which contributed
significantly to increasing of awareness and knowledge
about the Project and the grid technology in Poland.
PLGrid Plus Project Results
•
•
•
•
•
•
•
•
New domain-specific services for 13 identified scientific
domains
Extension of the resources available in the PL-Grid
Infrastructure by ca. 500 TFlops of computing power and
ca. 4.4 PBytes of storage capacity
Design and start-up of support for new domain grids
Deployment of Quality of Service system for users
by introducing SLA agreement
Deployment of new infrastructure services
Deployment of Cloud infrastructure for users
Broad consultancy, training and dissemination offer
Summary
• Modelling of complex collaborative scientific applications
– domain-oriented semantic descriptions of modules, patterns, and data to
automate composition of applications
• Studying the dynamics of distributed resources
– investigating temporal characteristics, dynamics, and performance variations to
run applications with a given quality
• Modelling and designing a software layer to access and orchestrate
distributed resources
– mechanisms for aggregating multi-format/multi-source data into a single
coherent schema
– semantic integration of compute/data resources
– data aware mechanisms for resource orchestration
– enabling reusability based on provenance data
Topics for collaboration
• Optimization of service
deployment on clouds
– Constraint satisfaction and
optimization of multiple
criteria (cost, performance)
– Static deployment planning
and dynamic auto-scaling
• Billing and accounting
model
– Adapted for the federated
cloud infrastructure
– Handle multiple billing
models
• Supporting system-level
(e)Science
– tools for effective scientific
research and collaboration
– advanced scientific analyses
using HPC/HTC resources
• Cloud security
– security of data transfer
– reliable storage and removal
of the data
• Cross-cloud service
deployment based on
container model
dice.cyfronet.pl
www.science.uva.nl/~gvlam/wsvlam
Download