Uploaded by jad_rj

DEVSHybrid

advertisement
DEVS PhD Dissertation Awards
Hybrid Modeling And Simulation of
Complex Data Networks
Matias Bonaventura
mbonaventura@dc.uba.ar
Supervisor: Prof. Rodrigo Castro
Computer Science Department and ICC-CONICET
School of Exact and Natural Sciences, University of Buenos Aires
Argentina
May 18th, 2020
Outline
1.
Challenges, motivation, and research hypothesis
2.
Background and theoretical framework
3.
Network packet-level simulation
4.
Network fluid-flow simulation
5.
Network hybrid simulation
6.
Conclusions and future work
2
1
Challenges, motivation,
and research hypothesis
3
Large Hadron Collider (LHC) at CERN
●
●
●
●
World Largest particle accelerator
(27 km of superconducting magnets)
Collides bunches of particles every 25 ns
at an energy of 13 Tev
Hosts 4 main detectors at collision points
Data center stores >30 PB of data per year
4
Data Network Simulation - Why?
Real case study: The ATLAS detector @ CERN
●
●
●
●
Detector: 40Mhz proton collisions generate ~64 TB/s
Level1: hardware filters down to ~160 GB/s
High Level (HLT): software filters down to ~1.6 GB/s
Simulation supports: network design, sizing, fine-tuning, predict changes, etc
~2000 multicore servers
1-10 Gbps links
5
Network Simulation Approaches
●
Packet-level simulation: individual packet-by-packet
○ Discrete events (continuous time) simulations
○ Detailed results & vast literature and tools
○ Limitations: execution is proportional to #packets
Not adequate for
large network simulation
6
Network Simulation Approaches
●
Packet-level simulation: individual packet-by-packet
○ Discrete events (continuous time) simulations
○ Detailed results & vast literature and tools
○ Limitations: execution is proportional to #packets
●
Fluid-flow simulation: averaged dynamics (ODEs)
TCP Congestion Window
○ Discrete time (Euler, Runge Kutta)
○ Better execution time & scarce literature
○ Limitation: ODE modeling is hard to adopt
7
Network Simulation Approaches
●
Packet-level simulation: individual packet-by-packet
○ Discrete events (continuous time) simulations
○ Detailed results & vast literature and tools
○ Limitations: execution is proportional to #packets
●
Fluid-flow simulation: averaged dynamics (ODEs)
○ Discrete time (Euler, Runge Kutta)
○ Better execution time & scarce literature
○ Limitation: ODE modeling is hard to adopt
●
Hybrid simulation: combines best of each world (details results & performance)
○ Limited literature available. Ad-hoc solutions.
○ Incompatibility between discrete time (packets) vs. continuous time (fluid)
○ Different set of tools and knowledge (Matlab, ODEs vs. protocols, topologies)
8
Main Research Hypothesis
"The packet-level, fluid-flow and hybrid models
can be represented within a unified M&S formalism (DEVS)
providing advantages of
modeling expressiveness and simulation performance"
9
2
Background
Theoretical framework
10
DEVS Formalism and Hybrid Systems
●
●
DEVS (Discrete EVent Systems specification), Bernard Zeigler, ’76
DEVS allows to:
○ Exactly represent any discrete system
○ Approximate continuous systems with any desired accuracy
Fluid-flow
models
(ODEs)
Classic
Fluid-flow
Simulation
(Euler, Runge
Kutta)
Packet-level
network
simulation
DEVS
Fluid-flow
Simulation
(QSS)
New hybrid
network
simulation
11
Modeling and Simulation-Driven Methodology
●
Iterative cycles and incremental phases
12
Solution Proposal
●
●
Goal: "Probe the theoretical and practical feasibility of unifying the packet-level,
fluid-flow, and hybrid approaches within a unifying formal framework"
○ Unifying formalism: DEVS
○ Unifying tool: PowerDEVS (DEVS+QSS, some existing network models)
○ Real use case: TDAQ system at CERN
Phases
A
DEVS
Packet-Level
Models
C
B
Fluid-Flow
Approximations
Hybrid
Models
13
3
Packet-level simulation
under the DEVS formalism
14
Packet-level Simulation Approach under DEVS
●
DEVS-based iterative methodology: bottom-up, emergent behaviour, OSI layers 3-7
○ Minimum complexity for the desired precision [Robinson]
Modular atomic DEVS
●
Formal implementation (DEVS) of TCP Reno
○ Validated against OMNET++ and
real TDAQ traffic (tcp_dump)
Sliding Window
Congestion Control
pkt
pkt
ack
req
ack
pkt
TCP Receiver
TCP Sender
TCP state machine
SS
CA
EB
FR
FR
15
Modeling Complex Network Topologies
●
●
●
Library with >30 new network models (DEVS: hierarchical & modular)
PowerDEVS GUI: build and design network topologies graphically
New modeling tools to ease design of bigger complex topologies:
(Laurito, Bonaventura, Pozo,
○ TopoGen: Automatic SDN topology generation
Castro, WSC 2016)
○ Py2PDEVS: python binding to define PowerDEVS models programmatically
Low-level Model Library
PowerDEVS Graphical Topology
16
Modeling TDAQ Applications
TDAQ/HLT Model in PowerDEVS
(Bonaventura, Foguelman , Castro, CiSE 2016)
2000 servers
100 Khz
TDAQ network
~50 Gbps
100-200
servers
~50 K applications
●
●
●
●
●
●
100 KHz Events IDs
~2000 servers
~50000 applications
~50Gb/s data
~50 switches, 2 routers
~450 links (1-10 Gbps)
17
Empirical Validation with the Real TDAQ System
●
Emergent Dynamics and the impact in the event filtering time:
1. Reproduce the "TCP Incast" pathology present in TDAQ
2. Reproduce event filtering times for topology changes
3. Fine tuning of traffic control applications
4.
Predictive load-balancing simulation
(Bonaventura, Jonckheere, Castro, WSC 2018)
i.
ii.
iii.
Initial study based on
queuing theory
Evaluation in the model
predicts improvements
Implementation in the real
system: improvements
confirmed
Policy:
FFFA
18
Hypothesis on the Model and Empirical Results
●
Experimental framework: 9 racks (267 DCMs, 6408 PUs), Events 1.7 MB
(trigger limited by network capacity)
RMSE= 3.24
●
RMSE=15.97
●
average latency
reduction of ~25%
10.84 ms
Simulation accurately
reproduces the real
system metrics
The new policy does
reduce latency
(under controlled
conditions)
Downside:
The real load-balancer could
not cope with the 100 KHz
using the new proposed
algorithm
19
Packet-level Simulation Conclusions
Packet-level main contributions
●
●
●
New library of formal (DEVS) network models
New tools for automatic and programmatic topologies (big scale)
Simulations: acceptable precision (network and application metrics)
○ Validation: with other simulators, analytical models and real hardware
○ Successful in the design and fine-tuning of the real world network at CERN
●
Execution times: grow linearly with respect to transmitted packets
○ Full TDAQ system (50 racks): 60s simulated → 1 day and 9 hours of execution
○ By 2026, TDAQ will increase system size ~50 times
○ Parallel simulation offers gains only when there is light inter-subnetworks traffic
20
4
Fluid-flow Simulations
21
Fluid-flow Simulation Approach
Same formalism (DEVS) + same tool (PowerDEVS)
as in packet-level simulation
●
Approach: decouple 3 areas of knowledge required for fluid approximations
1.
Simulation of ODEs with DEVS: numerical solutions with QSS
2.
Mathematical model: based on the set of equations proposed by MGT[*]
3.
Network modeling: topological description similar to packet-level
[*] V. Misra, W.-B. Gong, and D. Towsley. “Fluid-based Analysis if a Network of AQM Routers Supporting TCP Flows with an Application to RED”. ACM/SIGCOMM, 2000
22
Numerical Integration with DEVS: Quantized State System[*]
●
●
QSS originally proposed by Zeigler ‘98. Later formalized by Kofman ‘01.
Basic idea in QSS: quantize state variables preserving the continuous time domain
●
Given an ODE system:
xi(t)
ΔQi
●
QSS quantizes the state variables:
xi
qi
●
●
…
asynchronous
PowerDEVS provides a set of QSS methods (QSS1-3, LIQSS, etc.)
Resulting in a discrete event system
[*] Kofman, E., and S. Junco. 2001. “Quantized-State Systems: a DEVS Approach for Continuous System Simulation”. Simulation.
Time
23
Mathematical Model: Fluid Queue+Server Subsystem
●
Based on the set of equations proposed by Towsley ‘01-06
●
Fluid finite buffer: intuitive analogy a water reservoir with finite capacity
a1(t)
a2(t)
ai(t)
𝜇i(t)
q(t) <= Qmax
t
C
d1(t) d2(t) di(t)
24
Mathematical Model: Fluid Queue+Server Subsystem
●
Challenges for the numerical solution of the differential system
●
Asynchronous discontinuities
●
Delayed dynamics
In QSS: DQSS (Castro et al., 2011)
●
Implicit expression
New QSS extension: FDQSS
(convergence theorem in this Thesis)
25
Modeling of Fluid-Flow Networks: Modular Topology
TCP:
queue+server:
26
Fluid-Flow Networks: Experimental Results
●
Experiment:
○ Random Early Discard (RED) buffers
○ Multiple ON/OFF competing TCP sessions
○ Packet-level: stochastic packet size and generation times
Fluid-flow model
Packet-level model
27
Fluid-Flow Networks: Experimental Results (Accuracy)
1.
2.
Fluid-flow approximation captures averaged behaviour
Same dynamic profile: congestion window, resource sharing
1 host
2 hosts
1 host
28
Fluid-Flow Network: Experimental Results (performance)
●
Scalability comparison against packet-level simulations
○ Increasing (by a K factor): bandwidth, buffer size, RED params, #TCP sessions
● Scaling link speed
○ Packet-level: ~linear in #packets
○ Fluid-Flow: ~constant
Packet-level
model
Fluid-flow
model
● Speedup
○ e.g. 1Gbps links => speedup x200
29
Fluid-Flow Simulation Conclusions
Fluid-flow contributions
(Bonaventura, Castro, WSC 2018)
Same formalism (DEVS) + Same tool (PowerDEVS)
as in packet-level simulation
●
Simulation:
○ Acceptable approximation when compared with packet-level simulation
○ Performance advantages for large networks (independence of link speed)
○ New numerical method for FDQSS
●
Modeling:
1. New fluid network model libraries (DEVS): generic, reusable, and modular
2. Network modeling: topological description similar to packet-level (same tools)
3. Topology design: don’t need knowledge on ODEs or numerical solvers
30
5
Hybrid Simulation
31
Hybrid Simulation: Packet → Fluid
●
●
●
●
Combine packet and fluid simulations affecting each other
Idea: augment each packet arriving at a link with a fluid signal
ON:sending, OFF:passive
DEVS algorithm: for each discrete packet
○ Generate a signal with value C (link speed) together with the packet
○ After time t = packet.size/C: generate a signal with value 0
Optional: smoothing taking a averaged window
C=
ON
ON
OFF
ON
OFF
OFF
32
Hybrid Simulation: Hybrid Queue (Fluid ↔ Packet)
●
Interaction between models occurs at hybrid queues
○ Packet→Fluid: augmented packet (ON/OFF signal) input to a fluid queue
○ Fluid→Packet: fluid queue metrics (delay and discards) affect each packet
■ Fluid signals: discard Prob. =
; delay =
(applied to packets)
Packets
Fluid
●
QSS numerical method
○ Does not require any
synchronization
●
Dense fluid signals
○ known ∀t
○ bounded error < 𝚫Qi
hybrid queue+server
Fluid → Packet
Fluid queue+server
33
Hybrid Simulation: Topologies
Fluid-flow
model
Hybrid
Router
Hybrid model
Packet-level
model
Packet
ON/OFF
hybridization
Hybrid
queue+server
34
Hybrid simulation: Experimental Results
●
●
Topology for experiments:
○ Nf fluid flows; Nc packet flows; sharing the same bottleneck link
Hybrid simulation scenarios
○ Hybrid router only with packet flow (Nf =0)
○ Background/Foreground traffic: Nc=1 ; Nf >> Nc
○ Adjust precision vs. performance: Nc+ Nf = 40
bottleneck
35
Hybrid Simulation: Experimental Results
●
Adjusting trade-off between precision and performance
○ 40 TCP sessions in total: Npacket (precision) + Nfluid (performance) = 40
○ Increase precision by increasing the ratio packet/fluid
○ Bottleneck = 100 Mbps; bandwidth packet = fluid = 200 Mbps
Execution time
(70s simulated)
Total Gbits sent
(60s simulated)
250
Execution time(s)
Total sent (Gb)
4
Packet traffic
3
2
1
Fluid traffic
0
8
16
24
32
# Packet TCP sessions
40
Hybrid
model
200
150
100%
Packet-level
100
50
0
Hybrid (smooth=10ms)
0
4
8
12
20
28
# Packet TCP sessions
32
36 40
36
Hybrid Simulation: Experimental Results
TCP sessions: 20 fluid + 20 packet (1 probe)
37
Conclusions
Selected contributions
●
Scientific production
DEVS as a unifying framework
for packet, fluid and hybrid network models
3 Posters
4 Full conference papers
●
Modeling advantages: simplifies adoption
(no ODE knowledge) and modeling process
(unified tool)
*
1 Journal article accepted
●
Hybrid models allows to adjust the
trade-off between precision and performance
●
Real world application and validation in CERN
networks
(1 in preparation)
38
Future Work and Open Problems
●
Hybrid simulations in the TDAQ context
○ Design of the TDAQ architecture for Phase II (2026)
○ Mathematical models for trigger data flows (e.g. ROS -> DCM)
●
Extend packet and fluid models
○ Huge range of packet features/protocols. Challenging: wireless networks,
SDNs.
○ Other mathematical models (discontinuous) that better exploit QSS features
(e.g. different ODEs for the different TCP states)
●
Extend the study of hybrid network models
○ Parameter sensitivity (e.g. 𝚫Qrel, 𝚫Qmin) and analysis of variance (e.g. Jitter)
○ Different smoothing techniques (e.g. higher orders in QSS)
○ Study the "Ripple effect" according to network topology
39
Current last-minute application of the hybrid network framework:
Modeling COVID19 spread on a "network" of Urban Conglomerates in Argentina
●
Dynamic model: a Susceptible Infected Removed (SIR) represented as a network of ODEs
and solved with QSS in each urban cluster + Individual persons moving among clusters
●
Georeferenced SIR models. Interaction graph defined with Python (Py2PDEVS)
days
40
Thanks!
Questions?
41
Main references
- Bonaventura, M., D. Foguelman, and R. Castro. 2016. “Discrete Event Modeling and Simulation-Driven
Engineering for the ATLAS Data Acquisition Network”. Computing in Science & Engineering 18:70–83.
- Foguelman, D. J., M. Bonaventura, and R. D. Castro. 2016. “MASADA: A Modeling and Simulation Automated
Data Analysis Framework for Continuous Data-Intensive Validation of Simulation Models”.
- Laurito, A., M. Bonaventura, M. E. Pozo Astigarraga, and R. Castro. 2017. “TopoGen: A network Topology
Generation Architecture with Application to Automating Simulations of Software Defined Networks”. In
Proceedings of the 2017 Winter Simulation Conference, Volume 50, 1049–1060
- Bonaventura, M., and R. Castro. 2018. “Fluid-Flow and Packet-Level Models of Data Networks Unified Under a
Modular/Hierarchical Framework: Speedups and Simplicity, Combined”. 2018 Winter Simulation Conference.
- Bonaventura, M., M. Jonckheere, and R. Castro. 2018. “Simulation Study of Dynamic Load Balancing for
Processor Sharing Servers with Finite Capacity Under Generalized Halfin-Whitt Regimes”.
2018 Winter Simulation Conference
- Zeigler, B. P., A. Muzy, and E. Kofman. 2018. Theory of Modeling and Simulation 3rd Edition: Discrete
Event and Iterative System Computational Foundations. Elsevier.
- G. A. Wainer and P. J. Mosterman, Discrete-event modeling and simulation: Theory and applications. CRC
Press, 2010.
- V. Misra, W.-B. Gong, and D. Towsley. “Fluid-based Analysis if a Network of AQM Routers Supporting TCP Flows
with an Application to RED”. ACM/SIGCOMM, 2000
42
Download