Cross-Layer Scheduling in Cloud Computing Systems Authors: Hilfi Alkaff, Indranil Gupta Motivation • Many cloud computing frameworks out there – Batch Processing Framework: Hadoop – Stream Processing Framework: Storm • Current applications are not aware of underlying network topology – Might schedule tasks on machines with low bandwidth. Challenges • Need to expose underlying network topology efficiently to applications • Huge state space to search – Thousands of machines in a cluster – Users demand more interactive jobs • Multiple possible data-path representation – Want to have generic schedulers Data-Path: Map-Reduce Data-Path: Stream Proposed Solution • Cross-Layer Scheduling Framework – First-level scheduler in application Level – Second-level scheduler in routing level • Use Simulated Annealing at each level – Probabilistic framework – Idea: If neighboring state is better, always move there but if it is not, move there with probability P(T) that decreases with time Proposed Architecture Application Master M1 SDN Controller R1 R3 R2 M2 M3 Algorithm: Pre-computation • Pre-compute all-pairs (𝑀1 , 𝑀2 ) k-shortest paths – Stored in Topology-Map hash-table with key=(𝑀1 , 𝑀2 ), value=array of k-shortest paths • Too many duplicates – Intelligently merge similar sub-paths – Hash-Table’s value is now a tree instead of array Algorithm: Main Algorithm: genState() Heuristic • Too many neighboring states – Not possible to traverse all of them • Application Level – Prefer node that has higher # of sink vertices – Prefer node that has higher # of source vertices • Routing Level – Prefer paths that have lower number of hops – Prefer paths that have higher amount of available bandwidth Emulab Result: Throughput Simulation Result: Computation Time Simulation Results: CDF Le Questions? Algorithm: Failures • Link-Failures – Need to re-allocate flows using that link – Keep a separate hash-table where key=edge, value=flows – Get another path from Topology-Map. • Machine-failures – Re-run main algorithm on