DISTRIBUTED ALGORITHMS AND BIOLOGICAL SYSTEMS Nancy Lynch, Saket Navlakha BDA-2014 October, 2014 Austin, Texas Distributed Algorithms + Biological Systems • Distributed algorithms researchers have been considering biological systems, looking for: • Biological problems and behaviors that they can model and study using distributed algorithms methods, and • Biological strategies that can be adapted for use in computer networks. • Yields interesting distributed algorithms results. • Q: But what can distributed algorithms contribute to the study of biological systems? • This talk: • Overview fundamental ideas from the distributed algorithms research area, for biology researchers. • Consider how these might contribute to biology research. What are distributed algorithms? • Abstract models for systems consisting of many interacting components, working toward a common goal. • Examples: • Wired or wireless network of computers, communicating or managing data. • Robot swarm, searching an unknown terrain, cleaning up, gathering resources,… What are distributed algorithms? • Abstract models for systems consisting of many interacting components, working toward a common goal. • Examples: • Wired or wireless network of computers, communicating or managing data. • Robot swarm, searching an unknown terrain, cleaning up, gathering resources,… • Social insect colony, foraging, feeding, finding new nests, resisting predators,… • Components generally interact directly with nearby components only, using local communication. Distributed algorithms research • Models for distributed systems. • Problems to be solved. • Algorithms, analysis. • Lower bounds, impossibility results. • Problems: Communication, consensus, data management, resource allocation, synchronization, failure detection,… • Models: • Interacting automata. • Local communication: individual message- passing, local broadcast, or shared memory. • Metrics: Time, amount of communication, local storage. Distributed algorithms research • Algorithms: • Some use simple rules, some use complex logical constructions. • Local communication. • Designed to minimize costs, according to the cost metrics. • Often designed to tolerate limited failures. • Analyze correctness, costs, and faulttolerance mathematically. • Try to “optimize”, or at least obtain algorithms that perform “very well” according to the metrics. Distributed algorithms research • Lower bounds, impossibility results: • Theorems that say that you can’t solve a problem in a particular system model, or you can’t solve it with a certain cost. • Distributed computing theory includes hundreds of impossibility results. • Unlike theory for traditional sequential algorithms. • Why? • Distributed models are hard to cope with. • Locality of knowledge and action imposes strong limitations. Formal modeling and analysis • Distributed algorithms can get complicated: • Many components act concurrently. • May have different speeds, failures. • Local knowledge and action. • In order to reason about them carefully, we need clear, rigorous mathematical foundations. • Impossibility results are meaningless without rigorous foundations. • Need formal modeling frameworks. Key ideas • Model systems using interacting automata. • Not just finite-state automata, but more elaborate automata that may include complex states, timing information, probabilistic behavior, and both discrete and continuous state changes. • Timed I/O Automata, Probabilistic I/O Automata. • Formal notions of composition and abstraction. • Support rigorous correctness proofs and cost analysis. Key ideas • Distinguish among: • The problems to be solved, • The physical platforms on which the problems should be solved, and • The algorithms that solve the problems on the platforms. • Define cost metrics, such as time, local storage space, and amount of communication, and use them to analyze and compare algorithms and prove lower bounds. • Many kinds of algorithms. • Many kinds of analysis. • Many lower bound methods. • Q: How can this help biology research? Biology research • Model a system of insects, or cells, using interacting automata. • Define formally: • The problems the systems are solving (distinguishing cells, building structures, foraging, reaching consensus, task allocation, …), • The physical capabilities of the systems, and • The strategies (algorithms) that are used by the systems. • Identify cost metrics (time, energy,…) • Analyze, compare strategies. • Prove lower bounds expressing inherent limitations. • Use this to: • Predict system behavior. • Explain why a biological system has evolved to have the structure it has, and has evolved or learned to behave as it does. Example 1: Leader election • Ring of processes. • Computers, programs, robots, insects, cells,… • Communicate with neighbors by sending “messages”, in synchronous rounds. Example 1: Leader election • Ring of processes. • Computers, programs, robots, insects, cells,… • Communicate with neighbors by sending “messages”, in • • • • synchronous rounds. Problem: Exactly one process should (eventually) announce that it is the leader. Motivation: A leader in a computer network or robot swarm could take charge of a computation, telling everyone else what to do. A leader ant could choose a new nest. Leader election • Problem: Exactly one process should announce that it is the leader. • Suppose: • Processes start out identical. • Behavior of each process at each round is determined by its current state and incoming messages. • Theorem: In this case, it’s impossible to elect a leader. • No distributed algorithm could possibly work. • Proof: By contradiction. Suppose we have an algorithm that works, and reach a contradictory conclusion. Leader election • Problem: Exactly one process should • • • • • • • • • announce that it is the leader. Initially identical, deterministic processes Theorem: Impossible. Proof: Suppose we have an algorithm that works. All processes start out identical. At round 1, they all do the same thing (send the same messages, make the same changes to their “states”). So they are again identical. Same for round 2, etc. Since the algorithm solves the problem, some process must eventually announce that it is the leader. But then everyone does, contradiction. So what to do? • Suppose the processes aren’t identical---they have unique ID numbers. • Algorithm: • Send a message containing your ID clockwise. • When you receive an ID, compare it with your own ID. • If the incoming ID is: • Bigger, pass it on. • Smaller, discard it • Equal, announce that you are the leader. • Elects the process with the largest identifier. • Takes 𝑂(𝑛) communication rounds, 𝑂(𝑛2 𝑏) bits of communication, where 𝑏 bits are used to represent an identifier. What if we don’t have IDs? • Suppose processes are identical, no IDs. • Allow them to make random choices. • Assume the processes know n, the total number of processes in the ring. • Algorithm: • Toss an unfair coin, with probability 1/n of landing on heads. • If you toss heads, become a “leader candidate”. • It’s “pretty likely” that there is exactly one candidate. • The processes can verify this by passing messages around and seeing what they receive. • If they did not succeed, try again. Example 2: Maximal Independent Set • Assume a general graph network, with processes at the nodes: • Problem: Select some of the processes, so that they form a Maximal Independent Set. • Independent: No two neighbors are both in the set. • Maximal: We can’t add any more nodes without violating independence. • Motivation: • Communication networks: Selected processes can take charge of communication, convey information to all the other processes. • Developmental biology: Distinguish cells in fruit fly’s nervous system to become “Sensory Organ Precursor” cells. Maximal Independent Set • Problem: Select some of the processes, so that they form a Maximal Independent Set. • Independent: No two neighbors are both in the set. • Maximal: We can’t add any more nodes without violating independence. Maximal Independent Set • Problem: Select some of the processes, so that they form a Maximal Independent Set. • Independent: No two neighbors are both in the set. • Maximal: We can’t add any more nodes without violating independence. Distributed MIS problem • Assume processes know a “good” upper bound on n. • No IDs. • Problem: Processes should cooperate in a distributed (synchronous, message-passing) algorithm to compute an MIS of the graph. • Processes in the MIS should output in and the others should output out. • Unsolvable by deterministic algorithms, in some graphs. • Probabilistic algorithm: Probabilistic MIS algorithm • Algorithm idea: • Each process chooses a random ID from 1 to N. • N should be large enough so it’s likely that all IDs are distinct. • Neighbors exchange IDs. • If a process’s ID is greater than all its neighbors’ IDs, then the process declares itself in, and notifies its neighbors. • Anyone who hears a neighbor is in declares itself out, and notifies its neighbors. • Processes reconstruct the remaining graph, omitting those who have already decided. • Continue with the reduced graph, until no nodes remain. Example • All nodes start out identical. 7 Example 8 1 10 16 11 2 9 5 • Everyone chooses an ID. 13 7 Example 8 1 10 16 11 2 9 5 13 • Processes that chose 16 and 13 are in. • Processes that chose 11, 5, 2, and 10 are out. 7 Example 4 12 • Undecided (gray) processes choose new IDs. 18 7 Example 4 12 • Processes that chose 12 and 18 are in. • Process that chose 7 is out. 18 Example 12 • Undecided (gray) process chooses a new ID. Example 12 • It’s in. Properties of the algorithm • If it ever finishes, it produces a Maximal Independent Set. • It eventually finishes (with probability 1). • The expected number of rounds until it finishes is 𝑂(log 𝑛). More examples • Building spanning trees that minimize various network cost measures. • Motivation: • Communication networks: Use the tree for sending messages from the leader to everyone else. • Slime molds: Build a system of tubes that can convey food efficiently from a source to all the mold cells. • Building other network structures: • Routes • Clusters with leaders. • … More examples • Reaching consensus, in the presence of faulty components (stopping, Byzantine). • Motivation: • Agree on aircraft altimeter readings. • Agree on processing of data transactions. • Ants: Agree on a new nest location. • Lower bounds: • Byzantine-failure consensus requires at least 3𝑓 + 1 processes (and 2𝑓 + 1 connectivity) to tolerate 𝑓 faulty processes. • Even stopping-failure consensus requires at least 𝑓 + 1 rounds. • Algorithms: • Multi-round exchange of values, looking for a dominant preference [Lamport, Pease, Shostak]; achieves optimal bounds. Other examples • Communication • Resource allocation • Task allocation • Synchronization • Data management • Failure detection •… Recent Work: Dynamic Networks • Most of distributed computing theory deals with fixed, wired networks. • Now we study dynamic networks, which change while they are operating. • E.g. wireless networks. • Participants may join, leave, fail, and recover. • May move around (mobile systems). Computing in Dynamic Graph Networks • Network is a graph that changes arbitrarily from round to round (but it’s always connected). • At each round, each process sends a message, which is received by all of its neighbors at that round. • Q: What problems are solvable, and at what cost? • Global message broadcast, • Determining the minimum input. • Counting the total number of nodes. • Consensus • Clock synchronization •… Robot Coordination Algorithms • A swarm of cooperating robots, engaged in: • Search and rescue • Exploration • Robots communicate, learn about their environment, perform coordinated activities. • We developed distributed algorithms to: • Keep the swarm connected for communication. • Achieve “flocking” behavior. • Map an unknown environment. • Determine global coordinates, working from local sensor readings. Biological Systems as Distributed Algorithms • Biological systems consist of many components, interacting with nearby components to achieve common goals. • Colonies of bacteria, bugs, birds, fish,… • Cells within a developing organism. • Brain networks. Biological Systems as Distributed Algorithms • They are special kinds of distributed algorithms: • Use simple chemical “messages”. • Components have simple “state”, follow simple rules. • Flexible, robust, adaptive. Problems • Leader election: Ants choose a queen. • Maximal Independent Set: In fruit fly development, some cells become sensory organs. • Building communication structures: • Slime molds build tubes to connect to food. • Brain cells form circuits to represent memories. More problems • Consensus: Bees agree on location of a new hive. • Reliable local communication: Cells use chemical signals. • Robot swarm coordination: Birds, fish, bacteria travel in flocks / schools / colonies. Biological Systems as Distributed Algorithms • So, study biological systems as distributed algorithms. • Models, problem statements, algorithms, impossibility results. • Goals: • Use distributed algorithms to understand biological system behavior. • Use biological systems to inspire new distributed algorithms. Example 3: Ant (Agent) Exploration • [Lenzen, Lynch, Newport, Radeva, PODC 2014] • 𝑛 ants exploring a 2-dimensional grid for food, which is • • • • hidden at distance at most 𝐷 from the nest. No communication. Ants can return to the nest (origin) at any time, for free. In terms of 𝐷 and 𝑛, we get an upper bound on the expected time for some ant to find the food. Similar to [Feinerman, Korman]. • But now, assume bounds on: • The size of an ant’s memory. • The fineness of probabilities used in an ant’s random choices. A(ge)nt Exploration • Algorithm (for each ant): • Multi-phase algorithm. • In successive phases, search to successively greater distances. • Distances are determined by step-by-step random choices, using successively smaller probabilities at successive phases. • Expected time for some ant to find the food is (roughly) 𝑂 𝐷2 𝑛 +𝐷 . • Works even in the “nonuniform” case, where the ants don’t have a good estimate of 𝐷. • Analysis: Basic conditional probability methods, Chernoff bounds. A(ge)nt Exploration • Lower bound: Assume 𝑛 is polynomially bounded by 𝐷. Then there is some placement for the food such that, with high probability, no ant finds it in time much less than Ω 𝐷2 . • Lower bound proof ideas: • Model the movement of an ant through its state space as a Markov chain. • Enhance this Markov chain with information about the ant’s movement in the grid. • Use properties of the enhanced Markov chain to analyze probabilities of reaching various locations in the grid within certain amounts of time. Example 4: Ant Task Allocation • [Cornejo, Dornhaus, Lynch, Nagpal 2014] • 𝑛 ants allocate themselves among a fixed, known • • • • set of tasks, such as foraging for food, feeding larvae, and cleaning the nest. No communication among ants. Obtain information from the environment (somehow) about tasks’ current energy requirements. Try to minimize sum of squares of energy deficits for all tasks. Special case we’ve studied so far: • All ants are identical. • Synchronous rounds of decision-making. • Simple binary information (deficit/surplus for each task). Ant Task Allocation • Algorithm (for each ant): • Probabilistic decisions, based on binary deficit/surplus information. • Five states: Resting, FirstReserve, SecondReserve, TempWorker, CoreWorker, plus “potentials” for all tasks. • Ants in Worker states work on tasks; others are idle. • TempWorkers are more biased towards becoming idle. • Reserve workers are biased towards the task they last worked on. • Detailed transition rules and task assignment rules. • Theorem: Assume “sufficiently many” ants. In a period of length Θ(log 𝑛) with unchanging demand, whp, the ants converge to a fixed allocation that satisfies all demands. • Proof: Analyzes how oscillations get dampened. Limited ant memory size… Low c Efficie Robu Adapt Distributed Networks Modularity Stochastic Optimizatio Operating Principles 4 examples Slime mold foraging & MST construction [Afek et al. Science 2010] [Tero et al. Science 2010] E coli foraging & consensus navigation [Shklarsh et al. PLoS Comput. Biol 2013] d network design Fly brain development & MIS Synaptic pruning & network design SOP selection in fruit flies • During nervous system development, some cells are selected as sensory organ precursors (SOPs) • SOPs are later attached to the fly's sensory bristles • Like MIS, each cell is either: - Selected as a SOP - Laterally inhibited by a neighboring SOP so it cannot become a SOP Trans vs. cis inhibition Notch Delta Trans model • Recent findings suggest that Notch is also suppressed in cis by delta’s from the same cell • Only when a cell is ‘elected’ it communicates its decision to the other cell Cis+Trans model Miller et al Current Biology 2009, Sprinzak et al Nature 2010, Barad et al Science Signaling 2010 MIS vs SOP • Stochastic • Proven for MIS • Experimentally validated for SOP • Compared to previous algs: • Unlike Luby, SOP cells do not know its number of neighbors (nor network topology) • Constrained by time • An uninhibited cell eventually becomes a SOP • For SOP, messages are binary • Reduced communication • A node (cell) only sends messages if it joins the MIS Can we improve MIS algorithms by understating how the biological process is performed? Movie Simulations • 2 by 6 grid • Each cell touches all adjacent and diagonal neighbors Simulations - All models assume a cell becomes a SOP by accumulating the protein Delta until it passes some threshold Four different models: 1. Accumulation - Accumulating Delta based on a Gaussian distribution 2. Fixed Accumulation - Randomly select an accumulation rate only once 3. Rate Change - Increase accumulation probability as time goes by using feedback loop 4. Fixed rate - Fix accumulation probability, use the same probability in all rounds Observation: Comparing the time of experimental and simulated selection New MIS Algorithm + Demo MIS Algorithm (n,D) // n – upper bound on number of nodes D - upper bound on number of neighbors - p = 1/D - round = round +1 - W.h.p., the algorithm computes a MIS in O(log2n) rounds - if round > log(n) - All msgs are 1 bit p = p * 2 ; round = 0 // we start a new phase - Each processor flips a coin with probability p - If result is 0, do nothing - If result is 1, send to all other processors - If no collisions, Leader; all processes exit - Otherwise Afek et al Science 2011, Afek et al DISC 2011 Low c Efficie Robu Adapt Distributed Networks Modularity Stochastic Optimizatio Operating Principles 4 examples Slime mold foraging & MST construction [Afek et al. Science 2010] [Tero et al. Science 2010] E coli foraging & consensus navigation [Shklarsh et al. PLoS Comput. Biol 2013] d network design Fly brain development & MIS Synaptic pruning & network design Slime mold network Tokyo rail network Very similar transport efficiency and resilience Slime-mold model • Slime mold operates without centralized control or global information processing • Simple feedback loop between the thickness of tube (i.e. edge weight) and the internal protoplasmic flow (i.e. the amount it’s used) • Idea: reinforce preferred routes; removed unused or overly redundant connections • Edges = tubes • Nodes = junctions between tubes Slime-mold model Start with a randomly meshed lattice with fine spacing (t = 0) The flux through a tube (edge) is calculated as: Conductance of the tube At each time step, choose random source (node1) and sink (node2) Source: Sink: Otherwise: Pressure difference between ends of tube Length of tube Think “network flow”: the source pumps flow out, the sink consumes it; everyone else just passes the flow along (conservation: what enters must exit) Updated tube/edge weights The first term: the expansion of tubes in response to the flux The second term: the rate of tube constriction, so that if there is no flow, the tube gradually disappears f(|Q|) = sigmoidal curve Measuring network quality = Final slime mold network • TL = wiring length • MD = avg minimum distance between any pair of food sources • FT = tolerance to disconnection after single link failure • Each normalized to baseline MST = Tokyo railway network = minimum spanning tree = minimum spanning tree + added links = actual railway network = slime mold designed network (with light) = slime mold designed network (no light) = simulated networks Same perform, lower cost for SM Overall better FT for real network Cost: TLMST( ) = 1.80 and TLMST( Efficiency: MDMST( ) = 1.75 +/- 0.30 SIMILAR ) = 0.85 and MDMST( ) = 0.85 +/- 0.04 SIMILAR Fault tolerance: 4% of links cause real network disconnection, but 14-20% for SM Low c Efficie Robu Adapt Distributed Networks Modularity Stochastic Optimizatio Operating Principles 4 examples Slime mold foraging & MST construction [Afek et al. Science 2010] [Tero et al. Science 2010] E coli foraging & consensus navigation [Shklarsh et al. PLoS Comput. Biol 2013] d network design Fly brain development & MIS Synaptic pruning & network design Bacteria foraging • Imagine a complicated terrain with food in some local minima, and a collection of bacterium that want to find the food. How can they quickly succeed? • Need to agree on a movement trajectory to collectively find food sources based on: • Individual knowledge / sensory information • Knowledge from others • Need to account for dynamic and varied environments with little or no central coordination • May also be helpful for robot coordination problems Bacterial chemotaxis • Bacteria navigate via chemotaxis, i.e. they move according to gradients in the chemical concentration (food) • In areas of low concentration, bacteria tumble more (i.e. they move randomly) Bacteria acquire cues from neighbors: - Repulsion to avoid collision - Orientation - Attraction to avoid fragmentation Bacteria as automata • Treat bacteria has individual automata with two sources of information: • Individual belief in the food source • Interaction with neighbors’ beliefs about where the food source is • Parameter w(i)t controls how much bacterium i listens to its neighbors at time t Independent vs fixed weights • [[maybe include videos?]] Problem: erroneous positive feedback leads bacteria astray: a subgroup gets “bad” information and convinces the others along an incorrect trajectory Solution: adaptive weights • Bacteria adjust their communication weights based on their own internal confidence: • When a bacteria finds a beneficial path (strong gradient), it downweights, listens less to its neighbors • When it is unsure, it increases its interaction with its neighbors • This simple idea requires short-term memory and a very simple rule Low c Efficie Robu Adapt Distributed Networks Modularity Stochastic Optimizatio Operating Principles 4 examples Slime mold foraging & MST construction [Afek et al. Science 2010] [Tero et al. Science 2010] E coli foraging & consensus navigation [Shklarsh et al. PLoS Comput. Biol 2013] d network design Fly brain development & MIS Synaptic pruning & network design Synaptic pruning & brain development Thousands of neurons & synapses generated per minute ≤ Human birth Density of synapses decreases by 50-60% Age 2 Adolescence Pruning occurs in almost every brain region and organism studied But it is unlike previous models of network development… so why does it happen? • Idea: • Source-target pairs are drawn from a distribution D that represents a prior/likelihood on signals. • Dtrain is fixed beforehand (but unknown) • Dtest derived from D, but different s-t pairs • Constraints: • Distributed no centralized controller • Streaming process s-t pairs online Synaptic pruning as an algorithm 1. Start with all equivalent nodes and blanket the space with connections Synaptic pruning as an algorithm 2. Receive source-target request Target Source Edge Usage 1-10 1 10-15 1 * 0 “use it or lose it” 3. Route request and keep track of usage Target 15 10 Source 1 Edge Usage 1-10 1 10-15 1 6-14 1 14-30 1 * 0 “use it or lose it” 4. Process next pair… Target 30 15 14 10 6 1 Source Edge Usage 1-10 2 10-15 1 6-14 2 14-30 1 1-3 1 10-19 1 6-13 1 “use it or lose it” 5. Repeat 30 15 14 10 6 19 1 3 13 Edge Usage 1-10 14 10-15 6 6-14 12 14-30 3 1-3 7 10-19 2 6-13 1 “use it or lose it” 6. Prune at some rate: High usage edges are important [keep] Low usage edges are not [prune] 30 15 14 10 6 19 1 3 13 Pruning rates have been ignored in the literature Human frontal cortex [Huttenlocher 1979] Mouse somatosensory cortex [White et al. 1997] # of synapses / image Decreasing pruning rates in the cortex Rapid elimination 16 time-points early then taper-off 41 animals 9754 images 42709 synapses Postnatal day Efficiency (avg. routing distance) Robustness (# of alternate paths) Decreasing rates optimize network function Cost (# of edges) Decreasing rates > 20% more efficient than other rates Cost (# of edges) Decreasing rates have slightly better fault tolerance Take-aways [clean up / split] • Biology-inspired algorithms • Evaluated using standard cost metrics, based on formal models • Sometimes worse performance in some aspects, but often simpler algorithms: e.g. the fly MIS algorithm had higher run-time, but required fewer assumptions; synaptic pruning is initially wasteful, but is more adaptive than growth-based models. • Often no UIDs • Dynamic and variable network topologies, similar to new wireless devices & networks • Formal models can be used to evaluate performance and predict future system behavior • Contribution to biology: • Global understanding of local processes (Notch-Delta signaling, slime mold optimization criteria) • Raised new testable hypotheses (e.g. the role of pruning rates on learning) • [add more] • Common algorithmic strategies we’ve seen: • Stochasticity, to overcome noise and adversaries • Feedback processes, to reinforce good solutions/edges/paths • Rates of communication/contact vital to inform decision-making (MIS, brain, ants, etc.) Conclusion