Document

Revisiting traditional time modeling and analysis techniques at the light of the “timing dimensions” (and not only) Goals: • To state an homogeneous background • To build an “attitude” in the evaluation of time models Revisiting traditional models 1 Dynamical systems • • • • • • Discrete systems Continuous systems The state-space representation Dynamical systems as a model of computation From continuous to discrete (and back) Dynamical systems and the dimensions of temporal modeling Revisiting traditional models 2 Discrete dynamical systems: the (well-known) Fibonacci’s rabbits (1) a) A rabbit’s pregnancy lasts exactly one month; b) A birth produces exactly two siblings, a male and a female; c) A newborn rabbit becomes fertile when it turns onemonth old and it remains fertile for its whole life; d) A rabbit’s life is indefinitely long (within the time scales considered). Revisiting traditional models 3 Discrete dynamical systems: the (well-known) Fibonacci’s rabbits (2) • • • • • • • • • R counts the number of rabbit couples newR(t): newly born couples R(t) = R(t  1) + newR(t). (*) newR(t) = R(t  1)  newR(t  1) newR(t  1) = R(t  1)  R(t  2) R(t) = R(t  1) + R(t  1)  newR(t  1) • = R(t  1) + R(t  1)  (R(t  1)  R(t  2)) • = R(t  1) + R(t  2). If R(0) = 1, … If R(0) = 0, … t t Incidentally:   1 1 5    R (t )  5   2    1 5       2   Revisiting traditional models 4 Continuous dynamical systems: iC(t) the (well-known) capacitor i(t) V(t) • Q = C.V • d Q(t )  i (t ) c dt • • i(t) = iR(t) + iC(t); V(t) = R·iR(t) V(0) = 1, i(t) = 0 • t   V( t )  exp    R C  Revisiting traditional models 5 The state-space representation of dynamical systems • • • • • • State u = u1 u2 ... Um Input y = y1 y2 ... yl, Output x(t + 1) = f(x(t), u(t), t) ($) 𝒙 (t) = f(x(t), u(t), t) ($$), y = g(x, [u]) x = x1 x2 ... xn Invariant versions: x(t + 1) = f(x(t), u(t)) 𝒙(t) = f(x(t), u(t)) Revisiting traditional models ($-ti) ($$-ti) 6 Dynamical systems as models of computation • • • • State  memory (register, array, …) Next state: x(t+1) = f(memory, read input) Output: write … Example: cellular automata • • next state of a cell depends only on the state of the neighbor cells Rule 110: s(t )  si 2 (t )si 1 (t )si (t )si 1 (t )si 2 (t ) 1 if si1 (t )si (t )si1 (t ) {110, 101, 011, 010, 001} si (t  1)   othe rwise 0 • Have the computational power of Turing machines Revisiting traditional models 7 From continuous to discrete … and back (1) • Discretization for numerical computation (of continuous models): • Fixed point vs floating point … d 1 V( t )  V( t ) dt RC  ˆ ( k  1)   1 V ˆ (k ) V RC  ˆ ( 0)  V ˆ0 V ˆ (1)  (  RC) 1 V ˆ (0)  (  RC) 1 V ˆ0 V ˆ ( 2)  (  RC) 1 V ˆ (1)  (  RC)  2 V ˆ0 V  ˆ ( K )  (  RC)  K V ˆ0 V Revisiting traditional models 8 From continuous to discrete … and back (2) Continuous process Computer-based controller Revisiting traditional models 9 • Dynamical systems and the dimensions of temporal modeling Discrete and continuous (time) domains • • • (already discussed) Next state vs. continuous evolution Zeno (and other “pathological”) behaviors •  1 i f t i s rati on al x(t )   0 i f t i s i rrati on al • • x(t) = tang(t) x(t + 1) = r x(t) · (1 – x(t)), • • • for constant r > 0, which defines highly irregular chaotic behavior difficult to predict Sometimes boundedness and/or continuity are required as a “guarantee” of regularity, but … b(t) = exp(–1 / t2)·sin(1 / t) if t0 and b(0)=0, … Revisiting traditional models 10 Avoiding pathological behaviors • • Writing a differential/difference equation in time is no guarantee of formalizing a “good” dynamical system Avoiding pathological behviors “a priori”: by means of suitable (sufficient) conditions: • • Analitic functions have good regularity properties Verifying a posteriori whether the behavior implied by the given equations is “good” or not. Revisiting traditional models 11 A real-time exercise … Consider the following types of irregular behavior: discontinuous, continuous with discontinuous derivative, Zeno, Berkeley, unbounded. For each of the following choices of state and time domain, which types of irregular behavior may occur? • • • • • • Continuous and unbounded state space and time (say, R); Continuous and bounded state space and time (say, the real interval [0,1]); Dense state space and time (say, Q); Continuous time (say, R) and discrete state space (say, Z); Discrete time (say, Z) and continuous state space (say, R); Discrete state space and time (say, Z). • () Can you think of real systems where such irregular behaviors can arise? Revisiting traditional models 12 ... and a non real-time one A dynamical system has chaotic behavior when its dynamics is difficult to predict because of certain characteristics, including in particular sensitivity to initial conditions. Informally, sensitivity to initial conditions is a form of instability, where a tiny change in the initial value of the state variables may result in a conspicuous change in the evolution of the state. In terms of predictability, this means that if the initial state is known only with finite precision, the state dynamics becomes completely unpredictable after some time. The logistic map above (x(t + 1) = r x(t) · (1 – x(t))) is an example of discrete-time system with chaotic behavior. Consider now the notion of undecidability applied to the dynamics of a class of discrete-time systems C: a property of C’s dynamics (e.g., “Does the state of every system in C ever become positive?”, “Does the state reach equilibrium?”) is undecidable if its yes/no answer cannot be computed by any algorithmic procedure. • Is the dynamics of dynamical systems with chaotic behavior always undecidable? • Conversely, are dynamical systems whose dynamics is undecidable always chaotic? Revisiting traditional models 13 Determinism in dynamical systems • Traditional dynamical systems (those studied in classic control theory) are deterministic • Stochastic systems too are deeply studied (e.g. in information and communication theory) • No conceptual reasons not to exploit even nondeterministic ones in modern control and automation theory (Petri nets …, but why not nondeterministic continuous models?) Revisiting traditional models 14 Implicit vs. explicit time reference • Time invariant system: • if the system reaches the same state at two different times t1, t2 (that is: x(t1) = x(t2)) and it is subject to the same input function in the future (that is: u(d + t1) = u(d + t2) for all positive d), then the future values of the state are the same at corresponding instants of time (x(d + t1) = x(d + t2) for all positive d). • Time is still explicit in x(t) but is implicit in dx  f(x,u) dt Revisiting traditional models 15 Concurrency and composition Revisiting traditional models 16 Notations and tools for dynamical system analysis • Inspired by numerical methods: • Matlab/Simulink • Modelica • … • Towards models and tools that integrate equationbased formalisms with automata and logic-based ones: • … next part of the course Revisiting traditional models 17 Modeling time in hardware design (From continuous to discrete) • • • • From transistors to sequential logic circuits Raising the level of abstraction: finite state machines From asynchronous to synchronous logic circuits Raising again the level of abstraction: hardware description languages • Methods and tools for hardware analysis Revisiting traditional models 18 From transistors to sequential logic circuits (1) An NMOS transistor A transistor implementing a NOT logical operation A NAND circuit Revisiting traditional models A NOR circuit 19 From transistors to sequential logic circuits (2) But the table: NOT And the icon: Vin Vout 0 1 1 0 are more abstract … From a functional point of view What about timing? Revisiting traditional models 20 From transistors to sequential logic circuits (3) What about timing? Two NOT gates in series An input-output graph, where t2 – t1 = 2 Time is still continuous but trajectories are “rectified” Combinatorial circuits are memoryless (stateless) devices with a delay between input and output Revisiting traditional models 21 From transistors to sequential logic circuits (4) • From combinatorial to sequential logic circuits • Adding memory in logic circuitry by introducing feedback R, Q S R Q 0 0 forbidden 0 1 0 1 0 1 1 1 no change A SR NAND latch 1 0 Revisiting traditional models 22 From two to many states: sequential machines • 1 bit: 2 states n bits: 2n states The general structure of a sequential machine Question: it is clearly an example of dynamical system: Time is discrete or continuous? What about the state? Revisiting traditional models 23 Sequential machines a behavior of the machine: initial output O1 = O2 = 1 the two inputs R1 = R2 hold the value 1 while S1 switches to a value 0 for ε time units. ε greater than the switching delay of the latch Revisiting traditional models 24 Abstracting the “ramps”: 0-time transitions The state Q of an SR latch set to 1 with zero-time transitions. ( is the switching delay of the latch). But … what’s the state during the transition? Problems may arise … Revisiting traditional models 25 Abstracting the logic circuitry: Finite state machines (FSM) (Well known) Graphical representation of the finite-state machine for an SR latch The state space is now obviously discrete; what about time? Revisiting traditional models 26 FSMs with output: Moore and Mealy machines s/o r/ε 0 1 s/ε r/ε A Mealy machine modeling an SR latch with output ε (empty string) = no output Mealy: : Q  I  O Moore: : Q  O Revisiting traditional models 27 From asynchronous to synchronous logic circuits: the clock The clock as a square wave (dotted line) or as a sequence of impulses (solid line). Revisiting traditional models 28 The latch synchronized with the clock A synchronized reset transition of the flip-flop Now time too is discrete Revisiting traditional models 29 The “same” FSM with discrete time: A FSM for a SR flip-flop with “no change” events: The model now has a metric on time: one transition = one time unit = one clock period Revisiting traditional models 30 From few to many states: Modular abstractions in hardware design The clock –and therefore the time- is again implicit: one transition = one time unit However: as the abstraction level increases … the clock tends to “disappear” … Towards the SW “purely functional” view. Revisiting traditional models 31 Methods and tools for hardware analysis • • • • Testing Simulation Formal verification Synthesis • • • SPICE VHDL … Revisiting traditional models 32 Time in the analysis of algorithms • • Introductory concepts Computational complexity and automata models • • • • • • • A brief but necessary digression on computability Back to (deterministic) automata-theoretic complexity The RAM (random access memory) machine and its complexity The complexity relations between different computational models The complexity of nondeterministic computations Complexity hierarchies Probabilistic computations and their complexity (hints) Revisiting traditional models 33 The traditional way of modeling software • The functional abstraction: • • • • • • An algorithm computes a function f: I ---> O The computation process is completely abstracted away But the algorithm is executed by an (abstract or physical) machine (operational model) The abstract machine uses resources –memory and timeComplexity as a measure of cost to achieve the desired result: typically separated by functional analysis (unlike typical analysis of dynamical systems) Complexity analysis based on some (abstract) operational model: • • • Given the abstraction introduced by hardware: Discrete state and time domain Normally deterministic. Revisiting traditional models 34 Measures of computational complexity • (Most general remarks apply to both space and time complexity, but we are interested in the latter one) • Given a computational model: one –abstract- clock tick (execution of an elementary operation) = one time unit Complexity = number of time units elapsing from the beginning to the end of a computation Complexity depends on input data x: f(x) A first basic abstraction: make it a function of the size of input data: n = |x| • But in general  ( |x1| = |x2|  f(x1) = f(x2) • Worst case and average case analysis: TA (n )  max T ( x ) • For abstract machine A, worst case: A • • • x n • Average case: TA (n )  T A x n I n (x) ( where I is the inputalphabet) Revisiting traditional models 35 (The very simple) Complexity of FSMs • • • Finite memory (does not depend on input) For every input symbol one move = one time unit: TA(n) = n (in such cases we say the machine is a real-time machine) Revisiting traditional models 36 Brief digression on computational power • FSMs are rather limited in problem solving • E.g. problem = language recognition: • • L  I*; x  L ? FSMs cannot recognize {anbn | n > 0} (have a finite memory!) • If we want to formalize and analyze any generic computation (algorithm) we need much more computational power Revisiting traditional models 37 The famous and fundamental Turing machine (TM) The model can formalize language recognition (without output tape) as well as language translation ( function computation): Any formalization of the generic notion of problem Revisiting traditional models 38 A little example of TM 0 , _ / _ , 0 / L,R,S 1 , _ / _ , 1 / L,R,S _ , _ / _ , _ / L,S,S q0 0, _ / _ , _ / R,S,S 1, _ / _ , _ / R,S,S q1 1, _ / _ , 0 / L,R,S 0, _ / _ , 1 / L,R,S _ , _ / _ , 1 / S,R,S q2 _ , 0 / 0 , 0 / S,L,R _ , 1 / 1 , 1 / S,L,R q3 _ , _ / _ , _ / S,L,S A 1-tape Turing machine Msucc that computes the successor of a binary number The notation is imported and extended from that of FSMs (not by chance) Revisiting traditional models 39 The computational power of TMs • The fundamental Church-Turing thesis: • “every implementable computational process can be computed by a Turing machine” • There are problems (languages, functions, …) that cannot be solved by any TM • Therefore they are not “algorithmically solvable” • The most classical and fundamental case: • The (computation) termination problem (the time complexity of a TM computation can be ) Revisiting traditional models 40 Let’s go back to complexity issues in the context of the general (?) and powerful TM • Even if a problem can be solved for any input datum (TM(x)   x), TM(x) (TM(n) ) can be a highly variable (and high) function: 22n! 2 what about TM(n) = 2 ? • The linear speed-up theorem: • Given any Turing machine M solving a problem with time complexity TM(n) and any rational number c > 0, it is possible to build another Turing machine M’ that, after reading the whole input string, solves the same problem as M with time complexity c·TM(n), i.e., TM’(n) = max (n, c·TM(n)). • The theorem is general and does not really depend on the chosen model: it deals with the issue of spending more resources to solve problems. Revisiting traditional models 41 • The linear speed-up theorem suggests to focus complexity analysis –at least at first approximation- on the order of magnitude of (complexity) function: • f is O(g) (“big oh of g”) if there exist positive constants c, k (k integer) such that: f(n) ≤ c·g(n) for all n > k; • f is Ω(g) (“big omega of g”) if there exist positive constants c, k (k integer) such that: f(n) ≥ c·g(n) for all n > k; • f is Θ(g) (“big theta of g”) if f is O(g) and Ω(g). • By changing algorithm to solve a problem we may change the order of magnitude of the complexity of problem solution; by changing machine (instance) we can only change complexity linearly, i.e., without affecting its order of magnitude: • A (n.log(n)) sorting algorithm will run on a modest PC faster than a (n2) running on a very expensive supercomputer (for high values of n). Revisiting traditional models 42 Complexity classes • It is quite natural to classify problems according to the complexity of their solution: • • • • • • The more effort I apply to solve problems, the more problems I will be able to solve TIME(f(n)): the class of problems that can be solved (with the “best algorithm” with time complexity O(f(n)) f O(g)  TIME(f(n))  TIME(g(n)) Also, normally, (f) >  (g)  TIME(f(n))  TIME(g(n)) E.g. sorting (through comparisons)  TIME(n) Complexity lower-bounds Revisiting traditional models 43 However … • • • • The definition of computational complexity is bound to the computational model: What if we change model? Church thesis guarantees that if a problem can be solved by any “machine”, it can be solved by a TM Does the same hold for complexity evaluation? • • Intuitively the set of “elementary operations” of a real computer is quite different from TM transitions Maybe every hw operation can be simulated by a bounded sequence of TM operations? (linear speed up)? Revisiting traditional models 44 …as a matter of fact … • • Even without comparing (the complexity of) TMs with that of real computers (to be done next) Just moving from the k-tape TM to the (original) single tape one (not 1-tape!): Control unit • … the “mirror” or palindrome language L = {wwR |w  I*} cannot be recognized by any single-tape TM in less than (n2) Revisiting traditional models 45 • It’s time to consider a more realistic (?) computational model: the RAM, clearly inspired to the von Neumann architecture. Every cell contains a character or an integer Revisiting traditional models 46 The RAM instruction repertoir (1) Instruction READ x READ@ x Semantics Comments M[x]  current input value; Copy the value in the current input cell to the Input head advances by one position. memory at address x. M[M[x]]  current input value; @ denotes indirect addressing: copy the input to the Input head advances by one position. memory cell whose address is stored in memory at address x. WRITE x M[x]  current output cell; Output head advances by one position. WRITE@ x M[M[x]]  current output cell; Output head advances by one position. WRITE= x LOAD x x  current output cell; = denotes immediate addressing: write the operand’s Output head advances by one position. value x. ACC  M[x] Copy the content of memory at address x to the accumulator. LOAD@ x ACC  M[M[x]] LOAD= x ACC  x STORE x ACC  M[x] STORE@ x ACC  M[M[x]] Revisiting traditional models 47 The RAM instruction repertoir (2) Instruction ADD x Semantics ACC  [ACC] + M[x] Comments Add the contents of the accumulator to memory content at address x and store the result back in the accumulator. ADD@ x ACC  [ACC] + M[M[x]] ADD= x ACC  [ACC] + x SUB [...] MULT Subtraction, multiplication, and division arithmetic operations are defined similarly to ADD. DIV ... JUMP lab PC  instruction(lab) Set the program counter to the address of the instruction with label ‘lab’; this instruction is executed next. JZ lab if [ACC] = 0 then PC  instruction(lab) Conditional jump: jump to ‘lab’ if the accumulator stores 0, otherwise continue sequentially. else PC  [PC] +1 end HALT Execution stops. Revisiting traditional models 48 Then, complexity analysis proceeds as usual: • One elementary operation = one time unit • Apparently striking differences w.r.t. TM-based analysis: • ADD x in O(1) time vs. O(log(|x|) • Binary search in O(log(n)) time vs. O(n) –or O(n.log(n))- time if …?... For TM • … • BUT …. Revisiting traditional models 49 Label • x=2 • for x = 1 to n • x = x*x 2n • Computes 2 in O(n) LOOP: Instruction READ 1 Store the input n into M[1]. LOAD= 2 Initialize M[2] to 2. STORE 2 LOAD= 1 M[3] is used as counter, STORE 3 initialized to the value 1. LOAD 1 When the counter reaches n, SUB 3 M[2] contains the result, JZ • Is this realistic? • It needs 2n bits only to store the result RESULT Print it and stop. LOAD 2 Square M[2], that is MULT 2 M[2] receives STORE 2 M[2]*M[2]. LOAD 3 Increment the counter. ADD= 1 STORE 3 JUMP RESULT: Comment WRITE LOOP 2 HALT Revisiting traditional models 50 • Are RAMs operations really elementary? (independent on the data)? – Do ADD, LOAD, STORE, require O(1)? • RAM is a little “too abstract” to properly capture the notion of elementary datum and operation so that they can be associated with the (memory and) time unit: – A RAM cell stores an integer, but a real computer cell has k (32, 64,128, …) bits … Revisiting traditional models 51 The logarithmic cost criterion Instruction Cost READ x l(current input value) + l(x) READ@ x l(current input value) + l(x) + l(M[x]) WRITE x l(x) + l(M[x]) WRITE@ x l(x) + l(M[x]) + l(M[M[x]]) WRITE= x l(x) LOAD x l(x) + l(M[x]) LOAD@ x l(x) + l(M[x]) + l(M[M[x]]) LOAD= x l(x) STORE x l([ACC]) + l(x) STORE@ x l([ACC]) + l(x) + l(M[x]) ADD x l([ACC]) + l(x) + l(M[x]) ADD@ x l([ACC]) + l(x) + l(M[x]) + l(M[M[x]]) ADD= x l([ACC]) + l(x) JUMP lab 1 JZ lab l([ACC]) HALT 1 Revisiting traditional models Minor difference w.r.t RASP Random access stored program machine 52 • Now that we have a more realistic cost evaluation (for high values of data) – There are still important differences between RAM’s and TMs complexity (e.g.: bubble sort) but not that striking – But is RAM always more efficient than TM? • L = {w.c.wR | w  {a,b}* } can be recoginzed by a TM in O(n) but a RAM takes at least O(n.log(n)) under logarithmic cost • RAM “champs” direct access to memory addresses but direct access is not always as efficient as sequential access – More generally: Revisiting traditional models 53 – An algorithm coded in RAM language that has TR(n) time complexity can be simulated by a TM with TR(n)2 complexity – An algorithm coded as TM that has TM(n) time complexity can be simulated by a RAM with TM(n).log(TM(n)) complexity • More generally: • The polynomial correlation thesis (Strong version of Church-Turing thesis): • Under “reasonable” cost criterion any “realistic” computation model can be simulated by any other one in such a way that their complexity functions are polynomially related: TM1(n) = P1(TM2(n)) and TM2(n) = P2(TM1(n)) Revisiting traditional models 54 • We can now synthesize a few typical complexity classes: – – – – – O(1) Linear (or sublinear) time Up to O(n.log(n)): data intensive applications (Depend on computation model: normally RAM) Polynomial complexity (fairly tipical of numerical applications) • P: the class of problems that can be solved with polynomial complexity: • Does not depend on the computational model – Above polynomial: typical of combinatorics; considered intractable but … • All above about deterministic computation: what about … Revisiting traditional models 55 … nondeterministic and probabilistic computation • (Almost) all operational computation models have their nodeterministic counterpart: essentially • : Q  I  (Q) a b a • Practically holds for every family of automata: – FSM, TM, Pushdown, … – With different consequences in their formal properties (computational power, closure properites, …) • In some cases abstract machines are originally defined as nondeterministic devices (Petri nets,…) • In some cases even programming languages have nondeterministic constructs Revisiting traditional models 56 • Besides the impact on computational power and other properties: • What’s the impact of nondeterminism on complexity? • First of all: what do we mean by nondeterministic complexity? – The longest computation among all possible ones? – The shortest one? – According to the normal interpretation of nondeterminisim as blind or parallel search (but there is also a symmetric interpretation as whichever choice): • Existential vs universal nodeterminism • Here we focus on the existential one – The shortest computation that leads to success, if any: Revisiting traditional models 57 • In the traditional literature of abstract complexity/formal languages: – A nondeterministic (ND) Turing machine N runs in #N steps for an input x if the shortest sequence of steps that is allowed by the nondeterministic choices and correctly computes the result for input x has length #N. – A nondeterministic Turing machine has time complexity T(n) if the longest computation with input of size n runs in T(n) steps. – A time complexity measure T(n) defines the complexity class NTIME(T(n)) of all problems that can be solved by some nondeterministic Turing machine with time complexity in O(T(n)). Revisiting traditional models 58 • Plenty of problems admit an “obvious” ND solution, typically: – “Guess” a potential solution – Verify whether indeed it is a solution – Both actions can be done in a “short” time (linear or low-degree polynomial), but the number of guesses, ND generated can grow often exponentially or more. • Classical examples: – – – – SAT HC (Hamiltonian circuit problem in graph theory) Clique … • This leads to define the fundamental class • NP (or NPTIME): kN NTIME(nk) • and to the extremely challenging hierachy: Revisiting traditional models 59 • LOGSPACE  PTIME  NPTIME  PSPACE = NPSPACE  EXPTIME  NEXPTIME  EXPSPACE = NEXPSPACE • Some inclusions must be strict, but … which ones? – Conjecture: (almost) all of them • The fundamental notion of (NP)-completeness: – For a problem p, P(x) denotes the solution of p for input x; M(x) denotes the (unique) output of the deterministic Turing machine M with input x. Then, a problem c in NP is NP-complete if, for any other problem p in NP, there exist two deterministic Turing machines Rpc, Rcp with polynomial time complexities, such that Rcp(C(Rpc(x))) = P(x) holds for every input x. – All above examples of problems in NP –and many more- are also NPcomplete Revisiting traditional models 60 • Traditionally and “reasonably” the P/NP frontier has been considered as the borderline between tractable and intractable problems • However modern tools are able to manage in practice “most” instances of NP-complete problems in a satisfactory way even if the theory of worst-case analysis states that, with the present knowledge about simulating ND computations by a deterministic device should take at least exponential time (this is considered as a great challange for the theory of abstract complexity). • Thus, pretty much interest is now focused on the “above NP-completeness” hierarchy. Revisiting traditional models 61 Randomized models of computation • In principle all (deterministic) operational models can be “randomized” in much the same way as they are made ND • The mathematical machinery to manage them, however, is pretty much different (but well established in the literature) • Also the “applications” of the models (analysis) are rather different in nature: – Det & ND: • Can we guarantee that we will achieve our goal (within a given time)? • Is there a way to achieve our goal? – Probabilistic: • Which are the chances (probability) to achieve our goal within a given time? • How long should I wait in the average to achieve my goal? With which standard deviation from the average? Revisiting traditional models 62 Randomized models of computation: • Probabilistic finite-state automata (Markov chains) • Probabilistic Turing machines and complexity classes Revisiting traditional models 63 Probabilistic finite-state automata • Discrete time Markov chains • Markov decision processes • Continuous-time Markov chains Revisiting traditional models 64 Discrete time Markov chains • A probabilistic finite-state automaton is a finite-state automaton without input alphabet, extended with a probability function : Q × Q  [0,1] which determines which transitions are taken. The probability function is normalized:  (q, q' )  1 q  Q q 'Q • Discrete-time Markov chains generalize discrete-time probabilistic finite-state automata to any countable set Q of states. Revisiting traditional models 65 Example • A student that takes a series of exams: when the student passes an exam (state P), there is a 90% chance that she will pass the next exam as well (transition to P); when the student fails an exam (state F), there is a 60% chance that she will fail the next exam (transition to F). 0.1 0.9 P 0.6 F 0.4 Revisiting traditional models 66 A few“natural” consequences of probability theory • Decrease of probability of every specific behavior as the length of the behavior increases: whereas in nondeterministic models every nondeterministic choice is possible and must be considered, assigning probabilities to different choices entails that the long-term behavior is more and more likely to asymptotically approach the average behavior. • The steady-state probability gives the likelihood that an automaton will be in a certain finite state after an arbitrarily long number of steps: • The probability function  is a |Q|×|Q| probability matrix M, whose element [i, j] is the probability of transitioning from the i-th to the j-th state. • The steady-state probability is independent of the initial and current state: a row vector p = [p1 … p|Q|] of nonnegative elements, whose i-th element denotes the probability of being in the i-th state, is the steady state probability if it satisfies pM = p (it does not change after one iteration) and  pi  1 1i  Q Revisiting traditional models 67 • In previous example: • probability matrix M = 0,9 0,1 0,4 0,6 • [p1 p2] = [0.8 0.2] : • In the long term, the student passes 80% of the exams she attempts. Revisiting traditional models 68 Adding input: Markov decision processes • A probabilistic finite-state automaton with input is a probabilistic finite-state automaton with probability function : Q × I × Q  [0,1] over transitions. The probability function is normalized with respect to the next states. When the automaton is in state q  Q and inputs an event i  I, it can make a transition to any state q’  Q with probability (q, i, q’). • Discrete-time Markov decision processes generalize discrete-time probabilistic finite-state automata with input to any countable sets Q and I of states and input events. Revisiting traditional models 69 • The main difference between Discrete-time Markov chains and Discrete-time Markov decision processes (and corresponding finite state versions) is the input, which models the external environment, which is also named in the literature scheduler, controller, adversary, or policy depending on the application context. • Remark (holding in general for any operational model) • Decision processes are described as OPEN systems, to which an external abstract entity supplies input. The composition of a decision process with its environment is a CLOSED system that characterizes an embedded process operating in specific conditions. Revisiting traditional models 70 • (Summarizing) Remark: • Time is DISCRETE, SYNCHRONOUS, and METRIC in finite-state automata regardless of whether in their deterministic, nondeterministic, or probabilistic version. Revisiting traditional models 71 • (Simple) exercise: • Consider the following generalization of previous example: – While preparing for an exam, the student may attend classes on the exam’s topic (event a) or skip them (event s). While attending classes is no guarantee of passing, it significantly affects the probability of success: after the student has passed an exam (state P), there is a 90% chance that she will also pass the next one if she attends classes; if she has not, the probability shrinks to only 20%. Conversely, after the student has failed an exam (state F), there is a 70% chance that she will pass the next one if she attends, but only 10% if she skips them. – Model the new system specification by means of a decision process. – Compute: • The probability that the student passes the first k consecutive exams when she always attends classes • The probability that she passes the first 2k consecutive exams when she attends classes of every other exam • The probability passk that the student passes the first k consecutive exams as a function of the input sequence i1, i2, …, ik. Revisiting traditional models 72 Continuous-time Markov chains • Now: – Time is continuous – Behavior is asynchronous since there is a probability distribution to the residence (also, sojourn) time in every state – Main constraint: • the probability of remaining in the current state for the next t time units does not depend on the previous states traversed by the automaton, but only on the current one, in the same way as the probability of making a transition only depends on the current state in discrete-time probabilistic automata (that is, the Markov property) • The only probability distribution that satisfies the Markov property is the exponential one: the probability that an automaton waits for t time units decreases exponentially with t. Revisiting traditional models 73 • Formally: • A continuous-time probabilistic finite-state automaton extends a (discrete-time) probabilistic finite-state automaton with a rate function ρ: Q  R>0. Whenever the automaton enters state q  Q, it waits a time given by an exponential distribution with parameter ρ(q) and probability density function p(t) (q) exp((q).t ) t  0 p(t )   t0 0 • correspondingly the distribution function P(t) is t 1  exp((q).t ) t  0 P(t )   p(x)dx   t0 0  • When it leaves q, the next state is determined as in the underlying discrete-time probabilistic finite-state automaton. • Continuous-time Markov chains generalize continuous-time probabilistic finite-state automata to any countable set Q of states. Revisiting traditional models 74 • A lamp with a lightbulb can be in one of three states: on, off, and broken. When it is off, it is turned on after 100 seconds on average; while turning on, the lightbulb breaks in 1% of the cases. When the lamp is on, it is turned off after 60 seconds on average; while turning off, the lightbulb breaks in 5% of the cases. It takes 500 seconds on average before a broken lightbulb is replaced with a new one.  Continuous-time probabilistic finite-state automaton with transition rates Revisiting traditional models 75 • Continuous-time probabilistic automata augmented with input are the continuous-time counterpart of Markov decision processes, where input determines the transition rates of states or transitions: • Exercise Revisiting traditional models 76 Probabilistic Turing machines and complexity classes • (Similarly to the various types of Markov chains) • A probabilistic Turing machine can randomly choose which transition to take among those offered by the transition function. • Randomness does not increase expressive power, as it does not ND. • The basic idea/hope is to obtain a trade-off between a – small- error probability and a –robust- complexity improvement. Revisiting traditional models 77 • A bounded-error probabilistic Turing machine is a probabilistic Turing machine M that computes a function F(x) of the input. For all inputs x, M halts; upon termination, it outputs the correct value F(x) with probability greater than or equal to 2/3 (or any fraction > ½). • Thus, after n runs on the same input data the probability that the majority of them is incorrect is: • The average running time over input x: T1 (x)  T2 ( x)    Tj (x) • avg(T(x)) = • T(n) defined as usual j Revisiting traditional models 78 Probabilistic complexity classes • A time complexity measure T(n) defines the complexity class BPTIME(T(n)) of all problems that can be solved by some boundederror probabilistic Turing machine with time complexity in O(T(n)). • BPP (Bounded-error Probabilistic Polynomial) is the class of problems that can be solved in polynomial time (and unlimited space) by a bounded-error probabilistic Turing machine: BPP = • The traditional question: P = BPP ? • Unlike P = NP, the prevailing conjecture now is YES In this case using randomness would not produce a breakthrough in tractable problems. • However: – Polynomial-time probabilistic algorithms for PRIME have been known since the 1970’s, but only about thirty years later was the first deterministic polynomial-time algorithm developed. Revisiting traditional models 79 Conclusion and hint to future development • Computational complexity to be used: – When systems to be analyzed are modeled by abstract machines (derived from) the basic ones considered in this chapter – But also when we need to evaluate analysis tools associated with various models: we will see that most analysis (decidable) problems for non-trivial models/systems are NPcomplete or worse. Revisiting traditional models 80

Document

Related documents

Products

Support

Document

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib