The Mathematics and Algorithmics of Process Detection George Cybenko Dartmouth College Hanover NH 03755 USA gvc@dartmouth.edu Cybenko IPAM 27-7-2005 Acknowledgements Active Members Alumni George Bakos Alex Barsamian Marion Bates Vincent Berk Wayne Chung Valentino Crespi (Cal State LA) George Cybenko Ian deSouza Annarita Giani Doug Madory Glenn Nofsinger Yong Sheng William Stearns Naomi Fox (UMass, Ph.D. student) Hrithik Govardhan (Rocket) Robert Gray (BAE Systems) Diego Hernando (UIUC, Ph.D. student) Guofei Jiang (NEC Research) Alex Jordan (BAE Systems) Han Li (China) Josh Peteet (Greylock Partners) Chris Roblee (LLNL) Research Support: DARPA, DHS, ARDA, ISTS, I3P, AFOSR, Microsoft Cybenko IPAM 27-7-2005 Outline Background and basics Software and Applications Theory Summary Cybenko IPAM 27-7-2005 An Example of a Process A “Process” Model Two states - { 1 , 2 } a b 1 2 Two observables – { a , b } Legal transitions between states are depicted by arrows. When occupying a state, the process emits an observable. All states are initial/start states and there are no terminal states. Some legal sequences of observables: abbab , bababbb, abbb Some illegal sequences of observables: aa , baab Further reading: Automata Theory, Regular Languages, etc Cybenko IPAM 27-7-2005 A More Complex Process Another “Process” Model a,c b a,c 1 2 3 Three states - { 1 , 2 , 3 } Three observables – { a , b , c } Some legal sequences of observables: abab , babaccab, ab Some illegal sequences of observables: bb , baabb Problem: Given a sequence of possible observations is it legal? What states? Solution: Cybenko 1 Read the first observable, mark states that emit that observable 2 Read an observable, z 3 New marked states = (states reachable from old marked states) intersected with (states that could have emitted z ) 4 If no new marked states, illegal sequence; else go to 2 IPAM 27-7-2005 Two Simple Processes Model Instance A Model Instance B a b A1 A2 a b B1 B2 aabb is a legal observation sequence A1 B1 A2 A2 , A1 B1 A2 B2 , B1 A1 B2 B2 , ... are all legal state sequences A1 A2 A2 B1 , A1 A2 B1 B2 , A1 B1 B2 B2 We can reduce this to a single process.... Cybenko a track a hypothesis IPAM 27-7-2005 Multiple Process Representation A1 B1 Model Instance A Model Instance A Model Instance B a b A1 A2 a b A1 A2 a b B1 B2 0 1 M= MxM= 0 0 0 1 1 1 0 0 0 1 1 0 1 1 A1 B1 1 1 1 1 If the observation sequence is aaaaaa and multiple copies of the model are allowed, then we get a product model of size 2n. Cybenko IPAM 27-7-2005 Multistage Process Model Potential malicious activity Potential normal activity Scanned Data Access Start/Normal Infected Cybenko Exfiltration IPAM 27-7-2005 Extensions: Hidden Markov Model (HMM) p(a|1) = 0.8 , p(c|1) = 0.2 p(b|2) = 1 0.8 1 Add probabilities 1 t=1 t=2 3 2 0.2 t=0 p(a|3) = 0.8, p(c|3) = 0.2 t=3 Copies of states 0.5 0.5 t=k-2 t=k-1 Take logs of probabilities so this is a shortest path problem and can use dynamic programming (Viterbi algorithm) k copies Cybenko IPAM 27-7-2005 Hidden Process Models Observations missed, noise added, unlabelled (This is what we see) aba cfkhdcbgdbkhagda Observations are interleaved a b c c f h d cc a b g d b a g d a Observations related to state sequences abcdabbada cfhccgdg f, g a, c a, b Underlying (hidden) state spaces c, d e Model 1 Cybenko f, c c, d h Model n IPAM 27-7-2005 Terminology and Summary Processes have states. The states are hidden. States emit observables that are possibly not unique to a state. Observables are not labeled, can be noisy and might be dropped. Multiple processes might be instantiated. The problem is to determine which processes are possible and which states those processes can be in. Multiple process detection can be reduced to single process detection at the expense of exponential growth. Tracks are associations of observations to processes. Hypotheses are consistent tracks that explain all the observables. Cybenko IPAM 27-7-2005 Discrete Source Separation Problem (viz Blind Source Separation, “Cocktail Party” Problem) Process/Model Example: 3 states + transition probabilities n observable events: a,b,c,d,e,… Pr( state | observable event ) given/known Observed event sequence: ….abcbbbaaaababbabcccbdddbebdbabcbabe…. A Hypothesis Catalog of Processes/Models A Track Which combination of which process models “best” accounts for the observations? This is what we want to compute. Events not associated with a known process are “anomalies”. Cybenko IPAM 27-7-2005 A Simple Example of Process Detection a,b,c,d are events that can be observed {a} {b} {b,c} {c,d} A B C D NETWORK WORM MODEL (NW) (a,b,c,d ICMP traffic levels) {a} E {b} F • a,b,c,d are events that can be observed • states A, B, C, D, E, F are hidden • observe a sequence of events Sequence Hypotheses • ab NW | RF • abab (NW & NW)|(RF&NW)... E,F = 0 • ababc (NW & RF)|(NW & NW) repeat • ababcc read eventNW e & NW if e==a then E • Which process or combination of if E and e==b then F until F processes explains the observed events? ROUTER FAILURE MODEL (RF) Two models; states have different semantics; sets of observables intersect – what is the “diagnosis”? Cybenko IPAM 27-7-2005 Detecting a Process Using Rules {a} {b} {b,c} {c,d} A B C D WORM MODEL (a,b,c,d ICMP traffic levels) {a} E {b} F A,B,C,D = 0 repeat read event e if e==a then A if A and e==b then B if B and (e==b or e==c) then C if C and (e==c or e==d) then D until D E,F = 0 repeat read event e if e==a then E if E and e==b then F until F ROUTER FAILURE MODEL What does “ab” mean ? (Process ambiguity) What does “ac” mean ? (Missed Detections) Cybenko IPAM 27-7-2005 Rules for Process Disambiguation {a} {b} {b,c} {c,d} A B C D WORM MODEL (a,b,c,d ICMP traffic levels) {a} E {b} F A,B,C,D = 0 repeat read event e if e==a then A if A and e==b then B if B and (e==b or e==c) then C if C then (E=0, F=0) if C and (e==c or e==d) then D until D E,F = 0 repeat read event e if e==a then E if E and e==b then F until F ROUTER FAILURE MODEL Cannot decide which process is instantiated until more data arrives. Cybenko IPAM 27-7-2005 Rules for Missed Detections {a} {b} {b,c} {c,d} A B C D WORM MODEL (a,b,c,d ICMP traffic levels) A,B,C,D = 0 repeat read event e if e==a then A if A and e==b then B if A and e==c then C,D if A and e==d then D if B and (e==b or e==c) then C if C then (E=0, F=0) if C and (e==c or e==d) then D if D then (E=0, F=0) until D This clearly does not scale and does not lead to manageable sets/systems of rules. Cybenko IPAM 27-7-2005 Complexity of Rule-Based Systems for Multiple Process Detection m process models, each with n states Potentially as few as mn state transitions in the original models Potentially need to add: Cybenko O(m2n2) rules for disambiguation O(mn2) rules for missed detections these are “overhead” processing steps that can be done generically, not by the decision tree or rule set Process Query System software handles this overhead processing IPAM 27-7-2005 Approaches to Detecting Processes Aristotelian - Traditional information retrieval is based on specification of a query in terms of Boolean expressions based on record fields. IE. SQL ( name = “smith” & age > 20 & age < 40 ) + rule-based logics + decision trees, etc Newtonian - Next generation process detection requires retrieval based on specification of a set of discrete, dynamic processes. IE, descriptions of a Hidden Markov Model, Hidden Petri Net, weak models, FSMs, attack trees, etc. Main Concept: Move from an Aristotelian to a Newtonian Paradigm. Cybenko IPAM 27-7-2005 Examples of Process Detection Problems Is there an unusual pattern of computer network events, host activities, system calls, etc? (Network and computer security) Is a complex infrastructure (telecom, electricity, financial networks) operating normally or in a failure mode? (Critical Infrastructures) Is my software operating normally? (Autonomic computing) What biological pathways/processes are engaged? (Molecular Biology) Is there an unusual pattern of document accesses within an enterprise document control system? (Insider Threat Detection) Does a group of unusual transactions constitute a threat? (Homeland Security) Has the physical border/perimeter been breached? (National and industrial physical intrusion detection) Is there a large ground vehicle convoy moving towards our position? (Tactical military) What’s going on around me? (Human Cognitive Processing) IMPORTANT – All are “adversarial” situations, not cooperative, so the observations are not necessarily labeled for easy identification and association with a process! IPAM 27-7-2005 Cybenko Related Disciplines “Weak” Models Hidden Markov Models Linear State Space Systems Multiple Target Tracking Underlying Model Finite State Machines Markov Chains, Shannon Channel x’ = Ax + Bu y = Cx + Dv u, v Gaussian noise Any Algorithms for Single Processes State marking, eg Viterbi algorithms Kalman Filtering Not applicable Multiple Processes Process Query Systems Process Query Systems Process Query Systems Multiple Hypothesis Tracking (MHT) Cybenko IPAM 27-7-2005 Software and Applications Sensor networks Airborne plume detection Cyber security Server pool management Dynamics of social networks* 400000 Genomics and biological pathways* Total Successful Requests 350000 300000 250000 200000 150000 100000 50000 Human situation awareness* 0 0 100 200 300 400 500 Time (s) *In process or planned. Cybenko IPAM 27-7-2005 Process Query Systems (PQS) Process Query Systems solve the Discrete Source Separation Problem in a generic way: inputs outputs Cybenko a sequence of unlabelled observations (stream, logfiles, etc) a collection of process models estimates of which processes produced those observations estimates of which states those processes are in Basic theory and technology has been developed by the PQS team at Dartmouth Now being applied to a variety of applications IPAM 27-7-2005 Algorithms/Operations of PQS 2 Track Track Manage Hypotheses (MHT) Subscribed Data Arrives Hypothesis 1 4 Track Track Track Track Track Tracks Track Track Tracks Tracks Track Tra cks Tracks Tracks Track Track Tracks Tra cks Hypothesis Pool Track Tra cks Tra cks Tracks Hypothesis n Build or Learn Models 1 Recursive in Time Cybenko Track Update Tracks Within Hypotheses (Viterbi / Kalman / NDFA,etc) and Create New Hypotheses 3 5 Evaluate Solutions and Process Outputs IPAM 27-7-2005 Software: Process Query System One platform, many applications DISCUS Cyberlog Analysis Vehicle Tracking Attacks on utilities DHS DARPA PQSnet.net Plume detection Computer Security ARDA Robust Server Pooling Sensor networks DHS DHS Generic Process Query System Cybenko IPAM 27-7-2005 The COBOL and pre-PQS Analogy … application logic statement 1; application logic statement 2; file management statement 1; record management statement 1; file management statement 2; record management statement 2; application logic statement 3; record management statement 3; file management statement 3; application logic statement 4; … User responsibility System responsibility … application logic statement 1; application logic statement 2; SQL statement 1; application logic statement 3; SQL statement 2; application logic statement 4; … … file management operation 1; record management operation 1; file management operation 2; record management operation 2; record management operation 3; file management operation 3; … + Application logic Database management system Interwoven logic Post-SQL Programs Pre-SQL Programs … model logic statement 1; model logic statement 2; sensor access statement 1; state estimate statement 1; sensor access statement 2; state estimate statement 2; model logic statement 3; sensor access statement 3; state estimate statement 3; model logic statement 4; … User responsibility System responsibility … model description statement 1; model description statement 2; model description statement 3; model description statement 4; … … sensor access statement 1; state estimate statement 1; sensor access statement 2; state estimate statement 2; sensor access statement 3; state estimate statement 3; … Model description Interwoven logic Current Process Detection Programs + Process query system PQS-based Programs Computer Security Example (V. Berk and N. Fox) Funded by ARDA and DHS Cybenko IPAM 27-7-2005 Network Security Objective: Detect, disambiguate, and predict the course of concerted network attacks in an enterprise class network. Why: Problem domain demands the power of PQS Hundreds of “processes” occurring at once Lots of missed observations and noise All commercial technology focuses on collection and presentation of data Existing correlation efforts very weak at best Cybenko IPAM 27-7-2005 Goal of PQS in network monitoring Create a system that quickly, and accurately correlates related activity. Assist a security analyst in deciding: What activity is irrelevant. What activity needs attention and further investigation. Cybenko IPAM 27-7-2005 SENSORS INTEGRATED SENSOR Cybenko DESCRIPTION DIB:s Dartmouth ICMP-T3 Bcc: System CovChan Timing Covert Channel Detection Snort Signature Matching IDS IPtables Linux Netfilter firewall, log based Samba SMB server - file access reporting Weblog IIS, Apache, SSL error logs, … US-agent Userspace host monitoring agent Tripwire Host filesystem integrity checker SCOPE Global Network Host IPAM 27-7-2005 Multistage Process Model Potential malicious activity Potential normal activity Scanned Data Access Start/Normal Infected Cybenko Exfiltration IPAM 27-7-2005 PQS-Net Testbed at Dartmouth DIB:s Dartmouth Internet CovChan PQS-Net IPTables Snort SaMBa US-Agent ISTS DMZ WWW Mail WS 172.18.12.32-38 PQS Attack Hosts: • Skaion • Custom Exploits • Core Impact™ • Normal Traffic WinXP/LINUX targets • Covert Channels 192.168.24.192/26 • Worms www.pqsnet.net PQS-Net supply chain Tier 1 Models Focus on individual host status Report on status changes Tier 2 Models Focus on correlating host activity Report chains of events Tier 1 Output Tier 2 Output Mon Feb 21 20:06:17 2005 000000 131.58.63.160 (hostile) recon on 100.10.20.4 SNORT 469 proto: 1 Hypothesis 1 Score: 0.8 Hypothesis 2 Score 0.2 A scans B A scans B Mon Feb 21 20:30:24 2005 000000 138.158.170.45 (hostile) attacked 100.10.20.4 ERRORLOG 400 proto: 6 dport: 443 B scans E B attacks E sensor data Tier 1 Tracker Attack steps Tier 2 Tracker Attack sequences and scores sensors Cybenko Analyst’s front-end IPAM 27-7-2005 Example Scenario Internet A C D B E Tier1 Alerts Indicators A scans B Snort: 02/21-20:06:17.904500 [**] [1:469:1] ICMP PING NMAP [**] [Classification: Attempted Information Leak] [Priority: 2] {ICMP} 131.58.63.160 -> 100.10.20.4 C attacks B (success) Cybenko SSL error log (host 100.10.20.4): [Mon Feb 21 20:30:24 2005] [error] mod_ssl: SSL handshake failed (server www.osis.gov:443, client 138.185.170.45) (OpenSSL library error follows) [Mon Feb 21 20:30:24 2005] [error] OpenSSL: error:1406908F:lib(20):func(105):reason(143) IPAM 27-7-2005 Example Cont’d D B E Tier1 Alerts Indicators B scans D 02/21-20:31:17.528602 [**] [1:1807:2] WEB-MISC Chunked-Encoding transfer attempt [**] [Classification: Web Application Attack] [Priority: 1] {TCP} 100.10.20.4:34074 -> 100.10.20.169:80 B attacks D (fails) B scans E B attacks E (succeeds) Cybenko 100.20.1.169 - - [21/Feb/2005:08:31:22 -0500] "GET /default.idq?AAAAAAAAAAA………..AAAAAAA HTTP/1.1" 404 1287 "-" "-" 02/21-20:32:01.622465 [**] [1:1807:2] WEB-MISC Chunked-Encoding transfer attempt [**] [Classification: Web Application Attack] [Priority: 1] {TCP} 100.10.20.4:34076 -> 100.10.20.170:80 100.20.1.170 - - [21/Feb/2005:08:32:06 -0500] "GET /default.idq?AAAAAAAAAAA………..AAAAAAA HTTP/1.1" 200 1287 "-" "-" IPAM 27-7-2005 Fish Tracking (Kinematic Tracking) A. Jordan, W. Chung, V. Crespi Funded by DARPA and DHS Cybenko IPAM 27-7-2005 Real time Fish Tracking Objective: Track the fish in the fish tank Why: Very strong example of the power of PQS Fish swim very quickly and erratically Lots of missed observations Lots of noise Classical Kalman filters don’t work (non-linear movement and acceleration) “Easier” than getting permission to track people (we mistakenly thought) Cybenko IPAM 27-7-2005 Fish Tracking Details 5 Gallon tank with 2 red Platys named Bubble and Squeak Camera generates a stream of “centroids”: For each frame a series of (X,Y) pairs is generated. Model describes the kinematics of a fish: The model evaluates if new (X,Y) pairs could belong to the same fish, based on measured position, momentum, and predicted next position. This way, multiple “tracks” are formed. One for each object. Model was built in under 3 days!!! Cybenko IPAM 27-7-2005 Autonomic Server Monitoring (C. Roblee, V. Berk) Funded by DHS, ARDA Cybenko IPAM 27-7-2005 Autonomic Server Monitoring Objective: Detect and predict deteriorating service situations Why: Another strong example of the power of PQS Software and hardware are buggy and vulnerable Hot market, large profits for “The ONE” application Very ambiguous observations Sys-admins also want vacation Cybenko IPAM 27-7-2005 The Environment Hundreds of servers and services Various non-intrusive sensors check for: CPU load Memory footprint Process table (forking behavior) Disk I/O Network I/O Service query response times Suspicious network activities (i.e.. Snort) Models describe the kinematics of failures and attacks: The model evaluates load balancing problems, memory leaks, suspicious forking behavior (like /bin/sh), service hiccups correlated with network attacks… Cybenko IPAM 27-7-2005 Server Compromise Model: Generic Attack Scenario 2. Monitored host sensor output (system level) 3. PQS Tracker Output Current system record for host 10.0.0.24 (10 records): Average memory over previous 10 samples: 251.000 Average CPU over previous 10 samples: 0.970 | time | mem used | CPU load | num procs | flag | ---------------------------------------------------------------------------------| 1101094903 | 251 | 0.970 | 64 | | | 1101094911 | 252 | 0.820 | 64 | | | 1101094920 | 251 | 0.920 | 64 | | | 1101094928 | 251 | 0.930 | 64 | | | 1101094937 | 251 | 0.870 | 65 | | | 1101094946 | 251 | 0.970 | 65 | | | 1101094955 | 251 | 0.820 | 65 | | | 1101094964 | 253 | 1.220 | 65 | ! | | 1101094973 | 255 | 1.810 | 65 | ! | | 1101094982 | 258 | 2.470 | 65 | ! | 1. Last Modified: Mon Nov 21 21:01:03 Model Name: server_compromise1 Likelihood: 0.9182 Target: 10.0.0.24 Optimal Response: SIGKILL proc 6992 o1 o2 o3 Snort NIDS sensor output .. . Nov 21 20:57:16 [10.0.0.6] snort: [1:613:7] SCAN myscan [Classification: attempted-recon] [Priority: 2]: {TCP} 212.175.64.248-> 10.0.0.24 .. . Cybenko o1 SIGKILL t0 t4 Response t 1 t2 t3 Observations IPAM 27-7-2005 Experimental Results: Tracking 400000 400000 350000 350000 Total Successful Requests Total Successful Requests No Tracking 300000 250000 200000 150000 100000 300000 250000 200000 150000 100000 50000 50000 0 0 0 100 200 300 400 500 0 100 200 300 400 500 Time (s) Time (s) 100 100 90 90 80 80 % System Memory Used % System Memory Used Successful Requests 70 60 50 40 30 20 10 70 60 50 40 30 20 10 0 0 0 100 200 300 Time (s) 210,000 requests serviced Cybenko 400 500 0 100 200 300 400 500 Time (s) System Memory Consumed 380,000 requests serviced IPAM 27-7-2005 Theory Process Query System frameworks offer a principled approach that enables understanding how distinguishable models (attack and failure) are developing a notion of processes that are “trackable,” given models and sensing infrastructure (ie a “sampling theory”) Cybenko IPAM 27-7-2005 Hypothesis Growth A “hypothesis” is a consistent assignment of events to processes and/or states(ie, each event assigned to only one process instance). Given a set of “hypotheses” for an event stream of length k-1, update the hypotheses to length k to explain the new event. NP-Complete in general. Need to prune the pool of hypotheses, keeping the most suitable. Cybenko time Individual path is a “track” – ie one process instance Consistent tracks form a “hypothesis” IPAM 27-7-2005 Models and Hypothesis Growth “Weak” model FSM with “emission” vectors Emission for state i = 0/1 vector of sensor reports eg obs(i) = ( 0 , 1 , 1 , 0 , 0 , 1 , 1 ) Observation vector at time t collected by sensors: eg sensors(t) = ( 0 , 1 , 1 , 1 , 1 , 1 , 0 ) Possible states at time t are determined by: P = { i | Hamming_distance( obs(i) , sensors(t)) <= HD } R = { i | j possible at time t - 1 and i is reachable from j } U P R is the set of possible states at time t Number of hypotheses at time t recursively computed as above. Theorem: For a fixed value of HD, the worst-case number of hypotheses at time t is either polynomial or exponential in t. (Crespi, Cybenko, Jiang 2004) Cybenko IPAM 27-7-2005 Longer Longer tracking tracking time time More noise More noise (worse model) (worse model) Ouch!!! Nice Demo!! Cybenko IPAM 27-7-2005 Poor Models and Sensor Coverage Longer tracking time More noise (worse model) Excellent Models and Sensor Coverage Cybenko Acceptable Models and Sensor Coverage IPAM 27-7-2005 Basic Idea Behind the Proof N states time t time t+1 time t+2 time k If there are never two distinct paths from any node to itself over any period of observation, there is a simple injective mapping (ie. unique labeling) of the paths into {0, 1, ... , k} x {0, 1, ... , k} x {0, 1, ... , k} ... x {0, 1, ... , k} 2N times. So the number of paths is < (k+1)2N. The label for each path is the time it first occupies a state and the time it last occupies that state. Cybenko IPAM 27-7-2005 Basic Idea Behind the Proof N states time t time t+1 time t+2 time k Process dynamics (ie what is reachable from each state in a time step) + observations + noise threshold determines a “trellis”. If there are two distinct paths from one node to itself over some period of time, the number of distinct paths grows exponentially by repeating the construct. Cybenko IPAM 27-7-2005 Relationship to Spectral Radius Classical spectral radius: r(A) = |lmax| Joint spectral radius of a set, S = {A1, ... An}, of matrices: r(S) = lim max r(P Bk)1/ t t Bk e S 0 < k < t+1 Hypothesis growth is polynomial iff r(S) <= 1 Deciding whether r(S) <= 1 for real or rational matrices is impossible (Tsitsiklis and Blondel, 2000) If S consists of 0-1 matrices, decidable but NP hard. Cybenko IPAM 27-7-2005 Distinguishability of models Given two “models”, how distinguishable are they? Example: How different are these two models? p(a|1) = 0.8 , p(c|1) = 0.2 p(b|2) = 1 p(a|3) = 0.8, p(c|3) = 0.2 0.8 1 1 p(a|1) = 0.9 , p(d|1) = 0.1 p(b|2) = 1 Cybenko p(a|3) = 0.8, p(c|3) = 0.2 0.8 0.9 1 3 2 0.2 0.5 0.5 0.2 0.1 3 2 0.5 0.5 IPAM 27-7-2005 Distinguishability of models s The goal is to answer questions such as: “Do we need to build more refined models or do we need to add additional sensors/data sources or improve tracking/hypothesis management?” Cybenko IPAM 27-7-2005 Different degrees of distinguishability between models given their sensing capabilities: 1 Red: Prob of deciding model 2 given model 1 Blue: Prob of deciding model 1 given model 2 Entropy of the two ergodic models are different. Decision rule is based on ML as determined by the Viterbi algorithm Shannon-MacMillan-Brieman Ergodic Theorem states that “most” observation sequences are “typical” and have probability related to the entropy Cybenko IPAM 27-7-2005 Different degrees of distinguishability between models given their sensing capabilities: 2 However, nonmonotonic behaviors are possible (in general) and without convergence to zero (if the entropies are the same) Cybenko IPAM 27-7-2005 Different degrees of distinguishability between models given their sensing capabilities: 3 However, nonmonotonic behaviors are possible (in general) and without convergence to zero (if the entropies are the same) Cybenko IPAM 27-7-2005 Unifilar models Definition: for any pair of state si, and input yj, there could be at most one successor state One state sequence, one observation seq. One observation seq., at most 1 state seq. If acceptable, there is 1 state seq. If unacceptable, there is 0 state seq. T1 {0} T2 {1} A WM can be reduced to a DFA. Every DFA has an unique minimum state unifilar WM: T3 {1} 1 1 0 1 0 A 0 0 1 B 0 1 1 1 0 0 1 0 1 1 0 WM->DFA->Minimization->WM For a unifilar WM, counting acceptable strings with length n, for n sufficiently large: L(M Sn 0 An1 0 1 l1n1 1 Where λ1 is the maximum eigenvalue of A . Y. Sheng thesis, efficient estimates of l1 Cybenko IPAM 27-7-2005 Summary Multiple process detection is a ubiquitous problem with many applications but it has not been systematically studied. Existing approaches are either very ad hoc, very specialized or very unscalable. There is a promising generic software system for solving multiple process detection. The theory is rich and largely unexplored. Cybenko IPAM 27-7-2005 Questions See www.pqsnet.net for papers. Cybenko IPAM 27-7-2005