Looking Backwards to the Future Tony Lawrance Department of Statistics University of Warwick 1 First of all, sincere thanks for making this such a great day for me (provisional remark…) Especially – John Theodore and thanks to the Statistics Department for ‘sponsoring’ the event 2 Looking backwards to the future – what does it mean ? An excuse to briefly look back on an enjoyable time in statistics with a wish to also look forward to some more time in statistics… Will try and pin the talk on some significant and not so significant events in my statistics life Nearly 40 years of statistics before Warwick – so some reminiscing here for the first time here may be acceptable… In Warwick for just less 10 years – but very enjoyable ones Most of my publications are now on the site ‘researchgate.net’ Diary of Life Maths undergraduate in Leicester – graduated 1963 ‘Intimidated’ into statistics by Nageeb Rahman, a Cambridge PhD student of Henry Daniels – in that, I am the two-year elder ‘statistical brother’ of Phil Brown Nageeb sent me in 1963 to Aberystwyth for an MSc (and then Phil Brown in 1965) because Dennis Lindley from Cambridge had started a Stats Department there in 1960, with David Bartholomew, Mervyn Stone and Ann Mitchell (Dennis was in Harvard for half my year, but taught frequentist inference in the second term) 3 Department of Statistics, Aberystwyth 1963-64 GwynJones MikeSamworth PgslyGwynne GrahamPhipp ^ JeffWood ClivePayne ?Bambegye BasilSpringer ErylBasset RichdMorton Carol? DonaldEast SylviaLutkins DavidBartholomew DennisLindley MervynStone AnnMitchell PeterKing Eileen? 4 The IBM 1620 Electronic Computer, Aberystwyth Stats Dept 1963 Out of bounds to MSc students 5 Diary of Life After MSc Leicester October 1964 - started as a tutorial assistant 1 year -> assistant lecturer Frank Downton, d 1986 ? Nageeb Rahman, d 90’s ? Mike Phillips – 1968-… Brian English – 1969-70? Took 4 ‘summers’ to get a PhD, Stochastic Point Processes’, awarded in 1969. Started by Frank Downton giving me a sheet with a few references … To 7 Lightly supervised by Frank Downton, who almost immediately after my arrival back in Leicester moved to Birmingham, enticed by Henry Daniels Never-the-less, Frank Downton had big influence encouraging me, research confidence building… Another big influence in supporting my career was my external examiner David Cox So this seems a good point to get a bit more technical 6 (back to 6) 7 PhD and Point Processes… Time series of point events on the line – mainly Poisson and renewal processes at the time – spatial or dependent interval versions had not been much considered time I went for dependent interval versions with stationarity and first studied Cox’s 1954 Biometrika paper on ‘superposition of renewal processes’ or ‘pooled processes’ Process 1 Process 2 Superposition What was the inter-point distribution and dependency of this process ? My first issue was what was meant by a ‘typical event’ to start an interval in a stationary point process ? I wrote to David Cox – good question, he said ! “We have avoided it in my just completed Methuen monograph with Peter Lewis” on ‘Series of Events’ – 1966 (I hope my memory is correct !) So after a while I investigated two ideas… 8 An Average Event – an interval beginning with an ‘average event’ in the stationary PP with intervals X 1 , X 2 ,... n has distribution 1 P X x n lim n P X i x i 1 …a bit clunky An Arbitrary Event - a more elegant approach follows from Khintchine’s (1955) (To 10) work** on stationary input processes for queues**. This developed from ‘Palm distributions’, referencing Palm (1943) , who introduced the idea of an interval beginning with ‘at least one point’ in a telephone queuing context Thus, with N (t , t ) the counting variable in a stationary point process, the definition of the distribution of an interval beginning with an arbitrary event is P X x lim 0 P N ( , x) 0 | N (0, ) 1) It turned out that this definition mathematically connected the idea of an arbitrary event with that of an arbitrary time, and involved length-biased sampling and forward and backward recurrence times – previously informal concepts for a general stationary point process My thesis work also contained work on this arbitrary event approach and on particular point processes… 9 Khintchine (1894-1959). Mathematical Methods of Queuing, 1955, English Eds, 1960, 1969, Griffin From the introduction… . (back to 9)10 Diary of Life My First Seminar was 25 Feb 1970 at UMIST, Manchester, on ‘selective interaction of point processes’, one of my PhD point processes My Most Recent Seminar reconstructed part of my first seminar at the Maurice Priestley memorial meeting, 18 December 2013… The selective interaction model was introduced by the Dutch neurophysiologists Ten Hoopen and Reuver (1965, 1967) to explain multi-modal inter-spike distributions for dark firing of lateral geniculate neurons, observed by Bishop et al (1964) The process can be explained as follows - you can see that I was rather keen on graphics even in those distant days… (from my thesis) I explored it as an applied probability model. I really wish now that I had followed up on the statistical aspects, contacting the experimenters, analysing their data, attempting to collaborate, etc, and doing simulations – but there was little electronic computing and 11 no internet, and Holland was a long way away from Priestley meeting talk The Selective Interaction Neuron Firing Point Process Model Excitatory stnry stoc pnt count process This image cannot currently be display ed. Inhibitory I i , I ( y ) stnry Interval process Observed Response Selective interaction process This image cannot currently be display ed. The model was justified empirically by a multi-modal distribution of times between the responses’, in the ‘spike trains’ of observed neuron firings – convolutions of excitatory intervals Poisson excitatory results by very detailed calculation – in my thesis General results by appealing to the compound distribution structure of the observed response count, resulting in N R (t ) N E (t ) NI (t ) i 1 i E ,I , Ei , I 1 with prob P{N E ( I i ) 0} 1 P{N E ( Ii ) 0}, 0 otherwise 12 Continued, N R (t ) N E (t ) (J Appl Prob papers 1970-71 &1979) NI (t ) i 1 i E ,I , Ei , I 1 with prob P{N E ( I i ) 0} 1 P{N E ( Ii ) 0}, 0 otherwise Excitatory stationary stoc pt count process N E (t ) Inhibitory I i , I ( y ) stationary interval process Selective interaction process It follows Response N R (t ) E{N R (t )} E I Pr{N E ( y ) 1} I ( y )dy t y 0 and approximately (?) via compound distribution results var{N R (t )} [ E I E ( ) I var( E , I )]t sdevs E , I Compounding the exciting process intervals using the inhibitory process to get the inter-response distribution is more difficult…but I used arbitrary events For more detailed results when the excitatory process is Poisson, see my 4 JAP papers in the 70’s. No model fitting, no simulations – what a pity ! met Valery Isham, Anthony Atkinson at IC Diary Life 1970 – After PhD exam joined David Cox’s weekly PP journal club at IC from Leicester 1970 - Next move - the year 970/71 at the ‘IBM Thomas J Watson Research Center’, New York, invited by Peter Lewis Extended and consolidated PhD work by investigating branching Poisson process point models for computer failures, and co-organizing big point process conference 1972 – Returned to Leicester for 1 year – moved to Birmingham for 25 years 1973-2004 My Birmingham Years Henry Daniels David Wishart Paul Davies Phil Bertram Roger Holder Frank Downton Malcolm Faddy Alan Girling John Copas Chris Jones Richard Atkinson Frank Critchley Prakash Patil Christmas Meal 1981/82 PhilB? FrankD ? Chris Gray AJL AnnieM ChrisJ TriciaC 14 Birmingham Group (when MalcolmF moved back to NZ for second time, 2003) KamilaZ WolfgangB AlanG PrakashP SaidS MalcolmF RichardA 15 1973 – Farewell Point Processes Found research opportunities in hydrology (from teaching with Nath Kottegoda in Civil Engineering) after devising a course in hydrological time series for Bham MSc in Hydrology RSS Read Paper on the topic with Nath Kottegoda (Stochastic Modelling of Riverflow Time Series) Examined Jane’s PhD on ‘dry’ rivers… Teaching has influenced my ‘choice’ of research areas quite a bit but not the reverse 1973 – Hello Time Series – as it was moving into the nonlinear era Time series started to move away in several directions from ‘Box-Jenkins’ linear Gaussian models to be able to capture more statistically varied and complex behaviour Maurice Priestley, with non-stationary processes and spectra Howell Tong, with dynamical-statistical thresholds Robert Engle, Clive Granger, with volatility, co-integration Peter Lewis et al, with specified nonGaussian models, including discrete distribution models, simulation in operations research 1980-1990 Worked on non-Gaussian time series models with Peter Lewis, by then at Naval Postgraduate School, Monterey, California (nice summers) 16 Peter Lewis, 1932-2011 17 1978-80 – Work started with nonGaussian solutions to linear time series models, exponential, mixed exponential, gamma 1980-87 - Then ways to formulate autoregression operation with nonGaussian variables – in ways natural to the particular distribution, e.g. convolution and multiplication, minimization 1989-90 – Non-reversibility, directionality, in nonGaussian linear time series An early linear problem – it’s easy to set up …(so I describe here) The AR(1) Innovation Problem How to specify the error distribution for an AR(1) process with specified marginal distribution X t X t 1 t Gaver & Lewis made a start with the gamma distribution but could not explicitly obtain the innovation distribution..… 18 The AR(1) Innovation Problem – ‘epsilon for given X’ X t X t 1 t , 0 1, X t D, t ??distbn Solution easy in terms of Laplace transforms – Gaver & Lewis, from X ( z ) X ( z ) ( z ), ( z ) X ( z ) X ( z ) 0 with proby Exponential( ) solution clear: t Et ( ) with proby 1 X ( z ) z k Gamma solution –> z ( z ) z k Can you invert this Lapalce transform without serendipity ? ‘Consider a shot noise process in continuous time’, of course… N t Yi Ui i 1 N Poisson(k log ) 1 U i uniform(0,1) Yi exponential ( ) A compound Poisson distribution 19 Diary Life 1985 - RSS ‘read paper’ on nonlinear AR exponential variables, with Peter Lewis 1986 – ISI Tashkent - Very Sick ! (Time series directionality) 1986 - Began teaching inference in Bham - beginning of regression diagnostics 1986 – Seconded RSS vote of thanks at Cook’s 1986 local influence read paper, and showed how it applied to regression transformation diagnostics 1988 - JASA paper on regression transformation local influence diagnostics (To 21, 22) 1988 Got chair in Bham (poster of inaugural lecture) 1989 – Papers on regression transformation score statistics Biometrika papers 1987, 1989-ACA 1991 - IMA Minnesota Robustness & Diagnostics workshop (photos Anthony, Frank)(To 23, 24) 1981-1991 Tim Davis PhD collaboration ‘Survival of Tyres’, Dunlop-Sumitomo-Ford 1991 Tim Davis PhD 1995 - Regression diagnostics – Cook’s bivariate & conditional distance Gary Brown PhD 1995 1995 - 98 Engine mapping, with Tim Davis, Tim Holiday- PhD-1996 Technometrics paper 1998 1992- Statistical aspects of chaos 1998- Chaos-based communications took over my research & publication (To 25) 20 21 (Back to 20) (Back to 20) A trip across Minnesota and Iowa with Anthony Atkinson and Frank Critchley to Spillville, Iowa, to visit Dvorak connections, 1991, on the workshop rest day… (To 24) 23 Dvorak’s ‘American Quartet’ (String Quartet in F Major, op96) composed here in 1893, also, String Quintet in E Flat Major,op97 (sometimes called the ‘Spillville Quintet’), and after returning to NY, his Humoresque, No 7 in G Flat Major Spillville, Iowa 1991 25 Back to 20 1992- 2010 Statistical aspects of chaos, leading to Chaos-based communications Chaos – instabilities produced by a deterministic rule ‘What got me started’… the Uniform Distribution Solution to the AR(1) Process – Bartlett’s last paper, probably (another case of the AR(1) innovation problem) Ut 1 U t 1 t , k t i 1 1 wp , i 1, 2,..., k k k Collaborators Bala Balakrishna Alexander Baranovsky Tohru Khoda Gan Ohama Rodney Wolf Theodore Papamarkou Nancy Spencer Atsushi Uchida Chibisi Chima-Okereke 25 1 U t U t 1 t , k i 1 1 t wp , i 1, 2,..., k k k Where is the chaos from this model? The reverse of this model is the following chaotic and deterministic model U t 1 (kU t ) mod(1) deterministic rule called a chaotic shift map ~ like cnts congruential random number generator And, incidentally, there is a negatively correlated version reversing to U t 1 {k (1 U t )}mod(1) It follows (and more) generally that deterministic chaotic processes have statistical properties, i.e., there are statistical properties of chaos Such ideas prompted some electronic engineers to have the idea of ‘communicating with chaos’ – instead of communicating with sinusoidal radio waves 26 A particular chaos communication system using a chaotic map is Chaos Shift Keying (CSK) – ‘Coherent’ Case -simplest Channel Noise Transmit one bit b=+/- 1 i i1 n Received Signal Ri b( X i ) i Chaotic Spreading X i ( X i 1 ) n X i i1 i 1, 2,..., n Signal b( X i ) i 1,2,..., n Also available in coherent case X i i1 n Decoder bit = Exact theory for bit error rate of such a system, d BER( N ) x c i1 n b̂ Lawrance & Ohama (IEEE, 2002) ( x) f ( x)dx ( i 1) 2 (estimate b) Performance of CSK Assessed by bit error rate (BER) Depends on statistical aspects of the system as well as the dynamics, according to previous formula Worst; IID Gaussian Different types of chaotic spreading, compared to IID Gaussian Shift map Logistic map Best: circular map and theoretical lower bound Optimum circular map spreading: Ji Yao, T Papamarkou Area has moved on from chaotic-map and electronic circuitry chaos to laserchaos communication; this is still a research area but with several experimental demonstrations and US military applications 28 Current work with Atsushi Uchida and Chibisi Chima-Okereke -> Police mergers A Brief Diversion - In the Press… Police Mergers 2006 – the misuse of statistics 80.00 Line equates to an average score of 3 score = 3 70.00 Total Score by Force (excluding London) 63 60.00 Total Score 50.00 40.00 30.00 20.00 10.00 p is significant to the 0.01 level R2 = 0.5809 4000 0.00 0 1000 2000 3000 4000 5000 Force Size (Officer Strength) 6000 7000 8000 9000 Charles Clarke. Home Secretary Government O’Connor Report said : This strongly suggests that forces with over 4,000 officers (or 6,000 total staff) tend to meet the standard across the range of services measures in that they demonstrate good reactive capability with a clear measure of proactive capacity…’. 29 What I said about the O’Connor Report: 80.00 Total Score by Force (excluding London) Line equates to an average score of 3 score = 3 70.00 63 60.00 Total Score 50.00 40.00 30.00 20.00 What this plot shows to me is: 10.00 p is significant to the 0.01 level R2 = 0.5809 0.00 Rather rough upward scatter of points 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Force Size (Officer Strength) Least squares line is misleading because of extremes Large variability at each force size – very important Line at 63 shows most forces ‘fail’ – artifact of scoring and choice of ‘3’ Meaningless statistical elaborations of p-value and R-squared due to automatic use of software No justification of 4,000 figure 30 Another example of rubbish in the O’Connor report What I said about this plot ‘This is an almost perfect example of how not to present a graph - no scales on either axis, no data plotted to justify the lines drawn. It is almost impossible to obtain any critical understanding from it, except that it is intended to prove that score for protective capability increases with force size’ Score Overall Trend for Protective Services Ser i ous &Or gani sed Publ i c Or der Cr i ti cal Inci dents Ci vi l Conti ngenci es Roads Pol i ci ng Maj or Cr i me CT &DE F o r c e S i z e (Smallest f rom lef t ) 31 What was said in the House of Commons: MP David Davis: ….Frankly, the best that I can do is to repeat to the House the coruscating opinion of Professor Lawrance, a professor of statistics at Warwick University… MP Adrian Baily: …I rather regret the attempt by the University of Warwick to rubbish the statistical basis and the credibility of that report. It has a good pedigree and I shall make my judgement on the balance of professional police opinion, rather than on the opinion of university professors in Warwick… Another newspaper appearance… 32 A Publication in ‘The Sun’… - 14th October 2013 33 A Publication in ‘The Sun’… - 14th October 2013 A MATHS professor has told The Sun bills are so complicated even he can’t understand them. Tony Lawrance, right, of Warwick University said : “They’re absurdly over-complicated. Most professors would find them difficult to understand – the public doesn’t stand a chance.’’ 34 Chaos-based Communications 2001 – 2014 - ?? Collaborators: With Bala Balakrishna, Cochin University Kerala 75% Bala Balakrishna Gan Ohama Rachel Hilliam Yi Yao Theodore Papamarkou Chibisi Chima-Okereke Atsushi Uchida Current work|: laser-chaos-based communications (laser = light amplification by stimulated emission of radiation) Key laser features of laser-based communication 1. Lasers can produce chaotic waves which look stochastic – (use semiconductor laser with optical feedback) 2. Lasers producing chaotic behaviour can be synchronized by a trigger signal A message is hidden in a segment of the chaotic laser sequence - steganography, rather than cryptography when a message is visible but has to be decoded 35 Current Work-1: Laser-based Chaos Communication Experimental data via collaboration with Atsushi Uchida, Saitama University, Tokyo, and analysis collaboration with Chibisi Chima-Okereke of ActiveAnalytics, Bristol Experiment set up to probe chaos shift-keying system of communication using semiconductor lasers with optical feedback and transmission though 60m fibre optic cable Each set of data consists of three time series of 10m values binary message b and binary message 36 Experimental setup not quite so simple as it may have seemed… 37 Some Experimental Results Adjusted Received and Synchronized Laser Signals (5,000,001:1,000,500) 0.3 0.2 0.0 -0.1 -0.2 -0.3 -0.4 -0.5 Example of laser synchronization -0.6 0 100 200 300 400 500 Time Index - 5m Intensity of Drive Laser Adjusted Optical Noise 12 16 14 10 12 8 10 Density Density a djD rv_wO pt N se _1 0.1 6 4 8 6 4 2 2 0 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 Intensity 0.0 Drive laser 0.1 0.2 0.3 0 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 Intensity 0.0 0.1 0.2 0.3 Is Optical Noise Independent ? Optical Noise Based on post-processing for instrument effects – Noise not Gaussian 38 Distribution of Optical Noise Conditional on Driver Signal Strength Boxplots of Optical Noise versus Drive Signal Strength 0.15 Noise Boxplots 0.10 0.05 0.00 -0.05 -0.10 15 05 95 85 75 65 55 45 35 25 15 05 05 15 25 35 45 55 65 75 85 95 05 15 25 35 45 55 65 75 85 95 05 15 25 35 .1 .1 .0 .0 . 0 . 0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 .0 . 0 . 1 .1 .1 .1 .1 .1 .1 .1 .1 .1 .2 .2 .2 .2 - 0 -0 - 0 - 0 - 0 -0 - 0 - 0 - 0 - 0 -0 - 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N.B. BER v SNR plot under development, but initial work indicates acceptable values can be obtained using range of SNR controlled by range of spreading 39 Current Work-2: Volatility Modelling and Exploratory Graphics Topic comes from teaching financial time series in the Financial Mathematics masters program Financial time series ‘means’ volatility modelling Volatility is changing conditional variance var( X t | X t 1 ) in a time series Motivation – volatility models are routinely used without justification of the type of volatility structure existing in the data series But it has not been clear how to reveal volatility structure Attitude has been ‘fit the model you think will be ok and undertake some general tests of its fit’ - but never obtain the empirical volatility and compare it with the model volatility My attitude is ‘get an empirical version of the volatility function and choose a model which gives a good volatility fit, i.e. get the volatility right first’ may be not the purest of likelihood approaches – but surely volatility is the most important aspect of volatility models ! The General Volatility Model to be used X t ( X t 1 ) X t 1 t , t IID (0,1) 40 FTSE100 Daily Data 4th Jan 2005 – 10th Feb 2011 Daily Adjusted Closing Values and Daily Returns FTSE Values 8000 7000 6000 5000 4000 Returns 10% 5% 0% -5% -10% 01/01/2005 01/01/2006 01/01/2007 01/01/2008 Daily Date 01/01/2009 01/01/2010 01/01/2011 41 Journal of the Royal Statistical Society, Series C, Applied Statistics (2013) 62, Part 5, pp. 669-686 Volatility Graphics Based on the general volatility model for returns X t ( X t 1 ) X t 1 t , t IID (0,1) X t 1 volatility function Graphics Steps calculate t xt ˆ t 1 ) (unscaled individual volatilities) ( nearly constant with returns) smo( | xt 1 ) (smoothed unscaled individual volatilities) Smoothed & scaled i-volatilities ( xt 1 ) give empirical version of volatility function ( xt 1 ) smo( | xt 1 ) scaling gives standardized innovations n 2 1 1 (n 1) ( xt ˆ t 1 ) / smo( | xt 1 ) t 2 12 42 Scaled Individual Volatilities and Their Smooth 12 10 Volatility 8 6 4 Empirical volatility function 2 1 0 -7 -5 -3 -1 0 1 Previous Return 3 5 7 43 (see 20013 JRSS’C’ paper for more details) Bootstrapping the Volatility Function 12 10 Volatility 8 6 4 2 1 0 -7 -5 -3 -1 0 1 Previous Return 3 5 7 That’s Enough, except… 44 The one nice thing about getting older is that younger people follow you… 45 46 Many thanks 47