Nonlinear Time Series Analysis Applied to Resting State MEG Alexander Kovrig

advertisement
Nonlinear Time Series Analysis Applied to Resting State
MEG
Alexander Kovrig
September 13, 2015
Abstract
Entropy in the context of ergodic theory is the rate of information creation in a dynamical
system. Neuroscience research suggests that schizophrenics have abnormal interhemispheric
function. This research attemps to characterise abnormal interhemispheric function in
schizophrenics via entropy. Whereas previous research on entropy in schizophrenia has
focused on whole brain entropy, this research distinguishes between entropy in the left
hemisphere and entropy in the right hemisphere. The data is four minute resting state MEG
recordings. Transforming the time series into a path in an abstract embedding space, the
topological entropy is estimated from an incidence matrix. Comparing with controls, it is
found that entropy does not distinguish interhemispheric function in schizophrenics from
controls, and that right hemisphere entropy is higher across the whole population. This
approach shows that topological entropy is not the same in the two hemispheres across the
whole population.
Contents
1 Introduction
2
2 Theoretical Foundations of Attractor Reconstruction
2.1 Whitney’s Theorem and Takens’ Theorem . . . . . . . . . . . . . . . . . . . . . .
2.2 Singular Spectrum Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2
4
3 Applications of Ergodic Theory to the Life Sciences
3.1 Dynamical complexity and pathological order in the cardiac monitoring problem
(1987) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Application of entropy measures derived from the ergodic theory of dynamical
systems to rat locomotor behavior (1990) . . . . . . . . . . . . . . . . . . . . . .
3.3 Dynamical entropy is conserved during cocaine-induced changes in fetal rat motor
patterns (1996) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Intermittent Vorticity, Power Spectral Scaling, and Dynamical Measures on Resting
Brain Magnetic Field Fluctuations (2011) . . . . . . . . . . . . . . . . . . . . . .
5
11
4 MEG Time Series Analysis
4.1 Viewing the data in MATLAB with the FieldTrip toolbox . . . . . . . . . . . . .
4.2 Topological entropy and measure entropy . . . . . . . . . . . . . . . . . . . . . .
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
15
16
20
5 Conclusion
21
1
6
6
9
1
Introduction
The mathematical background to this work is covered in my essay, An Intuitive Guide to the Ideas
and Methods of Ergodic Theory for the Life Sciences. Professor Mark Pollicott of the University
of Warwick has been my mathematical advisor. This work is part of an ongoing research project
with Professor Arnold Mandell at UCSD.
The purpose of this thesis is to apply the methods of ergodic theory and nonlinear time series
analysis to MEG brain scan data. In particular, I seek to assess whether entropy can distinguish
functional interhemispheric differences between medicated schizophrenics and controls.
In the context of ergodic theory, entropy is the rate at which information is produced as
time passes. The word ‘ergodic’ was coined in the context of statistical mechanics by Boltzmann
from the Greek ergon, ‘work’ and odos, ‘path’. Here, the thermodynamical concept of ‘work’
is replaced by the concept of ‘information’, and we study the paths of information creation
within a system’s space of possible states. Intuitively, an ergodic system is one which cannot
be decomposed into two independent subsystems. Ergodicity is an expression of considering a
holistic set of phenomena such as the brain as a single system.
First I describe the theoretical foundation of attractor reconstruction. Then, I review current
applications of ergodic theory to the life sciences. Finally, I describe the methods and results
of MEG analysis in MATLAB, with a focus on how to calculate the topological entropy. The
methodology is adapted from Mandell’s work [17], the innovations here being a focus on interhemispheric rather than whole brain activity and a sparse data representation to improve
MATLAB memory usage. I also point to the potential of measure entropy (a.k.a. metric entropy,
measure-theoretic entropy) to give more accurate results.
2
Theoretical Foundations of Attractor Reconstruction
The main quantities I seek to apply to brain scan data are entropy, the leading Lyapunov exponent,
and the capacity dimension. The entropy is the rate of information production. The Lyapunov
exponent estimates the rate of expansion along the unstable manifold of the dynamical system - in
other words, the rate at which initially close points may become distant. Both of these quantities
have units of [time−1 ]. The entropy could be considered as a rate in bits per second, where bits
are a unitless measure of information. The capacity dimension estimates the size of the attractor
of the dynamical system in an embedding space. Here I focus on the entropy.
The ability to estimate such quantities on time series data is predicated on the theoretical
possibility of reconstructing the attractor from a delayed time series. In the section on MEG time
series analysis, this method of delays is implemented in MATLAB code to construct an incidence
matrix from which the entropy is estimated. The notes on Whitney’s and Takens’ theorems
attempt to give some theoretical background.
Ancillary quantities I seek to apply are the series of leading eigenfunctions and their Morlet
wavelet transformation. Some background on this is given in the notes on singular spectrum
analysis.
For an eloquent discussion of these subjects, see Holger Kantz’s book [10].
2.1
Whitney’s Theorem and Takens’ Theorem
An embedding is when one mathematical structure is contained in another mathematical structure.
Whitney’s embedding theorem states that a smooth finite m-dimensional manifold M can be
embedded in a Euclidean space Rn where n ≥ 2m + 1. Takens’ delay embedding theorem
2
describes how a dynamical system can be reconstructed from its time series. It effectively says
that Whitney’s theorem has practical relevance for the analysis of real world data. Takens’
theorem states that the delays of a time series provides an embedding for the dynamical system
that is generating the time series.
Φ(φ,y) (x) = (y(φ(x)), ..., y(φ2m+1 (x))
φ : M → M, y : M → R, Φ(φ,y) : M → R2m+1
Here, φ is the time evolution of the dynamical system; φ is what we don’t know and would
like to reconstruct. Our time series is y, a projection of the dynamics onto one axis. The function
Φ(φ,y) (x) is a correspondence between points on the manifold and vectors composed of time series
points.
For example, consider a dynamical system whose attractor is on a two-dimensional torus
in phase space. According to Takens’ theorem, this can be reconstructed in a five-dimensional
Euclidean space. A point in R5 , i.e. a five-component vector whose components are points of the
time series, identifies a point on the torus in the underlying phase space. If our time series has
a million points, then for every block of five points we’ll get a point of the torus. Since we can
time-shift a five-point window along the time series, this will give (106 − 5) points on the torus.
George Sugihara’s videos illustrate how Takens’ theorem works: they are available as a supplement to his paper ‘Detecting Causality in Complex Ecosystems’ at http://www.sciencemag.
org/content/338/6106/496/suppl/DC1
Figure 1: An illustration of Takens’ embedding theorem. The Lorenz attractor is reconstructed
in three dimensions from three delayed copies of a single time series. From George Sugihara’s
aforementioned video supplement.
For practical applications, Takens’ theorem requires an estimate of the dimensionality m of
3
the dynamical system being studied. For example, what is the dimensionality1 of the human
brain’s activity as measured by an EEG recording? The EEG is measuring electrical activity
which mostly comes from neurons, and there are on the order of 2 ∗ 1010 of them. Even with the
simplifying assumption that each neuron is a one-dimensional system, this still gives a dynamical
system with several billion dimensions. At the same time, the EEG can be such a coarse-grained
measurement apparatus that it might be completely insensitive to detail at the level of individual
cortical columns, let alone particular neurons, and the overall activity may be constrained to
lie on a much lower-dimensional manifold. Since there are four lobes in each hemisphere, and
functional specialisation goes much smaller than the whole lobes, it would be surprising if the
actual dimensionality were much below eight.
2.2
Singular Spectrum Analysis
An estimation of m as somewhere between eight and several billion is not very encouraging.
Fortunately, David Broomhead and Gregory King showed how to estimate m from time series
data using singular spectrum analysis,2 which is in the same circle of ideas as principal component
analysis. It is a principal component analysis in the context of signal processing, where the rows
of a covariance matrix are delays of a single time series and where the method is applied locally
to point clouds.3
Takens’ theorem does not specify a time scale or embedding dimension. The assumption
is that successive measurements contain new information whatever the time interval between
them, which is not true for finite precision measurements. Requiring 2m + 1 measurements is not
sufficient to specify an embedding: a time scale4 is also required. For example, one criteria of a
sampling interval is the first zero of the autocorrelation function of the time series, the time at
which two successive samples are uncorrelated.
First, a sequence of delayed vectors is made from the time series. These vectors form the
rows of the trajectory matrix X whose eigenvectors are a basis for the embedding space. This is
Takens’ theorem, and does not distinguish between deterministic components and components
dominated by noise. We would like to run Takens’ theorem in a way that eliminates as many of
the latter as possible. To do so, the effects of curvature are eliminated by going from this global
view to a local view. This means looking at a local ball B with radius in the vector space and
centered on one of the delay vectors. The rows of B are those delay vectors which are within of the vector on which the ball is centered. The smaller , the less the dimension estimate will be
affected by curvature, which is good, but the fewer data points it will contain, which is bad, so
there’s a tradeoff when changing the size of the ball. The local dimension is an estimate of the
dimension of the manifold which strives to only take the deterministic components into account.
Choosing a good is related to choosing a good time scale. Estimating the dimension involves
having as many data points as possible in the local analysis while remaining unaffected by the
curvature. It would be helpful for example if the data points happened to be from a particularly
flat part of the manifold.
The local covariance matrix is BT B , and its eigenvalues are variances. The diagonalised local
covariance matrix is used in calculating the eigenvalues of B . The corresponding eigenvectors
span the Euclidean tangent space at the point where the ball is centered. Looking at the local
covariance matrix for estimating dimension, rather than just counting the rows in the matrix
1 The
following comments in this paragraph are from correspondence with Cosma Shalizi.
time series analysis is also nowadays referred to in machine learning as ‘manifold learning.’
3 Thanks to Mark Muldoon of the University of Manchester for some of the comments that follow, and apologies
to the reader for the following technicality.
4 A time scale can also be thought of as a window length, i.e. how long the different delayed vectors should be.
2 Geometric
4
B , enables seeing which eigenvectors, i.e. which deterministic components, are significant. Each
eigenvector represents a dimension.
As is increased, the number of detected dimensions will grow until a plateau or until the
effects of curvature become noticeable. As the ball expands, it starts to hit quite distant pieces of
the attractor as measured by a metric intrinsic to the attractor. Then you’re only seeing global
effects rather than learning about the attractor.
For an independent and identically distributed process, singular spectrum analysis reduces
to Fourier analysis, where the eigenvectors are expressed in terms of sine and cosine functions.
Singular spectrum analysis is particularly useful if you know that your dynamical system is not
such a process, i.e. if it is described by a non-normal stable distribution, and want to learn about
its correlation structure.
Reconstruction of an attractor of dimension D is estimated to need 102+0.4D data points.5
The highest dimension attractor that can be discerned in a time series with N points is:
D = 2.5 log10 N − 5
For an eight dimensional system, this is 150,000 data points. With a window length of 4006
milliseconds, this would require an EEG recording time of 17 hours. If the dimensionality were
16, the EEG recording time is 3 years. Crucially, there needs to be new information at each data
point - simply increasing the sampling rate will not help in reconstructing the attractor.
3
Applications of Ergodic Theory to the Life Sciences
There are a variety of ways to obtain time series of biological systems. Some of the papers
reviewed in this section attempt to characterise mental disorders or the effects of psychoactives
using a time series of the individual’s movements.
Another source of time series is the heart, via the time series provided by an electrocardiogram
(EKG). Heart rate variability has been used to characterise mental disorders as well as variations
in how relaxed a person feels.
Lastly, time series can be obtained from the brain via imaging tools such as electroencephalogram (EEG), or magnetoencephalogram (MEG) scans. A popular tool for brain imaging is
functional magnetic resonance imaging (fMRI), but this does not easily provide time series for
ergodic analysis. The advantage of fMRI is a high spatial resolution of one millimeter. The
disadvantage of fMRI is low temporal resolution of one second. Studies using fMRI tend to
emphasise localised activity and an anatomical view of brain function. An example is studies
of resting-state fMRI activity, also known as the default mode network (DMN). While fMRI
can provide increasingly refined definitions of brain areas, a holistic understanding requires an
investigation of the temporal dynamics of brain activity. Both EEG and MEG have a high
temporal resolution of one millisecond, which is on the order of neuron dynamics. The brain
imaging papers reviewed here use MEG, which also has high spatial resolution of one millimeter.
5 Sprott, Chaos and Time-Series Analysis, quoted by Cosma Shalizi in Methods and Techniques of Complex
Systems Science: An Overview (Complex Systems Science in Biomedicine).
6 Not sure what a good window length is, indeed not sure if anyone knows. I took this number from http:
//sccn.ucsd.edu/wiki/Chapter_6.6._Model_Fitting_and_Validation
5
3.1
Dynamical complexity and pathological order in the cardiac monitoring problem (1987)
This paper [11] is an attempt to establish an analogy between healthy and unhealthy cardiac
rhythms and ergodic theory. It makes the clinically relevant observation that, as death may result
within minutes of cardiac dysfunction, there is no time to wait for the asymptotic statistics of the
patient’s heartbeat. The ergodic theorems are of no use at such short time scales, as the ergodic
quanities will not converge to a stable value. Rather than finding what the dynamics converge
to, one might look at the rate of convergence. The paper refers to this as the pre-asymptotic
diagnosis of the mixing conditions. Four idealised states of cardiac rhythm are given, each with a
faster mixing rate than the one before:
1. ergodic (cardiac bigeminy)
2. weak mixing
3. strong mixing with finite correlations
4. strong mixing with infinite correlations (ventricular tachycardia / fibrillation)
Both 1. and 4. can result in sudden death. In an idealised model of these four states, the second
and third have positive topological entropy, whereas the first and fourth have zero topological
entropy. This is designed to illustrate that positive topological entropy may be associated with
cardiac health.
The paper ends by saying that the topological entropy of a receiving channel must be
greater than that of the source, and that the two zero topological entropy states leave the heart
informationally isolated from the time-dependent regulatory signals of the body’s autonomic
nervous system.
3.2
Application of entropy measures derived from the ergodic theory
of dynamical systems to rat locomotor behavior (1990)
In this paper [22], rats are given different psychoactives: MDMA, amphetamines. The movement
of the rats in a bounded space is converted into symbolic sequences, and the topological entropy
and measure entropy are calculated for the sequence. The measurable dynamical system consists
as always of a space, a σ-algebra, a measure, and a transformation. The space is the set of infinite
sequences of symbols. The σ-algebra is that generated by cylinder sets on the space; i.e., for
each finite symbol sequence, the cylinder set is the set of infinite sequences that agree with the
finite one on its set of indices. The sequences must be taken to be infinite for the entropy to be
non-zero, even though in laboratory conditions the movements of the rats are only observed for
finite time. The finite observation is part of one of the infinite cylinders in the mathematical
space. The transformation is the shift operator, which gives the time evolution. The attractor
onto which the shift operator eventually maps the sequences, is the characteristic movement
pattern induced by the psychoactives.
6
Figure 2: Rat movement patterns. The rats on amphetamine are hyperactive. The rats on
MDMA become more chaotic in their movements at the low dose, and display a primarily circling
movement at the high dose. From [22]
The assumption in this paper is that there is an unknown underlying dynamical system, whose
dynamics can be approximated by a shift map. The underlying system is the rat itself or the
rat’s brain, and the shift map is the recordings of the rat’s movement. The transition from the
underlying dynamical system to the shift map represents the finite precision of the measuring
instruments. This finite precision defines a partition on the space of the underlying dynamical
system. The number of partition elements is denoted m. As the number of partition elements
increase, the accuracy with which the partition represents the space also increases.
The topological entropy of the shift operator with respect to a partition is defined in this
paper as
log N (ω m )
m→∞
m
7
where ω is a word of length m, σ is the usual notation for the shift operator, N is a counting
function, and L is a partition of the space. The topological entropy of the shift operator is the
hT (σ, L) = lim
7 C.f.
the section on shift maps - words are finite sequences.
7
supremum over partitions L of hT (σ, L). This describes the number of new sequences occuring
with increasing sequence length. The topological entropy is the growth rate of the number of
possible words with increasing word length, considering all possible partitions8 of the measure
space. A measure entropy is also defined as the limiting average of the measure entropy with
respect to a partition L, where the measure gives a weighting of which words are more probable:
H(ω m )
m→∞
m
The partition could have been defined in terms of movements easily expressed in language: a
poking of the head could have been one partition element, a decrease in speed could have been
another. Instead, the authors define partition elements that are inversely proportional to the
density distribution of points. They call this a relative generator, as opposed to a generating
partition. The idea here is that the partition should not be specified a priori, and should be chosen
relative to its significance with respect to the data. The consequence here is that a single partition
element may consist of a combination of poking, rearing, or acceleration movements. Subsets of
the measure space in which the rat is observed more frequently are resolved into more distinct
behavioral events than in subsets observed less frequently. The number of partition elements is
set to 32 as this seemed to saturate both the entropy creation and the largest Lyapunov exponent.
The actual probability for the different movement sequences is estimated by observing the
actual rat movements. These probabilities retroactively assign a measure to the system: the
measure of a sequence is defined to be its probability. Transitions between words are written as
an incidence matrix, and the probabilities transform this into a transition matrix. The RuellePerron-Frobenius Theorem is used to estimate the largest Lyapunov exponent. The incidence
matrix is used to calculate the largest Lyapunov exponent as well as the topological entropy. The
measure entropy with respect to a partition is estimated as a conditional probability of one word
given another word with the same length:
X
P (ωim )[P (ωim | ωjm ) log(P ωim | ωjm )].
H(ω m ) ≈
hm (σ, L) = lim
i,j
Amphetamine was observed to increase the amount of activity, leading Lyapunov exponent,
topological entropy, and measure entropy in a dose-dependent fashion. The increase in transitions
was both due to an increase in spatial activity (variety of paths) as well as temporal activity
(slowing down and speeding up).
The MDMA results were more complicated, as they were not dose-dependent. As the dose of
MDMA was increased, the leading Lyapunov exponent, topological entropy, and measure entropy
first increased, and then decreased. In other words, these ergodic quantities have a biphasic,
dose-dependent response to MDMA.
On closer inspection, it was observed that individual animals responded differently to high
dose MDMA. At high doses, some individuals experience a decrease of ergodic quantities to the
level of saline controls, whereas for others the ergodic quantities continue to increase. In the low
entropy response animals, there was greater topological entropy relative to measure entropy: this
indicates a decrease in the number of likely paths, in addition to a decrease in the number of
possible paths.
The amphetamine results are compared to the Lyon-Robbins hypothesis, which states that
the stimulant action of amphetamine causes an increase in the initiation of behavioural sequences
as well as a disruption in the completion of the sequences, eventually resulting in stereotypy.
In the experiment, an increased initiation of behavioural responses corresponds to an increase
8 In
the context of a shift map, a partition may also be referred to as a coding.
8
of transitions between different sequences of the measure space, resulting in an increase in the
leading Lyapunov exponent: an animal starts specific sequences of behavioural events and shortly
thereafter initiates a new sequence. This decreased correlation of consecutive events is consistent
with the Lyon-Robbins hypothesis.
With regards to MDMA, convergence at sufficiently high doses of all animals to the low
topological entropy and still lower measure entropy state indicates a perturbation of the central
nervous system that yields very constrained sequences of behaviour.
The paper concludes with the speculation that healthy functioning may consist in constrained
randomness, characterised by having many possible response options available (hT ) while choosing
only a limited subset of these options (hm ).
3.3
Dynamical entropy is conserved during cocaine-induced changes
in fetal rat motor patterns (1996)
This paper [26] proposes that entropy is a conserved property in biological systems such as the
brain and heart. It describes an experiment suggesting that cocaine redistributes entropy.
A lost variety theory of stimulant drug action is that drugs such as cocaine induce a pathological
simplification of the system’s dynamics via the loss of entropy. This paper challenges this view,
stating that entropy is in fact conserved, and that its redistribution is what causes damaging effects.
This redistribution consists in an increase in the amount of activity, associated with a decrease in
the variety of behaviour. The authors relate this to a simplified version of Manning’s formula.
The measure-theoretical aspects are dropped: the measure entropy is replaced by the topological
entropy, and the unique positive Lyapunov exponent of a two-dimensional hyperbolic system
is replaced by the leading Lyapunov exponent. The measure-dependent Hausdorff dimension is
replaced by the correlation dimension. This gives:
hT ≈ λ1 DR
The original theorems of Pesin and Manning are proved with mathematical conditions, such
as uniform expansivity, that are unrealistic for biological systems. Manning’s formula is only
valid for a two-dimensional system, and the substitution of the correlation dimension for the
Hausdorff dimension is not mathematically clear. The authors of this paper nevertheless derive
experimental results from the approximate formula that seem meaningful.
The substance of the paper is an experiment to determine the topological entropy (dynamical
complexity) of fetal rats injected with cocaine. The rats are visually observed for 20 minutes,
during which motor activity is verbally reported for entry into a computer. The events are then
summed and averaged into five second bins, giving 240 data points per subject.
The paper notes that a finite length biological time series is typically never long enough to
give a stable estimate of the quantities9 hT , λ1 , or DR . In other words, the asymptotic stability
of these quantities cannot in practice be reached from individual time series. The number of data
points needed to correctly estimate DR in a d-dimensional system is between 10d/2+1 and 10d . If
the dimension is for example six, the observation would have to last for weeks or months, far
longer than the duration of action of an injection of cocaine. Eckmann and Ruelle10 emphasise
that beyond having a large number of measurements, what matters is to have a long recording
time - increasing the resolution of one’s measurements at fixed recording time does not help much
in capturing the dynamics. Increasing the resolution merely gives more and more information on
9 Here and elsewhere, researchers in the applied sciences refer to quantities which characterise the dynamics as
measures. This is a different use of terminology from measures in the sense of measure theory. I keep with the
term quantity to avoid confusion.
10 Lyapunov exponents from time series, page three.
9
smaller and smaller pieces of the attractor, whereas one would like to let the recording time tend
to infinity to reconstruct all of the attractor. The authors assert that their recording time is long
enough.
To get a good estimate of hT , λ1 , and DR , a spatial average over individuals is taken in place
of a single long time series of a single individual. Doing this assumes that the system of fetal rats
under the influence of cocaine is ergodic.
A partition of the space is made by defining six partition elements as being from one to three
standard deviations above or below the mean. Each partition element corresponds to a type of
rat motor activity. A six-times-six transition matrix follows the orbits of the data points from
one partition to the next, each entry representing the probability of transition as a real number.
The transition matrix is converted to an incidence matrix by replacing each entry by a 0 or 1
according to the rule that a transition matrix entry of less than 0.0375 gives a 0, i.e. if the
cell was visited nine times or less (9/240 = 0.0375). The asymptotic growth rate of the trace
of the incidence matrix estimates its largest eigenvalue. The Ruelle zeta function11 makes an
appearance, but I am not sure in what capacity.
The evolution of the separation between two neighbouring data points after five time steps
was calculated for various neighbouring pairs, the greatest rate of separation giving λ1 . This
gives a logarithmic estimate of the largest rate of expansion of new motion patterns.
The correlation dimension12 , DR , is a measure of the dimensionality of the space occupied by
a set of points. In statistical mechanics, the correlation function of a time series is the average
distance between any two points xi and xj .
N
1 X
Θ(l − ||~x(i) − ~x(j)||),
N →∞ N 2
i,j=1
c(l) = lim
~x(i) ∈ Rm
i6=j
This gives the number of pairs of data points whose distance is less than l. The correlation
integral is the integral from 0 to l of the correlation dimension with m degrees of freedom, and
represents the mean probability that the states at two different times are close.
Z
l
C(l) =
dm r c(r)
0
C(l) is proportional to a power of l, lν . ν is the correlation dimension, and is a lower bound
of the Hausdorff dimension. In this paper, the authors choose m = 5 and graphically estimate ν
as l goes to zero.13
The experimental results are that hT is not correlated with λ1 or DR , and that there is
an inverse correlation between λ1 and DR . With administration of cocaine, the topological
entropy remained stable, the leading Lyapunov exponent increased, and the correlation dimension
decreased. This is given as evidence that topological entropy is conserved.
The paper states that the frequently used applied dynamical systems procedure of comparing
to a random data set is irrelevant to the statistical discrimination of quantities from experimentally
defined states.
11 A zeta function is a complex function that’s like a generating function. You’ve got a bunch of numbers, and
rather than writing down all these numbers you can just encode them into a single function. Complex functions
have infinitely many coefficients, and all this information can be collected together in a single function. If you
knew everything about the complex function you could read off all the numbers. Knowing some information about
the complex function can tell you some broad properties. It’s a convenient device. Zeta functions typically count
periodic behaviour.
12 See Grassberger and Procaccia’s paper Measuring the strangeness of strange attractors.
13 They denote ν as D and l as r.
R
10
Extrapolating from the experimental results to the human psychological level, an increase
in the leading Lyapunov exponent corresponds to increased busyness, while the concomitant
decrease in the correlation dimension corresponds to reduced degrees of freedom in thought and
behaviour. This is the profile of the complexity-conserving obsessive-compulsive or workaholic
personality. The paper suggests that this loss of complexity can be just as damaging as the
supposed entropy reduction of the alternative lost variety theory.
3.4
Intermittent Vorticity, Power Spectral Scaling, and Dynamical
Measures on Resting Brain Magnetic Field Fluctuations (2011)
This [17] is a pilot study on resting state MEG data from 10 schizophrenics and 10 controls. One
view of resting state MEG data is that it is background noise. This view is more typical of source
localisation studies of task-related MEG data. The authors take the alternative view that resting
state MEG data is physiologically and psychologically relevant.
In studies of functional networks of brain regions, resting state activation is sometimes referred
to as the Default Mode Network (DMN). The DMN is a spatial characterisation of resting state
activity, and is observed via fMRI scans which have high spatial resolution and low temporal
resolution. The authors of this pilot study use MEG scans which have a much higher temporal
resolution, in view of a temporal characterisation of resting state activity.
The authors mention how neuroscientists such as Michael Greicius have suggested that resting
state activity reflects task-unrelated images and thoughts. These task unrelated thoughts (TUT)
have also been referred to as daydreaming or stimulus independent thoughts (SIT), and I will
refer to them as the thinking mind, as opposed to the task-oriented working mind. The authors
mention evidence that the thinking mind persists under anesthesia.
The data examined is 12.5, 54, 180 or 240 seconds of eyes closed, resting spontaneous magnetic
field activity in ten resting controls and ten medicated schizophrenics. The measurable entropy
manifold volume (MEMV) is defined as the product of the topological entropy, leading Lyapunov
exponent, and capacity dimension. The authors state that this is a three-dimensional entropy
volume measure, but I am unclear on how this can represent a volume. Capacity dimension is
unitless, and entropy and Lyapunov exponents have units of inverse time, suggesting that MEMV
is an acceleration.
Prominent magnetic field fluctuations, which the paper title refers to as vorticity, are referred
to in the paper as strudels. The paper speculates that strudels are the thinking mind and that
MEMV represents what might be called psychic energy or psychic entropy. The hypothesis is
that MEMV is used up in the generation of strudels.
A common paradigm for MEG is the inverse problem: reconstructing the orientation and
location of magnetic dipoles needed to produce a given a MEG. The inverse problem is underdetermined, in that many dipole configurations may produce the same MEG. This paper instead
attempts to analyse the MEG globally, by analysing the sequences of differences between two
bilaterally symmetric sensor pairs, and refers to this as the symmetric sensor difference sequence
(SSDS). Seeking to disprove the assumption that local polarities of the magnetic field cancel out,
the SSDS is designed to show that a seed magnetic fluctuation can diffuse across spatiotemporal
scales.
A three minute SSDS signal has 144,000 data points. Some unknown function Φ acting on
the SSDS is the time evolution of the underlying dynamical system. Singular spectrum analysis
of the signal is used to estimate the leading eigenfunction14 of the SSDS, written Ψ1 . This is
done by using the method of delays to create a covariance matrix where each row is a delay of
the SSDS time series. The leading eigenvector given by singular spectrum analysis is calculated
14 An
eigenfunction is an eigenvector that is also a function.
11
at each point of the SSDS to give the leading eigenfunction, which the authors call the leading
Broomhead-King eigenfunction.
When analysing something, it can be useful to break it up into its component frequencies, just
as white light is made up of colours which each have their own frequency. A Fourier transform
analyses frequency from the perspective of eternity, and misses out on how the frequency changes
with time. The short time Fourier tranform uses a window function to catch the frequency
component in a time interval, but it can miss out on some information by having a window that’s
too long or too short, like glasses that are not adapted to one’s eyesight. By a kind of uncertainty
principle, the product of the time resolution and the frequency resolution is constant. A wavelet
transform takes advantage of this by having a window with varying width, allowing it to see both
short duration high frequency information as well as long duration low frequency information.
More precisely, there are three steps to using wavelets. First, choose a mother wavelet.
Wavelets are functions that are concentrated in time as well as in frequency around a certain
point. Here, the choice is of the Morlet wavelet, as this has been found to be a good match for
human perceptual processes. Second, convolve the wavelet with the signal. Third, change the
scale of the wavelet via dilation and compression. With dilation, the wavelet captures a low rate
of change and a low frequency. WIth compression, the wavelet captures a high rate of change
and a high frequency. This process is related to a time-frequency tradeoff which Dennis Gabor
first described by analogy with the Heisenberg uncertainty principle.
As an equation, this look like:
Z ∞
1
t−τ
T (a, τ ) = √
)
dtf (t)ψ(
a
a −∞
where f (t) is the signal, a is a scale parameter, τ is translation in time, and ψ is the wavelet.
This can be given as a 3D result, and is usually represented in 2D with colours representing
the amplitude.
Figure 3: The Morlet wavelet, also known as the Gabor wavelet. Its lateral inhibition is a good
model for perception, and is reminiscent of centre-surround neurons in vision.
Returning to the MEG paper, a Morlet wavelet transformation WM is applied to the
Broomhead-King eigenfunction, Ψ1 . This can be written as WM (Ψ1 (SSDS(i))), or W for short,
where i indicates a point in the SSDS time series. W is a function from the eigenfunction to a
time-frequency rescaling. This brings the data’s own time scaling structure into view, and the
authors refer to W as an eigenzeit.
A graph of W appears to show intermittent vorticity in the fluctuations of Ψ1 , which the
paper refers to as strudels. The authors give data from an epileptic staring spell and from a
12
schizophrenic thought blocking15 episode. In both cases the subjective experience is of being
unable to form thoughts, and the W graphs are of a sudden absence of strudels. MEMV also
appears to be reduced by 40-50% in schizophrenics versus controls.
Figure 4: Morlet wavelet transformation of the leading eigenfunction of the SSDS of left and right
C16 sensors. From bottom to top, there seem to be small scale fast driving events, intermediate
scale 1-3 Hz waves, and the intermittent emergence of longer strudels from some but not all fast
and intermediate scale events. From [17]
15 The schizophrenic thought blocking data appears in their later paper Daydreaming, Thought Blocking and
Strudels in the Taskless, Resting Human Brain’s Magnetic Fields. Thought blocking occurs when a person’s speech
is suddenly interrupted by silence that lasts for a few seconds or minutes. It is often brought on in schizophrenics
by discussing something emotionally heavy, and is described as a quick and total emptying of the mind.
13
4
MEG Time Series Analysis
79 resting state MEG recordings lasting four minutes each are studied to assess the entropy level
in the left and right hemispheres. The recordings are from controls, medicated schizophrenics,
and unaffected siblings of schizophrenics.
This analysis assumes that the MEG is deterministic rather than stochastic. It considers the
MEG time series as representative of an underlying deterministic dynamical system.
First, the data is imported into MATLAB using the EEG/MEG FieldTrip toolbox. Then, I
select pairs of sensor channels from each hemisphere from the imported dataset. The channels I
study are left and right C16 (central), left and right P57 (parietal), and left and right F14 (frontal).
The sensor map is that of CTF’s 275 lead MEG scanner. Fieldtrip’s CTF275.lay file provides the
correspondence between label and layout. To cancel noise, the time series is subtracted for two
sensors on the same hemisphere: for example, one time series is formed from the right C16 time
series from which the right P57 time series has been subtracted. This is similar to the SSDS in
the previous section, except that the sensor difference sequence is no longer symmetric, since the
two sensors are now from the same hemisphere. In this way six pairs of channels form six time
series, three for each hemisphere. The primary purpose of taking more than one pair is to guard
against possible noise at a particular channel location, rather than to distinguish between regions
within a given hemisphere. Finally, I run custom MATLAB functions on the new time series and
assess the hemisphere-specific differences. The functions are adapted from the Simple Aggregate
for Nonlinear Time-series Analysis project16 (SANTA), which I am currently helping to renovate.
16 http://vlsi-cuda.ucsd.edu/
~braindyn/index.php
14
Figure 5: Sensor locations. From [17]
4.1
Viewing the data in MATLAB with the FieldTrip toolbox
This example MEG file was recorded using the CTF MEG System17 . The dataset is stored in
a .ds folder, in this case one for each subject. FieldTrip functions are used to read the header
information and to read the data into a matrix.
cfg.dataset = 'Subject 01.ds';
hdr = ft read header('Subject 01.ds')
hdr =
Fs: 600
% sampling frequency
17 www.ctfmeg.com/index.html
15
nChans:
nSamples:
nSamplesPre:
nTrials:
label:
grad:
orig:
chantype:
chanunit:
332
144000
0
1
{332x1 cell}
[1x1 struct]
[1x1 struct]
{332x1 cell}
{332x1 cell}
%
%
%
%
%
%
%
%
%
number of channels
number of samples per trial
number of pre-trigger samples in each trial
number of trials
cell-array with labels of each channel
gradiometer structure
additional header information
type of data of each individual channel
physical units of each channel
dat = ft read data('Subject 01.ds');
format long e
unit = ft chanunit(hdr)
The header data structure contains a vector, hdr.label, which associates each line index with
the corresponding sensor of the MEG scanner. The FieldTrip function ft chanunit gives the units
of the MEG data. The first row is time in seconds, and the remaining rows are magnetic field
strength in tesla. The format command changes the display precision so that the tesla values do
not all show as zeros.
Here is the beginning of the first two columns of MEG data.
>> dat(:,1:2)
ans =
8.337461750000000e+03
-1.385906494693436e-09
2.241682167274993e-09
-4.404843752375973e-09
1.778295613667668e-09
-1.662896100804182e-10
-3.829996110958704e-09
3.635103659536867e-10
3.055356748903700e-09
8.337463416666666e+03
-1.387690267455668e-09
2.240907212037669e-09
-4.401568073661994e-09
1.776501210481651e-09
-1.681567006944026e-10
-3.826533741538616e-09
3.628395726104604e-10
3.053415729348467e-09
Both the header data and the first row of the MEG data indicate that the MEG is sampled at
the millisecond scale. In the MEG data, the digits 8.33746e+03 remain the same in the first and
second timestep, corresponding to 8337.46 seconds or 833746 centiseconds.
MEG magnetic field strength is measured in thousands of femtotesla, 10−12 − 10−11 T. A
femtotesla is 10−15 T, and the magnetic field generated by the heart is on the order of a nanotesla,
10−9 T. Taking the difference of two sensors allows for cancellation of noise.
4.2
Topological entropy and measure entropy
For a description of topological and measure entropy, see my essay mentioned in the introduction.
Recall that the topological entropy is an upper bound to the entropy with respect to any measure.
hµ ≤ hT
An incidence matrix represents which transitions occur in a partition space of the time series.
The topological entropy is estimated as the logarithm of the maximum eigenvalue of the incidence
matrix.
A transition matrix estimates the probability of going from one partition to another within
the time series. If the system is ergodic, one can find the asymptotic probability distribution 16
in other words, one can find the natural measure corresponding to the underlying measurable
dynamical system. This allows for calculation of the measure entropy.
Below is the code used to calculate the topological entropy and measure entropy. I have
included a lot of commentary directly in the code to explain how the calculation works.
Note that the implementation of the measure entropy is not complete - at the moment this
code only calculates the topological entropy, as well as the average information18 which is a step
to calculating the measure entropy.
The time series points are allocated box indices in a high-dimensional embedding space. I
used an 11 dimensional embedding space with 4 partitions per dimension. The high embedding
space dimension is made possible by the use of sparse matrices. Once the partition indices are
chosen, a transition matrix is formed, and simplified into an incidence matrix. The topological
entropy is estimated as the logarithm of the maximum eigenvalue of the incidence matrix.
function [topologicalEntropy, measureEntropy] = CalcEntropySparse(timeSeries, ...
embeddingDimension, partitionsPerDimension)
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
CalcEntropySparse(timeSeries, embeddingDimension, partitionsPerDimension)
Calculates the Topological and Measure Entropies
Parameters: timeSeries - The time slice of data to be analyzed, as a row
vector embeddingDimension - The embedding dimension
partitionsPerDimension - The number of partitions in each dimension
Return: topologicalEntropy - The Topological Entropy measureEntropy - The
Measure Entropy
Description: Calculates the Topological and Measure entropies, first
finding the sparsity pattern so as to optimise memory usage. An embedding
dimension of 2d+1 can reconstruct an attractor of dimension d. For
analysing the EEG or MEG, if the attractor is assumed to be eight, this
requires a seventeen dimensional embedding space, which requires a lot of
memory. Topological entropy is found as the logarithm of the maximum
eigenvalue of the incidence matrix formed from the time series. The
incidence matrix is composed of 1s and 0s, and indicates whether a
transition from one partition to another occurs. Measure entropy is found
with respect to the measure induced by the probabilities of the
transition matrix formed from the time series. The transition matrix, or
Markov matrix, gives the probabilities for going from one partition to
another. This allows the data to produce its own measure, the natural
measure (also know as the Sinai-Ruelle-Bowen measure or physical
measure).
The embedding space is covered with partitions, with the number of
partitions per dimension and the number of dimensions of the embedding
space given by the input parameters. The partitions are then labeled.
Imagine a single dimension partitioned into 4 partitions, with values
normalized between 0 and 1. This means that anything between 0 and 0.25
is partition 1, anything between 0.25 and 0.5 is partition 2, and so on.
But then imagine pulling it out into 2 dimensions and now the partitions
are squares. Then the first row of squares is 1 through 4 and the next
is 5 through 8 and so on. The box number that the point lies in is
dependent on it's value in the first and second dimensions. Moving it
along the second dimension changes its box number by 4 each time and
moving it along the first dimension changes the box number by 1. In
18 This
is H, capital H, in the discussion of measure entropy in my aforementioned essay.
17
%
%
%
%
%
%
%
%
%
%
three dimensions the partition moves up a plane (16 boxes) so the box
number changes by 16. In this way a number is assigned to every
partition.
Topological entropy is an upper bound to measure entropy.
WARNING: there is a known bug with this code where the entropies given
are complex values. Re-run the code with the same inputs until you get
real outputs. This problem may be caused by the instability of Matlab's
eigenvalue calculation algorithm.
data = timeSeries'; %has to be a column vector for accumarray later on
%nBox is the total number of partitions (also known as boxes or hypercubes)
%covering the embedding dimension.
nBox = partitionsPerDimensionˆembeddingDimension;
mn = min(data); %minimum value of the time series
mx = max(data); %maximum value of the time series
mx = mx + 1e-6*(mx - mn);
%small offset to the boundary of the rightmost partition to make sure that
%the maximum data point is included in the last partition
%restructure the data array to convert it to linear partition indices; the
%idea is to reshape the data into a table, where each successive column is
%the data sequence delayed by a successive delay; there is a Matlab
%command, delayseq, which could be used for this, but it doesn't come with
%the default Matlab license
lags = (0:(embeddingDimension - 1)) + 1; %index (into data array) of the first ...
entry in each column
flp lags = fliplr(lags) - 1; %offset from end of data array for each column
dataToPartitionIndex = zeros(length(data) - flp lags(1), embeddingDimension); ...
%preallocate memory
for i = 1:embeddingDimension
dataToPartitionIndex(:, i) = data(lags(i):(end - flp lags(i)));
end
%each column of dataToPartitionIndex is a delayed copy of the time series;
%the first column has no delay, the second column has a delay of 1, and so
%on up to (embeddingDimension - 1)
%preallocating memory by initialising the incidence and transition matrices
%to zeros uses too much memory when nBox is very large
%Instead of working wth matrices straight away, first find the
%sparsity pattern, i.e. the indices of the non-zero entries in the
%matrices and the values that go into these entries
%the bsxfun command can subtract and multiply
%matrices by vectors i.e. each row of a matrix gets combined with a row
%vector
partitionRange = (mx - mn)/partitionsPerDimension;
%mx-mn is the range of the data, and dividing this by
%partitionsPerDimension gives the range of the partition along any given
%dimension
dataToPartitionIndex = fix((dataToPartitionIndex - mn)/partitionRange);
%subtracting the minimum gives the distance of the data point from the
%minimum; Normalizing by partitionRange contains the values between 0 and
%the number of partitions ; normalizing by the range only would contain the
%values between 0 and 1
%at this stage dataToPartitionIndex is an array of partition subscripts;
%each column contains values in [0, nPart - 1] since we want to represent
18
%the transitions between partitions in a square matrix form (the transition
%and incidence matrices), we want to convert the partition subscripts into
%a single number: the linear index
dataToPartitionIndex = bsxfun(@times, dataToPartitionIndex, ...
partitionsPerDimension.ˆ(0:(embeddingDimension - 1)));
%first step towards linear indices; see the description of partition
%numbering in the introduction
dataToPartitionIndex = sum(dataToPartitionIndex, 2) + 1;
%second step towards linear indices; now dataToPartitionIndex is a single
%column of linear partition indices for the time series points
%look at the transitions between partitions: create an array of transitions
%from one partition to the next partition
dataToPartitionIndex = [dataToPartitionIndex(1:(end - 1)), ...
dataToPartitionIndex(2:end)];
%At this point we pretty much have the sparsity pattern, except for the
%following possibilities
% we may have partitions which are never visited. we may also have
% partitions which can be entered, but never exited we may also have
% partitions which are exited, but never entered
%these conditions would imply that the Markov chain representation is not
%irreducible which means it's not going to have a stationary distribution
%we have to eliminate such possibilities for the below code to work
%first let's ignore the diagonal matrix entries for now i.e. let's ignore
%the transitions from every state i to itself
off diag = dataToPartitionIndex; %copy the transition list
off diag(off diag(:, 1) == off diag(:, 2), :) = []; %remove entries where both ...
columns hold the same index
%if a state index is present in both columns of off diag, then it must be
%both enterable and exitable
enterable = false(nBox, 1);
exitable = enterable;
exitable(off diag(:, 1)) = true; %marks partitions which have at least one outgoing ...
transition
enterable(off diag(:, 2)) = true; %marks partitions which have at least one ...
incoming transition
delete partitions = ~(enterable & exitable); %if not both enterable and exitable, ...
the partition must be removed
%delete partitions also marks partitions which are never visited
%for the transition matrix, we want to count the number of transitions
%and put the number into the correct matrix entry; notice how the issparse
%argument to accumarray is set to true, so that the output is a sparse matrix
transitionMatrix = accumarray(dataToPartitionIndex, 1, [nBox, nBox], [], [], true); ...
%transition count
%remove the marked partitions from the matrix
transitionMatrix(delete partitions, :) = []; %delete the corresponding rows
transitionMatrix(:, delete partitions) = []; %delete the corresponding columns
%normalise the rows
transitionMatrix = spdiags(sum(transitionMatrix, 2), 0, size(transitionMatrix, 1), ...
size(transitionMatrix, 1))\transitionMatrix;
%the stationary distribution is the principal left eigenvaule of the
%transition matrix; for sparse matrices we use the command eigs() to get the
%right eigenvectors; since we want left, not right eigenvectors, we have to
%transpose the matrix before taking eigenvectors
[right eig vectors, ~] = eigs(transitionMatrix'); %calculate right eigenvalues of ...
19
transposed matrix
rw = right eig vectors(:, 1); %first right eigenvalue corresponds to stationary ...
distribution
%we now normalise the eigenvector and calculate the entropy of the
%stationary distribution
nzr = rw ~= 0; %ignore states with zero probability, this should never occur since ...
we've made sure our Markov chain has a stationary distribution
rw = rw(nzr)./sum(rw(nzr)); %normalise
averageInformation = sum(-rw(nzr).*log(rw(nzr)));
%the measure entropy of the stationary distribution is obtained by iterating the ...
average information over sequence length - Not implemented yet!
%for the topological entropy matrix, we want to simply have a non-zero
%entry for every transition which has occured in the time series. So we just
%want the non zero entries of the measure entropy matrix
incidenceMatrix = double(transitionMatrix > 0);
%the logarithm of the maximum eigenvalue of the incidence matrix estimates
%the topological entropy
eigtemp = eigs(incidenceMatrix);
topologicalEntropy = log(max(abs(eigtemp)));
end
4.3
Results
All subjects except three had higher topological entropy in the right hemisphere of the resting
state MEG. With a higher ratio indicating more entropy in the right hemisphere, the average
entropy ratio for medicated schizophrenics was 1.5584, 1.5441 for unaffected siblings, and 1.6497
for control subjects. Of the three subjects with dominant left hemisphere topological entropy,
one was a medicated schizophrenic, one a control, and one an unaffected sibling.
Performing a simple one-sample t-test on the null hypothesis that entropy is equally distributed
in left and right hemispheres, i.e. that the ratio is 1, without differentiating between schizophrenics,
siblings and controls, it is found that the average is significant (with a p-value lower than 0.00001).
This is therefore evidence that entropy is not equally distributed between right and left hemispheres
in humans.
Differentiating between the groups, I have tested the hypothesis that schizophrenics have
a different entropy ratio, compared to non-schizophrenics. To test this, I have taken out the
observations of the siblings: they are not statistically independent of the schizophrenics. I have
assumed that the entropy ratio is a normally distributed variable, like other human physical
characteristics such as height and weight. This allows for testing even though one of the groups the medicated schizophrenics - is small - less than 30. The hypothesis is tested by means of a
two-sample t-test as follows. First I compute the sample variation of the test statistic.
s
S12
S2
SE =
+ 2
n1
n2
S1 and S2 are the variances of the sample of medicated schizophrenic entropies and control
entropies, respectively. Here S1 = 0.0560 and S2 = 0.0554. n1 and n2 are the sizes of the sample
of medicated schizophrenic group and control group, respectively. Here n1 = 21 and n2 = 44.
Then I determine the degrees of freedom, assuming that the variances of the two populations
(schizophrenics and controls) are different.
20
2
S12
S22
+
n1
n2
df =
2 2
2 2
1
1
S1
S2
+
n1 − 1 n1
n2 − 1 n2
This gives a p value of 0.0752. This means that the probability of having these results from
the sample or more extreme results if the null hypothesis were true, i.e. even more of a difference
in interhemispheric entropy ratio of schizophrenics compared to controls, is 7.5%. I have also
calculated the p-value assuming the two populations have the same variances, giving a value of
0.0767.
These p-values are small but not significantly small (less than 0.05). This means that the
null hypothesis that the average topological entropy ratios of the two populations are the same
cannot be rejected. I thus find no evidence that schizophrenics have a different interhemispheric
entropy ratio.
5
Conclusion
This project has shown that topological entropy is almost always higher in the right hemisphere than in the left hemisphere, and suggests that topological entropy cannot distinguish
interhemispheric imbalances in medicated schizophrenics compared to controls.
Regarding the first point, I note that the number of subjects with more left hemisphere entropy
is 4%, which reminds me of the percentage of the population with inversion of left and right
hemisphere activities, 5%. The latter are people whose language center, for example, is in the
right rather than the left, in contrast to the other 95% of people. With an EEG or MEG dataset
of sujects whose lateralisation is known, it would be straightforward to check if the topological
entropy ratio does indeed match language center location.
Regarding the second point, two options for further research are available. One option is to
abandon the idea that entropy can characterise interhemispheric differences, and to instead look
to characterise interhemispheric differences with other quantities, such as the leading Lyapunov
exponent. The other option is to make the analysis more precise, by looking at time series of
hours instead of minutes and by looking at measure entropy instead of topological entropy.
References
[1] Ralph Abraham. Dynamics: The Geometry of Behavior. 1992.
[2] Vladimir Alekseev and Mikhail Yakobson. Symbolic dynamics and hyperbolic dynamic
systems. Physics Reports, 08/1981; 75(5):290-325.
[3] Joseph Berkovitz, Roman Frigg, and Fred Kronz. The ergodic hierarchy, randomness and
Hamiltonian chaos. Studies in History and Philosophy of Modern Physics, 2006.
[4] David Broomhead and Gregory King. Topological dimension and local coordinates from
time series data. Journal of Physics A, 20, 1987.
[5] Jean-Pierre Eckmann and David Ruelle. Ergodic theory of chaos and strange attractors.
Reviews of Modern Physics, 57:617, 1985.
21
[6] Jean-Pierre Eckmann, David Ruelle, Sergio Ciliberto, and Sylvie Oliffson Kamphorst. Lyapunov exponents from time series. Physical Review A, 1986.
[7] Roman Frigg, Joseph Berkovitz, and Fred Kronz. http://plato.stanford.edu/entries/ergodichierarchy. 2011.
[8] Peter Grassberger and Itamar Procaccia. Measuring the strangeness of strange attractors.
Physica 9D, 9:189–208, 1983.
[9] Brook Henry, Arpi Minassian, Martin Paulus, Mark Geyer, and William Perry. Heart rate
variability in bipolar mania and schizophrenia. Journal of Psychiatric Research, 44:168–176,
2010.
[10] Holger Kantz. Nonlinear Time Series Analysis. 2004.
[11] Arnold Mandell. Dynamical complexity and pathological order in the cardiac monitoring
problem. Physica A: Statistical Mechanics and its Applications, 27D:235–242, 1987.
[12] Arnold Mandell. Can a metaphor of physics contribute to MEG neuroscience research?
Intermittent turbulent eddies in brain magnetic fields. Chaos, Solitons & Fractals, 55:95–101,
2013.
[13] Arnold Mandell, Stephen Robinson, Karen Selz, Constance Schrader, Tom Holroyd, and
Richard Coppola. The turbulent human brain: An MHD approach to the MEG. 2014.
[14] Arnold Mandell and Karen Selz. Entropy conservation as H ≈ λ¯+ d in neurobiological
Tµ
µ µ
dynamical systems. Chaos: An Interdisciplinary Journal of Nonlinear Science, 7:67–81, 1997.
[15] Arnold Mandell and Karen Selz. An intuitive guide to the ideas and methods of dynamical
systems for the life sciences. 1998.
[16] Arnold Mandell, Karen Selz, John Aven, Tom Holroyd, and Richard Coppola. Daydreaming,
thought blocking and strudels in the taskless, resting human brain’s magnetic fields. American
Institute of Physics Proceedings, 2011.
[17] Arnold Mandell, Karen Selz, Lindsay Rutter, Tom Holroyd, and Richard Coppola. Intermittent vorticity, power spectral scaling, and dynamical measures on resting brain magnetic
field fluctuations. The Dynamic Brain, 2011.
[18] Anthony Manning. A relation between Lyapunov exponents, Hausdorff dimension and
entropy. Ergodic theory and dynamical systems, 1:451–459, 1981.
[19] Martin Paulus, Mark Geyer, and David Braff. Use of methods from chaos theory to quantify
a fundamental dysfunction in the behavioral organization of schizophrenic patients. American
Journal of Psychiatry, 1996.
[20] Martin Paulus, Mark Geyer, and David Braff. Long-range correlations in choice sequences of
schizophrenic patients. Schizophrenia Research, 53:69–75, 1999.
[21] Martin Paulus, Mark Geyer, and Arnold Mandell. Statistical mechanics of a neurobiological
dynamical system: The spectrum of local entropies applied to cocaine-perturbed behavior.
Physica A: Statistical Mechanics and its Applications, 1991.
[22] Martin Paulus, Arnold Mandell, Mark Geyer, and Lisa Gold. Application of entropy measures
derived from the ergodic theory of dynamical systems to rat locomotor behavior. Proceedings
of the National Academy of Sciences, 87:723–727, 1990.
22
[23] William Perry, Arpi Minassian, Martin Paulus, Jared Young, Meegin Kincaid, Eliza Ferguson,
Brook Henry, Xiaoxi Zhuang, Virginia Masten, Richard Sharp, and Mark Geyer. A reversetranslational study of dysfunctional exploration in psychiatric disorders. Archives of General
Psychiatry, 2009.
[24] Kevin Short. Direct calculation of metric entropy from time series. Journal of Computational
Physics, 104:162–172, 1993.
[25] Yakov Sinai. Introduction to ergodic theory. Princeton University Press, 1976.
[26] William Smotherman, Karen Selz, and Arnold Mandell. Dynamical entropy is conserved
during cocaine-induced changes in fetal rat motor patterns. Psychoneuroendocrinology,
21:173–187, 1996.
[27] Julien Sprott. Chaos and Time-Series Analysis. 2003.
[28] Floris Takens. Detecting strange attractors in turbulence. Dynamical Systems and Turbulence,
Lecture Notes in Mathematics, 898:366–381, 1981.
[29] Lai-Sang Young. What are SRB measures, and which dynamical systems have them? 2002.
23
Download