Lecture 2: Characterization of entropies of multidimensional shifts of

advertisement
Lecture 2: Characterization of entropies of
multidimensional shifts of finite type
Michael Hochman
The Hebrew University of Jerusalem
Combinatorics, Automata and Number Theory
CIRM, May 2012
Recall that for an X = SFTd (A, F)
h(X ) = lim
n→∞
1
log Nn (X )
nd
where
Nn (X ) = #{globally admissible patterns on [n; n]d }
Theorem
A number α is the entropy of a 2-dimensional shift of SFT if and
only if α = inf{αn } for some computable sequence (αn )n∈N .
The results for higher dimensions can either be proved in the
same way, or derived from this result.
Proposition
Let F be a finite set of A-patterns and define
e n (A, F) = #{a ∈ A[1,n]2 : a is locally admissible for F}
N
Then the entropy of X = SFT (A, F) is
h(X ) = inf
n
1
e n (A, F)
log N
n2
e n (A, F) is computable from (A, F), this
Since the sequence N
shows one direction of the theorem, that is, that h(X ) ∈ Π1
whenever X is an SFT.
Proof of proposition. Every locally admissible pattern is
globally admissible, so
e n (A, F) ≥ Nn (X )
N
therefore
inf
n
1
e n (A, F) ≥ inf 1 log Nn (X ) = h(X )
log N
2
n n2
n
Since h(X ) = inf n12 log Nn (X ), for the reverse inequality it is
enough to show that for each n0 ,
inf
n→∞
1
e n (A, F) ≤ 1 log N 2 (A)
log N
n0
n
n02
2
Fix n0 . For every locally but not globally admissible a ∈ A[1,n0 ] ,
there is an N(a) such that a cannot be extended to a locally
admissible pattern on [−N(a), N(a)]2 . Let
2
N = max{N(a) : a ∈ A[1,n0 ] is locally but not globally admissible}
Now fix a large n and partition [1; n]2 as follows:
Divide [1, n] × [1, n] into a maximal array of n0 × n0 squares that
are all at least distance N of the complement of [1, n]2 .
The area of the array is at least n2 − (n − 4N)2 , and the number
of cells in the array is at least [(n − 4N)2 /n0 ]
Also note: Every square R in the array lies at the center of a
square R 0 of side 2N + 1 contained in [1; n]2 ,
2
Suppose b ∈ A[1;n] is locally admissible. For each R, R 0 as
above, b|R 0 is locally admissible, so b|R is globally admissible.
Therefore the number of possible b is at most
#{colorings of boundary} × #{colorings of array}
≤ |A|n
2 −(n−4N)2
2 /n
× Nn (X )[(n−4N)
0]
e n (A, F). Taking logarithms,
This is an upper bound on N
2
2
1
e n (A, F) ≤ O(nN + N ) log |A| + (n − 4N) · 1 log Nn (X )
log
N
0
n0
n2
n2
n2
Taking n → ∞, we have the desired inequality
inf
n→∞
1
e n (A, F) ≤ 1 log N 2 (A)
log N
n0
n
n02
Now let α ∈ Π1 .
Our goal is now to construct a 2-dimensional SFT with entropy
equal to α.
We first perform a reduction.
Definition
2
For A0 ⊆ A and a ∈ A[1,n] define
#(A0 , a) = #{u ∈ |a| : au ∈ A0 }
2
Let X ⊆ AZ let
δn (A0 |X ) =
1
max{#(A0 , a) : a = x|[1;n]2 for some x ∈ X }
n2
and define the upper density of A0 in X by
δ(A0 |X ) = lim δn (A0 |X )
n→∞
Note that for an SFT X = SFT (A, F),
δn (A0 |X ) =
1
max{#(A0 , a) : a is globally admissible}
n2
Proposition
Suppose that X = SFT (A, F) and h(X ) = 0. Let A0 ⊆ A. Then
there is an SFT Y = SFT (A0 , F 0 ) of the same dimension as X
such that h(Y ) = δ(A0 |X ).
Proof.
Let A0 = A × {0, 1} and identify A0 -patterns a0 ∈ (A0 )E with pairs
(a, b) where a ∈ AE and b ∈ {0, 1}E . Let and b a {0, 1}-pattern.
Let
F 0 = {(a, b) : a ∈ F}∪{(a, b) : au ∈ A\A0 and bu = 1 for some u}
d
If a ∈ A[1,n] is globally admissible for F then the number of
globally admissible words a0 = (a, b) for F 0 is 2#(A0 ,a) . Hence
2n
2δ
n (A0 ,a)
≤ Nn (A0 , F 0 ) ≤ Nn (A, F) · 2n
2δ
n (A0 ,X )
Taking logarithms and dividing by 1/nd , and using the fact that
1
log Nn (A, F) → h(X ) = 0 we find that, writing
nd
Y = SFT (A0 , F 0 ),
Notice how the construction worked:
1. we started with X = SFT (A, F).
2. We formed the product alphabet A0 = A × B.
3. We added rules in F 0 to ensure that the first level obeys
the constraints defined by F.
4. We added additional constraints involving the second and
possibly first layers.
Every y ∈ SFT (A0 , F 0 ) now has a first layer x ∈ X , and the
second layer (B-symbols) obeys local constraints that may
depend locally on the x layer (in the example, the symbol 1
could only appear over symbols from A0 ).
This process is called superposition.
By a Turing machine we mean a finite automaton moving on a
1-dimensional array of cells. Each cell contains one read-only
bit and one re-writable bit. At each time step
I
I
It reads both bits from the current cell.
Based on the input and its internal state, it
I
I
I
Updates the re-writable bit in the current cell.
Moves one cell left or right.
Enters a new internal state.
Formally if S is the set of internal states,
T : {0, 1} × {0, 1} × S → {0, 1} × {0, 1} × {←, →} × S
There is also
I
A special initial state s0 ∈ S, which we assume is never
re-entered;
I
A set of halting states H ⊆ S on which T is not defined.
So in fact
T : {0, 1}×{0, 1}×(S\H) → {0, 1}×{0, 1}×{←, →}×(S\{s0 })
Let T be a given Turing machine. We can encode a machine at
a cell in a single symbol in {0, 1}2 × S.
When there is no machine at a cell, we encode the data pair
and a symbols ⇐,⇒ to indicate in which direction the machine
is located.
We call the resulting alphabet AT .
Now a typical pairs of rows of the computation looks like this:
Let T be a given Turing machine. We can encode a machine at
a cell in a single symbol in {0, 1}2 × S.
When there is no machine at a cell, we encode the data pair
and a symbols ⇐,⇒ to indicate in which direction the machine
is located.
We call the resulting alphabet AT .
Now a typical pairs of rows of the computation looks like this:
Of a cells is on the right or left of a cell with a machine, the cell
contains an arrow pointing to the machine. Otherwise, an arrow
must point to an identical arrow.
This forces every row with a machine to have all arrows
pointing to it. In particular, there is at most one machine in the
row, though there may be rows without any machine – only
arrows pointing in a common direction.
The transition from row to row can be prescribed locally
because the contents of a cell is determined by T and the three
cells below it.
One question is: how does the machine initialize if no transition
leads into the initial state? By our rules we have not forbidden
the following patterns:
Thus in an admissible configuration, either all rows do not
contain a machine, the data never changes, and the arrows can
switch direction arbitrarily; or there is a row i0 with the machine
in its initial state s0 . All rows below i0 do not contain a machine
and behave as in the first alternative; all rows above i0 are
obtained from the previous row by the transition rule of T .
Note that if in some row the machine is in a halting state, there
is no admissible row above it.
In particular, let a be a symbol encoding the initial state of the
machine. Then a can be extended to an admissible
configuration if and only if there is a choice of initial date such
that the machine does not halt.
This shows that it is undecidable whether a is globally
admissible (Wang’s theorem).
Also note that the SFT X we have constructed has entropy 0.
This is because the pattern on the boundary of [1, n]2
determines the interior unless the machine “appears” inside the
square, in which case one must also know in which column (the
boundary arrows determin which row). This gives a bound of
the form Nn (X ) ≤ c n for some constant c, so
log(Nn (X ))/n2 → 0 = h(X ).
Controlling frequencies.
Given the alphabet AT associated to the Turing machine T ,
consider the set of symbols
A0T = {a ∈ A : There is a 1 in the read-only data layer}
Suppose x is an admissible configuration containing a machine.
This means that the machine did not halt. Also, notice that the
read-only data symbol is constant on columns. Therefore
#(A0T , x|[1,n] ) = #{columns 1...n with read-only data symbol 1}
Let T0 be a Turing machine computing a sequence αn ≥ 0 and
α = inf αn
Assume αn > α + 1/ log(n) (if not, replace αn by αn + 1/ log(n))
Let T be the Turing machine that for each n reads the read-only
data word an from −n to n relative to the machine’s initial
position; and if #(A0T , an ) > αn , the machine halts.
Construct the SFT X associated to T . The result: If x is an
admissible configuration with a machine, then the density of
A0T -symbols is ≤ α. One can show that there are initial
configurations of read-only data that give density exactly α.
We have almost ensured that δ(A0T , X ) = α. But we do not
control densities in configurations with no machine.
We must force computation to occur.
Definition
A board is a set B ⊆ Z2 of the following form: for some intervals
I = [a; b] and J = [a0 , b0 ], and subsets I0 ⊆ I and J0 ⊆ J that
include the endpoints but no consecutive integers,
B = (J × I0 ) ∪ (I0 × J)
|I0 | is the width of B and |J0 | is its height.
Definition
2
d
Let x ∈ AZ and A0 ⊆ A. A configuration x ∈ AZ has arbitrarily
large boards (with respect to A0 ), if
{u ∈ Z2 : xu ∈ A0 }
S
is a union of boards Bi without common or adjacent sites,
and for each n some of the Bi have width and height ≥ n.
Proposition (Robinson)
There exists a 2-dimensional SFT X 6= ∅ on an alphabet AX ,
and a symbol b ∈ AX , such that every x ∈ X has arbitrarily
large boards (with respect to {b}), and h(X ) = 0.
Let X be as in the proposition and T a Turing machine.
2
We next describe how to construct an SFT Y ⊆ X × AZ for
some alphabet A, such that in every configuration (x, x 0 ) ∈ Y , if
B is a board appearing in x, the pattern x|B records the
operation of T .
We use the “junctions” of the board as cells of the machine’s
tape (each row of junctions is a row of cells). Junctions can be
identified locally so we can require that only they carry the
symbols used to simulate T .
Each junction needs to know what symbols are in the junctions
to its left and right (if such exist). We allow symbols on the
vertical edges consisting of pairs of symbols from AT :
I
On an edge adjacent to a junction, the symbol in the
junction appears adjacent to the junction.
I
Two adjacent cells on an edge carry the same symbol.
Now at each junction we have enough enough information
(using the immediate neighbors right and left) to determine
what the symbol is in the junction above.
But we still need to get the information there. We allow symbols
from AT on the vertical edges, with the rule that
I
In the vertical edge immediately above a junction, there
appears the symbols from AT that should appear in the
next junction above.
I
If two vertical edges are adjacent they carry the same
symbol.
I
A junction must have the same symbol as the vertical edge
below it, if one exists.
We require that in the bottom row of a board there is a machine
in initial state in the lower left corner.
We also must introduce rules to deal with vertices on the left
and right boundaries (they do not have adjacent cells on the
right and left).
For example we can interpret this as a special input symbol to
the machine, indicating that the tape ands, and assume that the
machine knows how to deal with this.
We have associated to T an SFT such that each configuration
in the SFT contains arbitrarily large boards, and each board of
size n × n represents the computation of T for n steps (without
halting).
This SFT is empty if and only if T halts on every input.
This is the proof of Berger’s theorem.
Returning to densities...
Now suppose that T0 is a Turing machine computing a
sequence αn ↓ α and we define T as before to check densities
of the input using T0 .
Then for every ε > 0, in every large enough board the density
of junctions with data symbol 1 cannot exceed α + ε. Also there
are arbitrarily large boards with densities arbitrarily close to α.
(Also, the entropy of the SFT is still 0).
But what about outside of the boards?
We can easily synchronize the read-only symbols in the boards
so that they are constant on each column (they are currently
constant on the columns inside a given board).
To complete the proof there is one more stage that we will not
carry out in detail: we force every sufficiently large board to
“sample” enough columns so that the bounds on densities in
boards applies everywhere.
This forces δ(A0T , X ) = α and completes the proof of the
characterization of entropies of 2-dimensional SFTs.
For more detials, see [Hochman-Meyerovitch 2010].
Download