Reflections on Langlet`s APL Theory of Human Vision

advertisement
Reflections on Langlet’s APL Theory of Human Vision
Stuart Smith
stu@cs.uml.edu
Abstract
In the APL Theory of Human Vision[1,2], Langlet introduced the “neurobit network.” This network, it is
claimed, is able to (1) detect periodic replications in data and (2) compress boolean vectors, reducing their
lengths without loss of information. The neurobit network is shown here to be capable of detecting certain
kinds of replicated patterns, but that this capability depends critically on the data to be processed having
dimensions that are powers of two. The success of data compression is also shown to depend similarly on
the dimensions of the data to be compressed; however, the key functions of the model give surprisingly
simple spectral analyses of certain visual patterns. The Appendix contains the APL functions needed to
allow experimentation with the ideas presented in this paper.
Introduction
One of the more intriguing figures in the APL Theory of Human Vision presentation [1] is this one:
What this figure seems to promise is almost too good to be true: an extremely simple system that
simultaneously detects periodic replications in data and compresses them. The triangular structure shown
above is a “fanion.” [2,8] Given a length n boolean vector, S, the fanion function generates successive
rows of a fanion with n-1 iterations of S←(1↓S)≠ ¯1↓S, the initial value of S being retained as the top
row of the fanion. The top row of the fanion in this case is the character string ‘’ 'AA'’ rendered as the
corresponding binary character codes. Reading down the left edge of the fanion we have Langlet’s Helical
transform (or “helicon”) of the sequence. Reading up the right edge of the fanion, we have Langlet’s
Cognitive transform (or “cogniton”) of the sequence [3,4,5]. The value of each element in a given row is
the exclusive-or (i.e., APL ≠) of the two elements immediately above it. As Langlet pointed out [2], all of
these relationships are retained when the fanion is rotated 120º in either direction, that is, the left edge is the
Helical transform of the top row, the right edge is the Cognitive transform of the top row, and each element
in a given row is the exclusive-or of the two elements immediately above it.
Data Compression
We will consider the data compression claim first. The given sequence of 16 items representing the binary
codes for the string 'AA' is compressed to just the six topmost elements of both the left and right edges of
the fanion. All the 1’s are at the upper end of the result, followed by a tail consisting solely of 0’s. To
uncompress the result, the 0’s tail must be concatenated back onto one of the compressed sequences—left
or right—and then the appropriate transform must be applied (Helical if the left edge of the fanion was
used, Cognitive if the right edge was used). It is therefore necessary to remember how many zeroes there
are below the compressed segment along each edge. The original string 'AA' requires two bytes, but the
compressed string also requires two bytes: one to hold the (encoded) replicated element 'A' and another
to hold a count of the zeroes that need to be tacked onto it in order to uncompress the string. Thus, no
compression is achieved; however, the transforms of longer strings of identical characters such as 'AAAA'
and 'AAAAAAAA' still require only two bytes, thus achieving a degree of compression that increases with
the length of the string. The Helical and Cognitive transforms will also compress binary representations of
character strings such as 'ABAB', in which groups of characters are repeated.
More generally, when the input to fanion consists of a length n sequence (n a power of 2), and the
sequence contains 2*k repetitions of a vector of length n÷2*k, an all-1’s row will be generated before
row n÷2*k (in 0-origin indexing). Because the value of each item in a given row is the exclusive-or of the
two items immediately above it, the row following the all-1’s row, and the remaining rows, will be all
zeroes. This accounts for the increasing degree of compression as the number of replicated units is
increased. For example, fanion applied to a length 8 vector containing four repetitions of 1 0 produces
the following result:
fanion 1 0 1 0 1 0 1 0
1 0 1 0 1 0 1 0
1 1 1 1 1 1 1
0 0 0 0 0 0
0 0 0 0 0
0 0 0 0
0 0 0
0 0
0
As expected, row 1 is all 1’s, and row 2 and the remaining rows consist entirely of zeroes.
This is good, but a limitation of this method is that the compression observed in Langlet’s fanion picture
disappears if only the seven low-order bits of the ASCII code for 'A' are used to form the initial sequence,
seven bits being all that are necessary to select any one of the 128 ASCII characters. The input to fanion
in this case is a length 14 sequence consisting of two 7-bit codes, and the output has no long, all-0’s tail:
fanion 1 0 0 0 0 0 1 1 0 0 0 0 0 1
1 0 0 0 0 0 1 1 0 0 0 0 0 1
1 0 0 0 0 1 0 1 0 0 0 0 1
1 0 0 0 1 1 1 1 0 0 0 1
1 0 0 1 0 0 0 1 0 0 1
1 0 1 1 0 0 1 1 0 1
1 1 0 1 0 1 0 1 1
0 1 1 1 1 1 1 0
1 0 0 0 0 0 1
1 0 0 0 0 1
1 0 0 0 1
1 0 0 1
1 0 1
1 1
0
The success of data compression with fanion depends on the input being a sequence that consists solely
of repetitions of a unit whose length is a power of two. Compression cannot be guaranteed if this condition
is not met. It is therefore something of an overstatement of the capabilities of fanion and the Helical and
Cognitive transforms, to say
signals with redundant, repetitious, or palindromic structures appear compressed in the helicon H and the cogniton C, i.e.,
both the helical and the cognitive transform compress information. [7]
Redundant or palindromic signals may be compressed if they are composed of units whose length is some
power of two. Thus fanion is not a general method for data compression.
Although the fanion is an interesting object in its own right—see, e.g., [2,8]— it does not play an essential
role in the discussion to follow. The important results of fanion are rather attributed to the Helical and
Cognitive transforms, which can be computed in several different ways. What we want to see are the useful
results these transforms can deliver, regardless of how they are computed.
Detection of Periodic Replication
For any theory of human vision, we will certainly want to know how effective the operations of the theory
are in detecting important features of two-dimensional scenes. The ability of the Helical and Cognitive
transforms to detect periodic replications is particularly well demonstrated when they are applied to highly
structured binary images. A very simple 2D test case is a square image consisting of equal-width black and
white stripes. Fig. 1 left, for example, is a 16×16 image consisting of sixteen stripes, each one unit wide,
and fig. 1 right is its Helical transform.
Fig. 1 Stripes pattern and its Helical transform
The 16×16 Helical transform of this image consists of a single black element (1) in an otherwise white
(0’s) matrix. The black element is located at (0,1), that is, in position 1 in row 0. If the striped pattern of
Fig. 1 left is transposed, as in fig. 2 left, the Helical transform is also transposed, as in fig. 2 right:
Fig. 2 Horizontal stripes pattern and its Helical transform
Combining the vertical and horizontal stripes patterns element-wise with exclusive-or produces the
checkerboard pattern shown in fig. 3 left. Because the combining operation is exclusive-or, the
intersections of black (1) stripes form white (0) squares:
Fig. 3 Checkerboard pattern and its Helical transform
Fig. 3 right is the Helical transform of the checkerboard pattern. The black square at (0,1) represents the
contribution of the vertical stripes, the black square at (1,0) the contribution of the horizontal stripes (the
black square at (0,0) is a kind of “sign bit” that identifies whether the image to be transformed was, in
photographic terms, a “negative”).
Because the Helical transform is its own inverse, applying it to the transformed image will reproduce the
original striped or checked image. Further investigation shows that the Helical transform of any binary
image that (a) is square, (b) has a side whose length is a power of two, and (c) is built only from alternating
equal-width black and white stripes will have a single 1 in the first row if the stripes are vertical, in the first
column if the stripes are horizontal, and in both the first row and first column for the checkerboard pattern.
Each 1 will be located in a row or column position whose index is a power of two, which will also be the
relative width of each stripe. We conclude, then, that the 2D Helical transform can detect simple cases of
periodic replication.
A more searching test of the Helical transform’s ability to detect periodic replication is provided by 2×2
block-replacement L-system patterns. These patterns, some quite pretty, are all built in the same way: a 2×2
matrix of 1’s and 0’s is given as a start pattern, and then successively larger square matrices are built by
replacing each 1 with a copy of the start pattern and each 0 with a copy of another, usually different, 2×2
matrix. Successively higher orders of Langlet’s G (“geniton”) matrix can be built in exactly this way (see
the G function in the Appendix). The start pattern is
1 1
1 0
To generate the next higher order geniton, replace each 1 with the start pattern and each 0 with a 2×2 allzeroes matrix. The result is this 4×4 matrix:
1
1
1
1
1
0
1
0
1
1
0
0
1
0
0
0
This process can be continued indefinitely, producing larger and larger matrices, all of which will be square
with side a power of two. All 256 possible combinations of two 2×2 boolean matrices can be generated
with the tile function. Thus we can know in every case exactly how a given tiling pattern was generated.
The task of the Langlet transforms here is to detect the underlying replications. The transforms reveal
replication in several different ways depending upon the particular combination of 2×2 matrices selected to
create an image.
Thue-Morse Plane
The next test case is an image that has a relatively complex visual appearance but, as it turns out, a very
simple Helical transform: the Thue-Morse plane, a familiar fractal image. Fig. 4 left, for example, is the
16×16 Thue-Morse plane, and fig. 4 right is its Helical transform.
Fig. 4 Thue-Morse Plane and its Helical transform
As can be seen in the transformed image, 1’s are at positions 0, 1, 2, 4, and 8 in both the first row and first
column. This analysis of the plane yields a set of nine 16×16 “basic forms”, each corresponding to a
specific 1 in the transformed image. A particular basic form is obtained simply by applying the Helical
transform to a matrix containing 1 in the selected position and 0’s everywhere else. When the basic forms
are combined element-wise with ≠ the result is the original Thue-Morse plane. The first basic form,
corresponding to the 1 at the “sign-bit” position (0,0) in the transformed matrix, is an all-1’s (i.e.,
completely black) square. The only effect of the all-1’s basic form is to complement an image, that is,
reverse black and white. The remaining basic forms correspond to 1’s in the first row and first column. As
expected, these images consist of black and white stripes whose relative widths are 1, 2, 4, and 8. The
relative width of the stripes in a given basic form is the same as the row or column index of that image’s 1
in the transform. Fig. 5 shows the set of basic forms of the 16×16 Thue-Morse plane:
Fig. 5 Decomposition of the 16×16 Thue-Morse plane into nine basic forms
The Helical transform reveals replication of both horizontal and vertical stripes at four different scales. This
is a global picture of the large-scale replicated elements from which the original image is built. This may be
a somewhat surprising result given that the construction of the Thue-Morse plane is usually described as a
process that gradually builds the plane out of smaller pieces.
Hadamard matrix
The last simple test case, the Hadamard matrix, is interesting because it has a relatively complex visual
appearance but a very simple Helical transform. It is also of interest because it has applications in image
and signal processing and in cryptography, among other areas. Fig. 6 left is a 16×16 Hadamard matrix
represented as a binary image, and fig. 6 right is its Helical transform.
Fig. 6 Hadamard matrix and its Helical transform
In the transformed matrix, the 1’s lie on the main diagonal at positions (0,0), (1,1), (2,2), (4,4), and (8,8).
This analysis yields a set of five basic forms, each corresponding to a 1 in the transformed image. The first
basic form, corresponding to the 1 in the “sign bit” position (0,0) in the transformed matrix, is again an all1’s (i.e., completely black) square. The remaining 1’s in the transformed image are in positions (1,1), (2,2),
(4,4), and (8,8) along the main diagonal. Fig. 7 shows the basic forms.
Fig. 7 Decomposition of the 16×16 Hadamard matrix into five basic forms
The transform reveals replication of a squares-within-a-square pattern at four different scales.
Most of the other 2×2 tilings have Helical transforms in which the replication patterns are obvious but
involve a large number of basic forms, far too many to permit showing all of them as was done above with
the Thue-Morse plane and the Hadamard matrix. These patterns can be generated with the tile function
listed in the Appendix.
Beyond 2x2 Block Replacement Images
The Helical and Cognitive transforms appear to be exquisitely designed to detect multiple levels of
replication in the kinds of binary images represented by 2×2 block replacement images. This is not
surprising because both transforms were originally based on Langlet’s G matrix, which is itself a member
of that collection. However, we immediately encounter difficulties when we apply the transforms to images
that are not square with a side whose length is a power of two, or to images in which the period of
replication is not a power of two.
With appropriate zero-padding, any matrix can be embedded in a larger, square matrix whose side is a
power of two. But the transformed images produced from padded matrices are generally disappointing
visually as compared with those above. The case pictured in fig. 8 is typical. On the left is a 27×27
Sierpinski “Carpet,” on the right its Helical-transformed image. The Carpet pattern here was padded out to
32×32 before it was given as the argument to the Helical transform. As can be seen, the transformed image
is larger as a result of the padding.
Fig. 8 27×27 Sierpinski Carpet and the Helical transform of the Carpet padded to 32×32
The replication pattern is easily visible in the Carpet image, but the transformed image appears to reveal
nothing beyond mirror symmetry around the main diagonal. This situation arises out of the mismatch
between the Langlet transform matrices, which consist of elements that are replicated at scales that are
powers of two, and the 3×3 tiling patterns, which consist of elements that are replicated at scales that are
powers of three. While the rows and columns of a 2×2 tiling pattern are precisely aligned with the rows and
columns of Langlet transform matrices (which are themselves 2×2 tiling patterns), this can never occur
with 3×3 tiling patterns.
Langlet offers another way to deal with images like the Carpet. In [4] he observes that the relevant
transformational properties of the transform matrix are preserved even if both the last row and the first
column are repeatedly removed. The matrix can therefore be pared down in this way until the length of its
side is the same as the length of the vector to be transformed. The resulting matrix can then be used to
compute the Helical or Cognitive transform of a matrix the length of whose side is not a power of two. The
Helical transform of the Carpet computed in this way is shown in fig. 9.
Fig. 9 Helical transform of the Carpet, computed with a truncated transform matrix
Little or nothing indicating replication leaps off the screen or page, not even the mirror symmetry around
the main diagonal found in fig. 8 right; however, the transforms of 3×3 block replacement tiling patterns
computed with either of the above methods still have exact inverses. The original image can be recovered
intact from the transformed image simply by applying the transform to its own result (and trimming off the
padding, if any).
Discussion
As we already saw in the data compression example at the beginning of this paper, many of the nice
properties of the Helical and Cognitive transforms depend on the dimensions and periodicities of their
arguments being powers of two. In Langlet’s system the problems that arise when the dimensions and/or
periodicities are not powers of two cannot be dealt with through conventional image or signal processing
techniques (e.g., windowing, smoothing, filtering, etc.). The reason is that Langlet insisted that essentially
all operations in his system have exact inverses so that no information could ever be lost in the course of a
computation. To achieve the goal of completely reversible computation, it is necessary to follow Langlet’s
droll mini-manifesto and scrupulously respect the “the rights of a bit:” [6]
Déclaration des Droits du Bit (Antwerp, 1994)
0. All bits contain information, so they deserve respect.
1. Every small bit is a quantum of information.
Nobody has the right to crunch it.
Nobody has the right to smooth it.
Nobody has the right to truncate it.
Nobody has the right to average it with it(s) neighbour(s).
Nobody has the right to sum it, except modulo 2.
Given the severity of these requirements and the nature of the problems encountered, it is not clear how one
would proceed with the work described here. One of the practical problems Langlet sought to solve was to
somehow connect his system to the domains of image and signal processing. I had hoped that he had gone
further in this direction than what I knew from his publications; however, in conversations after his passing
with people who knew him I learned that he had never made the connection. Wolfram’s magisterial A New
Kind of Science [9] has pointed in directions in which progress might be made in a style of computing
reminiscent of Langlet’s. One will recognize in this book many of the themes that Langlet was developing,
although there is no mention of Langlet in it. Meanwhile, however, the game has changed significantly.
The appearance of CNN (Cellular Neural Network) chips [10], which combine large numbers of digital and
analog processors on a single chip, has enabled real-time processing of large quantities of optical sensor
data. The main computational burden is carried by the analog units operating in parallel, while the digital
units are used principally for control and memory. While the two-dimensional neurobit network envisioned
by Langlet resembles a CNN chip in some respects, CNN’s really constitute an entirely new approach to
machine vision, pattern recognition, and image processing.
References
[1] Gérard Langlet. The APL Theory of Human Vision (foils). Vector, 11:3.
[2] Gérard Langlet. The APL Theory of Human Vision. APL ‘94 - 9/94 Antwerp, Belgium
ACM 0-89791 -675-1 /94.
[3] Gérard Langlet. Towards the Ultimate APL-TOE. ACM SIGAPL Quote Quad, 23:1, July 1992.
[4] Gérard Langlet. The Axiom Waltz - or When 1+1 make Zero. VECTOR, 11:3.
[5] Gérard Langlet. Paritons and Cognitons: Towards a New Theory of Information. VECTOR, 19:3.
[6] Gérard Langlet. Binary Algebra Workshop. Vector, 11:2.
[7] Michael Zaus. Crisp and Soft Computing with Hypercubical Calculus: New Approaches to Modeling in
Cognitive Science and Technology with Parity Logic, Fuzzy Logic, and Evolutionary Computing: Volume
27 of Studies in Fuzziness and Soft Computing. Physica-Verlag HD, 1999.
[8] Gérard Langlet. The Fractal MAGIC Universe. Vector, 10:1, p.137, 1993.
[9] Stephen Wolfram. A New Kind of Science. Wolfram Media, 2002.
[10] Leon Chua and Tamás Roska. Cellular Neural Networks and Visual Computing.Cambridge University
Press, 2002.
Appendix
This section contains the basic functions of Langlet’s system as well as functions needed to explore the
ideas presented in the body of the paper. Below are also functions to create figs. 1-8 above. The highlighted
text is the APL definition of each function.
bmp
Boolean inner product
Inputs:
x , y = conformable boolean arrays
Output:
z
= boolean inner product of a and b
z←x bmp y
z←x≠.∧y
bpp
Raise a boolean matrix to a given power.
Inputs:
m = square boolean matrix
n = power
Output:
y = m raised to the nth power
y←m bpp n
y←⊃≠.⌊/n⍴⊂x
bsi
Scalar integral. The modulo-2 sum of a boolean vector
Input:
x = vector of 1's and 0's
Output:
y = modulo-2 sum of x (can also be interpreted as the parity of x)
y←bsi x
y←≠/x
bvd
Parity differential. This is the same as the binary-to-graycode
conversion of a vector of 1's and 0's. bvd is the inverse
of bvi.
Input:
x = vector of 1's and 0's
Output:
y = boolean vector differential of x
y←bvd x
y←x≠¯1↓0,x
bvi
Parity integral. bvi is the same as the graycode-to-binary conversion
of a vector of 1's and 0's. bvi is the inverse of bvd.
Input:
x = vector of 1's and 0's
Output:
y = modulo-2 cumulative sum of x
y←bvi x
y←≠\x
cog
Langlet's Cognitive transform. rcog is much faster and can
accommodate much larger matrices.
Input:
x = vector or matrix of 1's and 0's
Output:
y = Cognitive transform of x
y←cog x
y←x bmp Gh (⍴x)
fanion
Constructs the "fanion" of a boolean vector. To display a fanion, use
the fandisp function.
Input:
x = vector of 1's and 0's (length is usually a power of 2)
Output:
y = the fanion of x
y←fanion x;n
n←⍴x
y←⊂x
L1:→(n=1)/0
x←(1↓x)≠¯1↓x
y←y,⊂x
n←n-1
→L1
fandisp
Display function for fanions
Input:
x = a fanion created with the fanion function
fandisp x;n;i
n←⍴x
x←⍉(1,n)⍴x
⎕←' ',x[1;1]
i←2
L1:→(i>n)/0
⎕←(i↑' '),x[i;1]
i←i+1
→L1
G
Creates Langlet's G matrix, or “geniton”
Input:
n = order of G (n is usually a power of 2)
Output:
y = G matrix of the specified order
y←G n
y←par n↑1 0
Gd
Creates Langlet's Gd matrix
Input:
n = order of Gd (n is usually a power of 2)
Output:
y = Gd matrix of the specified order
y←Gd n
y←⌽Gh n
Gh
Creates Langlet's Gh matrix
Input:
n = order of Gh (n is usually a power of 2)
Output:
y = Gh matrix of the specified order
y←Gh n
y←⊖G n
Gv
Creates Langlet's Gv matrix
Input:
n = order of Gv (n is usually a power of 2)
Output:
y = Gv matrix of the specified order
y←Gv n
y←⌽G n
hel
Langlet's Helical transform. rhel is much faster and can accommodate
much larger matrices.
Input:
x = vector of 1's and 0's
Output:
y = Helical transform of s
y←hel x
y←x bmp Gv (⍴x)
maj
“Majority” function. Determines whether a boolean matrix contains
more 1’s than 0’s
Input:
x = boolean matrix
Output:
y = 1 if 1’s outnumber 0’s in x, otherwise 0
y←maj x
y←(+/+/x=1)>+/+/x=0
par
Construct a pariton from a vector of 1's and 0's
Input:
x = vector of 1's and 0's (length is usually a power of 2)
Output:
y = the pariton of x
y←par x;k;t;n
n←⍴x
k←n-1
t←bvi x
y←t
L1:→(k=0)/L2
t←bvi t
y←y,t
k←k-1
→L1
L2: y←(n,n)⍴y
rcog
Fast divide-and-conquer algorithm to compute the Cognitive transform.
Input:
x = vector of 1’s and 0’s. Length must be a power of 2.
Output:
y = Cognitive transform of x
y←rcog x
y←⌽rhel⌽x
rhel
Fast divide-and-conquer algorithm to compute the Helical transform.
Input:
x = vector of 1’s and 0’s. Length must be a power of 2.
Output:
y = Helical transform of x
y←rhel x;half;left;right
half←(⍴x)÷2
→(half≤2)/L1
left←rhel half↑x
right←rhel (-half)↑x
y←left,left≠right
→0
L1:left←hel half↑x
right←hel (-half)↑x
y←left,left≠right
→0
tile
Creates L-system patterns using 2x2 block replacement. The 2x2
rules, black and white, are global variables. For usage, see the
example programs sierpinski, hadamard, TMplane, stripes, and
checkerboard below.
Input:
n = number of iterations of block replacement
Output:
y = matrix representing the tiled pattern
y←tile n
y←black
L1:→(n=1)/0
y←tile_step y
n←n-1
→L1
tile_step
Computes one step of block replacement. The 2x2 block replacement
rules, black and white, are global variables.
Input:
x = the start pattern
Output:
y = matrix with 1-cells replaced by the 2x2 black pattern and 0cells replaced by the white pattern
y←tile_step x;x1;x2;t1;t2;t;n
x1←black∘.×x
x2←white∘.×~x
t1←2 4 1 3⍉x1
t2←2 4 1 3⍉x2
t←t1∨t2
n←2×1↑⍴x
y←(n,n)⍴,t
Functions to create the figures in the main text
In each definition, n is the number of iterations of block replacement
to be performed. The side of the resulting matrix will be 2*n.
Sierpinski triangle
y←sierpinski n;black;white
black←2 2⍴1 1 1 0
white←2 2⍴0 0 0 0
y←tile n
Hadamard matrix
y←hadamard n;black;white
black←2 2⍴1 1 1 0
white←2 2⍴0 0 0 1
y←tile n
Thue-Morse plane
y←TMplane n;black;white
black←2 2⍴1 0 0 1
white←2 2⍴0 1 1 0
y←tile n
stripes
y←stripes n;black;white
black←2 2⍴1 0 1 0
white←2 2⍴1 0 1 0
y←tile n
checkerboard
y←checkerboard n
x←stripes n
y←x≠⍉x
Download