Reflections on Langlet’s APL Theory of Human Vision Stuart Smith stuart.smith3@comcast.net Abstract In the APL Theory of Human Vision[1,2], Langlet introduced the “neurobit network.” This network, it is claimed, is able to (1) detect periodic replications in data and (2) compress boolean vectors, reducing their lengths without loss of information. The neurobit network is shown here to be capable of detecting certain kinds of replicated patterns, but that this capability depends critically on the data to be processed having dimensions that are powers of two. The success of data compression is also shown to depend similarly on the dimensions of the data to be compressed; however, the key functions of the model give surprisingly simple spectral analyses of certain visual patterns. The Appendix contains the APL functions needed to allow experimentation with the ideas presented in this paper. Introduction One of the more intriguing figures in the APL Theory of Human Vision presentation [1] is this one: What this figure seems to promise is almost too good to be true: an extremely simple system that simultaneously detects periodic replications in data and compresses them. The triangular structure shown above is a “fanion.” [2,8] Given a length n boolean vector, S, the fanion function generates successive rows of a fanion with n-1 iterations of S←(1↓S)≠ ¯1↓S, the initial value of S being retained as the top row of the fanion. The top row of the fanion in this case is the character string ‘’ 'AA'’ rendered as the corresponding binary character codes. Reading down the left edge of the fanion we have Langlet’s Helical transform (or “helicon”) of the sequence. Reading up the right edge of the fanion, we have Langlet’s Cognitive transform (or “cogniton”) of the sequence [3,4,5]. The value of each element in a given row is the exclusive-or (i.e., APL ≠) of the two elements immediately above it. As Langlet pointed out [2], all of these relationships are retained when the fanion is rotated 120º in either direction, that is, the left edge is the Helical transform of the top row, the right edge is the Cognitive transform of the top row, and each element in a given row is the exclusive-or of the two elements immediately above it. Data Compression We will consider the data compression claim first. The given sequence of 16 items representing the binary codes for the string 'AA' is compressed to just the six topmost elements of both the left and right edges of the fanion. All the 1’s are at the upper end of the result, followed by a tail consisting solely of 0’s. To uncompress the result, the 0’s tail must be concatenated back onto one of the compressed sequences—left or right—and then the appropriate transform must be applied (Helical if the left edge of the fanion was used, Cognitive if the right edge was used). It is therefore necessary to remember how many zeroes there are below the compressed segment along each edge. The original string 'AA' requires two bytes, but the compressed string also requires two bytes: one to hold the (encoded) replicated element 'A' and another to hold a count of the zeroes that need to be tacked onto it in order to uncompress the string. Thus, no compression is achieved; however, the transforms of longer strings of identical characters such as 'AAAA' and 'AAAAAAAA' still require only two bytes, thus achieving a degree of compression that increases with the length of the string. The Helical and Cognitive transforms will also compress binary representations of character strings such as 'ABAB', in which groups of characters are repeated. More generally, when the input to fanion consists of a length n sequence (n a power of 2), and the sequence contains 2*k repetitions of a vector of length n÷2*k, an all-1’s row will be generated before row n÷2*k (in 0-origin indexing). Because the value of each item in a given row is the exclusive-or of the two items immediately above it, the row following the all-1’s row, and the remaining rows, will be all zeroes. This accounts for the increasing degree of compression as the number of replicated units is increased. For example, fanion applied to a length 8 vector containing four repetitions of 1 0 produces the following result: fanion 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 As expected, row 1 is all 1’s, and row 2 and the remaining rows consist entirely of zeroes. This is good, but a limitation of this method is that the compression observed in Langlet’s fanion picture disappears if only the seven low-order bits of the ASCII code for 'A' are used to form the initial sequence, seven bits being all that are necessary to select any one of the 128 ASCII characters. The input to fanion in this case is a length 14 sequence consisting of two 7-bit codes, and the output has no long, all-0’s tail: fanion 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 1 1 1 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 1 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 1 0 1 1 1 1 1 1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 1 1 0 1 1 1 0 If a sequence of any length is periodic and the period is known, the encoded element can still be extracted. In the above sequence, for example, the period is seven and the encoded element is therefore the first seven items reading down the left edge of the fanion or the first seven items reading up the right edge. Because additional processing is required to determine the period of the sequence, it is something of an overstatement of the capabilities of fanion and the Helical and Cognitive transforms, to say signals with redundant, repetitious, or palindromic structures appear compressed in the helicon H and the cogniton C, i.e., both the helical and the cognitive transform compress information. [7] Redundant or palindromic signals can be directly compressed if they are composed of units whose length is some power of two; otherwise, some “non-Langletian” processing is required. For example, if a non-square matrix is to be analyzed, both the horizontal and vertical periods must be determined before the repeated unit can be identified. Another, perhaps more serious, problem with the Helical and Cognitive transforms is their extreme sensitivity to noise in the data they operate on. A single corrupt bit will result in a transform from which the repeated unit can be extracted only with additional, non-Langletian processing. These issues and techniques for dealing with them are discussed in another paper [11] Although the fanion is an interesting object in its own right—see, e.g., [2,8]— it does not play an essential role in the discussion to follow. The important results of fanion are rather attributed to the Helical and Cognitive transforms, which can be computed in several different ways. What we want to see are the useful results these transforms can deliver, regardless of how they are computed Detection of Periodic Replication For any theory of human vision, we will certainly want to know how effective the operations of the theory are in detecting important features of two-dimensional scenes. The ability of the Helical and Cognitive transforms to detect periodic replications is particularly well demonstrated when they are applied to highly structured binary images. A very simple 2D test case is a square image consisting of equal-width black and white stripes. Fig. 1 left, for example, is a 16×16 image consisting of sixteen stripes, each one unit wide, and fig. 1 right is its Helical transform. Fig. 1 Stripes pattern and its Helical transform The 16×16 Helical transform of this image consists of a single black element (1) in an otherwise white (0’s) matrix. The black element is located at (0,1), that is, in position 1 in row 0. If the striped pattern of Fig. 1 left is transposed, as in fig. 2 left, the Helical transform is also transposed, as in fig. 2 right: Fig. 2 Horizontal stripes pattern and its Helical transform Combining the vertical and horizontal stripes patterns element-wise with exclusive-or produces the checkerboard pattern shown in fig. 3 left. Because the combining operation is exclusive-or, the intersections of black (1) stripes form white (0) squares: Fig. 3 Checkerboard pattern and its Helical transform Fig. 3 right is the Helical transform of the checkerboard pattern. The black square at (0,1) represents the contribution of the vertical stripes, the black square at (1,0) the contribution of the horizontal stripes. The black square at (0,0) is a kind of “sign bit” that identifies whether the image to be transformed was, in photographic terms, a “negative.” Because the Helical transform is its own inverse, applying it to the transformed image will reproduce the original striped or checked image. Further investigation shows that the Helical transform of any binary image that (a) is square, (b) has a side whose length is a power of two, and (c) is built only from alternating equal-width black and white stripes will have a single 1 in the first row if the stripes are vertical, in the first column if the stripes are horizontal, and in both the first row and first column for the checkerboard pattern. Each 1 will be located in a row or column position whose index is a power of two, which will also be the relative width of each stripe. We conclude, then, that the 2D Helical transform can detect simple cases of periodic replication. Fig. 3 suggests that because the Helical transform is capable of detecting periodic replication within a simple two-dimensional pattern, it should be able to do a certain amount of visual texture analysis. This is indeed the case. Suppose, for example, that we are given the 256×256 patch of texture shown in fig. 4. Fig. 4 Texture patch, 256×256 pixels If we apply the two-dimensional Helical transform to the image in fig. 4, the result will be as shown in fig. 5. As can be seen, there is a 16×16 pixel subimage in an otherwise blank square in this figure. Because of the behavior of the Helical transform described above, we already know that fig. 4 was created by replicating some 16×16 unit. What is that unit? Fig. 5 Helical transform of Fig. 4 To determine what the replicated unit is, remove all the completely blank rows and columns, making sure that the cropped picture is still a square whose side is a power of two (pad with zeroes if necessary). Fig. 6 shows the cropped picture. Fig. 6 Zoomed-in view of the upper left corner of Fig. 5 Finally, to determine what the original texture element was, simply take the two-dimensional Helical transform of the picture shown in fig. 6. The result is shown in fig. 7. Fig. 7 Zoomed-in view of the original texture element replicated to produce Fig. 4 In general, if a 2𝑛 × 2𝑛 boolean matrix is composed exclusively of exact repetitions of a 2𝑚 × 2𝑚 boolean submatrix (m<n), the two-dimensional Helical transform will extract the transform of the submatrix. The original submatrix, i.e., the pattern which is replicated, can then be recovered by applying the Helical transform again after the all-zeroes rows and columns are stripped off. A more searching test of the Helical transform’s ability to detect periodic replication is provided by 2×2 block-replacement L-system patterns. These patterns, some quite pretty, are all built in the same way: a 2×2 matrix of 1’s and 0’s is given as a start pattern, and then successively larger square matrices are built by replacing each 1 with a copy of the start pattern and each 0 with a copy of another, usually different, 2×2 matrix. Successively higher orders of Langlet’s G (“geniton”) matrix can be built in exactly this way (see the G function in the Appendix). The start pattern is 1 1 1 0 To generate the next higher order geniton, replace each 1 with the start pattern and each 0 with a 2×2 allzeroes matrix. The result is this 4×4 matrix: 1 1 1 1 1 0 1 0 1 1 0 0 1 0 0 0 This process can be continued indefinitely, producing larger and larger matrices, all of which will be square with side a power of two. All 256 possible combinations of two 2×2 boolean matrices can be generated with the tile function. Thus we can know in every case exactly how a given tiling pattern was generated. . Thue-Morse Plane The next test case is an image that has a relatively complex visual appearance but, as it turns out, a very simple Helical transform: the Thue-Morse plane, a familiar fractal image. Fig. 8 left, for example, is the 16×16 Thue-Morse plane, and fig. 8 right is its Helical transform. Fig. 8 Thue-Morse Plane and its Helical transform As can be seen in the transformed image, 1’s are at positions 0, 1, 2, 4, and 8 in both the first row and first column. This analysis of the plane yields a set of nine 16×16 “basic forms”, each corresponding to a specific 1 in the transformed image. A particular basic form is obtained simply by applying the Helical transform to a matrix containing 1 in the selected position and 0’s everywhere else. When the basic forms are combined element-wise with ≠ the result is the original Thue-Morse plane. The first basic form, corresponding to the 1 at the “sign-bit” position (0,0) in the transformed matrix, is an all-1’s (i.e., completely black) square. The only effect of the all-1’s basic form is to complement an image, that is, reverse black and white. The remaining basic forms correspond to 1’s in the first row and first column. As expected, these images consist of black and white stripes whose relative widths are 1, 2, 4, and 8. The relative width of the stripes in a given basic form is the same as the row or column index of that image’s 1 in the transform. Fig. 9 shows the set of basic forms of the 16×16 Thue-Morse plane: Fig. 9 Decomposition of the 16×16 Thue-Morse plane into nine basic forms The Helical transform reveals replication of both horizontal and vertical stripes at four different scales. This is a global picture of the large-scale replicated elements from which the original image is built. This may be a somewhat surprising result given that the construction of the Thue-Morse plane is usually described as a process that gradually builds the plane out of smaller pieces. Hadamard matrix The last simple test case, the Hadamard matrix, is interesting because it has a relatively complex visual appearance but a very simple Helical transform. It is also of interest because it has applications in image and signal processing and in cryptography, among other areas. Fig. 10 left is a 16×16 Hadamard matrix represented as a binary image, and fig. 10 right is its Helical transform. Fig. 10 Hadamard matrix and its Helical transform In the transformed matrix, the 1’s lie on the main diagonal at positions (0,0), (1,1), (2,2), (4,4), and (8,8). This analysis yields a set of five basic forms, each corresponding to a 1 in the transformed image. The first basic form, corresponding to the 1 in the “sign bit” position (0,0) in the transformed matrix, is again an all1’s (i.e., completely black) square. The remaining 1’s in the transformed image are in positions (1,1), (2,2), (4,4), and (8,8) along the main diagonal. Fig. 11 shows the basic forms. Fig. 11 Decomposition of the 16×16 Hadamard matrix into five basic forms The transform reveals replication of a squares-within-a-square pattern at four different scales. Most of the other 2×2 tilings have Helical transforms in which the replication patterns are obvious but involve a large number of basic forms, far too many to permit showing all of them as was done above with the Thue-Morse plane and the Hadamard matrix. These patterns can be generated with the tile function listed in the Appendix. Beyond 2x2 Block Replacement Images The Helical and Cognitive transforms appear to be exquisitely designed to detect multiple levels of replication in the kinds of binary images represented by 2×2 block replacement images. This is not surprising because both transforms were originally based on Langlet’s G matrix, which is itself a member of that collection. However, we immediately encounter difficulties when we apply the transforms to images that are not square with a side whose length is a power of two, or to images in which the period of replication is not a power of two. With appropriate zero-padding, any matrix can be embedded in a larger, square matrix whose side is a power of two. But the transformed images produced from padded matrices are generally disappointing visually as compared with those above. The case pictured in fig. 12 is typical. On the left is a 27×27 Sierpinski “Carpet,” on the right its Helical-transformed image. The Carpet pattern here was padded out to 32×32 before it was given as the argument to the Helical transform. As can be seen, the transformed image is larger as a result of the padding. 1 Fig. 12 27×27 Sierpinski Carpet and the Helical transform of the Carpet padded to 32×32 The replication pattern is easily visible in the Carpet image, but the transformed image appears to reveal nothing beyond mirror symmetry around the main diagonal. This situation arises out of the mismatch between the Langlet transform matrices, which consist of elements that are replicated at scales that are powers of two, and the 3×3 tiling patterns, which consist of elements that are replicated at scales that are powers of three. While the rows and columns of a 2×2 tiling pattern are precisely aligned with the rows and columns of Langlet transform matrices (which are themselves 2×2 tiling patterns), this can never occur with 3×3 tiling patterns. Langlet offers another way to deal with images like the Carpet. In [4] he observes that the relevant transformational properties of the transform matrix are preserved even if both the last row and the first column are repeatedly removed. The matrix can therefore be pared down in this way until the length of its side is the same as the length of the vector to be transformed. The resulting matrix can then be used to compute the Helical or Cognitive transform of a matrix the length of whose side is not a power of two. The Helical transform of the Carpet computed in this way is shown in fig. 13. Fig. 13 Helical transform of the Carpet, computed with a truncated transform matrix Little or nothing indicating replication leaps off the screen or page, not even the mirror symmetry around the main diagonal found in fig. 8 right; however, the transforms of 3×3 block replacement tiling patterns computed with either of the above methods still have exact inverses. The original image can be recovered intact from the transformed image simply by applying the transform to its own result (and trimming off the padding, if any). Discussion As we already saw in the data compression example at the beginning of this paper, many of the nice properties of the Helical and Cognitive transforms depend on the dimensions and periodicities of their arguments being powers of two. In Langlet’s system the problems that arise when the dimensions and/or periodicities are not powers of two cannot be dealt with through conventional image or signal processing techniques (e.g., windowing, smoothing, filtering, etc.). The reason is that Langlet insisted that essentially all operations in his system have exact inverses so that no information could ever be lost in the course of a 1 A version of the Helical transform that accepts an arbitrary rectangular matrix as its argument returns a result of the same dimensions as the argument, i.e., no stripping of padding is necessary. The downside is that it is much slower than the versions that require a square matrix whose side is a power of 2. computation. To achieve the goal of completely reversible computation, it is necessary to follow Langlet’s droll mini-manifesto and scrupulously respect the “the rights of a bit:” [6] Déclaration des Droits du Bit (Antwerp, 1994) 0. All bits contain information, so they deserve respect. 1. Every small bit is a quantum of information. Nobody has the right to crunch it. Nobody has the right to smooth it. Nobody has the right to truncate it. Nobody has the right to average it with it(s) neighbour(s). Nobody has the right to sum it, except modulo 2. Given the severity of these requirements and the nature of the problems encountered, it is not clear how one would proceed with the work described here. One of the practical problems Langlet sought to solve was to somehow connect his system to the domains of image and signal processing. I had hoped that he had gone further in this direction than what I knew from his publications; however, in conversations after his passing with people who knew him I learned that he had never made the connection. Wolfram’s magisterial A New Kind of Science [9] has pointed in directions in which progress might be made in a style of computing reminiscent of Langlet’s. One will recognize in this book many of the themes that Langlet was developing, although there is no mention of Langlet in it. Meanwhile, however, the game has changed significantly. The appearance of CNN (Cellular Neural Network) chips [10], which combine large numbers of digital and analog processors on a single chip, has enabled real-time processing of large quantities of optical sensor data. The main computational burden is carried by the analog units operating in parallel, while the digital units are used principally for control and memory. The two-dimensional neurobit network envisioned by Langlet resembles a CNN chip in some respects, but CNN’s constitute a more powerful and comprehensive approach to machine vision, pattern recognition, and image processing. References [1] Gérard Langlet. The APL Theory of Human Vision (foils). Vector, 11:3. [2] Gérard Langlet. The APL Theory of Human Vision. APL ‘94 - 9/94 Antwerp, Belgium ACM 0-89791 -675-1 /94. [3] Gérard Langlet. Towards the Ultimate APL-TOE. ACM SIGAPL Quote Quad, 23:1, July 1992. [4] Gérard Langlet. The Axiom Waltz - or When 1+1 make Zero. VECTOR, 11:3. [5] Gérard Langlet. Paritons and Cognitons: Towards a New Theory of Information. VECTOR, 19:3. [6] Gérard Langlet. Binary Algebra Workshop. Vector, 11:2. [7] Michael Zaus. Crisp and Soft Computing with Hypercubical Calculus: New Approaches to Modeling in Cognitive Science and Technology with Parity Logic, Fuzzy Logic, and Evolutionary Computing: Volume 27 of Studies in Fuzziness and Soft Computing. Physica-Verlag HD, 1999. [8] Gérard Langlet. The Fractal MAGIC Universe. Vector, 10:1, p.137, 1993. [9] Stephen Wolfram. A New Kind of Science. Wolfram Media, 2002. [10] Leon Chua and Tamás Roska. Cellular Neural Networks and Visual Computing.Cambridge University Press, 2002. [11] Stuart Smith. Analyzing Counterchange Patterns with Langlet Transforms. www.cs.uml.edu/~stu Appendix This section contains the basic functions of Langlet’s system as well as functions needed to explore the ideas presented in the body of the paper. Below are also functions to create figs. 1-13 above. The highlighted text is the APL definition of each function. Matlab versions of the functions are also available in the Langlet Toolkit at www.cs.uml.edu/~stu. bmp Boolean inner product Inputs: x , y = conformable boolean arrays Output: z = boolean inner product of a and b z←x bmp y z←x≠.∧y bpp Raise a boolean matrix to a given power. Inputs: m = square boolean matrix n = power Output: y = m raised to the nth power y←m bpp n y←⊃≠.⌊/n⍴⊂x bsi Scalar integral. The modulo-2 sum of a boolean vector Input: x = vector of 1's and 0's Output: y = modulo-2 sum of x (can also be interpreted as the parity of x) y←bsi x y←≠/x bvd Parity differential. This is the same as the binary-to-graycode conversion of a vector of 1's and 0's. bvd is the inverse of bvi. Input: x = vector of 1's and 0's Output: y = boolean vector differential of x y←bvd x y←x≠¯1↓0,x bvi Parity integral. bvi is the same as the graycode-to-binary conversion of a vector of 1's and 0's. bvi is the inverse of bvd. Input: x = vector of 1's and 0's Output: y = modulo-2 cumulative sum of x y←bvi x y←≠\x cog Langlet's Cognitive transform. rcog is much faster and can accommodate much larger matrices. Input: x = vector or matrix of 1's and 0's Output: y = Cognitive transform of x y←cog x y←x bmp Gh (⍴x) fanion Constructs the "fanion" of a boolean vector. To display a fanion, use the fandisp function. Input: x = vector of 1's and 0's (length is usually a power of 2) Output: y = the fanion of x y←fanion x;n n←⍴x y←⊂x L1:→(n=1)/0 x←(1↓x)≠¯1↓x y←y,⊂x n←n-1 →L1 fandisp Display function for fanions Input: x = a fanion created with the fanion function fandisp x;n;i n←⍴x x←⍉(1,n)⍴x ⎕←' ',x[1;1] i←2 L1:→(i>n)/0 ⎕←(i↑' '),x[i;1] i←i+1 →L1 G Creates Langlet's G matrix, or “geniton” Input: n = order of G (n is usually a power of 2) Output: y = G matrix of the specified order y←G n y←par n↑1 0 Gd Creates Langlet's Gd matrix Input: n = order of Gd (n is usually a power of 2) Output: y = Gd matrix of the specified order y←Gd n y←⌽Gh n Gh Creates Langlet's Gh matrix Input: n = order of Gh (n is usually a power of 2) Output: y = Gh matrix of the specified order y←Gh n y←⊖G n Gv Creates Langlet's Gv matrix Input: n = order of Gv (n is usually a power of 2) Output: y = Gv matrix of the specified order y←Gv n y←⌽G n hel Langlet's Helical transform. rhel is much faster and can accommodate much larger matrices. Input: x = vector of 1's and 0's Output: y = Helical transform of s y←hel x y←x bmp Gv (⍴x) maj “Majority” function. Determines whether a boolean matrix contains more 1’s than 0’s Input: x = boolean matrix Output: y = 1 if 1’s outnumber 0’s in x, otherwise 0 y←maj x y←(+/+/x=1)>+/+/x=0 par Construct a pariton from a vector of 1's and 0's Input: x = vector of 1's and 0's (length is usually a power of 2) Output: y = the pariton of x y←par x;k;t;n n←⍴x k←n-1 t←bvi x y←t L1:→(k=0)/L2 t←bvi t y←y,t k←k-1 →L1 L2: y←(n,n)⍴y rcog Fast divide-and-conquer algorithm to compute the Cognitive transform. Input: x = vector of 1’s and 0’s. Length must be a power of 2. Output: y = Cognitive transform of x y←rcog x y←⌽rhel⌽x rhel Fast divide-and-conquer algorithm to compute the Helical transform. Input: x = vector of 1’s and 0’s. Length must be a power of 2. Output: y = Helical transform of x y←rhel x;half;left;right half←(⍴x)÷2 →(half≤2)/L1 left←rhel half↑x right←rhel (-half)↑x y←left,left≠right →0 L1:left←hel half↑x right←hel (-half)↑x y←left,left≠right →0 tile Creates L-system patterns using 2x2 block replacement. The 2x2 rules, black and white, are global variables. For usage, see the example programs sierpinski, hadamard, TMplane, stripes, and checkerboard below. Input: n = number of iterations of block replacement Output: y = matrix representing the tiled pattern y←tile n y←black L1:→(n=1)/0 y←tile_step y n←n-1 →L1 tile_step Computes one step of block replacement. The 2x2 block replacement rules, black and white, are global variables. Input: x = the start pattern Output: y = matrix with 1-cells replaced by the 2x2 black pattern and 0cells replaced by the white pattern y←tile_step x;x1;x2;t1;t2;t;n x1←black∘.×x x2←white∘.×~x t1←2 4 1 3⍉x1 t2←2 4 1 3⍉x2 t←t1∨t2 n←2×1↑⍴x y←(n,n)⍴,t Functions to create the figures in the main text In each definition, n is the number of iterations of block replacement to be performed. The side of the resulting matrix will be 2*n. Sierpinski triangle y←sierpinski n;black;white black←2 2⍴1 1 1 0 white←2 2⍴0 0 0 0 y←tile n Hadamard matrix y←hadamard n;black;white black←2 2⍴1 1 1 0 white←2 2⍴0 0 0 1 y←tile n Thue-Morse plane y←TMplane n;black;white black←2 2⍴1 0 0 1 white←2 2⍴0 1 1 0 y←tile n stripes y←stripes n;black;white black←2 2⍴1 0 1 0 white←2 2⍴1 0 1 0 y←tile n checkerboard y←checkerboard n x←stripes n y←x≠⍉x