Two dimensional prefix codes of pictures

Two dimensional prefix codes of pictures ? Marcella Anselmo,1 Dora Giammarresi,2 Maria Madonia3 1 2 Dipartimento di Informatica, Università di Salerno I-84084 Fisciano (SA) Italy. E-mail: anselmo@dia.unisa.it Dipartimento di Matematica. Università di Roma “Tor Vergata”, via della Ricerca Scientifica, 00133 Roma, Italy. E-mail: giammarr@mat.uniroma2.it 3 Dipartimento di Matematica e Informatica, Università di Catania, Viale Andrea Doria 6/a, 95125 Catania, Italy. E-mail: madonia@dmi.unict.it Abstract. A two-dimensional code is defined as a set X ⊆ Σ ∗∗ such that any picture over Σ is tilable in at most one way with pictures in X. The codicity problem is undecidable. The subclass of prefix codes is introduced and it is proved that it is decidable whether a finite set of pictures is a prefix code. Further a polynomial time decoding algorithm for finite prefix codes is given. Maximality and completeness of finite prefix codes are studied: differently from the onedimensional case, they are not equivalent notions. Completeness of finite prefix codes is characterized. 1 Introduction The theory of word codes is a well established subject of investigation in theoretical computer science. Results are related to combinatorics on words, formal languages, automata theory and semigroup theory. In fact the aim is to find structural properties of codes to be exploited for their construction. We refer to [7] for complete references. During the last fifty years, many researchers investigated how the formal language theory can be transferred into a two-dimensional (2D) world (e.g. [8, 11, 12, 4, 18]). Extensions of classical words to two dimensions bring in general to the definition of polyominoes, labeled polyominoes, directed polyominoes, as well as rectangular labeled polyominoes usually referred to as pictures. Some different attempts were done to generalize the notion of code to those 2D objects. A set C of polyominoes is a code if every polyomino that is tilable with (copies of) elements of C, it is so in a unique way. Most of the results show that in the 2D context we loose important properties. A major result due to D. Beauquier and M. Nivat states that the problem whether a finite set of polyominoes is a code is undecidable, and the same result holds also for dominoes ([6]). Related particular cases were studied in [1]. In [14] codes of directed polyominoes equipped with catenation operations are considered, and some special decidable cases are detected. Codes of labeled polyominoes, called bricks, are studied in [17] and further undecidability results are proved. ? Partially supported by MIUR Project “Aspetti matematici e applicazioni emergenti degli automi e dei linguaggi formali”, by 60% Projects of University of Catania, Roma “Tor Vergata”, Salerno. As major observation, remark that all mentioned results consider 2D codes independently from a 2D language theory. In this paper we consider codes of pictures, i.e. rectangular arrays of symbols. Two partially operations are generally considered on pictures: the row/column concatenations. Using these operations, in [10] doubly-ranked monoids are introduced and picture codes are studied in order to extend syntactic properties to two dimensions. Unfortunately most of the results are again negative. Even the definition of prefix picture codes in [13] does not lead to any wide enough class. We study picture codes in relation to the family REC of picture languages recognized by tiling systems. REC is defined by naturally extending to two-dimensions a characterization of finite automata by means of local sets and alphabetic projections ([12]). But pictures are much more difficult to deal with than strings and in fact a crucial difference is that REC is intrinsically non-deterministic and the parsing problem is NP-complete ([16]). In [2, 3] unambiguous and deterministic tiling systems, together with the corresponding subfamilies DREC and UREC, are defined; it is proved the all the inclusions are proper. Moreover the problem whether a given tiling system is unambiguous is undecidable. Despite these facts, REC has several remarkable properties that let it be the most accredited generalization to 2D of regular string languages. In this paper the definition of code is given in terms of the operation of tiling star as defined in [18]: the tiling star of a set X is the set X ∗∗ of all pictures that are tilable (in the polyominoes style) by elements of X. Then X is a code if any picture in X ∗∗ is tilable in one way. Remark that if X ∈ REC then X ∗∗ is also in REC. By analogy to the string case, we denote by flower tiling system a special tiling system that recognizes X ∗∗ and show that if X is a picture code then such flower tiling system is unambiguous and therefore X ∗∗ belongs to UREC. This result sounds like a nice connection to the word code theory, but, unfortunately, again we prove that it is undecidable whether a given set of pictures is a code. This is actually not surprising because it is coherent with the known result of undecidability for UREC. Inspired by the definition of DREC, we propose prefix codes. Pictures are then considered with a preferred scanning direction: from top-left corner to the bottom-right corner. Intuitively, we assume that if X is a prefix code, when decoding a picture p starting from top-left corner, it can be univocally decided which element in X we can start with. The formal definition of prefix codes involves polyominoes. In fact, in the middle of the decoding process, the already decoded part of p is not necessarily rectangular, i.e. it is in general a polyomino. More precisely, we get a special kind of polyominoes that are vertically connected and always contain the whole first row of their minimal bounding box: we refer to them as comb polyominoes. Remark that this makes a big difference with the word code theory where a (connected) part of a word is always a word. We define X to be a prefix set of pictures by imposing that any comb polyomino that is tilable by pictures in X cannot “start” with two different pictures of set X. We prove that it can be always verified whether a finite set of picture is a prefix set and that, as in 1D case, every prefix set of picture is a code. Moreover we present a polynomial time decoding algorithm for finite prefix codes. We extend to picture codes also the classical notions of maximal and complete codes. We present several results regarding the relations between these two notions and give a characterization for maximal complete finite prefix codes. 2 Preliminaries We introduce some definitions about two-dimensional languages (see [12]). A picture over a finite alphabet Σ is a two-dimensional rectangular array of elements of Σ. Given a picture p, |p|row and |p|col denote the number of rows and columns, respectively; |p| = (|p|row , |p|col ) denotes the picture size. The set of all pictures over Σ of fixed size (m, n) is denoted by Σ m,n , while Σ m∗ and Σ ∗n denote the set of all pictures over Σ with fixed number of rows m and columns n, respectively. The set of all pictures over Σ is denoted by Σ ∗∗ . A two-dimensional language (or picture language) over Σ is a subset of Σ ∗∗ . The domain of a picture p is the set of coordinates dom(p) = {1, 2, . . . , |p|row } × {1, 2, . . . , |p|col }. We let p(i, j) denote the symbol in p at coordinates (i, j). Positions in dom(p) are ordered following the lexicographic order: (i, j) < (i0 , j 0 ) if either i < i0 or i = i0 and j < j 0 . Moreover, to easily detect border positions of pictures, we use initials of words “top”, “bottom”, “left” and “right”: then, for example the tlcorner of p refers to position (1, 1). A subdomain of dom(p) is a set d of the form {i, i + 1, . . . , i0 } × {j, j + 1, . . . , j 0 }, where 1 ≤ i ≤ i0 ≤ |p|row , 1 ≤ j ≤ j 0 ≤ |p|col , also specified by the pair [(i, j), (i0 , j 0 )]. The subpicture of p associated to [(i, j), (i0 , j 0 )] is the portion of p corresponding to positions in the subdomain and is denoted by p[(i, j), (i0 , j 0 )]. Given pictures x, p, with |x|row ≤ |p|row and |x|col ≤ |p|col , we say that x is a prefix of p if x is a subpicture of p corresponding to its top-left portion, i.e. if x = p[(1, 1), (|x|row , |x|col )]. Let p, q ∈ Σ ∗∗ pictures of size (m, n) and (m0 , n0 ), respectively, the column concatenation of p and q (denoted by p : q) and the row concatenation of p and q (denoted by p q) are partial operations, defined only if m = m0 and if n = n0 , respectively, as: p:q = p q pq = p q . These definitions can be extended to define two-dimensional languages row- and columnconcatenations and row- and column- stars ([2, 12]). In this paper we will consider another interesting star operation for picture language introduced by D. Simplot in [18]. The idea is to compose pictures in a way to cover a rectangular area without the restriction that each single concatenation must be a or : operation. For example, the following figure sketches a possible kind of composition that is not allowed applying only or a : operations. Definition 1. The tiling star of X, denoted by X ∗∗ , is the set of pictures p whose domain can be partitioned in disjoint subdomains {d1 , d2 , . . . , dk } such that any subpicture ph of p associated with the subdomain dh belongs to X, for all h = 1, ..., k. Language X ∗∗ is called the set of all tilings by X in [18]. In the sequel, if p ∈ X ∗∗ , the partition t = {d1 , d2 , . . . , dk } of dom(p), together with the corresponding pictures {p1 , p2 , . . . , pk }, is called a tiling decomposition of p in X. In this paper, while dealing with tiling star of a set X, we will need to manage also non-rectangular “portions” of pictures composed by elements of X: those are actually labeled polyominoes, that we will call polyominoes, for the sake of simplicity. We extend notations and definitions from pictures to polyominoes by simply defining the domain of a (labeled) polyomino as the set of pairs (i, j) corresponding to all positions occupied inside its minimal bounding box, being (1, 1) the tl-corner position of the bounding box. See the examples below. a a b a b a b b a a a a a a a a (a) a b a b a a a b a a a b a a a a a b (b) (c) Then we can use notion, for a picture x, to be subpicture or prefix of a polyomino c. Observe that the notion of prefix of polyomino makes sense only if the domain of the polyomino contains (1, 1). If C is a set of polyominoes, Pref(C) denotes the (possibly empty) set of all pictures that are prefix of some element in C. Moreover, we can extend to polyominoes the notion of tiling decomposition in a set of pictures X. We can also define a sort of tiling star that, applied to a set of pictures X, produces the set of all polyominoes that have a tiling decomposition in X. If a polyomino p belongs to the polyomino star of X, we say that p is tilable in X. Let us now recall definitions and properties of recognizable picture languages. Given a finite alphabet Γ , a two-dimensional language L ⊆ Γ ∗∗ is local if L coincides with the set of pictures whose sub-blocks of size (2, 2), are all in a given finite set Θ of allowed tiles (considering also border positions). Language X ⊆ Σ ∗∗ is recognizable if it is the projection of a local language over alphabet Γ (by means of a projection π : Γ → Σ). A tiling system for X is a quadruple specifying the four ingredients necessary to recognize X: (Σ, Γ, Θ, π). The family of all recognizable picture languages is denoted by REC. A tiling system is unambiguous if any recognized picture has a unique pre-image in the local language, and UREC is the class of languages recognized by an unambiguous tiling system [3]. REC family shares several properties with the regular string languages. In particular, REC is closed under row/column concatenations, row/column stars, and tiling star ([12, 18]). Note that the notion of locality/recognizability by tiles corresponds to that of finite type/sofic subshifts in symbolic dynamical systems [15]. 3 Two-dimensional codes In the literature, many authors afforded the definition of codes in two dimensions. In different contexts, polyomino codes, picture codes, and brick codes were defined. We introduce two-dimensional codes, according to the theory of recognizable languages, and in a slight different way from all the mentioned definitions. Definition 2. X ⊆ Σ ∗∗ is a code iff any p ∈ Σ ∗∗ has at most one tiling decomposition in X. We consider some simple examples. Let Σ = {a, b} be the alphabet. a aa Example 1. Let X = a b , , . It is easy to see that X is a code. Any picture b aa p ∈ X ∗∗ can be decomposed starting at tl-corner and checking the size (2, 2) subpicture p[(1, 1), (2, 2)]: it can be univocally decomposed in X. Then, proceed similarly for the next contiguous size (2, 2) subpictures. a aba has Example 2. Let X = a b , b a , . Set X is not a code. Indeed picture a aba aba aba the two following different tiling decompositions in X: t1 = and t2 = . aba aba In the 1D setting, a string language X is a code if and only if a special flower automaton for X ∗ is unambiguous (cf. [7]). In 2D an analogous result holds for a finite picture language X, by introducing a special tiling system recognizing X ∗∗ , we call the flower tiling system of X, in analogy to the 1D case. The construction (omitted for lack of space) goes similarly to the ones in [12, 18], assigning a different “colour” to any different picture in X. In this way, one can show that, a finite language X is a code if and only if the flower tiling system of X is unambiguous. Unfortunately such result cannot be used to decide whether a finite language is a code: the problem whether a tiling system is unambiguous is undecidable [3]. As a positive consequence we obtain the following non-trivial result; recall that in 2D not any recognizable language can be recognized in a unambiguous way. Proposition 1. Let X be a finite language. If X is a code then X ∗∗ is in UREC. The unambiguity of a tiling system remains undecidable also for flower tiling systems. In fact, according to the undecidability of similar problems in two dimensions (codicity is undecidable for polyomino codes, picture codes, and brick codes), the codicity problem is undecidable also in our setting. 2D-C ODICITY P ROBLEM I NPUT: X ⊆ Σ ∗∗ , X finite O UTPUT: T RUE if X is a code, FALSE otherwise. Proposition 2. The 2D-C ODICITY P ROBLEM is undecidable. Proof. (Sketch) The Thue system word problem reduces to the 2D-C ODICITY P ROB LEM . The well-known undecidability of the Thue system word problem (see e.g. [9]) will imply the undecidability of the 2D-C ODICITY P ROBLEM. For a given Thue system (Σ, S) and two words u, v ∈ Σ ∗ we construct the set XS,u,v of square bricks over an alphabet that extends Σ, in the way defined in [17], Section 3. Each square brick can be regarded to as a picture, and hence XS,u,v can be consider as a picture language XS,u,v ⊆ Σ ∗∗ . The authors of [17] show that u and v are equivalent iff XS,u,v is not a brick code. The proof is completed by showing that XS,u,v is a brick code iff it is a picture code. t u Next step will be to consider subclasses of codes that are decidable. In 1D an important class of codes is that one of prefix codes. In the next section we consider a possible extension of the definition to two dimensions. 4 Prefix codes In Section 2 we reported the definition of prefix of a picture p, as a subpicture x corresponding to the top-left portion of p. Such definition “translates” to 2D the classic notion of prefix of a string. Starting from this, one could merely define a set of picture X to be prefix whenever it does not contain pictures that are prefixes of other pictures in X. Unfortunately this property would not guarantee the set to be a code. Consider for example the set X introduced in Example 2. No picture in X is prefix of another picture in X; nevertheless X is not a code. The basic idea in defining a prefix code is to prevent the possibility to start decoding a picture in two different ways (as it is for the prefix string codes). One major difference going from 1D to 2D case is that, while any initial part of a decomposition of a string is still a string, the initial part of a decomposition of a picture, at an intermediate step, has not necessarily a rectangular shape. Starting from the tl-corner of a picture, it is possible to reconstruct its tiling decomposition in many different ways, obtaining, as intermediate products, some (labeled) polyominoes whose domain contain always the tl-corner position (1, 1). If we proceed by processing positions (i, j) in lexicographic order, the domains of these polyominoes will have the peculiarity that if they contain a position (h, k) then they contain also all positions above it up to the first row. We name these special polyominoes as follows. Definition 3. A corner polyomino is a labeled polyomino whose domain contains position (1, 1). A comb polyomino is a corner polyomino whose domain is the following set of positions for some n, h1 , h2 , · · · , hn ≥ 1: (1, 1) (1, 2) . . . . . . (1, n) .. (2, 1) . (2, n) .. .. . (h2 , 2) . (h1 , 1) (hn , n) In other words, a comb polyomino is a column convex corner polyomino whose domain contains all positions in the first row of its minimal bounding box. See the figure in Section 2, where (b) is a corner (but not a comb) polyomino, and (c) is a comb polyomino. In the literature these corner polyominoes correspond to labeled directed polyominoes while comb polyominoes correspond to labeled skyline or Manhattan polyominoes rotated by 180 degrees. Now we can define the comb star, a sort of tiling star that, applied to a set of pictures X, produces a set of comb polyominoes. This set of comb polyominoes tiliable in X will be denoted by X ** and will be called the comb star of X. Comb polyominoes and comb star are used to define prefix sets. A set of pictures X is prefix if any decomposition of a comb polyomino in X ** can “start” (in its tlcorner) in a unique way with a picture of X. To give a formal definition, recall that a picture p is a prefix of a (comb) polyomino c if the domain of c includes all positions in dom(p) = {1, 2, . . . , |p|row } × {1, 2, . . . , |p|col }, and c(i, j) = p(i, j) for all positions (i, j) ∈ dom(p). Definition 4. A set X ⊆ Σ ∗∗ is prefix if any two different pictures in X cannot be both prefix of the same comb polyomino c ∈ X **. Example 3. Let X ⊆ Σ ∗∗ . One can easily show the following. If |Σ| = 1 then X is prefix if and only if |X| = 1. If |X| = 2 and |Σ| ≥ 2, then X is prefix if and only if the two pictures in X are not the power of a same picture (cannot be obtained by column and row concatenation of a same picture). Any set X ⊆ Σ m,n containing pictures on Σ of fixed size (m, n), is always prefix. Example 4. It is easy to verify that the set X of Example 1 is prefix. On the contrary, the set X of Example 2 is not prefix: two different elements of X are prefixes of the aba that belongs to X **. comb polyomino c = aba Example 5. Let X = { aba , abb , b aa aa aa aa ba ba b b , , , , , , , , b aa ab ba b b aa ab aa bb }. Language X is prefix: no two pictures in X can be overlapped in order to be ab both prefixes of the same comb polyomino. Let us introduce another notion related to tiling that will be useful for the proofs in the sequel: the notion of covering. For example, in the figure below, the picture with thick borders is (properly) covered by the others. Definition 5. A picture p is covered by a set of pictures X, if there exists a corner polyomino c such that p is prefix of c and the domain of c can be partitioned in rectangular subdomains {d1 , ..., dh } such that each di corresponds to a picture in X and the tl-corner of each di belongs to the domain of p. Moreover p is properly covered if the subdomain containing position (1, 1) corresponds to a picture different from p itself. Proposition 3. A set X is prefix if and only if every x ∈ X cannot be properly covered by pictures in X. Proof. Assume there exists x ∈ X properly covered by pictures in X, let c be the corresponding corner polyomino, and t its tiling decomposition. Then c can be “completed” to a comb polyomino, by filling all the empty parts of c, that contradict column convexity of c, up to the first row. Indeed it is possible to concatenate some copies of elements of X, that occurr in t, and cover the bottom and right border positions of x (the tl-corner of each subdomain in t belongs to dom(x)). For example in the figure before Definition 5, one may concatenate a copy of the first two pictures crossing the right border. Conversely, if there exist x, y ∈ X both prefixes of a comb polyomino c, with a tiling decomposition t in X then, if the subdomain of t containing position (1, 1) does not correspond to x (to y, resp.), one can properly cover x (y, resp.), by taking only the elements of X whose tl-corners belongs to the domain of x (y, resp.). t u Definition 4 seems to be a right generalization of prefix set of strings since it implies the desired property for the set to be a code. Remark that the following result holds without the hypothesis of finiteness on X, and its converse does not hold. Proposition 4. If X ⊆ Σ ∗∗ is prefix then X is a code. Proof. (Sketch) Suppose by contradiction that there exists a picture u ∈ Σ ∗∗ that admits two different tiling decompositions in X, say t1 and t2 . Now, let (i0 , j0 ) the smallest position (in lexicographic order) of u, where t1 and t2 differ. Position (i0 , j0 ) corresponds in t1 to position (1, 1) of some x1 ∈ X, and in t2 to position (1, 1) of some x2 ∈ X, with x1 6= x2 . See the figure below, where a dot indicates position (i0 , j0 ) of u in t1 (on the left) and t2 (on the right), respectively. r x1 r x 2 Consider now the size of x1 and x2 and suppose, without loss of generality, that |x1 |row > |x2 |row and |x1 |col < |x2 |col (equalities are avoided because of the prefix hypothesis). In this case, x1 together with other pictures placed to its right in the decomposition t1 would be a proper cover for x2 that, by Proposition 3, contradicts the hypothesis of X prefix set. t u Proposition 4 shows that prefix sets form a class of codes: the prefix codes. Contrarily to the case of all other known classes of 2D codes, the family of finite prefix codes has the important property to be decidable. Using results in Proposition 3 together with the hypothesis that X is finite, the following result can be proved. Proposition 5. It is decidable whether a finite set of picture X ⊆ Σ ∗∗ is a prefix code. We now present a decoding algorithm for a finite prefix picture code. Proposition 6. There exists a polynomial algorithm that, given a finite prefix code X ⊆ Σ ∗∗ and a picture p ∈ Σ ∗∗ , finds, if it exists, a tiling decomposition of p in X, otherwise it exits with negative answer. Proof. Here is a sketch of the algorithm and its correctness proof. The algorithm scans p using a “current position” (i, j) and a “current partial tiling decomposition” T that contains some of the positions of p grouped as collections of rectangular p subdomains. Notice that, at each step, the partial decomposition corresponds to a comb polyomino. 1. Sort pictures in X with respect to their size (i.e. by increasing number of rows and, for those of same number of rows, by increasing number of columns). 2. Start from tl-corner position of p ( (i, j) = (1, 1)) and T = ∅. 3. Find the smallest picture x ∈ X that is a subpicture of p when associated at subdomain with tl-corner in (i, j) and such that positions of this subdomain were not yet put in T . More technically: let (rx , cx ) indicate the size of x. We find the smallest x such that x is subpicture of p at subdomain dh = [(i, j), (i + rx − 1, j + cx − 1)] and positions in dh are not in the current T . 4. If x does not exist then exit with negative response, else add the subdomain dh to T and set (i, j) as the smallest position not yet added to (subdomains in) T . 5. If (i, j) does not exist then exit returning T , else repeat from step 3. It is easy to understand that if p ∈ / X ∗∗ then at step 3, the algorithm will not find any x, and will exit giving no decomposition in step 4. Suppose now that p ∈ X ∗∗ that is there exists a (unique) decomposition T for p. At each step, the algorithm chooses the right element x to be added to the current decomposition. This can be proved by induction using an argument similar to the one in the proof of Proposition 4. Recall that x is the smallest (as number of rows) picture that is a subpicture of p at current position (i, j). If there is another element y that is subpicture of p at position (i, j), y cannot be part of a decomposition T of p otherwise y possibly together with other pictures to its right in T would be a cover for x and this contradicts the fact that X is prefix. Regarding the time complexity, the most expensive step is the third one that it is clearly polynomial in the sum of the areas of pictures in X and in the area of p. We remark that, using a clever preprocessing on the pictures in X, it can be made linear in the area of set X and picture p. t u We conclude the section by remarking that, moving to the context of recognizable picture languages, the problem of finding a decomposition of p, corresponds to check whether picture p belongs to the REC language X ∗∗ or, equivalently, if p can be recognized by means of the flower tiling system for X ∗∗ . In the general case the parsing problem in REC is NP-complete ([16]) and there are no known (non deterministic) cases for which the parsing problem is feasible. The presented algorithm shows that, in the case of X finite prefix set, the parsing problem for language X ∗∗ can be solved in polynomial time. 5 Maximal and complete codes Maximality is a central notion in theory of (word) codes: the subset of any code is a code, and then the investigation may restrict to maximal codes. In 1D maximality coincides with completeness, for thin codes. In this section we investigate on maximal and complete 2D codes: we show that, differently from the 1D case, complete codes are a proper subset of maximal codes. Moreover we give a full characterization for complete prefix codes. For lack of space all proofs are sketched. Definition 6. A code X ⊆ Σ ∗∗ is maximal over Σ if X is not properly contained in any other code over Σ that is, X ⊆ Y ⊆ Σ ∗∗ and Y code imply X = Y . A prefix code X ⊆ Σ ∗∗ is maximal prefix over Σ if it is not properly contained in any other prefix code over Σ, that is, X ⊆ Y ⊆ Σ ∗∗ and Y prefix imply X = Y . Example 6. The set X = Σ m,n of all pictures on Σ of fixed size (m, n), is a prefix code (see Example 3). It is also maximal as a code and as a prefix code. The following lemma gives a general tool to verify whether a finite prefix set is maximal prefix. It shows that if a prefix code is not maximal prefix, there is always a “small” picture that witnesses it. As consequence we can check whether a prefix set is maximal prefix by restricting our test to a finite number of pictures. Lemma 1. Let X be a prefix finite set, rX = max{|x|row , x ∈ X}, and cX = max{|x|col , x ∈ X}. If X is not maximal prefix, then there exists p0 ∈ Σ ∗∗ , p0 ∈ / X, such that X ∪ {p0 } is still prefix and |p0 |row ≤ rX + 1, |p0 |col ≤ cX + 1. Proof. Let p ∈ Σ ∗∗ , p ∈ / X, such that X ∪ {p} is still prefix. If |p|row > rX + 1, |p|col > cX + 1, then let p0 be the prefix of p of size (rX + 1, cX + 1). Then X ∪ {p0 } is prefix. Otherwise one could properly cover a picture in X with some pictures in X ∪ {p0 }, and then, replacing p0 with p, also with pictures in X ∪ {p}, against X ∪ {p} is prefix and Proposition 3. t u Applying Lemma 1 one can show the following proposition. Proposition 7. It is decidable whether a finite prefix set X is maximal prefix. Any prefix code that is maximal, is trivially maximal prefix too. In 1D the converse holds for finite sets (thin indeed). In 2D the converse does not hold. Proposition 8 gives an example of a prefix maximal set that is not a maximal code. Let us show some preliminary result. Informally speaking, a picture p is unbordered if it cannot be self overlapped. Lemma 2. Let X ⊆ Σ ∗∗ , |Σ| ≥ 2. If there exists a picture p ∈ / P ref (X **), then there exists an unbordered picture p0 , with p prefix of p0 , and p0 ∈ / P ref (X **). Proof. Suppose |p| = (m, n), let p(1, 1) = a and let b ∈ Σ \ {a}. The picture p0 can be obtained by the following operations: surround p with a row and a column of a; then row concatenate a picture of size (m, n + 1) filled by b, unless for its bl position; finally column concatenate a picture of size (2m + 1, n + 1) filled by b, unless for its bl position. The resulting picture is a picture p0 of size (2m + 1, 2n + 2) having p as a prefix. It is possible to claim that p0 is unbordered, by focusing on the possible positions of the subpicture p[(1, n + 1), (m + 1, n + 1)] when p eventually self overlaps. t u Proposition 8. There exists a finite prefix code that is maximal prefix but not maximal code. Proof. Let X be the language of Example 5. Applying Proposition 7 one can show that X is a maximal prefix set. On the other hand, X is not a maximal code. Consider bbab picture p = b b b x , with x, y, z ∈ Σ; p ∈ / P ref (X **). By Lemma 2, there exists abyz also an unbordered picture p0 , such that p is a prefix of p0 , and p0 ∈ / P ref (X **). Let us show that X ∪ {p0 } is still a code. Indeed suppose by contradiction, that there is a picture q ∈ Σ ∗∗ with two different tiling decompositions, say t1 and t2 , in X ∪ {p0 }. By a careful analysis of possible compositions of pictures in X, one can show that if there exists a position (i, j) of q, where t1 and t2 differ, while they coincide in all their tl-positions, then this leads to a contradiction if j = 1, otherwise it implies that there exists another position to the bottom left of (i, j), where t1 and t2 differ, while they coincide in all their tl positions. Repeating the same reasoning for decreasing values of j, this leads again to a contradiction. t u In 1D, the notion of maximality of (prefix) codes is related to the one of (right) completeness. A string language X ⊆ Σ ∗ is defined right complete if the set of all prefixes of X ∗ is equal to Σ ∗ . Then a prefix code is maximal if and only if it is right complete. Let us extend the notion of completeness to 2D languages and investigate the relation of completeness with maximality for prefix codes. Definition 7. A set X ⊆ Σ ∗∗ is tl-complete if P ref (X **) = Σ ∗∗ . Differently from the 1D case, in 2D, the tl-completeness of a prefix set implies its prefix maximality, but it is not its characterization. Proposition 9. Let X ⊆ Σ ∗∗ be a prefix set. If X is tl-complete then X is maximal prefix. Moreover the converse does not hold. Proof. It is easy to prove by contradiction that a tl-complete prefix language is also maximal prefix. For the converse, let X as in Example 5. Applying Proposition 7, one can show that X is a maximal prefix set. Let us show that X is not tl-complete. Consider bbab picture p = b b b x with x, y, z ∈ Σ: there is no comb polyomino c ∈ X ** that has p abyz as a prefix. Indeed, from a careful analysis of possible compositions of pictures in X, it follows that the symbol b in position (2, 3) of p cannot be tiled by pictures in X. t u In 2D tl-complete languages are less rich than in 1D. Complete picture languages are indeed only languages that are somehow “one-dimensional” complete languages, in the sense specified in the following result. Note that the result shows in particular that completeness of finite prefix sets is decidable. Proposition 10. Let X ⊆ Σ ∗∗ be a maximal prefix set. X is tl-complete if and only if X ⊆ Σ m∗ or X ⊆ Σ ∗n . Proof. First observe that when X is a tl-complete prefix set, then no two pictures in X can overlap in a way that their tl corners match (otherwise one of them would be covered by X). Suppose that X is tl-complete, but X * Σ m∗ , for any m. Then there exists x1 , x2 ∈ X, of size (m1 , n1 ) and (m2 , n2 ), respectively, with m1 > m2 . We claim that in this case, n1 = n2 . Indeed if n1 6= n2 , then two cases are possible: n1 > n2 or n1 < n2 . Define y = x1 [(1, 1), (m2 , n1 )]. In the first case, picture p1 , in figure, with t, z ∈ Σ ∗∗ , does not belong to P ref (X **), against X tl-complete. In the second case, picture p2 , in figure, with t ∈ Σ ∗∗ , does not belong to P ref (X **), against X tl-complete. Now, for any picture x ∈ X, the same technique applies either to the pair x, x1 or x, x2 , concluding that X ⊆ Σ ∗n , for some n. x2 p1 = y x2 x1 x1 z t p2 = x1 y x2 t Conversely, suppose X ⊆ Σ m∗ . Setting Γ = Σ m,1 , X can be considered as a set of strings over Γ . Since X is prefix maximal, it is prefix maximal viewed as set of strings over Γ . Therefore, from classical theory of codes (see e.g. [7]), X is right-complete (as a subset of Γ ∗ ), and this easily implies X tl-complete (as a subset of Σ ∗∗ ). t u 6 Conclusions The aim of this paper was to investigate 2D codes in relation with the theory of recognizable picture languages (REC family), starting from the finite case. We determined a meaningful class of 2D codes, the prefix codes, that are decidable and that have very interesting properties to handle with. Our further researches will follow two main directions. First we will extend the investigation to 2D generalizations of other types of string codes, as for example bifix codes. Secondly, we will try to remove the finiteness hypothesis and consider prefix sets belonging to particular sub-families in REC, such as deterministic ones. References 1. P. Aigrain, D. Beauquier. Polyomino tilings, cellular automata and codicity. Theoretical Computer Science, Vol. 147, pp. 165-180, Elsevier 1995. 2. M. Anselmo, D. Giammarresi, M. Madonia. Deterministic and unambiguous families within recognizable two-dimensional languages. Fund. Inform., Vol. 98(2-3), pp. 143-166, 2010. 3. M. Anselmo, D. Giammarresi, M. Madonia, A. Restivo. Unambiguous Recognizable Twodimensional Languages. RAIRO: Theoretical Informatics and Applications, Vol. 40, 2, pp. 227-294, EDP Sciences 2006. 4. M. Anselmo, N. Jonoska, M. Madonia. Framed Versus Unframed Two-dimensional Languages. In M. Nielsen et al. (Eds.) Procs. SOFSEM 09, LNCS 5404, 79-92. Springer-Verlag Berlin Heidelberg 2009. 5. M. Anselmo, M. Madonia. Deterministic and unambiguous two-dimensional languages over one-letter alphabet. Theoretical Computer Science, Vol. 410-16, 1477–1485 Elsevier 2009. 6. D. Beauquier, M. Nivat, A codicity undecidable problem in the plane. Theoret. Comp. Sci. 303 (2003) 417-430 7. J. Berstel, D. Perrin, C. Reutenauer. Codes and Automata. Cambridge University Press, 2009. 8. M. Blum, C. Hewitt. Automata on a two-dimensional tape. IEEE Symposium on Switching and Automata Theory, pp. 155-160, 1967. 9. R. V. Book, F. Otto. String-rewriting Systems. Springer-Verlag, 1993. 10. S. Bozapalidis, A. Grammatikopoulou. Picture codes. ITA, vol. 40 (4), pp. 537-550, 2006. 11. D. Giammarresi, A. Restivo. Recognizable picture languages. Int. Journal Pattern Recognition and Artificial Intelligence. Vol. 6, No. 2& 3, pp. 241-256, 1992. 12. D. Giammarresi, A. Restivo. Two-dimensional languages. Handbook of Formal Languages, G.Rozenberg, et al. Eds, Vol. III, pp. 215-268. Springer Verlag, 1997. 13. A. Grammatikopoulou. Prefix Picture Sets and Picture Codes. In Procs. CAI’05, pp. 255268, 2005. 14. M. Kolarz, W. Moczurad. Multiset, Set and Numerically Decipherable Codes over Directed Figures. In Combinatorial Algorithms, LNCS, Vol. 7643, pp. 224-235, Springer 2012. 15. D. Lind and B. Marcus: An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, Cambridge, 1995. 16. K. Lindgren, C. Moore, M. Nordahl. Complexity of two-dimensional patterns. Journal of Statistical Physics, 91 (5-6), pag. 909–951, 1998. 17. M. Moczurad and W. Moczurad, Some Open Problems in Decidability of Brick (labeled Polyomino) Codes In: K.-Y. Chwa and J.I. Munro (Eds.): COCOON 2004, LNCS 3106, pp. 72-81, 2004. Springer-Verlag Berlin Heidelberg 2004 18. D. Simplot. A Characterization of Recognizable Picture Languages by Tilings by Finite Sets. Theoretical Computer Science, Vol. 218-2, 297–323 Elsevier 1991.

Two dimensional prefix codes of pictures

Related documents

Products

Support

Two dimensional prefix codes of pictures

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib