Two dimensional prefix codes of pictures

advertisement
Two dimensional prefix codes of pictures ?
Marcella Anselmo,1 Dora Giammarresi,2 Maria Madonia3
1
2
Dipartimento di Informatica, Università di Salerno I-84084 Fisciano (SA) Italy. E-mail:
anselmo@dia.unisa.it
Dipartimento di Matematica. Università di Roma “Tor Vergata”, via della Ricerca Scientifica,
00133 Roma, Italy. E-mail: giammarr@mat.uniroma2.it
3
Dipartimento di Matematica e Informatica, Università di Catania, Viale Andrea Doria 6/a,
95125 Catania, Italy. E-mail: madonia@dmi.unict.it
Abstract. A two-dimensional code is defined as a set X ⊆ Σ ∗∗ such that any
picture over Σ is tilable in at most one way with pictures in X. The codicity
problem is undecidable. The subclass of prefix codes is introduced and it is proved
that it is decidable whether a finite set of pictures is a prefix code. Further a
polynomial time decoding algorithm for finite prefix codes is given. Maximality
and completeness of finite prefix codes are studied: differently from the onedimensional case, they are not equivalent notions. Completeness of finite prefix
codes is characterized.
1
Introduction
The theory of word codes is a well established subject of investigation in theoretical
computer science. Results are related to combinatorics on words, formal languages,
automata theory and semigroup theory. In fact the aim is to find structural properties of
codes to be exploited for their construction. We refer to [7] for complete references.
During the last fifty years, many researchers investigated how the formal language
theory can be transferred into a two-dimensional (2D) world (e.g. [8, 11, 12, 4, 18]).
Extensions of classical words to two dimensions bring in general to the definition of
polyominoes, labeled polyominoes, directed polyominoes, as well as rectangular labeled polyominoes usually referred to as pictures. Some different attempts were done
to generalize the notion of code to those 2D objects. A set C of polyominoes is a code if
every polyomino that is tilable with (copies of) elements of C, it is so in a unique way.
Most of the results show that in the 2D context we loose important properties. A major
result due to D. Beauquier and M. Nivat states that the problem whether a finite set
of polyominoes is a code is undecidable, and the same result holds also for dominoes
([6]). Related particular cases were studied in [1]. In [14] codes of directed polyominoes equipped with catenation operations are considered, and some special decidable
cases are detected. Codes of labeled polyominoes, called bricks, are studied in [17] and
further undecidability results are proved.
?
Partially supported by MIUR Project “Aspetti matematici e applicazioni emergenti degli automi e dei linguaggi formali”, by 60% Projects of University of Catania, Roma “Tor Vergata”,
Salerno.
As major observation, remark that all mentioned results consider 2D codes independently from a 2D language theory. In this paper we consider codes of pictures, i.e.
rectangular arrays of symbols. Two partially operations are generally considered on pictures: the row/column concatenations. Using these operations, in [10] doubly-ranked
monoids are introduced and picture codes are studied in order to extend syntactic properties to two dimensions. Unfortunately most of the results are again negative. Even the
definition of prefix picture codes in [13] does not lead to any wide enough class.
We study picture codes in relation to the family REC of picture languages recognized by tiling systems. REC is defined by naturally extending to two-dimensions a
characterization of finite automata by means of local sets and alphabetic projections
([12]). But pictures are much more difficult to deal with than strings and in fact a crucial difference is that REC is intrinsically non-deterministic and the parsing problem
is NP-complete ([16]). In [2, 3] unambiguous and deterministic tiling systems, together
with the corresponding subfamilies DREC and UREC, are defined; it is proved the all
the inclusions are proper. Moreover the problem whether a given tiling system is unambiguous is undecidable. Despite these facts, REC has several remarkable properties that
let it be the most accredited generalization to 2D of regular string languages.
In this paper the definition of code is given in terms of the operation of tiling star
as defined in [18]: the tiling star of a set X is the set X ∗∗ of all pictures that are tilable
(in the polyominoes style) by elements of X. Then X is a code if any picture in X ∗∗ is
tilable in one way. Remark that if X ∈ REC then X ∗∗ is also in REC. By analogy to
the string case, we denote by flower tiling system a special tiling system that recognizes
X ∗∗ and show that if X is a picture code then such flower tiling system is unambiguous
and therefore X ∗∗ belongs to UREC. This result sounds like a nice connection to the
word code theory, but, unfortunately, again we prove that it is undecidable whether a
given set of pictures is a code. This is actually not surprising because it is coherent with
the known result of undecidability for UREC.
Inspired by the definition of DREC, we propose prefix codes. Pictures are then considered with a preferred scanning direction: from top-left corner to the bottom-right corner. Intuitively, we assume that if X is a prefix code, when decoding a picture p starting
from top-left corner, it can be univocally decided which element in X we can start with.
The formal definition of prefix codes involves polyominoes. In fact, in the middle of the
decoding process, the already decoded part of p is not necessarily rectangular, i.e. it is
in general a polyomino. More precisely, we get a special kind of polyominoes that are
vertically connected and always contain the whole first row of their minimal bounding
box: we refer to them as comb polyominoes. Remark that this makes a big difference
with the word code theory where a (connected) part of a word is always a word.
We define X to be a prefix set of pictures by imposing that any comb polyomino
that is tilable by pictures in X cannot “start” with two different pictures of set X.
We prove that it can be always verified whether a finite set of picture is a prefix set
and that, as in 1D case, every prefix set of picture is a code. Moreover we present
a polynomial time decoding algorithm for finite prefix codes. We extend to picture
codes also the classical notions of maximal and complete codes. We present several
results regarding the relations between these two notions and give a characterization for
maximal complete finite prefix codes.
2
Preliminaries
We introduce some definitions about two-dimensional languages (see [12]).
A picture over a finite alphabet Σ is a two-dimensional rectangular array of elements of Σ. Given a picture p, |p|row and |p|col denote the number of rows and columns,
respectively; |p| = (|p|row , |p|col ) denotes the picture size. The set of all pictures over
Σ of fixed size (m, n) is denoted by Σ m,n , while Σ m∗ and Σ ∗n denote the set of all
pictures over Σ with fixed number of rows m and columns n, respectively. The set of all
pictures over Σ is denoted by Σ ∗∗ . A two-dimensional language (or picture language)
over Σ is a subset of Σ ∗∗ .
The domain of a picture p is the set of coordinates dom(p) = {1, 2, . . . , |p|row } ×
{1, 2, . . . , |p|col }. We let p(i, j) denote the symbol in p at coordinates (i, j). Positions
in dom(p) are ordered following the lexicographic order: (i, j) < (i0 , j 0 ) if either
i < i0 or i = i0 and j < j 0 . Moreover, to easily detect border positions of pictures,
we use initials of words “top”, “bottom”, “left” and “right”: then, for example the tlcorner of p refers to position (1, 1). A subdomain of dom(p) is a set d of the form
{i, i + 1, . . . , i0 } × {j, j + 1, . . . , j 0 }, where 1 ≤ i ≤ i0 ≤ |p|row , 1 ≤ j ≤ j 0 ≤ |p|col ,
also specified by the pair [(i, j), (i0 , j 0 )]. The subpicture of p associated to [(i, j), (i0 , j 0 )]
is the portion of p corresponding to positions in the subdomain and is denoted by
p[(i, j), (i0 , j 0 )]. Given pictures x, p, with |x|row ≤ |p|row and |x|col ≤ |p|col , we say
that x is a prefix of p if x is a subpicture of p corresponding to its top-left portion, i.e. if
x = p[(1, 1), (|x|row , |x|col )].
Let p, q ∈ Σ ∗∗ pictures of size (m, n) and (m0 , n0 ), respectively, the column concatenation of p and q (denoted by p : q) and the row concatenation of p and q (denoted
by p q) are partial operations, defined only if m = m0 and if n = n0 , respectively, as:
p:q =
p
q
pq =
p
q
.
These definitions can be extended to define two-dimensional languages row- and columnconcatenations and row- and column- stars ([2, 12]).
In this paper we will consider another interesting star operation for picture language
introduced by D. Simplot in [18]. The idea is to compose pictures in a way to cover a
rectangular area without the restriction that each single concatenation must be a or
: operation. For example, the following figure sketches a possible kind of composition
that is not allowed applying only or a : operations.
Definition 1. The tiling star of X, denoted by X ∗∗ , is the set of pictures p whose domain
can be partitioned in disjoint subdomains {d1 , d2 , . . . , dk } such that any subpicture ph
of p associated with the subdomain dh belongs to X, for all h = 1, ..., k.
Language X ∗∗ is called the set of all tilings by X in [18]. In the sequel, if p ∈ X ∗∗ ,
the partition t = {d1 , d2 , . . . , dk } of dom(p), together with the corresponding pictures
{p1 , p2 , . . . , pk }, is called a tiling decomposition of p in X.
In this paper, while dealing with tiling star of a set X, we will need to manage also
non-rectangular “portions” of pictures composed by elements of X: those are actually
labeled polyominoes, that we will call polyominoes, for the sake of simplicity.
We extend notations and definitions from pictures to polyominoes by simply defining the domain of a (labeled) polyomino as the set of pairs (i, j) corresponding to all
positions occupied inside its minimal bounding box, being (1, 1) the tl-corner position
of the bounding box. See the examples below.
a
a
b
a
b
a
b
b
a
a
a
a
a
a
a
a
(a)
a b
a
b a a
a b a a
a b a a
a
a a
b
(b)
(c)
Then we can use notion, for a picture x, to be subpicture or prefix of a polyomino c.
Observe that the notion of prefix of polyomino makes sense only if the domain of the
polyomino contains (1, 1). If C is a set of polyominoes, Pref(C) denotes the (possibly
empty) set of all pictures that are prefix of some element in C.
Moreover, we can extend to polyominoes the notion of tiling decomposition in a set
of pictures X. We can also define a sort of tiling star that, applied to a set of pictures
X, produces the set of all polyominoes that have a tiling decomposition in X. If a
polyomino p belongs to the polyomino star of X, we say that p is tilable in X.
Let us now recall definitions and properties of recognizable picture languages. Given
a finite alphabet Γ , a two-dimensional language L ⊆ Γ ∗∗ is local if L coincides with the
set of pictures whose sub-blocks of size (2, 2), are all in a given finite set Θ of allowed
tiles (considering also border positions). Language X ⊆ Σ ∗∗ is recognizable if it is the
projection of a local language over alphabet Γ (by means of a projection π : Γ → Σ).
A tiling system for X is a quadruple specifying the four ingredients necessary to recognize X: (Σ, Γ, Θ, π). The family of all recognizable picture languages is denoted by
REC. A tiling system is unambiguous if any recognized picture has a unique pre-image
in the local language, and UREC is the class of languages recognized by an unambiguous tiling system [3]. REC family shares several properties with the regular string
languages. In particular, REC is closed under row/column concatenations, row/column
stars, and tiling star ([12, 18]). Note that the notion of locality/recognizability by tiles
corresponds to that of finite type/sofic subshifts in symbolic dynamical systems [15].
3
Two-dimensional codes
In the literature, many authors afforded the definition of codes in two dimensions. In
different contexts, polyomino codes, picture codes, and brick codes were defined. We
introduce two-dimensional codes, according to the theory of recognizable languages,
and in a slight different way from all the mentioned definitions.
Definition 2. X ⊆ Σ ∗∗ is a code iff any p ∈ Σ ∗∗ has at most one tiling decomposition
in X.
We consider some simple examples. Let Σ = {a, b} be the alphabet.
a aa
Example 1. Let X = a b , ,
. It is easy to see that X is a code. Any picture
b aa
p ∈ X ∗∗ can be decomposed starting at tl-corner and checking the size (2, 2) subpicture
p[(1, 1), (2, 2)]: it can be univocally decomposed in X. Then, proceed similarly for the
next contiguous size (2, 2) subpictures.
a
aba
has
Example 2. Let X = a b , b a ,
. Set X is not a code. Indeed picture
a
aba
aba
aba
the two following different tiling decompositions in X: t1 =
and t2 =
.
aba
aba
In the 1D setting, a string language X is a code if and only if a special flower automaton for X ∗ is unambiguous (cf. [7]). In 2D an analogous result holds for a finite
picture language X, by introducing a special tiling system recognizing X ∗∗ , we call
the flower tiling system of X, in analogy to the 1D case. The construction (omitted for
lack of space) goes similarly to the ones in [12, 18], assigning a different “colour” to
any different picture in X. In this way, one can show that, a finite language X is a code
if and only if the flower tiling system of X is unambiguous. Unfortunately such result
cannot be used to decide whether a finite language is a code: the problem whether a
tiling system is unambiguous is undecidable [3]. As a positive consequence we obtain
the following non-trivial result; recall that in 2D not any recognizable language can be
recognized in a unambiguous way.
Proposition 1. Let X be a finite language. If X is a code then X ∗∗ is in UREC.
The unambiguity of a tiling system remains undecidable also for flower tiling systems.
In fact, according to the undecidability of similar problems in two dimensions (codicity is undecidable for polyomino codes, picture codes, and brick codes), the codicity
problem is undecidable also in our setting.
2D-C ODICITY P ROBLEM
I NPUT: X ⊆ Σ ∗∗ , X finite
O UTPUT: T RUE if X is a code, FALSE otherwise.
Proposition 2. The 2D-C ODICITY P ROBLEM is undecidable.
Proof. (Sketch) The Thue system word problem reduces to the 2D-C ODICITY P ROB LEM . The well-known undecidability of the Thue system word problem (see e.g. [9])
will imply the undecidability of the 2D-C ODICITY P ROBLEM. For a given Thue system (Σ, S) and two words u, v ∈ Σ ∗ we construct the set XS,u,v of square bricks over
an alphabet that extends Σ, in the way defined in [17], Section 3. Each square brick
can be regarded to as a picture, and hence XS,u,v can be consider as a picture language
XS,u,v ⊆ Σ ∗∗ . The authors of [17] show that u and v are equivalent iff XS,u,v is not
a brick code. The proof is completed by showing that XS,u,v is a brick code iff it is a
picture code.
t
u
Next step will be to consider subclasses of codes that are decidable. In 1D an important class of codes is that one of prefix codes. In the next section we consider a possible
extension of the definition to two dimensions.
4
Prefix codes
In Section 2 we reported the definition of prefix of a picture p, as a subpicture x corresponding to the top-left portion of p. Such definition “translates” to 2D the classic
notion of prefix of a string. Starting from this, one could merely define a set of picture
X to be prefix whenever it does not contain pictures that are prefixes of other pictures
in X. Unfortunately this property would not guarantee the set to be a code. Consider
for example the set X introduced in Example 2. No picture in X is prefix of another
picture in X; nevertheless X is not a code.
The basic idea in defining a prefix code is to prevent the possibility to start decoding
a picture in two different ways (as it is for the prefix string codes). One major difference
going from 1D to 2D case is that, while any initial part of a decomposition of a string
is still a string, the initial part of a decomposition of a picture, at an intermediate step,
has not necessarily a rectangular shape. Starting from the tl-corner of a picture, it is
possible to reconstruct its tiling decomposition in many different ways, obtaining, as
intermediate products, some (labeled) polyominoes whose domain contain always the
tl-corner position (1, 1). If we proceed by processing positions (i, j) in lexicographic
order, the domains of these polyominoes will have the peculiarity that if they contain a
position (h, k) then they contain also all positions above it up to the first row. We name
these special polyominoes as follows.
Definition 3. A corner polyomino is a labeled polyomino whose domain contains position (1, 1). A comb polyomino is a corner polyomino whose domain is the following
set of positions for some n, h1 , h2 , · · · , hn ≥ 1:
(1, 1) (1, 2) . . . . . . (1, n)
..
(2, 1)
.
(2, n)
..
..
.
(h2 , 2)
.
(h1 , 1)
(hn , n)
In other words, a comb polyomino is a column convex corner polyomino whose domain
contains all positions in the first row of its minimal bounding box. See the figure in Section 2, where (b) is a corner (but not a comb) polyomino, and (c) is a comb polyomino.
In the literature these corner polyominoes correspond to labeled directed polyominoes
while comb polyominoes correspond to labeled skyline or Manhattan polyominoes rotated by 180 degrees. Now we can define the comb star, a sort of tiling star that, applied
to a set of pictures X, produces a set of comb polyominoes. This set of comb polyominoes tiliable in X will be denoted by X ** and will be called the comb star of X.
Comb polyominoes and comb star are used to define prefix sets. A set of pictures
X is prefix if any decomposition of a comb polyomino in X ** can “start” (in its tlcorner) in a unique way with a picture of X. To give a formal definition, recall that a
picture p is a prefix of a (comb) polyomino c if the domain of c includes all positions in
dom(p) = {1, 2, . . . , |p|row } × {1, 2, . . . , |p|col }, and c(i, j) = p(i, j) for all positions
(i, j) ∈ dom(p).
Definition 4. A set X ⊆ Σ ∗∗ is prefix if any two different pictures in X cannot be both
prefix of the same comb polyomino c ∈ X **.
Example 3. Let X ⊆ Σ ∗∗ . One can easily show the following. If |Σ| = 1 then X is
prefix if and only if |X| = 1. If |X| = 2 and |Σ| ≥ 2, then X is prefix if and only if the
two pictures in X are not the power of a same picture (cannot be obtained by column
and row concatenation of a same picture). Any set X ⊆ Σ m,n containing pictures on
Σ of fixed size (m, n), is always prefix.
Example 4. It is easy to verify that the set X of Example 1 is prefix. On the contrary,
the set X of Example 2 is not prefix: two different elements of X are prefixes of the
aba
that belongs to X **.
comb polyomino c =
aba
Example 5. Let X =
{
aba , abb ,
b aa aa aa aa ba ba b b
,
,
,
,
,
,
,
,
b aa ab ba b b aa ab aa
bb
}. Language X is prefix: no two pictures in X can be overlapped in order to be
ab
both prefixes of the same comb polyomino.
Let us introduce another notion related to tiling that will be useful for the proofs in the
sequel: the notion of covering. For example, in the figure below, the picture with thick
borders is (properly) covered by the others.
Definition 5. A picture p is covered by a set of pictures X, if there exists a corner
polyomino c such that p is prefix of c and the domain of c can be partitioned in rectangular subdomains {d1 , ..., dh } such that each di corresponds to a picture in X and the
tl-corner of each di belongs to the domain of p. Moreover p is properly covered if the
subdomain containing position (1, 1) corresponds to a picture different from p itself.
Proposition 3. A set X is prefix if and only if every x ∈ X cannot be properly covered
by pictures in X.
Proof. Assume there exists x ∈ X properly covered by pictures in X, let c be the
corresponding corner polyomino, and t its tiling decomposition. Then c can be “completed” to a comb polyomino, by filling all the empty parts of c, that contradict column
convexity of c, up to the first row. Indeed it is possible to concatenate some copies of
elements of X, that occurr in t, and cover the bottom and right border positions of x
(the tl-corner of each subdomain in t belongs to dom(x)). For example in the figure
before Definition 5, one may concatenate a copy of the first two pictures crossing the
right border. Conversely, if there exist x, y ∈ X both prefixes of a comb polyomino c,
with a tiling decomposition t in X then, if the subdomain of t containing position (1, 1)
does not correspond to x (to y, resp.), one can properly cover x (y, resp.), by taking
only the elements of X whose tl-corners belongs to the domain of x (y, resp.).
t
u
Definition 4 seems to be a right generalization of prefix set of strings since it implies
the desired property for the set to be a code. Remark that the following result holds
without the hypothesis of finiteness on X, and its converse does not hold.
Proposition 4. If X ⊆ Σ ∗∗ is prefix then X is a code.
Proof. (Sketch) Suppose by contradiction that there exists a picture u ∈ Σ ∗∗ that admits two different tiling decompositions in X, say t1 and t2 . Now, let (i0 , j0 ) the smallest position (in lexicographic order) of u, where t1 and t2 differ. Position (i0 , j0 ) corresponds in t1 to position (1, 1) of some x1 ∈ X, and in t2 to position (1, 1) of some
x2 ∈ X, with x1 6= x2 . See the figure below, where a dot indicates position (i0 , j0 ) of
u in t1 (on the left) and t2 (on the right), respectively.
r
x1
r x
2
Consider now the size of x1 and x2 and suppose, without loss of generality, that
|x1 |row > |x2 |row and |x1 |col < |x2 |col (equalities are avoided because of the prefix
hypothesis). In this case, x1 together with other pictures placed to its right in the decomposition t1 would be a proper cover for x2 that, by Proposition 3, contradicts the
hypothesis of X prefix set.
t
u
Proposition 4 shows that prefix sets form a class of codes: the prefix codes. Contrarily to the case of all other known classes of 2D codes, the family of finite prefix codes
has the important property to be decidable. Using results in Proposition 3 together with
the hypothesis that X is finite, the following result can be proved.
Proposition 5. It is decidable whether a finite set of picture X ⊆ Σ ∗∗ is a prefix code.
We now present a decoding algorithm for a finite prefix picture code.
Proposition 6. There exists a polynomial algorithm that, given a finite prefix code X ⊆
Σ ∗∗ and a picture p ∈ Σ ∗∗ , finds, if it exists, a tiling decomposition of p in X, otherwise
it exits with negative answer.
Proof. Here is a sketch of the algorithm and its correctness proof. The algorithm scans
p using a “current position” (i, j) and a “current partial tiling decomposition” T that
contains some of the positions of p grouped as collections of rectangular p subdomains.
Notice that, at each step, the partial decomposition corresponds to a comb polyomino.
1. Sort pictures in X with respect to their size (i.e. by increasing number of rows and,
for those of same number of rows, by increasing number of columns).
2. Start from tl-corner position of p ( (i, j) = (1, 1)) and T = ∅.
3. Find the smallest picture x ∈ X that is a subpicture of p when associated at subdomain with tl-corner in (i, j) and such that positions of this subdomain were not yet
put in T . More technically: let (rx , cx ) indicate the size of x. We find the smallest
x such that x is subpicture of p at subdomain dh = [(i, j), (i + rx − 1, j + cx − 1)]
and positions in dh are not in the current T .
4. If x does not exist then exit with negative response, else add the subdomain dh to
T and set (i, j) as the smallest position not yet added to (subdomains in) T .
5. If (i, j) does not exist then exit returning T , else repeat from step 3.
It is easy to understand that if p ∈
/ X ∗∗ then at step 3, the algorithm will not find any x,
and will exit giving no decomposition in step 4. Suppose now that p ∈ X ∗∗ that is there
exists a (unique) decomposition T for p. At each step, the algorithm chooses the right
element x to be added to the current decomposition. This can be proved by induction
using an argument similar to the one in the proof of Proposition 4. Recall that x is the
smallest (as number of rows) picture that is a subpicture of p at current position (i, j).
If there is another element y that is subpicture of p at position (i, j), y cannot be part of
a decomposition T of p otherwise y possibly together with other pictures to its right in
T would be a cover for x and this contradicts the fact that X is prefix.
Regarding the time complexity, the most expensive step is the third one that it is
clearly polynomial in the sum of the areas of pictures in X and in the area of p. We
remark that, using a clever preprocessing on the pictures in X, it can be made linear in
the area of set X and picture p.
t
u
We conclude the section by remarking that, moving to the context of recognizable
picture languages, the problem of finding a decomposition of p, corresponds to check
whether picture p belongs to the REC language X ∗∗ or, equivalently, if p can be recognized by means of the flower tiling system for X ∗∗ . In the general case the parsing
problem in REC is NP-complete ([16]) and there are no known (non deterministic)
cases for which the parsing problem is feasible. The presented algorithm shows that, in
the case of X finite prefix set, the parsing problem for language X ∗∗ can be solved in
polynomial time.
5
Maximal and complete codes
Maximality is a central notion in theory of (word) codes: the subset of any code is
a code, and then the investigation may restrict to maximal codes. In 1D maximality
coincides with completeness, for thin codes. In this section we investigate on maximal
and complete 2D codes: we show that, differently from the 1D case, complete codes
are a proper subset of maximal codes. Moreover we give a full characterization for
complete prefix codes. For lack of space all proofs are sketched.
Definition 6. A code X ⊆ Σ ∗∗ is maximal over Σ if X is not properly contained in
any other code over Σ that is, X ⊆ Y ⊆ Σ ∗∗ and Y code imply X = Y . A prefix code
X ⊆ Σ ∗∗ is maximal prefix over Σ if it is not properly contained in any other prefix
code over Σ, that is, X ⊆ Y ⊆ Σ ∗∗ and Y prefix imply X = Y .
Example 6. The set X = Σ m,n of all pictures on Σ of fixed size (m, n), is a prefix
code (see Example 3). It is also maximal as a code and as a prefix code.
The following lemma gives a general tool to verify whether a finite prefix set is
maximal prefix. It shows that if a prefix code is not maximal prefix, there is always a
“small” picture that witnesses it. As consequence we can check whether a prefix set is
maximal prefix by restricting our test to a finite number of pictures.
Lemma 1. Let X be a prefix finite set, rX = max{|x|row , x ∈ X}, and cX =
max{|x|col , x ∈ X}. If X is not maximal prefix, then there exists p0 ∈ Σ ∗∗ , p0 ∈
/ X,
such that X ∪ {p0 } is still prefix and |p0 |row ≤ rX + 1, |p0 |col ≤ cX + 1.
Proof. Let p ∈ Σ ∗∗ , p ∈
/ X, such that X ∪ {p} is still prefix. If |p|row > rX + 1,
|p|col > cX + 1, then let p0 be the prefix of p of size (rX + 1, cX + 1). Then X ∪ {p0 }
is prefix. Otherwise one could properly cover a picture in X with some pictures in
X ∪ {p0 }, and then, replacing p0 with p, also with pictures in X ∪ {p}, against X ∪ {p}
is prefix and Proposition 3.
t
u
Applying Lemma 1 one can show the following proposition.
Proposition 7. It is decidable whether a finite prefix set X is maximal prefix.
Any prefix code that is maximal, is trivially maximal prefix too. In 1D the converse
holds for finite sets (thin indeed). In 2D the converse does not hold. Proposition 8 gives
an example of a prefix maximal set that is not a maximal code. Let us show some
preliminary result. Informally speaking, a picture p is unbordered if it cannot be self
overlapped.
Lemma 2. Let X ⊆ Σ ∗∗ , |Σ| ≥ 2. If there exists a picture p ∈
/ P ref (X **), then there
exists an unbordered picture p0 , with p prefix of p0 , and p0 ∈
/ P ref (X **).
Proof. Suppose |p| = (m, n), let p(1, 1) = a and let b ∈ Σ \ {a}. The picture p0
can be obtained by the following operations: surround p with a row and a column of a;
then row concatenate a picture of size (m, n + 1) filled by b, unless for its bl position;
finally column concatenate a picture of size (2m + 1, n + 1) filled by b, unless for its
bl position. The resulting picture is a picture p0 of size (2m + 1, 2n + 2) having p as a
prefix. It is possible to claim that p0 is unbordered, by focusing on the possible positions
of the subpicture p[(1, n + 1), (m + 1, n + 1)] when p eventually self overlaps.
t
u
Proposition 8. There exists a finite prefix code that is maximal prefix but not maximal
code.
Proof. Let X be the language of Example 5. Applying Proposition 7 one can show
that X is a maximal prefix set. On the other hand, X is not a maximal code. Consider
bbab
picture p = b b b x , with x, y, z ∈ Σ; p ∈
/ P ref (X **). By Lemma 2, there exists
abyz
also an unbordered picture p0 , such that p is a prefix of p0 , and p0 ∈
/ P ref (X **). Let
us show that X ∪ {p0 } is still a code. Indeed suppose by contradiction, that there is a
picture q ∈ Σ ∗∗ with two different tiling decompositions, say t1 and t2 , in X ∪ {p0 }.
By a careful analysis of possible compositions of pictures in X, one can show that if
there exists a position (i, j) of q, where t1 and t2 differ, while they coincide in all their
tl-positions, then this leads to a contradiction if j = 1, otherwise it implies that there
exists another position to the bottom left of (i, j), where t1 and t2 differ, while they
coincide in all their tl positions. Repeating the same reasoning for decreasing values of
j, this leads again to a contradiction.
t
u
In 1D, the notion of maximality of (prefix) codes is related to the one of (right) completeness. A string language X ⊆ Σ ∗ is defined right complete if the set of all prefixes
of X ∗ is equal to Σ ∗ . Then a prefix code is maximal if and only if it is right complete.
Let us extend the notion of completeness to 2D languages and investigate the relation
of completeness with maximality for prefix codes.
Definition 7. A set X ⊆ Σ ∗∗ is tl-complete if P ref (X **) = Σ ∗∗ .
Differently from the 1D case, in 2D, the tl-completeness of a prefix set implies its prefix
maximality, but it is not its characterization.
Proposition 9. Let X ⊆ Σ ∗∗ be a prefix set. If X is tl-complete then X is maximal
prefix. Moreover the converse does not hold.
Proof. It is easy to prove by contradiction that a tl-complete prefix language is also
maximal prefix. For the converse, let X as in Example 5. Applying Proposition 7, one
can show that X is a maximal prefix set. Let us show that X is not tl-complete. Consider
bbab
picture p = b b b x with x, y, z ∈ Σ: there is no comb polyomino c ∈ X ** that has p
abyz
as a prefix. Indeed, from a careful analysis of possible compositions of pictures in X, it
follows that the symbol b in position (2, 3) of p cannot be tiled by pictures in X.
t
u
In 2D tl-complete languages are less rich than in 1D. Complete picture languages are
indeed only languages that are somehow “one-dimensional” complete languages, in
the sense specified in the following result. Note that the result shows in particular that
completeness of finite prefix sets is decidable.
Proposition 10. Let X ⊆ Σ ∗∗ be a maximal prefix set. X is tl-complete if and only if
X ⊆ Σ m∗ or X ⊆ Σ ∗n .
Proof. First observe that when X is a tl-complete prefix set, then no two pictures in
X can overlap in a way that their tl corners match (otherwise one of them would be
covered by X). Suppose that X is tl-complete, but X * Σ m∗ , for any m. Then there
exists x1 , x2 ∈ X, of size (m1 , n1 ) and (m2 , n2 ), respectively, with m1 > m2 . We
claim that in this case, n1 = n2 . Indeed if n1 6= n2 , then two cases are possible:
n1 > n2 or n1 < n2 . Define y = x1 [(1, 1), (m2 , n1 )]. In the first case, picture p1 , in
figure, with t, z ∈ Σ ∗∗ , does not belong to P ref (X **), against X tl-complete. In the
second case, picture p2 , in figure, with t ∈ Σ ∗∗ , does not belong to P ref (X **), against
X tl-complete. Now, for any picture x ∈ X, the same technique applies either to the
pair x, x1 or x, x2 , concluding that X ⊆ Σ ∗n , for some n.
x2
p1 =
y
x2
x1
x1
z
t
p2 =
x1
y
x2
t
Conversely, suppose X ⊆ Σ m∗ . Setting Γ = Σ m,1 , X can be considered as a set of
strings over Γ . Since X is prefix maximal, it is prefix maximal viewed as set of strings
over Γ . Therefore, from classical theory of codes (see e.g. [7]), X is right-complete (as
a subset of Γ ∗ ), and this easily implies X tl-complete (as a subset of Σ ∗∗ ).
t
u
6
Conclusions
The aim of this paper was to investigate 2D codes in relation with the theory of recognizable picture languages (REC family), starting from the finite case. We determined a
meaningful class of 2D codes, the prefix codes, that are decidable and that have very
interesting properties to handle with.
Our further researches will follow two main directions. First we will extend the
investigation to 2D generalizations of other types of string codes, as for example bifix
codes. Secondly, we will try to remove the finiteness hypothesis and consider prefix sets
belonging to particular sub-families in REC, such as deterministic ones.
References
1. P. Aigrain, D. Beauquier. Polyomino tilings, cellular automata and codicity. Theoretical
Computer Science, Vol. 147, pp. 165-180, Elsevier 1995.
2. M. Anselmo, D. Giammarresi, M. Madonia. Deterministic and unambiguous families within
recognizable two-dimensional languages. Fund. Inform., Vol. 98(2-3), pp. 143-166, 2010.
3. M. Anselmo, D. Giammarresi, M. Madonia, A. Restivo. Unambiguous Recognizable Twodimensional Languages. RAIRO: Theoretical Informatics and Applications, Vol. 40, 2, pp.
227-294, EDP Sciences 2006.
4. M. Anselmo, N. Jonoska, M. Madonia. Framed Versus Unframed Two-dimensional Languages. In M. Nielsen et al. (Eds.) Procs. SOFSEM 09, LNCS 5404, 79-92. Springer-Verlag
Berlin Heidelberg 2009.
5. M. Anselmo, M. Madonia. Deterministic and unambiguous two-dimensional languages over
one-letter alphabet. Theoretical Computer Science, Vol. 410-16, 1477–1485 Elsevier 2009.
6. D. Beauquier, M. Nivat, A codicity undecidable problem in the plane. Theoret. Comp. Sci.
303 (2003) 417-430
7. J. Berstel, D. Perrin, C. Reutenauer. Codes and Automata. Cambridge University Press,
2009.
8. M. Blum, C. Hewitt. Automata on a two-dimensional tape. IEEE Symposium on Switching
and Automata Theory, pp. 155-160, 1967.
9. R. V. Book, F. Otto. String-rewriting Systems. Springer-Verlag, 1993.
10. S. Bozapalidis, A. Grammatikopoulou. Picture codes. ITA, vol. 40 (4), pp. 537-550, 2006.
11. D. Giammarresi, A. Restivo. Recognizable picture languages. Int. Journal Pattern Recognition and Artificial Intelligence. Vol. 6, No. 2& 3, pp. 241-256, 1992.
12. D. Giammarresi, A. Restivo. Two-dimensional languages. Handbook of Formal Languages,
G.Rozenberg, et al. Eds, Vol. III, pp. 215-268. Springer Verlag, 1997.
13. A. Grammatikopoulou. Prefix Picture Sets and Picture Codes. In Procs. CAI’05, pp. 255268, 2005.
14. M. Kolarz, W. Moczurad. Multiset, Set and Numerically Decipherable Codes over Directed
Figures. In Combinatorial Algorithms, LNCS, Vol. 7643, pp. 224-235, Springer 2012.
15. D. Lind and B. Marcus: An Introduction to Symbolic Dynamics and Coding. Cambridge
University Press, Cambridge, 1995.
16. K. Lindgren, C. Moore, M. Nordahl. Complexity of two-dimensional patterns. Journal of
Statistical Physics, 91 (5-6), pag. 909–951, 1998.
17. M. Moczurad and W. Moczurad, Some Open Problems in Decidability of Brick (labeled
Polyomino) Codes In: K.-Y. Chwa and J.I. Munro (Eds.): COCOON 2004, LNCS 3106, pp.
72-81, 2004. Springer-Verlag Berlin Heidelberg 2004
18. D. Simplot. A Characterization of Recognizable Picture Languages by Tilings by Finite Sets.
Theoretical Computer Science, Vol. 218-2, 297–323 Elsevier 1991.
Download