Consider an m by n array containing symbols

advertisement
Transversals in Rectangles
G. H. J. van Rees
Dept. of Computer Science
University of Manitoba
1
Consider an m by n array containing symbols
with m  n.
We are interested in the existence of transversals.
A section consists of m cells in the rectangular array,
exactly one from each row and at most one from each
column.
A transversal is a section which contains m distinct
symbols.
If the maximum frequency of any symbol in a
rectangular array is 1, then every section is a
transversal.
If the maximum frequency of a symbol in a m by n
array is mn, then there are no transversals.
2
Eg
1 1 2 3 
1 3 1 1 


1 4 3 5
{(1,1,1),(2,2,3),(3,4,5)} is a transversal.
L(m,n) is the greatest integer such that if each symbol
in a m by n array appears at most L(m,n) times, then
the array must have a transversal.
Translation: for a while, as the largest frequency of
an element in the rectangle goes up, all rectangular
arrays have transversals. At some point, we get our
first counterexample i.e. an array with no
transversals. Just before this point the maximum
frequency of a symbol is L(m,n).
To determine L(m,n) , we must show that all m by n
arrays with symbols whose frequency is  L(m,n)
must have a transversal and that there is a m by n
array with symbols whose frequency is  L(m,n)+1
that does not have a transversal.
3
For squares and latin squares this problem has been
around for some time.
Let us look at Parker’s construction.
1 1 2 2 
2 2 3 3


3 3 4 4
no transversal


4 4 1 1 
L(4,4)  3
1
2

3

4
1 2 2 2
2 3 3 3
3 4 4 4

4 1 1 1
1
2

3

4
1 1 2 2 2
2 2 3 3 3
3 3 4 4 4 L(4,6)  5.

4 4 1 1 1
L(4,5)  4. (Show Proof)
But no more
4
1
2

3

4
2 2 *
3* 3 

3 3 4 4* 4 4 

4 4 1* 1 1 1 
1 1 2
2 2 3
2
3
Lot of transversals.
Theorem 1: if n  2m-2, then L(m,n)  n-1.
Lemma 2: L(m,n)  (mn-1)/(m-1).
Proof.
If the maximum frequency is greater than
(mn-1)/(m-1) then there are arrays with only m-1
symbols in them and so can not have a transversal.
QED.
By the way L(1,n)=n. So m  2.
Stein hoped that if n  2m-2, then L(m,n)  n-1 and if
n > 2m-2 then L(m,n) = (mn-1)/(m-1).
For m=2 this is true. It is easy to show.
5
Akbari, Etesami, Mahini, Mahmoody and Sharifi did
the asymptotics in an excellent paper that has just
appeared in Discrete Math.
Theorem 3: If m  2 and n  2m3-6m2+6m+1, then
L(m,n) = (mn-1)/(m-1).
The result is recursive and we will use math
induction.
Also it is clearly not best possible. The bound on n
can be improved
Proof:
1) Delete the mth row and by recursion, you get a
partial transversal of length m-1 by induction. Not
tight on the inequality.
1
 2


2a) 


:
m 1
x
x
x
x
x






x 
6
The x’s can not be larger than m-1 or we have a
transversal and we are done. WLOG let (m,m) be 1.
1
 2


2b) 


x
x
x
x
1 x
x
x
x
;
m 1
x





x 
By counting, the x’s and 1’s can not all be 1’s. This
is not tight. WLOG let (1,m+1) be 2.
2* x
1
x 2
x

3*

:
2c) 

m  1*

1* x x

x
x
x
x
x
x
x
x





x 
By counting, the x’s, 1, and 2’s can not all be 1’s or
2’s. But now we have several distinct places that
could be a 3.
7
So the 3, wherever it is, causes there to be a partial
transversal of length m-1 with some desired
properties.
So they are constructing partial transversals Ti of
length m-1 with the following properties:
a) Each row, except row i has a cell in Ti.
b) For every j, i<j<m cell (j,j) is included in Ti.
c) The symbols in Ti are 1,2,…,m-1.
This the authors did very carefully and in detail so
that it is utterly convincing that you can do this.
They do this trick m-1 times.
like:
2
1
3 2

x x 3

x x :


x x
m 1

1 x

Last tableau may look
x
x

x

x x x
x 6 x

x x 7
x x
x x
4 x
8
3) So they have m-1 symbols appearing twice in the
array producing m-1 partial transversals. The
2m-2 symbols can be made to appear in the first 2m-2
columns. Then consider the remaining columns.
They can not contain any symbol larger than m-1 for
if it did and it was in row i then it and Ti would form
a transversal and we are done.
4) The element m must occur so it must occur in the
first 2m-2 columns. There must be at least one
occurrence of m, say in row i. Now delete row i from
the right-most n-(2m-2) columns and call up the
induction result. This is tight on n. We get a
transversal in the (m-1) by n-(2m-2) array based on
symbols 1,2,…,m-1. In the m by n array this partial
transversal and the element m in row i form a
transversal. QED
Now, they viewed this as an asymptotic result for
large enough n but I like to sharpen things.
They did not account for the fact, when they did the
last induction that there are 2 copies of the symbols
1,2,…,m-1 in the first 2m-2 columns. If you did you
can sharpen the bound from
n  2m3-6m2+6m+1 to n  2m3-8m2+12m-3.
I don’t believe this is the right result either. It is
more like m2.
9
Now Stein & Szabo worked on m=3 and using ideas
we have just seen and with a lot of case work they
proved:
Theorem 4 For n 5, L(3,n) = (3n-1)/2 .
i.e. m=3 behaves nicely.
Theorem 5 (S&S) L(m,n)  n-m+1.
Proof: Induction on m. True for m = 2 or 3. Done
by example but you can use Akbari and all to give a
real proof.
If L(4,n)  n-3 then show that L(5,n)  n-4. Assume
no transversal, then you get this tableau.
2 x
1
x 2
x


3 x
x

4


1 3 x
x .. ..
x .. ..
x .. ..


x .. ..
You count occurrences of 1,2,3,4 and there are too
many of them. Contradiction. There is a transversal.
Note that the 4th row is empty. But it also can be
mostly filled with x’s as in Akbari et al. This gives:
10
Theorem 6 (JVR) For m2,
L(m,n)  (m(n-m+1)-1)/(m-1).
So the difference between the upper and lower
bounds are at most m.
For m=4, Asymptotics kick in at m=45, according to
the theorems. However, the upper bound is easy to
prove correct for m  41, even 39 but not 40 although
it can be dealt with as an ugly special case. So where
do the asymptotics kick in?
Theorem 7 (JVR) For m 3,
L(m,n) < (mn-1)/(m-1) for n= m+a(m-1)+b where
a=0,1,…,m-3 and b=0,1,…,m-3-a.
Proof: eg. L(4,7) < (4*7-1)/3 = 9.
Consider a 4 by 7 rectangle with max frequency 9.
Consider 4 symbols, 1, 2, 3 and 4 with frequencies 9,
9, 9 and 1. Any transversal must take 1 of each
symbol and in particular must take the 1 occurrence
of symbol 4.
11
1 1 1 1 1 1 4


1



1  Clearly if you put the 4 in


1


the transversal , you can not get a 1 in the transversal.
QED.
This is only case where this works for m=4.
Consider a 5 by 10 rectangle with max frequency
(5*10-1)/4 = 12. Let frequencies be 12, 12, 12, 12
and 2.
1 1 1 1 1 1 1 1 1 1 1 5 
 2 2 2 2 2 2 2 2 2 2 5 2



2 1


2
1



2 1
Clearly if you put a 5 in the transversal, you either
can not put in a 1 or you can not put in a 2. No
transversal. QED
12
at m=3
at m=4
at m=5
at m=6
etc.
3
4,5, 7
5,6,7, 9,10, 13
6,7,8,9, 11,12,13, 16,17, 21
the last result is at m2-3m +3. Note the theorem say
the asymptotics kick in at approx. 2m3. Which is
right?
One other lower bound using the Lovasz Local
Lemma. Students, if you don’t know it take a day
and study the proof and its usage.
Theorem 8 If no symbol appears in more than
(n-1)/4e cells of an n by n rectangle , then there is a
transversal.
This is incredibly weak but hard to beat when m is
near n.
13
Theorem 9 L(n,n) 2 for all n  3 and L(n,n) 3 for
n=4,5,6,7,8.
Proof
Case n 2
There are n! sections and each pair of identical
elements can cause (n-2)! sections to not be
transversals. Since there are at most n2/2
pairs of symbols there must be at least
n!- n2/2*(n-2)! >0 transversals.
Case n 3: Much the same except you get at n=6 a
tie but a bit more work proves it.
Monotinicity Theorems
1) L(m,n)  L(m+1,n)
2) L(m,n)  L(m,n+1)
Conjecture (Stein)
L(n,n+1) = n.
If true, then every Latin square has a partial
transversal with n-1 distinct elements, a best-possible
result for which there is some evidence. –BIG
RESULT. Might be true but I believe his conjecture
false.
14
My Monotinicity Conjectures
1) L(m,n)  L(m,n+1) +1
2) L(n,n)  L(n+1,n+1)
Szobo did some computing. Lots more to do.
2
3
4
5
6
2
1
3
5
2
L(m,n)
4
7
3
3
5
9
7
4
3
6
11
8
5
5
4
7
13
10
8
Question
How does g(m,n) = (mn-1/(m-1) - L(m,n) behave?
For programmers –Determine L(5,7). Is it 5 or 6.
15
1
5

6

7
1
3
2
7
2
3
4
3
3
5
4
2
6
4
4
2
1
5
6

7
1 
No transversals showing L(5,5)  3.
1
2

3

7
8

6
1 1 5 4 2
2 2 5 4 1
3 3 7 8 2

4 4 4 8 7
5 5 7 5 8

3 3 7 8 1
No transversals showing that L(6,6)  4.
Generalize the examples.
16
Greedy Defining Sets (GDS).
Latin Square is a n by n array in which each symbol
appears exactly once in each row and column
3 1 2
1 2 3
2 3 1
Do not want to store whole Latin Square though.
Store some of it and fill in the rest (using latin rules
when necessary).
Well studied. Called critical Sets.
Still need (n2-1)/4 elements to define square.
Eg.
1 2
2
3
3 4
17
So Eric Mendelsohn defined GDS where you have a
fixed algorithm (Greedy) and a Defining Set to define
the Latin Square.
Algorithm
1) Start at top left and go in the rows left to right and
from top to bottom.
2) At each empty cell fill in the square with the
lowest number that does not violate the latin rules.
3) If no number left, algorithm fails.
5
3
eg.
18
5
5
eg.
So the question is, fixing n, how small can these
greedy defining sets get?
Let g(n) be the size of the smallest defining set.
What do we know? NOT MUCH.
Conjecture (Zaker)
g(n) is O(n).
I don’t believe it.
Let us examine the GDS in a Latin Square.
1
2
3
1
4
5
5 3*
2* 4
3
2
1
4
5
4
5
2
1
3
5
4
3
2
1
19
5
2
2______________________
5
3
3
________________________
3
2
2______________________
4
2
2________________________
Zaker defined these as descents. A descent in a Latin
Square is a set of 3 cells { (a,b,x), (a,d,y), (f,b,y)}
such that x > y; a< f, b<d.
(a,b,x) is the head of the descent
(a,d,y) is the hand of the descent
(f,b,y) is the foot of the descent.
Picture
20
Theorem 1(Z) Every descent must intersect some
position of any GDS.
Proof: How else did the apex get to occupy that
position.
Theorem 2(JVR) GDS are invariant under conjugacy.
ie. interchanging the role of rows, columns and
symbols leaves GDSs as GDSs.
Proof: Just check.
Theorem 3(Z) g(n) = 0 if f n is a power of 2.
Proof: Easy
There are no known lower bounds other than 1 for n
not a power of 2.
First up is a product construction .
Theorem 4(Z) If n=rs,
then g(n)  r2g(s) +s2g(r) – g(s)g(r)
Proof: Longish but routine.
21
eg
3*
1
1
2
2
3
2
3
1
*
1
2
3
1
2*
3
3
2
1
=
7*
8*
9*
1
9*
7*
8*
3
8*
9*
7*
2*
1
2
3
4
3
1
2
6
2*
3
1
5*
4
5
6
7
6
4
5
9
5*
6
4
8*
2
1
3
5
4
6
8
7
9
3
4
2
6
6
1
5* 7
5
9
4 9
8* 1
8
3
7
2*
5
4
6
8
7
9
2
1
3
6
5
4
9
8
7
3
2
1
We modify the product construction when one of the
squares is a 2 by 2.
Theorem 5(JVR) If there exists a GDS, S, of order n
and size f with a cell (1,1,n), then g(2n)  4f-2.
By example
22
1 2
2 1
3*
1
2
6*
4
5
1
2
3
4
5
6
3*
1
*
2
2
3
1
5
6
4
1
2
2
3
3
1
=
6*
4
5
3*
1
2
4
5
6
1`
2
3
5
6
4
2
3
1
Turn the turn square
23
6*
1
2
3
4
5
1
2
3
4
5
6
2
3
1
5
6
4
3
4
5
6*
1
2
4
5
6
1
2
3
5
6
4
2
3
1
Found by Zaker.
Corollary 6(JVR) If there exists a GDS, S, of order n
and size f with a cell (1,1,n), then
g(2kn)  4kf-2(4k-1)/3.
If there are subsquares in a Latin square and the GDS
is just right, the subsquare can be filled in paying no
attention to the rest of the bigger square. (especially
subsquare to the left and top.)
Construction
24
n
1
2
3
4
5
6
7
8
9
10
g(n)
0
0
1
0
2
2
3
0
4
5
HOW
Z
Z
Z
Z
Z
Z
JVR(YJZhou)
Z
JVR(YJZ)
JVR(YJZ
n
11 12 13 14 15 16 17 18 19 20
|GDS| 9 6 15 12 10 0 18 16 23 18
What about a general upper bound for any n????
n2
Good Upper Bound
25
Theorem 8 (JVR)
The minimum |GDS| for a back circulant Latin
Square of order n is n(n-1)/6.
Proof:
Here is the construction:
9
8 9
7 8 9
4 5
4
3
2 3
3 equal sized triangles at the top left, right middle and
bottom left. Straightforward to show this is a GDS.
Real tricky and tedious (but lucky) to show that this
is minimum.
Zaker found an equivalent set in a conjugate of this
LS but there was no proof it was minimal.
26
Open Problems
1)(Z) What is the minimum or maximum number of
descents?
2) Say anything intelligent about descents.
3) Get a good general lower bound for g(n) for any n.
4) Get more constructions.
5) Do more computer work finding g(n) for small n.
6) Get a better upper bound for g(n) for any n.
7) (Z-conjecture) g(n) is O(n).
8) Prove that finding g(n) is NP-complete.
John Bate has counted the minimum number of
apexes in Latin Squares of small order.
n
3
4
5
6
7
8
9
10
11
min# apices
1
0
2
2
4
0
5
6
?
g(n)
1
0
2
2
3
0
4
5
9
27
References
1) S. Akbari, O. Etesami, H. Mahini, M. Mahmoody
& A. Sharifi, Transversals in Long Rectangular
Arrays, Discrete Math. (on view)
2) S.K. Stein & S. Szabo, The number of distinct
symbols in sections of rectangular arrays, Discrete
Math., 306 (2006), no. 2, 254–261. ...
3) M. Zaker, Greedy Defining Sets in Latin Squares,
Ars Combinatoria, accepted
28
Download