On Computing Schur Functions and Series Thereof

advertisement
J Algebr Comb manuscript No.
(will be inserted by the editor)
On Computing Schur Functions and Series Ther eof
Cy Chan ∙ Vesselin Dr ensk ∙y Alan Edelman ∙
Raymond Kan ∙ Plamen K oe v
Recei v ed: date / Accepted: date
Abstract W e present tw o ne w algorithms for computing all Schur functions
sκ ( x1,..., xn)
for partitionsκ such that|κ | ≤ N.
Both algorithms ha v e the property that for nonne g ati v e arxguments
1,..., xn the output
O ( n2) . is
is computed to high relati v e accurac y and the cost per Schur function
∙ Computing∙
K eyw ordsSchur function∙ Hyper geometric function of a matrix ar gument
Accurac y
Mathematics Subject Classification —)
05E05∙ 65F50∙ 65T50
1 Intr oduction
( x1,
W e consider the problem of accurately and ef ficientl y e v aluating the Schursκfunction
x2,..., xn) and series thereof for nonne g ati v e ar guments
xi ≥ 0, i = 1, 2,..., n.
This research w as partially supported by NSF Grants DMSÆ, DMSr, DMSÆ, the
Singapore–MIT Alliance (SMA), and by Grant MI-1503/2005 of the Bulgarian National Science Fund.
C. Chan
Department of Computer Science, Massachusetts Institute of T echnology , 77 Massachusetts A v enue, Cambridge, MA 02139, U.S.A.
E-mail: c ychan@mit.edu
V. Drensk y
Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, 1113 Sofia, Bulg aria.
E-mail: drensk y@math.bas.bg
A. Edelman
Department of Mathematics, Massachusetts Institute of T echnology , 77 Massachusetts A v enue, Cambridge,
MA 02139, U.S.A.
E-mail: edelman@math.mit.edu
R. Kan
Joseph L. Rotman School of Manageme nt, Uni v ersity of T oronto, 105 St. Geor ge Street, T oronto, Ontario,
Canada M5S 3E6.
E-mail: kan@chass.utoronto.ca
P. K oe v
Department of Mathematics, North Carolina State Uni v ersity , Box 8205, Raleigh, NC 27695, U.S.A.
E-mail: psk oe v@ncsu.edu
2
Our goal is to deri v e ef ficient algorithms for computing
hyper
the g eometric function of
an n× n semidefinite matrix ar gumentand
X parameter
α > 0:
(α )
p Fq
( )
( )
α
α
1 ( a1) κ ∙ ∙ ∙( ap) κ
(α )
∙
(α )
( α ) ∙ Cκ ( X ) ,
∑
∑
k= 0 κ ` k k! ( b1 ) κ ∙ ∙ ∙( bq) κ
∞
( a1,..., ap; b1,..., bq; X ) =
A)
where the summation is o v erpartitions
all
κ =( κ 1, κ 2,... ) of k = 0, 1, 2,... ;
( )
α
( c) κ ≡
∏
(c (i
1) / α + j
1)
B)
(i, j )2 κ
(α )
is the g ener alized P oc hhammer symbol
, and Cκ ( X ) is the J ac k function
. The latter is a
(α )
(
)
=( tr( X )) k [19,
generalization of the Schur function and is normalized so∑that
C
X
κ` k κ
(α )
(α )
25, 31]. The ar gument
X in A) is a matrix for historical reasons only;Cκ and pFq are
≥ 0, i = 1, 2,..., n, of X.
scalar -v alued symmetric functions in the eigen xvi alues
The practical importance of computing the h yper geometric function of a matrix ar gument stems from f ar reaching applications
of multi v ariate statistics, where it pro vides a
closed-form e xpression for the distrib utions of the eigen v alues of the classical random matrix ensembles—W ishart, Jacobi, and Laguerre [27]. These di strib utions, in turn, are needed
in critical statistical tests in applications ranging from genomics [29] to wireless communications [10, 20, 30], finance [16], tar get classification [18], etc.
Despite its enormous practical importance, only limited progres s has been made in the
computation of this function since the 1960s. The problems with its computation come from
tw o sources: A) it con v er ges e xtremely slo wly , and B) the straightforw ard e v aluation of a
single Jack function is e xponential [6]. The frustrations of man y researchers with the lack
of ef ficient algorithms are long standing and well documented [2, 13, 15, 17, 26˜].
The recent progress in computing the h yper geometric function of a matrix ar gument
has focused on e xploiting the combinatorial properties of the Jack function leading to ne w
algorithms which are e xponentially f aster than the pre vious best ones (see Section 2 for an
o v ervie w).
Interestingly , the computational potential of the combinatorial properties of the Jack
function had been missed for quite some time. It is thus our hope to dra w the attention of
the combinatorics community to this problem and its f ar reaching practical applications.
> 0y
Although the h yper geometric function of a matrix ar gument is defined for
α an
and there are corresponding theoretical interpretations [7], most applications focus
α = 1 on
and α = 2 only . This is lar gely because these v alues correspond to the distrib utions of the
eigen v alues of comple x and real random matrices, respecti v ely .
( 1)
In this paper we focus on
Cκ ( X ) is a normalization
α = 1. In this case the Jack function
of the Schur functionsκ ( x1,..., xn) (see C) in Section 2 for the e xact relationship).
One w ay to compute the h yper geometric function of a matrix ar gument in practice is to
truncate the series A) fork ≤ N for some suf ficiently lar ge
N. Sincesκ ( x1, x2,..., xn) = 0
if κ n+ 1 > 0, our goal is thus to compute, as quickly and accurately as possible, all Schur
functions corresponding to partitions
N.
κ in not more thann parts and size not e xceeding
Denote the set of those Schur functionsSby
N, n :
S N,n ≡ { sκ ( x1,..., xn) | κ =( κ 1,..., κ n) , |κ | ≡ κ 1 + ∙ ∙ ∙+ κ n ≤ N} .
The analysis in [6] suggests that computing e v en a single Schur function accurately and
ef ficiently is f ar from tri vial. W e elaborate on this briefly .
There are se v eral determinantal e xpressions for t he Schur function (the classical definition as quotient of alternants, the Jacobi–T rudi identity , its dual v ersion, the Giambelli
3
and Lascoux–Prag acz determinants). Each one w ould seemingly pro vide an ef ficient w ay
to compute the Schur function v ery ef ficiently . The problem with this approach is that the
matrices in v olv ed quickly become ill conditioned as the sizes of the matrix ar gument
(
n)
and the partition |(κ |) gro w . This implies that con v entional (Gaussian-elimination-based)
algorithms will quickly lose accurac y to roundof f errors. The loss of accurac y is due to a
phenomenon kno wn as
—loss of significant digits due to subtracsubtr active cancel lation
tion of intermediate (and thus approximate) quantities of similar magnitude.
According to [6], the loss of accurac y in e v aluating the determinantal e xpressions for
the Schur function can be arbitrarily lar ge in all b ut the dual Jacobi–T rudi identity . In the
latter the amount of subtracti v e cancellation can be bounded independent of the v alues of
the input ar guments
xi . By using e xtended precision one can compensate for that loss of
accurac y leading
to an algorithm that is guaranteedto be accurate and costsO (( n|κ | +
3
1+
1
κ 1 ) ( |κ | κ 1 | ρ ) ) .
Subtraction is the only arithmetic operation that could lead to loss of accurac y; multiplication, di vision, and addition of same-sign quantities al w ays preserv es the relati v e accurac y .
In this paper we present tw o ne w algorithms for computing the Schur function. Both
algorithms are subtraction-free, meaning that both are guaranteed to compute the v alue of
the Schur function to high rela ti v e accurac y in floating point arithmetic. Both are also v ery
ef ficient—the cost per Schur function, when computing all Schur functions in S
t he
N,n,set
is O ( n2) .
This represents a major impro v ement o v er the pre vious best result in [6] in the sense that
no e xtended precision arithmetic is required to achie v e accurac y and the cost of computing
O (( n|κ | + κ 13)( |κ |κ 1) 1+ ρ )) to O ( n2) .
a single Schur function is reduced from
While both our ne w algorithms ha v e the same comple xity and accurac y char acteristics,
each is significant in its o wn right for the follo wing reasons:
– The first algorithm implements the classical definition of the Schur function as a sum of
monomials o v er all semistandard Young tableaux. Since the coef ficients in this e xpression are positi v e (inte gers), such an approach is subtraction-free, thus guaranteed to be
accurate. The full e xpression of the Schur function as a sum of monomials contains e xO ( n|κ | ) [6]) thus the similarly e xponential cost of the pre vious
ponentially man y terms
algorithms based on it [6].
W e use dynamic programming and e xploit v arious redundancies to reduce the cost to
O ( n2) per Schur function. This approach is only ef ficient if all Schur functions in the set
S N,n are computed, b ut (unlik e the second algorithm) the ideas may generalize be yond
α = 1.
– The second algorithm represents an accurate e v aluation of the e xpression of the Schur
function as a quotient of (generalized, totally nonne g ati v e) Vandermonde determinants.
Since virtually all linear algebra with totally nonne g ati v e matrices can be performed
ef ficiently and in a subtraction-free f ashion [22, 23], this leads to an accurate algorithm
for the e v aluation of indi vidual Schur functions at the cost of only
O( n2κ 1) each, or
2
O ( n ) each if all ofS N,n is computed.
This paper is or g anized as follo ws. W e present background information and surv e y e xisting algorithms for this problem in Section 2. Our ne w algorithms are presented in Sections
3 and 4. W e dra w conclusions and outline open problems in Section 5.
W e made softw are implementations of both our ne w algorithms a v ailable online [21].
1
Hereρ is tin y and accounts for certain logarithmic functions.
4
2 Pr elimi naries
Algorithms for computing the h yper geometric function of a matrix ar gument for specific
v alues ofp, q, andα can be found in [1, 3, 4, 14, 15].
In this section we surv e y the approach of K oe v and Edelman [24] which w orks for an y
α > 0. W e also introduce a fe w impro v ements and set the stage for our ne w algorithms in
the caseα = 1.
W e first recall a fe w definitions that are rele v ant. Gi v en a partition
κ and a point( i , j )
in the Young diagram ofκ (i.e., i ≤ κ 0j and j ≤ κ i ), the upper and lo wer
hook lengthsat
( i , j ) 2 κ are defined, respecti v ely , as:
_
0
hκ ( i , j ) ≡ κ j i + α ( κ i j + 1) ;
0
hκ_ ( i , j ) ≡ κ j i + 1 + α ( κ i j ) .
The products of the upper and lo wer hook lengths are denoted, respecti v ely , as:
_
Hκ ≡
_
∏
(i, j ) 2 κ
hκ ( i , j )
H_κ ≡
and
∏
hκ_ ( i , j ) .
(i, j )2 κ
W e introduce the “Schur” normalization of the Jack function
(α )
Sκ ( X ) =
α
H_κ
|κ | |
(α )
κ |!
Cκ ( X ) .
C)
( 1)
This normalization is such that
Sκ ( X ) = sκ ( x1,..., xn) [31, Proposition 1.2].
(α )
The h yper geometric function of a matrix ar gument in terms
Sκ ( Xof) is:
(α )
pFq
α
α
( a1) κ ∙ ∙ ∙( ap) κ α k (α )
∙ S ( X).
(α )
(α ) ∙
∑∑
H_κ κ
k= 0 κ ` k ( b1) κ ∙ ∙ ∙( bq) κ
( )
∞
( a1,..., ap; b1,..., bq; X) =
( )
D)
(α )
Denote the coef ficient in front Sof
κ ( X ) in D) by:
α
α
( a1) κ ∙ ∙ ∙( ap) κ α |κ |
.
(α )
(α ) ∙
( b1) κ ∙ ∙ ∙( bq) κ H_κ
( )
Qκ ≡
( )
0
Let the partitionκ =( κ 1, κ 2,..., κ h) ha v e
h = κ 1 nonzero parts. When
κ i > κ i + 1, we
define the partition:
κ ( i ) ≡ ( κ 1, κ 2,..., κ i
1, κ i
1, κ i + 1,..., κ h) .
E)
The main idea in the e v aluation of D) update
is to the κ term in D) from terms earlier in
(α )
the series. In particular , we update
Qκ from Qκ ( h) andSκ ( x1,..., xn) from Sµ ( x1,..., xn 1) ,
µ ≤ κ.
In order to mak e the
Qκ update as simple as possible, we first e xpress
H_κ in a w ay that
0
does not in v olv e the conjugate partition,
:
κ
H_κ =
h
∏∏
r = 1 c= 1
h
=
κr
0
r + 1 + α (κr
κc
c) _
κj
r
∏∏ ∏
(j
r + 1 + α (κr
c) )
r = 1 j = 1 c= κ j + 1 + 1
= α
|κ |
h
j
∏
∏
j= 1r= 1
_ j
r+ 1
α
+ κr
κj
_
,
κ j κ j+ 1
5
where ( c) t = c( c + 1) ∙ ∙ ∙( c + t 1) is the r aising factorial, the uni v ariate v ersion of the
Pochhammer symbol defined in B).
Defining κ˜ i ≡ ακ i i we obtain:
κ ( h)
h 1
1
H_
κ˜ j κ˜ h
=
.
∏
˜
+
H_κ
1
ακ h α
κ˜ h + 1
j= 1 κ j
F)
Using F), Qκ can beupdatedfrom Qκ ( h) as
p
h 1
∏ j = 1( a j + κˉ h)
κ˜ j κ˜ h ,
α
∙
∙
Qκ = Qκ ( h) ∙ q
∏
˜
ˉ
+
1
(
+
)
b
κ˜ h + 1
∏ j = 1 j κ h ακ h α
j= 1 κ j
G)
whereκˉ h ≡ κ h 1 hα 1 .
(α )
The Jack functionSκ ( X ) can be dynamically updated using the formula of Stanle y [31,
Proposition 4.2] (see also [24, C.8)] and C)):
Sκα ( x1,..., xn) =
( )
(α )
Sµ
∑
µ
| / µ|
( x1,..., xn 1) xnκ
σκµ ,
H)
where the summation is o v er all partitions
µ ≤ κ such that the sk e w shape
κ / µ is a horizontal strip (i.e., κ 1 ≥ µ1 ≥ κ 2 ≥ µ2 ≥ ∙ ∙ ∙ [32, p. 339]). The coef ficients
σκµ are defined
as
h_µ ( i , j )
hκ_ ( i , j )
=
I)
σκµ
∏ _
∏ µ ,
( i , j ) 2 κ hκ ( i , j ) ( i , j ) 2 µ h_ ( i , j )
0
0
( i ,all
where both products are o v er
j ) 2 κ such thatκ j = µ j + 1. F orα = 1, clearly σ, κµ = 1
for all κ and µ .
Once again, instead of computing the coef ficients
σκµ in H) from scratch, it is much
more ef ficient to start with
σκκ = 1 and update the ne xt coef ficient i n the sum H) from the
0
0
pre vious ones. T o this end,µ let
be a partition such that
κ j = µ j for j = 1, 2,..., µk 1, and
κ / µ be a horizontal strip. Then we update
σκµ ( k) from σκµ using:
σκµ ( k)
σκµ
k
=
_
hκ ( r, µk) k 1 h_µ ( r, µk)
=
∏ κ
∏ _
r = 1 h_ ( r, µk ) r = 1 hµ ( r, µk )
k
k 1
µ˜ k
µ˜ r µ˜ k + α
µ˜ k r∏
µ˜ r µ˜ k
=1
1 + κ˜ r
∏
α
=
r 1 + κ˜ r
1
,
˛)
which is obtained directly from I).
(α )
W e use H) to compute
Sκ ( x1,..., xi ) for i = h + 1,..., n. F ori = h, the result of Stanle y
[31, Propositions 5.1 and 5.5] allo ws for a v ery ef ficient update:
Sκα ( x1,..., xh) =( x1 ∙ ∙ ∙xh) κ h ∙ Sκα
( )
( )
κ hI ( x1,...,
xh) ∙
κh
h
h
∏
∏
j = 1 i= 1 h
i + 1 + α (κi j )
,
i + α ( κ i j + 1)
˚)
whereκ κ hI ≡ ( κ 1 κ h, κ 2 κ h,..., κ h 1 κ h) .
The ne w results in this section comprise of the updates G) and ˛), which are more
ef ficient than the analogous ones in [24, Lemmas 3.1 and 3.2]. These ne w updates do not
require the conjugate partition to be computed and maintained by the algorithm and cost
2( p + q) + 4h and 9k, do wn from (2p + q) + 11κ h + 9h 11 and 12k + 6µk 7, respecti v ely .
Additionally , the use of ˚) reduces the cost of an e v aluation of a truncation of A) by a
f actor of about
N/ 2.
6
3 The first algorithm
In this section we present the first of our tw o ne w algorithms for computing all Schur funcS N,n. This algorithm is based on the classical definition of the Schur function
tions of the set
[32, Section 7.10]:
sκ ( x1,..., xk) =
XT ,
∑
T 2 Aκ
where the summation is o v er the set
Aκ of all semistandardκ -tableauxT filled with the
c
c
, 2,..., k. Also, X T = x11 ∙ ∙ ∙xkk , and( c1,..., ck) is the content ofT . Extending the
numbers 1
notation, let:
m
T
sκ ( x1,..., xk ) =
X ,
∑
T 2 Aκ ,m
where the summation is o v er theAset
κ ,m, which equalsAκ with the additional restriction
that k does not appear in the first
m ro ws.
Note thatsκ ( x1,..., xk) = s0κ ( x1,..., xk) andsκ ( x1,..., xk 1) = skκ ( x1,..., xk) .
1
Lemma 1 The following identity holds for allm
κ s ( x1,..., xk ) :
1
( x1,..., xk) =
sm
κ
_ sm
if κ m = κ m+ 1;
κ ( x1 ,..., xk ) ,
1
m
∙
(
,...,
)
+
(
,...,
)
,
sm
x
x
x
s
x
x
otherwise ,
1
1
k
k
k
κ ( m)
κ
wher e the partition
κ ( m) is defined as inE) .
Pr oof In the first case κ( m = κ m+ 1), no k is allo wed in themth ro w of T because of the
strictly increasing property of each columnTof
. Therefore, the restriction that no
k appear
in the firstm 1 ro ws ofT is equi v alent to the restriction that
no
in the first
k appear
m ro ws
of T , andAκ ,m 1 = Aκ ,m.
In the second case, there are tw o possibilities for
the
T 2 Aκ ,m 1. If the entry
κ -tableau
in position ( m, κ m) is not equal tok, then none of the entries in themth ro w can equalk
due to the nondecreasing nature of each ro w . Thus, the tableaux fitting this description are
e xactly the set
Aκ ,m.
If the entry in position( m, κ m) is equal tok, then remo v al of that square of the tableau
clearly results in an elementAof
Aκ (in
κ ( m) ,m 1. Further , for e v ery tableau
m) ,m 1 , the addition
of a square containing
k to the mth ro w results in a v alid semistandard tableau
Aκin
,m 1.
The tableau retains its semistandardness because e v ery element
w (and in the
mthinrothe
entire table as well) can be no lar ger kthan
, and e v ery element inκthe
mth column abo v e the
Aκ (in
ne w square can be no lar ger kthan1 due to the restriction that e v ery tableau
m) ,m 1
cannot ha vkein the firstm 1 columns.
W eha v ethus constructeda bijection f mapping Aκ( m) ,m 1 to the set (call it B) of
tableauxin Aκ ,m 1 where the entry in position ( m, κ m) equalsk. Clearly ,for each T 2
Aκ ( m) ,m 1, X f ( T ) = X T ∙ xk, so∑T 2 B XT = ∑T 2 Aκ ( m) ,m 1 X T ∙ xk.
Combining these tw o possibilities T
for2 Aκ ,m 1, we obtain
1
( x1,..., xk) =
sm
κ
concluding our proof.
∑
T 2 Aκ ,m
XT
1
T
=
∑ X
T 2 Aκ ,m
=
m
sκ ( x1,...,
+
∑ X
T 2 Aκ ,m 1
T
∙ xk
( m)
m 1
xk) + sκ ( m) ( x1,..., xk) ∙ xk ,
ut
7
Our algorithm, based on Lemma 1, is v ery simple.
S Nin
Algorithm 1 The following algorithm computes all Sc hur functions
,n.
κ1
sκ = x1 if κ 2 S N,1 and sκ = 0 otherwise
for all
κ 2 S N,n initialize
for k = 2 to n (Loop 1)
for m = k down to 1 (Loop 2)
for all
κ 2 S N,k such that κ m > κ m+ 1, in reverse lexicographic order
sκ = sκ + sκ ( m) ∙ xk
endfor
endfor
endfor
After the first line Algorithm 1, the v ariablessκ contain sκ ( x1) . During each iteration of Loop 1, the v alues stored in
sκ for κ 2 S N,k are updated fromsκ ( x1,..., xk 1) =
skκ ( x1,..., xk) to sκ ( x1,..., xk) = s0κ ( x1,..., xk) . During each iteration of Loop 2, the v alues
m 1
( x1,..., xk) .
in sκ for κ 2 S N,k are updated from
sm
κ ( x1 ,..., xk ) to sκ
The last line of the algorithm implements Lemma 1. Since the partitions are processed
in re v erse le xicographic order
sκ ( m), will ha v e already been updated for each
κ when this
1
line is e x ecuted. Thus, at the time
(
,...,
) , and sκ is
sκ is updated,sκ ( m) containssm
x
x
1
k
κ ( m)
m 1
( x1,..., xk) . Our algorithm updates the Schur functions
updated fromsm
κ ( x1,..., xk) to sκ
“in place” using a single memory location for each partition.
O ( n2) per Schur function, we must be able
In order to complete the algorithm i n time
to lookup the memory location of
in
constant
time for each
sκ ( m)
κ 2 S N,n and 1≤ m ≤ n.
{
}
In our implementation, we k eep al l of the Schur v ariables
sκ κ 2 S N,n in an array and use a
lookup table of size|S N,n| ∙ n to find the appropriate array inde x for each
sκ ( m) . Since this
lookup table is in v ariant o v er all possible xinputs
1 ,..., xn, we simply precompute it (or load
it from disk) and k eep it in persistent memory for future calls to our algorithm.
It is also possible to compute the inde x es ofsκeach
( m) on the fly rather than ha ving them
O ( |from
S N,n| ∙
stored in a lookup table. This modification reduces the memory requirement
n) to O ( |S N,n|) , b ut increases the time comple xity by a f actornofto O ( n3) per Schur
O notation (for time comple xity) is
function. In practice, the constant hidden by the big_ N we ha v e found
greatly increased when computing indices on the fly; therefore,nsince
the lookup table approach to be much more ef ficient. Further discussion of implementation
issues and MATLAB code for both methods are a v ailable online [21].
4 The second algorithm
Our second algorithm is based on the e xpression of the Schur function as a quotient of totally
nonne g ati v e generalized Vandermonde alternants:
sκ ( x1,..., xn) =
where
j 1+ κ n
G ≡ xi
j+ 1
_ni , j = 1
detG
,
detVn,n
j 1 n
_i , j = 1
and Vn,n ≡ xi
aren × n generalized and ordinary Vandermonde matrices, respecti v ely .
Sincesκ ( x1, x2,..., xn) is a symmetric polynomial, we can assume that
s are sorted
xi ’ the
in increasing order: 0≤ x1 ≤ x2 ≤ ∙ ∙ ∙ ≤ xn. This choice of ordering mak es
G and Vn,m
8
totally nonne g at i v e [9, p. 76] thus the methods of [23, Section 6] can be used to e v aluate
with guaranteed accurac y in O ( n2κ 1) time. The matricesG and Vn,m are notoriously
ill conditioned [11] meaning that con v entional Gaussian-elimination-based algorithms will
quickly lose all accurac y to roundof f [6].
The contrib ution of this section is to sho w ho w to eliminate the remo v able singularity at
= j , and to arrange the computations in such a w ay that the cost per Schur function
xi = x j , i 6
is only O ( n2) when e v aluating allSofN,n.
It is con v enient to see
G as a submatrix of the rectangular Vandermonde matrix
j 1 n,m
_i , j = 1,
Vn,m ≡ xi
m = n 1 + κ 1, consisting of columns +1 κ n, 2 + κ n 1,..., n 1 + κ 1.
Consider the LDU decomposition
, whereL is a unit lo wer triangular
Vn,n 1+ κ 1 = L D U
n × n matrix, D is a diagonaln × n matrix, andU is a unit upper triangular
n × ( n 1 + κ 1)
matrix.
The critical observ ation here is that the v alue G
ofisdet
unaf fected by
L, namely
detG = detD ∙ detUˉ ,
whereUˉ is the (n × n) submatrix ofU consisting of its columns+1κ n, 2 + κ n 1,..., n 1 +
κ 1.
= detVn,n, thus
Ho we v er , Ddet
sκ ( x1, x2,..., xn) = detUˉ .
˘)
The e xplicit form ofU is kno wn [8, Section 2.2], [33, eq. B.3)], allo wing us to write
˘)also as:
n
,
sκ ( x1,..., xn) = det_ hi j + κ n j + 1 ( x1,..., xi ) _
i, j = 1
wherehk , k = 1, 2,..., are the complete symmetric polynomials and, by def hault,
k ≡ 0 for
k < 0.
In order to apply the algorithms of [23] to e v aluate ˘), we need
the gonal decombidia
positionof U , which has a particularly easy form:
8 1 x1 x1
>
>
0 1 x2
>
>
>
<
BD (U ) = 0 0 1
>
>
>
>
>
: 0 0 0
... x1 x1 x1
... x2 x2 x2
... x3 x3 x3
..
.
1 xn xn
9
>
>
>
>
>
= .
>
>
>
>
... >
;
...
...
...
−)
F or e xample, for
n = 3, m = 4 [22, Section 3]:
2
3 2 1 x1
3 2 1 x1
3
2 1 0 0 0 3 1 x1
1 x2 7 6 1 x2 7 6 1 0 7
.
U = 4 0 1 0 05 6
6
6
6
1 x3 7
1 07
1 07
4
54
54
5
0010
1
1
1
Therefore computing the Schur function consists of using Algorithm 5.6 from κ[23]
1
times to remo v e the appropriate
κ 1 columns inU and obtain the bidiagonal decomposition
of Uˉ . The determinant ofUˉ , i.e., the Schur function, is easily computable by multiplying
out the diagonal of the diagonal f actor in the bidiagonal decomposition
Uˉ . of
The total cost isO ( n2κ 1) .
In this process of computing
sκ ( x1,..., xn) a total of κ 1 (intermediate) Schur functions
Oof
( n2) each.
are computed, therefore all funct ions inSthe
N,n can be computed at the cost
9
5 Open pr oblems
It is natural to ask if the ideas of this paper can e xtend be yond
α = 1 and in particular to
=
2,
the
other
v
alue
of
of
major
practical
importance
[27].
α
α
None of the determinantal e xpressions for the Schur function are belie v ed to ha v e ana= 1, thus we are sk eptical of the potential of the ideas in Section 4 to generalize.
logues forα 6
= 1 and we return to our original idea in
The results of Section 3 may e xtend be αyond
de v eloping Algorithm 1 to e xplain.
S N,n ordered in
Consider the (column) v ector
s( n) consisting of all Schur functions in
( n 1)
re v erse le xicographic order s. Let be the same set, b ut n
on 1 v ariablesx1,..., xn 1.
Then
s( n) = M s( n 1)
whereM is an|S N,n| × |S N,n 1| matrix whose entries are inde x ed by partitions
Mµν
and=
| |
|
x µ ν if µ / ν is a horiz ontal strip and 0 otherwise. The contrib ution of Section 4 w as to
recognize thatM consists of blocks of the form
2 1
3
1
x
7
A= 6
6 x2 x 1 7 .
4 3 2
5
x x x1
SinceA
1
is bidiagonal:
A
1
2 1
3
1
x
7,
= 6
6
x 1 7
4
5
x1
gi v en a v ector (callz),it the matrix-v ector product
y = Az can be formed in linear (instead
of quadratic) time by solving instead the
bidia gonallinear systemA 1y = z for y.
This w as our original approach in designing Algorithm 1. Ultimately we found the much
more ele g ant proof which we presented instead.
The question is whether this approach can be generalized to other v αalues
. Unfortuof
nately the matrixA in general has the form:
2
(α )
A
(α )
whereai j =
1
1
= 6
1 2
6
(
x
4 1 3 2! α + 1)
3! x ( 1 + α )( 1 + 2α )
xi j
( i j )!
i
3
1
1
1 2
(
+ 1)
x
α
2!
1
1
1
7
7
5
j 1
∏k= 0 ( kα + 1) , i > j .
(
= 1 thus the approach of Section 3 cannot be
The matrix A ) 1 is not bidiagonal forα 6
( )
carried o v er directly . One could consider e xploiting the T oeplitz structure
form a
A α to of
(α )
2
matrix-v ector product with it in
O( k log k) instead ofk time (assumingA is k × k) [12, p.
193]. Current computing technology , ho we v er ,Nlimits
to about 200 and since
k ≤ N, this
approach does not appear feasible in practice at this time.
(α )
Ackno wledgementsW e thankJame s Demmel
for posing the problem of computingthe Schur function
accurately and ef ficiently [5, Section 9.1B)] as well as Iain Johnstone and Richard Stanle y for man y fruitful
discussions during the de v elopment of this material.
10
Refer ences
1. Butler , R.W ., W ood, A.T .A.: Laplace approximations for h yper geometric functions with matrix ar gument. Ann. Statist.30D), 1155
2. Byers, G., T aka wira, F .: Spatially and temporally correlat ed MIMO channels: modeling and capacity
53, 634£ ⁄)
analysis. IEEE T ransactions on Vehicular T echnology
3. Chen, W.R.: T able for upper percentage points of the lar gest root of a determinantal equation with fi v e
roots
4. Chen, W .R.: Some ne w tables of the lar gest root of a matrix in multi v ariate analysis: A computer approach from 2 to 6 fi). Presented at the 2002 American Statistical Association
5. Demmel, J., Gu, M., Eisenstat, S., Slapni ˇcar , I., Veseli ´c, K., Drma ˇc, Z.: Computing the singular v alue
decomposition with high relati v e accurac y . Linear Algebra299
Appl.
A³), 21À ‘)
6. Demmel, J., K oe v , P.: Accurate
and ef ficient e v aluation of Schur and Jack functions.
Math. Comp.
75«),223¡ Ł)
7. F orrester , P.: Log-g asses and Random matrices.
_ matpjf/matpjf.html
http://www.ms.unimelb.edu.au/
12, 769µ
8. Gander , W .: Change of basis in polynomial interpolation.
Numer . Linear Algebra Appl.
˝)
9. Gantmacher , F ., Krein, M.: Oscillation Matrices and K ernels and Small V ibrations of Mechanical Systems, re vised edn. AMS Chelsea, Pro vidence, RI fi)
10. Gao, H., Smith, P., Clark, M.: Theoretical reliability of MMSE linear di v ersity combining in Rayleigh46, 666º
f ading additi v e interference channels. IEEE T ransactions on Communications
10.1109/26.668742
11. Gautschi, W., Inglese, G.: Lo wer bounds for the condition number of Vandermonde matrices.
Numer .
Math. 52C), 241¨ ˆ)
12. Golub, G., Van Loan, C.: Matrix Computations, 3rd edn.
Johns Hopkins Uni v ersity Press, Baltimore,
MD √)
53,
13. Grant, A.: Performance analysis of transmit beamforming.
IEEE T ransactions on Communications
738ä ˝)
14. Gross, K.I., Richards, D.S.P.: T otal positi vity , spherical series, and h yper geometric functions of matrix
59B), 224¦ ‰)
ar gument. J. Approx. Theory
15. Guti ´errez, R., Rodriguez, J., S ´aez, A.J.: Approximation of h yper geometric functions with matricial ar 11,
gument through their de v elopment in series of zonal polynomials.
Electron. T rans. Numer . Anal.
121X µ)
16. Harding, M.: Explaining the single f actor bias of arbitrage pricing models in finite sampl es. forthcoming,
Economics Letters ł)
17. James, A.T .: Distrib utions of matrix v ariates and latent roots deri v ed from normalAnn.
samples.
Math.
Statist.35, 475A t)
18. Jef fris, M.: Rob ust 3D ATR techniques based on geometric in v In:
ariants.
F .A. Sadjadi (ed.) Automatic
T ar get Recognition XV,
, v ol. 5807, pp. 190€. SPIE, Bellingham, WA ˝)
Pr oceedings of SPIE
19. Kaneko, J.: Selber g inte grals and h yper geometric functions associated with Jack polynomials.
SIAM J.
Math. Anal.24D), 1086H ‹)
20. Kang, M., Alouini, M.S.: Lar gest eigen v alue of comple x Wishart matrices and performance analysis of
21C), 418≥ fl)
MIMO MRC systems. IEEE Journal on Selected Areas in Communications
21. K oe v , P.:
http://www4.ncsu.edu/~pskoev
22. K oe v , P.: Accurate eigen v alues and SVDs of totally nonne g ati vSIAM
e matrices.
J. Matrix Anal. Appl.
27A), 1“ ˝)
23. K oe v , P.: Accurate computations with totally nonne g ati v e matrices.
SIAM J. Matrix Anal. Appl. 29,
731é ł)
24. K oe v , P., Edelman, A.: The ef ficient e v aluation of the h yper geometric function of a matrix ar gument.
Math. Comp.75¬),833& Ł)
25. Macdonald, I.G.: Symmetric functions and Hall polynomials, Second edn. Oxford Uni v ersity Press, Ne w
York Ω)
26. McKay , M., Collings, I.: Capacity bounds for correlated rician MIMO channels. In: 2005 IEEE International Conference on Communications. ICC 2005., v ol. 2, pp. 772þ ˝)
27. Muirhead, R.J.: Aspects of multi v ariate statistical theory . John W ile y & Sons Inc., Ne w York ‚)
28. Ozyildirim, A., T anik, Y.: SIR statistics in antenna arrays in the presence of correlated Rayleigh f ading.
In: IEEE VTS 50th Vehicular T echnology Conference, 1999. VTC 1999 - F all, v ol. 1, pp. 67¹ ‘).
DOI 10.1109/VETECF .1999.797042
29. P atterson, N., Price, A., Reich, D.: Population structure and eigenanalysis. PLoS 2Genetics
, 2074K
Ł)
11
30. Smidl, V., Quinn, A.: F ast v ariational PCA for functional analysis of dynamic image sequences.
In:
Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, 2003.
ISP A 2003., v ol. 1, pp. 555p fl). DOI 10.1109/ISP A.2003.1296958
31. Stanle y , R.P.: Some combinatorial properties of Jack symmetric functions.
Adv . Math.77A), 76M
‰)
32. Stanle y , R.P.: Enumerati v e combinatorics.Cambridge
Vol. 2,
, v ol. 62.
Studies in Advanced Mathematics
Cambridge Uni v ersity Press, Cambridge ‘)
33. Van de Vel, H.: Numerical treatment of a generalized Vandermonde system of equations. Linear Algebra
Appl. 17, 149∫
Download