>> David Wilson: So I'm pleased to introduce our... from the University of Washington where she's an associate professor...

advertisement
>> David Wilson: So I'm pleased to introduce our next speaker, Ioana Dumitriu,
from the University of Washington where she's an associate professor in the
math department. She got her Ph.D. at MIT in 2003 with Alan Edelman, who we
heard about in the preceding talk. After that she was a Miller Fellow at U.C.
Berkeley until 2006, and then came to UW. She's worked in various fields.
Probably perhaps most recently but also numerical analysis and combinatorics
and linear algebra and she's going to tell us about a regular stochastic clock
model.
>> Ioana Dumitriu: Thank you for the very kind introduction. I'm sorry, but I'm
going to have to correct you. It's not the same Alan Edelman. It's actually not
Alan Edelman. It's somebody else. Edelman Greene, is that what you were --
>> David Wilson: Oh, really.
>> Ioana Dumitriu: It's a different Edelman. Seems to be a popular name.
All right. So it is my great pleasure to be talking about an ongoing problem that
we're working on at UW. This is joint work with my students Gerandy Brito,
Shirshendu Ganguly, with my colleague Chris Hoffman and for part of this we
were joined by Linh Tran who was a post-doc at UW up until this year. So we're
talking about a regular stochastic block model. Whoops. I guess this one. Or
not. All right. So I'm going to give you a quick overview of what I'm going to tell
them, what I'm going to tell you and then I'm going to tell you and then I'm going
to tell you what I told you. I'll give you a quick introduction to the field. Actually
it's not so quick. It's probably going to be somewhere between a third and a half
really of the talk. About independent edge binary SBM which actually be partly
described in the introduction as well and then I will tell you how you change that
definition to make it regular. And I will talk about the issues that arise, why
certain things become easier. Why certain things are slightly harder, but overall
the problem is much more tractable it seems. And I'll tell you about current and
future work.
All right. So let me, without further ado, talk about the clustering problem. This is
a very well-studied problem. Some would say that it has been studied to death,
but it's still not solved completely. So it's still worth talking about.
In fact, it's interesting to people. The idea is to input a network with clusters with
some sort of properties, possibly overlapping, although I won't be talking about
this today. And the idea is to be able through some algorithm to detect or
recover these clusters accurately and efficiently. I'll talk about what it means to
detect and recover in a moment. But it has applications in many places.
Machine learning, community detection. Synchronization, channel transmission
and so on. And there are many questions that are still open and quite subtle,
especially in the case when overlap is possible which again is not going to be the
case I'll be talking about today.
There's a huge body of work as I mentioned. And it overlaps many subjects,
many fields, optimization. EE, theoretical computer science and math. There
are two approaches that one can take to studying this problem. One of them is
to study actual networks and try and see if your algorithm actually can detect the
clusters. This is Zachary's Network. The network that described the interactions
between members of martial arts club that suffered at some point a division and
some people went with the old instructor and some people moved to the new,
and essentially this picture describes how the cessation took place. This is a
group that went with or, rather, stayed with the instructor and this is the group
that seceded. This is a network that's been really studied to death. It's used as a
benchmark for pretty much any algorithm. Look, my algorithm can actually
cluster Zachary's Network correctly and so on.
The other approach is to focus on studying idealized models of networks. Like,
for example, once produced by Erdos-Rényi and this is known as the problem of
studying spherical cows.
That's what we're going to be doing. So here's the stochastic block model or
SBM for short. Also known as the planted partition model, if you've seen that in
the literature. Essentially it says the following thing. It's very easy to describe.
You have to consider K Erdos-Rényi graph with size I and probability PI
independent and nonoverlapping. And aside from that you can put on the join of
the graphs a multi-partite and by multi-partite I mean basically any two vertices in
nonadjacent -- sorry, in not in the same cluster get joined by an edge with
probability Q. So you have some sort of cluster. Some sort of K clusters and
then on the outside of them you put a multi-partite Erdos-Rényi with some given
probability. The question then becomes under what sort of conditions on all of
these parameters -- this should have been a small case k, sorry, can one recover
detect the presence of the partition. Recovery here is understood as being
generally it's understood as being complete recovery, although sometimes weak
recovery is considered, where you expect to recover all but sublinearly many of
the nodes. Detection means that you're able to say if you believe that there's
some structure like this at work or if the network is just random.
Okay. So the possibility of recovery has been studied generally, the maximum
likelihood estimator and convex relaxations thereof. Recently, though, there's a
new approach which is described by using multiple structure -- multiple structure
matrix algorithms. So you think of the adjacency matrix as being sparse plus low
rank sparse. Being the outside plus. And the low rank actually identifying the
clusters. So being essentially a matrix that corresponds to, instead of clusters,
just taking clicks, the same nodes. So this is one of the references. Vinayak
Oymak and Hassibi. There are more general, the most general analysis. So via
impossibility, with impossibility with information theoretic bounds plus a convex
relaxation with the maximal likelihood estimator is fairly new. It's actually very
new. It appears in a paper by Chen and Xu from 2014 and it gives various order
sharp bounds, which means they don't get any constants but the order of the
thresholds is found. For illustrations.
The only case that has so far really been solved in terms of various thresholds is
the two equal cluster binary case and one can say that most of the work has
been done by an alum now's of the Microsoft group and two students, two
ex-students. I'm referring to Erin Mossel and Joe Neeman and Allen Sly.
So here's a dictionary of terms. I will talk about a strong recovery regime when it
is possible there are algorithms that will give you the partition completely. Weak
recovery regime means you don't get it exactly but you get essentially up to
sublinear many nodes labeled correctly and others may be mislabeled. There's
an approximation regime where you get a fraction of the nodes correctly, the
fraction is bigger than 50 percent but the rest may be mislabeled. There's a
detection regime where you can guarantee to get just about 50 percent of the
nodes correctly labeled. But you can't quantify that. You can say how much
better you can do. And then there's the impossibility regime where it is
impossible to do better than guessing essentially. And generally it's because of
indistinguishability, I'm not even sure if that's a word, indistinguishability reasons,
where your model is essentially indistinguishable from that of an Erdos-Rényi
with the according adjusted probability.
So the nomenclature in the field is varied. And it seems that everybody uses
their own preferred words. I'm using this, which is kind of a combination between
Mossel Neeman Sly and Abbe, I'm not sure how to pronounce his name,
Bandeira. I should know how to pronounce it because I know him. So it's
embarrassing.
This is the nomenclature. I hope you'll be able to remember it. This is the
definition for binary stochastic model. You start with two N nodes and you pick N
of them to be labeled one and the others minus one uniformly and independently.
Then add an edge between vertices with the same probability of P which in fact
it's going to be a function of N. And then outside of that, so you add labels, sorry,
you add edges between vertices with different labels with probability Q. And you
can call the resulting graph model G to NPQ. I guess I'm going to have to use
both of these somehow. Okay. So let me talk about briefly about the strong
recovery regime. You will need to have P and Q at least logarithmic in N
because otherwise you'll get essentially with depending on the constant with
constant probability you'll get isolated vertices, and those you cannot classify.
There's a bunch of people who have worked on this problem and have made
seminal contributions and it was solved almost completely by Abbe Bandeira and
Hall in 2014 and completely by Mossel Neeman Sly a few months later. In other
words, if you have this regime plus some tiny conditions I guess which I'm going
to show you in a moment you can do complete recovery. So this is the state of
the art. And when you see MMS, it's going to be short for Mossel Neeman Sly,
you'll see that a lot in this talk. Which is why I just felt like I could shorthand it.
So the state of the art is a rather complex characterization, but certain cases can
be more explicit. For example, if you think of QPN as being roughly constant
times log N, although the constant can fluctuate a little bit, then strong recovery is
possible even and only if this happens.
So in other words, if this number here or if the sequence of numbers here is
strictly positive, at all times then you have strong recovery and it's necessary for
strong recovery to be at least nonnegative. This matches the Abbe Bandeira and
Hall results but they weren't thinking in terms of letting the constants fluctuate.
And therefore this is what they got. They didn't get this.
>> When you say "strong recovery is possible" did that include a statement
about computational complexity?
>> Ioana Dumitriu: It's a little vague. Abbe Bandeira and Hall were showing this
is possible for slightly different conditions. It seems that the algorithm that MSN
has come up with is almost linear in the number of edges. So that would suggest
yes.
>> But I'm -- is that what you're focusing on or just --
>> Ioana Dumitriu: Just the existence, possibility of strong recovery. But it
seems like that's the case. I think that they will probably do a complete analysis.
In fact I was going to say something in a little while. But it seems like that might
be the case.
So there's a weak recovery regime in which you can recover up to sub linearly
many of the vertices, and this is again MNS 2014. And it's a very nice result,
essentially says that you have to have probabilities slightly better than 1 over N,
you want them to go to infinity NPN and NQN have to go to infinity and you have
to have this extra condition attached. So not sub logarithmic. You can have
them sub logarithmic and you cannot because of that you cannot hope for full
strong recovery. And it seems that the reason why you do not get or in a sense
what you cannot get is caused by the fact that some vertices may be mislabeled
in the sense they might actually be label one but have more connections to the
wrong set, to the other set.
>> Minus slide on the second ->> I'm sorry. This should be a plus sign. This should be an identity. So this is a
plus. Apology. Okay. Then there's an approximation regime which actually was
first studied by co-gel kin, I hope I'm pronouncing it correctly, that showed if you
consider P and Q to be A over N and B over N and you have this extra condition
for some large constant, then you can detect, bear with me I'll talk about
detectability next. You can detect the presence of the partition but the fraction of
recovered vertices is bounded. You cannot do better than a constant fraction of
the vertices which extraction depends on C and then some M and N from 2003
algorithm, but there are others that you'll talk about soon use propagation to
show that this interaction is actually achievable. Yielding thus an approximation
regime in which you can get a certain fraction of the vertices correctly. When C
approaches two, the constant approach is a half, which is why nobody expected
that any kind of approximation could be done in the lower range. Finally
detection impossibility. Again, Mossel Neeman Sly have been all over this
problem. So it turns out that they had a challenger, Massoulié at the same time
independently obtained a different proof for this. But they showed when you're in
this regime of P and Q being 1 over N, specifically A over N and B over N then
there exists polynomial time algorithms to find a correlated partition if and only if
this identity is true. This is a threshold that's been conjectured by Krzakala
Moore and Zdebrova. And has been reinforced several times since then. In
particular in 12 when Mossel Neeman Sly showed that if you're under that
threshold then the graph is indistinguishable with a graph that has two N vertices
and average, the average of the two, there should be an N there, I apologize.
No, there shouldn't be an N, so because -- sorry. It's correct. And as
reconstruction it's impossible. If you cannot distinguish, you cannot detect.
Okay. So complexity as mentioned it's not -- it's not written in stone, but it seems
that there's polynomial time strong or weak construction belief propagation is also
efficient detectability was shown to be polynomial time by both groups who
showed detectability. So the bottom line is that there seems to be no regime
reconstructionist impossible but not in polynomial time. That's a very interesting
thing because it runs contrary to the widespread belief that there are hard
regimes if you have more than two clusters. Perhaps clusters that are growing or
not linearly leaning. And it has a connection to the minimized bisection problem
which again known to be hard. So it's kind of an interesting ->> I'm sorry, but I got lost. If there's this graph that this randomly generated and
somebody wants to answer all these questions, what is the information that that
person has access to.
>> Ioana Dumitriu: Adjacent symmetrics. Thank you. Sorry. You have access
to the graph you have to test, whether it's a large graph, you have to test whether
the model it's been generated from is the one described here. Okay. So let me
introduce this notion of a regular stochastic log model. So basically we do the
same thing, except that now instead of taking Erdos-Rényi graphs, we take
uniformly regular graphs. So for integers 1 and 2, you take two D1 regularly,
uniformly regular random graphs of size N and you connect them bipartite NN D2
also uniformly random. Everything is independent of the other things that you're
doing. And of course the question is why, why would you do something like that
and of course the answer can be many-fold. The structure in such a model is a
lot more rigid. Can you say a lot more than you did before, can you do recovery
in other kinds of cases, can do recovery in the so-called lower regimes when you
have P and Q smaller. There's also edge dependence, how does that affect
things because before edge dependence was playing a pretty heavy role in the
calculations, do you use the same methods or you come up with others and of
course lastly because it's there, we're in the math department. Okay. So the first
thing to note if you remember, I mentioned that there's an impossibility regime for
the Erdos-Rényi stochastic model, and that is that if the relationship between the
two probabilities is of a certain nature, then you cannot distinguish essentially
between your model and just a bigger Erdos-Rényi model with the average of the
probabilities. Here you can always distinguish between this model and just a
random uniformly regular D1 plus D2 graph, your model is also of that type. So if
you look at the fact that you have two D1 regular joined by a D2 regular that
means that graph is actually D1 plus D2 regular. However it's a very different
distribution than that. So you can tell basically. And the reason is that if you
count the number of graphs that you have in both set it's exponentially smaller.
So this is not -- this is where things diverge for the first time. Of course
unfortunately indistinguishability has no computational value. What we'd like
really is to prove uniqueness of the partition. In other words, if you generate your
graph like this, you would like to know that there's basically no chance that it can
also be expressed in the same way with a different partitioning of the two classes
of vertices. Then you can hope for recovery. If it turns out that you don't have
uniqueness, no chance of recovery. Okay. So that's what one would like. And
of course we've made progress towards this kind of very interesting progress,
because we can show uniqueness when D2 is less than D1 but for huge sizes,
for huge sizes of D2. The interesting stuff happens when D2 is small. So that's
why this is very partial progress. But we're not going to stop here. The idea is
roughly to improve our results on this lemma, which is the overlap lemma, to
which I'm going to refer again. Which says that if the partition exists, if the
second partition exists, if you want, in some case, then the smaller swap set
must be large. So if you have two such partitions in your graph, then the set of
overlaps must be really large. So in other words you can just swap a few
vertices and hope you get a second partition. We're working on improving this
from upping the fraction from one over two D2 to one-half, if you can show that
you need to swap at least one-half or one-half is a barrier, because it's the
smaller of the two sets. It's very easy to explain why this happens. So suppose
that this is your left, let's say, D1 regular graph. This is your right D regular
graph. And suppose that a second partition is possible, which essentially swaps
the set of vertices in B to the set of vertices in C. Then it turns out that at least a
trivial actually observation that if you pick a vertex V and B, the number of
connections that it has to vertices in A has to be the same with the number of
connections to vertices in D. Because you're about to swap things. And this has
to be also the new swapped thing has to be still D1 regular. The problem is that
D1 and D2 are different and we know that V cannot have more than D2
connection. In fact, V has to have precisely the two connections to see union D.
So it cannot have more than D2 connections to D. However, if this set is small,
the chance of getting a vertex whose all D1 connections are to A rather than to B,
is significant. And at that point you would have to have the same degree to D but
D1 is bigger than D2 so you can't do it. So this is essentially a good heuristic for
the argument of why you have to have uniqueness of the partition in such a case.
Okay. So I think ill just explained this. Then we have -- and this is I think the
only proof that I'll show, I'm going to zoom through the rest of it. There's an easy
spectral regime. The fact that you are working with a D regular graph allows you
access to a whole host of very interesting simple linear algebra properties. So as
a consequence of these properties we have this theorem which says that if the
difference between the two degrees is bigger than this quantity here, then the
second largest eigenvalue of the adjacency matrix is D1 minus D2, the first
eigenvalue is D1 plus D2 regular graph. But the important thing is the second
largest eigenvalue of D1 minus D2 with multiplicity 1, with eigenvalue
corresponding to the correct partition. If you have adjacency matrix, if you're in
this regime, basically you find the second eigenvector and you're good to go.
That's going to give you the partition.
And also the consequences that the partition is recoverable. This is due to the
fact that the multiplicity here is 1. If you were to have two distinct partitions, both
of those would have to be eigenvector or would have corresponding eigenvectors
with eigenvalue D1 minus D2. However, the multiplicity of this eigenvalue is D1.
The partition is unique and recoverable and it solves the mean bisection of this
problem. The proof is simple. It's linear algebra.
These two are facts. You can check them. Then you can split the adjacency -sorry? I'm thinking if people are falling asleep in the audience they could spend
some time actually checking -- split the adjacency matrix like this. So now you
have two D1 regular graphs and this is the bipartite D2 part. These are
adjacency matrix of random regular graph and this is the random adjacency
matrix of the bipartite regular graph. And you just do spectral analysis on them
essentially.
So you calculate this unbounded from above, actually I should have probably put
an absolute value here but it's a symmetric matrix. I should have put an absolute
value. This is what happens. So if you look at a vector that's orthogonal to the
first eigenvector and the second eigenvector, then the value that this quadratic
intake is at most this. And this is strictly smaller than D1 minus D2 by the
condition. How can you show that? Essentially split the matrix like I said and
then you make a gross overestimate. Essentially split it into these two parts.
And you use the fact that at least for the -- this should have been an A-1.
Apologies. This should have been been an A-1. Remember that A-1 is just the
part, is the part corresponding just to the two D1 regular graphs. And on each
one of them you have due to Friedman another nice result on the second
eigenvalue, you bound that and you get the overall bound is two squared D1 and
the same thing holds true for the bipartite graph. It's been shown that a similar
bound on the second eigenvalue is true by Puta in 2013. And you put it together
and you get this. And it follows that the second eigenvalue is D1 minus D2 with
multiplicity 1, et cetera, et cetera. So this is an easy regime. Very spectral.
That's not the only case an issue you can get strong recovery in this model.
That's just an easy regime. And actually we've done that because it's easy to
explain. But it turns out that essentially using the methods of Massoulié,
adapting I should say the methods of Massoulié we can define this theorem. If
the difference of the degrees is bigger than D1 plus D2 then the partition is
strongly recoverable in polynomial time. Now mind you Massoulié had
detectability. This is much stronger recoverability. Essentially the methods are
adapted to work for the case where you have a different kind or almost
independence of edges because D1 and D2 are fixed and you can use the
configuration model for the uniformly random regular graph. But it's not exactly
trivial work. However, you get a lot more. So Massoulié used the matrix of
self-avoiding long walks, in contrast with the M and S strategy of using the matrix
of nonbacktracking long walks. The difference is in the second case the entries
of the matrix is bigger and the method is not spectral, whereas the method
Massoulié spectral and we used that because it was more accessible to us.
And this is all I'm going to tell you about how to prove it. I had several lemmas
actually that I was hoping to show that I'm obviously not going to have time for.
The idea is to do a local analysis. You have to show that no cycles are closed.
And of course this is known for the deregular graph. It takes a little bit more work
to show it in this, in the context of this model where you put regular graphs
together.
Then -- and this holds. So cycles are not as close as C log N. So cycles are far
apart. And then there's, of course, the connection between path structures and
labels of the neighborhood, which in our case, given that we have a regular
graph, is very simple to establish because we know what the neighborhoods look
like. We know exactly what the neighborhoods look like. Whereas in the case of
Erdos-Rényi, it's still an element of randomness, where here it's an exact count.
And finally the true important ingredient is to do the following thing. You have to
show spectrum separation of the first two eigenvalues of this matrix, the matrix of
self-avoiding walks of length L from the rest of the spectrum. So the first
eigenvalue, and maybe I'm going to show that lemma. So if you show spectrum
separation of the top two eigenvalues then essentially you will get a partition that
is correlated with the correct, the original partition, except that in our case it's not
just correlation. In our case is allows you to recover the original partition, up to N
minus little o of N and correct the measures by majority rule. All of them. And
then you get complete recovery. This is pretty standard technique. Okay. I'm
going to just show you this. So this is the core of the argument, really. So we
showed that the graph is tangle free with high probability. And if that's true, then
the following estimates are true for these two quantities. So E to N remember is
the vector of all 1s and sigma is the vector that gives you the signs. If you look at
what happens here, you see that these two, if you scale them down by square
root of N they will become unit vectors, both of them. So this N here and here
and this N minus big O of N to delta where the delta is small over here will
essentially disappear and what you're left with is telling you that this is essentially
going to go to the first eigenvalue of SL, which is going to be D1 plus D2 to the L.
And this is almost an eigenvector. And similarly this will go to what will turn out
to be the second eigenvalue of this which is D1 minus D2 to the L and this is
close to an eigenvector. You can actually show that this is polynomially close to
an eigenvector. Provided that you can show good separation from the rest of the
spectrum. So if you can show that for any other vector that's orthogonal to these
two guys then the same estimate, it's much, much smaller than you're done. And
it turns out that that's true. It turns out that if you look at unit vectors that are
orthogonal to those two, then the estimate is much, much smaller. Notice here
that you have a D1 plus D2 to L over 2. You essentially want that to be smaller
than D1 minus D2 to the L. And that's what gives you the condition. Finally, then
there's the question of, okay, so we know that the second eigenvector is going to
be giving you the partition up to sub linearly many vertices which are going to be
incorrectly labeled, but you could have two partitions for which both -- for both of
which have the property that the second eigenvector does that. Well, no.
Because in that case, the two partitions overlap very much. More than if
shown -- sorry overlap -- the swap is very small and we've shown that the swap
has to be relatively big. This was the last thing. And so this is just the recap.
We showed that strong recovery is possible in polynomial time and we believe
that recovery is always possible because uniqueness, that's what uniqueness
tells you that recovery is always possible. This is a rigid model. So if you have
the graph, you could in principle just test all the vertices. Of course, that's not
efficient. But you could test all possible pairings and see if it works. So that's
very different from before. And the question is: Is there a threshold for the
complexity? So we show that recovery is possible in polynomial time in a certain
regime. We believe that and we believe we'll actually be able to show that
recovery is always possible. And because -- so this threshold is given by the
method. We have no idea if it can be pushed down just yet, but we're working on
it. But it's also possible that this is actually an efficiency threshold. It's perhaps
lower than this you cannot get polynomial time algorithms. And of course then
the idea is to generalize to multiple clusters and I'm going to stop here. Thank
you.
[applause].
>> David Wilson: Questions?
>> I have a question but it's not mathematical. I understand these results are
asymptotic so in some sense they should call for large networks. So I'm thinking
about something like Facebook. But in Facebook, when I try to think about
communities, I would guess that there's actually a very large number of
communities, not just two or three or small finite numbers.
>> Ioana Dumitriu: And there's overlap as well.
>> So basically what are good, practical multiplying examples of a large network
which on this big scale or either are a small number of communities?
>> Ioana Dumitriu: That's why I talked about spherical cows.
>> What?
>> Ioana Dumitriu: That's why I talked about spherical cows. Yes, in general
you're completely right. So the idea is to get results that are asymptotically true
for many clusters for overlapping clusters, for clusters of different sizes. And so
on. And there's a whole body of literature on that. No thresholds. So thresholds
that are perhaps at best order thresholds. So we started off with this example
because there's hope here that one can analyze it completely. Higher than that,
probably not. However, I am actually working with Marian Frizelle and with a
couple of her students on a problem just like that. So we're actually making
some interesting progress. Generally the algorithms that will produce the
clustering will be -- so some sort of convex relaxation of MLE. And it turns out
that in certain regimes they perform well and the question is what are these
regimes. And generally the conditions that you get involve all the parameters in
the problems. So therefore under so you have to start saying okay if the clusters
are equal, what does that mean in terms of the probabilities associated to each
cluster, if the clusters are very separated, how can I play the probabilities to get
an impossibility regime and things like that.
>> Just one relevant example of course the Rath model is a rough provision but
clustering it in two is something you want to do all the time. For instance, just
differentiate legitimate websites from spam websites. That would be two
clusters. And there's not links between the legitimate websites and each other
and lots of artificial links created between the spam ware site and two links that
go between these groups. But those may be different nature. And this can -graph structure is the basis for using -- one of the tools I should say for this kind
of distinction, though. The real picture is very different from this beautiful models.
>> So that's nice. So actually there are some real examples.
>> There are lots of real examples and we want to distinguish between the two.
So that part is real. These random graph models are ->> In that case you don't really have two plus, because the spam may consist of
many different spam clusters.
>> Right.
>> Ioana Dumitriu: Or not clustered points even, which is ->> The thing you want to make a binary.
>> You want to make a binary proficient.
>> Ioana Dumitriu: Not necessarily with equal weight or anything.
>> What do the counterexamples of D2 equal 2 look like, so I get some idea
when it --
>> Ioana Dumitriu: You can find counterexamples, for example, D2 is equal to 2,
you don't have connectivity. So you can have long cycles of the same length and
you can essentially swap those. You can construct those.
>> But you're saying you think it's a theorem for any deregular graph --
>> Ioana Dumitriu: Yes, yes, so there's a theorem that says the following thing.
With D greater than or equal to 3, then with probability 1, AS, the graph is
connected.
>> No, but what you strongly suspect is true. You suspect it to be theorem, but
deregular graph.
>> Ioana Dumitriu: You can find examples for D2 equals 2 where the probability
of encountering a graph that has two possible partitions is not zero.
>> But it's not asking about the probability.
>> Ioana Dumitriu: I'm sorry, I guess I don't understand.
>> He's asking about the uniqueness of partitions for --
>> Ioana Dumitriu: Oh, what's the question?
>> So I find you saying that you suspect that it's deterministically true that a D
regular graph--
>> Ioana Dumitriu: Not deterministically true. I think it's true with high
probability. For example, if you think of a square, okay, so you have N -- N is
even. So you just put N over 2, N over 2, N over 2. And then you just put there
you put I guess D over 2 or something like that. Regular examples and then
between them bipartite D over 2. Bipartite D over two in this case D1 is equal to
D2 but you can imagine other cases it works.
>> The idea is these would occur with multiple --
>> Ioana Dumitriu: Yes. No, there's no reason to believe this is deterministic. I
think it's with high probability, though. With vanishing probability do you expect
to encounter a second partition in a graph that has one. Sorry I didn't
understand.
>> David Wilson: Any other last questions? Let's thank Ioana again.
[applause]
Download