Correlation of preferences and Honesty in matching market Sangram Vilasrao Kadam

advertisement
Correlation of preferences and Honesty in matching market
Sangram Vilasrao Kadamy
March 2, 2011
Abstract
rived at, are not at all informative for the real
world settings. This leaves the question about
core convergence in large matching markets still
open.
The limit convergence of the core in a large
matching market is a known result. If the
maximum size of the ranked list submitted by
Intro duction
men is xed and if their preferences are
Consider a one-to-one matching market where
randomly generated from any distribution, the
there are nmen and nwomen. Suppose they
proportion of women who have incentives to
manipulate their preferences goes to zero as the play the matching game to produce a ‘stable
matching,’ where no pair of individuals would
number of participants in the market grows to
in nity. I extend this result by allowing correlation prefer each other to their existing matches or
prefer to remain single rather than being
in the preferences of men while the women
matched to their current match. The seminal
preferences are allowed to be complete and
arbitrary. Furthermore, I nd that for a given level result about the existence of at least one such
of the proportion of women who can manipulate match where the preferences are strict on both
their preferences, we can allow an increase in sides of the market was given by GaleShapley
the size of the rank order list as the correlation in (1962). It can be produced by using Deferred
preferences increases. However, I nd that the acceptance algorithm where one side of the
market, say men, propose. A man proposing
bounds arDAA produces a man-optimal stable matching
and a woman-pessimal stable matching.
I would like to thank Peter Coles and Al Roth for providing
an excellent course which motivated me to investigate on RothSotomayor (1990) prove that the women
this issue and for providing me inputs through the progress always have an incentive to misreport their
of this paper. I would also like to thank Assaf Romm for
preferences
helpful discussions.
yDepartment of Economics, Harvard University, Cambridge,
MA 02138. G1; ID 30802853; email:
svkadam@fas.harvard.edu.
1
if there are more than one stable matches, i.e. if there are couples in the market. Couples
the match generated for a given woman in the incorporate correlation among individuals over
man-optimal stable match is di erent from that in preferences. However, all these results had
the woman-optimal stable match. Thus, she
assumed in some form that the individuals have
does not have incentives to truthfully reveal her no correlation in preferences or|as in case of
preferences.
KPR|that the proportion of individuals who have
The extension of stable one-to-one matching correlation goes to zero as ngrows large.
market to the real world problem of medical
I have been able to allow for correlation in
residency matches have been studies quite
preferences of all the market participants. I am
extensively. Roth-Peranson (1999) ran some
able to extend the above results for a particular
simulations on the size of the markets and found kind of correlation. The structure of this paper is
that as the number of doctors increase the
as follows. In the next section, I discuss the
proportion of hospitals who have more than one existing set of results in more details from IM
set of stable matches increases to about 90% for and KP. In section 2, I provide an overview of
n= 1000, if the preferences on both sides of the the literature on correlation in matching market
market are complete and randomly generated. It and then focus on the speci c type of correlation
is important to note that the completeness
that I am going to use for the rest of the
guarantees that every hospital is acceptable to discussion. Section 3 contains the main theorem
every doctor and vice versa. In practise,
and its proof is presented in section 4. The most
however, only a small number of hospitals get important discussion takes place in section 5.
listed by doctors on their preference list. Further Firstly, I would provide details about allowing
simulations, accounting for this observed fact, greater size of the rank ordered lists for market
lead to the Roth-Peranson conjecture that the participants as correlation increases. Secondly,
proportion of hospitals who have incentives to the lack of informativeness of the results
manipulate their preferences goes to 0 as the
presented, here and earlier, is shown through
size of the market increases and the preferences numerical simulations of the bound. Section 6
for both sides of the market are generated at
concludes highlighting the shortcomings of the
random from a uniform distribution.
existing results and motivating new directions on
This conjecture initiated a set of theoretical
this problem.
investigations to understand the underlying
mechanisms. Immorlica-Mahdian (2005)
(henceforth IM) presented the rst explanation 1 Existing results
for the setting of one-to-one matching markets.
IM generalize the RP conjecture in two
KojimaPathak (2008) (henceforth KP) extended
directions. Firstly, they allow the men’s
the above to many-to-one matching market and
preferences to be drawn from an arbitrary
a get a similar result that proportion of hospitals
distribution and not necessarily uniform
who have incentives to misrepresent their
distribution. They assume that the preferences
preferences goes to zero as n grows large.
of men are restricted to size of kand at each
Kojima, Pathak and Roth (2010) (henceforth
step a random woman is drawn from a
KPR) extend this even further to the situations
distribution D until someone who
where
2
does not already exist on his preference list is of regions of preferences and each doctor
found. Thus, the preference list for all nmen is belongs to one and only one region. The
generated by drawing from D. Secondly, they do preference list for doctors is generated by
not impose any restriction on the preferences of randomly generating a list of hospitals from the
women and assume that to be completely
distribution of their region. Thus, they are able
arbitrary. However, they do maintain the
to replicate region speci c preferences as one
assumption that any man is acceptable to any might expect in the real world. However, they
woman, i.e. the women have complete
do not prove or disprove the existence of core
preferences over the set of men. They use ckk(n) convergence result for these type of
to denote the expected number of women who preferences. A related work on preference
have more than one stable husband. Their main correlation is in Coles, Kushnir and Niederle
result stated in theorem 3.1 claims that
(2010) where they have what they call block
kTheorem Consider a situation where each
preferences. Doctors agree about the ordering
woman has an arbitrary and complete
of ranking of two hospitals from di erent blocks
preference list, and each man has a
but they might have individual tastes about
preference list chosen independently at
their ranking of hospitals within a block.
random according to D. Then, for all xed k,
Another form of correlation appears in Coles
(2009) where there is nite proportion of
women who have exactly identical preferences
as a given woman.
limn! ck(n) = 0
I have included correlation in preferences in a
1
n
di erent way than any of the existing
mechanisms. The motivation for this form can
best be described through a picture.
KP generalize the IM result for many-to-one
matching and nd a similar result for the
expected proportion of colleges that can
manipulate the student-optimal stable match
when others are truthful. KPR extend the result
where there are couples in the market. Couples
indicate real scenarios where there is correlation
in preferences among doctors. However, their 3
results hold as long as the proportion of couples
vanishes as n grows large.
2 Correlation in preferences
In the matching literature correlation in
preferences among candidates has been
included in a few ways. To list a few, KP
have nite number
1On
the left extreme we have the IM preferences
for men and the restriction of the rank order list
(ROL) being of size k. On the right extreme we
have the situation of perfect correlation in
preferences across all men. In fact, it just means
that all the men simply agree on the most eligible
bachelorette and the second most eligible and so
on. To have maximum number of matches
1I
am denoting jROLjas the maximum permissible size of
the rank order list
and truth telling the size of the rank order list
toss. Say H we generate from R and with T we
should be equal to the number of women n. This go to D . If we get a T then a random number is
will ensure that everybody just lists the ranking generated to enable us to draw someone from
and then the women choose in the order of their the distribution D . Suppose we have the
ranking their favorite bachelor. Due to IM, we
outcomes for the coin toss and the subsequent
know that the core convergence result hold in
random number generated, if T, is as shown in
the limit of n!1for the left extreme situation. We the table below.
also know that the results hold with certainty for
Coin ip H T H Prob the situation on the right where there is a unique
0.811 stable match.
Lets consider the preferences where we have
some degree of agreement while some degree
This would create the preference list of size 3
of randomness in men’s preferences. This can
for
X as A, C, B.
be modeled as (1 )D + Rpreferences de ned as
This
preference generation process has one
follows.
obvious
shortcoming. The correlation in
De nition Each position on a man’s preference
preferences
would never allow a woman ranked
list is lled with probability 1 by drawing a woman
from the distribution D till a woman who does not greater than kin R to be generated at a given
already exist on the preference list is drawn and spot for any man when the spot is going to
with probability by picking up the highest rankedbe lled from R . Thus, this correlation in
preferences has a unique characteristic that it
woman on the ranked list R who does not
allows for the possibility of only the top k women
already exist on the list.
making it to the preference list through the
The parameter can range from 0 where it
would reect IM preferences to 1 where it would channel of correlation, i.e. the ranked list R .
be the case of perfect agreement in preferences.
We are interested when 2(0;1). An example
3 Main Result
would clarify this further. Lets say that we are
simulating the preference list for Xavier and he is Lets suppose there are nmen and nwomen in
looking at Amy, Beatrice, Cindy, and Dorothy.
the matching market we are considering. Then
Lets further assume that in this case k= 3
my main result is the following theorem.
and =1 2st. Say we have the R as A, B, C, D
Theorem Suppose each woman has an
ranked as 1;2nd;3rd;and 4threspectively. Lets
arbitrary and complete preference list, and
assume that D is as shown in the table below. each man has a preference list chosen
A B C D Prob 40% 30% 20% 10%
independently at random according (1 )D
+ R , with the restriction that the preference
list is of maximum size k. Then, for all and k,
we have
limn! ck(n) = 0
1
n
As =1 2we can decide whether we go to the
ranked list or the distribution based on a coin
4
ckThus,
the expected proportion of women who
have more than one stable husband( n) nis
bounded by a quantity that approaches 0 as n
grows large. Furthermore, for a xed nand a
given bound on proportion who could
manipulate their preferences, kcan increase
with such that k(1 ) stays constant.2The main
contribution of this theorem is twofold. First, I
extend the core-convergence result in the limit of
n!1to the case where there is correlation in
preferences for all the participants on one side of
the matching market. The results hold for any
level of correlation characterized by the value
of . I maintain the assumption of arbitrary
preferences on the other side of the market.
Second, this result generalizes and provides
some theoretical motivation as to why we could
and should allow for larger Rank ordered list
sizes, i.e. k, when the correlation in preferences
is expected to be higher.
4 Pro of
I provide the proof here and the details of the
proof of a Lemma used in the Appendix. The
proof closely follows the chassis laid in IM. Lets
assume that the women are numbered in their
decreasing order of popularity, i.e. probability of
being drawn from D .
The expected number of women who could
possibly manipulate their preferences can be
found by taking expectations|over all possible
preference lists for all the men|of the
2This
assumption has its bene t of being extremely general
in its application and at the same time we can not get a real
tight upper bound for the proportion of women who could
manipulate given that we are not making any assumptions
about their preferences. I would come back to this in the
next section of simulations.
5
number of women having incentives to
manipulate for a given preference list. Due
to linearity of expectation and probability,
this quantity will be exactly equal to the sum
of probability for a given woman having a
profitable deviation by misrepresenting
preferences. We know that women have
incentives to misrepresent their preferences
if and only if they have more than one stable
husband in all possible stable matches. Say
Multgis the event that a given woman ghas
more than one stable husband.
ckg(n) = XProb(Multg)
To investigate these probabilities, IM use the
stochastic algorithm motivated from Knuth,
Motwani and Pittel (1990). I have included
some details about it in the appendix. I follow
the exact same approach and make the same
argument that the probability of Multgis
bounded above by the probability that woman
gunder consideration getting a proposal at
the rst place in a step above. Lets call the
later event Propg
It is important to note that even if woman g
receives a proposal it is not necessarily from
one of her stable matches or in other words
a man she prefers more to her earlier match
whom she divorced when she initiated the
rejection chain.
Next, we look at the number of single
women more popular than gand let there be
X(g) such women. At this step, we make an
assumption and focus only on women who
are not in the top kof the ranked list R . As
we noted earlier, such a woman can never
get on the preference list for a given man
3ln(n) ln( n)
=2
+
4nke
through the channel of correlation. Thus,
ck(n) 16nk(1 )ln(n) ng= 12(1 )eg 8 nk(1 ) g
gXn
the probability that she can get a spot on
+
g=
the preference list is limited to her
gX
probability of being drawn from the
distribution, say pg, given that the spot is
going to be lled from the distribution D .
Thus the unconditional probability of
Propgis (1 )p. Clearly, this is bounded
by1 X( g)+1.g
If Y(g) denotes the number of women more
ck(n) 16k(1 )ln(n) 3ln(n)
= o(1)
popular than gbut not listed on any of the
n
+
4k(1 )p n
men’s preference list then essentially they
would be single. Clearly we have
X(g) Y(g). Using the next Lemma we can
arrive at probability bounds for all g 4k.
Lemma 8g>4k, we have
Both the terms on the right hand side of the last
inequality approach 0 as n!1. QED.
16nk(1 )ln(n)
5 Economic implications of the
results
E h1
1i
Y(g) +
12(1 )eg
8 nk(1 ) g
The old role of xed kin IM result is taken by
k(1 ) and not surprisingly for = 0 it boils down
to IM result. An important implication is that for a
given bound on the proportion of women who we
can allow to have more than one stable
husbands, the maximum permissible size of the
Although the above lemma is true for g 4k, rank ordered list kcan increase if increases
we will use it only for g g16 nk(1 ) ln( n) =. For a such that k(1 ) is xed. Hence, we can and we
given value of kand , we can always nd an n should allow for greater sizes of rank ordered
large enough that the former is implied by the lists from the perspective of highest number of
later.Even if we completely ignore the
matches and truthful revelation by men if the
incentives of the top g 3women by saying that correlation is expected to be higher. The
at most their probability of manipulating their following table illustrates the growth in
preferences is 1, we get our nal result.
permissible value of kas increases
0 0.25 0.5 0.75 0.9 k 10 13 20
40
100
3As increases and gets closer to 1, we would need a
higher value of nto ensure this holds. However, this is really The bound provided at the end of last section
not a concern for realistic values of n. For n= 100 the above
holds 8 <0:9884
6
consists of two parts. The rst part being the top On the extreme left we have n= 10688and the
g women for whom we did not provide any
reasonable bound on the probability. The secondvalue of the rst component of the bound is
part being all the other women ranked greater
than g and most of the results that were proved about 800%. The bound becomes <100% only
have been used only for assessing the
incentives of these women. The second part of for extremely large value of n 10. This reveals
the bound is encouraging and approaches zero
at the rate ofp nwhich provides reasonable
the enthusiastic nature of our results so far.
values for the
values of nwe see in real life. This can be seen Although, the result proved holds in limit of
n!1and this could be done without any
from the next gure.
assumptions on the women preferences, we
clearly see that this has compromised the
tightness and informativeness of the bound.
My results are generalization of IM results and
follows a similar structure of the proof.
Furthermore, KP result uses similar arguments
and get a bound which looks similar to this.
Their rst term is the proportion of popular
colleges and that also has the exact same rst
component. Thus, the criticism about lack of
tightness and informativeness carries forward
to the existing set of results of IM and KP. The
core convergence results proved so far in the
limit without correlation, in IM or KP, or with
correlation, in KPR or in this paper, just give us
a directional sense of what are the reasons we
might see extremely small proportion of
The second part dies to 0 quickly as we increase colleges or women who can misrepresent their
nfrom 100 where it takes value 3:5% to 10,000 preferences, which was the original RP
where it takes value 0:7%. However, the rst partconjecture. Nonetheless, all these results do
of the bound is very discouraging as it
not tell us what the mechanism is in the real
approaches zero at the rate of 1=ln(n). The plot life scenarios we stumble upon (with nbeing in
of the numerical value for the rst part for some the order of 1000 or at best 100,000).
really large values of nshows that it only means
that at most 100% or all the women have
incentives to manipulate their preferences in all
real world scenarios.
7
Thus, we are still far from resolving the puzzle of could be informative. Furthermore, I have only
RP conjecture and why truthful revelation of
included the correpreferences is (at least an approximate)
lation in preferences which ensures that only the
equilibrium, if it is, when others are truthful for
top kwomen stand a chance to bene t from
the matching markets we see in real life.
introduction of correlation. It would be an
16 nk(1 ) ln( n)It is important to realize that Lemma interesting extension to allow for correlation to
a ect all or at least a nontrivial fraction of the
1 holds for all g 4k but we have leveraged it
women.
only for g . The mathematical craftsmanship,
initially discovered in IM, used to arrive at the
results in the given level of generality forces us
to take this leap in the number of top
App endix
women/colleges. This is one area where the
Stochastic DAA The essence of the stochastic
bound can be improved upon.
DA algorithm is that a given woman divorces
her current mate in the Man Optimal Stable
Match and initiates a rejection chain by forcing
6 Conclusion
her current mate to propose to his next best
alternative. He continues proposals till his
In this paper, I studied correlation in preferences
of one-to-one matching market and showed that proposal is accepted. The current mate of the
woman who recently accepted a better
for the particular type of correlation, i.e. (1 )D
proposal now starts his proposals and this
+ Rpreferences, the same core convergence
result holds as the size of the matching market continues. We know from the lattice theorems
given in Roth-Sotomayor(1990) that if there is
increases. An important outcome of this result
was that the size of the rank ordered lists being another stable match then this rejection chain
necessarily ends on the woman who initiated
submitted can (and should) be allowed to
this rejection chain.Proof of the Lemma Let Q
increase if we have more correlation in
preferences. Although, I have proved the result = Pk j=1. Let w be a woman ranked better than
g. Let lwpj(m) be the event that a man m does
only for one-to-one matching market, the results
not list wat a given step given that he has
could be extended to many-to-one matching
listed w1;w2;:::;wi1as his rst i1 women is given
market.
by
Lastly, the limit results that have been found
here and in other earlier works, reveal our gap in
understanding real world markets of the sizes we
actually see. The results give any meaningful
bounds only for extremely large values of nwhich
we never see. There is a clear direction to
extend our understanding on this extremely
important problem which has puzzled us for over
a decade. If we can provide a better bound for
the top few colleges or women, we would get to
results that
(m)) = 1 pwj=1
j= i1pw
(1 ) 1 Q
1 pw
P
j
8 Prob(lw
(1 ) 1
There are wkwomen more popular than wso
clearly pw 1 Q wk. For Ew(= Tn m=1lw(m)) being the correlated events4.
event that no man has the given woman w listed
on his preference list of size k, we have
2(Yg) E[Yg]
Prob(Ew)
1
pw(1 ) 1 Q1
Using Chebyshev inequality and the same
1 wk nk nk
arguments as in the proof of Lemma 4.1 in IM,
we have
Combining inequality 1 and 2 we have the
result in Lemma 1. QED.
] (2)
E[Y
References
E h1 Yg+
6 g
1i
[1] Coles, Peter, \Optimal Truncation in
Matching Markets," Mimeo, 2009.
Furthermore, we will have the expectation of
[2] Coles, Peter, Alexey Kushnir and Muriel
Ygas sum of the probabilities of all women w
Niederle. \Preference Signalling in Matching
ranked higher than gbeing not listed. Hence
Markets," NBER Working Paper, 2010,
for g 4kwe have
E[Yg] = Xw=1gXProb(Lw=
[3] Gale, David and Lloyd S. Shapley, \College
2 kgXew= g=2ew)4
Admissions and the Stability of Marriage,"
g
nk(1 ) w8 nk(1 ) g
American Mathematical Monthly, 1962, 69,
9-15.
[4] Immorlica, Nicole and Mohammad
Mahdian, \Marriage, Honesty and Stability,"
SODA, 2005, 53-62.
Thus, we
[5] Knuth, Donald E., Rajeev Motwani and
have
Borris Pittel, \Stable Husbands," Random
Structures and Algorithms, 1990, 1, 1-14.
8 nk(1 ) g
]
g2
(1)
E[Yg
e
.
We know that 1xe2 xw 2if w>2k. Hence we
have 0 for x2[0;1=2] and wk
Prob(Ew ) e 2 nk(1 ) wk
e 4 nk(1 ) w
For the variance of Y(g), we can use IM Lemma [6] Kojima, Fuhito and Parag Pathak,
4.4 directly without any modi cation as this
\Incentives and Stability in Large Two-Sided
depends on Ygbeing the count of negatively
Matching Markets," American Economic
Review, 1999, 99, 608-627.
9
+ 2C22= n1
n 4Prob(Ei
^Ej)
Prob(Ei)Prob(Ej) and
nC
[7] Kojima, Fuhito, Parag Pathak and Alvin E.
Roth, \Matching with Couples: Stability and
Incentives in Large Markets," NBER working
Paper, 2010.
[8] Roth, Alvin E. and Elliot Peranson, \The
Redesign of the Matching Market for American
Physicians: Some Engineering Aspects of
Economic Design," American Economic Review,
1999, 89, 748-780.
[9] Roth, Alvin E. and Marilda A. O. Sotomayor,
Two-sided Matching: a study in Game-theoretic
Modeling and Analysis, Cambridge: Econometric
Society monographs, 1990.
10
Download