Marriage and Online Mate-Search Services: Evidence From South Korea

advertisement
Marriage and Online Mate-Search Services:
Evidence From South Korea
Soohyung Lee1
Department of Economics and MPRC
University of Maryland, College Park
LeeS@econ.umd.edu
First Version: November, 2007
This Version: October, 2009
Abstract
This paper examines the implications of online mate search for marriage, using data
from a Korean online matchmaking company. Using the estimated marital preferences,
I find that customized recommendations from an online matchmaker and individuals’
own online search generate similar marital sorting, though customized recommendations
result in dates more often than individuals’ own online search. When compared to traditional offline search, online search generates different marital sorting and may account
for changes in marital sorting observed in Korea since 1991. Finally, the estimated
preferences were recently used by the company to change its recommendation system,
dramatically improving its success rate.
Keywords: Marriage, Online Search, Internet, Assortative Matching, Market Designs
JEL Classification Numbers: D02; J12; C15
1
This was the main chapter of my PhD thesis, previously called “Preferences and Choice Constraints
in Marital Sorting: Evidence from Korea”. I thank Peter Klenow, Luigi Pistaferri, John Pencavel and
Michèle Tertilt for their advice and support throughout this project. I have benefited from discussions
with seminar participants at Stanford, University of Minnesota, Cornell, Penn State, University of
Maryland, Harvard Business School, UIUC, MIT, Rand, SMU, NUS, Collegio Carlo Alberto, Bocconi,
Tokyo, Korea and KERI. I thank Ran Abramitzky, Mark Duggan, John Hatfield, John Ham, Han
Hong, Ali Hortaçsu, Jakub Kastl, Yuan Chuan Lien, Ben Malin, Sri Nagavarapu, Muriel Niederle,
Minjung Park, Alex Ponce-Rodriguez, Felix Reichling, Azeem Shaikh, and Joanne Yoong for detailed
comments; Ken Judd, Hyunok Lee and Zsolt Sándor for sharing their computational expertise; and
the B.F. Haley and E.S. Shaw Fellowship and Stanford Graduate Research Opportunity Fellowship for
financial support. I am indebted to Woong-Jin Lee, Heui-Gil Lee, Kang-Yong Ahn, and Hye-Rim Kim
for sharing the data.
1
Introduction
Rapid adoption of the Internet has influenced many aspects of people’s behavior. The
search for a mate is no exception. In many countries, people can use an online platform
to post and respond to notes to a potential dating partner (e.g., Yahoo Personals and
Match.com). Alternatively, they can use an online matchmaking service that suggests
a potential dating partner based on their personal characteristics (e.g., eHarmony and
Chemistry.com). Use of online search can change the size and composition of people’s
choice sets of potential mates. In addition, by receiving an online matchmaker’s customized recommendations for potential mates, people may date and marry certain types
of individuals more often than otherwise. Therefore, use of online mate-search services
can affect people’s decisions for marriage and result in marital sorting different from
that generated by traditional offline mate-search processes.
The goal of this paper is to examine the implications of the increasingly wide use of
online mate-search services for marriage by addressing the following two questions: Do
customized recommendations from an online matchmaker lead individuals to different
decisions about dating and marriage, as compared to their own online search? How has
the increasingly wide use of online mate-search services impacted marital sorting in the
population?
To address these questions, I study the South Korean marriage market, which is
a useful setting because of the early adoption of online mate-search services and their
widespread use. In Korea, the services emerged in the late 1980s and in 2005 eight
percent of newlyweds met their spouse through these services (Korea Marriage Culture
Institute, 2005).2 I use Korean vital records, combined with an unusually rich dataset
from a major Korean online matchmaking company. The dataset provides detailed
information on over 20,000 users, 13.4 percent of whom have gotten married through
the service. It includes information about whom each user dated and married. Moreover,
It has information about proposed dates that were turned down, which is rarely available
2
According to Madden and Lenhard (2006), three percent of the sample of U.S. Internet users met
their spouse through the Internet, including online mate-search services, and one percent met on a blind
date or through a dating service. In terms of service providers, in January 2006, the two most popular
companies in the U.S. were Yahoo Personals and Match.com, which were established in 1997 and 1995,
respectively. eHarmony, which provides online matchmaking services and was established in 2000, was
ranked 7th in the same survey.
1
in datasets typically used in the literature. Users’ characteristics are mostly verified by
legal documents. The dataset from the company includes a wide spectrum of the Korean
population in terms of age, education, geographic location, and many other dimensions.
The company allows users to find a dating partner from the opposite sex in two ways.
The user can directly browse other users’ profiles on the company’s online website and
request a first date, or the company can suggest a first date with another user who has
no ongoing relationship. I will use the term proposal to refer to an event in which two
users consider going on a date (or marriage) with each other and partner to refer to the
person who is asked out by a user or suggested to another user by the company.
I estimate users’ preferences for their spousal characteristics by analyzing the proposals initiated by the company, which constitute 87 percent of all proposals. The inference
of users’ preferences is possible because the company suggests a wide variety of partners
in terms of observable characteristics.3 To infer users’ preferences, I develop a model in
which an individual can have multiple dates with a partner to make a marriage decision.
Within my model, multiple dates result from a desire to learn more about one’s partner. Following Hitsch et al. (forthcoming), I assume that the marriage utility function
may depend on sex and the similarity between a husband’s and wife’s characteristics.
I estimate the model using a Laplace-type estimator suggested by Chernozhukov and
Hong (2003). The estimation results suggest that for income and physical attractiveness,
both men and women prefer someone who possesses these characteristics in abundance,
regardless of their own traits. However, they prefer marrying a person who is similar to
themselves in terms of age, height, religion, geographical location, and the industry in
which one works. For educational attainment and father’s educational attainment, men
prefer women who are similar to them, whereas women prefer men with high educational
attainment.
Next, I examine individuals’ decisions for dates and marriage when the online service
recommends a potential mate, as compared to their own search via the company’s online
website. Using the estimated preferences, I compute the probability that a proposal
3
Suppose that the company initiates a proposal only if a man and a woman have the same characteristics. Since there is no variation in terms of partners’ traits, users’ responses to being asked on
a date are explained by a unobservable random shock, not by partners’ traits. Therefore, we cannot
quantify the extent to which a person values a partner’s observable trait.
2
initiated by a user would move to the next stage of the relationship if the proposal
were made by the company and compare it with the actual outcomes. I find that the
probability of a user accepting a first date with another user is significantly higher
if the company introduces the two to each other, as compared to the case where the
potential mate directy contacts the individual. However, conditional on having a first
date, the probability of a proposal turning to a second date or marriage remains similar
regardless of who initiates the proposal. In terms of marital sorting, I find that the
sorting patterns among users whose spouse is suggested by the company are similar to
those among users who directly contacted (or was contacted) their spouse. These results
imply that recommendations of an online matchmaker can reduce search cost by raising
a user’s acceptance rate but do not change marital sorting, as compared to individuals’
own online search.
Although online matchmaking services generate marital sorting similar to individuals’ own online search, it is still possible that online mate search may generate sorting
patterns different from traditional offline mate-search processes, thus changing marital
sorting in population. According to the census of newlyweds in Korea, marital sorting
between 1991 and 2005 changed in the following ways: the probability of an individual
marrying a spouse whose trait is the same as his/her own has decreased for hometown;
increased for marital history (never-married vs. not); and remained similar for educational attainment. I undertake two exercises to examine the possibility that wider use
of online mate-search services changed marital sorting in Korea. In the first, I weight
the users of the online matchmaking service to replicate the characteristics of the average individual (for each sex) in the census of newlyweds. I then compute how likely
this average individual is to marry someone with his/her same traits. I find that if
an individual uses the online matchmaking company, he/she is less likely to marry a
spouse with the same trait in terms of hometown but more likely to marry a spouse
with the same marital history. The prediction for the probability of marrying a spouse
with the same educational attainment is ambiguous. Therefore, the wider adoption of
online mate-search services could explain the patterns observed in the census between
1991 and 2005.
In the second exercise, I use the estimated preferences to compute the male-optimal
3
stable matching with the Gale-Shapley algorithm (1962) and calculate the probability
of the average individual marrying a spouse with the same traits. I use the maleoptimal stable matching from the Gale-Shapley algorithm because I find it generates
sorting comparable to the sorting among users of the online matchmaking service who
ultimately marry. This exercise allows me to simulate marriages where the distribution
of traits for both men and women in the matchmaking company is representative of
the population, whereas, in the first exercise, the distribution of only one sex’s traits
is representative. The results are qualitatively similar to the findings from the first
exercise, providing further evidence that the wider use of online mate-search services
may account for changes in marital sorting in Korea from 1991 to 2005.
This paper is closely related to three strands of research. The first is the literature
estimating marital preferences (e.g., Abramitzky et al., 2009; Angrist, 2002; Banerjee
et al., 2009; Bisin et al., 2004; Choo and Siow, 2006; Fernandez et al., 2005; Fisman
et al., 2006, 2008; Hitsch et al., 2006, forthcoming; Kurzban and Weeden, 2005; and
Wong, 2003). Among studies in this strand of literature studies, the overall analytic
framework of this paper is most closely related to Hitsch et al. (forthcoming) who
estimate people’s preferences based on first date outcomes in a U.S. online platform
and predict marital sorting if people use the online platform using the Gale-Shapley
algorithm. This paper builds upon their original contribution and other studies in this
literature in three important ways. First, to recover marital preferences, my analysis
uses both dating and marriage decisions as well as user characteristics that have, to
a large extent, been verified by third-parties.4 Second, to the best of my knowledge,
this paper is the first to compare matching outcomes initiated by individuals with those
initiated by an online intermediary (i.e., matchmaker). Third, this paper uses the wider
adoption of online mate-search services to understand the time trend in marital sorting
in a country.
A second related literature studies online search and online labor market intermediaries (e.g., Autor 2001, 2008; Kuhn and Skuterud, 2004; Bagues and Labini, 2008). This
4
My findings on marital preferences generally confirm the findings in the literature. For example,
my results are consistent with findings of Fisman et al. (2006) and Hitsch et al. (forthcoming) that
men value appearance more than women do. Banerjee et al. (2009) find preferences for similar social
background (i.e., caste) which is similar to my finding that men prefer women with similar family
backgrounds.
4
paper adds the marriage market to the list of search markets that have been affected by
online search. My finding that the wider use of online mate-search services may account
for the decline in marital sorting by geographical location in Korea is consistent with
that in Bagues and Labini (2008). They find that the introduction of online job search
in Italy increased workers’ geographical mobility.
Third, this paper is related to studies of market design (e.g., Niederle and Roth,
2003, 2008; Niederle and Yariv, 2008). They highlight the possibility that a well-designed
centralized matching system can improve the welfare of market participants as compared
to a decentralized search. For example, Niederle and Roth (2003) find that residents’
geographical mobility in the U.S. gastroenterologists increased under the centralized
system. Their finding is consistent with mine in the sense that an online matchmaking
company that offers a centralized matching system generates sorting patterns different
from those generated under a decentralized traditional dating environment. Moreover,
the matchmaking company used my estimated marital preferences to update its matching
algorithm for generating a proposal, and increased the probability of a proposal turning
into an actual date by a factor of 2. This fact suggests that insights from the market
design literature can be beneficially applied for a wider range of economic environment
such as marriage markets, in addition to school-choice and kidney-exchange problems
which have been extensively studied in the literature.
A brief overview of the remainder of this paper is as follows. Section 2 describes the
institutional background and the data. Sections 3 and 4 present an empirical framework for estimation and the results, respectively. Section 5 provides the results of the
counterfactual analysis. I then discuss several potential issues, such as selection bias, in
Section 6. Section 7 concludes.
2
2.1
Industry and Data
Industry
Online matchmaking companies emerged in Korea in the late 1980s and rapidly expanded their market. These matchmaking companies typically provide access to an
Internet database where users can browse one another’s profiles: the companies then
5
Table
Spouse
Table1:
1: Route
Route of
of Finding
Finding aa Spouse
Survey Conductor
Survey year
Sample
KMCI
2005
305 couples
married in 2005
50
Pollever
2004
1,941 unmarried
internet users
67.2
Fraction of men
Age Groups
- 29 and younger
29.3
63.9
- 30-33
49.8
25.9
- 34 and older
20.9
10.2
Fraction of survey participants who are
college students, graduates, or beyond*
93.8
69.7
Route of finding a spouse/dating partner
by age groups**
all
(1)
(2)
(3)
all
Online matchmaking companies
7.6
3.7
4.3 28.3
2.5
Internet/Club
7.9
8.0 10.8 2.2
2.7
Friends, College, or Work Place
61.3 62.5 63.4 52.1
68.6
Family/Relatives/Matchmakers
12.6 11.7 11.8 17.4
8.0
Others
10.6 14.1 9.7
0.0
18.2
* In the 2005 marriage register, the fraction of people with tertiary education was 52.28 percent.
** Definition of age groups: (1) younger than 30, (2) between 30 and 33, and (3) older than 34.
The survey by Pollever does not provide statistics broken down by age group.
: The survey by Pollever does not provide statistics depending on [according to? broken down by?]
use a computerized algorithm to introduce singles to each other. These users are reage group.
cruited through advertisements and pay a fixed advance fee for a pre-specified period,
Source:
see Section
2.1 The use of online matchmaking services is quite common in Korea (see
usually
a year.
Table 1). According to the Korea Marriage Culture Institute (KMCI), 7.6 percent of
couples who married in 2005 met through matchmaking companies. The use of online
services is small among young people but still non-negligible. Similar results are found
in another study of young Internet users conducted by a Korean research organization,
Pollever.
2.2
Data
The main dataset for this study comes from a major Korean online matchmaking company, which helps its users find a spouse among other users of the opposite sex. I have
detailed information about 20, 689 individuals who started using the company’s services
between January 2002 and June 2006 including their individual characteristics, stated
marital preferences, and the history of dating outcomes.
6
Figure 1: Regions of South Korea
2.2.1
Motivation of Users and Reliability of Information
The annual membership of the company costs 900,000 won in 2007 (approximately 900
US dollars), which is about 3.5 percent of the average annual income in Korea.5 The
fraction of users who have married as a result of using the matchmaking service is 13.4
percent. Because of the high membership cost and significant fraction of users getting
married, I think it is reasonable to assume users are primarily motivated to seek marriage
rather than casual dating. The information users provide about their characteristics is
subject to several checks by the company. As much as possible, key information is legally
verified (e.g., age, education, employment, marital status) or independently evaluated by
the company (e.g., a facial grade). For some characteristics for which the company does
not require a third-party verification (e.g., income and height), the company monitors
the accuracy of the information via user feedback. The company routinely surveys its
users about their experiences and asks them to verify the correctness of other users’
information. The company’s contract specifies that the service will be terminated if a
user is found to provide incorrect information.
7
Table 2: Users’ Characteristics 1
This table compares characteristics of users in the matchmaking data set (MM) with the official marriage
register (MR).
Table 2: Users’s Characteristics 1
Year
Number of individuals
Composition (percentage)
Women
Divorced
Non-Korean
Age
26 and younger
27-29
30-33
34 and older
Educational attainment
Middle School or less
High School
College or more
Technical College
University
Master’s and Ph.D.
Region
Seoul or Gyeonggi
Gangwon
Chungcheong
Jeolla
Gyeongsang
Jeju and others
Hometown
Seoul or Gyeonggi
Gangwon
Chungcheong
Jeolla
Gyeongsang
Jeju and others
Matchmaking dataset
January, 2002 ~ June, 2006
All
Married
20,689
1,594
MR
2002~2005
2,477,648
53.90
10.70
0.00
50.00
12.57
0.00
50.00
18.82
4.87
9.01
25.28
40.05
25.66
5.83
24.76
43.61
25.8
28.79
28.08
21.84
21.31
0.87
6.63
92.50
13.65
61.25
17.60
0.09
8.06
91.86
12.70
64.83
14.33
5.14
38.27
56.59
-
75.92
0.55
4.44
3.34
11.39
4.35
77.65
0.57
5.00
3.46
13.25
0.06
51.44
2.79
9.59
9.63
25.15
1.40
45.12
3.26
10.65
13.60
25.86
1.51
42.48
3.79
11.76
14.58
26.11
1.29
27.36
4.86
15.47
19.32
31.61
1.38
8
Table 3: Users’ Characteristics 2
This table compares users of the matchmaking service with the general population. For population data, the top
panel uses the WS (2002-2006) and the bottom panel uses the PT (2004).
Table 3: Users’s Characteristics 2
Matchmaking
dataset
Year
Distribution across industries (Percentage)
Agriculture, forestry, fishing, Mining
Manufacturing
Public, electric power, gas, water supply
Construction
Wholesales & retail trade,
consumer goods, restaurants & hotels
Transportation, storage, communication
Finance & insurance
Real estate rental & business services
Education services
Health & social welfare
Entertainment, housekeeping, personal service
International & other foreign institution
Others or unemployed
Annual income (10,000 won)
Mean
Mean between 5th and 95th percentiles
Median
Jan. 2002~June, 2006
General
Population
WS(2002-2006)
0.04
20.37
9.23
4.26
4.74
7.92
16.36
6.27
10.54
19.32
9.41
10.19
0.76
20.32
9.55
5.6
2.41
3.12
5.49
5.17
12.69
11.01
3.02
2.2
-
4054.63
3138.08
3137.05
3046.49
N.A
N.A.
Gender-specific Physical Traits
Height (feet, inches)
34 and younger
Men
5’ 9”
Women
5’ 4”
35 and older
Men
5’ 8”
Women
5’ 4”
Weight (lb)
34 and younger
Men
153.7
Women
111.4
35 and older
Men
153.2
Women
112.0
Body Mass Index*
34 and younger
Men
22.8
Women
19.0
Men
23.0
35 and older
Women
19.4
2
* BMI = 703 * weight (pounds) / (height (inches))
** [a,b] denotes the case where the corresponding statistic ranges from a to b.
* obs with no income: very initial – men 8.1766%, women 20.03 percent.
9
PT (2004)
5’ 8”
5’ 3”
[5’ 4”, 5’ 7”]**
5’ 2”
[153.2 , 157.0]
[116.0 , 120.4]
[151.9 , 158.3]
[123.9 , 131.0]
[22.6 , 24.0]
[20.3 , 21.7]
[24.7 , 25.0]
[22.8 , 25.1]
2.2.2
Comparison between Users and the General Population
I use four separate nationally representative datasets because no single population-based
dataset captures all the features observed in my data. The closest analog to the matchmaking dataset is the marriage register (MR). The MR, the population of newlyweds
in South Korea in a given year, provides information about husband and wife’s age,
education, residence, hometown, and marital history (never-married vs. not married).
I use the MR as a baseline for drawing comparisons to the general population, and I
supplement the analysis with three other datasets: the Basic Statistics Survey of Wage
Structure (WS) for industries and income, the National Household Income and Expenditure Survey (HIS) for income of husbands and wives, and the Survey of Physical Traits
of Koreans (PT) for height and weight.
I find that, in terms of observable traits, the users of the company represent a wide
spectrum of Koreans. As shown in Tables 2 and 3, the users include all types of Koreans,
in terms of marital status, educational attainment, geographical location, and industry.
However, the users overrepresent people who are older, more educated, and currently
live in, or are originally from, Seoul and its surroundings (i.e., Gyeonggi province). As
discussed earlier, the company does not request legal documents for a user to verify
his/her reported income and physical traits. To gauge the reliability of the information,
I compare the average income and BMI of users with those in population whose characteristics are the same as users in terms of age, gender, and educational attainment (for
income): see Appendix C.16 . As shown in the middle of Table 3, the average income
among users is over 40 million won (about 40,000 dollars), larger than the average annual
income in the population whose characteristics are the same as users is 30 million won.
However, excluding people whose reported income is less than 5th percentile or more
than 95th percentile among users, the 10 percent trimmed mean income is comparable
to that of the population. The average height and weight of the matchmaking company’s
users are remarkably similar to those in the PT.
5
In contrast, online dating services in the United States, such as Yahoo Personals and eHarmony,
cost about 160 to 250 dollars for a comparable one-year contract.
6
Appendix was separately submitted.
10
2.2.3
Stated Marital Preferences
The company surveys users, asking them to rank the three most important traits for
their prospective spouse, as well as any religion or geographic location that they wish to
avoid (see Appendix Table A.1). Male users’ top priority is appearance (44.6 percent),
which is chosen most often, followed by personality (33.7 percent), and occupation and
income (11.0 percent). In contrast, female users choose occupation and income (55.6
percent) most often, followed by personality (26.8 percent), and appearance (5.1 percent). A Kolmogorov-Smirnov test shows that the distribution of female users’ top
priority is statistically different from that of male users. This gender difference in stated
marital preferences is consistent with the findings in Fisman et al. (2006) and Hitsch et
al. (forthcoming), both of whom find that women put greater weight on income while
men respond more to physical attractiveness. Most users are open to all religions and
geographic location.
2.2.4
Search System and Dating Outcomes
Each user can find a partner for a date in two ways: he/she can search the company’s
database independently or have the company suggest a partner. In the first case, the
user accesses the company’s database via a website. The database contains users’ profiles
with the users’ photograph, education level, names of schools attended, occupation,
geographic location, birth order, and number of siblings. For online security and privacy
reasons, the company does not immediately reveal income, weight, parental marital
status, or parental wealth, but this information can be obtained prior to a first date
by asking a staff member. Having found a suitable profile, the user then can send an
electronic note to propose a first date (a user-initiated proposal). Users cannot initiate
a proposal to other users if they have an ongoing relationship with any user of the
company.
In the second case, the company may introduce two users based on its algorithm (a
company-initiated proposal). First, the company assigns each user a single-dimensional
index (called OSI) based on all of the user’s observable characteristics except geographical location, marital status, religion, and age. The OSI is intended to measure the
extent to which a user should be attractive to the opposite sex as a spouse. The OSI
11
Table 4: Search System
The top panel shows the distribution of partners’ overall attractiveness index depending on users’ index
quintiles. Q1 is the lowest quintile and Q5 is the top quintile. Statistics in parentheses present the cumulative
density of the corresponding statistics. The middle and bottom panels show the distribution of partners’
educational attainment and facial grade, respectively, given the men’s own characteristics. Statistics in
parentheses show the fraction of females who have the corresponding characteristics.
Table 4: Search System
Quintile of Users’ Own Overall Index (OSI)
Q1*
Q2
Q3
Q4
Q5
Men
Partner’s Index
- Mean
(cumulative density, %)
- SD
- MIN
(cumulative density, %)
- MAX
(cumulative density, %)
66.26
(18.34)
7.76
25.35
(0.01)
94.42
(99.71)
70.04
(30.98)
7.38
25.35
(0.01)
96.10
(99.92)
Women
Partner’s Index
- Mean
66.46
70.56
(cumulative density, %)
(30.34)
(45.70)
- SD
8.11
7.98
- MIN
26.52
26.52
(cumulative density, %)
(0.01)
(0.01)
- MAX
96.03
97.87
(cumulative density, %)
(99.88)
(99.99)
* Q1 refers to the lowest quintile and Q5 to the highest.
73.29
(44.56)
7.12
38.60
(0.04)
97.18
(99.99)
76.70
(59.95)
6.69
25.35
(0.01)
97.18
(99.99)
80.75
(76.01)
7.16
38.60
(0.04)
98.26
(100.00)
73.74
(58.11)
7.58
26.52
(0.01)
97.87
(99.99)
77.22
(70.78)
7.39
26.52
(0.01)
97.87
(99.99)
81.20
(83.95)
7.22
27.92
(0.03)
98.28
(100.00)
Men’s Educational Attainment
ranges from 25 to 98, and the higher the OSI
gets,
the more
a /Univ.
user is expected
to be
High
School
Tech.
Master/Ph.D.
Women’s Educ.
Attainment
attractive.
A weight
assigned to each characteristic is based on surveys of the company’s
- High School
(8.59)
34.03
9.22
2.49
staff
members,
are experienced
in assisting60.95
users. Note that
- Tech.
College who
or University
(75.89)
78.21how the weights
73.57 are
- Master’s
or Ph.D.the same throughout
(15.52) the period
5.02 covered by12.58
23.94 the
assigned
remains
my dataset. Next,
company selects a male and a female user whose OSI are, on average, similar to each
Men’s Facial Grade
other among users who have no ongoing relationship;
it thenBsends
A
~ C an electronic
D ~ F note
Women’s
Grade
to
the twoFacial
users,
along with each other’s profile. This means that the company’s al-A
(8.74)
19.09
9.03
4.87
gorithm
users whose
observables are
similar to each
other,
- B ~ C generates a proposal to two
(79.90)
73.74
78.97
72.21
-D~F
(11.36)
7.17
22.92
more
often than not. However, there
are still large
variations 12.00
in terms of partners’
in-
dex (thus observables) among company-initiated proposals. The large variation among
partners’ observables is important for us to identify users’ preferences for spousal traits.
To see this point, consider a case in which users are the same except educational attainment and the company generates a proposal to two users only if the two have the
same educational attainment. Then, the probability of a user accepting a date (or marriage) depends only on unobservable shock; thus, we cannot know people’s preferences
12
Table 5: Description of Search Outcomes
Table 5: Description of Search Outcomes
Proposals
First
Date
Second
Date
Marriage
**
Men
No. of users with obs.>0 *
9,538
8,911
6,690
1,370
[Percentage out of all users]
[100]
[93.43]
[70.14]
[14.37]
Median
28
5
2
Mean
42.94
6.06
2.56
Standard Deviation
45.81
5.24
2.15
Women
No. of users with obs.>0 *
11,151
10,006
7,351
1,409
[Percentage out of all users]
[100]
[89.73]
[65.92]
[12.64]
Median
27
4
2
Mean
38.28
4.97
2.4
Standard Deviation
36.72
4.34
1.89
Proposals
No. of all proposals
360,509
58,845
14,886
1,537
[conditional survival rate]
[16.32%]
[25.30%]
[10.32%]
No. of user-initiated proposals
44,986
4,547
1,211
128
[conditional survival rate]
[10.11%]
[26.63%]
[10.57%]
* The unit of observation is a proposal which reaches each stage. For example, users with
obs.>0 for a second date means the number of users who have at least one proposal that
reaches the second date.
** There is a discrepancy between the number of male and female users who eventually
married because 185 male users and 224 female users married persons who joined the
matchmaking company prior to 2002.
on spousal educational attainment by analyzing the dataset.
The distribution of OSIs for partners suggested by the company is shown in Table 4.
I classify the members into ten groups based on gender and quintile of their own OSI. For
each of the groups, I calculate the mean, standard deviations, minimum, and maximum
value of the partners’ index. To gauge the magnitude of the statistics, I also include the
cumulative density of the corresponding statistics in parentheses. For example, the first
row shows that the average OSI of women suggested to men in the first quintile (Q1)
is 66.26 and 18.34 percent of female users have an OSI lower than 66.26. Regardless
of a user’s own OSI quintile, the minimum of OSI among partners belongs to the first
percentile and the maximum of OSI among partners is in the 99th percentile. Similarly,
I find that users receive suggestions to meet all types of partners in terms of education,
facial grade, marital history, and other characteristics.
Once a proposal is made, either by the company or by a user, the company contacts
the users to check whether they would like to have a first date. If two users agree to
13
have a first date, then the company contacts each of them after the first date and asks
whether they would like to meet again for a second date. This response is recorded.
Although the company does not examine the results of any subsequent dates in the
same automatic fashion, a staff member assigned to each user regularly contacts his/her
user and follows up on whether the proposal eventually resulted in marriage.
Table 5 shows that there are 9,538 male users and 11,151 female users. All users in
the dataset have at least one proposal; about 91 (68) percent of users have at least one
actual first (second) date; about 13 percent of users get married to a person they found
through the matchmaking company. For a median user, the user has about 27 proposals,
4 first dates, and 2 second dates. There are 360,509 proposals in the dataset, 16.3 percent
of which (58,845 proposals) reach a first date. Among the proposals reaching a first date,
25.3 percent reach a second date, and among the proposals reaching a second date, 10.3
percent result a marriage. As shown in the bottom panel of Table 5, user-initiated
proposals constitute only 12.5 percent of the total7 and their probability of reaching a
first date is 10 percent, much lower than the average of company-initiated proposals (17
percent). This fact suggests the possibility that the company’s recommendation of a
dating partner may affect a user’s decision, especially for a first date.
2.2.5
Patterns of Sorting
Table 6 presents the degree of sorting among users over the stages of the relationship.
I calculate statistics measuring the degree of sorting for three groups: pairs who both
wanted to have a first date, pairs who both wanted to have a second date, and couples
who married. Column (1) presents the corresponding statistics among pairs formed by
randomly drawing a man and a woman among users (random matching). The difference
in sorting between the actual outcomes and random matching reveals the degree of
sorting. Table 6 shows that users positively sort on all dimensions, and the degree of
sorting across various dimensions is generally similar at different relationship stages.
7
The high ratio of company-initiated proposals to user-initiated proposals can be attributed to two
factors: First, we can observe a user-initiated proposal only if at least one user wants to have a first
date. Thus, the total number of profiles users reviewed for a first date is not necessarily smaller than
the number of company-initiated proposals. Second, the company frequently sends a proposal to each
user when the user is not in an ongoing relationship. For example, when a user declines a proposal, the
company proposes a first date with another user within four days (median value).
14
population data, whereas those in column (7) are computed using weights based on women. In column (8),
measures of sorting along industry and income are computed using weights based on men and women as shown
in the Basic Statistical Survey of Wage Structure, because the HIS is not a representative sample of workers.
When the HIS is used, the statistics using weights based on husbands are presented first followed by those
based on wives’ weights. Def. 1 classifies education into 4 categories (high school or less, technical college,
university, and Master or Ph.D.), and Def .2 classifies education either “high school or less” or “college or
more”.
Table 6: Sorting Pattern
Random
Mean difference of age
Fraction of couples with
- Same education
- Same marital history
- Same region
- Same hometown
- Same industry
Income correlation
(1)
5.040
1st date
(2)
3.374
0.361
0.710
0.352
0.220
0.108
0.000
0.529
0.985
0.927
0.475
0.130
0.212
All Proposals
2nd date Married
(3)
(4)
3.331
3.343
0.535
0.985
0.928
0.484
0.129
0.214
0.549
0.989
0.932
0.454
0.127
0.200
User-Initiated Proposals
1st date
2nd date Married
(5)
(6)
(7)
3.295
3.224
3.067
0.484
0.994
0.885
0392
0.111
0.121
0.510
0.991
0.867
0.413
0.131
0.190
0.517
0.993
0.860
0.373
0.123
0.239
Generally, the sorting patterns in the user-initiated proposals are comparable to the
overall sorting (columns (5) to (7)), although the degree of sorting along region and
hometown is lower.
3
Empirical Framework
In analyzing a user’s decision problem, one faces the choice of using a fully specified
dynamic model or using a model approximating the user’s optimization problem. I
choose the second option for the following reason. Although my dataset is exceptionally rich compared to standard datasets on dating and marriage, three important data
limitations related to measuring opportunity cost are impediments to the first option.
First, the dataset does not have information on the date when a user decided to continue/discontinue a relationship. This information is key for determining how long a user
needs to wait for a partner’s response, which is essential to quantify the opportunity cost
of accepting a proposal. Second, there is no information about the number of dates between the second date and marriage or, if the couple did not eventually marry, about
which partner ended the dating relationship. Third, the dataset has little information on
a user’s own search process, such as what other profiles he/she browses in the database
or whom he/she meets outside the matchmaking service. Thus, I employ a model in
which a user’s opportunity cost of accepting a date is approximated as a function of the
user’s observable characteristics, instead of introducing the additional assumptions that
would be necessary for estimating a fully specified dynamic model with my data.
15
It is important to note that my model, which is described in the sections below,
is consistent with the empirical facts shown in Section 2: I allow for the possibility
of gender-specific preferences because the distribution of stated importance of spousal
traits differs between men and women. I introduce learning about a partner’s traits
over the relationship to generate multiple dates with the same partner. In addition, my
model generates some qualitative predictions similar to those that would be predicted
by a fully specified dynamic model (see Section 3.1). The remainder of this section
presents an empirical model, then discusses issues that arise in estimating the model,
and presents identification and estimation methods.
3.1
Individual’s Problem
I begin by introducing some terminology and notation. A pair (m, w) refers to a specific combination of man m and woman w and also to the proposal between m and
w since the company will introduce m and w only once. Subscript s ∈ {1, 2, 3} indicates the stage of the relationship for two users. Stage 1 represents the decision to
have a first date. Stage 2 represents the decision to have a second date, and stage 3
contains the marriage decision. Superscript M or W indicates the gender of the decision maker in the pair. A binary variable YsM (m, w) is one if a man m wants to
continue a relationship with w at stage s and zero otherwise. Likewise, YsW (m, w) is one
if woman w wants to do so. I define the outcome of a proposal between m and w as
a sequence {Y1M (m, w), Y1W (m, w), Y2M (m, w), Y2W (m, w), Y3 (m, w)} where Y3 (m, w) as
the product of two users’ responses at s = 3 (i.e., Y3 (m, w) = Y3M (m, w) × Y3W (m, w)).
Note that Y2M (m, w) and Y2W (m, w) are observable only if Y1M (m, w) = Y1W (m, w) = 1,
and Y3 (m, w) is observable only if Y2M (m, w) = Y2W (m, w) = 1.
Because the notation is symmetric, from now on I describe the model considering the
case where m receives a proposal from the company to date w. Let U M (m, w) denote
m’s utility from marrying w, and UsM (m, w) denote the corresponding expected utility
given the information available at stage s. RsM (m) denotes the utility from ending a
relationship at stage s, which is the expected utility from waiting for a new proposal
in the next period. I assume RsM (m) depends on stage because the number of days for
a user to wait for a partner’s response may vary by stage. pM
s (m, w) is m’s expected
16
probability that m and w eventually get married at stage s. If m wants to continue a
relationship with w at stage s but m and w eventually do not marry to each other, then
M
m receives the utility dM
s (m, w). ds (m, w) can be interpreted as the utility from just
having a first (second) date if it is positive, or disutilty from being rejected in each stage
of the relationship if it is negative.
If man m wants to continue a relationship with woman w at stage s, then he will
M
M
get UsM (m, w) with probability pM
s (m, w) and Rs (m) + ds (m, w) with probability 1 −
M
pM
s (m, w). If he does not want to continue, he receives the utility Rs (m). Therefore, if
pM
s (m, w) > 0, the condition that holds if m wants to continue a relationship with w at
stage s is
YsM ∗ (m, w) = UsM (m, w) +
(1 − pM
s (m, w)) M
ds (m, w) − RsM (m) > 0.
M
ps (m, w)
(1)
In the rest of this paper, I use the term reservation utility to refer to the sum of the last
two terms in Eq. (1).
This model generates two qualitative predictions that would arise from a fully specified dynamic model. First, in a fully specified dynamic model, m can accept a date
with w because he has an option value to reject her at stage 2 or 3. My model can
generate this prediction if m gets sufficiently large utility from just having a date and
the expected probability of eventually marrying w is not one (e.g, dM
1 (m, w) > 0 and
pM
1 (m, w) < 1). Second, in a fully specified dynamic model, m can reject a date with
w because m expects the probability that w wants to marry him to be low and he will
suffer if he wants a date with w but w rejects him. My model also allows for this possibility, since if m has large disutility from rejection and the probability of eventually
M
marrying w is not one (i.e., dM
s (m, w) < 0 and ps (m, w) < 1), then m will reject a date
(or marriage) with w.
3.2
Preferences
Let X m and X w be the vector of characteristics of man m and woman w, respectively.
X m (i) (X w (i)) denotes its ith element. The utility that m receives from marrying w is a
17
function of the observable attributes of m and w and a pair-specific random utility M
m,w :
U M (m, w) =
X
αiM X m (i) + βiM X w (i) + γiM h (X m (i), X w (i)) + M
m,w
(2)
i
where h (x, y) = (x − y)2 if x and y are continuous, and h (x, y) = 1 (x = y) otherwise.
The variable M
m,w summarizes the characteristics of w that m cares about but that
are unobservable to researchers (e.g., personality). It is drawn from a N (0, (σM )2 )
M
W
0
0
distribution and is independent from W
m,w , m0 ,w0 and m0 ,w0 for all (m , w ) 6= (m, w).
This utility function has two key features. First, it allows men and women to have
different utility functions because {αM , β M , γ M } can differ from {αW , β W , γ W }. Second,
the utility function depends on the interaction between a husband’s and a wife’s characteristics because of the function h (X m (i), X w (i)) in Eq. (2). If either of the parameters
{γ M , γ W } is not zero, then two men may rank potential mates differently depending
on their own characteristics, and thus the estimated utility function may imply the
complementarity between a husband’s and wife’s characteristics.
Table 8 presents the attributes that may affect a user’s utility from marriage. Some
of the attributes require additional explanation. First, the variable “facial grade” ranges
from A to F where a facial grade A is the most attractive and F is the least attractive.8
Second, the variable “hours worked” is the average of the number of hours worked per
year given a worker’s gender, age group, educational attainment, and industry, constructed from the population wage surveys. I assume that after controlling for income
and hours worked, individuals are indifferent about their spouse’s industry. I take this
approach for reasons of parsimony in order to reduce the computational burden of estimation. Third, Body Mass Index (BMI) is a height-adjusted measure of weight and
ranges between 18.5 and 24.9 for normal-weight adults 20 years old and older.9 Fourth,
primary care-provider is a binary variable that is one if a man is the eldest son or if a
woman is the eldest daughter and has no male siblings. This indicates whether a user is
likely to be the primary care provider for his or her parents and thus the user may need
to share the burden with his/her spouse. Marital status of parents is a binary variable
8
In the data, the distribution of facial grades is as follows: A(7.1 percent), B(38.3 percent), C(42.7
percent) and D∼F(9.6 percent).
9
Source: U.S. Centers for Disease Control and Prevention, Department of Health and Human Services
18
that is zero if the biological parents of a user are alive and still married to each other.
Finally, I define a binary variable “hometown conflict” that is one if a user from Jeolla
meets a partner from Gyeongsang because substantial political tensions exist between
these two regions.
3.3
Expected Utility from Marriage and Learning Processes
At each stage of decision, m forms an expectation on the utility from marrying w based
M
M
M
on the available information set ΩM
m,w,s (i.e., Us (m, w) = E(U (m, w)|Ωm,w,s )). This
section presents two types of learning processes that govern m’s expectation. In Type
1, m acquires additional information about w’s characteristics not revealed in the online
database but observable to researchers, discussed in Section 2.2.4. In Type 2, m acquires
additional information about w’s characteristics unobservable to researchers (i.e., M
m,w
in Eq.(2)).
3.3.1
Type 1 Learning Process: Linear Projection
Let X1w and X2w denote w’s characteristics observable to m at stage 1 and at stage 2,
respectively.10 Because the dataset does not provide information about the exact range
of a partner’s characteristics obtained by a user prior to a first date, I make the following
assumption: X1w is all the characteristics included in the utility function, except for the
four variables (denoted by X2w ) that are not presented in the online database (i.e.,
income, parental wealth, BMI, parental marital status). I assume that at stage 1 m
predicts w’s income as a linear function of her education and hours worked and does w’s
parental wealth as a linear function of her father’s educational attainment. I assume
that X1w is not correlated with w’s BMI and parental marital status.11
10
In theory, I can assume that some observable traits can be observable after a second date. However,
in that case, estimation is more difficult because, after a second date, only the joint marriage decision
is observable, not each user’s response for marriage.
11
Although I introduce these assumptions to reduce computational burden, they seem plausible.
For example, I find that a user’s income is mainly accounted for by education and hours worked, and
parental wealth is accounted for by father’s education. In an OLS regression of income on the entire
set of characteristics, education and hours worked account for over 93 percent of R-squared. In an OLS
regression of parental wealth on the entire set of characteristics, father’s education accounts for over 50
percent of R-squared. For BMI, over 92 percent of users have a normal weight.
19
3.3.2
Type 2 Learning Process: Bayesian Updating
M
I assume that man m receives a noisy signal ζm,w,s
of woman w’s true type M
m,w when the
M
is the
two actually meet in person (i.e., stage s with s ≥ 2). I assume that a signal ζm,w,s
M
sum of the true type M
m,w and noise νm,w,s . The noise is assumed normally distributed
with mean zero and variance (σνM )2 . Man m uses Bayes’ rule to update the expectation
12
The assumption of no Type 2 learning at s = 1 is
of M
m,w from the observed signals.
used for identification and discussed further in Section 3.6.
Given the information set at stage s, the distribution of M
m,w can be written as:
M
M 2
(3)
M
m,w |Ωm,w,1 ∼ N 0, (σ )
s


P M
M −2
(σ
)
ζm,w,i
ν


1
i=2
M
M

 for s ≥ 2
m,w |Ωm,w,s ∼ N  M −2
,
(σ ) + (s − 1)(σνM )−2 (σM )−2 + (s − 1)(σνM )−2 
m
w
where ΩM
m,w,1 = {X , X1 }
m
w
w
M
ΩM
m,w,2 = {X , X1 , X2 , ζm,w,2 }
m
w
w
M
M
ΩM
m,w,3 = {X , X1 , X2 , ζm,w,2 , ζm,w,3 }.
Having multiple dates with w improves the precision of m’s prediction of M
m,w since the
M
conditional variance of w0 s unobserved attributes (V ar(M
m,w |Ωm,w,s )) decreases in s.
3.4
Reservation Utility
I assume that RsM (m) depends on four components. The first component cM
s is a genderstage-specific common component. The second component Lm is the number of singles
of the opposite sex per km2 in the region where m lives. This component captures the
option value of finding a spouse outside the matchmaking service.13 The third component, a user-specific random utility ηm , incorporates unobserved users’ characteristics,
M
such as willingness to marry. The fourth component , ωm,w,s
, is a pure idiosyncratic
shock which is correlated with neither observables nor other random variables. It is
12
Examples of papers that employ a Bayesian learning process include Parent (2002), Gibbons et al.
(2005), and Brien et al. (2006).
13
I examined an alternative specification using both Lm and the sex-ratio. I find that the sex-ratio
is not statistically significant at a conventional level, after controlling for Lm .
20
normally distributed and its variance at stage 1 is assumed to be one. Thus, I have:
M m
M
RsM (m) = cM
s + χ L + ηm + ωm,w,s
(4)
M
M 2
M 2
with ηm ∼ N (0, (σηM )2 ), ωm,w,s
∼ N (0, (σω,s
) ), and (σω,1
) = 1.
Next, I assume pM
s (m, w) is the likelihood that a man whose type is the same as m would
marry a woman whose type is the same as w, if the man were to get married. Using the
marriage registers (MR), I define the type of a person based on age group, education, and
location. I then compute pM
s (m, w) by dividing the number of new marriages between
men whose type is the same as m and women whose type is the same as w by the number
of new marriages by men whose type is the same as m. Lastly, I assume dM
s (m, w) to
be constant given gender and stage (dM
s ).
3.5
3.5.1
Issues Regarding Estimation
Non-randomness of Proposals
Proposals in the data are not randomly generated. With the company-initiated proposals, the non-randomness generated by the company’s algorithm results in over-sampling
of observations when two users’ OSIs (thus observables) are similar. The user-initiated
proposals are observed only if at least one user wants a first date with the other user.
When estimating the model, I use only company-initiated proposals for two reasons:
First, the majority of proposals (87 percent) are company-initiated; Second, I do not
have information about who the user browsed without asking out. Without the information on a user’s own search process, we cannot weigh the importance of one user-initiated
proposal relative to one company-initiated proposal.
To address the non-randomness due to the company’s algorithm, I construct weights
as described below. I first classify users into 1,136 groups based on gender, OSI decile,
age group, geographical location, and marital history. I then compute the probability of
a user in group i of getting a proposal with a user in a group j. For proposals with i and
j, I use the probability of observing a proposal between i and j under random matching
divided by the observed probability as weights.14
14
To see the need to use weights, consider a simple example as follows: Ym∗ = α1 +α2 ×1(Xm = Xw )+
21
First year of membership
purchase
Table 7: Distribution of User’s Tenure:
Table 7: Year of Membership Purchase and Representation in the Sample
2002
2003
2004
2005
2006 (Jan ~ June)
Sum
3.5.2
Users
(1)
17.56
21.72
28.22
25.15
7.36
100
Men
Proposals
(2)
13.18
21.92
31.68
28.62
4.61
100
Women
Users
Proposals
(3)
(4)
16.51
12.65
20.77
20.88
28.22
32.22
26.20
29.31
8.30
4.94
100
100
Censoring
Of the company-initiated proposals, 2.6 percent are censored, either because two users
had a first date but the data does not have information about their second date or
marriage, or because the two users had the first two dates but the data does not have
information about whether or not they married. To estimate the model, I assume that
the censoring occurs at a random manner. However, because only a small fraction
of proposals are censored, the estimation results changed little even if I alternatively
assumed that all censored proposals eventually did not result in marriage.
3.5.3
Sample Distribution of User-Specific Unobserved Reservation Utility
I assume that the distribution of user-specific unobserved reservation utility (ηm in Eq.
(4)) in my data is the same as the population distribution, assumed to be N (0, (σηM )2 )
for men and N (0, (σηW )2 ) for women. However, it is possible that those who have high
value of ηm may remain at the service and thus be over-represented in the proposals.
Alternatively, it is also possible that those who have a high value of ηm may get disappointed by the quality of other users and stop using the service, thus making them
under-represented.15 Table 7 shows that except for people joining in the first half of
2006, the number of new users and their share in the proposals remain similar across
where Ym = 1(Ym ∗ > 0),{Xm , Xw } ∈ 0, 1 and ∼ N (0, 1). Then, the estimate of α2 is P r(Ym =
1|Xm = Xw ) − P r(Ym = 1) = (1 − P r(Xm = Xw ))(P r(Ym = 1|Xm = Xw ) − P r(Ym = 1|Xm 6= Xw )).
If the company’s algorithm generates proposals for people whose types are similar to each other, then
(1 − P r(Xm = Xw )) among the observed proposals is lower than (1 − P r(Xm = Xw )) among random
matching.
15
For example, about 13 percent of users who started to use the service between 2002 and 2005
stopped using the service after three months although they did not get married during this period and
could have used the service for nine more months.
22
users’ year of membership purchase. Thus, this assumption can be plausible if, among
users who join the company in the same year, the distribution of ηm for men who left
the service early is balanced out by those who used the service longer. I discuss the
robustness check regarding this issue in Section 6.
3.6
Identification and Estimation Methods
Parameters can be identified up to scale due to the feature that a user’s response at a
given stage is binary (i.e., whether to continue the relationship with a partner). Thus,
I normalize the total variance at the first stage as one by assuming that the variance
M
W
and ωm,w,1
is one and users do not receive noisy signal of a
of random shock ωm,w,1
partner’s type at the first stage. Because of the restrictions across stages, the variances
of the composite random variables at stage 2 and 3 are identified (see Appendix A for
the details).
I use a Laplace type estimator (LTE) as suggested by Chernozhukov and Hong (2003),
who show that the LTE performs well in applications where the parameter dimension is
high and many local optima exist. The LTEs are defined similarly to Bayesian estimators, but use more general objective functions such as method of moments in place of
the likelihood function in Bayesian estimators. For the LTE, I define the objective function to minimize the distance between actual moments and simulated moments from the
model. The moments of interest consists of five categories: the probability of accepting
a first date for a man and that for a woman, the probability of accepting a second date
for a man and that for woman, and finally the probability of a pair of a man and a
woman getting married to each other. Each probability is multiplied by instrumental
variables that are the users’ observable characteristics. In total, I have 172 identifiable
parameters and 245 moments (see Appendix B).
4
Estimation Results
This section presents the estimated parameters in the model, trade-offs between spousal
income and other traits, and the model fit.
23
Table 8: Estimation Results (Baseline Model)
This table presents the estimation results for users’ surplus from marriage. Details of the regressors are in
Appendix Table A.2.
Table 8: Estimation Results
Variables
Age: own
Age: spouse
Age: sq. diff.
Edu: own ≤ high school
Edu: own =tech. college
Edu: own=master’s or Ph.D.
Edu: spouse≤ high school
Edu: spouse=tech. college
Edu: spouse=master’s or Ph.D.
Edu: own=spouse
Industry: own=spouse
Hours worked: own
Hours worked: spouse
Hours worked: sq. diff
Log income: own
Log income: spouse
Log income: sq. diff
Dad's edu: own≤ high school
Dad's edu: own=tech. college
Dad's edu: own=master’s or Ph.D.
Dad's edu: spouse≤ high school
Dad's edu: spouse=tech. college
Dad's edu: spouse=master’s or Ph.D.
Dad's edu: own=spouse
Log parental wealth: own
Log parental wealth: spouse
Log parental wealth: sq. diff
Facial grade: own=A
Facial grade: own=B
Facial grade: own=D~F
Facial grade: spouse=A
Facial grade: spouse=B
Facial grade: spouse=D~F
Facial grade: own=spouse
Height: own
Height: spouse
Height: sq. diff.
Body Mass Index: own
Body Mass Index: spouse
Body Mass Index: sq. diff
Martial history: own=ever divorced
Martial history: spouse=ever divorced
Martial history: own=spouse
Primary care provider: own=yes
Primary care provider: spouse=yes
Primary care provider: own=spouse
Source and unit
Birth certificate, 10yrs
Birth certificate, 10yrs
Birth certificate, 10yrs
Diploma
Diploma
Diploma
Diploma
Diploma
Diploma
Diploma
Proof of employment
Author’s calculation
Author’s calculation
Author’s calculation
Reported*
Reported*
Reported*
Reported
Reported
Reported
Reported
Reported
Reported
Reported
Reported*
Reported*
Reported*
Company’s evaluation
Company’s evaluation
Company’s evaluation
Company’s evaluation
Company’s evaluation
Company’s evaluation
Company’s evaluation
Reported, 1 meter
Reported, 1 meter
Reported, 1 meter
Reported, 10
Reported, 10
Reported, 10
Legal documents
Legal documents
Legal documents
Legal documents
Legal documents
Legal documents
24
Men
Estimate
SE
1.685
0.503
-2.232
0.450
-1.906
0.025
0.151
0.011
0.049
0.025
0.034
0.057
0.099
0.006
0.043
0.009
-0.120
0.007
0.151
0.011
0.047
0.039
0.224
0.294
-1.129
0.051
-0.139
0.003
-2.926
0.022
2.443
0.021
0.989
0.104
0.105
0.111
0.123
0.012
-0.035
0.027
-0.143
0.019
0.033
0.002
0.000
0.006
0.052
0.094
-0.005
0.027
0.655
0.027
-0.531
0.163
-0.204
0.025
-0.206
0.073
0.203
0.046
0.505
0.029
0.276
0.079
-0.083
0.044
-0.005
0.073
1.605
0.263
-1.873
0.245
-15.395
0.003
0.429
0.061
-2.167
0.051
-0.020
0.789
0.456
0.044
0.140
0.043
0.540
0.150
-0.024
0.073
-0.020
0.049
-0.014
0.104
Women
Estimate
SE
-1.761
0.495
1.406
0.551
-2.619
0.025
-0.501
0.005
-0.444
0.007
0.077
0.005
0.218
0.016
-0.304
0.027
0.215
0.064
0.098
0.104
0.048
0.046
-0.438
0.036
-1.439
0.323
-0.061
0.003
0.037
0.015
3.307
0.016
-0.142
0.101
-0.177
0.013
-0.137
0.002
0.469
0.004
-0.029
0.120
0.049
0.014
0.099
0.035
0.028
0.105
0.429
0.019
0.722
0.019
-0.175
0.186
-0.085
0.036
-0.158
0.087
0.100
0.050
0.297
0.029
0.104
0.087
-0.112
0.050
0.049
0.083
-4.377
0.270
5.466
0.289
-17.508
0.003
0.052
0.036
1.155
0.044
0.082
0.607
0.030
0.045
0.698
0.046
0.687
0.165
0.001
0.054
-0.034
0.084
-0.011
0.114
Table 8: Estimation Results (cont.)
Region: own=spouse
Legal documents
0.134
0.144
0.223
0.160
Religion: own=spouse
Reported
0.052
0.100
0.075
0.113
Hometown: own=spouse
Legal documents
0.112
0.082
0.053
0.090
Hometown conflict: yes
Legal documents
-0.261
0.017
-0.137
0.023
Parental marital status: own
Legal documents
0.335
0.009
0.012
0.005
Parental marital status: spouse
Legal documents
0.041
0.007
-0.051
0.006
Parental marital status: own=spouse
Legal documents
-0.118
0.021
0.402
0.015
Density: own
Author’s calculation
-1.400
0.030
-1.849
38.955
Inverse of the success rate at s=1
Author’s calculation
-0.475
0.001
-0.057
0.001
Inverse of the success rate at s=2
Author’s calculation
0.067
0.001
0.124
0.000
Inverse of the success rate at s=3
Author’s calculation
-0.112
0.000
0.559
0.000
M
s.d. of composite shocks at s=2 (σ 2)
3.127
0.006
4.275
0.003
s.d. of composite shocks at s=3 (σM3)
2.729
0.014
5.801
0.000
cov. btw. shocks at s=2 and s=3(kM)
0.729
0.002
0.901
0.001
s.d. of random reservation util. (σMη)
0.490
0.031
0.315
0.032
No. of proposals
165,896
14,818
No. of users
* The unit of income (parental wealth) is 10,000 won. The log of the variable is divided by 10 for scaling.
4.1
Net Utility from Marriage
The estimates parameters described in Section 3 are presented in Table 8 in such a way
that a positive coefficient of a variable implies that, ceteris paribus, a user is more likely
to want to continue a relationship with a partner as the value of the variable increases.
Many traits are statistically significant at a conventional level for explaining dating and marriage decisions, suggesting that people consider a large number of partner
traits when they make their decisions on dating and marriage. In particular, parental
socioeconomic status, such as father’s education and parental wealth, still affects people’s decisions, even after controlling for a large number of individual characteristics.16
This finding suggests that the impact of family background on marital sorting can be
important in studying intergenerational mobility.
The estimated parameters governing the utility from the interaction between a husband’s and wife’s traits (i.e., γ M and γ W in Eq.(2)) are statistically different from zero
for many traits. This implies that people may have different preference rankings for
potential mates, depending on their own characteristics. For instance, consider age. On
average, male users receive higher utility by marrying a young partner; a one unit in16
Charles et al. (2006) also find positive marital sorting by parental wealth, even after controlling for
individual characteristics, among married couples in the United States.
25
Util
Women
-1.5
-2
-2.5
-3
0
2
4
6
8
10
12
6
2.5
4
2
1.5
Women
0
Utility from marriage
Utility from marriage
2
-2
-4
-6
-8
Men
-10
-12
1
0.5
Men
0
-0.5
-1
Women
-1.5
-2
-14
-2.5
-16
20
25
30
35
40
45
50
55
-3
60
0
Figure 2: Utility from Spouse’s Age
2
4
6
8
10
12
Figure
3: Utility from Spouse’s Income
6
4
2
Utility from marriage
crease in spousal age lowers the utility from marriage
by −2.23. However, the Women
larger the
0
-2 lower the utility. Thus, the optimal
age difference between a husband and a wife, the
-4
age of a user’s spouse can vary by the user’s own
age. Figure 2 (the solid line) plots
-6
-8
the expected utility that the median man (33-year-old)
will receive Men
from marrying a
-10
woman, as a function of the woman’s age. The-12graph shows that, ceteris paribus, the
-14
median man considers a 27-year-old woman to -16
be ideal. Similarly, the median woman
20
25
30
35
40
45
50
55
60
(30-year-old) considers a 32-year-old man to be ideal (the dashed line in Figure 2). Preferences for “similar” types are also observed for height and marital history. On the
other hand, for some characteristics, the utility from marrying a “good” type dominates
the utility from marrying a similar type. For example, regardless of a user’s own facial
grade, both men and women strictly prefer a spouse with better facial features (i.e.,
A > B > C > {D ∼ F }). For spousal income, the more a spouse earns, the higher
utility both the median man and woman enjoy in marriage (Figure 3).
Interesting gender differences in preferences are observed for educational attainment and father’s educational attainment. Consider two men (m1 , m2 ) and two women
(w1 , w2 ). Suppose the highest educational attainment of m1 and w1 is a college degree
and that of m2 and w2 is a master’s degree. The estimation results suggest that man m1
receives higher utility from marrying w1 than marrying w2 (0.15 vs. -0.12). Similarly,
man m2 receives higher utility from marrying w2 than marrying w1 (0.03 vs. 0.00).
Therefore, the two men prefer marrying a woman whose educational attainment is the
same as theirs. In contrast, both women receive higher utility from marrying m2 than
26
marrying m1 . Thus, women prefer marrying a man with high educational attainment
regardless of their own.17 We observe the same pattern for father’s educational attainment. People in a region where there are many singles of the opposite sex have a higher
reservation utility. This may reflect the fact that a high density of available singles increases the opportunity of finding a spouse more attractive than the current partner. A
user’s expectation of the probability of marriage affects the user’s responses differently
across stages. For example, if there are many marriages between men and women whose
types are the same as m and w, respectively, then m is more likely to accept a first date
with w, but less likely to accept a second date with w.
The estimated covariance of the composite random shocks implies that compared
to having only a first date, having a second date reduces uncertainty due to imperfect
information about the partner’s unobservable type up to 47 percent for men and 20
percent for women (see Appendix A).
4.2
Trade-offs between Spouse’s Income and Traits
I compute the trade-offs between spousal income and other traits to gauge the magnitude
of these traits’ contribution to the utility from marriage. I take the median man and
woman in terms of all observable characteristics and compute how much of spouse’s
annual income they are willing to forgo to marry someone who is identical to their
spouse along all but one dimension. Columns (1) and (2) of Table 9 report the results.
For example, the median man whose facial grade is C is willing to forgo about 159 million
won to marry a person whose facial grade is two notches higher than the median woman
(i.e., facial grade C to A). The columns show that having attractive facial features and
having an desirable height provide a user’s spouse with a sizable utility which should
be compensated by a large amount of income. To marry a spouse with facial grade A,
men are willing to forgo much larger spousal income than women, which is consistent
with the findings in Fisman et al. (2006) and Hitsch et al. (forthcoming). However,
interestingly, women’s willingness to pay to marry a spouse with the optimal height is
comparable to men’s. The amount of spousal income a user is willing to forgo in order
17
For high school educated men and women, preferences for spousal educational attainment depends
on specifications (see Tables A.2 and A.3 in Appendix for further comparison).
27
Table 9: Trade-Offs
The first row shows the preference ranking of partners varying education and facial grade but holding all other
conditions constant. The subsequent rows present the annual income of partners that the median men (or
women) are willing to forgo in order to change their partner’s characteristics.
Table 9: Trade-offs
Baseline
Median
Median
Men
Women
(1)
(2)
Alternative I
Median
Median
Men
Women
(3)
(4)
Facial grade
C→A
158.868
44.801
2,029.391*
C→B
54.579
5.981
248.095
C→D
-6.923
-14.895
-25.598**
Height
Median of the opposite sex
5' 4"
5' 8"
5' 4"
Optimal height
5' 6"
5' 10"
5' 5"
Median → Optimal
4.921
3.890
7.725
Education
Univ → High (high school)
4.921
18.106
-13.973
Univ → Tech (technical college)
-1.940
-26.133
-3.733
Univ → Master’s/Ph.D.
-13.973
18.106
-25.598**
Father’s Education
High → Tech (technical college)
11.845
9.718
67.949
High → Univ (university)
6.923
1.542
-6.293
High → Master’s/Ph.D.
8.074
16.981
55.639
Unit: million won (roughly equivalent to 1,000 US dollars)
* Maximum value in the sample, ** Minimum value in the sample
103.337
10.481
-31.408
5' 8"
5' 11"
44.801
-30.827
-36.630
110.374
-7.337
21.432
39.538
to marry someone with desirable educational attainment is similar to that for finding a
spouse with desirable “father’s educational attainment,” but less than that for having a
mate with facial grade A.
4.3
Goodness of Fit
Table 10: Model Fit
To examine
goodness of fit of the model, I presentMa subset
the 245 moments in Table
M
M of W
P(Y1 =1) is the probability of a man accepting a first date; P(Y2 =1| Y1 = Y1 =1) is the probability of a man
W
accepting
a second
date
conditional
on having
a first date; from
P(Y1W=1)
P(Ydata
Y1M =the
Y1Wweighted
=1) are similarly
2 =1|and
10.
Columns
(1)
and
(2) Mshow
the Mstatistics
theand
raw
ones,
W
W
defined for women. P(Y3 =1| Y1 = Y1 =Y1 = Y1 =1) is the probability of a pair getting married after two
dates. R(m,w) describes the outcome of a proposal among 8 possible events.
Table 10: Model Fit
P(Y1M=1)
P(Y1W=1)
P(Y2M =1| Y1M = Y1W =1)
P(Y2W =1| Y1M = Y1W =1)
P(Y3 =1|Y2M = Y2W =1)
Data (Company)
Prediction
unweighted weighted Baseline Alternative Alternative
I
II
(1)
(2)
(3)
(4)
(5)
0.325
0.232
0.204
0.204
0.208
0.266
0.153
0.139
0.138
0.139
0.592
0.314
0.320
0.318
0.321
0.515
0.286
0.287
0.290
0.293
0.051
0.021
0.021
0.027
0.021
28
respectively. For example, in the data, a male (female) user on average accepts a first
date with the probability of 0.325 (0.266). When we weight the sample to correct nonrandomness of proposals, then the weighted probability of accepting a first date becomes
0.232 for men and 0.152 for women. Column (3) presents the predicted statistics based
on the estimated model. Overall, the model fits the data well, although the predicted
acceptance rate for a first date is slightly higher than the actual data.
5
Counterfactual Analysis
In this section, I examine the outcome of a proposal depending on whether the proposal is initiated by the matchmaking company or by another user. Then, I study the
implications of wider use of the online search for marital sorting in Korea.
5.1
Role of Online Matchmakers in Mate Search
Around the world, a growing number of online mate-search services recommend potential mates to their users. Therefore, It is important to examine to what extent the
recommendation of a potential mate by an online matchmaking affects an individual’s
decisions regarding dating and marriage. In particular, if recommendations by the online
matchmakers affect people’s marriage decisions, then the algorithms that the matchmakers use for recommendation can change search outcomes including marital sorting. Ex
ante, it is not clear whether a matchmaker’s recommendations would have any impact
on people’s dating or marriage decisions because people do not need to take the recommendations. On the other hand, since an individual cannot perfectly observe a potential
mate’s characteristics, the individual may form his/her expectation of the utility from
marriage with the potential mate depending on the channel through which the individual encounters the potential mate: whether a third party (e.g, matchmaker) introduces
them or the potential mate contacts the individual.18
In theory, a user in my dataset may perceive partners suggested by the company
18
Studies on retirement saving decisions document that people are more likely to choose a saving
option selected as a default and suggest, as a possible cause of the finding, that people may perceive the
default option to be the best based on the assumption that the service provider has more information
than they do (e.g., Beshears et al., 2008).
29
Table 11: Role of Online Matchmakers in Acceptance Behavior
Table 11: Role of Online Matchmakers
Decision for a first date
- male recipients P(Y1M=1)
- female recipients P(Y1W=1)
Decision for a second date
- P(Y2M =1| Y1M = Y1W =1)
- P(Y2W =1| Y1M = Y1W =1)
Decision for a marriage
- P(Y3 =1| Y2M = Y2W =1)
Data
(User)
(1)
Baseline
(2)
Prediction
Alternative I Alternative II
(3)
(4)
0~0.133
0~0.436
0.274
0.206
0.275
0.205
0.279
0.206
0.573
0.460
0.571
0.511
0.569
0.520
0.572
0.530
0.087
0.038
0.049
0.036
as better because the company has more information about them, such as income and
family background, or because the user thinks a person with low pair-specific random
utility (e.g., an aggressive person) is more likely to ask another user out. Note that
in both examples, a user will be more likely to have a first date with a partner if the
partner was introduced by the company. Therefore, I empirically examine to what
extent the recommendation by an online matchmaking affects an individual’s decisions,
by investigating a user’s behavior regarding willingness to continue a relationship.
If a proposal is initiated by a user, not by the company, then the dataset does not
inform us who, of the two users, initiated the proposal. However, we can infer who did by
analyzing users’ responses. Among user-initiated proposals, 77.1 percent were accepted
for a first date by men but not by women (thus, male user-initiated); 12.8 percent were
not accepted by men but were accepted by women (thus, female user-initiated); and
the remaining 10.1 percent were accepted by both men and women (ambiguous case).
Therefore, we can compute the maximum (minimum) probability that a female user
would accept a male user-initiated proposal for a first date, by assuming that all the
ambiguous proposals are initiated by male (female) users. I report the range of the
actual acceptance rate in column (1) of Table 11 and the predicted rate if the company
introduced the two to each other in column (2). I find that a male user is at least 14
percentage point more likely to want to have a first date with a woman if the woman
is introduced by the company, as compared to the case where the woman directly asks
him out, although the results for females depend on how to classify the ambiguous
proposals. Conditional on having a first date, the probability of a proposal reaching
30
a second date or marriage remains similar regardless of who initiates the proposal. I
find the qualitatively same patterns regardless of whether a user and his/her partner
have the same trait in terms of education, father’s education, and industry. This finding
implies that the company’s recommendation increases the overall acceptance rate but
appears not to increase the acceptance rate for certain types of partners more than not
and thus does not change sorting patterns. This is consistent with our earlier finding in
Section 2.2.5, which is the patterns of sorting among user-initiated proposals are similar
to those among company-initiated proposals.
5.2
Online Mate Search and Marital Sorting
In the top panel of Table 12, I report the fraction of newlyweds in Korea who have
a spouse with the same characteristics. I find that marital sorting between 1991 and
2005 changed in the following ways: the probability of an individual marrying a spouse
whose trait is the same as his/her own has decreased for hometown, increased for region
and marital history (never-married vs. divorced), and remained similar for educational
attainment (high school or less vs. college or more). To gauge the extent to which
changes in the underlying distribution of people’s traits account for this time trend, I
perform the following exercise. For each trait, I first compute the fraction of couples who
have a common value for this trait if a man and a woman were randomly selected from
among the newlyweds. I then regress a dummy variable indicating whether a husband
and a wife have a common value for this trait on a constant, a calendar year, and this
fraction. The bottom panel of Table 12 presents the results of this regression for each
trait. The results suggest that changes in the underlying distribution of people’s traits
alone cannot account for the time trend.19
I perform two exercises to examine the possibility that wider use of online matesearch services account for the time trend in marital sorting. Note that in South Korea,
the fraction of newlyweds who met their spouse through online mate-search services
increased from nearly zero in 1991 to 8 percent in 2005. In the first exercise, I take an
average man and a woman among the newlyweds in the population and compute their
likelihood to marry a spouse with the same traits in the population (column (1) of Table
19
The finding of a time trend of marital sorting is robust to probit and logit specifications.
31
Table 12: Aggregate Trend of Martial Sorting
Columns (1) to (4) in the top panel present the mean of a dummy variable indicating whether newlyweds in the
marriage registrar have the same education level, live in the same region, grew up in the same hometown and
have the same marital history, respectively. Numbers in brackets show the corresponding statistics under
“random matching.” Columns (1) to (4) in the bottom present a regression analysis, where the dependant
variable is the corresponding dummy variable in each column.
Table 12: Aggregate Trend of Marital Sorting
Traits
Degree of Marital Sorting
1991
1995
1998
2002
2005
Regression: OLS
Sorting (random matching)
Education*
(1)
Region
(2)
Hometown
(3)
Marital History
(4)
0.813
[0.539]
0.791
[0.518]
0.793
[0.503]
0.786
[0.504]
0.793
[0.534]
0.817
[0.303]
0.852
[0.320]
0.878
[0.331]
0.890
[0.350]
0.886
[0.356]
0.608
[0.235]
0.611
[0.237]
0.599
[0.238]
0.567
[0.238]
0.548
[0.238]
0.960
[0.899]
0.953
[0.874]
0.944
[0.828]
0.931
[0.783]
0.935
[0.777]
0.109
(0.002)
0.049
(0.001)
1,250,266
0.0079
1.085
(0.006)
-0.047
(0.001)
1,250,266
0.0298
0.335
(0.001)
0.012
(0.000)
1,247,450
0.1123
0.768
(0.002)
Year/10
-0.008
(0.001)
Observations
1,246,887
R-squared
0.0881
* “High school or less” vs “college or more”
13). I also compute the corresponding statistics for people who have the same characteristics as the average individuals, but use the online matchmaking company (column (3)
for the average man and column (4) for the average woman). I compute the statistics
in columns (3) by weighting the male users to match the distribution of newlywed men
in the population in terms of education, marital history, region, hometown, industry,
and age group. Similarly, I compute the statistics in column (4) for women. Comparing
column (1) with columns (3) and (4) shows that if the average person in the population
and the corresponding person among the users have the same marital preferences, then
using online match search services will change his or her likelihood of marrying a spouse
with the same trait in the following way: he/she is less likely to marry a spouse with
the same type in terms of hometown and industry, but more likely to marry a spouse
with the same marital history. The prediction for the probability of marrying a spouse
with the same educational attainment varies by sex. Therefore, the wider adoption of
online mate-search services could explain the time trend of marital sorting since 1991.
In the second exercise, I use the estimated preferences to compute the male-optimal
stable matching with the Gale-Shapley algorithm (1962) and calculate the probability
32
Table 13: Adoption of Online Marriage Market Intermediaries and Marital Sorting
Columns (1) and (3) show the marital sorting in the population. Column (2) shows the predicted marital sorting in 1991, if the OLS results in Table 2 are
employed and the underlying distribution of traits is the same as in 2005. Column (4) shows calculated marital sorting in the case that nobody found their spouse
via online matchmaking services in 2005, and column (5) computes the fraction of changes accounted for by the use of online matchmaking services. Columns
(6) and (7) present marital sorting when the entire population uses online matchmaking services. The difference between the two is whether the simulation allows
users to prefer being single rather than marrying an available partner (column (6)) or not (column (7)).
Table 13: Marital Sorting and Online Mate-Search Services
Actual
Fraction of couples with
- same education*
- same marital history
- same region
- same hometown
- same industry
1991 level
popularity
Weighted
(men)
Prediction
Weighted
(women)
(1)
(2)
(3)
0.790
0.920
0.891
0.560
0.365~0.400
0.792
0.828
0.832
0.622
-
0.819
0.970
0.812
0.307
0.131
(4)
Male-opt.
stable
matching
(5)
Femaleopt. stable
matching
(6)
0.617
0.987
0.905
0.542
0.106
0.605
0.942
0.519
0.370
0.185
0.606
0.942
0.521
0.371
0.187
* “High school or less” vs “college or more”
of the average individual marrying a spouse with the same traits. I use the maleoptimal stable matching from the Gale-Shapley algorithm because I nd it generates
sorting comparable to the sorting among users of the online matchmaking service who
ultimately marry (see Appendix D). This exercise also allows me to simulate marriages
where the distribution of traits for both men and women in the matchmaking company
is representative of the population, whereas in the results of the previous analysis, the
distribution of only one sex’s traits is representative. To do the second exercise, I first
sample the users of the matchmaking company with weights to match the distribution
of characteristics in the sample with their distribution in the population. I assume
that people have two dates with all possible candidates and rank them based on the
observables and random shocks. Thus, for each of all possible pairs of men and women,
I draw a set of random shocks that are mixture of nosy signals of a partners type and
pure random component in reservation utility. I then compute the expected utility from
marrying each other and construct peoples preference rankings over all potential mates.
With the computed expected utilities, I compute the male-optimal stable matching
by the Gale-Shapley algorithm (1962) and the marital sorting among the hypothetical
marriages. I iterate this process an additional nine times and report the average statistics
of marital sorting in column (5) of Table 13. Just as in the first exercise, the use of online
matchmaking services generates less sorting along hometown and industry and more
sorting along marital history, as compared to the actual sorting in the population. This
finding suggests that marital sorting in the population has become similar to the sorting
among users of the online matchmaking company. In Korea, the two largest companies
33
– one of which provided my dataset – account for more than 80 percent of sales in the
industry (Korea Fair Trade Commission, 2004) and the other leading company is known
to have business practices similar to the source of my dataset. Therefore, the wider
use of online mate-search services may account for the changes in marital sorting in the
population from 1991 to 2005.
6
6.1
Robustness Checks
Range of the Type 1 Learning
To certain extent, a user can get the information about another user’s traits not included
in the online database if the user contacts the company’s staff member assigned to
him/her. I estimate an alternative model (referred to as Alternative I ) in which users
can see all observable characteristics of other users prior to a first date. I find that, as
shown in column (3) of Table 11 and column (2) of Table 14, the estimated Alternative
I generates the results for the counterfactual analysis quantitatively comparable to the
baseline model, although the estimated preferences for spousal educational attainment
is not the same as that in the baseline (see Appendix Table A.2).20
6.2
Lifetime Income
It is possible that people may value a life-time income prospectus of their spouse, rather
than the spouse’s current income per se. I estimate another model (Alternative II ) in
which a user’s present discount value of the life-time income (PDV) replaces the user’s
income. To compute the PDV, I use WS, a population income survey, to estimate the
growth rate of the income and the probability of being laid off, specific to sex, education,
age, and industry. Based on the estimation results, I compute the PDV as the sum of
annual expected income discounted by the average interest rate (see Appendix C.3). The
estimated model yields the results for the counterfactual analysis similar to the baseline
20
In particular, in the baseline model, a college educated woman prefers a man with master’s degree
or Ph.D. the most, then a high school graduate, followed by a college graduate and a graduate of a
technical college. On the other hand, in the alternative model, the preference ranking between a college
graduate and a high school graduate is reversed.
34
his table presents simulation results using alternative estimates. Columns (1) and (2) are observed marital sorting and the baseline model prediction whe
tire population uses the online matchmaking services. Columns (3) to (8) present results using estimates based only on first-date decisions, and columns (
) employ a fixed-effects linear probability model. Column (4) uses the response of partners as a control variable; column (5) uses the duration of usin
atchmaking company as a control variable; column (7) allows the possibility that high school educated and college educated may have different u
nctions; and column (8) allows for the possibility that people may value the degree of dissimilarity between their own and partner’s traits differently depen
the direction of dissimilarity.
sed on male optimal
Table 14: Robustness Checks
Baseline
Alternative
I
II
(Sec. 6.1) (Sec. 6.2)
(1)
(2)
(3)
Model
Percentage of couples with
- same education*
- same marital history
- same region
- same hometown
- same industry
0.605
0.942
0.519
0.370
0.185
0.591
0.944
0.470
0.303
0.163
0.557
0.945
0.461
0.329
0.174
Models
III
(Sec. 6.3)
(4)
IV
(Sec. 6.5)
(5)
0.560
0.754
0.433
0.287
0.157
0.549
0.812
0.402
0.273
0.158
* “High school or less” vs “college or more”
case: see column (4) of Table 11 and column (3) of Table 14.
6.3
Selection on Willingness to Marry
It is possible that the distribution of people’s willingness to marry in my sample may
not be the same as the population’s distribution. Since people’s willingness to marry
is individual-specific, we can remove the potential source of bias by using a model with
individual-fixed effects. In particular, I use a linear-probability model with individual
fixed effects to recover marital preferences and then re-perform the counterfactual analysis. For simplicity, I use only first-date outcomes because Alternative I generates similar
results to the baseline model, and parameters in Alternative I relevant to this robustness check are identifiable by analyzing the first date decision. Column (4) of Table 14
presents the results, which are comparable to the baseline one.
6.4
Selection on Marital Preferences
It is possible that the preferences for spousal traits among the users may be very different from those in the population, but marital sorting in the population may randomly
become more like that among the users. Although directly addressing this problem is
impossible given the available data, I attempt to measure the importance of this bias
as follows. I calibrate the parameter governing the preferences for spouse-working-inthe-same-industry, in order to match the marital sorting among marriages based on the
Gale-Shapley algorithm with the actual marital sorting in the population. I then ex-
35
amine the implication of the magnitude of the calibrated parameter. I find that the
magnitude of the calibrated parameter suggests that for both men and women, working in the same industry outweighs a spouse’s appearance, educational attainment, and
many other important characteristics. This implication appears to be implausible since,
in several surveys of people’s priorities for spousal traits (e.g., Gallup Korea, 2007 and
DUO, 2007), single Koreans rarely indicate that working in the same industry is an
important factor for marriage decision. Therefore, selection on marital preferences may
not entirely account for the difference in marital sorting among population and the users
of online mate-search services.
6.5
Heterogeneous Marital Utility Function
I relax the assumption in the baseline model that all men (or women) have the same
marriage utility function. In particular, I use a model that assumes high school graduates
and college graduates to value their spousal educational attainment differently, but is
the same as Alternative I in Section 6.3, otherwise. The results of this counterfactual
analysis using the alternative estimates are shown in column (5) of Table 14 and are
comparable to the baseline results in column (1) of the table.
6.6
Possibility of Polygamy
One may be concerned that the estimated model does not restrict users from having
multiple marriage partners, although polygamy is illegal in Korea. To check whether
the absence of this restriction is quantitatively important, I simulate users’ responses
to the company-initiated proposals by drawing expected utility from marrying partners
using the estimates in Section 4. I find that polygamy rarely occurs: averaging across
ten simulations, only two out of over 14,000 users have multiple spouses.
7
Conclusion
The findings of this paper suggest several directions for future research. Using the
estimated users’ preferences, we can design an alternative algorithm to improve the
36
probability of a proposal resulting in an actual date or marriage. Recently, the company
partially adopted the estimation results to revise its algorithm for introducing two users.
The new algorithm increased the probability of a proposal turning into an actual first
date by a factor of 2, suggesting that the potential gain from designing an alternative
matching algorithm can be large. Next, my estimation results show that parental socioeconomic status directly affects an individual’s marriage decisions, even after controlling
for the individual’s socioeconomic status. Therefore, it may be useful to examine intergenerational mobility in an environment in which parental socioeconomic status partially
determines not only a child’s educational attainment but also the child’s marriage. Note
that previous studies have focused on the first mechanism but not on both. Finally,
using a full dynamic search framework to analyze a two-sided search market, such as
a marriage market, will be interesting, and I leave the task of extending my model to
incorporate the full dynamic search aspects for future research.
References
Abramitzky, Ran, Adeline Delavande, and Luı́s Vasconcelos, “Marrying Up:
the Role of Sex Ratio in Assortative Matching.” 2009, Working Paper, Stanford
University.
Angrist, Joshua, “How Do Sex Ratios Affect Marriage And Labor Markets? Evidence
From America’s Second Generation,” The Quarterly Journal of Economics, August
2002, 117 (3), 997–1038.
Autor, David, “Wiring the Labor Market,” Journal of Economic Perspectives, 2001,
15(1), 25 40.
, “The Economics of Labor Market Intermediation: An Analytic Framework,” in
David Autor, ed., Studies of Labor Market Intermediation, Chicago: University of
Chicago Press, 2008.
Bagues, Manuel F. and Mauro Sylos Labini, “Do On-Line Labor Market Intermediaries Matter? The Impact of AlmaLaurea on the University-to-Work Transition,”
in David Autor, ed., Studies of Labor Market Intermiation, Chicago: University of
Chicago Press, 2008.
Banerjee, Abhijit, Esther Duflo, Maitreesh Ghatak, and Jeanne Lafortune,
“Marry for What? Mate Selection in Modern India,” 2009. Working Paper, MIT.
37
Beshears, John, James Choi, David Laibson, and Brigitte Madrian, “The
Importance of Default Options for Retirement Saving Outcomes: Evidence from
the United States,” in Stephen J. Kay and Tapen Sinha, eds., Lessons from Pension
Reform in the Americas, Oxford: Oxford University Press, 2008, pp. 59–87.
Bisin, Alberto, Giorgio Topa, and Thierry Verdier, “Religious Intermarriage
and Socialization in the United States,” Journal of Political Economy, 2004, 112,
615–664.
Brien, Michael J., Lee A. Lillard, and Steven Stern, “Cohabitation, Marriage,
and Divorce in a Model of Match Quality,” International Economic Review, 2006,
47, 451–494.
Charles, Kerwin, Liqian Ren, and Erik Hurst, “The Nature and Consequences of
Marital Sorting by Parental Wealth.” 2006, Working Paper, University of Chicago.
Chernozhukov, Victor and Han Hong, “An MCMC Approach to Classical Estimation,” Journal of Econometrics, 2003, 115, 293–346.
Choo, Eugene and Aloysius Siow, “Who Marries Whom and Why,” The Journal of
Political Economy, February 2006, 114 (1), 175–202.
DUO, “Survey of Priorities for Spousal Traits,” Technical Report 2007.
Fernández, Raquel, Nezih Guner, and John Knowles, “Love and Money: A Theoretical and Empirical Analysis of Household Sorting and Inequality,” The Quarterly
Journal of Economics, January 2005, 120 (1), 273–344.
Fisman, Raymond, Sheena S. Iyengar, Emir Kamenica, and Itamar Simonson, “Gender Differences in Mate Selections: Evidence from a Speed Dating Experiment,” The Quarterly Journal of Economics, May 2006, 121, 673–679.
,
,
, and
, “Racial Preferences in Dating: Evidence from a Speed
Dating Experiment,” Review of Economic Studies, 2008, 75, 117–132.
Gale, David and Lloyd S. Shapley, “College Admissions and the Stability of Marriage,” The American Mathematical Monthly, January 1962, 69 (1), 9–15.
Gallup-Korea, “Survey on Ideal Spouse among Korean,” Technical Report 2007.
Gibbons, Robert, Lawrence F. Katz, Thomas Lemieux, and Daniel Parent,
“Comparative Advantage, Learning, and Sectoral Wage Determination,” Journal
of Labor Economics, October 2005, 23 (4), 681–723.
Hitsch, Günter J., Ali Hortaçsu, and Dan Ariely, “What Makes You Click? Mate
Preferences and Matching Outcomes in Online Dating,” 2006. Working Paper,
University of Chicago.
38
,
, and
, “Matching and Sorting in Online Dating Markets,” American
Economic Review, Forthcoming.
Korea Consumer Association, Survey of Matchmaking Services Providers 2000.
Korea Labor Institute, Labor Statistics, The Korean Labor Institute, 2007.
Korea Marriage Culture Institute, Survey of the Korean Marriage Culture 2005.
Korean Agency of Techonology and Standards, Survey of Physical Traits of Koreans 2004.
Kuhn, Peter and Mikal Skuterud, “Internet Job Search and Unemployment Durations,” American Economic Review, March 2004, 94(1), 218–232.
Kurzban, Robert and Jason Weeden, “HurryDate: Mate Preferences in Action,”
Evolution and Human Behavior, 2005, 26 (3), 227–244.
Madden, Mary and Amanda Lenhart, “Online Dating,” Technical Report,
PEW/INTERNET March 2006.
Niederle, Muriel and Alvin E. Roth, “Unraveling Reduces Mobility in a Labor
Market: Gastroenterology with and without a Centralized Match,” The Journal of
Political Economy, December 2003, 111 (6), 1342–1352.
and
, “The Effects of a Central Clearinghouse on Job placement, Wages, and
Hiring Practices,” in David Autor, ed., Studies of Labor Market Intermiation, The
University of Chicago Press., 2008.
and Leeat Yariv, “Matching Through Decentralized Markets,” 2008. Working
Paper, Stanford University.
Parent, Daniel, “Matching, Human Capital, and the Covariance Structure of Earnings,” Labour Economics, 2002, 9 (3), 375–404.
Pollever, Survey of Korean Marriage 2004.
Republic of Korea. Fair Trade Comission, Press Release March 2004.
Republic of Korea. Ministry of Labor, Basic Statistical Survey of Wage Structure
1994 – 2006.
, Labor Demand Survey 1994 – 2006.
Republic of Korea. National Statistical Office, National Population and Fertility
Survey 1991 – 2005.
, National Household Income and Expenditure Survey 2002 – 2005.
Wong, Linda Y., “Structural Estimation of Marriage Models,” Journal of Labor Economics, July 2003, 21 (3), 699–727.
39
Download