Clustered Institutional Holdings and Stock Comovement

advertisement
Clustered Institutional Holdings and Stock Comovement
ZHENG SUN1
First Draft: January, 2007
This Draft: April, 2008
Abstract
Previous literature has found that stock returns comove more than fundamentals. More
recently, researchers have also found commonalities in liquidity and trading activity. In
this paper, I document the role of institutional clienteles in comovement. To define
clienteles, I take an innovative approach based on applying hierarchical clustering
algorithms to institutional holdings. I find the majority of institutional investors can be
stably clustered into a small number of clienteles. These clienteles seem to play an
important role in explaining comovement. Stocks hold by the same clientele comove
excessively in trading volume, return, and liquidity. Lastly, I provide a channel through
which clientele effects generate comovement. Funds within the same clientele seem
suffer correlated liquidity shocks. These shocks generate correlated order flow in the
underlying stocks, inducing comovement in return and liquidity for these stocks.
1
I am highly indebted to my dissertation committee: Joel Hasbrouck, Robert Whitelaw, Robert Engle and
Lasse Pedersen. I also thank participants of the finance seminars at NYU, MIT, Ohio State University,
University of California, Irvine, University of Texas, Austin for helpful comments and suggestions. All the
remaining errors are mine.
Electronic copy available at: http://ssrn.com/abstract=1332201
I.
Introduction
This paper studies what drives stock comovement. Previous literature has extensively
documented how fundamentals drive comovement. But there have also been lots of
evidence showing that stock returns may move with each other even when fundamentals
are not at work. The puzzling findings related to comovement do not just stop at prices
and returns. More recently, people have also found liquidity commonality, as well as
trading behavior commonality.
In trying to understand these comovement findings, the current literature seems to
have found that the degree of excess comovement seems to be related to the degree of
institutional ownership. Stocks with higher proportion of institutional ownership seem to
comove more excessively.2 This paper also looks at excessive stock comovement from
the angle of institutional investment. But instead of focusing on the degree of institutional
ownership, I focus on the differences among institutional holdings. Using these
differences, I show institutional investors group into distinct clienteles, and the existence
of clienteles plays an important role in explaining excessive comovement.
In contrast to traditional asset pricing theories in which the marginal investor in every
stock is the same broadly diversified representative agent, clientele theories assume each
stock has a body of holders who find it attractive. Clientele theories imply limited market
participation. The marginal investor in a particular asset market is a specialized investor
rather than the diversified representative agent. 3 In such a world, the set of potential
buyers for an asset is much smaller. When one investor sells part of its portfolio due to a
liquidity shock or preference change, those who are willing to buy without a big price
concession probably also specialize in the same set of stocks. It takes longer for investors
who do not specialize in these stocks to learn about them. Thus, these “outside” investors
would ask for a bigger price concession. In this case, an asset’s price changes reflect the
2
Pindyck and Rotemberg (1993) suggest excessive return comovement is related to institutional ownership.
Kamara, Lou, and Sadka (2006) suggest degree of institutional ownership is related to degree of liquidity
commonality.
3
Merton (1987) presents a model in which segmentation arises endogenously, and explores the implication
of market segmentation for asset prices. Allen and Gale (1994) study an environment in which traders must
specialize in a certain asset market ex ante. This leads to limited market participation ex post. Moreover,
the wealth level of the specialized traders is critical to setting prices. Barberis and Shleifer (2003) show
how style investment can generate excessive comovement
1
Electronic copy available at: http://ssrn.com/abstract=1332201
preferences or wealth changes of the investors who specialize in it. Defining investors
and funds holding similar stocks as a clientele, different clienteles effectively define
different marginal investors. Consequently, returns of two stocks that share the same
marginal investors tend to comove more than returns of two stocks held by different
marginal investors. Furthermore, when a liquidity shock hits a marginal investor, it likely
triggers a liquidity shock for the group of stocks in the marginal investor’s portfolio. This
mechanism simultaneously increases the trading volume and decreases the liquidity of
this group of stocks.
This paper shows that the existence of distinct institutional clienteles can partially
explain commonalities in trading volume, return, and liquidity. First, I show the majority
of institutional investors can be stably clustered into a small number of clienteles. Second,
I show stocks held by the same clientele comove excessively in trading volume, return,
and liquidity. Here, I use excess comovement to mean comovement unexplained by
factors commonly thought to generate commonality. Lastly, I provide a channel through
which clientele effects generate comovement. I show funds within the same clientele
suffer from correlated fund withdraws which in turn trigger reactions that induce
comovement among stocks underlying these funds’ portfolios.
The literature has generally ignored the differences among institutional investors
because it is difficult to quantify these differences. In the first part of this paper, I attack
this problem through examining the differences among these investors’ portfolio holdings.
Specifically, I apply hierarchical clustering techniques to institutional holdings data to
group together funds that hold similar portfolios. I find the majority of institutional
investors can be clustered into a few distinct groups. The largest ten groups contains
more than 50% of the institutional investors in terms of total number and more than 80%
in terms of total assets under management. This clustering is stable over time; funds that
group together in one year are likely to group together the following years. As funds in
each stable cluster hold similar portfolios through time, I will from here on refer to the
funds that share a cluster as a clientele.
The existence of distinct clienteles is most interesting if it has asset pricing
implications. In the second part of the paper, I find the clustering of institutional investors
actually induces commonality in stock trading volume, return, and liquidity relative to the
2
standard factor models. Taking returns as an example, a one standard deviation increase
in the cluster level return is associated with a 20 bps increase in individual daily return.
So I essentially identify a cluster effect for stocks. Previous research has also found
excessive return comovement 4 and commonality in liquidity and trading activity 5 in
equity markets, and there is some evidence that commonality in order flow can partially
explain commonality in stock returns 6 . However, up to now, no one has tried to
synthesize all the empirical evidence of excess commonality. Through clientele effects,
this paper provides a simple channel that can partially account for all of the previous
comovement related findings.
To understand why stocks comove excessively on the clientele level, in the third and
last part of this paper, I test whether the fact that specialized institutional investors are
wealth constrained can explain the observed clientele level comovement effects.7 I will
refer to this channel as the wealth constraint channel. I focus on testing this particular
channel because of the special organizational feature of the asset management industry.
In this industry, retail or other uninformed investors delegate their investment decisions
to fund managers, but they can often put money in or pull money out of funds freely. For
example, it is well documented that mutual fund investors actively chase past
performance; 8 thus, fund managers may become wealth constrained upon bad past
performance.
The wealth constraint channel can induce excessive clientele level stock comovement
if all funds in a cluster experience liquidity shocks around the same time and if funds
outside the cluster demand a larger premium to hold stocks not usually in their portfolios.
As pointed out by Allen and Gale (1994), limited participation by itself does not explain
4
Both Karolyi and Stultz (1996) and Connolly and Wang (1998) find that macroeconomic
announcements and other public news do not affect the comovements of the Japanese and American stock
markets. King and Wadhwani (2000) also find that observable economic variables explain only a small
fraction of international stock market comovements.
5
Chordia et al. (2000) find that quoted spreads, quoted depth, and effective spreads comove with
market- and industry-wide liquidity; Huberman and Halka (2001) find evidence of commonality for quotes
and depth.
6
Hasbrouck and Seppi (2001) find that both returns and order flow are characterized by common
factors, and that commonality in order flow explains two-thirds of the commonality in returns.
7
Shleifer and Vishny (1997) are the first to emphasize inter-temporal wealth effects of financial
constraint. Gromb and Vayanos (2002), Kyle and Xiong (2001), and Yuan (2005) model a large drop in
prices caused by distressed liquidation of assets by hedge funds.
8
For early results concerning the relationship between fund flow and performance, readers can consult
Ippolito (1992), Chevalier and Elllison (1997) and Sirri and Tufano (1998).
3
excessive price movement. Liquidity in the market depends not only on the number of
investors who participate, but also on the amount of cash they hold. Even if the market is
thin, as long as a couple of informed investors hold enough cash, they will provide
liquidity at a small premium. Excessive comovement results when all the informed
investors are wealth constrained.
I test the wealth constraint channel in three steps. First, I show that money flows of
funds in the same clientele are highly correlated. Next, I show that funds’ investment
behaviors are affected by large fund flows. When funds suffer large outflows, they
dramatically increase the number of stocks they sell. The first two steps show that there is
correlated liquidity demand due to large money withdraws. In the last step, I show, as the
wealth constraint story would predict, this correlated liquidity demand partially explains
clientele level comovement. Clientele level comovement significantly increases when
funds in the clientele face large outflows. Moreover, this effect is asymmetric. I do not
find increased comovement when funds receive large inflows.
The rest of this paper proceeds as follows: Section II takes a brief look at the related
literature; Section III summarizes the data used in this paper; Section IV describes my
clustering methodology and characterizes the properties of the clusters obtained; Section
V shows how fund clustering induces the aforementioned stock commonalities; Section
VI links cluster level comovement to wealth constraint effects; Section VII lays out
possible future research and concludes.
II.
Literature
This paper’s findings add to the fast-growing literature that emphasizes how frictions
or market sentiments can de-link returns comovement from comovement of news about
fundamentals. For example, Pindyck and Rotemberg (1993) find excess return
comovement relative to the stocks’ exposure to macroeconomic factors. They show this
excess return comovement can be explained by the stocks’ underlying proportion of
institutional ownership. Lee, Shleifer, and Thaler (1991) show that prices of stocks with
low proportions of institutional ownership and prices of small stocks move together with
closed-end fund discounts, which are held predominantly by individuals. Both these
4
papers stress the market segmentation between the individual and institutional investors.
In this paper, I show different clienteles exist among institutional investors; in effect, I
argue there is also segmentation among institutional investors. Barberis, Shleifer, and
Wurgler (2005) study comovement of stocks within the S&P 500 index. They show when
a stock is added to the S&P index, its beta with respect to the S&P increases. As the
authors point out, it would be hard to argue that the stock’s addition is correlated with
any change in fundamentals. They suggest one possible explanation might be that certain
investors prefer certain investment habitats. As these investors’ risk aversions, sentiments,
or liquidity needs change, they adjust their portfolios’ exposure to the securities in their
habitats, thereby inducing a common factor underlying these securities’ returns. Unlike
Barberis, Shleifer, and Wurgler (2005), who only focus on S&P stocks and index changes,
this paper looks at almost all stocks in the investment universe. That I find a few stable
fund clusters indicates that the habitat argument works for a much broader set of stocks.
The findings in this paper also contribute to the growing literature that documents and
explains liquidity commonality. The mechanism proposed here is demand-driven, which
differs from supply-side explanation prevalent in the literature. Coughenour and Saad
(2004) and Newman and Rierson (2004) find that bid-ask spreads of stocks supported by
the same market makers move together. This spread comovement can be due to the
market maker using information across stocks or in his managing the inventory risk of the
combined portfolio. Brunnermeier and Pedersen (2005) propose a third mechanism for a
market maker to transfer liquidity shock between stocks. Their story relies on margin
calls to the liquidity provider forcing him to rebalance his portfolio. In contrast, the
mechanism stressed in this paper generates liquidity commonality from order flow
commonality. This mechanism does not require the existence of a common market maker
for the underlying stocks. Furthermore, I find empirical evidence showing that the
commonality of order flow is not solely driven by information. At least a part of it is
induced by the same group of marginal investors liquidating their portfolios.
Another strand of literature in which this paper fits examines the relationship between
asset management industry fund flows and stock returns. Edelen (1999) finds substantial
positive cross-correlation in fund flows, possibly indicating the existence of a common
factor behind fund flows. In this paper, I provide a natural candidate for such a factor,
5
namely the clustering of institutional holdings. Boyer and Zheng (2004) find that the
quarterly contemporaneous relations between fund flows and returns are positive and
significant for Mutual Funds, Foreigners, Pension Funds and Insurance Companies.
Moreover, they find that the quarterly contemporaneous covariances are driven mainly by
strong contemporaneous monthly relations. Boyer and Zheng’s results suggest that these
sectors may exert price pressure on the market through their demand for stocks. The price
impact appears to be temporary and is reversed in the subsequent months. Similar effects
are found in Coval and Stafford (2005) and Frazzini and Lamont (2005). The former
paper finds that funds experiencing large outflows (inflows) tend to decrease (increase)
their existing positions. This strategy change creates price pressure in the securities held
in common by these funds. The latter paper uses mutual fund flow as a measure of
individual investor sentiment for different stocks, and finds that high investor sentiment
predicts low future returns at long horizons. Both these studies and this paper conjecture
that funds incur liquidity shocks around the same time due to their common holdings.
Unlike this paper, however, they focus on individual stocks, thus they cannot directly test
the similarity of funds’ portfolios. Moreover, they do not provide evidence of correlation
of fund flows among similar funds.
Finally, from a methodology point of view, this paper documents similarity among
fund managers’ holdings and designs a new method to classify institutional investors
based on their holding information. Because I look at holdings rather than rely on selfreported information, my method can be thought of as a cleaner classification of
investment style. This classification procedure is especially useful given that a large
number of institutional investors do not report investment styles. It should prove to be
more useful as more institutional investing information become available.
III.
Data
The main questions I ask in this paper are: (1) whether several distinct clienteles exist
among institutional investors in the equity market, and (2) whether such a clientele effect
can help to explain the commonality of market trading behavior, returns, and liquidity.
The existing literature provides little help on these questions because of data issues, as
6
well as the difficulty to quantify institutional and investor differences. To tackle these
questions, I design a new approach based on an empirical definition of “clienteles.” In
this paper, I define a clientele to be a group of investors holding similar portfolios
throughout time. The natural implementation of this approach consists of clustering
together similar institutional holdings. I detail my methodology in the next section. First,
I introduce the major datasets used in this paper.
To conduct my analysis, I require institutional holding, style, and return data. In
addition, I need data on stock trading volume, returns, liquidity, and other related firm
characteristics. In total, I use four major datasets in this paper: CDA Spectrum
Institutional data, CDA Spectrum Mutual Fund data, CRSP Survivorship Bias Free
Mutual Fund Data, and The Transactions and Quotes (TAQ) data. My institutional equity
ownership data come from the CDA Spectrum 13f filings database. The SEC requires all
institutional investors with more than $100 million in equity ownership to report their
holdings via quarterly 13f filings. Little research has been done on the holding and
trading characteristics for institutions other than mutual funds.9 This is unsatisfying given
the large market shares of the other types of institutions. For example, it is possible that
institutions other than mutual funds hold a majority of the shares of certain stocks. When
mutual funds need to trade in and out of these stocks, other institutions can easily absorb
this trading need. In other words, mutual funds may not always be the marginal investors
for the stocks they trade. Therefore, in this study, I incorporate all major institutions
whose holding information is available. The CDA spectrum dataset classifies the filing
institutions into one of five categories: bank trust department, insurance companies,
mutual funds, independent investment advisors, and other institutional investors. Pension
funds likely fall into the last group. One major type of institutions, hedge funds, is
missing from the data. This data limitation will generate problems if hedge funds have
sufficient wealth to absorb the trading needs of all other institutions. However, existing
literature documents strong evidence of limits to arbitrage by hedge funds. It is
nonetheless unfortunate that this data limitation forces me to assume hedge funds have no
impact.
9
There are a few exceptions: Bennett et al. (2003) study the characteristics of stocks that are held by
all classes of institutions. Boyer and Zheng (2004) document the price impact of trading by different types
of institutions.
7
Table 1 summarizes the median number of stocks a typical institutional investor holds
and trades each quarter.10 From 1980-2003, the median number of stocks an institutional
investor holds at the end of each quarter is 262. Considering the median number of stocks
bought (124) and sold (111) during each quarter, I find, as does the existing literature,
that institutional investors manage assets actively.
Because I also examine how correlated fund inflow and outflow affect stock
characteristics, I require institutional flow data. Unfortunately, the CDA Spectrum
Institutional database from 13f filings does not report total assets under management, nor
does it report net return information. Thus, when using these data to compute fund flow
statistics, I have to assume that the reported equity portfolio makes up the fund’s entire
investment set, and also that funds make investment choices on quarterly intervals. To
soften these assumptions and to enhance the power of my tests involving fund flow, I
adapt two datasets on mutual funds that are commonly used in the literature. From the
CDA/Spectrum holdings database, I obtain complete quarterly U.S. equity holdings for
all the U.S. mutual funds during the period of 1980-2003. I manually merge these data
with the CRSP Survivorship Bias Free Mutual Fund Database.11 The CRSP Mutual Fund
Database includes fund returns, total net assets, different types of fees, investment
objectives, and other fund characteristics. My merged final sample spans the period from
January 1980 to December 2003. I eliminate bond and international funds from the
sample. In addition, I include funds with multiple share classes only once. Finally, I
eliminate from the sample all fund-quarter observations for which fewer than ten stock
holdings are reported.
Lastly, to construct stock liquidity proxies, I use the TAQ database of the New York
Stock Exchange. This database records transaction prices and quantities of all trades, as
well as prevailing quotes beginning in 1993. I apply standard microstructure filters to
these data. Namely, using TAQ’s sale condition and correction indicator variables, I
exclude the transactions that take place under special conditions and those that are
10
I delete the first quarter and last quarter observation for each fund so I do not mistakenly count
purchases or sales due to funds entering or exiting the database. If a fund has a missing report during a
quarter, I do not count the number of trades in the subsequent quarter since doing so would use information
from more than half a year earlier.
11
Information on how to merge the two datasets can be found in Wermers (2000), where detailed
description of the CDA/spectrum dataset is also provided.
8
wrongly ordered. Also, I only use trade and quote information from NYSE, NASDAQ
and AMEX. This actually turns out to be an important filter. Using this filtered sample, I
derive several standard liquidity measures, such as proportional quoted bid-ask spreads
and proportional effective spreads. Following previous literature, I use these spreads to
proxy for transaction cost. Table 2 presents summary statistics of liquidity and trading
volume measures used in the paper.
IV.
Clustered Institutional Holdings
I first ask whether there exist different clienteles among institutional investors in the
U.S. equity market. Previous research looking at institution ownership has mainly
focused on the distinction between individual investors and institutional investors. In this
paper, I find that even among institutional investors, different types and styles emerge.
By clustering on portfolio holdings, I show that there exist different clienteles among
institutional investors in the U.S. equity market. The next section details my clustering
methodology and presents properties of the obtained clusters. Briefly, I find the majority
of institutional investors can be stably clustered into a small number of clienteles.
A. Cluster Methodology
For each quarter, I perform cluster analysis on fund portfolio holdings. 12 As an
example, in a given quarter, suppose there are a total of n stocks held by all the
institutional investors. I first construct an n by 1 vector for each fund. Each entry of this
vector corresponds to a fund’s position on a particular stock. For any particular stock, if a
fund does not hold it in its portfolio, then the entry corresponding to that stock is zero.
Otherwise, the entry corresponds to the stock’s proportion in the fund’s portfolio. With
the weight vectors created, I compute the pair-wise distance between any two funds in
their holdings. I define the distance of two funds as the sum of the absolute differences of
stock holdings. Mathematically, let vi = [ω1i , ω2i ...ωni ] and vk = [ω1k , ω2k ...ωnk ] be the weight
12
One of the papers that introduce the cluster analysis into finance is by Elton and Gruber (1970).
Other finance papers that use the cluster techniques include Carleton and McGee (1970) and Brown and
Goetzmann (1997).
9
vectors for funds i and k, respectively. The distance (or dissimilarity) between the two
n
funds is defined to be dik = ∑ ω ij − ω kj . Clearly, d is symmetric and is bounded between
j =1
0 and 2. If the two funds hold the same positions, then d equals to 0. However, if the two
funds hold totally different stocks, then d hits its maximum value of 2. For K funds, I
obtain a pair-wise distance matrix with K*(K-1)/2 unique pair-wise distances.
I then perform a hierarchical clustering of funds based on this pair-wise distance
matrix as follows. First, I set each fund to its own cluster. Then, I find the pair of funds
that are the closest according to my distance measure and cluster them together. After this
clustering, a new cluster that contains more than one fund is formed. I define the distance
between all other funds and the new cluster as the furthest distance between any fund in
the cluster and other funds outside the cluster. This definition gives the most conservative
measure of distance. Under my definition, the distance separating any two funds within
the same cluster is always shorter than the distance between that cluster and any fund
outside the cluster. Proceeding under the same principle, I define the distance between
any two multiple-fund clusters as the furthest distance between funds from these two
clusters.
The hierarchical cluster analysis proceeds in an orderly fashion from the weakest
level (where all funds are individual clusters) to the strongest level (where all funds are in
one cluster). To ensure all funds do not fall into the same cluster, I terminate the process
if all distances between clusters are above a certain threshold. Since I am interested in the
number of clusters that obtains and since that number depends on the imposed threshold,
I must objectively choose the threshold. I perform a simulation to find this threshold.
Specifically, to maintain certain properties of funds, such as fund size and portfolio
concentration, I first fix the funds’ portfolio weights and the number of stocks in which
funds can invests. Then, using these characteristics from the real data, I use the
permutation method to simulate a portfolio for each fund under the null hypothesis that
funds choose their portfolios independently. After simulating funds’ holding vectors, I
then calculate pair-wise distances between the simulated funds. With these, I plot the
distribution of simulated pair-wise distances and use the left 1% critical value as the
threshold for clustering. I combine two clusters together if their distance is below the
10
threshold. By construction, I can reject the null that funds in these two clusters invest in
independent portfolios with 99% confidence. The above procedure assures that I cluster
together funds whose holdings are close to one another. The empirical distribution of the
pair-wise distance metric from the sample simulated under the null of no holdings
correlation is much more right-skewed than the corresponding distribution from the real
data. Figure 1 plots a snapshot of the two distributions for the first quarter of 1994. That
the majority of pair-wise distance statistics lie to the left of the 1% critical threshold
suggests that the majority of institutions probably fall into a few large clusters.
My approach to classifying institutional investors has certain advantages over existing
techniques.13 First of all, this approach does not use style information self-reported by
fund managers. Thus, it does not suffer from strategic style misclassification by funds.14
Second, compared to clustering funds based on return information, looking directly at
holdings information allows one to measure fund strategy more closely since fund returns
can be obscured by manager skill and management fees. Moreover, clustering based on
return information requires long time-series of data, so it must be assumed that fund style
does not change during the clustering period. Since I cluster quarterly, my method does
not require a constant style assumption. Third, I do not need to know how many styles
actually exist, nor do I need to specify what these styles are. The existing literature
focuses mainly on broad characteristic comparisons, such as large vs. small and growth
vs. value. Although size and book-to-market are two important factors in determining
styles, they alone certainly do not capture all there is about style. For example, as pointed
out by Brown and Goetzmann (1997), one can interpret the definition of “growth funds”
in many ways. Managers who describe their funds as growth-oriented have great latitude
in picking the types of stocks they can hold, the timing of purchases and sales, the level
of fund diversification, the industry concentration of the portfolio, as well as a host of
other factors that can go into determining client investment style. Ultimately, these uncaptured features of the data translate into distinct groups of stocks that institutions hold.
13
The existing literature classifies funds using several different methods: (1) classify funds based on
the reported styles; (2) classify funds into growth, value, small- and large-cap using the loadings on the
Fama-French factors (Chan, Chen & Lakonishok (2002)); (3) classify funds by clustering fund performance
variables (Brown & Goetzmann (1997)).
14
Brown and Goetzmann (1997) find evidence consistent with the notion that mutual funds choose
to report styles that minimize their relative poor performance.
11
My clustering approach picks up styles not easily identified when only taking into
account known risk factors.
B. Cluster Characteristics
Table 3 summarizes the number of clusters produced each period. In the interest of
brevity, I report annual results, which are simple averages of quarterly statistics. At first
glance, the total number of clusters is around one-tenth of the number of funds in the
market. Clusters’ sizes are tremendously unbalanced. On one hand, there are several large
clusters hosting numerous funds; on the other hand, some funds form their own clusters.
The ten largest clusters cover more than 50% of institutional investors in terms of the
number of funds and more than 80% in terms of dollar volume. Thus, it seems reasonable
to focus exclusively on the largest clusters.
One important question is whether the few large clusters obtained can pick up
important fund-distinguishing characteristics. Two of the most commonly used
characteristics to identify distinct fund styles are size and book-to-market. Thus, I look at
whether stocks held by funds in different clusters differ along these two dimensions.
Figure 2 shows some supporting evidence to this effect. Here I only consider the ten
largest clusters obtained. Stocks are ranked into 5 by 5 size and book-to-market portfolios.
Each bubble in the graph represents a particular cluster’s average rank along the two
dimensions. Its center corresponds to the mean rank for the cluster, and the widths along
the two dimensions represent the standard errors around the mean. As one can see, these
bubbles generally do not overlap. Thus, clusters do seem to hold stocks that differ
significantly in these two characteristics. Figure 2 suggests my clustering scheme does
pick up important fund-distinguishing characteristics.
The asset management industry also defines several distinct styles among institutional
investors, especially for mutual funds. One commonly used style measure is the ICDI
objective code. To gain further understanding of the clusters obtained, I study whether
each cluster can be associated with a distinct fund style. Since the ICDI style measure is
reported for mutual funds only, I perform a separate clustering analysis using mutual fund
holding data only. Table 4 summarizes the relationship between the clusters obtained and
the ICDI objective. For each cluster, I identify a dominant style, which is the style most
12
represented within that cluster. Then for all clusters identified by the same style, I
calculate the cluster-level average of the percentage of funds of that style. Panel A reports
the results. Only the largest 10 clusters and clusters containing more than 10 funds are
analyzed. As one can see, clusters and styles do not perfect mapping to each other.
Although there are some clusters hosting exclusively one certain fund style, such as
“aggressive growth”, “international equity”, “sector funds,” and “utility funds”, most of
the dominant styles cover less than 50% of the funds in the clusters. This suggests that
funds with different reporting styles can still hold similar portfolios. At the same time, it
is possible that two funds reporting the same style hold very different portfolios.
Therefore, I also examine how many clusters can be associated with a single style. Here, I
associate a cluster with its dominant style. Panel B reports the percentage of clusters
belonging to various styles. On average the “long term growth” style is associated the
most number of clusters; however, its coverage seems to decline with time. For most
styles, the percentage of clusters associated with them seems to be stable over time.
However, during the past twenty years, the numbers of “aggressive growth” and “sector
funds” clusters seem to have increased.
Since I perform cluster analysis quarterly, my method as detailed does not guarantee
that funds falling into one cluster in this quarter will be grouped together the next quarter.
Moreover, funds on the margin between two groups are easily misclassified. To alleviate
concerns over the stability, I test the clustering’s consistency. If there are some economic
forces behind the clusters, then the clustering should remain stable over time. I test this
conjecture by looking at pair-wise connections between funds. The intuition behind
looking at pair-wise connections is that, ideally, funds clustered together should stay
clustered together. In each quarter, I define “connection” to be either 1 or 0 depending on
whether the two funds in question fall into the same cluster or not. I then count the
percentage of pair-wise connections that remain unchanged for the next year. Based on
intuition given earlier, a higher percentage of unchanged pair-wise connections signifies a
more stable clustering. Table 5 gives the clustering stability results. Column 2 counts the
number of pair-wise connections that stay the same, and column 3 counts the total
number of pair-wise connections for funds that are alive in both the previous and the
current quarter. Column 4 gives the percentage of connections changed from the previous
13
quarter. I call this percentage the transition rate. The average annual transition rate is
14.9%. To gauge the stability of the clustering through time, for each year I bootstrap the
“switching rate” under the null hypothesis of no cross-sectional structure. The null is
constructed by forming samples via random draws without replacement from actual fund
portfolios. For each round of the bootstrap procedure, I set the number of clusters and the
total number of funds equal to those statistics from the real sample. Column 5 reports the
average “switching rate” for each year. The typical rate of change under the null is
21.45%, which is considerably higher than the transition rate of 14.9% obtained from the
true sample. Column 6 reports the standard deviation of the bootstrapped distribution.
That each year’s transition rate is below the 1% critical value in the left tail of the
bootstrapped distribution allows me to reject the null of spurious classification. Thus,
clustering based on portfolio holdings gives a stable grouping of institutional investors.
This stable clustering supports the story that there are indeed clienteles among
institutional investors.
Although the results presented in this paper do not depend on the economic reasons
underlying why institutional holdings cluster, the existing literature does provide some
rationales that support this phenomenon. Investors may choose to invest in only a subset
of stocks due to preference, liquidity, and information reasons. For example, according to
Nieuwerburgh and Veldkamp (2006), when information is costly and its acquisition has
increasing returns to scale, it pays to focus exclusively on the single risk factor with
which the fund is most familiar.15 Thus, funds stay with their familiar risk factors as long
as their investments are successful. Conversely, a fund should learn about and rely on
other risk factors after it performs badly. I provide some evidence consistent with this
explanation. Specifically, I test whether a fund switches to another cluster after poor
performance. To do so, I run a logistic regression using a dummy variable that captures
whether a fund leaves its cluster during a certain quarter. Table 6 provides the results.
They are consistent with the information story; a fund’s probability of switching clusters
increases as the fund’s past quarter net performance and past year net performance
decrease. The coefficients on fund past performance are negative and significant,
15
Veldkamp (2005) proposes a theory that provides rationales for why investors may want to purchase the
same information that others are purchasing.
14
suggesting that a fund is more likely to change its style and transfer to another cluster
after its current strategy fails.
V.
Comovement and Clustered Institutional Holdings
That there exist clustered institutional holdings is most interesting if it has asset
pricing implications. In this section, I study the pricing implications of institutional
clustering detailed in the previous section. Existing theories suggest that when segmented
investor groups concentrate on different stocks, only the marginal investor who
specializes in a stock affects that particular stock’s price and the associated trading
behavior. Consequently, stocks held by the same marginal investor would comove
excessively. In the current context, since I cluster based on fund portfolio holdings, each
cluster of institutions can represent a marginal investor. I test if fund clustering can
explain the observed comovement in stock return, trading pattern, and liquidity.
Economically, this hypothesis is the same as whether the observed stock comovement is
related to stocks sharing a common marginal investor.
A. Comovement in Trading Volume, Liquidity and Return
A.1. Trading Behavior Commonality
Since clientele effects generate pricing and liquidity commonality through trading, I
first examine whether fund clustering induces comovement in stocks’ turnover. By
grouping funds into clusters, I essentially create several “pseudo portfolios.” I test for
excessive trading behavior comovement within these portfolios. There are a few ways of
doing so. One way would be to compute pair-wise correlations among all stocks and
compare the average correlation of pairs of stocks within the same cluster against the
average correlation of pairs of stocks not residing in the same cluster. This method turns
out to be too computationally intensive. Instead, I construct measures of turnover on the
stock, cluster, as well as market, level. I then regress stock level turnover on cluster and
market level turnover to see if cluster level turnover can explain stock level turnover.
My daily stock turnover measure, defined as daily trading volume divided by the
number of shares outstanding, comes from CRSP. Constructing cluster level turnover
15
turns out to be nontrivial because some stocks are held by multiple clusters. These stocks
tend to be large and liquid; most of them are in fact index stocks. Diversification and
liquidity concerns are the two main forces behind these stocks being held by a large
population of investors. If a stock is held by more than 5 clusters of investors, I associate
that stock with the five clusters holding the most of its shares. If a stock is held by 5 or
fewer clusters, I associate that stock with all of the clusters that hold it. For each stock, I
define its associate cluster level turnover variable as the average turnover among all the
other stocks held by every cluster that holds this particular stock. I construct a stock’s
associate market level turnover variable as the average turnover of all other stocks in the
market.16
To test whether clustering can explain the cross-section of individual stock turnover,
for each stock I run a time-series regression of daily17 individual stock turnover on its
associated daily cluster level turnover. I include the stock’s associated market level
turnover as control. I control for the other common factors, such as size, book to market,
past return, and industry, that seem to be correlated to risks involving public information.
The formal specification is
T j ,t = α + β c CT j ,t + β m MT j ,t + β s ST j ,t + β bm BMT j ,t + β lr LRT j ,t + β I IT j ,t + ε j ,t (1)
where CT j ,t is the stock’s associated cluster level turnover and MT j ,t is the stock’s
associated market level turnover. ST j ,t , BMT j ,t , LRT j ,t represent the average turnover for a
portfolio matched on size, book-to-market, and, past one year return, respectively. IT j ,t
refers to the average turnover for the stocks in the same industry as the target stock. Since
similar funds, such as sector funds, likely hold stocks with similar characteristics,
commonality of turnover within the same cluster may reflect common information. I
construct matching size, book-to-market, past year return, and industry portfolios to
control for common turnover due to public information. Specifically, I sort stocks into
quintile portfolios based on market capitalization, book-to-market, and past year return.
16
I also use another way to identify stocks sharing the same cluster as the target stock. I perform a
second level clustering on the universe of stocks held by institutional investors based on the similarity of
clusters that hold each stock during each period. The basic results are not changed.
17
I replicate all the analyses using weekly and bi-weekly data. The results are similar to regressions
using daily data. The weekly results are slightly stronger than their daily counterparts, and the bi-weekly
results are as strong as the weekly ones. These results imply that, for my data, clientele effects generate
comovement at a weekly frequency.
16
Each stock thus matches to one of the quintile portfolios. To control for industry effects, I
use the Fama-French 48 industry portfolios. Each stock matches to one of 48 industry
portfolios.
The first two columns of Table 7 report evidence that supports excessive turnover
comovement within clusters. To judge the cluster level turnover’s explanatory power of
stock level turnover, Column 1 shows baseline results of stock level turnover regressed
on market level turnover and characteristic controls. Since I standardize each variable by
its time-series standard deviation, one can readily read off the economic significance of
the estimates. Note the reported coefficients are the cross-sectional means of stock-wise
time-series regressions. Column 2 shows regression results after I add cluster level
turnover to the right-hand side of the baseline regression. Stock level turnover comove
with market level turnover, public information, and cluster level turnover since all
coefficients are positive and significant. The coefficient on cluster β c is highly
statistically significant. A one standard deviation increase in the stock’s affiliated cluster
turnover is associated with 0.0230 standard deviation increase in the stock’s turnover. In
terms of magnitude, the cluster turnover coefficient is roughly 50% as large as the
coefficients of industry turnover variable. Turnover of the matched book-to-market
portfolio turns out to have the smallest effect. Thus, cluster turnover seems to be
associated with observed stock turnover comovement.
A.2. Return Commonality
Trading commonality discovered in the previous subsection does not directly translate
into return commonality. If the transaction is not information-based and if there is enough
liquidity for the underlying stock, then trading by itself should not impact prices greatly.
Hence I test whether stock returns also comove at the cluster level. The test specification
is similar to that for stock turnover. The only difference is that instead of using stock,
cluster, and market level turnover measures, I use in their place the corresponding return
measures. I construct these return measures in ways analogous to their turnover measure
counterparts. In the interest of brevity, I skip the construction details.
Column 3 and 4 in Table 7 show the results from regressing stock returns on their
associated cluster level returns, their associated market level returns, as well as their
17
associated characteristic returns. Existing literature says the Fama-French three factors,
the momentum factor, and the industry factor explain a large part of the cross-section of
stock returns. My results concur; the coefficients on the market, size, momentum, and
industry portfolio return variables all load significantly in the return regressions reported
in the table. However, similar to the turnover results, book-to-market return does not
significantly contribute to stock return comovement. Comparing the baseline regression
that does not include cluster level return to the regression that does include cluster level
return, I find a cluster effect on top of the other factors. The coefficient on cluster level
return is both statistically and economically significant. For a typical stock, a one
standard deviation increase in cluster level average return is associated with a 0.04
standard deviation increase in the stock’s return, which translates to a 20 basis points
increase in daily return level. Thus, I find that fund holding clustering is associated with
observed stock return comovement.
A.3. Liquidity Commonality
Having looked at how clustering seems to be associated with stock trading and return
commonality, I finally examine if clustering is also connected to observed liquidity
commonality. The last four columns of Table 7 summarize regression results for the
commonality of quoted spreads and effective spreads, respectively. The baseline
regression mimics the results that have been found in the literature. Both size and
industry factors play important roles in explaining the movement of individual liquidity.
Similar to the return and trading volume results, cluster level liquidity is statistically and
economically significant. The economic significance of the cluster level liquidity is
roughly one-third of that of the industry level liquidity. Using quoted spread as an
example, a one standard deviation increase in industry level liquidity increases stock
level liquidity by 11.8% of a standard deviation. A one standard deviation increase in
cluster level liquidity increases stock level liquidity by 4.4% of a standard deviation,
which translates to a 0.1% increase in the quoted spread. The explanatory power of the
regression also increases with the addition of cluster level liquidity.
18
VI.
Correlated Budget Constraints and Cluster Level Comovement
In the previous sections, I document the existence of stable clusters among
institutional investors and show how clustering promotes excessive comovement in
trading behavior, return, and liquidity among stocks held by the same cluster. Current
literature provides three explanations as to why we observe excessive comovement. I
now look at how these explanations fit into my clustering story. The three explanations
can be neatly characterized by the mechanisms through which they operate. The channels
are based on (1) correlated private information,18 (2) cross-market portfolio rebalancing,19
and (3) correlated liquidity demands.20 In this paper, I mainly focus on the channel based
on correlated liquidity demands.
First, I show that funds’ investment behavior is dramatically distorted when facing a
large flow shock. When facing a large money inflow, a fund tends to purchase more;
when facing a large money outflow, it liquidates a larger proportion of its portfolio.
Whether such trading distortions have market-wide impact also depends on if other
investors are willing and able to provide liquidity to the constrained funds at reasonable
prices. In the second part of this section, I find that funds sharing the same cluster
experience large inflows and outflows around the same time. My finding concurs with
Edelen (1999). Edelen finds substantial positive cross-correlation in fund flows, possibly
indicating the existence of common factors. Here, I show that part of the flow
commonality is due to the clustering of funds’ portfolio holdings.
This high degree of flow correlation implies that funds clustered together likely hit
their budget constraints concurrently. In the last part of this section, I test whether
common liquidity shocks can partially explain why stocks held by the same cluster
18
The correlated information channel was originally introduced by King and Wadhwani (1990). It
is based on the idea that information asymmetry leads uninformed traders to incorrectly update their beliefs
on the payoffs of many assets following idiosyncratic shocks to a single asset.
19
Fleming, Kirby, and Ostdiek (1998) and Kodres and Pritsker (2002) argue that the portfolio
rebalancing activity of privately informed, price-taking investors—driven by risk aversion—may mislead
the updating process of other, uninformed investors, thus eventually inducing financial contagion.
20
The importance of financial constraint arbitrage is first emphasized by Shleifer and Vishny (1997).
Kyle and Xiong (2001) study the effect of wealth constraint on arbitrageurs and use it as a spillover
mechanism. Gromb and Vayanous (2002) develop an equilibrium model of arbitrage trading with margin
constraints to explain contagion. Yuan (2005) shows that information asymmetry amplifies the wealth
effect on price movement.
19
experience excessive comovement. Particularly, because funds within the same cluster
experience correlated inflows and outflows, an individual fund’s liquidity-forced
transactions aggregate to a large liquidity demand at the cluster, as well as market, level,
inducing a big price impact. Whether common liquidity shocks have a first order effect in
the real market is ultimately an empirical question I answer in the following sections.
A. Fund Trading Behavior during Large Flow Periods
As pointed out above, CDA Spectrum Institutional data from the quarterly 13f filings
database does not contain direct fund flow information. To proxy for the flow variable, I
had to make rather strong assumptions about fund investment spectrum and timing.
Fortunately, the CRSP Mutual Fund Survivorship Free database allows me to calculate
fund flow for mutual funds without these assumptions. For the analysis of this section, I
restrict my sample to mutual funds due to data limitations.
Since I am interested in trading behavior of funds when they incur large inflows or
outflows, I calculate the holding statistics restricted to sub-samples of funds with
flow>19% and flow<-7%, respectively. The 19% and -7% represent the upper and lower
twenty percentile cutoffs of the fund flow distribution. As a comparison, I also include
the corresponding statistics for funds facing regular inflow (0<flow<19%) and regular
outflow (-7%<flow<0). The summary statistics are reported in Table 8. On average,
18.18% of funds incur a large inflow, and 20.4% funds incur a large outflow each quarter.
Panel B of Table 8 shows that funds buy more stocks and sell fewer stocks when they
incur large inflows. The percentage of stocks bought increases from 43% for the regular
inflow sample to 70% for the sub-sample of stocks that incur large inflows. In the same
sub-sample, the percentage of sales decreases from 26% to 24%. In contrast, when funds
incur large outflows, they tend to sell more stocks. Fund purchase percentage decreases
to 33%, and fund sale percentage increases to 47%. These summary statistics suggest
large capital flow strongly influences a fund’s investment decisions. When a fund
experiences a large liquidity shock, it must change its portfolio to absorb the effects.
When I further decompose purchases into increases on an existing stock or “new
buys,” the pattern of funds buying (selling) when they experience large inflows (outflows)
is even stronger. I label a purchase a “new buy” if the fund adds a stock to its portfolio in
20
the current quarter. A fund manager likely holds stronger positive opinions about “new
buys” as opposed to increases on existing positions. For the similar reasons, I also count
the number of stocks that are completely dropped (labeled “exits”) by funds. Managers
should hold stronger negative opinion about these dropped stocks as compared to mere
sales. From panel B of Table 8, among the stocks that are sold by funds that incur large
inflows, 16% of them are completed dropped from fund portfolios. Only 13% are
dropped from fund portfolios for the whole universe of funds. Panel D of Table 8 shows
that 16% of purchases are “new buys” for funds incurring large outflows as compared to
14% for the whole universe. Managers need stronger signals to trade a stock in a different
direction from that of the rest of the portfolio, especially when facing tight liquidity
constraints.
B. Correlated Cluster Level Flow among Institutional Investors
For each mutual fund, I can more precisely calculate its fund flow. Thus, I can
confidently decompose mutual fund total flow into expected flow and unexpected flow
using a third order VAR model on [flow(t), return(t)]. To control for the market level
expectation, I also include on the right-hand side of the regression the lagged market
level average fund flow and lagged market level fund return. Expected flows are the
forecasts of the VAR model, and unexpected flows are the residuals of the model. I
decompose the flow variable in this manner because fund managers likely factor into
their strategies how much money is expected to flow into or out of their funds. The
impact of fund flows on funds’ investment strategy can be alleviated if they can be
predicted ahead of time. In contrast, unexpected fund flows come as shocks, and a fund
manager must deal with them in ways that may impact the flow, return, and liquidity of
stocks held by the fund.
First, I apply the same clustering procedure for the mutual fund quarterly holdings.
Because restricting to mutual funds means further results are generated from a smaller
sample, I check that statistics from the mutual fund sample conform to the statistics from
entire sample of institutional investors. Reassuringly, properties of the clusters produced
from the mutual fund sample qualitatively match properties of the clusters obtained from
21
the sample consisting of all institutions. 21 Thus, cluster analysis on the mutual fund
sample should give the same inferences as analysis on the sample consisting of all
institutional investors.
Having checked that analysis of the mutual fund sample is representative, I proceed to
fund flow analysis. Before I show fund flows are correlated at the cluster level, I supply a
reason as to why one may expect this to be true. It is well documented that fund investors
actively pull money out of loser funds and put money into winner funds. Since funds
within a clientele hold similar portfolios, they should also perform similarly. If fund
investors do chase performance, then correlated performance should lead to correlated
fund flows. To verify this argument, I test whether funds sharing the same cluster have
more correlated performance than funds residing in different clusters. Results in panel A
of Table 9 support this hypothesis. Column 1 reports results from a regression of
individual fund performance on market level fund performance and reveals a strong
market level performance correlation. However, as can be seen from the second column,
once I add a cluster level performance variable to the regression, market level influence
dramatically decreases. In fact, the cluster level performance variable dominates and
absorbs the market level variable’s explanatory power. This result suggests that most of
the fund level performance correlation takes place at the cluster level.
Having found excess performance correlation at the cluster level, I move on to test
whether fund flows are also correlated at the cluster level. This is done by regressing
individual fund flow on cluster level fund flow. For each fund, I construct a cluster level
fund flow (Flowc,t) variable. For any particular fund, I define Flowc,t as the average flow
for all funds sharing the same cluster, excluding the fund itself. Similarly, I construct a
market level fund flow (Flowm,t) variable as the average flow of all funds in the market,
excluding the fund itself. I then regress individual fund flow (Flowi,t) on its cluster level
average flow (Flowc,t) and its associated market level average flow (Flowm,t). If fund
flows are correlated at the cluster level, then cluster level average flow should provide
explanatory power for individual flow over and above explanatory power provided by
market level average flow alone.
21
The clustering results for the mutual fund datasets are available upon request.
22
I add several controls to the above baseline regression. As mentioned in the data
section, I do not classify institutions using self-reported styles. I give several reasons for
not doing so—among them, the fact that except for mutual funds, style is often not
reported. Since analysis in this section is limited to mutual funds and mutual funds of the
same style tend to have correlated portfolios, I add to the baseline regression a style flow
variable, Flows,t, to control for correlated flow generated by funds sharing the same
investment style. For each fund, I define Flows,t to be the average flow of all the funds
that share the same style, excluding the fund itself. Previous literature also finds that fund
investors actively chase past performance. It also finds fund flows are positively autocorrelated. Thus, I throw in as control variables current and previous month net return, as
well as the fund’s previous month flow into the regression. For each fund, I run the
following time series regression:
Flowi ,t = α + β c Flowc ,t + β m Flowm,t + γX t + ε i ,t
(2)
where X t denotes exogenous control variables mentioned above. I normalize all
variables in this regression by their own standard deviations so one can directly compare
the economic significance of different regressors. Table 9 shows the regression results. I
report regression results with and without the cluster flow variable separately for
comparison. I also include in the regressions lagged cluster and lagged market flows to
study possible contagion effects among funds in the same cluster.
Table 9 reveals ample evidence of flow comovement within the same cluster. The
regression coefficient of contemporaneous cluster flow is both statistically and
economically significant. With raw flow as the dependent variable, the contemporaneous
cluster flow coefficient β c is around 0.14 and has an associated t-statistic of 16.
Approximately 70% of the time series β c coefficients are positive, and 34% of them
exceed the 5% one-tailed critical value. Adding to the baseline regression cluster level
flow improves the adjusted R-squared of the baseline model by 3% to 5% depending on
which component of fund flow is under study. The significance of the cluster level
variable in both the expected and unexpected flow regressions implies that they both
contribute to the commonality of overall fund flow.
23
The coefficient on the cluster level variable in the expected flow regression loads is
not surprising since funds sharing the same clusters hold similar portfolios and thus have
correlated past performance. However, the corresponding coefficient in the unexpected
flow regression is a little surprising. This result suggests that even if funds are
sophisticated enough to smooth out the impact of the expected flows, a systematic
liquidity shock will affect each cluster. Moreover, the significance of cluster level flow
after controlling for individual past performance suggests that there might be a non-linear
relationship between flow and past return. Contemporaneous cluster flow may pick up
the residual effects left out by the linear prediction. An unrelated explanation may come
from an externality effect. When investors decide in which funds to invest, they not only
look at each fund’s individual performance, but they may also consult the performance of
similar funds. Therefore, failures of some funds may cause investors to pull money out of
other funds of the same type even if these other funds perform relatively satisfactorily.
Finally, consistent with the existing literature, the style flow variable is significant
even after controlling for performance, suggesting style does generate flow commonality.
However, the coefficient on the cluster level flow variable is typically twice as large as
that on the style flow variable, suggesting clustering generates even more flow
commonality than style.
C. Asymmetric Comovement Induced by Fund Flow
The previous section shows that funds sharing the same cluster tend to have large
inflows and outflows around the same time, suggesting commonality in financial
constraint. Because the clustering is based on fund holdings, funds of a cluster hold a
large percentage of the shares outstanding of the stocks held by the cluster. Since no one
is there to take the other side, the correlated liquidity shocks aggregate to a market-wide
liquidity shock for stocks that the funds in the cluster decide to buy or sell. These marketwide liquidity shocks impact pricing, as well as other stock trading related characteristics.
Thus the correlated liquidity story predicts comovement among stocks held by the same
cluster should increase with the magnitude of fund flow.
Realistically, common liquidity shock to stocks held by the same marginal investor
generates comovement is not the only story that can explain the phenomena already
24
mentioned. A missing risk factor or characteristics common to the stocks in question can
also generate excessive return comovement. Although the missing factor mechanism
plays at least a partial role in determining excessive return comovement, it is unlikely that
the missing risk factor can generate the same observed comovement in liquidity or
trading volume if it contains only public information. Moreover, fund flow is orthogonal
to the time-series variation of the missing risk factor. Thus, if my cluster level variable
provides additional explanatory power for comovement during times of inflows or
outflows, then the budget constraint mechanism must have a first order effect.
To show that part of the explanatory power of the cluster flow variable comes from
funds changing their positions when facing liquidity shocks, I add two interaction terms
to equation (1), my baseline regression testing for comovement. Specifically, I interact
my cluster level measure with a dummy variable, large_inflow, indicating whether the
cluster incurs a large inflow and a dummy variable, large_outflow, for a large outflow.
The two variables not only capture the effects from the magnitude of the shocks but also
the asymmetric effects from the direction of the shocks. Although funds subject to
negative flow shocks have to liquidate some of their positions, funds do not necessarily
increase their positions after a positive fund inflow. Thus, fund flows should impact
prices and other stock characteristics asymmetrically. To test fund flow effects, I run the
following model,
T j ,t = α + β c CT j ,t + β InflowCT j ,t × DInflow, j ,t + βOutflowCT j ,t × Doutflow, j ,t + β ' X j ,t + ε j ,t (3)
_
where Dinf low = 1 if the average cluster flow is greater than D inf low , and Dinf low = 0
otherwise; Doutflow is defined likewise. I use the upper and lower 10th percentile cutoff of
_
_
historical cluster flow as the value for D inf low and D outflow , respectively.
The regression results for turnover, return and liquidity are summarized in Table 10.
The signs of the two interaction coefficients are consistent throughout all specifications.
Using turnover as an example, both of the coefficients on the interaction terms are
positive, suggesting higher cluster level comovement when funds in the cluster incur
large flows. However, whereas the coefficient on the interaction term for large inflow is
generally not significant, the coefficient on the interaction term for large outflow is
always significant. This difference implies an asymmetric effect of fund flow, which is
25
consistent with the wealth constraint story. When funds face large inflows, they have the
flexibility to smooth out their investment over time, so that price impact can be
minimized. In contrast, when funds are subject to large outflows through redemption,
they are then forced to liquidate their assets relatively quickly, which can generate a large
price impact.
Finally, expected fund flow and unexpected fund flow may have different degrees of
price impact. However, it is but it is unclear which one should have a stronger effect. On
the one hand, if large flows are expected, funds may prepare for them and choose
investment strategies that mitigate their impact. On the other hand, as shown by the flow
correlation test in the previous section, expected fund flows tend to be more correlated
than unexpected ones. Therefore, we perform separate analyses for expected and
unexpected flow. Comparing results of models using expected flows versus those using
unexpected flows, unexpected outflows seem to have stronger effects. It seems funds do
factor next period’s expected flows into their investment strategies.
VII. Conclusion
Since the asset management industry is a dominant player in the U.S. equity market,
institutional investment behavior should impact stocks in terms of their trading, pricing,
and liquidity. Proceeding from this intuition, this paper shows the existence of
institutional clienteles can partially account for all of the previously found comovement
related findings. Using a novel approach based on applying standard clustering
algorithms to institutional holdings, I first find the majority of institutional investors fall
into a few distinct clienteles. This partitioning seems to be stable and seems to capture
some fundamental economic characteristics. Second, I find the existence of institutional
clienteles has asset pricing implications in that there appears to be excessive comovement
of turnover, return, and liquidity on the clientele level. Finally, as a possible explanation
for the excessive clientele-level commonalities, I present evidence showing that
institutional investors are wealth constrained gives one explanation for the observed
cluster effects.
26
This paper uncovers a stable clustering of institutional holdings. One may argue,
however, that it is equally important to uncover the economic forces behind this
clustering phenomenon. Future research on this topic may uncover why institutional
holdings cluster together and also what institutional characteristics are related to this
clustering phenomenon.
Furthermore, from a methodological point of view, this paper has important
implications for risk management. Kyle and Xiong (2001) and Stephen Ross (2001) point
out that the comovement due to the intrinsic feature of the asset management industry
implies some flaw in current risk valuation methodology. Currently, researchers evaluate
portfolio risk based on historical correlation of returns of the underlying stocks. This
paper suggests that when evaluating the diversification level of one fund, one should also
take into account which other funds hold the same stocks and how healthy their financial
conditions are.
27
References
Allen, Franklin, and Douglas Gale, 1994, Limited market participation and volatility of
asset prices, The American Economic Review 84, 933-955.
Ang, Andrew, and Joseph Chen, 2002, Asymmetric correlations of equity portfolios,
Journal of Financial Economics 63, 443–494.
Barberis, Nicolas, Andrei Shleifer, and Jeffrey Wurgler, 2005, Comovement, Journal of
Financial Economics 75, 283-317.
Bennett, James, Richard Sias and Laura Starks, 2003, Greener Pastures and the Impact of
Dynamic Institutional Preferences, The Review of Financial Studies 16(4),
1203-1238.
Boudoukh, Jacob, Matthew Richardson, Robert Stanton, and Robert Whitelaw, 1997,
Pricing
mortgage-backed
securities
in
a
multifactor
interest
rate
environment: multivariate density estimation approach, Review of Financial
Studies 10, 405-446.
Boyer, Brian H., Tomomi Kumagai, and Kathy Zhichao Yuan, 2005, How Do Crises
Spread? Evidence from Accessible and Inaccessible Stock Indices, AFA
2003 Washington, DC Meetings.
Boyer, Brian, and Lu Zheng, 2004, Who moves the market? A study of stock prices and
sector cashflows, Working paper, University of Michigan.
Brown, Stephen, and William Goetzmann, 1997, Mutual fund styles, Journal of
Financial Economics 43, 373-399.
Brunnermeier, Markus, and Lasse Pedersen, 2005, Market liquidity and funding liquidity,
Working paper, New York University.
Chan, Louis, Hsiu-Lang Chen, and Josef Lakonishok, 2002, On Mutual Fund Investment
Styles, The Review of Financial Studies 15, 1407-1437.
Carleton, Willard, and Victor McGee, 1970, Piecewise regression, Journal of the
American Statistical Association, 1109-1124.
Chevalier, Judith, Glenn Ellison, 1997, Risk taking by mutual funds as a response to
incentives, The Journal of Political Economy 105, 1167-1200.
28
Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2000, Commonality in
liquidity, Journal of Financial Economics 56, 3-28.
Collin-Dufresne, Pierre , Robert Goldstein, and Spencer Martin, 2001, The determinants
of credit spread changes, The Journal of Finance 56, 2177-2208.
Connolly, Robert, and Albet Wang, 1998, On stock market return comovements:
Macroeconomic news, dispersion of beliefs, and contagion, Working paper,
Rice University.
Connolly, Robert, and Albet Wang, 2003, International equity market comovements:
economic fundamentals or contagion?, Pacific-Basin Finance Journal 11,
23–43.
Coughenour, Jay, and Mohsen Saad, 2004, Common market makers and commonality in
liquidity, Journal of Financial Economics 73, 37-69.
Coval, Joshua and Erik Stafford, 2005, Asset fire sales (and purchases) in equity markets,
Working paper, Harvard University.
Da, Zhi and Pengjie Gao, 2005, Clientele change, liquidity shock, and the return on
financially distressed stocks, Working paper, Northwestern University.
Edelen, Roger, 1999, Investor flows and the assessed performance of open-end mutual
funds, Journal of Financial Economics 53, 439-466.
Elton, Edwin and Martin Gruber, 1970, Improved forecasting through the design of
homogeneous groupings, Journal of Business 44, 432-450.
Frazzini, Andrea and Owen Lamont, 2005, Dumb money: mutual fund flows and the
cross-section of stock returns, Working paper, Yale University.
Gabaix, Xavier, Arvind Krishnamurthy, and Olivier Vigneron, 2006, Limits of arbitrage:
theory and evidence from the mortgage-backed securities market, Journal of
Finance, forthcoming.
Gromb, Denis, and Dimitri Vayanos, 2002, Equilibrium and welfare in markets with
financially constrained arbitrageurs, Journal of Financial Economics 66,
361-407.
Hasbrouck, Joel, and Duane Seppi, 2001, Common factors in prices, order flows and
liquidity, Journal of Financial Economics 59, 383-411.
29
Hasbrouk, Joel, 2005, Trading Costs and Returns for US Equidities: The Evidence from
Daily Data, Working Paper, New York University.
Huberman, Gur, and Dominika Halka, 2001, Systematic Liquidity, The Journal of
Financial Research 24, 161-178.
Ippolito, Richard, 1992, Consumer reaction to measures of poor quality: evidence from
the mutual fund industry, Journal of Law and Economics 35, 45-70.
Kacperczyk, Marcin, Clemens Sialm, and Lu Zheng, 2005, On the Industry
Concentration of Actively Managed Equity Mutual Funds, Journal of
Finance 60, 1983-2011.
Karolyi, Andrew, and Rene Stulz, 1996, Why do markets move together? An
investigation of U.S.-Japan stock return comovements, Journal of Finance
51, 951–986.
King, Mervyn, and Sushil Wadhwani, 2000, Transmission of volatility between stock
markets, Review of Financial Studies 3, 5–33.
Kodres, Laura E., and Matthew Pritsker, 2002, A rational expectations model of finanical
contagion, Journal of Finance 57, 769-799.
Kyle, Albert, and Wei Xiong, 2001, Contagion as a wealth effect, The Journal of Finance
56, 1401-1440.
Lee, Charles, Andrei Shleifer, and Richard H. Thaler, 1991, Investor sentiment and the
closed-end fund puzzle, Journal of Finance 46, 75-109.
Longin, Francois, and Bruno Solnik, 2001, Extreme correlation of international equity
markets, The Journal of Finance 56, 649–676.
Merton, Robert, 1987, A simple model of capital market equilibrium with incomplete
information, The Journal of Finance 42, 483-510.
Newman, Yigal, and Michael Rierson, 2004, Illiquidity spillovers: theory and evidence
from european telecom bond issuance, Working paper, Stanford University.
Nieuwerburgh, Stijn Van, and Laura Veldkamp, 2006, Information acquisition and
portfolio under-diversification, Working paper, New York University.
Pasquariello, Paolo, 2006, Imperfect Competition, Information Heterogeneity, and
Financial Contagion, Review of Financial Studies.
30
Pindyck, Robert, and Julio J. Rotemberg, 1993, The comovement of stock prices, The
Quarterly Journal of Economics 108, 1073-1104.
Ross, Stephen, 2001, Discussion: Contagion as a wealth effect, Journal of Finance 56,
1440-1443.
Shleifer, Andrei, and Robert Vishny, 1997, The limits of arbitrage, The Journal of
Finance 52, 35-55.
Sirri, Erik, and Peter Tufano, 1998, Costly search and mutual fund flows, The Journal of
Finance 53, 1589-1622.
Veldkamp, Laura, 2005, Information markets and the comovement of asset prices,
Forthcoming in Review of Economic Studies.
Wermers, Russ, 2003, Mutual fund performance: An empirical decomposition into stockpicking talent, style, transaction costs, and expenses, The Journal of Finance
55, 1655-1703.
Yuan, Kathy, 2005, Asymmetric price movements and borrowing constraints: A REE
model of crisis, contagion, and confusion, Journal of Finance 60, 379-411.
31
Table 1
Institutional Trading (All Institutions)
This table reports the median number of stocks held and traded by a typical institutional investor each
quarter. Institutional ownership data are obtained from CDA Spectrum. Sample period is 1980-2003. We
delete the first quarter and last quarter observation for each fund to preclude artificial counting of purchase
or sale due to funds entering or exiting the database. If a fund has a missing report during a quarter, we do
not count the number of trades in the following quarter. A fund is considered to buy/sell a stock if it
increases/decreases its shares from the last quarter. A purchase is labeled as "new buy" if it was not in the
portfolio but is newly added in the current quarter. A sale is labeled as "exit" if it is completely eliminated
from the portfolio.
Year
# of funds
Hold
Buy
Sell
New buy
Exit
Unchanged
1980
438
190
83
75
23
16
48
1981
478
191
83
78
21
19
49
1982
502
194
85
83
25
22
48
1983
523
213
99
87
31
24
51
1984
584
216
94
92
27
26
56
1985
635
227
106
93
33
27
55
1986
687
236
112
99
34
31
55
1987
739
246
114
105
34
31
59
1988
779
247
117
112
31
28
45
1989
752
263
115
103
32
29
75
1990
841
253
104
102
26
28
76
1991
872
257
112
97
31
25
73
1992
949
266
122
101
33
27
70
1993
966
280
134
108
40
30
68
1994
997
290
129
124
40
36
73
1995
1099
298
138
123
40
36
73
1996
1083
308
153
124
47
38
68
1997
1199
312
155
130
46
39
67
1998
1323
310
155
132
50
43
66
1999
1310
313
156
146
50
50
62
2000
1514
310
162
147
54
52
54
2001
1579
287
146
129
43
40
52
2002
1601
285
143
131
38
39
49
2003
1690
288
148
129
41
36
47
Total
964
262
124
111
36
32
60
32
Table 2
Summary Statistics for Liquidity and Turnover
Panel A summarizes the liquidity variables. The proportional quoted spread and the proportional effective
spread are used as proxies for liquidity measures. TAQ data from 1993-2003 are used to compute spread
measures. Daily spread measures are calculated as the simple average of spreads of every transaction and
quote during a day. Panel B summarizes the daily turnover measure, with sample period from 1980-2003.
All the statistics reported are the cross-sectional statistics for time-series means among the stocks. For each
measure, separated summary statistics are reported for all the stocks in the database, and the stocks having
institutional ownership.
Panel A: Daily Liquidity Measures (Cross-sectional statistics for time-series means)
N
Mean Median Std. Deviation
Proportional Quoted Spread (Whole Sample)
15,044 0.0437 0.0259
0.0575
Proportional Quoted Spread (Whole Sample)
15,034 0.0424 0.0257
0.0546
Proportional Effective Spread (Held by Institutions)
14,722 0.0314 0.0194
0.0392
Proportional Effective Spread (Held by Institutions)
14,722 0.0306 0.0192
0.0377
Panel B: Daily Turnover (Cross-sectional statistics for time-series means)
N
Mean Median Std. Deviation
Turnover (Whole Sample)
21,352 0.0059 0.0030
0.0848
Turnover (Held by Institutions)
20,130 0.0060 0.0031
0.0871
33
Table 3
Number of Clusters
Hierarchical clustering is performed each quarter based on the pair-wise distances of the portfolios. All the
statistics are simple averages of the quarterly values within a year. Column 2 is the total number of funds
per quarter. Column 3 counts the average number of clusters obtained each period. Columns 4 and 5
compute the average percentage of funds covered by the largest 10 clusters in terms of count of funds and
market value.
Year
# of funds
# of clusters
Largest 10/total countLargest 10/total asset
1980
475
39
86.86%
94.73%
1981
511
45
82.79%
93.08%
1982
536
48
82.48%
93.05%
1983
574
55
79.65%
92.21%
1984
631
63
76.41%
91.58%
1985
694
68
73.70%
90.01%
1986
746
83
67.67%
87.44%
1987
809
88
65.58%
85.66%
1988
834
86
66.29%
86.58%
1989
827
86
67.42%
86.24%
1990
892
84
68.91%
86.79%
1991
932
88
68.81%
88.23%
1992
1018
107
61.76%
83.40%
1993
1035
115
57.66%
82.53%
1994
1086
124
58.55%
83.52%
1995
1181
136
58.00%
83.13%
1996
1191
136
57.71%
81.82%
1997
1320
161
57.98%
84.20%
1998
1452
163
58.70%
86.04%
1999
1465
140
63.79%
90.36%
2000
1646
151
63.22%
89.82%
2001
1673
158
62.22%
90.24%
2002
1741
175
59.42%
88.17%
2003
1800
179
57.26%
85.35%
34
Table 4
Mutual Fund Clusters and Styles
The table summarizes the relationship between the clusters obtained from the mutual fund holding datasets and mutual funds’ ICDI objective. For each cluster,
we identify a dominant style, which represents the highest number of funds within the cluster. Panel A reports the average percentage of funds covered by the
dominant style. Only the largest 10 clusters and clusters containing more than 10 funds are analyzed. Panel B reports the percentage of the total number of
clusters belonging to various styles. A cluster is only associated with a dominant style. "AG", "BL", "GI", "IN", "LG", "SF", "TR", "UT" and "IE" represents
aggressive growth, balanced, growth and income, income, long term growth, sector funds, total return, utility funds, and international equity, respective.
Panel A: Average percentage of funds within a cluster covered by the dominant style
year
AG
BL
GI
IN
LG
SF
TR
UT
IE
1993
66.32
28.57
48.20
48.76
48.43
46.15
30.30
76.53
1994
79.23
40.00
44.52
54.89
55.55
69.95
33.33
95.83
50.00
1995
68.03
43.33
43.42
44.57
51.66
78.87
36.36
87.16
50.00
1996
67.30
34.84
40.54
54.10
50.96
85.52
42.48
89.63
.
1997
83.53
36.67
43.01
43.09
49.34
61.30
34.38
91.29
.
1998
75.16
40.00
41.65
55.84
52.65
75.74
35.42
86.36
.
1999
83.98
30.00
43.69
38.59
51.46
76.32
.
83.85
100.00
2000
88.63
.
41.78
43.81
47.12
76.90
.
81.83
100.00
2001
88.12
.
43.67
41.63
53.28
72.17
30.00
87.76
100.00
2002
87.30
.
40.50
.
54.47
83.28
.
89.25
100.00
2003
90.87
.
40.29
.
55.39
84.82
.
95.66
.
whole
79.86
36.20
42.84
47.25
51.85
73.73
34.61
87.74
83.33
Panel B: Average percentage of clusters for each style
year
AG
BL
GI
IN
LG
SF
TR
UT
IE
1993
7.5
3.8
24.3
8.8
51.8
4.3
8.7
4.6
.
1994
13.0
3.6
20.1
8.7
52.2
4.7
3.1
3.4
3.1
1995
11.7
5.7
19.5
6.3
52.5
5.0
3.0
3.1
3.6
1996
15.3
4.1
16.8
5.0
50.4
5.4
3.0
3.8
.
1997
13.7
2.6
15.8
7.5
49.1
9.4
2.5
3.1
.
1998
12.4
2.7
17.2
4.9
49.0
9.6
2.8
3.5
.
1999
21.7
3.0
17.1
3.7
40.8
10.7
.
3.0
3.0
2000
18.2
.
13.3
6.3
48.0
13.2
.
4.1
3.3
2001
17.5
.
15.9
4.4
40.1
16.6
3.3
3.3
3.3
2002
25.7
.
17.1
.
35.9
17.1
.
3.4
3.3
2003
22.3
.
19.8
.
37.8
16.1
.
4.1
.
whole
16.3
3.7
17.9
6.2
46.2
10.2
3.8
3.6
3.3
35
Table 5
Transition Rate of Pair-Wise Connections Between Funds
In each quarter, we study the pair-wise connection between funds; connection takes the value of 1 or 0
depending on whether the two funds under study fall into the same cluster or not. We then count the
percentage of pair-wise connections remaining unchanged the next quarter. The higher the percentage,
the higher the stability of clustering. Column 2 counts the number of pair-wise connections that are the
same as the last quarter, and column 3 counts the total number pair-wise connections for funds that exist
in both the previous and the current quarter. Column 4 is the transition rate, which computes the
percentage of connections that changed since last quarter. Column 5 reports the bootstrapped transition
rate under the null of no cross-sectional structure. The last column reports the standard deviation of the
bootstrapped null distribution.
Null transition
Year
Stay
Total
Transition rate
Std.
rate
1981
64309
84993
0.2434
0.3169
0.0050
1982
71634
90354
0.2071
0.3684
0.0064
1983
69886
90112
0.2246
0.3456
0.0071
1984
86818
109039
0.2052
0.3343
0.0058
1985
104785
129784
0.1935
0.2862
0.0043
1986
128892
156826
0.1782
0.2064
0.0024
1987
139789
163463
0.1451
0.1967
0.0024
1988
172574
200615
0.1397
0.2051
0.0021
1989
180288
210005
0.1407
0.2218
0.0033
1990
180133
213918
0.1558
0.2170
0.0023
1991
239758
283542
0.1542
0.2028
0.0016
1992
266787
304078
0.1231
0.1567
0.0012
1993
303359
341548
0.1122
0.1425
0.0012
1994
273712
306200
0.1061
0.1521
0.0014
1995
366303
409872
0.1066
0.1608
0.0016
1996
376703
421103
0.1066
0.1433
0.0013
1997
357245
405887
0.1204
0.1803
0.0018
1998
464076
526623
0.1193
0.1533
0.0013
1999
448169
509174
0.1194
0.1626
0.0013
2000
593227
686772
0.1366
0.1950
0.0014
2001
751530
866017
0.1323
0.2020
0.0015
2002
761940
881229
0.1353
0.1926
0.0015
2003
914199
1040793
0.1221
0.1914
0.0016
Total
0.1490
0.2145
36
Table 6
Cluster Transition Probability
Inter-temporal links are established for each cluster. The dummy transition variable is created for each
fund-quarter observation. If a fund changes its associated cluster from one quarter to the next, then the
transition variable takes a value of 1, and 0 otherwise. Then the logistic regression of the transition
variable is estimated. The institutional type and the fund's own past net performance are used to explain
the transition rates.
Transfer
1
16141
p
Not transfer
0
36691
0.3055
Parameter
Intercept
Bank
Insurance
Mutual fund
Independent advisor
Past
quarter
net
performance
Past
year
net
performance
Year fixed effect
Quarter fixed effect
DF
1
1
1
1
1
Estimate
-0.8127
-1.0383
0.0157
0.3910
0.5120
Chi-Square
2931.2696
1700.0156
0.2619
159.7305
939.1836
Pr>ChiSq
<.0001
<.0001
0.6088
<.0001
<.0001
odds ratio
1
-2.4021
45.7134
<.0001
0.934
1
22
3
-0.9236
152.3284
578.8553
13.9422
<.0001
<.0001
0.003
0.842
37
0.314
0.901
1.312
1.481
Bank vs. Other
Insurance vs. Other
Mutual vs. Other
IA vs. Other
Table 7
Clusters and Comovement
Characteristics studied include turnover, return, quoted spread and effective spread. For each stock, its individual daily characteristics is regressed on cluster
average characteristics, contemporaneous market characteristics, average characteristics for the portfolios matched based on size, book to market and past year
return, and Fama-French 48 Industry, respectively. All the variables on the right-hand side exclude the stock under study. The reported coefficients are the crosssectional means of the firm-by-firm time-series regression. For the cluster variable, the percentage of positive and positive significant among individual time
series regressions are also reported. Sample period is 1980-2003 for turnover and return regressions and 1993-2003 for quoted spread and effective spread
regressions.
Cluster
std error
% positive
% positive significant
Turnover
0.0230
0.0010
56.54%
22.12%
Return
0.0454
0.0008
61.92%
15.67%
Quoted Spread
0.04418
0.00371
56.21%
36.35%
Effective Spread
0.0367
0.0033
56.32%
31.78%
Market
std error
0.0136
0.0021
0.0129
0.0021
-0.0073
0.0014
-0.0102
0.0014
0.02429
0.008693
0.019404
0.008491
0.0455
0.0073
0.0384
0.0072
Size
std error
0.0782
0.0027
0.0732
0.0027
0.1211
0.0025
0.1118
0.0025
0.262833
0.009907
0.255358
0.009703
0.2267
0.0082
0.2242
0.0081
BM
std error
0.0023
0.0025
0.0011
0.0025
-0.0064
0.0025
-0.0071
0.0025
-0.01243
0.009537
-0.01409
0.009245
-0.0065
0.0077
-0.0090
0.0075
Momentum
std error
0.0333
0.0023
0.0304
0.0023
0.0407
0.0022
0.0375
0.0021
0.056273
0.008637
0.052262
0.008339
0.0580
0.0072
0.0542
0.0070
Industry
std error
0.0466
0.0016
0.0459
0.0016
0.0752
0.0017
0.0739
0.0017
0.126926
0.006569
0.117616
0.006283
0.1054
0.0051
0.0980
0.0049
N
Mean R-square
Mean Adjusted R-square
9253
0.0958
0.0843
9096
0.1012
0.0879
9944
0.0861
0.0753
9658
0.0883
0.0762
5510
0.4818
0.4732
5401
0.4988
0.4891
5247
0.4107
0.4005
5163
0.4238
0.4120
38
Table 8
Institutional Trading (Mutual Funds)
The table reports median number of stocks held and traded by a mutual fund each quarter. Mutual fund
ownership data is obtained from CDA Spectrum. Sample period is from 1980 to 2003. We delete the first
quarter and last quarter observation for each fund to preclude artificial counting of purchase or sale due to
funds entering or exiting the database. If a fund has a missing report during a quarter, we do not count the
number of trades in the immediate subsequent quarter. A fund is considered to buy/sell a stock if it
increases/decreases its shares from the last quarter. A purchase is labeled as "new buy" if it was not in the
portfolio but are newly added in the current quarter. A sale is labeled as "exit" if it is completely ellimated
from the portfolio. Panel A, B , C and D reports statistics for all funds, funds with 19%>flow>0%,
flow>=19%, -7%<flow<0% and funds with flow<=-7% respectively.
Panel A: 19%>Instituions with flow>0
year
# of funds
flow
hold buy sell
new buy
exit
unchange
1980-1984
61
3.39%
53
21
12
8
7
26
1985-1989
96
4.01%
56
23
15
10
8
26
1990-1994
173
4.13%
64
27
15
9
8
30
1995-1999
195
4.18%
74
33
19
11
11
28
2000-2001
269
3.40%
76
38
23
11
11
22
whole
154
3.84%
64
28
17
10
9
26
Panel B: Instituions with flow>=19%
1980-1984
7
33.08%
54
32
13
14
9
21
1985-1989
16
31.77%
48
33
12
13
9
15
1990-1994
32
33.10%
57
42
14
15
10
17
1995-1999
44
35.77%
63
46
16
14
12
12
2000-2003
42
55.37%
63
45
17
14
11
12
whole
28
37.09%
57
39
14
14
10
16
Panel C: -7%<Instituions with flow<0
1980-1984
99
-2.54%
47
12
11
5
6
27
1985-1989
147
-2.81%
51
16
15
7
8
26
1990-1994
145
-2.20%
57
18
16
8
8
29
1995-1999
179
-2.46%
66
25
24
10
11
27
2000-2003
376
-2.76%
74
28
31
12
12
21
whole
181
-2.54%
58
20
19
8
9
26
Panel D: Institutions with flow<=-7%
1980-1984
9
-11.19%
62
16
17
8
10
31
1985-1989
30
-10.41%
53
18
23
10
13
20
1990-1994
22
-11.12%
58
18
27
10
12
22
1995-1999
58
-11.54%
62
23
33
11
13
16
whole
37
-11.04%
61
20
28
10
12
20
39
Table 9
Flow Regressions
For each fund, we construct its cluster fund flow—that is, the average flow for funds sharing the same cluster (excluding the fund itself). We perform
time-series regressions of individual fund flow on market average flow (also excluding the fund itself) and cluster average flow for each fund. The
reported coefficients are the median coefficients across all funds.
Variable
flow_cluster_std flow_style_std lag_cluster_std lag_style_std flow_mkt_std lag_mkt_std lag_flow_std return_std
flow_cluster_std
1.000
flow_style_std
0.318
1.000
lag_cluster_std
0.667
0.274
1.000
lag_style_std
0.108
0.272
0.131
1.000
flow_mkt_std
0.321
0.690
0.287
0.237
1.000
lag_mkt_std
0.282
0.588
0.325
0.270
0.863
1.000
lag_flow_std
0.259
0.174
0.313
0.076
0.148
0.168
1.000
return_std
0.115
0.165
0.064
0.023
0.195
0.079
0.035
1.000
lag_return_std
0.115
0.157
0.118
0.060
0.184
0.196
0.070
0.430
Panel A: Performance Correlation
Cluster Performance (t)
0.8266
Std. Error
0.0079
%positive
98.47%
%positive significant
93.55%
Cluster Performance (t-1)
Std. Error
%positive
%positive significant
0.0091
0.0054
52.52%
7.85%
Market Performance (t)
Std. Error
0.8737
0.0026
0.1217
0.0080
Market Performance (t-1)
Std. Error
-0.0130
0.0041
-0.0062
0.0042
Self Performance (t-1)
Std. Error
-0.0078
0.0043
-0.0040
0.0040
40
Log (tna)
-0.0183
0.0039
-0.0069
0.0028
N
Mean R-square
Mean Adjusted R-square
Panel B: Flow Correlation
1629
0.7903
0.7708
1567
0.9008
0.8872
Cluster Flow (t)
Std. Error
%positive
%positive significant
Flow
0.1430
0.0087
70.51%
34.88%
Expected flow
0.1472
0.0074
71.44%
33.65%
Unexpected flow
0.1570
0.0084
74.11%
33.58%
Cluster Flow (t-1)
Std. Error
%positive
%positive significant
-0.0506
0.0063
41.82%
8.26%
0.0539
0.0053
61.47%
17.79%
0.0139
0.0054
55.25%
8.90%
Style Flow (t)
Std. Error
0.0783
0.0012
0.0665
0.0074
0.0274
0.0095
0.0759
0.0111
0.0616
0.0062
0.0727
0.0069
Style Flow (t-1)
Std. Error
-0.0068
0.0061
-0.0050
0.0060
0.0664
0.0090
0.0370
0.0102
0.0195
0.0048
0.0107
0.0056
Market Flow (t)
Std. Error
0.0993
0.0080
0.0699
0.0079
0.1472
0.0094
0.0929
0.0124
0.0789
0.0063
0.0551
0.0070
Market Flow (t-1)
Std. Error
-0.0621
0.0060
-0.0446
0.0068
0.0086
0.0076
-0.0284
0.0112
-0.0084
0.0051
-0.0130
0.0054
Flow (t-1)
Std. Error
0.3659
0.0090
0.3491
0.0091
Self Performance (t)
Std. Error
0.0069
0.0045
0.0025
0.0045
41
Self Performance (t-1)
Std. Error
0.0116
0.0036
0.0118
0.0035
Log(tna)
0.0182
0.0117
1802
0.4818
0.4097
0.0075
0.0119
1743
0.5175
0.4357
N
Mean R-square
Mean Adjusted R-square
0.0020
0.0101
1506
0.2958
0.2513
-0.0425
0.0097
1495
0.4049
0.3505
42
0.0243
0.0065
1506
0.1183
0.0629
0.0108
0.0059
1495
0.2031
0.1311
Table 10
Flows and Cluster-Level Comovement
The specification of the test is the same as in Table 7, except that two more variables are added into the regression. The first new variable Cluster*Large_Inflow
is an interaction term between cluster-level characteristics and a dummy variable indicating whether the average flow for funds in the clusters that hold the stock
has an large inflow or not. The second new variable is constructed likewise, except that the dummy variable for Large_Inflow is replaced by a dummy variable
for Large_Outflow. An inflow/outflow is indicated large if they are higher/lower than the top/bottom ten percentile for the flow distribution.
Turnover
0.0196 0.0264
0.0050 0.0050
53.13% 55.48%
19.82% 19.09%
0.0180
0.0025
56.04%
11.07%
Return
0.0168
0.0024
56.58%
11.31%
0.0154
0.0024
56.20%
9.70%
Quoted Spread
0.0385 0.0287 0.0315
0.0061 0.0059 0.0057
56.49% 55.46% 52.72%
33.31% 31.33% 31.02%
Effective Spread
0.0367 0.0311 0.0264
0.0049 0.0048 0.0045
55.54% 55.69% 54.97%
26.35% 24.54% 24.62%
Cluster
std error
% positive
% positive significant
0.0177
0.0052
52.24%
18.53%
Cluster*large_inflow
std error
% positive
% positive significant
0.0057 0.0033 -0.0074
0.0068 0.0075 0.0071
49.32% 47.61% 45.78%
20.74% 20.30% 15.95%
0.0043 0.0048 0.0051
0.0037 0.0039 0.0036
50.77% 48.06% 50.00%
6.31% 6.87% 6.92%
0.0002 0.0010 0.0005
0.0021 0.0015 0.0015
49.37% 50.17% 51.16%
28.11% 32.17% 32.66%
0.0002
0.0018
50.22%
27.10%
0.0003 0.0006
0.0011 0.0011
50.58% 50.69%
28.97% 27.13%
Cluster*large_outflow
std error
% positive
% positive significant
0.0194 0.0102 0.0152
0.0054 0.0059 0.0052
51.99% 52.05% 51.50%
19.29% 19.73% 19.93%
0.0066 0.0063 0.0089
0.0024 0.0024 0.0036
52.97% 68.72% 53.11%
9.55% 13.40% 9.06%
0.0044 0.0041 0.0049
0.0024 0.0014 0.0017
52.12% 51.19% 54.12%
22.35% 20.68% 23.43%
0.0021
0.0014
51.41%
27.07%
0.0018 0.0148
0.0011 0.0011
53.52% 54.25%
29.03% 28.96%
Market
std error
-0.0132
0.0036
-0.0145
0.0035
-0.0169
0.0036
-0.0457
0.0026
-0.0445
0.0026
-0.0422
0.0025
-0.0322
0.0130
-0.0302
0.0126
-0.0376
0.0127
-0.0238
0.0111
-0.0240
0.0108
-0.0358
0.0101
Size
std error
0.0583
0.0053
0.0547
0.0053
0.0510
0.0053
0.0380
0.0050
0.0431
0.0049
0.0500
0.0049
0.1626
0.0162
0.1592
0.0159
0.1452
0.0157
0.1129
0.0134
0.1174
0.0134
0.1176
0.0129
BM
std error
0.0092
0.0045
0.0123
0.0046
0.0126
0.0046
0.0289
0.0047
0.0271
0.0047
0.0223
0.0047
-0.0056
0.0165
-0.0069
0.0158
-0.0016
0.0158
-0.0050
0.0126
-0.0008
0.0129
-0.0157
0.0129
43
Momentum
std error
0.0261
0.0043
0.0274
0.0042
0.0264
0.0042
0.0353
0.0041
0.0319
0.0040
0.0311
0.0040
0.0054
0.0164
0.0074
0.0151
0.0118
0.0159
0.0207
0.0122
0.0215
0.0115
0.0195
0.0115
Industry
std error
0.0834
0.0031
0.0801
0.0031
0.0833
0.0032
0.1272
0.0030
0.1255
0.0030
0.1274
0.0029
0.1865
0.0112
0.1925
0.0114
0.1930
0.0112
0.1888
0.0090
0.1868
0.0090
0.1908
0.0089
N
Mean R-square
Mean Adjusted R-square
2385
0.1103
0.0942
2396
0.1093
0.0932
2347
0.1085
0.0919
2250
0.0742
0.0590
2264
0.0681
0.0537
2267
0.0741
0.0599
1657
0.4052
0.3934
1749
0.3999
0.3880
1728
0.3885
0.3760
1651
0.3219
0.3074
1679
0.3179
0.3031
1661
0.3009
0.2854
44
0.25
Probability Density
0.2
0.15
0.1
0.05
0
0
0.2
0.4
0.6
0.8
1
1.2
Pairwise Distance
1.4
1.6
1.8
2
Figure 1: Empirical distribution of pair-wise distance metric for institutional holdings (Quarter one
of Year 1994). The blue histogram represents the empirical distribution of the pair-wise distance metric
from the data, and the red line plots the corresponding distribution for the sample simulated under the null
of no holdings. The pair-wise distance measure is defined as the sum of absolute deviation between two
funds’ portfolio weights. For the null distribution, each fund is allowed to choose their portfolios randomly,
but maintaining the fund size and portfolio concentration the same as in the data.
45
Figure 2: Size and book to market rankings for stocks held by top 10 clusters (second quarter of
sample year 1995). Stocks are ranked into 5 by 5 size and book-to-market portfolios. Each bubble in the
graph represents a particular cluster’s average rank along the two dimensions. Its center corresponds to the
mean rank for the cluster, and the widths along the two dimensions represent the standard errors around the
mean.
46
Download