
The term “alpha” represents the difference between the return on an investment and the return which could have been achieved in an index with identical risk exposure, quantifying a fund manager’s skill. A recent study by Laurent Barras, Olivier Scaillet, and Russ Wermers investigates the presence of true alpha in the results of 2,076 open-end domestic equity mutual funds for the thirty-two years from January 1975 to December 2006.
The study, “False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas,” employs the use of t-statistic hypothesis testing and statistical data to compare funds’ relative performance, employing a “False Discovery Test” to avoid errors which commonly plague statistical analysis and mitigate the effects of false positive and negative results. Unlike many previous studies of mutual fund performance, this method allows for distinctions to be made between fund results based on luck and those based on skill.
The conclusions of the study decisively reveal the folly of chasing
alpha. Using data which prevents survivorship biases and excludes
funds with less than five years of performance history, and taking
into account the large effects of active management fees, the
study concludes that 99.4% of all fund managers failed to demonstrate
true stock-picking ability.
In a July 2008 New York Times article titled, “The Prescient Are Few”, journalist Mark Hulbert digs into the results of the landmark study and its implications as described by Prof. Russ Wermers who headed up the study. “The number of funds that have beaten the market over their entire histories is so small that the False Discovery Rate test can’t eliminate the possibility that the few that did were merely false positives,” says Prof. Wermers--or as Hulbert puts it “just lucky.”
Figure 3-6B
In a study of the Morningstar Direct database, the same conclusions were reached. Virtually no evidence of stock picking skill was found. A multivariable regression analysis of historical returns was conducted to determine whether or not a fund manager has skill, or to put it in academic speak, reliably delivered alpha. The three variables used were the Fama-French three risk factors of market, size and value. This analysis reveals the extent to which the returns can be replicated with a combination of index funds, as well as the value added or subtracted by the manager (i.e., alpha).
One way to test the claim that a manager can beat a market is to see if we have enough years of performance data to be statistically significant. The statistical test called the Student’s t-test was introduced in 1908 by William Sealy Gosset, referred to as the “Student,” while working for the Guinness brewery in Dublin, Ireland to evaluate the quality of the brewery’s ingredients. The t-test can be used to determine if a series of historical returns is reliably superior to a risk-equivalent benchmark. This can determine whether alpha (any return over the benchmark return) is due to luck or skill. A t-stat of 2 or higher indicates that we are at least 95% confident that the manager actually earned a return higher than his benchmark due to skill, with up to a 5% chance that it was due to luck.
In Figure 3-6B-i, the t-test is applied to U.S. equity funds in six different style classifications over a ten-year period. Out of 614 mutual funds that were compared to their risk-appropriate benchmarks, only 80 of the 614 fund managers had positive excess returns. Of those 80, only one (0.16%) had a t-stat greater than or equal to 2 (signifying skill). But when the time period of that one was extended back to the fund’s November 1991 inception, the t-stat dropped below 2, indicating that skill evaporated.
Figure 3-6B-i
Only one fund (NFJ Allianz Small Cap Value) had a statistically significant positive alpha (t-statistic greater than 2), and when this fund was analyzed over its entire period since inception, the alpha was no longer statistically significant. The chart below shows the excess return of NFJ Allianz Small Cap Value relative to the Russell 2000 Value Index (Morningstar’s designated benchmark). From the average alpha and variability of the alpha, we see that we need 170 years of similar returns to conclude the presence of skill.
Figure 3-6B-ii
Another way to view this data is to draw a line that separates statistical significance on a Alpha versus Standard Deviation of Alpha Scatter Plot. Funds that fall above the line inicated that there is a 95% chance that they may be skillfull. As seen above, after extending the period for the only possible skillful manager, the probablity of skill went down the drain.
Figure 3-6C-i
Bill Miller of Legg Mason Capital Management holds the distinction of being the only manager to have ever beaten the S&P 500 index for fifteen consecutive years (1991 to 2005). Unfortunately, his returns after 2005 fell short of the S&P 500, so those of his investors who put their money in after he became well-known discovered the meaning of disappointment. The chart below shows how the Legg Mason Capital Management Value Trust fared against the Russell 1000 Index (Morningstar’s designated benchmark) on a calendar year basis from inception through 2010. From the average alpha and variability of the alpha, we see that we need 269 years of similar returns to anoit Mr. Miller with having stock picking skill.
Figure 3-6C-ii
Two funds that have recently received attention from the financial media are the Yacktman Fund and the Yacktman Focused Fund, both managed by Donald and Stephen and Yacktman. The chart below shows the excess return of Yacktman Focused relative to the Russell 1000 Value Index (Morningstar’s designated benchmark). From the average alpha and variability of the alpha, we see that we need 105 years of similar returns to conclude the presence of skill. Well, 105 is certainly better than 269.
Figure 3-6C-iii
For the Yacktman Fund vs. the Russell 1000 Value Index, the average alpha was -1.10%, so there is no number of possible years to conclude the existence of skill.
Figure 3-6C-iv
Unfortunately for them, investors are constantly bombarded with advertisements, market commentaries, and screaming magazine covers telling them what they should do with their money. Contributing to all the clamor and din is Morningstar’s annual announcement of their awards for “Fund Manager of the Year.” As usual, investors are best served by not paying it any attention.
In order to determine whether being named “Fund Manager of the Year” engenders a valid expectation of higher returns for the fund’s investors, Index Funds Advisors ran a statistical test (the t-test) of sixteen domestic equity mutual funds which received this Morningstar recognition (cached article) to determine if the fund’s outperformance was truly attributable to skill (95% or higher probability) or if it could be explained as luck. For each fund, the performance from the manager’s inception date (or the inception date of Morningstar’s benchmark in two cases) through year-end 2011 was evaluated against the benchmark designated for the fund by Morningstar. The charts below show each fund’s alpha (the difference in returns between the fund and the benchmark) on a year-by-year basis. Only one of the sixteen funds (about 6%) met the requirement of the statistical test that would suggest ruling out luck as the explanation for the outperformance based on a 95% confidence level. Before you get too excited however, please note that this fund belongs to the small growth category which of has the lowest expected return per unit of risk of all the different equity style boxes. Among the sixteen funds, the median number of years needed to conclude the presence of skill over luck was 72 years. Five of the funds showed a high enough degree of volatility in their returns (relative to their benchmarks) as to require a minimum of 100 years.
Even when there is a statistical indication of skill in a manager’s performance, it is often confined to a single time period and does not persist beyond it. A perfect example of this is Bill Miller of the Legg Mason Value Trust who carries the distinction of being the only mutual fund manager to have beaten the S&P 500 for fifteen consecutive years. Viewing the fifteen-year winning period alone indicates over a 99% probability of true skill, but if we broaden the scope of analysis to his entire tenure, we no longer can statistically conclude the presence of skill over luck.
Figure 3-6D1
Figure 3-6D2
Figure 3-6D3
Figure 3-6D4
Figure 3-6D5
Figure 3-6D6
Figure 3-6D7
Figure 3-6D8
Figure 3-6D9
Figure 3-6D10
Figure 3-6D11
Figure 3-6D12
Figure 3-6D13
Figure 3-6D14
Figure 3-6D15
Figure 3-6D16
In calculating the t-stat, the first step is to determine the excess returns the manager earned above an appropriate benchmark. Then we determine the regularity of the excess returns by calculating the standard deviation of those returns. Based on these two numbers, we can then calculate how many years we need to support the manager’s claims.
Of the 80 fund managers who had positive excess returns, the average excess return was 0.84% and the standard deviation was 5.64%. To estimate the years needed for statistical significance, you can find the intersection of the average excess return (about 0.8%) and standard deviation (about 5.6%) in the chart below (see data box for point estimates). Then follow the line out, and you can see that 180 years of returns data are needed to establish skill as the reason for the higher returns. The calculator below the chart provides the exact number of years needed. Obviously, no manager has ever managed a fund for 180 years; therefore, we are unable to accept any of these manager’s claims. Alas, managers are mere mortals.
Three Aspects of Performance Chart
The Figure below shows the formula to calculate the number of years needed for a t-stat of 2. We first determine the excess return over a benchmark (the alpha) then determine the regularity of the excess returns by calculating the standard deviation of those returns. Based on these two numbers, we can then calculate how many years we need (sample size) to support the manager’s claim of skill.
Sample Size Calculator for Active Manager Alphas
As you see in the calculator above, the t-stat is held at 2. Understanding why a t-stat of 2 or more is considered statistically significant is important. However, it is vital to simply grasp why bigger t-stats mean the value is more “reliably” different from zero. To begin with, refer to the following equation defining a t-stat:
or t-stat = (average x √Observations ) / standard deviation
Decomposing the elements of this equation can demonstrate what leads to bigger t-stats and help instill the intuition behind why a bigger t-stat implies that the observed value is less likely to have a true value of zero.
“Average” is the average of all observations in the sample. This parameter is in the numerator, so as the average increases, so does the t-stat. To illustrate, consider the two data series below:
Series A: 1, 2, 1, 2, 1, 2, 1, 2, 1, 2
Series B: 9, 10, 9, 10, 9, 10, 9, 10, 9, 10
Both have the same number of observations and the same standard deviation. But series A has an average of 1.5 and series B has an average of 9.5. As the average increases, so does the t-stat, meaning it is less likely the true average from series B is actually zero.
The intuition here is that a mean further from zero makes it less likely that the true value is in fact zero.
“√N” is the square root of the number of observations. This parameter is also in the numerator, so as the number of observations increases, the t-stat does as well. Consider the two data series below:
Series A: 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2
Series B: 1, 2, 1, 2
Both have the same average of 1.5 and the same standard deviation of 0.5, but series A has 20 observations and series B only has 4. As the number of observations increases, so does the t-stat, and the observed average becomes more reliable. In this example, series A has a t-stat of 13.4 and series B has a t-stat of 6 due to the difference in the number of observations. This means series A is more reliably different from zero than series B.
The intuition here is that a larger number of observations results in more reliability.
“Standard deviation” is a measure of how much the individual observations in the sample vary from the average. This parameter is in the denominator, so as the standard deviation decreases, the t-stat increases. Consider the two data series below:
Series A: 9, 10, 9, 10, 9, 10, 9, 10, 9, 10
Series B: 18, 0, -18, 32, 10, -20, 40, 15, 8, 10
Both have the same 9.5 average and the same number of observations, but series A has much less volatility and a lower standard deviation than series B. As the standard deviation increases, the t-stat decreases, so the average from series B is less reliably different from zero than the same average from series A. Said differently, there is a greater likelihood the 9.5 average from series B happened by chance due to the volatility of the data series.
The intuition here is that a more volatile data series results in a mean that is less reliably different from zero. Here is a calculator to determine the t-stat. Don't trust an alpha or average return without one.
The Fama and French Risk Premiums are good examples of the use of the t-stat. Based on the long term data, there has been an excess return for exposure to these risk factors, referred to as the US Equity Premium (Risk of the Total Market - Risk Free - 30 d T-Bill), the US Value Premium (High Book to Market - Low Book to Market), and the US Size Premium (Small Companies - Big Companies). An important consideration for investors is the likelihood that these risk “premiums” are actually zero (i.e., there is no premium) despite a historical mean that is positive. As discussed, the starting point is calculating a t-stat for each return series as outlined in Table 1 below. The t-stats in Table 1 are all considered statistically significant (i.e., greater than 2), and we can almost be 99% sure that all three risk premiums are positive, with only the SMB t-stat being marginally lower than the required 2.6 for that level of significance.
All three data series have the same number of observations, so differences in their t-stats will be a function of different means and standard deviations, as illustrated in Table 2 below.
As you can see, the equity premium is the most reliable (i.e., different from zero) despite having the highest volatility because it has a significantly higher mean to go with it. Conversely, the size premium is less reliable than the value premium despite having nearly the same volatility because it has a lower historical mean.
In “Challenge to Judgment,” Paul Samuelson dismisses investors who claim they can find benchmark-beating managers by saying, “They always claim that they know a man, a bank, or a fund that does do better. Alas, anecdotes are not science. And once Wharton School dissertations seek to quantify the performers, these have a tendency to evaporate into thin air—or, at least, into statistically insignificant t-statistics.”
Although a few managers will occasionally appear to have reliably delivered alpha, IFA cautions investors that the fact that there are so many managers virtually guarantees that there will be some who appear to have demonstrated true skill. Unfortunately, the number of such managers is no higher than what we would have if all of them were monkeys throwing darts at the Wall Street Journal. Two studies that elegantly address this point are:
Rob Silverblatt of U.S. News and World Report spoke with Eugene Fama about the implications of the “Luck versus Skill in the Cross Section of Mutual Fund Alpha Estimates” study conducted by Fama of the University of Chicago and Kenneth French from Dartmouth, which casts serious doubt on managers’ ability to generate alpha. Here is his interview:
Why did you decide to study luck?
[Fama] "This is the basic problem. You have several thousand mutual funds out there. When you look at the results over their whole histories, there’s a huge range of results. The winners are big winners and the losers are big losers. So the problem is to judge what the world would look like, what the cross section of performance would look like, if there were no skill in the population. That’s what this paper does, it constructs experiments that maintain the characteristics of mutual fund returns, but we set them up knowing that there is really no [skill]."
So just how lucky are fund managers?
[Fama] "If you look at the top 10 percent, they’re [comfortably] outperforming their benchmarks. …Those are the people that people would write books about. But it turns out that if you look at the distribution that you’d expect by chance, you’d expect more of them out there."
As for the ones that do get good returns, does that mean they’re good stock pickers?
[Fama] "There are always people on the top; that’s the point. People make the wrong inference. There are people that are big winners, but there are fewer of them than you’d expect than if they were just lucky."
Can any managers truly be counted on to add alpha through skill alone?
[Fama] "You can’t tell from the net returns. Now if you give them back their fees and expenses and just look at their portfolio returns, then you find some evidence that there are funds out there that might have some skill, but it’s absorbed in fees and expenses."
What do your findings mean for the role of active management?
[Fama] "Don’t be misled by past performance. There’s lots of other evidence that shows that performance doesn’t persist--that the past winners aren’t the future winners and that basically what happens after you rank them as winners is random. And this is consistent with that: It’s basically saying that the winners are just lucky."
Figure 3-6C illustrates the results of this study. This article from Forbes.com also discusses this study.
Figure 3-6C
Even professional stock pickers can fall hard. Bill Miller, chief investment officer of Legg Mason Capital Management and portfolio manager of the Legg Mason Capital Management Value Trust and Value Equity Strategy, lost his Midas touch after a long stretch of beating the S&P. On November 17, 2011, the company announced that Miller will be stepping down effective April 30, 2012. Formerly a former Morningstar “Fund Manager of the Decade,” Miller seemed to glitter throughout the 90’s only to have his sparkle go dim towards the end of the following decade. His fund grew from $750 million in 1990 to more than $20 billion in 2006. As of November 16, 2011, total assets are down to $2.8 billion. His Legg Mason Value Trust Fund (LMVTX) is portrayed in Figures 3-A, 3-B and 3-C, showing the risk and return results of his fund for three different time periods, compared to various indexes and index portfolios: Figure 3-A for the decade of the 90s through 2000; Figure 3-B for the ten years from 2001 to 2010; and Figure 3-C for the 28 years and 8 months since the inception of the LMVTX fund.
As the first chart clearly shows, LMVTX did earn higher returns than the S&P 500 and the index portfolios during the 90s, but with significantly higher risk—a risk that eventually caught up with Miller. In a January 6, 2005 article in The Wall Street Journal, Miller accounted for his winning streak saying, “As for the so-called streak, that’s an accident of the calendar. If the year had ended on different months it wouldn’t be there. At some point, mathematics will hit us. We’ve been lucky. Well, maybe it’s not 100% luck—maybe 95% luck.”
Figure 3A
Figure 3B
Figure 3C
Figure 3-B shows just how hard the mathematics did hit Miller. Despite the fact that his “so-called streak” showed him to outperform the S&P 500 for a 10-year period, Miller’s subsequent 10-year returns from 2001 to 2010 pale in comparison to the indexes and index portfolios shown. Miller’s outperformance and subsequent underperformance were the result of his excessively risky bets on concentrated investments among highly correlated stocks. While equity index portfolios invest across many asset classes and invest in as many as 12,000 companies in 40 different countries, Miller’s strategy was to “place big bets on stocks other investors feared,” cites a Wall Street Journal article, “The Stock Picker’s Defeat.” According to the December 2008 article, “Mr. Miller was in his element [a year ago] when troubles in the housing market began infecting financial markets. Working from his well-worn playbook, he snapped up American International Group Inc., Wachovia Corp., Bear Stearns Cos. and Freddie Mac. As the shares continued to fall, he argued that investors were overreacting. He kept buying.” The article continued, “What he saw as an opportunity turned into the biggest market crash since the Great Depression. Many Value Trust holdings were more or less wiped out. After 15 years of placing savvy bets against the herd, Mr. Miller had been trampled by it.” Miller stated, “The thing I didn’t do, from Day One, was properly assess the severity of this liquidity crisis... I was naïve… Every decision to buy anything has been wrong…It’s been awful.” Not only did the assets themselves plummet, but investors bailed on the fund pushing its assets down from its apex of $21 billion to around $4.2 billion.
At one point, Miller said, “The S&P 500 is a wonderful thing to put your money in. If somebody said, ‘I’ve got a fund here with a really low cost, that’s tax efficient, with a 15 to 20-year record of beating almost everybody, why wouldn’t you own it?’”
Figure 3-C shows that over the lifetime of the LMVTX, several indexes and index portfolios outperformed the LMVTX with lower risk than the LMVTX, and the more appropriate benchmark of U.S. Large Cap Value beat Miller with less risk.
Miller’s so-called streak was based on bad benchmarking. LMVTX was far riskier than the S&P 500, a reality most investors certainly did not understand—especially investor Peter Cohan who lamented to the Wall Street Journal, “Why didn’t I just throw my money out the window and light it on fire?”
Morningstar ranked Miller’s fund as one of the top 3 losers for fund performance in June 2011. Bloomberg News reports that Russel Kinnel, Morningstar director of mutual fund research said, “People assume because certain managers have had good streaks that they are always going to be a step ahead of the market. It never works out that way.”
This is a lesson for long-term investors who pick fund managers whom they believe are skilled in stock picking. In this case, the manager is leaving the fund after a roller coaster 30-year career. It might be a good idea to put a warning on the Legg Mason Value Trust prospectus reminding investors that luck is not a reliable source of returns in the future – maybe something along the lines of the health warning on a package of cigarettes.
|
Source: Yahoo Tech Ticker |
See this article for more Lessons from Bill Miller: Don't concentrate, don't style drift, and nobody can beat a risk adjusted market over long periods. Invest right, sit tight. Also see the Quote of the Week #45.
The studies
mentioned above represent only a sampling of the mountain of research
that have been stockpiled over the years. The impact of the research can
best be summed up in the words of Henry Blodget, former securities analyst
turned financial journalist: “Academics have essentially proved
that active fund management for the fund customer is a loser’s game.
The vast majority of active funds underperform passive benchmarks. So,
the vast majority of customers of active funds pay billions of dollars
in exchange for, at best, nothing.”
| All of the
chances above are quite poor and are unreasonable odds based on
the the fact that the average actively managed mutual fund is
about three times the cost of an index fund (1.5% versus 0.5%).
So you pay three times the cost with only a 3% chance of winning. Other
studies indicate a zero chance of winning. As Larry Swedroe
has said, investors who buy actively managed funds should wear
a shirt that says, "I
can't add." Essentially investors are being fooled by
randomness and poor statistical information that is being provided
by active managers. Your better understanding of statistics will improve your ability to ignore the siren songs of active
management and better manage your investment portfolio. If the average index fund charges 0.25-0.5% and the average active mutual fund charges 1.5%, there is already an innate cost associated with active management even before taking into account that active management underperforms the respective index. What exactly are investors paying for? According to hundreds of studies, it appears that investors are paying for nothing more than false hope or promise. They are just speculating, and the expected return of speculation is zero, minus the costs of speculating. This means that as a group, active investors obtain the return of the market they play in, minus their cost of playing. As Nobel Laureate William Sharpe says, "why pay people to gamble with your money?" |
3.3.5
The attempt
to predict the outcome of a coin toss is a futile endeavor. Unless
the coin is rigged, the only way to make a correct prediction is to
guess blindly. Unfortunately, it is with the same disregard for investors’
financial health that the financial institutions and media perpetuate
the false idea that some people have a gift or method for predicting
future stock price gyrations.
In a study by Walter Good and Roy Hermansen, a hypothetical coin flipping
experiment was compared to mutual fund manager performance. Three-hundred
college students were asked to guess the outcome of 10 coin tosses.
Their guesses were tabulated and charted. The performances of 300
mutual fund managers were then tabulated for 10 years (1987 to 1996)
from Morningstar® Principia®. See Figure 3-7.
The number of years that the mutual fund managers were rated in the
top 50% of fund managers was then counted and compared to the ability
of college students to correctly guess the outcome of the flip of
a coin. The results were nearly identical.
An interesting point was raised by a hypothetical nationwide coin
toss. In this example proposed by Warren Buffett, 225 million Americans
are given one silver dollar and expected to flip it once per day,
with heads winning and tails losing. After 25 consecutive days, the
statistical result would be comparable to six people flipping heads
for 25 days in a row. These people would be regarded as geniuses for
being so masterful at flipping coins. This is nonsense, of course,
but it would do well for investors to see mutual fund managers as
the six masterful coin flippers rather than geniuses, gurus or all
star analysts.
|
|