Portfolio optimization with US large cap equity sectors

I had originally planned to do most of the tables below in nice and equally formatted HMTL tables, which ChatGPT whizzed up for me in no time. Alas, my Squarespace code and text blocks don’t like each other, and my original plan got lost in ruined formatting. I have yet to figure out why. AI can only compensate for incompetence to an extent, I guess.

I am still in a quant-mood at the moment, so today I will go through some work I’ve done on portfolio optimization with US large cap equity sectors. I am doing this to augment my current MinVar framwork, which I use for my own investments. A quick re-cap on the basics of portfolio optimization, with advance apologies to PMs reading this and lamenting that I’ve missed something. Finance has two workhorse models; the tangent portfolio, which places the investor on the efficient frontier, where risk-adjusted return—or the Sharpe ratio—is maximised. Or the minimum variance portfolio, which offers exposure to the combination of assets with the lowest variance, or standard deviation, regardless of return. These portfolios often are estimated given a set of constraints, as I explain below. Assuming most portfolio allocation decisions start with one of these ideal models in mind—you either want to achieve the best risk adjusted return or the lowest volatility—the difference between the textbook models and real-time allocations is governed by the following layers of complexity.

  • The universe - Few PMs or investors optimise and choose their portfolios from the entire universe of assets, even in equities. The choice of universe—which in today’s investment world is closely related to factor exposure, domain expertise and simple remit imposed from above—has important implications for the kind of portfolio you end up with, and what you can expect from it. Amateur investors, in theory, have an advantage over many professional PMs in the sense that they can be more idiosyncratic in their selection of universe. But be beware, the bigger the universe, the more complex your exposure and portfolio overall.

  • Constraints (quantitative) - This is a catch-all category for the explicit constraints that the PM will impose in the quantitative analysis itself. The most common ones are whether to allow short selling or not and whether to allow leverage, and if so, how much. But you can also imagine a host of other constraints, for example the maximum or minimum allowed allocation to individual assets, as well as the minimum Sharpe ratio required, or the maximum volatility accepted. In theory, the model under consideration is a simple one; we are trying to find the optimal allocation subject to a number of quantitative constraints.

  • Constraints (qualitative) - All investors, be they professional PMs or amateurs, operate under constraints, which might not be quantitative, but which have important consequences for the portfolio decisions. These could be as simple as time-zone constraints, or as complex as legal constraints, or somewhere in between. Behavioural constraints, and preferences—especially of superiors—are often important too.

  • Rebalancing and forward-looking returns and variances- The Tangent and MinVar portfolios are constructed based on backward-looking metrics, by definition. The expected return of a given asset is based on its historical return, as is the asset’s expected variance and co-variance with other assets. This invariably invites using rebalancing to make sure that the portfolio allocation is always as close as possible to the efficient frontier or minimum variance, given the historical data. This, in turn, highlights the fundamental trade-off between optimising the asset allocation and transaction costs, slippage and the like. In theory, it is easy to create a portfolio that updates its weights with every price tick, but in practice it is impossible to gain exposure to this portfolio, even for the most sophisticated quantitative PM with access to all the bells and whistles offered by the likes of Citadel, Millennium, Renaissance or similar. Another way to correct for the backward-looking nature of traditional portfolio allocation models is to augment the traditional expected return metric, with a forward-looking element. Similarly, for the historical variance and covariances, experimenting with different time periods, or identifying structural breaks in the variance and covariance structures, conditional on external events, are ways to make the final covariance matrix, used to estimate the portfolio weights, more robust. Below, I run a study with a forward-looking returns component and a more robust covariance matrix.

  • Risk management - Portfolios, and the people that run them, don’t operate in quantitatively optimised vacuums. Many simple risk management metrics can be included in the quantitative constraints mentioned above, or at least made explicit by imposing a quantitative and rules-based risk management framework. But when the chips are down, risk management is an art not a science and more importantly, it is often imposed on PMs and investors from above, or from external events which brutally upends even the most perfect quantitative model.

The tangent portfolio in time and space

Large-cap US equity sectors are some of the most ubiquitous factors out there today, and they also lend themselves relatively easy to direct exposure for retail investors via ETFs. The list of sectors used for the analysis here is:

Categories
Business Services
Consumer Cyclicals
Consumer Non-Cyclicals
Consumer Services
Energy
Finance
Healthcare
Industrials
Non-Energy Materials
Technology
Telecommunications
Utilities

For the purpose of the analysis I am using weekly observations of total return series, with annual—52w—returns. My plan is to automate this analysis via my portfolio optimization GPT, but in this first iteration I have run the analysis in Excel with the Solver as my optimization tool, so that I can see what’s going on more. My aim is to check how the stable the portfolio weights are over time, and conversely, how different, or similar, the estimated covariances are across the sample periods. The first step in this analysis is to split the sample of historical returns into three sub-samples—a five-year, 10-year and the full sample running all the way back to 1989—and to see how the estimated tangent portfolio, unconstrained with no leverage, varies across the three samples. The results are shown below.

Let’s start with the main characteristics of the portfolios. They all come with similar expected returns of 13-to-15%, but the standard deviations has declined in the more recent samples. This suggests that the expected return on the efficient frontier has been relatively stable over time, but that the standard deviation has come down somewhat. Looking more closely, I’d highlight three things. First the five-year and 10-year efficient frontier portfolios look somewhat similar, indicating that expected returns, (co)variances and Sharpe ratios of the sectors are reasonably stable over the two periods. This is especially true considering that business services and technology are two sectors/factors with significant overlap. The second result is that the big non-cyclical overweight in the whole-period portfolio reflects a high Sharpe ratio, and low covariances with other sectors, in the distant past. This is a cautionary tale for portfolio optimization algos that use a sample with a multi-decade history, ostensibly for robustness and accuracy. Because the distant past carries as much weight as what happened yesterday in the simple models, it is easy to pick up superior characteristics of a sector due to performance metrics early in the sample, which won’t necessarily apply today. Finally, some sectors seem to have stood test of time. Energy, technology and utilities make into all three portfolios, indicating that the characteristics that make them good choices for optimal portfolios are stable over time.

To get a better understanding of what drives the shifts in portfolio weights in the three samples, and the persistence of some sectors, we need to look under the hood of the two metrics used by the optimization algo to construct the portfolios; the risk-adjusted return, Sharpe Ratio, and the covariances of the assets. The first table below shows the Sharpe ratio of the sectors in the three sample periods, with the three top and three bottom highlighted by me. The second table plots two variance metrics. The first column is the standard deviation of the covariances—for each sector with all other sectors—across the three samples and the second shows the number of negative covariances each sector has with other sectors across the three samples. In the first instance, I realise that I am taking a standard deviation of three numbers, which isn’t ideal, but the idea is sound I think. You could estimate a rolling number of covariance matrices and do the same calculation for a more robust result.

This analysis suggest that the top and bottom sectors, measured by Sharpe ratios, are fairly stable in the 5y and 10y samples, but not so in the full sample. This fits the results above showing relative stability in the efficient frontier portfolio on a 5y and 10y basis, compared to the full sample. The results also shows that business services, technology and healthcare have been solid top performers on a 5y and 10y basis, with healthcare retaining a top three Sharpe ratio in the full sample too. Consumer non-cyclicals and industrials are top performers in the full sample, but the results above indicate that this is due mainly to strong performance in the past, which won’t necessarily hold in more recent samples. Telecoms are bottom of the pile throughout, confirming what anyone trying to invest successfully in this sector—Looking at you V and T—have known for a long time.

Energy is another interesting sector. It is in the bottom of the pile on Sharpe ratio, but somehow still makes it into the optimal portfolios. The second column in the variance table shows why. Energy is, by far, the best diversifier, defined here as the sector with the most negative co-variances with other sectors across the three samples. No one comes close. Most other sectors have at most one negative co-variance with other sectors—you guessed it, with energy—while technology has two. Finally, the first column in the variance analysis above shows that utilities, healthcare and business services have the most stable covariances across time, though as I note above, this analysis is very rudimentary here. More generally, though, estimating a large number of covariance matrixes over different timeframes and comparing covariance estimates across sectors over time is one way to check whether the covariance matrix is stable over time, and if it isn’t, which sectors are driving the volatility.

An augmented tangent portfolio - Utilities are red hot

The rule of thumb, taught in finance classes, is that the weights of the minimum variance portfolio are more stable over time compared to the tangent portfolio, for two reasons. First, because (co)variances themselves tend to be more stable than returns over time and secondly because the tangent portfolio effective optimises over two variables, risk and return. By contrast, the minimum variance portfolio optimises over volatility alone. Starting with that, an augmented tangent portfolio should attempt to do two things. First, it should be estimated on a covariance matrix that is as robust, i.e. stable, over time as possible. A further augmentation would be to include some forward-looking element in the covariance estimates, but I am not considering that here. Secondly, it should include a forward-looking element for returns to optimise for expected returns, rather than just past returns.

Let’s try to do that.

For the covariance matrix, I am simply using the equally weighted average of the three covariance matrices estimated above. This isn’t very sophisticated, but it is cumbersome in Excel to estimate multiple covariance matrices. It will be much easier in a more flexible coding environment to estimate multiple covariance matrices and to adjust the chosen “optimal” matrix over time, if need be. For the return vector, meanwhile, I am using the standard expected return—a simple weighted average of the sample return—and an augmented return which takes the standard return + the Z-score of expected EPS growth times one standard deviation of return - the Z-score of the P/B ratio times one standard deviation of return^.

The difference between these two return series are shown below.

The deviation between the two series is extreme in some cases, but that’s what we want. In the end, we want to optimise for a combination of the two return series, using the augmented variable as a way to determine the overweights in the portfolio. This is because, as is apparent below, the augmented return portfolio by definition is a very concentrated portfolio. Adding everything together, the final analysis produces the following optimal portfolio(s).

Let’s start with the simple tangent portfolio, which is spread over five sectors with overweights in utility and technology. Given that business services and technology are likely to factors with significant overlap, we can say that this portfolio buys momentum/growth; technology and business services, low vol; utility, and value; healthcare and energy. This portfolio is not materially different from the initial 5y and 10y portfolios estimate above, indicating that a Sharpe ratio of just over 1, split between an expected annual return of just under 15% with a standard deviation of around 12-to-13% is what investors should expect at the efficient frontier.

The 5% allocation in non-cyclicals/staples seems largely irrelevant in this context, but the importance of this factor is elevated, however, with the augmented portfolio. This framework currently suggesting overweights in this sector and utilities. This leads to a very large weight in utilities in the combined portfolio, which is interesting in the current context given the focus on this sector as a beneficiary of the AI boom and associated rise in demand for electricity. It is also because of this story that utilities have such a large weight in the augmented return model. Earnings expectations for this sector are now shooting higher, and valuations are still relatively attractive. That said, a +40% allocation to one sector in a portfolio with exposure to just five sectors seems unreasonable. But remember this is an unconstrained model. It is easy to tell the optimization algo to restrict weights in any given sector, which forces the model to diversify further. More generally, how much extra Sharpe ratio should investors expect for exposing themselves to expected rather than historical returns. In this study, the Sharpe ratio goes from 1.2 to 2.0. That isn’t too bad, but will this result hold true in reality? Time will tell.

Appendix

^ Mathematical notation for the return vector