My Finance

University Home Page


  • Updates for 2025

    Dear readers, after a long break, I am back. I updated the annual volatility and other data for S&P 500 for the year 2025. Annual volatility is computed as the empirical standard deviation of daily log changes multiplied by 1000 (for normalizing). The end-of-year price for S&P 500 in 2025 is also updated. We also add S&P 500 dividends for 2025. Now we have data on volatility for 1928-2025, on dividends for 1927-2025, and end-of-year level of S&P 500 for 1927-2025 too.

    We added the dividend data for 1927 as well, to increase the number of data points. This is fine, since S&P 90 (a predecessor for S&P 500) was created in 1926, and the data is taken from Robert Shiller’s data library.

    The volatility for 2025 is 11.77. This is higher than the long-term average 10.51, or the 2024 volatility, which is 7.98. See the original post with computations of Angel Piotrowski for 1928-2023 and its previous update for 2024.

    Dividends for 2025 are 78.92, which is significantly higher than dividends for 2024, which are 74.83.

    The S&P 500 increased a lot in 2025: End-of-year 2024 level is 5881.63, but end-of-year 2025 level is 6845.5.

    We could not yet provide earnings for 2025, since we have earnings for 2025 Quarter 4 reported only on 2026 Quarter 1, which is still ongoing. We will provide them as soon as we can.

    Finally, we added the BAA rate: December 2025 daily average. The BAA are lowest-rated investment-grade corporate bonds. The rate in December 2025 is 5.9, slightly higher than 5.8 for December 2024.

    Above, logarithmic plots of index levels and dividends for 1927-2025. Below, the annual volatility and December BAA rate.

    The data are published on my web page: We created a new tab named Financial Data Library on my web page. Let us now apply

    Let us replicate this post: Make stock returns IID Gaussian.

    We have the following notation:

    •  S(t) the S&P level at end-of-year  t.
    •  D(t) the dividend of S&P in year  t.
    •  R(t) December daily average BAA rate during year  t.
    •  V(t) annual realized volatility for the S&P for year  t.

    Compute total nominal geometric returns for the S&P 500:  Q(t) = \ln(S(t) + D(t)) - \ln S(t-1) for year  t. Below is the graph of returns 1928-2025.

    Now plot the autocorrelation function for these total returns  Q. And another autocorrelation function for their absolute values  |Q|. Both plots are below, and both are consistent with the white noise hypothesis. It is surprising that we, in fact, do not have to divide total returns by annual volatility to make it white noise.

    The quantile-quantile plot of these returns is shown below. We see that the returns are not Gaussian.

    This is consistent with the normality testing. Shapiro-Wilk and Jarque-Bera tests give us  p = 0.02\% and  p = 3\cdot 10^{-5}. What if we do divide these total returns by annual volatility? We get  N(t) = Q(t)/V(t). Let us plot the autocorrelation function for  N and the autocorrelation function for  |N|.

    These are still consistent with white noise, although, in my view, the autocorrelation function values are greater.

    January 28, 2026

  • Advanced Version

    In the repository https://github.com/asarantsev/advanced we added another page with advanced version of the simulator, where we can pick initial conditions for model factors:

    • S&P 500 Annual Volatility (VIX)
    • S&P 500 Bubble Valuation Measure
    • Moody’s BAA Bond Rate
    • Treasury 10Y – 3M Long-Short Bond Spread

    We repainted the button Compute in orange. For the advanced option, we made the legend font and the web page font smaller. And we updated the default initial conditions for the main version of the simulator to make it for June 2025.

    See the historical graphs of these four measures below. We can see their historical range. We included them in the HTML page for the advanced version of the simulator.

    Below we see the autocorrelation function plots for original and absolute values, and the quantile-quantile plot versus the normal distribution, of residuals for each of the seven regressions. First, let us consider four factors: volatility, BAA rate, and spread (autoregressions for logarithms) and earnings growth divided by volatility (needed to compute the evolution of the bubble measure; here no regression).

    Then consider regression residuals for total returns of three asset classes: S&P 500, International Stocks, USA Corporate Bonds.

    July 10, 2025

  • Autoregression Simplification

    I am back! Now let me reduce the complexity of this model to model factors: log volatility, log BAA rate, and long-short Treasury spread, only using one-dimensional simple autoregressions, possibly with non-Gaussian independent identically distributed innovations. Different series of innovations might correlate.

    We have the following differences from the previous setting:

    logarithm of the BAA rate is modeled as  X(t) = 0.643694 + 0.539518 \cdot X(t-1)

    and the spread is modeled as  X(t) = 0.107208  + 0.942062 \cdot X(t-1)

    Analysis of residuals for these is given below. You can see that residuals (innovations) are IID and close to normal.

    We shall not present here all residuals analysis, instead referring the reader to the new GitHub repository, but results are almost the same as before. If I have time, I will write a detailed description of analysis of residuals. Next, we need to move to an advanced version of the simulator, which allows you to choose factor values.

    July 9, 2025

  • Simulator Final Update

    I updated the online simulator to include the bubble measure involving earnings growth, and the long-short spread. The previous version included only the BAA rate and stock volatility as factors.

    I use a mix of developed and emerging markets in the portfolio of international stocks, in proportion 60/40, as in the previous version of the simulator. I use only total (not price) returns, and nominal (not real) returns. There are three asset classes: S&P 500; International stocks: 60% of MSCI EAFE (Europe-Australasia-Far East) and 40% of MSCI EM (emerging markets); and USA corporate bonds.

    I modeled innovations using kernel density estimation. There are 8 regressions, but one of them (the bubble measure) is used to create said measure, not model the actual returns, so we use 7 series of residuals. I use the same algorithm discussed there to impute missing data.

    For current market conditions, we use May 2025. I could write the API to take these from Yahoo Finance but I just manually put them. They need to be updated from time to time.

    I put the backend Python simulation code, the original Excel file for financial data, the Python code for filling innovations series, the Excel file for innovations before and after imputation, and the HTML frontend files, to a separate GitHub repository.

    This is my 50th post. And so far, the main work on this simulator has been finished, or so I hope. Now I would like to tell this as many people as possible. I am taking some break from this coding and web development. Enjoy the summer!

    June 25, 2025

  • Emerging Stocks Included

    From the same web site Novel Investor, we included total nominal annual returns of emerging markets stocks (MSCI EM index). They are available only from 1988, as opposed to developed markets from 1970. So I added emerging markets to the portfolio in the following way: Starting from 1988, we have 60% Developed and 40% Emerging portfolio of international stocks. I fitted the econometrics model using the new data, and rewrote the simulator.

    The model fits well, judging by the innovations for the regression for international stock returns. However, simulator runs show that returns have considerably increased. This is due to historical data: Returns of emerging markets were much higher than of developed markets during 1988-2024, although some years were an exception.

    Regression coefficients in the model for international stock returns  Q_2(t) have changed:

     Q_2(t) = a_2 - b_2V(t) - c_2(R(t) - R(t-1)) + \delta_2(t) with  a_2 = 30.328921 and  b_2 = 1.818674 and  c_2 = 6.297304.

    The updated code is available in the same GitHub/asarantsev repository simulator-current. It has version 10.

    I also made the following changes, per discussion with a colleague:

    • Removed the simplified version of the financial simulator, with pre-specified portfolios and withdrawals
    • Changed from 5 percentiles 10%, 30%, 50%, 70%, 90% to 3 percentiles 20%, 50%, 80%
    • Included average ruin time over all paths in ruin, and the ruin time for each chosen percentile path in ruin
    June 25, 2025

  • Modeling Innovations

    Contents: Problems; Kernel Density Estimation; Missing Data

    Problems. We have non-normality of innovations. Even when they are independent identically distributed, we cannot guarantee their normality. What is more, different series of innovations have different lengths. For example, in our current version of the simulator, we have five series of innovations. They are available in the Excel file innovations.xlsx in GitHub repository asarantsev/simulator-current

    • Autoregression for log volatility: 96 data points
    • S&P stock returns: 97 data points
    • International stock returns: 55 data points
    • Autoregression for log rate: 97 data points
    • Corporate bond returns: 52 data points

    In the previous version of the simulator, I simply used the multivariate normal distribution. We can compute the empirical covariance matrix by ignoring the missing data. And the means are, of course, zero. But, as mentioned above, the distribution is not normal in fact. To be more precise, the first and fourth components are not normal. In particular, their kurtosis is greater than that of the normal distribution, and the skewness is nonzero.

    I tried to apply other distributions to fit each of these two components: skew-normal and variance-gamma. But I failed to fit them well. What is more, fitting univariate distribution is not enough. I need to combine them with normal marginals for the other three components. I did not find relevant exact literature. This would require developing entire new theory of multivariate distributions.

    Kernel Density Estimation. This is a universal nonparametric method: (KDE). We apply Gaussian kernel: For  X_1, \ldots, X_N \in \mathbb R^d we have the density

     f(x) = \frac1N\sum\limits_{k=1}^N\varphi(x; X_N, \Sigma)

    where  \varphi(x; \mu, \sigma) is the probability density function on  \mathbb R^N for  \mathcal N_d(\mu, \Sigma) which is the  d dimensional Gaussian distribution with mean vector  \mu and covariance matrix  \Sigma. In other words, to simulate a random variable  Y with such density, we pick  U at random (uniformly) from  X_1, \ldots, X_N and simulate the additional noise  Z \sim \mathcal N_d(0, \Sigma) independent of  U. Thus  Y = U + Z.

    For  \Sigma we apply Silverman’s rule of thumb: This is a diagonal matrix with

     \Sigma_{ii}  = \left(\frac{4}{d+2}\right)^{1/(d+4)}\cdot N^{-1/(d+4)}\cdot \min(\sigma_i, \rho_i/1.34)

    Here,  \sigma_i, \rho_i are statistics of the ith component of the data  X_1, \ldots, X_n. Namely,  \sigma_i is the empirical standard deviation, and  \rho_i is the empirical inter quartile range: 75% quantile minus 25% quantile. This is realized in the file simKDE.py from our main repository. We stress that this code simulates innovations, not computes the joint density function.

    Missing Data: The other problem mentioned above is lack of data for some series of innovations. We considered many imputation methods, for example iterative imputer using Principal Component Analysis or k-nearest neighbor method. They are implemented in Python package sklearn. But such methods reduce variance, because imputed data reverts to the mean. I chose a custom designed approach. It comes in iterated steps. We describe one step below.

    Step. Assume we have d + 1 series of  N independent identically distributed data points. Out of these,  d have full  N values, and the last one has missing  M values. We then regress this last series (of course, only existing  N - M values) versus the full  d series (of course, only the matching  N - M data points). We use ordinary least squares linear regression. Then we take residuals of this regression. We randomly choose with replacement  M of these residuals. And for each of  M missing points, we pick the predicted value by this regression using the first  M data series, and add this randomly chosen residuals. This is the way to fill the missing data. This completes the description of this step.

    We first apply this step to one missing data point for series 1 (using series 2 and 4 as the backbone), then to  97-55 = 42 missing data points for series 3 (using series 1, 2, 4 as the backbone), and then to  97-52 = 45 missing data points for series 5. We write this new data frame into a separate Excel file called filled.xlsx. It is available in the same GitHub repository. The Python code is given in innovations.py in the same repository.

    June 20, 2025

  • Bubble Measure and Long-Short Spread in the Simulator

    Introduction. Here I talk about improvement of my financial simulator. So far, I have the following factors:

    • BAA corporate bond rate, average December
    • Annual volatility, S&P 500

    I decided to include two more important factors:

    • A bubble measure, which is an improvement over Robert Shiller’s cyclically adjusted price-earnings ratio (CAPE), for which he got a Nobel Prize in Economics. We discussed it here and here and here.
    • The long-short spread between 10-year and 3-month (average December) Treasury rates, which is often quoted as an important indicator. For example, if it is negative (inverted yield curve), then a recession is looming. We discussed it here and here and here.

    We will now use annual earnings of S&P for our research. This is important, since it connects stock returns with fundamentals. This is similar to computing bond returns using bond rates. Robert Shiller used this comparison for his work. And we use this comparison to make our valuation measure. But, of course, stocks are much more volatile than bonds. So it’s harder to model.

    Since we covered so much in previous posts, here we will be brief. For the person who wants to know details, we refer to GitHub repository.

    Results. 1. First, recall the autoregression of order 1 for annual volatility:

     \ln V(t) = \alpha + \beta \ln V(t-1) + W(t).

    Recall that  \alpha = 0.847850, \beta = 0.620146.

    2-3. Then, modify the autoregression for log BAA rate  R(t) to include spreads:  S(t). This is vector autoregression of order 1:

     \begin{bmatrix} \ln R(t) \\ S(t) \end{bmatrix}

    It does NOT include volatility. It has intercept and the slope matrix

     \begin{bmatrix} -0.263745 \\ 0.151941 \end{bmatrix}

     \begin{bmatrix} 1-0.471137 & 0.502590 \\ -0.043837 & 1-0.049088 \end{bmatrix}

    4. Model annual earnings  E(t) by considering its growth:  G(t) = \ln E(t) - \ln E(t-1). We model this as a regression

     G(t) = a + bS(t-1) + k(R(t) - R(t-1)) + cV(t) + V(t)Z(t).

    As usual, we fit it after dividing by volatility. We have

     a = 7.756497, c = -0.784163, b = 4.786272, k = 3.720972.

    5. Next, we consider the bubble measure  B(t) computed as in the above blog posts, with 10-year averaging window, and without using volatility. We get for  \gamma = -0.02138 and  \theta = 1 - 0.1901:

     B(t) = \gamma + \theta B(t-1) + U(t).

    6. We fit the corporate bond returns (geometric) denoted by  Q_0(t) in the same way as above. Namely,

     100\cdot Q_0(t) - R(t-1) = h_0 -m_0(R(t) - R(t-1)) + \delta_0(t).

    This makes sense in the context of bond markets, even though we use geometric instead of arithmetic returns here. Note that this does not use volatility. Similar to the above post, we see:  h_0 = -1.661125,\, m_0 = 5.588353.

    7. Next, we fit the geometric Standard & Poor 500 returns  Q_1(t) by dividing it by volatility  V(t). We also do regression versus

     B(t-1), S(t-1), R(t) - R(t-1), V(t).

     Q_1(t) = h_1 - m_1(R(t) - R(t-1)) - a_1B(t-1) - b_1S(t-1) + cV(t) + V(t)\delta_1(t).

    The quantity  m_1 plays the role of the duration: Dependence upon the change in interest rates. This coefficient is negative because returns of stocks and bonds decrease when interest rates increase. Next,  a_1 is the coefficient for the bubble measure. Of course, this is also negative, since being in a bubble implies low future returns. Same is true for the long-short spread, as discussed at the top of this post. Numerical values of coefficients are:

     h_1 = 26.851013, m_1 = 7.823798, c = -1.356810, a_1 = 16.439790, b_1 = 3.411985.

    8. Finally, for international stocks we do the same. Thus we write this regression as

     Q_2(t) = h_2 - m_2(R(t) - R(t-1)) - a_2B(t-1) + c_2V(t) + V(t)\delta_2(t).

    Note that  m_2 is still the duration. Although  R(t) is BAA rate, which is the USA, but it influences the international stocks as well. Same for the bubble measure. Numerical values of coefficients are:

     h_2 = 25.607625, m_2 = 4.081440, c_2 = -1.919254, a_2 = 12.617889.

    Remark. In the regression for earnings growth, we tried  S(t) - S(t-1) instead of  S(t-1) , but the p-value for the Student test is too large. Also, we tried  R(t-1) instead of  R(t) - R(t-1) , but the p-value for the Student test is too large. We used covariates in linear regressions if  p < 20\%. Similarly, for the international stocks, we found that spread  S(t-1) is not statistically significant.

    Innovations. Thus we have 8 (eight) innovation series. Only  \delta_i(t) for  i = 0, 1, 2, can be modeled as Gaussian. But all of them, judging by the autocorrelation function for  \delta_i and for  |\delta_i| and other tests, can be modeled as independent identically distributed random variables.

    I do not have energy to present all these plots for each innovation series. But below is the table. Here, ACF is the L1 norm for the first 5 lags of the autocorrelation function values. Kurtosis is normalized so for normal distribution it is zero. Of course, the same is true for skewness.

    SeriesLengthSkewnessKurtosisShapiro-Wilk pJarque-Bera pACF original valuesACF absolute values
    Volatility960.4010.4010.4010.4010.4010.237
    BAA rate970.0081.7540.0080.0020.3750.655
    Long-short spread971.0583.3820.0000.0000.4550.468
    Earnings growth970.6142.9030.0000.0000.4740.253
    Bubble measure97-0.8161.1020.0030.0000.2910.608
    US corporate bond returns520.1930.2380.8570.8000.8780.706
    US stock returns970.0390.1570.3440.9400.4130.590
    International stock returns55-0.015-0.2020.9410.9530.5270.331

    Covariance matrix

     \begin{array}{lrrrrrrrr} \textsc{Series} & \text{vol} & \text{spread} & \text{baa-rate} & \text{earn-growth} & \text{bubble} & \text{usa-ret} & \text{intl-ret} & \text{bond-ret} \\ \text{vol} & 0.1342 & 0.0179 & 0.0134 & -0.1393 & -0.0273 & -0.0253 & -0.0316 & 0.0724 \\ \text{spread} & 0.0179 & 1.0797 & -0.0270 & -0.1194 & -0.0106 & -0.1961 & -0.2815 & -0.3577 \\ \text{baa-rate} & 0.0134 & -0.0270 & 0.0156 & -0.0398 & -0.0120 & -0.0151 & -0.0008 & -0.1121 \\ \text{earn-growth} & -0.1393 & -0.1194 & -0.0398 & 3.1196 & -0.0136 & 0.3457 & 0.4155 & -1.1972 \\ \text{bubble} & -0.0273 & -0.0106 & -0.0120 & -0.0136 & 0.0344 & 0.1699 & 0.0914 & 0.1133 \\ \text{usa-ret} & -0.0253 & -0.1961 & -0.0151 & 0.3457 & 0.1699 & 1.8470 & 0.7857 & 0.9502 \\ \text{intl-ret} & -0.0316 & -0.2815 & -0.0008 & 0.4155 & 0.0914 & 0.7857 & 2.9181 & -0.3106 \\ \text{bond-ret} & 0.0724 & -0.3577 & -0.1121 & -1.1972 & 0.1133 & 0.9502 & -0.3106 & 7.0280 \\ \end{array}

    Correlation matrix

     \begin{array}{lrrrrrrrr} \textsc{Series} & \text{vol} & \text{spread} & \text{baa-rate} & \text{earn-growth} & \text{bubble} & \text{usa-ret} & \text{intl-ret} & \text{bond-ret} \\ \text{vol} & 1.0000 & 0.0472 & 0.2919 & -0.1800 & -0.4234 & -0.0516 & -0.0493 & 0.0725 \\ \text{spread} & 0.0472 & 1.0000 & -0.2082 & -0.0544 & -0.0575 & -0.1388 & -0.1304 & -0.1043 \\ \text{baa-rate} & 0.2919 & -0.2082 & 1.0000 & -0.1510 & -0.5435 & -0.0890 & -0.0036 & -0.3101 \\ \text{earn-growth} & -0.1800 & -0.0544 & -0.1510 & 1.0000 & -0.1304 & 0.1204 & 0.1036 & -0.1878 \\ \text{bubble} & -0.4234 & -0.0575 & -0.5435 & -0.1304 & 1.0000 & 0.7038 & 0.3467 & 0.2714 \\ \text{usa-ret} & -0.0516 & -0.1388 & -0.0890 & 0.1204 & 0.7038 & 1.0000 & 0.3705 & 0.2817 \\ \text{intl-ret} & -0.0493 & -0.1304 & -0.0036 & 0.1036 & 0.3467 & 0.3705 & 1.0000 & -0.0701 \\ \text{bond-ret} & 0.0725 & -0.1043 & -0.3101 & -0.1878 & 0.2714 & 0.2817 & -0.0701 & 1.0000 \\ \end{array}

    See below the p-values for the Student T-test for null hypothesis which is zero correlation between series of innovations.

     \begin{bmatrix} 1 & 65 & 0 & 8 & 0 & 62 & 72 & 61 \\ 65 & 1 & 4 & 60 & 58 & 17 & 34 & 46 \\ 0 & 4 & 1 & 14 & 0 & 39 & 98 & 3 \\ 8 & 60 & 14 & 1 & 14 & 24 & 45 & 18 \\ 0 & 58 & 0 & 14 & 1 & 0 & 1 & 5 \\ 62 & 17 & 39 & 24 & 0 & 1 & 1 & 4 \\ 72 & 34 & 98 & 45 & 1 & 1 & 1 & 62 \\ 61 & 46 & 3 & 18 & 5 & 4 & 62 & 1 \\ \end{bmatrix}

    June 19, 2025

  • Simplified Simulator Version

    The main simulator gives users a choice of portfolio: US stocks, international stocks, and bonds. Moreover, the main choice in the portfolio is the proportion of stocks and bonds. In the newest version, we allowed this proportion to vary and not to be fixed throughout these simulated years. For example, at the start of 30 years, the stock/bond split can be 80/20, and at the end, 50/50. This is done to make room for retirement planning. Usually, people choose to invest more in risky assets at the start of their savings journey, and to make it less risky when they become closer to retirement. Within retirement, it is wise to do the converse: Invest in bonds less and less as you progress through the retirement.

    Some users might be confused by this variability. In addition to options for withdrawals/contributions, this can be challenging to navigate. In practice, most users care about only a few modes: saving before retirement and living in retirement. To this end, I created a simplified version with the following options:

    Risk Tolerance: High (Assertive), Mid (Moderate), Low (Conservative)

    Your Goal:

    • Lump-Sum Investing: invest initial wealth, amount provided by user, and do not contribute annually
    • Regular Savings: start with zero wealth, contribute annually a fixed nominal amount, provided by user, which grows 3% annually
    • Retirement Spending: invest initial wealth, amount provided by user, and withdraw annually fixed nominal amount, initially it is 4% of the initial wealth, according to the celebrated 4% retirement rule, and grows 4% annually

    Why choose 3% annual increase for annual contributions and 4% annual increase for annual withdrawals? Historically, inflation was running around 3% annually in 1928-2024. We consider only nominal (not inflation-adjusted) returns, because we could not model inflation-adjusted version of corporate bond returns. Thus we need some compensation for inflation.

    Below, we provide the split between stocks and bonds: stocks/bonds. Stocks include both USA and international.

    Lump-Sum Investing and Regular Savings:

    • Conservative: 60/40 constant during simulation
    • Moderate: 90/10 at the start, 60/40 at the end, linear during simulation
    • Assertive: 90/10 constant during simulation

    Retirement Spending:

    • Conservative: 60/40 constant during simulation
    • Moderate: 60/40 at the start, 90/10 at the end, linear during simulation
    • Assertive: 90/10 constant during simulation

    Other than that, in this simulator the user can choose the number of years, wealth (initial investment in case of lump-sum investing or retirement spending, or annual investment in case of regular savings), but not growth rate.

    June 18, 2025

  • Disclaimer

    This is the usual disclaimer for risky financial products, found at every investment company web site.

    The performance data shown represent past performance, which is not a guarantee of future results. Investment returns and principal value will fluctuate so that investors’ shares, when sold, may be worth more or less than their original cost. Current performance may be lower or higher than the performance data cited. The performance of an index is not an exact representation of any particular investment, as you cannot invest directly in an index.

    Assets represented by these portfolios and indices in the simulator are not guaranteed and protected by the US government, including Federal Deposit Insurance Corporation, and may lose value, including the loss of principal.

    I, Andrey Sarantsev, PhD, the creator of this simulator, discount any responsibility, legal or otherwise, resulting in the use of this simulator. Any harms, losses, or damages resulting from this use are the responsibility of the user, not me.

    For professional advice on investments or retirement, speak to your financial adviser or retirement adviser.

    June 18, 2025

  • Updated Simulator of S&P, International Stocks, Corporate Bonds, with Rate and Volatility as Factors

    See my GitHub repository simulator-current for the HTML frontend pages, Python backend code in Flask, Excel data file, and Python code for validation of the model described below.

    Continuing the previous post, we updated the financial simulator to make for geometric returns instead of arithmetic returns. We had mistakenly made linear regression for arithmetic returns, but this does not work well, since such returns can be only greater than  -100\% . Thus we replaced returns of all three asset classes (US stocks, developed-markets stocks, US bonds) from arithmetic to geometric. To compute portfolio returns, we later convert these geometric returns to arithmetic returns. The updated version is specified below.

    1. Description and system of equations.
    2. Testing innovations for white noise.
    3. Testing innovations for normality.

    Description and system of equations

    We use the following autoregression equation for annual volatility for S&P 500 and its predecessor, S&P 90, computed by Angel Piotrowski:

     \ln V(t) = \alpha + \beta \ln V(t) + W(t) with \alpha = 0.847850 and \beta = 0.620146.

    Next, we use the following autoregression for BAA rate, following previous research:

     \ln R(t) = \gamma + \theta \ln R(t-1) + Z(t) with  \gamma = 0.107208 and \theta = 0.942062.

    We use the logarithm because otherwise the rate might become negative, even with very small probability.

    We consider three classes of assets and denote their annual geometric total returns (multiplied by 100 for normalization):

    • USA stocks, measured by Standard & Poor 500 index and its predecessor, the Standard & Poor 90 index  Q_1(t)
    • International stocks, measured by MSCI EAFE (Europe/Australasia/Far East) index  Q_2(t)
    • USA corporate investment-grade bonds, measured by Bank of America ICE index (ratings AAA, AA, A, BBB)  Q_0(t)

    We normalize the two stock returns by dividing them by annual volatility. But we do not normalize the bond returns. We have the following equations for these three classes of assets:

    • For USA corporate bonds, following this blog post, we get:  Q_0(t) - R(t-1) = a_0 - c_0(R(t) - R(t-1)) + \delta_0(t) with  a_0 = -1.661125 and  c_0 = 5.588353
    • For USA stocks, following this blog post, we get:  Q_1(t) = a_1 - b_1V(t) - c_1(R(t) - R(t-1)) + \delta_1(t) with  a_1 = 21.206735 and  b_1 = 1.088812 and  c_1 = 6.211856.
    • For international stocks, similarly to USA stocks, we get:  Q_2(t) = a_2 - b_2V(t) - c_2(R(t) - R(t-1)) + \delta_2(t) with  a_2 = 27.494934 and  b_2 = 1.929750 and  c_2 = 3.996489.

    Thus all three classes of assets have returns highly dependent upon change in interest rates, with duration (regression coefficient)  c_i for returns  Q_i. Note that  Q_1 and  Q_2

    All five series of residuals  \delta_0, \delta_1, \delta_2, Z, W are well-modeled by independent identically distributed random variables, judging by Monte Carlo simulation. I present the autocorrelation plots for them and their absolute values below.

    Unfortunately, they are not normal. Namely,  Z and  W (the innovations for factor autoregressions) are closer to skew-normal. I did not yet pursue this direction of research. But the other three residual series  \delta_0, \delta_1, \delta_2 are Gaussian. I discuss their normality below.

    However, I still modeled these five series as multivariate Gaussian with mean vector zero and the following empirical covariance matrix.

     \begin{bmatrix} 0.132753 & -0.063165 & -0.054849 & 0.014952 & 0.072379 \\ -0.063165 & 2.2129 & 0.913203 & -0.028769 & 1.531643 \\ -0.054849 & 0.913203 & 3.018478 & -0.00741 & -0.174603 \\ 0.014952 & -0.028769 & -0.00741 & 0.01842 & -0.057425 \\ 0.072379 & 1.531643 & -0.174603 & -0.057425 & 6.892802 \\ \end{bmatrix}

    We consider only nominal, not real returns. To compensate for that, withdrawals/contributions can change annually.

    We allow for constant split between US and international stocks. Bond and overall stock percentages might change from year to year linearly. This is to allow for a more conservative portfolio as time goes.

    Also, and very importantly, we added a separate web page with a simplified version of this simulator. We describe it in a separate post.

    Initial value for volatility is taken as average daily close VIX June 1, 2024 – May 31, 2025. The initial value for the BAA rate is taken as average daily May 2025.

    Testing innovations for white noise

    Below are autocorrelation function plots for the five series of innovations. Original and absolute in the captions refer to whether innovations are taken as is or after taking absolute values.

    •  W(t) has tag ln-vol
    •  Z(t) has tag ln-baa
    •  \delta_0(t) has tag bonds
    •  \delta_1(t) has tag usa-stocks
    •  \delta_2(t) has tag intl-stocks

    Also, we compute L1 norms for first 5 values of the ACF, for original innovations and their absolute values. Comparing with these threshold values, we see that it is reasonable to model these as independent identically distributed.

    N Data Points9697529755
    Innovations W(t)  Z(t)  \delta_0(t)  \delta_1(t)  \delta_2(t)
    Original  x 0.400.180.880.480.49
    Absolute  |x| 0.240.360.710.430.58

    Testing innovations for normality

    Similarly, we plot the five quantile-quantile plots versus the normal distribution below. Tags are the same.

    Also, see p-values for statistical testing for normality: Shapiro-Wilk (SW) and Jarque-Bera (JB) testing.

    Test W(t)  Z(t)  \delta_0(t)  \delta_1(t)  \delta_2(t)
    SW0.9%0.6%86%42%99%
    JB6.1%0.009%80%66%90%

    Finally, let us provide skewness and kurtosis for these innovations, normalized so that for the normal distribution they are 0.

    Function W(t)  Z(t)  \delta_0(t)  \delta_1(t)  \delta_2(t)
    Skewness0.590.810.190.23-0.13
    Kurtosis0.0571.40.240.0680.13

    Summary:  \delta_i(t) are well modeled by normal, but  W(t) and  Z(t) are not.

    June 18, 2025

Next Page

Blog at WordPress.com.

 

Loading Comments...
 

    • Subscribe Subscribed
      • My Finance
      • Already have a WordPress.com account? Log in now.
      • My Finance
      • Subscribe Subscribed
      • Sign up
      • Log in
      • Report this content
      • View site in Reader
      • Manage subscriptions
      • Collapse this bar