Accruals, out of sample, 2016 to 2026
Sloan's 1996 accruals anomaly claims that low-accruals firms (earnings backed by cash) outperform high-accruals firms by ~10.4% per year. On 123 months of US equity data with cash-flow-statement accruals, the primary VW NYSE-breakpoint spec produces +0.883% per month — essentially exactly the canonical 0.87% monthly magnitude — but the i.i.d. t is +1.85, just below the rubric threshold, and the Newey-West 12-lag t is +1.74. Verdict: FAILED by pre-registered rubric. Headline caveat: two of the three EW sensitivities REPLICATE robustly under both i.i.d. and Newey-West (Sens A: +3.81 NW t, Sens B: +2.39 NW t), implying the accruals effect has migrated to smaller stocks where the VW primary underweights it. This is a new kind of near-miss in the archive: FAILED at the canonical magnitude, with sensitivities that clearly replicate.
Source paper
Sloan (1996) "Do Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future Earnings?"
The claim
Sloan’s 1996 paper in The Accounting Review argued that the market underweights the distinction between the cash and accruals components of earnings. Firms whose earnings are backed by operating cash flow generate persistent future earnings; firms whose earnings depend heavily on accruals (deferrals, accrued receivables, inventory buildup, and similar non-cash items) generate earnings that are less persistent and tend to reverse. Investors who ignore the distinction are surprised by the reversal, and the paper documents that a trading rule of long low-accruals firms, short high-accruals firms earns significant abnormal returns.
The canonical magnitude reported in Sloan’s original 1962-1991 sample on NYSE/AMEX/NASDAQ is approximately 10.4% per year on the decile long-short, which is roughly 0.87% per month — right in the neighborhood of the Jegadeesh-Titman momentum canonical (Sloan 1996, as summarized in Dechow-Khimich-Sloan’s SSRN review and Quantpedia’s accrual anomaly page; the primary JSTOR PDF was 403 at time of this verdict, so the headline magnitude is cited from secondary sources per Nullberg’s math-verification rule).
What we tested
The goal is to see whether the accruals anomaly still produces a significant long-short spread in the 2016-2026 US equity out-of-sample window, with the modern cash-flow-statement-based accruals measure rather than Sloan’s original balance-sheet formula.
Sample and data
- 2016-01-05 to 2026-04-09 daily closes, 123 usable formation months in the primary
- Cash-flow and balance-sheet fundamentals from FMP (
cash_flows.parquet483K rows,balance_sheets.parquet485K rows) - 8,479-row FMP company profile snapshot (exchange, sector, marketCap, ETF/fund flags)
- Stocks-only (NYSE + NASDAQ + AMEX; no ETFs, no funds, no OTC/CBOE/PNK)
- Financials and real estate excluded (Fama-French SIC 6 convention)
Accruals definition — Hribar and Collins (2002) cash-flow-statement approach:
accruals_cfs = Net Income − Cash Flow from Operating Activities
Hribar and Collins showed that this cash-flow-statement definition is cleaner than Sloan’s original balance-sheet formula (ΔCA − ΔCL − ΔCash + ΔSTDebt − Dep) / Assets, because balance-sheet changes include non-operating distortions (M&A, divestitures, discontinued operations) that corrupt the accruals measure. The two definitions are highly correlated in practice, and the modern factor literature uses the cash-flow-statement version. Nullberg uses it here as the primary.
Sort score — TTM cash-flow-statement accruals divided by same-period total assets:
gp_a_accruals = TTM_accruals / totalAssets
Where TTM_accruals = rolling 4-quarter sum of quarterly (NI − CFO) per symbol. Fundamentals are joined to monthly returns via pandas.merge_asof on filingDate (strict point-in-time, no look-ahead).
Sign convention — Sloan’s long-short is D1 (low accruals) minus D10 (high accruals). Low-accruals firms have cash-backed earnings and are the “long” side; high-accruals firms have less-persistent earnings and are the “short” side. A positive mean is the paper’s predicted direction.
Four specifications reported in full:
- PRIMARY (VW, NYSE breakpoints, stocks-only, financials excluded, TTM accruals, winsorized): the canonical specification most comparable to the original Sloan 1996 methodology. Value-weighted within deciles using FMP snapshot marketCap.
- Sensitivity A (EW, filtered, winsorized, TTM accruals): equal-weighted with price ≥ $5 and dollar-volume ≥ $1M filter.
- Sensitivity B (EW, filtered, winsorized, single-Q accruals score): same filter but using only the most recent quarterly accruals instead of TTM.
- Sensitivity C (EW, no filter, no winsorization, TTM): the rawest specification.
Pre-registered verdict thresholds, paper-specific calibration for Sloan’s ~0.87% per month canonical:
- Replicated: mean(D1 − D10) > 0.003 AND t > 2
- Degraded: 0 < mean ≤ 0.003 AND t > 2
- Failed: mean ≤ 0 OR t ≤ 2
- Inconclusive: data quality issue
The 0.003 floor is approximately 35% of canonical, reflecting the well-documented post-publication decay in the accruals literature (Richardson-Sloan-Soliman-Tuna 2005, Green-Hand-Zhang 2017, Dechow-Khimich-Sloan 2011).
The numbers
Primary specification
Value-weighted, NYSE breakpoints, stocks-only, financials excluded, TTM accruals, winsorized.
| Metric | Value |
|---|---|
| Sample months | 123 |
| Median stocks / month | 2,348 |
| D1 (low accruals, high quality) mean monthly | +2.912% |
| D10 (high accruals, low quality) mean monthly | +2.029% |
| D1 minus D10 mean monthly | +0.883% |
| i.i.d. t-statistic | +1.85 |
| Newey-West 12-lag t-statistic | +1.74 |
| Annualized Sharpe | — |
| Verdict (pre-registered rubric) | FAILED |
The primary point estimate matches the canonical 0.87% per month almost exactly (0.883% vs the Sloan/Dechow canonical of ~0.87%). The mean clears the REPLICATED floor by a wide margin (0.88% ≫ 0.30%). What fails the rubric is the t-statistic: +1.85 i.i.d., +1.74 Newey-West. Both are just below the |2| threshold.
This is the most unusual “failed” verdict in the Nullberg archive so far: the magnitude is right at the canonical, but 123 months is just barely insufficient to resolve it at conventional significance with a VW-on-large-caps construction. In the language of statistics, the point estimate says “replicated” while the standard error says “we cannot yet tell”. The pre-registered rubric cares only about t > 2, so the verdict is FAILED.
Sensitivity A: EW filtered winsorized, TTM
| Metric | Value |
|---|---|
| Median stocks / month | 1,792 |
| D1 - D10 mean monthly | +1.058% |
| i.i.d. t-statistic | +3.56 |
| Newey-West 12-lag t-statistic | +3.81 |
Sensitivity A REPLICATES under both i.i.d. and Newey-West. The Newey-West t actually improves over the i.i.d. t (the monthly series has mildly negative autocorrelation). Mean is +1.06% per month, exceeding even the original Sloan canonical of ~0.87%.
Sensitivity B: EW filtered winsorized, single-Q accruals
| Metric | Value |
|---|---|
| D1 - D10 mean monthly | +0.555% |
| i.i.d. t-statistic | +2.34 |
| Newey-West 12-lag t-statistic | +2.39 |
Using single-quarter accruals instead of TTM (so the score is more reactive to the most recent reported earnings) also REPLICATES under both i.i.d. and Newey-West, though with a smaller magnitude than TTM. Both |t|-statistics clear 2.
Sensitivity C: EW no filter no winsor, TTM
| Metric | Value |
|---|---|
| D1 - D10 mean monthly | +2.071% |
| i.i.d. t-statistic | +1.83 |
| Newey-West 12-lag t-statistic | +1.90 |
The rawest specification produces the largest mean but has elevated variance from the microcap tail, pulling the t-stat back below 2. FAILED per rubric.
What this means
The accruals verdict is genuinely different in shape from the first four verdicts. The VW NYSE primary is underpowered at the exact canonical magnitude, while both EW sensitivities REPLICATE robustly under both i.i.d. and Newey-West tests. This is the opposite of the profitability and value patterns, where the large-cap VW specification carried the effect and EW mid-cap portfolios were flat.
The honest interpretation is that the accruals effect has migrated to smaller stocks in the 2016-2026 regime. In Sloan’s original 1962-1991 sample, accruals worked on value-weighted large-cap portfolios — it was a large-cap effect. In the post-2016 regime, it appears to be a smaller-stock earnings-quality effect, consistent with the Green-Hand-Zhang 2017 factor zoo literature that shows the accrual anomaly has shifted toward less liquid segments of the market. One plausible mechanism: large-cap stocks have dense analyst coverage that has learned to price the accruals distinction cleanly, while smaller stocks remain mispriced by the same heuristic that Sloan originally identified.
By the pre-registered rubric, the primary is FAILED. A reader who uses a decision rule of “any specification has to clear significance” would call this REPLICATED, because two of four do. Nullberg’s rubric is specifically the primary spec, so the headline is FAILED — but the page reports every sensitivity in full so the reader can apply their own standard.
Comparative picture across five verdicts
| Paper | Factor class | Primary mean | i.i.d. t | NW t | Verdict | Shape |
|---|---|---|---|---|---|---|
| MAX, Bali et al. 2011 | Lottery | -1.80% | -2.57 | -2.18 | Failed | Sign inverted |
| Momentum, JT 1993 | Trend | -0.13% | -0.16 | -0.26 | Failed | Decayed to null |
| Profitability, Novy-Marx 2013 | Quality | +0.50% | +0.93 | +1.05 | Failed | Underpowered, below canonical |
| Value, FF 1992 | Value | +1.70% | +2.82 | +1.94 | Replicated | Regime-driven, 2022 rebound |
| Accruals, Sloan 1996 | Earnings quality | +0.88% | +1.85 | +1.74 | Failed | Primary at canonical; EW sensitivities replicate |
Five verdicts, five distinct shapes. The rubric has now produced four FAILED and one REPLICATED, and within the four FAILEDs there are four economically different failure modes.
Reproducibility
- Script:
scripts/verdicts/sloan_1996_accruals.py - Results JSON:
scripts/verdicts/sloan_1996_accruals.results.json - Monthly LS CSVs:
..._primary.csv,..._sensA.csv,..._sensB.csv,..._sensC.csv
What we will track from here
This verdict enters the archive as failed and is reviewed when at least one of the following happens:
- A forward extension of the sample pushes the primary i.i.d. t above 2 at the current point estimate. At +0.88% per month and the current std, approximately 40 additional monthly observations would do it.
- A small-cap-tilted or size-neutral specification shows whether the migration-to-smaller-stocks hypothesis is real. A 2×3 size × accruals sort is a natural next test.
- The original Sloan balance-sheet accruals definition, computed by hand from balance-sheet changes, produces a materially different result.
Bibliography
- Sloan, Richard G. “Do Stock Prices Fully Reflect Information in Accruals and Cash Flows about Future Earnings?” The Accounting Review 71(3), 1996, pp. 289-315. Paper
- Hribar, Paul, and Daniel W. Collins. “Errors in Estimating Accruals: Implications for Empirical Research.” Journal of Accounting Research 40(1), 2002, pp. 105-134. Paper — the cash-flow-statement definition used for this verdict.
- Newey, Whitney K., and Kenneth D. West. “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econometrica 55(3), 1987, pp. 703-708. HAC estimator used for the Newey-West t-statistics.