Methodology2026-05-21· by Dipsern Research

Why Prediction Error Matters More Than the Prediction Itself

A point estimate without an error bar is a story, not a forecast. Here is how Dipsern computes prediction error, why it matters more than the headline number, and how to reason like a Bayesian without the math.

"A weather forecaster who says it will rain tomorrow is interesting. A weather forecaster who says there is a 70% chance of rain — and is calibrated — is useful."

The same logic applies to financial forecasts, and it is almost universally ignored. Walk into any retail investing newsletter or YouTube channel and you will find point estimates everywhere: "I expect this stock to go to $200." "Bitcoin will hit $150k by year-end." "The market is up 12% next year."

None of those statements come with an error bar. None of them tell you how often the forecaster has been right before, or by how much they have been wrong. The number is presented as if it were a measurement, when it is in fact a guess from a distribution the speaker cannot describe.

This post is about the missing other half of every forecast: the error around the estimate. It is the difference between a useful signal and a confident-sounding story.

The Error Trap

Imagine two forecasters. Forecaster A says: "Stock X will return 8% over the next 90 days." Forecaster B says: "Stock Y will return 8% over the next 90 days, with a historical average error of plus or minus 6%."

Most retail investors treat both forecasts identically — they hear "8%" and reach for the buy button. But the two statements are profoundly different.

Forecaster A has given you a point estimate divorced from any reality check. You have no idea whether their typical forecast is off by 1% or by 30%. The number "8%" could mean anything from "almost certainly between 6% and 10%" to "anywhere from -40% to +50%."

Forecaster B has given you something usable. You know that on average, when they predict 8%, the realized outcome tends to fall somewhere between 2% and 14%. That is a different kind of object — a forecast in the statistical sense, not a story.

The error trap is the habit of acting on the first kind of statement as if it were the second.

A Simple Thought Experiment

Suppose you have to choose between two bets, each costing the same.

Bet A: Expected payoff +8%, average historical error ±5%.
Bet B: Expected payoff +8%, average historical error ±25%.

Both have the same expected return. In an idealized expected-value world, they are identical. But in any world where you care about variance — that is, the real world, where you have a finite portfolio and a finite life — they are not identical at all.

Bet A's outcomes cluster tightly around +8%. Most realizations will be between +3% and +13%. The signal is robust.

Bet B's outcomes range wildly. Many realizations will be deeply negative; others will be spectacularly positive. The "+8%" point estimate is a center of mass, not a useful prediction. You could lose 17% on any given Bet B even though the average is positive.

In professional risk management, this is the difference between alpha and noise. The 8% in Bet B is statistically indistinguishable from random.

Yet retail investors are bombarded with Bet-B-style claims daily, presented as if they were Bet A.

How Dipsern Computes Prediction Error

Dipsern's engine produces a forecast for each asset based on its current drawdown bucket: the rolling median of historical forward returns from observations in that bucket. The forecast is, by construction, a backward-looking statistic. So the natural error metric is also backward-looking.

For every historical observation, Dipsern can ask: "At the time this observation was made (no look-ahead), what did the model predict the forward return would be? What was the actual realized forward return? What is the difference?"

The absolute value of that difference, averaged across all historical observations for the asset, is the prediction error.

In math-light form:

error_t = |actual_forward_return_t - predicted_at_origin_t|
prediction_error = mean(error_t) across all valid t

A few details that matter:

The prediction at time t uses only data revealed by time t. The engine never peeks forward. Dipsern's test suite has an explicit regression test (test_no_lookahead_bias) that fails if this invariant is ever broken.
The error is in the same units as the forecast (percentage points of forward return).
It is a mean absolute error, not a standard deviation. We prefer MAE because the forward return distribution is not Gaussian — equity returns have fat tails — and standard deviation systematically understates downside risk in that setting.

The result is a single, interpretable number you can read alongside the median forecast: "Historical median forward return is +X%, with an average error of ±Y%."

Three Tickers, Same Prediction, Different Error

Below is an illustrative comparison. The point estimates are identical, but the error tells you very different things about each asset. Numbers are stylized for illustration.

| Ticker | Median 90d forecast | Prediction error | Win rate | Sample size | |---|---|---|---|---| | Asset A (large-cap ETF) | +6% | ±4% | 68% | 400+ | | Asset B (mid-cap stock) | +6% | ±11% | 58% | 120+ | | Asset C (small-cap, recent IPO) | +6% | ±22% | 51% | 18 |

All three show the same median forecast. But the messages are different.

Asset A's tight error band suggests that when historical drawdown was at this level, forward returns clustered reliably near 6%. The signal is robust.

Asset B has a wider band — about double Asset A's error. Forward returns from this bucket were positive on average but with substantial variation. Treat the 6% as a center of mass, not a target.

Asset C is barely a signal at all. With only 18 observations and an error of ±22%, the "6% median" is statistically indistinguishable from "anywhere between -16% and +28%." The win rate near 50% confirms it: this is a coin flip dressed up as a forecast.

Same headline, three different realities.

How to Interpret 5% vs 15% Prediction Error

There is no universal threshold for "good" versus "bad" error, but the following rough heuristics hold across most asset classes Dipsern covers.

Error under 5%. Very tight. Typically appears in large, mature assets with long price histories and dense bucket sample sizes. Treat the median as a reasonably reliable central tendency.

Error 5-10%. Normal range for most equities at moderate drawdown levels. The median is informative but the realized return will commonly be a few percentage points off either way. Position sizing should respect that uncertainty.

Error 10-15%. Elevated. Common in volatile assets (small-caps, sector ETFs, crypto). The median is still useful as a base rate but should not be taken as a target. The right framing is: "On average, this level produces positive returns, but individual realizations vary widely."

Error above 15%. High noise. Either the asset is genuinely volatile, the sample size is too small, or the drawdown bucket sits on a fat-tail event that distorts the average. Be skeptical of any point estimate; treat the asset as a discretionary call, not a statistical one.

The right mental model: the error is the resolution of the forecast. A 5% error is a high-resolution forecast; a 20% error is a blurry one. Both can be informative, but only if you respect what each can and cannot tell you.

The Bayesian Connection

There is a deeper principle hiding behind all of this, and it is the Bayesian idea that every belief should come with a credence interval, not just a point.

Bayesian thinkers do not say "I believe X." They say "I assign 65% probability to X being true." When the world supplies new evidence, they update — they do not abandon old beliefs entirely, but they shift the probability mass.

Prediction error is the empirical, frequentist cousin of a Bayesian credence interval. It does not produce a posterior distribution explicitly, but it tells you how widely past realizations have differed from past forecasts — which is exactly the information you need to know how much weight to put on a new forecast.

A useful habit: every time you read "the model expects X%," mentally append "...with an average historical error of Y%." If you do not know Y, you do not really know the forecast.

When Error Swamps the Signal

A practical rule, useful for filtering noise:

If the prediction error is larger than the absolute value of the predicted return, the signal is below the noise floor.

Concretely, a forecast of +6% with an error of ±10% has more error than signal. The realized outcome will routinely be negative even though the central tendency is positive. Trading off such a forecast is essentially trading off randomness with extra steps.

This rule is conservative — there are situations where a positive-skew distribution still rewards systematic exposure even when error exceeds the mean — but for a retail investor without sophisticated portfolio construction tooling, treating "error larger than signal" as a hard skip is a reasonable default.

It also has the nice property of automatically penalizing small sample sizes. With fewer observations, the empirical error is mechanically larger, which naturally pushes thinly-sampled buckets below the noise floor.

Key Takeaways

A point estimate without an error bar is a story, not a forecast.
Prediction error tells you how reliable the headline number actually is.
Dipsern computes mean absolute error of historical forecasts versus realized returns — no look-ahead, same units as the forecast.
Under 5% error is tight; 5-10% normal; 10-15% elevated; above 15% high noise.
If error exceeds the absolute value of the predicted return, treat the signal as noise.
This is the empirical version of Bayesian credence intervals — and it changes how you size positions, not just whether you take them.

Try It Yourself

Pull up two assets you know well in Dipsern: one large-cap with a long history, one small-cap or recent IPO. Compare the prediction error at similar drawdown levels. You will almost certainly see that the smaller, less-sampled asset has multiple times the error — even when the headline median return looks similar. Try VOO versus a more volatile name to see the contrast clearly.

Educational content. Past performance does not guarantee future results. This is not financial advice.

For informational purposes only. Not financial advice. Past performance does not guarantee future results.

Written by

Dipsern Research

Quantitative research desk

10 articleson Dipsern

More from the research desk

Methodology2026-05-21

Want grades for your portfolio?

Dipsern analyzes 2,200+ assets daily — drawdown signals, win rates, and prediction accuracy. Free to start.

Sign up free