Elections Economy: coverage simulation

Source: ElectionsEconomy/hibbs_coverage.Rmd

This example checks a model-fitting procedure by fake-data simulation. We create data from a known regression line, fit the model back to each fake dataset, and record how often nominal intervals for the slope contain the true value.

The original page uses rstanarm::stan_glm. Here the model is the same Gaussian linear regression, fit with statsmodels OLS. For this simple problem, the usual t intervals and the flat-prior Bayesian intervals are nearly the same.

Setup and data

Code
from pathlib import Path
import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

root = Path("../../ROS-Examples")
hibbs = pd.read_csv(root / "ElectionsEconomy/data/hibbs.dat", sep=r"\s+")
rng = np.random.default_rng(1)
hibbs.head()
year growth vote inc_party_candidate other_candidate
0 1952 2.40 44.60 Stevenson Eisenhower
1 1956 2.89 57.76 Eisenhower Stevenson
2 1960 0.85 49.91 Nixon Kennedy
3 1964 4.21 61.34 Johnson Goldwater
4 1968 3.02 49.60 Humphrey Nixon

Step 1: create the pretend world

Code
a = 46.2
b = 3.1
sigma = 3.8
x = hibbs["growth"].to_numpy()
n = len(x)
pretend = pd.Series({"intercept": a, "slope": b, "sigma": sigma, "n": n})
pretend
intercept    46.2
slope         3.1
sigma         3.8
n            16.0
dtype: float64

Step 2: simulate one fake dataset

Code
y = a + b * x + rng.normal(0, sigma, size=n)
fake = pd.DataFrame({"x": x, "y": y})
fake.head()
x y
0 2.40 54.953220
1 2.89 58.281149
2 0.85 50.090661
3 4.21 54.299003
4 3.02 59.002352

Step 3: fit the model and check intervals

Code
fit = smf.ols("y ~ x", data=fake).fit()
fit.summary()
OLS Regression Results
Dep. Variable: y R-squared: 0.758
Model: OLS Adj. R-squared: 0.740
Method: Least Squares F-statistic: 43.79
Date: Sun, 31 May 2026 Prob (F-statistic): 1.16e-05
Time: 23:02:00 Log-Likelihood: -35.690
No. Observations: 16 AIC: 75.38
Df Residuals: 14 BIC: 76.93
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 46.9749 1.037 45.278 0.000 44.750 49.200
x 2.9471 0.445 6.617 0.000 1.992 3.902
Omnibus: 2.138 Durbin-Watson: 2.423
Prob(Omnibus): 0.343 Jarque-Bera (JB): 1.558
Skew: -0.731 Prob(JB): 0.459
Kurtosis: 2.552 Cond. No. 4.54


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
def slope_interval(model, level):
    alpha = 1 - level
    lo, hi = model.conf_int(alpha=alpha).loc["x"]
    return float(lo), float(hi)

for level in [0.50, 0.90]:
    lo, hi = slope_interval(fit, level)
    print(f"{int(level * 100)}% interval: ({lo:.2f}, {hi:.2f}); covers true slope? {lo < b < hi}")
50% interval: (2.64, 3.26); covers true slope? True
90% interval: (2.16, 3.73); covers true slope? True

Step 4: repeat the fake-data experiment

Code
def one_coverage_sim(seed):
    local_rng = np.random.default_rng(seed)
    y_sim = a + b * x + local_rng.normal(0, sigma, size=n)
    model = smf.ols("y ~ x", data=pd.DataFrame({"x": x, "y": y_sim})).fit()

    covers = {}
    for level in [0.50, 0.68, 0.90, 0.95]:
        lo, hi = slope_interval(model, level)
        covers[f"ci_{int(level * 100)}"] = lo < b < hi

    slope_hat = model.params["x"]
    slope_se = model.bse["x"]
    covers["se_68"] = abs(b - slope_hat) < slope_se
    covers["se_95"] = abs(b - slope_hat) < 2 * slope_se
    return covers

n_fake = 1000
coverage = pd.DataFrame([one_coverage_sim(s) for s in range(1, n_fake + 1)])
coverage.mean()
ci_50    0.516
ci_68    0.703
ci_90    0.906
ci_95    0.947
se_68    0.691
se_95    0.939
dtype: float64
Code
coverage_summary = coverage.mean().rename({
    "ci_50": "50% interval",
    "ci_68": "68% interval",
    "ci_90": "90% interval",
    "ci_95": "95% interval",
    "se_68": "estimate +/- 1 se",
    "se_95": "estimate +/- 2 se",
})
coverage_summary.to_frame("empirical coverage")
empirical coverage
50% interval 0.516
68% interval 0.703
90% interval 0.906
95% interval 0.947
estimate +/- 1 se 0.691
estimate +/- 2 se 0.939

Because the fake data were generated from the same linear-normal model that is fit back to the simulations, empirical coverage should be close to the nominal interval levels, up to Monte Carlo error from using only 1000 fake datasets.