Newcomb / posterior predictive checking

Source: Newcomb/newcomb.Rmd

Simon Newcomb’s speed-of-light measurements are recorded as deviations from 24,800 nanoseconds. The book fits a normal model with only an intercept and then checks whether replicated normal data look like the observed data. The interesting feature is the extreme low observation: a normal model centered on the bulk of the data struggles to reproduce it.

Setup and data

Code
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import statsmodels.formula.api as smf


def ros_root():
    candidates = [
        Path("../../ROS-Examples"),
        Path("../ROS-Examples"),
        Path("/Users/alal/tmp/ros-python-book/ROS-Examples"),
    ]
    for candidate in candidates:
        if candidate.exists():
            return candidate
    return candidates[0]

root = ros_root()
newcomb = pd.read_table(root / "Newcomb/data/newcomb.txt", sep=r"\s+")
y = newcomb["y"].to_numpy()
newcomb.head()
y
0 28
1 26
2 33
3 24
4 34
Code
pd.Series(y).describe()
count    66.000000
mean     26.212121
std      10.745325
min     -44.000000
25%      24.000000
50%      27.000000
75%      30.750000
max      40.000000
dtype: float64

Histogram of the measurements

Code
fig, ax = plt.subplots(figsize=(5, 3.5))
ax.hist(y, bins=30, color="0.75", edgecolor="white")
ax.set_xlabel("Deviation from 24,800 nanoseconds")
ax.set_yticks([])
ax.set_title("Newcomb's measurements")
Text(0.5, 1.0, "Newcomb's measurements")

Fit the intercept-only normal model

stan_glm(y ~ 1) is a Gaussian model with an unknown mean and residual scale. The least-squares estimate is simply the sample mean; for posterior predictive checks we draw from the usual conjugate approximation with a noninformative prior.

Code
fit = smf.ols("y ~ 1", data=newcomb).fit()
fit.summary()
OLS Regression Results
Dep. Variable: y R-squared: 0.000
Model: OLS Adj. R-squared: 0.000
Method: Least Squares F-statistic: nan
Date: Sun, 31 May 2026 Prob (F-statistic): nan
Time: 23:04:23 Log-Likelihood: -249.86
No. Observations: 66 AIC: 501.7
Df Residuals: 65 BIC: 503.9
Df Model: 0
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 26.2121 1.323 19.818 0.000 23.571 28.854
Omnibus: 98.931 Durbin-Watson: 2.310
Prob(Omnibus): 0.000 Jarque-Bera (JB): 2139.175
Skew: -4.493 Prob(JB): 0.00
Kurtosis: 29.403 Cond. No. 1.00


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Code
n = len(y)
ybar = y.mean()
s2 = y.var(ddof=1)
rng = np.random.default_rng(202611)
n_sims = 4000

sigma2_draws = (n - 1) * s2 / rng.chisquare(df=n - 1, size=n_sims)
mu_draws = rng.normal(ybar, np.sqrt(sigma2_draws / n))
y_rep = rng.normal(mu_draws[:, None], np.sqrt(sigma2_draws)[:, None], size=(n_sims, n))

pd.DataFrame({
    "posterior_mean_mu": [mu_draws.mean()],
    "posterior_sd_mu": [mu_draws.std()],
    "posterior_mean_sigma": [np.sqrt(sigma2_draws).mean()],
})
posterior_mean_mu posterior_sd_mu posterior_mean_sigma
0 26.227695 1.360741 10.889986

Histograms of replicated datasets

Code
fig, axes = plt.subplots(4, 5, figsize=(10, 6), sharex=True, sharey=True)
for ax, s in zip(axes.ravel(), rng.choice(n_sims, size=20, replace=False)):
    ax.hist(y_rep[s], bins=15, color="0.8", edgecolor="white")
    ax.set_yticks([])
fig.suptitle("Twenty replicated datasets from the normal model")
fig.tight_layout()

Data histogram against predictive replications

Code
bins = np.arange(min(y.min(), y_rep[:19].min()) - 2, max(y.max(), y_rep[:19].max()) + 4, 8)
fig, axes = plt.subplots(4, 5, figsize=(10, 6), sharex=True, sharey=True)
axes = axes.ravel()
axes[0].hist(y, bins=bins, color="black", alpha=0.85)
axes[0].set_title("observed")
for i, ax in enumerate(axes[1:], start=1):
    ax.hist(y_rep[i - 1], bins=bins, color="0.75", edgecolor="white")
    ax.set_title(f"rep {i}", fontsize=8)
for ax in axes:
    ax.set_yticks([])
fig.tight_layout()

Density overlay

Code
xs = np.linspace(min(y.min(), y_rep[:100].min()) - 5, max(y.max(), y_rep[:100].max()) + 5, 400)
fig, ax = plt.subplots(figsize=(6, 3))
for rep in y_rep[:100]:
    kde = stats.gaussian_kde(rep)
    ax.plot(xs, kde(xs), color="0.75", linewidth=0.6, alpha=0.35)
obs_kde = stats.gaussian_kde(y)
ax.plot(xs, obs_kde(xs), color="black", linewidth=2, label="observed")
ax.set_yticks([])
ax.set_xlabel("Deviation from 24,800 nanoseconds")
ax.set_title("Observed density over replicated densities")
ax.legend(frameon=False)

Test statistic: the minimum

The book’s posterior predictive check focuses on min(y). If the observed minimum lies far in the tail of replicated minima, the normal model misses an important feature of the data.

Code
test_rep = y_rep.min(axis=1)
test_obs = y.min()
pp_value = np.mean(test_rep <= test_obs)
pd.DataFrame({"observed_min": [test_obs], "posterior_predictive_p_value": [pp_value]})
observed_min posterior_predictive_p_value
0 -44 0.0
Code
fig, ax = plt.subplots(figsize=(5.5, 3.5))
ax.hist(test_rep, bins=20, range=(-50, 20), color="0.75", edgecolor="white")
ax.axvline(test_obs, color="black", linewidth=3, label=f"observed min = {test_obs:g}")
ax.set_xlabel("Minimum of replicated dataset")
ax.set_ylabel("Frequency")
ax.legend(frameon=False)

CmdStanPy note

Code
from cmdstanpy import CmdStanModel

# A direct Stan implementation is a one-parameter-location normal model plus
# generated quantities for y_rep and min(y_rep).  The conjugate simulation above
# is enough for the posterior predictive lesson in this example.