x = rng.lognormal(mean=0, sigma=0.8, size=70)
T = np.median
T(x)np.float64(0.83287393253026)
Bradley Efron (1979)
The bootstrap replaces unknown population sampling with the empirical distribution \(\hat F_n\). For a statistic \(T=T(X_1,\ldots,X_n)\), draw
\[ X_1^*,\ldots,X_n^* \stackrel{iid}{\sim} \hat F_n \]
and approximate the sampling distribution of \(T\) by the conditional distribution of
\[ T^* = T(X_1^*,\ldots,X_n^*)\mid X_1,\ldots,X_n. \]
This turns many standard-error and confidence-interval problems into simulation problems.
Treat the observed array as the empirical distribution \(\hat F_n\).
Draw \(X_1^*,\ldots,X_n^* \sim \hat F_n\) and compute \(T^*\) many times.
array([0.69442265, 1.07560355])
Plot the conditional bootstrap distribution of \(T^*\).
fig, ax = plt.subplots(figsize=(6, 3.5))
ax.hist(boot, bins=36, alpha=0.75, color="#45b3e7")
ax.axvline(np.median(x), color="#ffcc66", lw=2.5, label="sample median")
ax.axvline(ci[0], color="#66ff99", ls="--")
ax.axvline(ci[1], color="#66ff99", ls="--", label="95% percentile CI")
ax.set(title=f"CI=({ci[0]:.2f}, {ci[1]:.2f})", xlabel="bootstrap median")
ax.legend()
plt.show()