Electric Company / treatment effects

Source: ElectricCompany/electric.Rmd

The electric-company example studies paired classes with pre-test and post-test scores. Each row in the wide data contains a treated class and a matched control class. The original R page uses rstanarm::stan_glm; here the core workflow is reproduced with pandas, statsmodels, and plotting code, using a lightweight Bayesian linear-regression helper where the page needs posterior draws.

Setup and wide data

Code
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
from python.bayes_glm import bayes_lm


def ros_root():
    candidates = [
        Path("../../ROS-Examples"),
        Path("../ROS-Examples"),
        Path("/Users/alal/tmp/ros-python-book/ROS-Examples"),
    ]
    for candidate in candidates:
        if candidate.exists():
            return candidate
    return candidates[0]

root = ros_root()
electric_wide = pd.read_table(root / "ElectricCompany/data/electric_wide.txt", sep=r"\s+")
electric_wide.head()
city grade treated_pretest treated_posttest control_pretest control_posttest supplement
0 Fresno 1 13.8 48.9 12.3 52.3 Supplement
1 Fresno 1 16.5 70.5 14.4 55.0 Replace
2 Fresno 1 18.5 89.7 17.7 80.4 Supplement
3 Fresno 1 8.8 44.2 11.5 47.0 Replace
4 Fresno 1 15.3 77.5 16.4 69.7 Supplement
Code
score_cols = ["treated_pretest", "treated_posttest", "control_pretest", "control_posttest"]
electric_wide.groupby("grade")[score_cols].agg(["mean", "std", "count"])
treated_pretest treated_posttest control_pretest control_posttest
mean std count mean std count mean std count mean std count
grade
1 15.528571 2.736082 21 77.090476 16.412949 21 15.623810 2.087320 21 68.790476 13.389657 21
2 75.897059 10.821905 34 101.570588 10.242263 34 70.708824 13.607871 34 93.211765 11.931490 34
3 93.500000 10.764073 20 106.510000 7.993412 20 95.800000 9.763843 20 106.175000 6.832729 20
4 107.300000 6.224227 21 114.066667 4.204204 21 104.238095 10.213984 21 110.357143 7.290855 21

Histograms of post-test scores by grade

Code
fig, axes = plt.subplots(4, 2, figsize=(8, 8), sharex=True, sharey=True)
bins = np.arange(40, 130, 5)
for row, grade in enumerate(range(1, 5)):
    d = electric_wide[electric_wide["grade"] == grade]
    for col, arm, title in [(0, "control_posttest", "Control"), (1, "treated_posttest", "Treated")]:
        ax = axes[row, col]
        ax.hist(d[arm], bins=bins, color="0.75", edgecolor="white")
        ax.axvline(d[arm].mean(), color="black", linewidth=1.5)
        ax.text(0.03, 0.86, f"mean = {d[arm].mean():.0f}\nsd = {d[arm].std():.0f}", transform=ax.transAxes, va="top")
        ax.set_title(f"Grade {grade}: {title}")
        if row == 3:
            ax.set_xlabel("Post-test score")
        if col == 0:
            ax.set_ylabel("Classes")
fig.tight_layout()

Convert to long paired-class data

Code
treated = electric_wide[["city", "grade", "treated_pretest", "treated_posttest", "supplement"]].rename(
    columns={"treated_pretest": "pre_test", "treated_posttest": "post_test"}
)
treated["treatment"] = 1
treated["supp"] = (treated["supplement"] == "Supplement").astype(float)

control = electric_wide[["city", "grade", "control_pretest", "control_posttest"]].rename(
    columns={"control_pretest": "pre_test", "control_posttest": "post_test"}
)
control["treatment"] = 0
control["supplement"] = pd.NA
control["supp"] = np.nan

electric = pd.concat([treated, control], ignore_index=True)
electric["pair_id"] = np.tile(np.arange(len(electric_wide)), 2)
electric.head()
city grade pre_test post_test supplement treatment supp pair_id
0 Fresno 1 13.8 48.9 Supplement 1 1.0 0
1 Fresno 1 16.5 70.5 Replace 1 0.0 1
2 Fresno 1 18.5 89.7 Supplement 1 1.0 2
3 Fresno 1 8.8 44.2 Replace 1 0.0 3
4 Fresno 1 15.3 77.5 Supplement 1 1.0 4

Scatterplots and parallel regression lines

For each grade, the R page overlays regression lines for treated and control classes. The first model uses a common pre-test slope and a treatment intercept shift.

Code
fig, axes = plt.subplots(1, 4, figsize=(13, 3.5), sharex=True, sharey=True)
for ax, grade in zip(axes, range(1, 5)):
    d = electric[electric["grade"] == grade]
    fit = smf.ols("post_test ~ pre_test + treatment", data=d).fit()
    xs = np.linspace(d["pre_test"].min(), d["pre_test"].max(), 100)
    control_line = fit.params["Intercept"] + fit.params["pre_test"] * xs
    treated_line = control_line + fit.params["treatment"]
    ax.scatter(d.loc[d.treatment == 0, "pre_test"], d.loc[d.treatment == 0, "post_test"], color="black", s=25, label="control")
    ax.scatter(d.loc[d.treatment == 1, "pre_test"], d.loc[d.treatment == 1, "post_test"], facecolors="white", edgecolors="black", s=30, label="treated")
    ax.plot(xs, control_line, color="black", linestyle="--", linewidth=1)
    ax.plot(xs, treated_line, color="black", linewidth=1)
    ax.set_title(f"Grade {grade}")
    ax.set_xlabel("pre-test")
axes[0].set_ylabel("post-test")
axes[0].legend(frameon=False, fontsize=8)
fig.tight_layout()

Treatment-by-pretest interaction in grade 4

Code
grade4 = electric[electric["grade"] == 4].copy()
fit_g4 = smf.ols("post_test ~ treatment * pre_test", data=grade4).fit()
fit_g4_bayes = bayes_lm("post_test ~ treatment * pre_test", data=grade4, draws=1000, prior_scale=10.0, seed=4020)
fit_g4.summary()
OLS Regression Results
Dep. Variable: post_test R-squared: 0.889
Model: OLS Adj. R-squared: 0.880
Method: Least Squares F-statistic: 101.2
Date: Sun, 31 May 2026 Prob (F-statistic): 3.61e-18
Time: 23:02:07 Log-Likelihood: -89.409
No. Observations: 42 AIC: 186.8
Df Residuals: 38 BIC: 193.8
Df Model: 3
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 37.8402 4.901 7.721 0.000 27.918 47.762
treatment 17.3722 9.600 1.810 0.078 -2.062 36.807
pre_test 0.6957 0.047 14.863 0.000 0.601 0.790
treatment:pre_test -0.1472 0.090 -1.636 0.110 -0.329 0.035
Omnibus: 5.305 Durbin-Watson: 2.023
Prob(Omnibus): 0.070 Jarque-Bera (JB): 6.528
Skew: 0.116 Prob(JB): 0.0382
Kurtosis: 4.917 Cond. No. 3.70e+03


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 3.7e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

The treatment effect at a given pre-test score is treatment + treatment:pre_test * pre_test.

Code
rng = np.random.default_rng(4020)
beta_draws = fit_g4_bayes.beta_draws
param_names = fit_g4_bayes.coef_names
it = param_names.index("treatment")
ii = param_names.index("treatment:pre_test")
xs = np.linspace(grade4["pre_test"].min(), grade4["pre_test"].max(), 100)
effect_draws = beta_draws[:, it, None] + beta_draws[:, ii, None] * xs
mean_effect_by_draw = beta_draws[:, it, None] + beta_draws[:, ii, None] * grade4["pre_test"].to_numpy()

fig, ax = plt.subplots(figsize=(6, 3.5))
ax.axhline(0, color="black", linestyle="--", linewidth=0.8)
for draw in effect_draws[:40]:
    ax.plot(xs, draw, color="0.75", linewidth=0.8, alpha=0.6)
ax.plot(xs, fit_g4.params["treatment"] + fit_g4.params["treatment:pre_test"] * xs, color="black", linewidth=2)
ax.set_xlabel("pre-test")
ax.set_ylabel("treatment effect")
ax.set_title("Grade 4 treatment effect by pre-test")
Text(0.5, 1.0, 'Grade 4 treatment effect by pre-test')

Code
pd.Series(mean_effect_by_draw.mean(axis=1), name="mean_grade4_effect").describe()
count    1000.000000
mean        1.745794
std         0.769802
min        -0.986987
25%         1.253102
50%         1.711963
75%         2.234293
max         4.687675
Name: mean_grade4_effect, dtype: float64

Grade-specific treatment estimates

Code
rows = []
for grade in range(1, 5):
    d = electric[electric["grade"] == grade]
    simple = smf.ols("post_test ~ treatment", data=d).fit()
    adjusted = smf.ols("post_test ~ treatment + pre_test", data=d).fit()
    rows.append({
        "grade": grade,
        "simple_est": simple.params["treatment"],
        "simple_se": simple.bse["treatment"],
        "adjusted_est": adjusted.params["treatment"],
        "adjusted_se": adjusted.bse["treatment"],
    })
estimates = pd.DataFrame(rows)
estimates
grade simple_est simple_se adjusted_est adjusted_se
0 1 8.300000 4.622243 8.786518 2.611757
1 2 8.358824 2.696754 4.265765 1.358955
2 3 0.335000 2.351391 1.909726 0.776162
3 4 3.709524 1.836559 1.701436 0.685347
Code
fig, ax = plt.subplots(figsize=(7, 3.8))
ypos = np.arange(len(estimates))
for offset, est_col, se_col, label, marker in [
    (-0.12, "simple_est", "simple_se", "Treatment only", "o"),
    (0.12, "adjusted_est", "adjusted_se", "Adjusted for pre-test", "s"),
]:
    ax.errorbar(estimates[est_col], ypos + offset, xerr=2 * estimates[se_col], fmt=marker, color="black", capsize=0, label=label)
ax.axvline(0, color="0.5", linestyle="--", linewidth=1)
ax.set_yticks(ypos)
ax.set_yticklabels([f"Grade {g}" for g in estimates["grade"]])
ax.set_xlabel("Estimated treatment effect on post-test score (± 2 SE)")
ax.legend(frameon=False)

Supplement versus replacement among treated classes

The treated arm has two variants: supplement and replacement. The R page first models supplement assignment as a function of pre-test score, then estimates the supplement effect on post-test score controlling for pre-test.

Code
def invlogit(x):
    return 1 / (1 + np.exp(-x))

supp_rows = []
fig, axes = plt.subplots(1, 4, figsize=(13, 3.2), sharey=True)
for ax, grade in zip(axes, range(1, 5)):
    d = electric[(electric["grade"] == grade) & electric["supp"].notna()].copy()
    logit_fit = smf.logit("supp ~ pre_test", data=d).fit(disp=False)
    outcome_fit = smf.ols("post_test ~ supp + pre_test", data=d).fit()
    supp_rows.append({
        "grade": grade,
        "supp_assignment_slope": logit_fit.params["pre_test"],
        "supp_effect_est": outcome_fit.params["supp"],
        "supp_effect_se": outcome_fit.bse["supp"],
    })
    xs = np.linspace(d["pre_test"].min(), d["pre_test"].max(), 100)
    probs = invlogit(logit_fit.params["Intercept"] + logit_fit.params["pre_test"] * xs)
    jitter = rng.uniform(-0.035, 0.035, size=len(d))
    ax.scatter(d["pre_test"], d["supp"] + jitter, facecolors=np.where(d["supp"] == 1, "white", "black"), edgecolors="black")
    ax.plot(xs, probs, color="black")
    ax.set_title(f"Grade {grade}")
    ax.set_xlabel("Pre-test score")
axes[0].set_yticks([0, 1])
axes[0].set_yticklabels(["Replace", "Supplement"])
fig.tight_layout()

supp_estimates = pd.DataFrame(supp_rows)
supp_estimates
grade supp_assignment_slope supp_effect_est supp_effect_se
0 1 0.154021 5.923001 4.444132
1 2 0.009163 4.686893 1.834017
2 3 -0.071659 -1.297883 1.513067
3 4 0.083116 -0.323840 1.171118

Code
fig, ax = plt.subplots(figsize=(6, 3.5))
ypos = np.arange(len(supp_estimates))
ax.errorbar(supp_estimates["supp_effect_est"], ypos, xerr=2 * supp_estimates["supp_effect_se"], fmt="o", color="black")
ax.axvline(0, color="0.5", linestyle="--", linewidth=1)
ax.set_yticks(ypos)
ax.set_yticklabels([f"Grade {g}" for g in supp_estimates["grade"]])
ax.set_xlabel("Estimated supplement effect vs replacement (± 2 SE)")
Text(0.5, 0, 'Estimated supplement effect vs replacement (± 2 SE)')

CmdStanPy note

Code
from cmdstanpy import CmdStanModel

# The rstanarm calls on the source page are Gaussian linear regressions and one
# logistic regression.  CmdStanPy is most useful here if you want shared priors
# or a hierarchical model that partially pools treatment effects across grades.