Sample size simulation

Source: SampleSize/simulation.Rmd

The original example uses stan_glm to show how the same noise level leads to different uncertainty for main effects and interactions under different predictor codings. This port uses ordinary least squares; with flat priors and Gaussian errors, the coefficient summaries closely match the large-sample posterior summaries from stan_glm.

Code
import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

rng = np.random.default_rng(20260531)

Simulated outcome

Code
N = 1_000
sigma = 10
y = rng.normal(0, sigma, size=N)

A small helper fits the main-effect model and the interaction model for a chosen two-point predictor coding.

Code
def simulate_and_fit(levels, label):
    x1 = rng.choice(levels, size=N)
    x2 = rng.choice(levels, size=N)
    fake = pd.DataFrame({"y": y, "x1": x1, "x2": x2})
    fit_main = smf.ols("y ~ x1", data=fake).fit()
    fit_interaction = smf.ols("y ~ x1 * x2", data=fake).fit()
    rows = []
    for model_name, fit in [("main", fit_main), ("interaction", fit_interaction)]:
        table = pd.DataFrame({
            "coding": label,
            "model": model_name,
            "term": fit.params.index,
            "estimate": fit.params.values,
            "std_error": fit.bse.values,
        })
        rows.append(table)
    return fake, fit_main, fit_interaction, pd.concat(rows, ignore_index=True)

Predictor range (-0.5, 0.5)

With centered half-unit coding, the coefficient of x1 is the expected difference between the high and low settings of x1. The interaction is also on a compact scale.

Code
fake_half, fit_1a, fit_1b, tab_half = simulate_and_fit([-0.5, 0.5], "-0.5/0.5")
tab_half
coding model term estimate std_error
0 -0.5/0.5 main Intercept -0.619858 0.312006
1 -0.5/0.5 main x1 -0.941152 0.624011
2 -0.5/0.5 interaction Intercept -0.650228 0.312731
3 -0.5/0.5 interaction x1 -1.001892 0.625462
4 -0.5/0.5 interaction x2 -0.235538 0.625462
5 -0.5/0.5 interaction x1:x2 2.548815 1.250924

Predictor range (0, 1)

The source Rmd redraws x1 and x2 here but accidentally keeps the original fake data frame. The code below uses the intended regenerated predictors.

Code
fake_zero_one, fit_2a, fit_2b, tab_zero_one = simulate_and_fit([0, 1], "0/1")
tab_zero_one
coding model term estimate std_error
0 0/1 main Intercept -1.547944 0.450412
1 0/1 main x1 1.781741 0.622816
2 0/1 interaction Intercept -1.599636 0.632548
3 0/1 interaction x1 2.324774 0.871822
4 0/1 interaction x2 0.104923 0.901195
5 0/1 interaction x1:x2 -1.116622 1.246324

Predictor range (-1, 1)

Doubling the predictor range halves the standard error for the main-effect contrast. For the interaction term, changing from 0/1 to -1/1 also changes the meaning of the coefficient because the product column has a different scale and correlation pattern with the main effects.

Code
fake_pm_one, fit_3a, fit_3b, tab_pm_one = simulate_and_fit([-1, 1], "-1/1")
tab_pm_one
coding model term estimate std_error
0 -1/1 main Intercept -0.618180 0.312252
1 -1/1 main x1 -0.260882 0.312252
2 -1/1 interaction Intercept -0.624835 0.312369
3 -1/1 interaction x1 -0.258752 0.312369
4 -1/1 interaction x2 -0.157598 0.312369
5 -1/1 interaction x1:x2 -0.403571 0.312369

Comparing codings

Code
comparison = pd.concat([tab_half, tab_zero_one, tab_pm_one], ignore_index=True)
comparison.pivot_table(
    index=["coding", "model", "term"],
    values=["estimate", "std_error"],
).round(3)
estimate std_error
coding model term
-0.5/0.5 interaction Intercept -0.650 0.313
x1 -1.002 0.625
x1:x2 2.549 1.251
x2 -0.236 0.625
main Intercept -0.620 0.312
x1 -0.941 0.624
-1/1 interaction Intercept -0.625 0.312
x1 -0.259 0.312
x1:x2 -0.404 0.312
x2 -0.158 0.312
main Intercept -0.618 0.312
x1 -0.261 0.312
0/1 interaction Intercept -1.600 0.633
x1 2.325 0.872
x1:x2 -1.117 1.246
x2 0.105 0.901
main Intercept -1.548 0.450
x1 1.782 0.623
Code
key_terms = comparison.query("term in ['x1', 'x1:x2']").copy()
key_terms.assign(abs_t=lambda d: (d["estimate"] / d["std_error"]).abs())[
    ["coding", "model", "term", "estimate", "std_error", "abs_t"]
].round(3)
coding model term estimate std_error abs_t
1 -0.5/0.5 main x1 -0.941 0.624 1.508
3 -0.5/0.5 interaction x1 -1.002 0.625 1.602
5 -0.5/0.5 interaction x1:x2 2.549 1.251 2.038
7 0/1 main x1 1.782 0.623 2.861
9 0/1 interaction x1 2.325 0.872 2.667
11 0/1 interaction x1:x2 -1.117 1.246 0.896
13 -1/1 main x1 -0.261 0.312 0.835
15 -1/1 interaction x1 -0.259 0.312 0.828
17 -1/1 interaction x1:x2 -0.404 0.312 1.292

Even with the same sample size and outcome noise, the precision of regression coefficients depends on the scale of the predictors. Centered codings are especially useful for interactions because the lower-order terms remain interpretable as average effects near the center of the design.