Sample size simulation

Source: SampleSize/simulation.Rmd

The original example uses stan_glm to show how the same noise level leads to different uncertainty for main effects and interactions under different predictor codings. This port uses ordinary least squares; with flat priors and Gaussian errors, the coefficient summaries closely match the large-sample posterior summaries from stan_glm.

Code

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

rng = np.random.default_rng(20260531)

Simulated outcome

Code

N = 1_000
sigma = 10
y = rng.normal(0, sigma, size=N)

A small helper fits the main-effect model and the interaction model for a chosen two-point predictor coding.

Code

def simulate_and_fit(levels, label):
    x1 = rng.choice(levels, size=N)
    x2 = rng.choice(levels, size=N)
    fake = pd.DataFrame({"y": y, "x1": x1, "x2": x2})
    fit_main = smf.ols("y ~ x1", data=fake).fit()
    fit_interaction = smf.ols("y ~ x1 * x2", data=fake).fit()
    rows = []
    for model_name, fit in [("main", fit_main), ("interaction", fit_interaction)]:
        table = pd.DataFrame({
            "coding": label,
            "model": model_name,
            "term": fit.params.index,
            "estimate": fit.params.values,
            "std_error": fit.bse.values,
        })
        rows.append(table)
    return fake, fit_main, fit_interaction, pd.concat(rows, ignore_index=True)

Predictor range (-0.5, 0.5)

With centered half-unit coding, the coefficient of x1 is the expected difference between the high and low settings of x1. The interaction is also on a compact scale.

Code

fake_half, fit_1a, fit_1b, tab_half = simulate_and_fit([-0.5, 0.5], "-0.5/0.5")
tab_half

	coding	model	term	estimate	std_error
0	-0.5/0.5	main	Intercept	-0.619858	0.312006
1	-0.5/0.5	main	x1	-0.941152	0.624011
2	-0.5/0.5	interaction	Intercept	-0.650228	0.312731
3	-0.5/0.5	interaction	x1	-1.001892	0.625462
4	-0.5/0.5	interaction	x2	-0.235538	0.625462
5	-0.5/0.5	interaction	x1:x2	2.548815	1.250924

Predictor range (0, 1)

The source Rmd redraws x1 and x2 here but accidentally keeps the original fake data frame. The code below uses the intended regenerated predictors.

Code

fake_zero_one, fit_2a, fit_2b, tab_zero_one = simulate_and_fit([0, 1], "0/1")
tab_zero_one

	coding	model	term	estimate	std_error
0	0/1	main	Intercept	-1.547944	0.450412
1	0/1	main	x1	1.781741	0.622816
2	0/1	interaction	Intercept	-1.599636	0.632548
3	0/1	interaction	x1	2.324774	0.871822
4	0/1	interaction	x2	0.104923	0.901195
5	0/1	interaction	x1:x2	-1.116622	1.246324

Predictor range (-1, 1)

Doubling the predictor range halves the standard error for the main-effect contrast. For the interaction term, changing from 0/1 to -1/1 also changes the meaning of the coefficient because the product column has a different scale and correlation pattern with the main effects.

Code

fake_pm_one, fit_3a, fit_3b, tab_pm_one = simulate_and_fit([-1, 1], "-1/1")
tab_pm_one

	coding	model	term	estimate	std_error
0	-1/1	main	Intercept	-0.618180	0.312252
1	-1/1	main	x1	-0.260882	0.312252
2	-1/1	interaction	Intercept	-0.624835	0.312369
3	-1/1	interaction	x1	-0.258752	0.312369
4	-1/1	interaction	x2	-0.157598	0.312369
5	-1/1	interaction	x1:x2	-0.403571	0.312369

Comparing codings

Code

comparison = pd.concat([tab_half, tab_zero_one, tab_pm_one], ignore_index=True)
comparison.pivot_table(
    index=["coding", "model", "term"],
    values=["estimate", "std_error"],
).round(3)

			estimate	std_error
coding	model	term
-0.5/0.5	interaction	Intercept	-0.650	0.313
		x1	-1.002	0.625
		x1:x2	2.549	1.251
		x2	-0.236	0.625
	main	Intercept	-0.620	0.312
	main	x1	-0.941	0.624
-1/1	interaction	Intercept	-0.625	0.312
		x1	-0.259	0.312
		x1:x2	-0.404	0.312
		x2	-0.158	0.312
	main	Intercept	-0.618	0.312
	main	x1	-0.261	0.312
0/1	interaction	Intercept	-1.600	0.633
		x1	2.325	0.872
		x1:x2	-1.117	1.246
		x2	0.105	0.901
	main	Intercept	-1.548	0.450
	main	x1	1.782	0.623

Code

key_terms = comparison.query("term in ['x1', 'x1:x2']").copy()
key_terms.assign(abs_t=lambda d: (d["estimate"] / d["std_error"]).abs())[
    ["coding", "model", "term", "estimate", "std_error", "abs_t"]
].round(3)

	coding	model	term	estimate	std_error	abs_t
1	-0.5/0.5	main	x1	-0.941	0.624	1.508
3	-0.5/0.5	interaction	x1	-1.002	0.625	1.602
5	-0.5/0.5	interaction	x1:x2	2.549	1.251	2.038
7	0/1	main	x1	1.782	0.623	2.861
9	0/1	interaction	x1	2.325	0.872	2.667
11	0/1	interaction	x1:x2	-1.117	1.246	0.896
13	-1/1	main	x1	-0.261	0.312	0.835
15	-1/1	interaction	x1	-0.259	0.312	0.828
17	-1/1	interaction	x1:x2	-0.404	0.312	1.292

Even with the same sample size and outcome noise, the precision of regression coefficients depends on the scale of the predictors. Centered codings are especially useful for interactions because the lower-order terms remain interpretable as average effects near the center of the design.

# Sample size simulation Source: `SampleSize/simulation.Rmd` The original example uses `stan_glm` to show how the same noise level leads to different uncertainty for main effects and interactions under different predictor codings. This port uses ordinary least squares; with flat priors and Gaussian errors, the coefficient summaries closely match the large-sample posterior summaries from `stan_glm`. ```{python} import numpy as np import pandas as pd import statsmodels.formula.api as smf rng = np.random.default_rng(20260531) ``` ## Simulated outcome ```{python} N = 1_000 sigma = 10 y = rng.normal(0, sigma, size=N) ``` A small helper fits the main-effect model and the interaction model for a chosen two-point predictor coding. ```{python} def simulate_and_fit(levels, label): x1 = rng.choice(levels, size=N) x2 = rng.choice(levels, size=N) fake = pd.DataFrame({"y": y, "x1": x1, "x2": x2}) fit_main = smf.ols("y ~ x1", data=fake).fit() fit_interaction = smf.ols("y ~ x1 * x2", data=fake).fit() rows = [] for model_name, fit in [("main", fit_main), ("interaction", fit_interaction)]: table = pd.DataFrame({ "coding": label, "model": model_name, "term": fit.params.index, "estimate": fit.params.values, "std_error": fit.bse.values, }) rows.append(table) return fake, fit_main, fit_interaction, pd.concat(rows, ignore_index=True) ``` ## Predictor range (-0.5, 0.5) With centered half-unit coding, the coefficient of `x1` is the expected difference between the high and low settings of `x1`. The interaction is also on a compact scale. ```{python} fake_half, fit_1a, fit_1b, tab_half = simulate_and_fit([-0.5, 0.5], "-0.5/0.5") tab_half ``` ## Predictor range (0, 1) The source Rmd redraws `x1` and `x2` here but accidentally keeps the original `fake` data frame. The code below uses the intended regenerated predictors. ```{python} fake_zero_one, fit_2a, fit_2b, tab_zero_one = simulate_and_fit([0, 1], "0/1") tab_zero_one ``` ## Predictor range (-1, 1) Doubling the predictor range halves the standard error for the main-effect contrast. For the interaction term, changing from `0/1` to `-1/1` also changes the meaning of the coefficient because the product column has a different scale and correlation pattern with the main effects. ```{python} fake_pm_one, fit_3a, fit_3b, tab_pm_one = simulate_and_fit([-1, 1], "-1/1") tab_pm_one ``` ## Comparing codings ```{python} comparison = pd.concat([tab_half, tab_zero_one, tab_pm_one], ignore_index=True) comparison.pivot_table( index=["coding", "model", "term"], values=["estimate", "std_error"], ).round(3) ``` ```{python} key_terms = comparison.query("term in ['x1', 'x1:x2']").copy() key_terms.assign(abs_t=lambda d: (d["estimate"] / d["std_error"]).abs())[ ["coding", "model", "term", "estimate", "std_error", "abs_t"] ].round(3) ``` Even with the same sample size and outcome noise, the precision of regression coefficients depends on the scale of the predictors. Centered codings are especially useful for interactions because the lower-order terms remain interpretable as average effects near the center of the design.