Logistic regression graphs

Source: LogitGraphs/logitgraphs.Rmd

This example shows several ways to display logistic regression: raw binary data with fitted curves, binned averages, and discrimination lines for a model with two predictors. The original page uses stan_glm; here statsmodels fits the same logistic likelihood by maximum likelihood.

Setup

Code

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
from scipy.special import expit, logit

rng = np.random.default_rng(14014)

Simulate fake data from a one-predictor logit model

Code

n = 50
a = 2
b = 3
x_mean = -a / b
x_sd = 4 / b
x = rng.normal(x_mean, x_sd, size=n)
y = rng.binomial(1, expit(a + b * x))
fake_1 = pd.DataFrame({"x": x, "y": y})
fake_1.head()

	x	y
0	-1.457298	0
1	-1.840753	0
2	-1.216585	0
3	2.052998	1
4	-0.063965	0

Fit the model

Code

fit_1 = smf.logit("y ~ x", data=fake_1).fit(disp=False)
a_hat = fit_1.params["Intercept"]
b_hat = fit_1.params["x"]
fit_1.summary()

Logit Regression Results
Dep. Variable:	y	No. Observations:	50
Model:	Logit	Df Residuals:	48
Method:	MLE	Df Model:	1
Date:	Sun, 31 May 2026	Pseudo R-squ.:	0.5068
Time:	23:03:37	Log-Likelihood:	-17.074
converged:	True	LL-Null:	-34.617
Covariance Type:	nonrobust	LLR p-value:	3.153e-09

	coef	std err	z	P>\|z\|	[0.025	0.975]
Intercept	1.8147	0.637	2.847	0.004	0.566	3.064
x	2.8411	0.772	3.679	0.000	1.327	4.355

Raw data, true curve, and fitted curve

Code

def shifted(values, delta=0.008):
    values = np.asarray(values)
    return np.where(values == 0, delta, np.where(values == 1, 1 - delta, values))

x_grid = np.linspace(x.min() - 0.5, x.max() + 0.5, 300)

fig, ax = plt.subplots(figsize=(7, 4.5))
ax.scatter(x, shifted(y), color="black", s=25)
ax.plot(x_grid, expit(a + b * x_grid), color="0.25", label="true curve")
ax.plot(x_grid, expit(a_hat + b_hat * x_grid), color="0.25", linestyle="--", label="fitted curve")
ax.set_ylim(0, 1)
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.legend(frameon=False)

The vertical shift keeps the binary observations visible without pretending that the outcomes are continuous.

Binned averages

Code

K = 5
fake_1["bin"] = pd.cut(fake_1["x"], bins=K)
binned = fake_1.groupby("bin", observed=True).agg(x_bar=("x", "mean"), y_bar=("y", "mean"), n=("y", "size"))
binned

	x_bar	y_bar	n
bin
(-3.191, -1.991]	-2.567395	0.0000	6
(-1.991, -0.798]	-1.275010	0.1875	16
(-0.798, 0.396]	-0.142363	0.7500	20
(0.396, 1.59]	0.648396	1.0000	5
(1.59, 2.783]	2.257293	1.0000	3

Code

fig, ax = plt.subplots(figsize=(7, 4.5))
ax.scatter(fake_1["x"], shifted(fake_1["y"]), color="0.55", s=22, label="data")
ax.scatter(binned["x_bar"], shifted(binned["y_bar"], delta=0.02), s=90, facecolor="white", edgecolor="black", label="bin averages")
ax.plot(x_grid, expit(a_hat + b_hat * x_grid), color="black", linewidth=1)
ax.set_ylim(0, 1)
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_title("Data and binned averages")
ax.legend(frameon=False)

Binning is only a display device: the model is still fit to the individual binary observations.

Logistic regression with two predictors

Code

n = 100
beta = np.array([2, 3, 4])
x1 = rng.normal(0, 0.4, size=n)
x2 = rng.normal(-0.5, 0.4, size=n)
linpred = beta[0] + beta[1] * x1 + beta[2] * x2
y = rng.binomial(1, expit(linpred))
fake_2 = pd.DataFrame({"x1": x1, "x2": x2, "y": y})
fake_2.head()

	x1	x2	y
0	-0.281543	-0.478992	0
1	0.005012	-0.727271	0
2	-0.845535	-0.299932	1
3	-0.023915	-0.021838	1
4	-0.134852	0.454685	1

Code

fit_2 = smf.logit("y ~ x1 + x2", data=fake_2).fit(disp=False)
beta_hat = fit_2.params[["Intercept", "x1", "x2"]].to_numpy()
fit_2.summary()

Logit Regression Results
Dep. Variable:	y	No. Observations:	100
Model:	Logit	Df Residuals:	97
Method:	MLE	Df Model:	2
Date:	Sun, 31 May 2026	Pseudo R-squ.:	0.5280
Time:	23:03:37	Log-Likelihood:	-32.707
converged:	True	LL-Null:	-69.295
Covariance Type:	nonrobust	LLR p-value:	1.288e-16

	coef	std err	z	P>\|z\|	[0.025	0.975]
Intercept	3.6698	0.852	4.305	0.000	1.999	5.340
x1	4.2329	1.075	3.939	0.000	2.127	6.339
x2	6.7458	1.463	4.610	0.000	3.878	9.614

Discrimination lines

For a fixed probability p, the line is the set of predictor values satisfying a + b1*x1 + b2*x2 = logit(p).

Code

def discrimination_line(beta_vec, p, x_values):
    return (logit(p) - beta_vec[0] - beta_vec[1] * x_values) / beta_vec[2]

x1_grid = np.linspace(fake_2["x1"].min() - 0.15, fake_2["x1"].max() + 0.15, 200)

fig, ax = plt.subplots(figsize=(7, 5))
ax.scatter(fake_2.loc[fake_2["y"] == 0, "x1"], fake_2.loc[fake_2["y"] == 0, "x2"], color="black", s=22, label="y=0")
ax.scatter(fake_2.loc[fake_2["y"] == 1, "x1"], fake_2.loc[fake_2["y"] == 1, "x2"], facecolor="white", edgecolor="black", s=36, label="y=1")
for p, style in [(0.1, "--"), (0.5, "-"), (0.9, "--")]:
    ax.plot(x1_grid, discrimination_line(beta_hat, p, x1_grid), color="black", linestyle=style, label=f"Pr(y=1)={p:.1f}")
ax.set_xlabel("x1")
ax.set_ylabel("x2")
ax.set_title("Data and 10%, 50%, 90% fitted discrimination lines")
ax.legend(frameon=False, loc="best")

Code

pd.DataFrame({
    "true": beta,
    "fitted_mle": beta_hat,
}, index=["Intercept", "x1", "x2"])

	true	fitted_mle
Intercept	2	3.669789
x1	3	4.232908
x2	4	6.745842

With two predictors, the logistic regression curve becomes a surface. The contour lines at probabilities 0.1, 0.5, and 0.9 are often easier to read than a three-dimensional surface.

# Logistic regression graphs Source: `LogitGraphs/logitgraphs.Rmd` This example shows several ways to display logistic regression: raw binary data with fitted curves, binned averages, and discrimination lines for a model with two predictors. The original page uses `stan_glm`; here `statsmodels` fits the same logistic likelihood by maximum likelihood. ## Setup ```{python} import numpy as np import pandas as pd import matplotlib.pyplot as plt import statsmodels.formula.api as smf from scipy.special import expit, logit rng = np.random.default_rng(14014) ``` ## Simulate fake data from a one-predictor logit model ```{python} n = 50 a = 2 b = 3 x_mean = -a / b x_sd = 4 / b x = rng.normal(x_mean, x_sd, size=n) y = rng.binomial(1, expit(a + b * x)) fake_1 = pd.DataFrame({"x": x, "y": y}) fake_1.head() ``` ## Fit the model ```{python} fit_1 = smf.logit("y ~ x", data=fake_1).fit(disp=False) a_hat = fit_1.params["Intercept"] b_hat = fit_1.params["x"] fit_1.summary() ``` ## Raw data, true curve, and fitted curve ```{python} def shifted(values, delta=0.008): values = np.asarray(values) return np.where(values == 0, delta, np.where(values == 1, 1 - delta, values)) x_grid = np.linspace(x.min() - 0.5, x.max() + 0.5, 300) fig, ax = plt.subplots(figsize=(7, 4.5)) ax.scatter(x, shifted(y), color="black", s=25) ax.plot(x_grid, expit(a + b * x_grid), color="0.25", label="true curve") ax.plot(x_grid, expit(a_hat + b_hat * x_grid), color="0.25", linestyle="--", label="fitted curve") ax.set_ylim(0, 1) ax.set_xlabel("x") ax.set_ylabel("y") ax.legend(frameon=False) ``` The vertical shift keeps the binary observations visible without pretending that the outcomes are continuous. ## Binned averages ```{python} K = 5 fake_1["bin"] = pd.cut(fake_1["x"], bins=K) binned = fake_1.groupby("bin", observed=True).agg(x_bar=("x", "mean"), y_bar=("y", "mean"), n=("y", "size")) binned ``` ```{python} fig, ax = plt.subplots(figsize=(7, 4.5)) ax.scatter(fake_1["x"], shifted(fake_1["y"]), color="0.55", s=22, label="data") ax.scatter(binned["x_bar"], shifted(binned["y_bar"], delta=0.02), s=90, facecolor="white", edgecolor="black", label="bin averages") ax.plot(x_grid, expit(a_hat + b_hat * x_grid), color="black", linewidth=1) ax.set_ylim(0, 1) ax.set_xlabel("x") ax.set_ylabel("y") ax.set_title("Data and binned averages") ax.legend(frameon=False) ``` Binning is only a display device: the model is still fit to the individual binary observations. ## Logistic regression with two predictors ```{python} n = 100 beta = np.array([2, 3, 4]) x1 = rng.normal(0, 0.4, size=n) x2 = rng.normal(-0.5, 0.4, size=n) linpred = beta[0] + beta[1] * x1 + beta[2] * x2 y = rng.binomial(1, expit(linpred)) fake_2 = pd.DataFrame({"x1": x1, "x2": x2, "y": y}) fake_2.head() ``` ```{python} fit_2 = smf.logit("y ~ x1 + x2", data=fake_2).fit(disp=False) beta_hat = fit_2.params[["Intercept", "x1", "x2"]].to_numpy() fit_2.summary() ``` ## Discrimination lines For a fixed probability `p`, the line is the set of predictor values satisfying `a + b1*x1 + b2*x2 = logit(p)`. ```{python} def discrimination_line(beta_vec, p, x_values): return (logit(p) - beta_vec[0] - beta_vec[1] * x_values) / beta_vec[2] x1_grid = np.linspace(fake_2["x1"].min() - 0.15, fake_2["x1"].max() + 0.15, 200) fig, ax = plt.subplots(figsize=(7, 5)) ax.scatter(fake_2.loc[fake_2["y"] == 0, "x1"], fake_2.loc[fake_2["y"] == 0, "x2"], color="black", s=22, label="y=0") ax.scatter(fake_2.loc[fake_2["y"] == 1, "x1"], fake_2.loc[fake_2["y"] == 1, "x2"], facecolor="white", edgecolor="black", s=36, label="y=1") for p, style in [(0.1, "--"), (0.5, "-"), (0.9, "--")]: ax.plot(x1_grid, discrimination_line(beta_hat, p, x1_grid), color="black", linestyle=style, label=f"Pr(y=1)={p:.1f}") ax.set_xlabel("x1") ax.set_ylabel("x2") ax.set_title("Data and 10%, 50%, 90% fitted discrimination lines") ax.legend(frameon=False, loc="best") ``` ```{python} pd.DataFrame({ "true": beta, "fitted_mle": beta_hat, }, index=["Intercept", "x1", "x2"]) ``` With two predictors, the logistic regression curve becomes a surface. The contour lines at probabilities 0.1, 0.5, and 0.9 are often easier to read than a three-dimensional surface.