Data from a randomized HIV-prevention trial for high-risk couples. Treatment arms were control, woman-only counseling, and couple counseling; one outcome is the number of unprotected sex acts after three months.
Code
from pathlib import Pathimport pandas as pdimport matplotlib.pyplot as pltroot = Path("../../ROS-Examples")risky = pd.read_csv(root /"RiskyBehavior/data/risky.csv")risky.head()
sex
couples
women_alone
bs_hiv
bupacts
fupacts
0
woman
0
1
negative
7
32.0
1
woman
0
0
negative
2
5.0
2
woman
0
0
positive
0
15.0
3
woman
0
0
negative
24
9.0
4
woman
1
0
negative
2
2.0
Quick structure check
Code
risky.describe(include="all")
sex
couples
women_alone
bs_hiv
bupacts
fupacts
count
434
434.000000
434.000000
434
434.000000
434.000000
unique
2
NaN
NaN
2
NaN
NaN
top
woman
NaN
NaN
negative
NaN
NaN
freq
217
NaN
NaN
337
NaN
NaN
mean
NaN
0.373272
0.336406
NaN
25.910138
16.489579
std
NaN
0.484232
0.473025
NaN
31.917963
26.825769
min
NaN
0.000000
0.000000
NaN
0.000000
0.000000
25%
NaN
0.000000
0.000000
NaN
5.000000
0.000000
50%
NaN
0.000000
0.000000
NaN
15.000000
5.000000
75%
NaN
1.000000
1.000000
NaN
36.000000
20.925600
max
NaN
1.000000
1.000000
NaN
300.000000
200.000000
Code
# Summaries by randomized arm when a treatment column is present.for col in risky.columns:if risky[col].nunique(dropna=True) <=5:print("\n", col)print(risky.groupby(col).size())
The original R page only loads and displays the data; later causal chapters use this kind of randomized-treatment dataset for treatment-effect modeling.
# Risky behavior trial dataSource: `RiskyBehavior/risky.Rmd`Data from a randomized HIV-prevention trial for high-risk couples. Treatment arms were control, woman-only counseling, and couple counseling; one outcome is the number of unprotected sex acts after three months.```{python}from pathlib import Pathimport pandas as pdimport matplotlib.pyplot as pltroot = Path("../../ROS-Examples")risky = pd.read_csv(root /"RiskyBehavior/data/risky.csv")risky.head()```## Quick structure check```{python}risky.describe(include="all")``````{python}# Summaries by randomized arm when a treatment column is present.for col in risky.columns:if risky[col].nunique(dropna=True) <=5:print("\n", col)print(risky.groupby(col).size())```The original R page only loads and displays the data; later causal chapters use this kind of randomized-treatment dataset for treatment-effect modeling.