Synthetic control with sci-kit learn models#
import causalpy as cp
Load data#
df = cp.load_data("sc")
treatment_time = 70
Analyse with WeightedProportion
model#
result = cp.SyntheticControl(
df,
treatment_time,
control_units=["a", "b", "c", "d", "e", "f", "g"],
treated_units=["actual"],
model=cp.skl_models.WeightedProportion(),
)
fig, ax = result.plot(plot_predictors=True)

result.summary(round_to=3)
================================SyntheticControl================================
Control units: ['a', 'b', 'c', 'd', 'e', 'f', 'g']
Treated unit: actual
Model coefficients:
a 0.319
b 0.0597
c 0.294
d 0.0605
e 0.000762
f 0.234
g 0.0321
But we can see that (for this dataset) these estimates are quite bad. So we can lift the “sum to 1” assumption and instead use the LinearRegression
model, but still constrain weights to be positive. Equally, you could experiment with the Ridge
model (e.g. Ridge(positive=True, alpha=100)
).