Synthetic control with sci-kit learn models#

import causalpy as cp

Load data#

df = cp.load_data("sc")
treatment_time = 70

Analyse with WeightedProportion model#

result = cp.SyntheticControl(
    df,
    treatment_time,
    control_units=["a", "b", "c", "d", "e", "f", "g"],
    treated_units=["actual"],
    model=cp.skl_models.WeightedProportion(),
)
fig, ax = result.plot(plot_predictors=True)
../_images/143d1a14d9b94f8fff6763ba3c29924bc08a8e8e45be3b3b3ca856b99f0b0731.png
result.summary(round_to=3)
================================SyntheticControl================================
Control units: ['a', 'b', 'c', 'd', 'e', 'f', 'g']
Treated unit: actual
Model coefficients:
  a	     0.319
  b	    0.0597
  c	     0.294
  d	    0.0605
  e	  0.000762
  f	     0.234
  g	    0.0321

But we can see that (for this dataset) these estimates are quite bad. So we can lift the “sum to 1” assumption and instead use the LinearRegression model, but still constrain weights to be positive. Equally, you could experiment with the Ridge model (e.g. Ridge(positive=True, alpha=100)).