{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Bayesian Interrupted Time Series\n",
"\n",
"Interrupted Time Series (ITS) analysis is a powerful approach for estimating the causal impact of an intervention or treatment when you have a single time series of observations. The key idea is to compare what actually happened after the intervention to what would have happened in the absence of the intervention (the \"counterfactual\"). To do this, we train a statistical model on the pre-intervention data (when no treatment has occurred) and then use this model to forecast the expected outcomes into the post-intervention period as-if treatment had not occurred. The difference between the observed outcomes and these model-based counterfactual predictions provides an estimate of the causal effect of the intervention, along with a measure of uncertainty if using a Bayesian approach."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What do we mean by _causal impact_ in Interrupted Time Series?\n",
"\n",
"In the context of interrupted time series analysis, the term **causal impact** refers to the estimated effect of an intervention or event on an outcome of interest. We can break this down into two components which tell us different aspects of the intervention's effect:\n",
"\n",
"- The **Instantaneous Bayesian Causal Effect** at each time point is the difference between the observed outcome and the model's posterior predictive distribution for the counterfactual (i.e., what would have happened without the intervention). This is not just a single number, but a full probability distribution that reflects our uncertainty.\n",
"- The **Cumulative Bayesian Causal Impact** is the sum of these instantaneous effects over the post-intervention period, again represented as a distribution.\n",
"\n",
"Let $y_t$ be the observed outcome at time $t$ (after the intervention), and $\\tilde{y}_t$ be the model's counterfactual prediction for the same time point. Then:\n",
"- **Instantaneous effect:** $\\Delta_t = y_t - \\tilde{y}_t$\n",
"- **Cumulative effect (up to time $T$):** $C_T = \\sum_{t=1}^T \\Delta_t$\n",
"\n",
"In Bayesian analysis, both $\\tilde{y}_t$ and $\\Delta_t$ are distributions, not just point estimates.\n",
"\n",
"### Why does this matter for decision making?\n",
"These metrics allow you to answer questions like:\n",
"- \"How much did the intervention change the outcome, compared to what would have happened otherwise?\"\n",
"- \"What is the total effect of the intervention over time, and how certain are we about it?\"\n",
"\n",
"By providing both instantaneous and cumulative effects, along with their uncertainty, you can make more informed decisions and better understand the impact of your interventions."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Interrupted Time Series example"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import arviz as az\n",
"import pandas as pd\n",
"\n",
"import causalpy as cp"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2\n",
"%config InlineBackend.figure_format = 'retina'\n",
"seed = 42"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Load data"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
]
},
"metadata": {
"image/png": {
"height": 811,
"width": 711
}
},
"output_type": "display_data"
}
],
"source": [
"fig, ax = result.plot()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"==================================Pre-Post Fit==================================\n",
"Formula: y ~ 1 + t + C(month)\n",
"Model coefficients:\n",
" Intercept 23, 94% HDI [21, 24]\n",
" C(month)[T.2] 2.9, 94% HDI [0.95, 4.8]\n",
" C(month)[T.3] 1.2, 94% HDI [-0.82, 3.1]\n",
" C(month)[T.4] 7.1, 94% HDI [5.2, 9.2]\n",
" C(month)[T.5] 15, 94% HDI [13, 17]\n",
" C(month)[T.6] 25, 94% HDI [23, 27]\n",
" C(month)[T.7] 18, 94% HDI [16, 20]\n",
" C(month)[T.8] 33, 94% HDI [31, 36]\n",
" C(month)[T.9] 16, 94% HDI [14, 18]\n",
" C(month)[T.10] 9.2, 94% HDI [7.3, 11]\n",
" C(month)[T.11] 6.3, 94% HDI [4.4, 8.2]\n",
" C(month)[T.12] 0.58, 94% HDI [-1.3, 2.6]\n",
" t 0.21, 94% HDI [0.19, 0.23]\n",
" sigma 2, 94% HDI [1.7, 2.3]\n"
]
}
],
"source": [
"result.summary()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As well as the model coefficients, we might be interested in the estimated causal impact of the intervention over time - what we called the instantaneous Bayesian causal effect above. The post intervention causal impact estimates are contained in the `post_impact` attribute, which is of type `xarray.DataArray`. We can take a look at what this looks like, and we can see that it consists of 3 dimensions: `chain`, `draw`, and time (`obs_ind`). The `chain` and `draw` dimensions are used to store the samples from the posterior distribution, while the `obs_ind` dimension corresponds to the time points in the time series."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
" mean sd hdi_3% hdi_97%\n",
"x[unit_0] 65.654 23.943 20.56 109.182"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"az.summary(result.post_impact.sum(\"obs_ind\"), kind=\"stats\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Of course, if we wanted to query the estimated impact over a specific time period, we can leverage the fact that this is an `xarray.DataArray` and we can use the `sel` method to select the time points we are interested in. We will leave this as an exercise for the reader.\n",
"\n",
"Moving on, it would also be possible to look at the mean post-intervention impact estimates, which would give us the average impact of the intervention over the post-intervention period. This can be done using the `mean` method and would equate to $\\bar{C_T} = \\Big[ \\sum_{t=1}^T \\Delta_t \\Big] / T$"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
mean
\n",
"
sd
\n",
"
hdi_3%
\n",
"
hdi_97%
\n",
"
\n",
" \n",
" \n",
"
\n",
"
x[unit_0]
\n",
"
1.824
\n",
"
0.665
\n",
"
0.571
\n",
"
3.033
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mean sd hdi_3% hdi_97%\n",
"x[unit_0] 1.824 0.665 0.571 3.033"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"az.summary(result.post_impact.mean(\"obs_ind\"), kind=\"stats\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
":::{warning}\n",
"Care must be taken with the mean causal impact statistic $\\bar{C_T}$. It only makes sense to use this statistic if it looks like the intervention had a lasting (and roughly constant) effect on the outcome variable. If the effect is transient (like in the example here), then clearly there will be a lot of post-intervention period where the impact of the intervention has ‘worn off’. If so, then it will be hard to interpret the mean impacts real meaning.\n",
"\n",
"But if there was a roughly constant impact of the intervention over the post-intervention period, then the mean causal impact can be interpreted as the mean impact of the intervention (per time point) over the post-intervention period.\n",
":::"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Similarly, if we wanted, we would also be able to query the estimated cumulative impact of the intervention over the post-intervention period by instead looking at the `post_impact_cumulative` attribute, rather than the `post_impact` attribute."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "CausalPy",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.2"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "02f5385db19eab57520277c5168790c7855381ee953bdbb5c89c321e1f17586e"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}