Columbia University
To develop a unified Bayesian approach to estimating heterogeneous causal effects
When running experiments, a social scientist can expect to observe the outcome of treating a given individual at a given time under given circumstances. But what the experimenter never gets to see directly is the counterfactual outcome that would have been obtained had the very same individual not been treated at that very same time under those very same circumstances. That is why so many empirical social scientists love randomized controlled trials (RCTs). By assigning people randomly to either the treatment or control group, RCTs are designed to make sure there are no statistical differences between one group and the other at the outset. If the experiment reveals statistically significant differences between those two groups when it is over, then we have good reason to conclude that the intervention must be the cause. One big problem, though, is that RCTs are often impractical or even unethical. And unless researchers have access to very large samples, they usually produce reliable estimates of average treatment effects only, Andrew Gelman of Columbia University has dedicated his career to developing and applying techniques that provide empirical evidence in complicated settings where desirable data may be incomplete or unavailable. Like many social scientists, he favors “Bayesian” methods over “Frequentist” ones that natural scientists prefer since, in many cases, they actually can repeat the same experiment over and over. Bayesians start with prior beliefs about a probability distribution in question, then systematically adjust those beliefs based on all the experimental evidence. Gelman specifically constructs reasonable priors using a process of “multilevel modeling.” His approach recognizes that treatment effects are heterogeneous from the outset and provides additional structure for breaking down those estimates by subgroup even when data are sparse. Together with Jori Korpershoek, Gelman will develop a unified Bayesian approach to estimating causal effects in cross sectional data from both panels and time series (i.e., the kinds of data most empirical economists study). They aim to express standard ways of estimating average treatment effects as cases of a more comprehensive algorithm in which each method amounts to comparing differently weighted means. Rather than having researchers concoct specific weights, as is current practice, Gelman and Korpershoek will prescribe methods for estimating the weights that will in turn produce better and more efficient estimators. Their open-source and user-friendly software will extend those methods to allow for the reliable estimation of individual treatment effects based on Bayesian multilevel modeling.