A Practical Introduction to Causal Inference in Agricultural Economics 

20.1.2026 by A Henningsen, G Low, D Wuepper, T Dalhaus, H Storm, D Belay, S Hirsch (eine deutsche Version gibt es auf dem Agrarpolitik Blog der ETH Zürich)

Agricultural economics research often addresses policy-relevant questions, many of which are inherently causal, e.g., how a certain policy, farmer’s choice, or market development affects some outcome such as consumer behavior, farm income, or the environment. Sometimes, data comes from randomized control trials, with a well-balanced treatment and control group, and the analysis is straightforward and comparably simple. Once it is established that treatment and control group were indistinguishable regarding both observable and unobservable factors before any treatment was provided, it is obvious that differences arising after treatment are likely caused by it. 
XKCD.webp
© XKCD

However, most of the time, data does not come from randomized control trials. For instance, policymakers need to make a decision, e.g., on how to design a new agri-environmental scheme or a land use regulation, and there is no time to run a randomized control trial for multiple years before one can tell the policymaker what works and what does not. 

Moreover, even if the policymaker had the time and patience to wait, many research questions are generally unsuitable for a randomized control trial. For instance, convincing governments of randomly selected countries to change their laws and convincing the governments of all other countries to maintain their laws unchanged solely for cleaner causal inference is highly impractical. Once we move to issues like land degradation, we even move into the realm of the unethical. No ethics board in the world would allow researchers to purposefully degrade cropland of randomly chosen farmers just to learn how much this leads to cropland abandonment and migration. These challenges highlight why obtaining robust causal inference without randomized control trials is essential for designing effective agricultural policies that achieve intended outcomes and avoid unintended consequences. Instead of answering all research questions using randomized control trials, agricultural economists most of the time rely on the analysis of observational data and use a wide range of causal inference approaches that are suited for this, conditional on different assumptions being valid.

In a new article published in the Journal of Agricultural Economics (Henningsen et al., 2025), we offer an introductory discussion of these causal inference approaches for observational data. The fundamental challenge when working with observational data is statistical endogeneity, i.e., when explanatory variables are correlated with unobserved factors affecting the outcome. When using observational data, we estimate correlations, and only by choosing clever research designs and critically scrutinizing identifying assumptions, we might be able to rule out all non-causal explanations for why there could be a correlation between the variable(s) whose effect we aim to estimate and the outcomes of interest. Potential non-causal alternatives are unobserved confounding factors, reverse causality, and certain types of measurement error. 

The article begins by discussing approaches that aim to adjust for all confounding factors. For example, in a regression on the link between farm performance and pesticide use, one might want to control for environmental differences such as rainfall and temperature, or even better, pest pressure itself. The article then discusses approaches  for estimating causal effects when unobserved confounders may exist, such as methods relying on instrumental variables, difference-in-differences designs, synthetic control method, regression discontinuity design and difference-in-discontinuities design. Each of these research designs is based on a set of identifying assumptions, which need to be critically discussed and examined in every application. Often, assumptions cannot be directly tested, but if many attempts to falsify these assumptions using theory, domain knowledge and data fail, one might conclude from this that it is credible to interpret the estimated relationships as causal effects.

Good studies that aim to estimate causal effects with observational data provide easy-to-follow explanations of what are the empirical challenges that stand in the way of simply interpreting a correlation as causal. This is then followed up by an explanation of what research design is chosen to overcome these challenges and what makes this research design credible, supported by reasoning as well as empirical evidence. Importantly, all important identifying assumptions need to be discussed together with why one should believe that they are valid. To make this point, both context and data need to be described well. For the latter, plotting and mapping might be helpful in many cases. Moreover, placebo tests are often a powerful tool to diffuse concerns.

Another characteristic of a convincing analysis is that the overall patterns presented are internally consistent and in line with the hypothesized mechanisms. If the initially presented correlations are possibly inflated because a positive selection bias is plausible, then it is to be expected that a causal research design leads to a smaller estimate, and if instead the estimate becomes larger, this could be a red flag. Likewise, there is a lot we know about agricultural production in general that we can use for sanity checks. Some effects can only show up with a time lag, while others deteriorate over time. If an estimated effect shows up impossibly quickly or impossibly late, this can call into question whether these estimates are indeed causal effects. Likewise, there are units and times for which the effect is expected to be larger, smaller, or non-existent, and it is helpful to confirm that this is indeed what comes out of the analysis. For example, if the units of observations are crop fields and the outcome of interest is crop-specific, then certain effects can plausibly only be found when these fields are actually planted with the crop, and not at other times, when there is either no crop, or a different crop planted; certain weather effects can be safely assumed to be stronger in summer, weaker in spring and autumn, and non-existent in winter; and sometimes, information treatments can be expected to only affect inexperienced farmers but not those who already possess all the required information. In other words, instead of relying on one or two specifications that then need to carry all the weight of the evidence, it is commonly advisable to establish as much as possible that all potentially available empirical evidence supports the authors’ interpretation of the data, or if this is not the case, “puzzles” should be openly communicated, e.g., if there are empirical patterns that do not fully fit into the narrative of the paper.
Unfortunately, it often takes a while for misconceptions to be corrected. A famous example is the often-repeated statement that an instrumental variable is sufficiently strong if its F-statistic exceeds 10. Although more recent research has shown that the threshold for instrument strength is context-dependent and often much higher than 10, a large number of papers simply repeat the statement that whenever the F-value exceeds 10, there is no reason to worry. Our guidelines aim to support researchers in making better methodological choices and enhancing the credibility of causal evidence in agricultural economics research. 
Our introductory article can of course not cover any topic in depth. It is meant as an overview and a starting point, e.g., for established researchers to refresh their expertise in causal inference, or for early-career researchers to start developing it. For further reading, there are more in-depth treatments of the various topics. For a general introduction to agricultural economics research, including how to make convincing arguments, there is the introductory book “Doing Economics” by Bellemare (2026). More detailed treatments of the different causal inference designs, including example code for different software, can be found in “The Mixtape” by Cunningham (2026) and “The Effect” by Huntington-Klein (2026). For each specific research design, there are dedicated review papers, such as on difference-in-differences designs, including recent insights into designs with staggered treatment (Baker et al., 2025; Roth, Sant’Anna, Bilinski & Poe, 2023), or on regression discontinuity designs (Wuepper & Finger, 2023), and for instrumental variables, there are even review papers on specific, popular kinds of instrumental variables, such as shift-shares (Borusyak, Hull & Jaravel, 2025). Accessible introductions on machine learning are provided by Storm, Baylis & Heckelei (2020) and Baylis, Heckelei & Storm (2021).

References
Baker, A., Callaway, B., Cunningham, S., Goodman-Bacon, A. & Sant'Anna, P.H.C. (2025). Difference-in-differences designs: A practitioner's guide. arXiv preprint arXiv:2503.13323. 
Baylis, K., Heckelei, T., & Storm, H. (2021). Machine learning in agricultural economics. In Handbook of Agricultural Economics (Vol. 5, pp. 4551-4612). Elsevier. 
Bellemare, M. F. (2022). Doing economics: What you should have learned in grad school—but didn’t. MIT Press. 
Borusyak, K., Hull, P., & Jaravel, X. (2025). A practical guide to shift-share instruments. Journal of Economic Perspectives, 39(1), 181-204. 
Cunningham, S. (2021). Causal inference: The mixtape. Yale University Press. 
Henningsen, A., Low, G., Wuepper, D., Dalhaus, T., Storm, H., Belay, D. & Hirsch, S. (2025). Estimating Causal Effects with Observational Data: Guidelines for Agricultural and Applied Economists. Journal of Agricultural Economics, forthcoming. 
Huntington-Klein, N. (2025). The effect: An introduction to research design and causality. Chapman and Hall/CRC. 
Roth, J., Sant’Anna, P. H., Bilinski, A., & Poe, J. (2023). What’s trending in difference-in-differences? A synthesis of the recent econometrics literature. Journal of Econometrics, 235(2), 2218-2244. 
Storm, H., Baylis, K., & Heckelei, T. (2020). Machine learning in agricultural and applied economics. European Review of Agricultural Economics, 47(3), 849-892. 
Wuepper, D., & Finger, R. (2023). Regression discontinuity designs in agricultural and environmental economics. European Review of Agricultural Economics, 50(1), 1-28. 

Wird geladen