The Building Blocks
External validity inference rests on assumptions, background knowledge, and supplementary evidence that researchers collect. From these building blocks, researchers hypothesize about mechanisms—the conditions under which inputs lead to outputs at the population level. Mechanisms are foundational because external validity is not about whether an inference travels in binary fashion, but about specifying when and how effects generalize.
Two Types of External Validity
There are two distinct types of external validity inference, distinguished by their eligibility criteria:
Generalizability
Generalizability refers to when researchers use facts that they learn about a sample to make inferences about a population that has similar eligibility criteria.
Estimand: Population Average Treatment Effect (PATE)
Transportability
Transportability captures when researchers use facts that they learn about a sample to make inferences about a population that has different eligibility criteria.
Estimand: Target Average Treatment Effect (TATE)
Transportability requires additional assumptions or adjustment strategies to account for how mechanisms operate across differences in eligibility criteria. Regardless of whether an inference takes the form of generalizability or transportability, the strongest inferences will be mechanism-based.
The M–STOUT Framework
Credible external validity inferences require organizing inferences around multiple dimensions. We examine M–STOUT: Mechanisms, Settings, Treatments, Outcomes, Units, and Time. Our framework builds on but reorganizes the classic UTOS framework (Cronbach & Shapiro 1982) to emphasize the often-neglected dimensions of Mechanisms and Time.
We place Mechanisms first for three key reasons. First, the mechanism represents the causal structure, specifying the pathways and constraints that link inputs to outputs. Second, any particular empirical point estimate is only relevant to its specific STOUT context—these inferences do not capture broader theory and, at best, represent observable implications of a mechanism. Finally, the mechanism is the only dimension that, in principle, can travel across all STOUT contexts, explaining the conditions under which effects occur.
- Mechanisms
- The causal structure specifying pathways and constraints that shape relationships of interest. Mechanisms here refer to structural mechanisms—the moderator variables that determine when and how effects vary—rather than causal process mechanisms (mediators). Mechanisms specify support factors, constraint factors, and the functional form governing how those factors interact across populations.
- Settings
- The environments in which a study's data are generated—laboratory, country, village, or other contexts. Different settings yield different levels of external validity.
- Treatments
- The extent to which inferences hold across different operationalizations of the main explanatory variable. Requires construct validity and measurement invariance.
- Outcomes
- Whether inferences hold across different operationalizations of the dependent variable, including across different measurement approaches.
- Units
- The population units for which sample inferences hold—individuals, households, municipalities, countries, or nested combinations thereof.
- Time
- Effects and their mechanisms often vary across periods and sequences. The target population for any inference pertains to some future state; population composition shifts and confounders may emerge.