How one can know the unknowable in observational research
- Introduction
- Drawback Setup
2.1. Causal Graph
2.2. Mannequin With and With out Z
2.3. Power of Z as a Confounder - Sensitivity Evaluation
3.1. Aim
3.2. Robustness Worth - PySensemakr
- Conclusion
- Acknowledgements
- References
The specter of unobserved confounding (aka omitted variable bias) is a infamous downside in observational research. In most observational research, except we are able to moderately assume that therapy task is as-if random as in a pure experiment, we are able to by no means be really sure that we managed for all doable confounders in our mannequin. In consequence, our mannequin estimates could be severely biased if we fail to regulate for an vital confounder–and we wouldn’t even understand it for the reason that unobserved confounder is, nicely, unobserved!
Given this downside, you will need to assess how delicate our estimates are to doable sources of unobserved confounding. In different phrases, it’s a useful train to ask ourselves: how a lot unobserved confounding would there need to be for our estimates to drastically change (e.g., therapy impact not statistically vital)? Sensitivity evaluation for unobserved confounding is an energetic space of analysis, and there are a number of approaches to tackling this downside. On this submit, I’ll cowl a easy linear technique [1] based mostly on the idea of partial R² that’s extensively relevant to a big spectrum of instances.
2.1. Causal Graph
Allow us to assume that we have now 4 variables:
- Y: end result
- D: therapy
- X: noticed confounder(s)
- Z: unobserved confounder(s)
It is a widespread setting in lots of observational research the place the researcher is desirous about figuring out whether or not the therapy of curiosity has an impact on the end result after controlling for doable treatment-outcome confounders.
In our hypothetical setting, the connection between these variables are such that X and Z each have an effect on D and Y, however D has no impact on Y. In different phrases, we’re describing a state of affairs the place the true therapy impact is null. As will turn into clear within the subsequent part, the aim of sensitivity evaluation is having the ability to purpose about this therapy impact when we have now no entry to Z, as we usually received’t because it’s unobserved. Determine 1 visualizes our setup.
Determine 1: Drawback Setup
2.2. Mannequin With and With out Z
To show the issue that our unobserved Z could cause, I simulated some information in step with the issue setup described above. You may seek advice from this pocket book for the main points of the simulation.
Since Z could be unobserved in actual life, the one mannequin we are able to usually match to information is Y~D+X. Allow us to see what outcomes we get if we run that regression.
Based mostly on these outcomes, it looks as if D has a statistically vital impact of 0.2686 (p<0.001) per one unit change on Y, which we all know isn’t true based mostly on how we generated the info (no D impact).
Now, let’s see what occurs to our D estimate once we management for Z as nicely. (In actual life, we after all received’t be capable to run this extra regression since Z is unobserved however our simulation setting permits us to peek backstage into the true information technology course of.)
As anticipated, controlling for Z accurately removes the D impact by shrinking the estimate in the direction of zero and giving us a p-value that’s not statistically vital on the 𝛼=0.05 threshold (p=0.059).
2.3. Power of Z as a Confounder
At this level, we have now established that Z is powerful sufficient of a confounder to eradicate the spurious D impact for the reason that statistically vital D impact disappears once we management for Z. What we haven’t mentioned but is strictly how robust Z is as a confounder. For this, we are going to make the most of a helpful statistical idea referred to as partial R², which quantifies the proportion of variation {that a} given variable of curiosity can clarify that may’t already be defined by the prevailing variables in a mannequin. In different phrases, partial R² tells us the added explanatory energy of that variable of curiosity, above and past the opposite variables which are already within the mannequin. Formally, it may be outlined as follows
the place RSS_reduced is the residual sum of squares from the mannequin that doesn’t embrace the variable(s) of curiosity and RSS_full is the residual sum of squares from the mannequin that features the variable(s) of curiosity.
In our case, the variable of curiosity is Z, and we want to know what quantity of the variation in Y and D that Z can clarify that may’t already be defined by the prevailing variables. Extra exactly, we have an interest within the following two partial R² values
the place (1) quantifies the proportion of variance in Y that may be defined by Z that may’t already be defined by D and X (so the decreased mannequin is Y~D+X and the complete mannequin is Y~D+X+Z), and (2) quantifies the proportion of variance in D that may be defined by Z that may’t already be defined by X (so the decreased mannequin is D~X and the complete mannequin is D~X+Z).
Now, allow us to see how strongly related Z is with D and Y in our information by way of partial R².
It seems that Z explains 16% of the variation in Y that may’t already be defined by D and X (that is partial R² equation #1 above), and 20% of the variation in D that may’t already be defined by X (that is partial R² equation #2 above).
3.1. Aim
As we mentioned within the earlier part, unobserved confounding poses an issue in actual analysis settings exactly as a result of, in contrast to in our simulation setting, Z can’t be noticed. In different phrases, we’re caught with the mannequin Y~D+X, having no method to know what our outcomes would have been if we might run the mannequin Y~D+X+Z as an alternative. So, what can we do?
Intuitively, an affordable sensitivity evaluation method ought to be capable to inform us that if a Z such because the one we have now in our information had been to exist, it might nullify our outcomes. Do not forget that our Z explains 16% of the variation in Y and 20% of the variation in D that may’t be defined by noticed variables. Due to this fact, we anticipate sensitivity evaluation to inform us {that a} hypothetical Z-like confounder of comparable power could be sufficient to eradicate the statistically vital D impact.
However how can we calculate that the unobserved confounder’s power ought to be on this 16–20% vary within the partial R² scale with out ever getting access to it? Enter robustness worth.
3.2. Robustness Worth
Robustness worth (RV) formalizes the thought we talked about above of figuring out the required power of a hypothetical unobserved confounder that might nullify our outcomes. The usefulness of RV emanates from the truth that we solely want our observable mannequin Y~D+X and never the unobservable mannequin Y~D+X+Z to have the ability to calculate it.
Formally, we are able to write down as follows the RV that quantifies how robust unobserved confounding must be to alter our noticed statistical significance of the therapy impact (if the notation is an excessive amount of to comply with, simply keep in mind the important thing concept that the RV is a measure of the power of confounding wanted to alter our outcomes)
the place
- 𝛼 is our chosen significance degree (typically set to 0.05 or 5%),
- q determines the % discount q*100% in significance that we care about (typically set to 1, since we normally care about confounding that would cut back statistical significance by 1*100%=100% therefore rendering it not statistically vital),
- t_betahat_treat is the noticed t-value of our therapy from the mannequin Y~D+X (which is 8.389 on this case as could be seen from the regression outcomes above),
- df is our levels of freedom (which is 1000–3=997 on this case since we simulated 1000 samples and are estimating 3 parameters together with the intercept), and
- t*_alpha,df-1 is the t-value threshold related to a given 𝛼 and df-1 (1.96 if 𝛼 is ready to 0.05).
We are actually able to calculate the RV in our personal information utilizing solely the noticed mannequin Y~D+X (res_ydx).
It’s by no struck of luck that our RV (18%) falls proper within the vary of the partial R² values we calculated for Y~Z|D,X (16%) and D~Z|X (20%) above. What the RV is telling us right here is that, even with none specific information of Z, we are able to nonetheless purpose that any unobserved confounder wants, on common, at the very least 18% power within the partial R² scale vis-à-vis each the therapy and the end result to have the ability to nullify our statistically vital consequence.
The rationale why the RV isn’t 16% or 20% however falls someplace in between (18%) is that it’s designed to be a single quantity that summarizes the required power of the confounder with each the end result and the therapy, so 18% makes good sense given what we all know concerning the information. You may give it some thought like this: for the reason that technique doesn’t have entry to the precise numbers 16% and 20% when calculating the RV, it’s doing its finest to quantify the power of the confounder by assigning 18% to each partial R² values (Y~Z|D,X and D~Z|X), which isn’t too far off from the reality in any respect and really does an amazing job summarizing the power of the confounder.
After all, in actual life we received’t have the Z variable to double test that our RV is right, however seeing how the 2 outcomes align right here ought to at the very least offer you some confidence within the technique. Lastly, as soon as we calculate the RV, we should always take into consideration whether or not an unobserved confounder of that power is believable. In our case, the reply is ‘sure’ as a result of we have now entry to the info technology course of, however to your particular real-life software, the existence of such a robust confounder may be an unreasonable assumption. This might be excellent news for you since no reasonable unobserved confounder might drastically change your outcomes.
The sensitivity evaluation method described above has already been carried out with all of its bells and whistles as a Python bundle beneath the identify PySensemakr (R, Stata, and Shiny App variations exist as nicely). For instance, to get the very same consequence that we manually calculated within the earlier part, we are able to merely run the next code chunk.
Notice that “Robustness Worth, q = 1 alpha = 0.05” is 0.184, which is strictly what we calculated above. Along with the RV for statistical significance, the bundle additionally offers the RV that’s wanted for the coefficient estimate itself to shrink to 0. Not surprisingly, unobserved confounding must be even bigger for this to occur (0.233 vs 0.184).
The bundle additionally offers contour plots for the 2 partial R² values, which permits for an intuitive visible show of sensitivity to doable ranges of confounding with the therapy and the end result (on this case, it shouldn’t be shocking to see that the x/y-axis worth pairs that meet the purple dotted line embrace 0.18/0.18 in addition to 0.20/0.16).
One may even add benchmark values to the contour plot as proxies for doable quantities of confounding. In our case, since we solely have one noticed covariate X, we are able to set our benchmarks to be 0.25x, 0.5x and 1x as robust as that noticed covariate. The ensuing plot tells us {that a} confounder that’s half as robust as X ought to be sufficient to nullify our statistically vital consequence (for the reason that “0.5x X” worth falls proper on the purple dotted line).
Lastly, I want to word that whereas the simulated information on this instance used a steady therapy variable, in apply the tactic works for any sort of therapy variable together with binary remedies. Then again, the end result variable technically must be a steady one since we’re working within the OLS framework. Nevertheless, the tactic can nonetheless be used even with a binary end result if we mannequin it utilizing OLS (that is referred to as a LPM [2]).
The chance that our impact estimate could also be biased attributable to unobserved confounding is a standard hazard in observational research. Regardless of this potential hazard, observational research are a significant device in information science as a result of randomization merely isn’t possible in lots of instances. Due to this fact, you will need to understand how we are able to tackle the problem of unobserved confounding by operating sensitivity analyses to see how sturdy our estimates are to potential such confounding.
The robustness worth technique by Cinelli and Hazlett mentioned on this submit is an easy and intuitive method to sensitivity evaluation formulated in a well-known linear mannequin framework. If you’re desirous about studying extra concerning the technique, I extremely suggest looking on the unique paper and the bundle documentation the place you may find out about many extra fascinating purposes of the tactic comparable to ‘excessive state of affairs’ evaluation.
There are additionally many different approaches to sensitivity evaluation for unobserved confounding, and I would really like briefly point out a few of them right here for readers who want to proceed studying extra on this matter. One versatile method is the E-value developed by VanderWeele and Ding that formulates the issue by way of danger ratios [3] (carried out in R right here). One other method is the Austen plot developed by Veitch and Zaveri based mostly on the ideas of partial R² and propensity rating [4] (carried out in Python right here), and yet one more latest method is by Chernozhukov et al [5] (carried out in Python right here).
I want to thank Chad Hazlett for answering my query associated to utilizing the tactic with binary outcomes and Xinyi Zhang for offering a variety of worthwhile suggestions on the submit. Until in any other case famous, all photos are by the creator.
[1] C. Cinelli and C. Hazlett, Making Sense of Sensitivity: Extending Omitted Variable Bias (2019), Journal of the Royal Statistical Society
[2] J. Murray, Linear Likelihood Mannequin, Murray’s private web site
[3] T. VanderWeele and P. Ding, Sensitivity Evaluation in Observational Analysis: Introducing the E-Worth (2017), Annals of Inside Drugs
[4] V. Veitch and A. Zaveri, Sense and Sensitivity Evaluation: Easy Submit-Hoc Evaluation of Bias Attributable to Unobserved Confounding (2020), NeurIPS
[5] V. Chernozhukov, C. Cinelli, W. Newey, A. Sharma, and V. Syrgkanis, Lengthy Story Quick: Omitted Variable Bias in Causal Machine Studying (2022), NBER