• Effect modification means the magnitude or stability of a treatment effect varies as a function of an observed covariate. Generally, larger and more stable treatment effects are insensitive to larger biases from unmeasured covariates, so a causal conclusion may be considerably firmer if this pattern is noted if it occurs. We propose a new strategy, called the submax-method, that combines exploratory and confirmatory efforts to determine whether there is stronger evidence of causality - that is, greater insensitivity to unmeasured confounding - in some subgroups of individuals. It uses the joint distribution of test statistics that split the data in various ways based on certain observed covariates. For $L$ binary covariates, the method splits the population $L$ times into two subpopulations, perhaps first men and women, perhaps then smokers and nonsmokers, computing a test statistic from each subpopulation, and appends the test statistic for the whole population, making $2L+1$ test statistics in total. Although $L$ binary covariates define $2^{L}$ interaction groups, only $2L+1$ tests are performed, and at least $L+1$ of these tests use at least half of the data. The submax-method achieves the highest design sensitivity and the highest Bahadur efficiency of its component tests. Moreover, the form of the test is sufficiently tractable that its large sample power may be studied analytically. The simulation suggests that the submax method exhibits superior performance, in comparison with an approach using CART, when there is effect modification of moderate size. Using data from the NHANES I Epidemiologic Follow-Up Survey, an observational study of the effects of physical activity on survival is used to illustrate the method. The method is implemented in the $\texttt{R}$ package $\texttt{submax}$ which contains the NHANES example.
  • Two problems that arise in making causal inferences for non-mortality outcomes such as bronchopulmonary dysplasia (BPD) are unmeasured confounding and censoring by death, i.e., the outcome is only observed when subjects survive. In randomized experiments with noncompliance, instrumental variable methods can be used to control for the unmeasured confounding without censoring by death. But when there is censoring by death, the average causal treatment effect cannot be identified under usual assumptions, but can be studied for a specific subpopulation by using sensitivity analysis with additional assumptions. However, in observational studies, evaluation of the local average treatment effect (LATE) in censoring by death problems with unmeasured confounding is not well studied. We develop a novel sensitivity analysis method based on instrumental variable models for studying the LATE. Specifically, we present the identification results under an additional assumption, and propose a three-step procedure for the LATE estimation. Also, we propose an improved two-step procedure by simultaneously estimating the instrument propensity score (i.e., the probability of instrument given covariates) and the parameters induced by the assumption. We have shown with simulation studies that the two-step procedure can be more robust and efficient than the three-step procedure. Finally, we apply our sensitivity analysis methods to a study of the effect of delivery at high-level neonatal intensive care units on the risk of BPD.
  • Studies have shown that exposure to air pollution, even at low levels, significantly increases mortality. As regulatory actions are becoming prohibitively expensive, robust evidence to guide the development of targeted interventions to reduce air pollution exposure is needed. In this paper, we introduce a novel statistical method that splits the data into two subsamples: (a) Using the first subsample, we consider a data-driven search for $\textit{de novo}$ discovery of subgroups that could have exposure effects that differ from the population mean; and then (b) using the second subsample, we quantify evidence of effect modification among the subgroups with nonparametric randomization-based tests. We also develop a sensitivity analysis method to assess the robustness of the conclusions to unmeasured confounding bias. Via simulation studies and theoretical arguments, we demonstrate that since we discover the subgroups in the first subsample, hypothesis testing on the second subsample can focus on theses subgroups only, thus substantially increasing the statistical power of the test. We apply our method to the data of 1,612,414 Medicare beneficiaries in New England region in the United States for the period 2000 to 2006. We find that seniors aged between 81-85 with low income and seniors aged above 85 have statistically significant higher causal effects of exposure to PM$_{2.5}$ on 5-year mortality rate compared to the population mean.
  • In observational studies, the causal effect of a treatment on the distribution of outcomes is of interest beyond the average treatment effect. Instrumental variable methods allow for causal inference by controlling for unmeasured confounding. The existing nonparametric method for estimating the effect of the treatment on the distribution of outcomes for compliers has several drawbacks, such as producing estimates that violate the non-decreasing and non-negative properties of cumulative distribution functions. In this paper, we propose a novel nonparametric composite likelihood approach, referred to as the binomial likelihood (BL) method, which overcomes the limitations of the previous techniques and utilizes the advantage of likelihood methods. We show the consistency of the maximum binomial likelihood (MBL) estimators and derive their asymptotic distributions. Next, we develop a computationally efficient algorithm for computing the MBL estimates by combining the expectation-maximization (EM) and the pool-adjacent-violators algorithms (PAVA). Moreover, the BL method can be used to construct a binomial likelihood-ratio test (BLRT) for the null hypothesis of no distributional treatment effect. Asymptotic expansion of the BLRT test is derived and the performance of the BL method is demonstrated in simulation studies. Finally, we apply our method to a study of the effect of Vietnam veteran status on the distribution of civilian annual earnings.
  • There is effect modification if the magnitude or stability of a treatment effect varies systematically with the level of an observed covariate. A larger or more stable treatment effect is typically less sensitive to bias from unmeasured covariates, so it is important to recognize effect modification when it is present. We illustrate a recent proposal for conducting a sensitivity analysis that empirically discovers effect modification by exploratory methods, but controls the family-wise error rate in discovered groups. The example concerns a study of mortality and use of the intensive care unit in 23,715 matched pairs of two Medicare patients, one of whom underwent surgery at a hospital identified for superior nursing, the other at a conventional hospital. The pairs were matched exactly for 130 four-digit ICD-9 surgical procedure codes and balanced 172 observed covariates. The pairs were then split into five groups of pairs by CART in its effort to locate effect modification. The evidence of a beneficial effect of magnet hospitals on mortality is least sensitive to unmeasured biases in a large group of patients undergoing rather serious surgical procedures, but in the absence of other life-threatening conditions, such as a comorbidity of congestive heart failure or an emergency admission leading to surgery.
  • Malaria is a parasitic disease that is a major health problem in many tropical regions. The most characteristic symptom of malaria is fever. The fraction of fevers that are attributable to malaria, the malaria attributable fever fraction (MAFF), is an important public health measure for assessing the effect of malaria control programs and other purposes. Estimating the MAFF is not straightforward because there is no gold standard diagnosis of a malaria attributable fever; an individual can have malaria parasites in her blood and a fever, but the individual may have developed partial immunity that allows her to tolerate the parasites and the fever is being caused by another infection. We define the MAFF using the potential outcome framework for causal inference and show what assumptions underlie current estimation methods. Current estimation methods rely on an assumption that the parasite density is correctly measured. However, this assumption does not generally hold because (i) fever kills some parasites and (ii) the measurement of parasite density has measurement error. In the presence of these problems, we show current estimation methods do not perform well. We propose a novel maximum likelihood estimation method based on exponential family g-modeling. Under the assumption that the measurement error mechanism and the magnitude of the fever killing effect are known, we show that our proposed method provides approximately unbiased estimates of the MAFF in simulation studies. A sensitivity analysis can be used to assess the impact of different magnitudes of fever killing and different measurement error mechanisms. We apply our proposed method to estimate the MAFF in Kilombero, Tanzania.