MCAR is not necessary for the complete cases to constitute a simple random subsample of the target sample Academic Article uri icon

abstract

  • Missing data is the norm rather than the exception in complex epidemiological studies. Complete-case analyses, which discard all subjects with some data values missing, are known to be valid under the very restrictive assumption that the response mechanism is missing completely at random (MCAR). While conditions weaker than MCAR are known under which estimators of regression coefficients are unbiased, one often comes across the view in the literature that MCAR is necessary for the complete cases to form a simple random subsample of the target sample. In this paper, we explain why this is not the case, and we distill an assumption weaker than MCAR under which the simple random subsample condition holds, which we call available at random (AAR). Moreover, we show that, unlike MCAR, AAR response mechanisms can be missing not at random (MNAR). We also suggest how approximate AAR mechanisms might arise in practice through cancellation of selection and drop-out effects, and we conclude that before pooling partially complete and complete cases into an analysis, the investigator should consider how selection might impact on the representativeness of the cases included in the pooled analysis (compared to those comprising the complete cases only).

publication date

  • 2016