Abstract
4 min readWe thank Witte and Visscher1 for their very interesting comments. As we discussed in our article,2 there are many alternative approaches within our conceptual framework for how exactly to correct for misclassification. We used the most simple and straightforward one for illustrative purposes, using a plain cutoff value assignment for reclassifying apparent controls to cases.2 We are more interested in promoting the concept rather than any particular technical method by which it could be implemented. In this regard, several options proposed by Witte and Visscher1 are certainly possible. We agree that it would be useful to compare the performance of these various correctional options in future studies. However, we suspect that improvements, if any, would be incremental rather than major. This situation is fairly equivalent to the use of imputed genotypes for genetic variants that have not been directly genotyped.3,4 Each genotype is imputed with a probability, not full certainty. Including these variants with a probabilistic weight (eg, proportional to the accuracy of their imputation) in theory is preferable to simply use a cutoff value and assign them the most likely genotype. However, differences between these approaches have been minimal in practice as shown in the large number of studies using imputations. Overall, a balance between computational simplicity and incremental precision should be considered. Correction for events that happen during follow-up may also be done with various techniques for modeling time-to-event processes. We do not agree with Witte and Visscher1 that death provides a good analogy to the future incidence of other phenotypes, such as age-related macular degeneration. Death is a universal outcome: everybody dies sooner or later. It is unlikely, however, that everyone would develop advanced age-related macular degeneration even if follow-up could be extended to the current limits of human longevity. For most common phenotypes to which our method may be applied, it is similarly unlikely that the phenotype would eventually occur in all persons if they lived long enough. Phenotypes would become more common, but not ubiquitous. We think that in discovering genetic variants it is useful to expect an average life expectancy, so as to discover risk factors that would make a difference in a typical lifetime. Moreover, in applying our method to different phenotypes, one has to consider the incidence density of the phenotype over age. For example, if a disease always develops before 50 years of age, then clearly there is no need to correct for misclassification if all enrolled participants are 50 years old. Conversely, if all disease develops between age 50 and 80 years, then using an uncorrected control participant aged 50 years is a poor choice. For discovery purposes, we argue that if a control has not developed the phenotype simply because he is currently 50 years old, but is expected to develop the phenotype with sufficiently high certainty (based on predictive modeling) by age 80 years, it is worth correcting this. As our simulations show, if the predictive model has high discriminating ability and the phenotype is common, the correction will be useful. We agree with Witte and Visscher1 that juxtaposing the predicted status versus the observed status could also have collateral benefits, such as the documentation of diagnostic errors in both research and clinical applications. In fact (and in contrast to age-related macular degeneration), the misclassification of cases into apparent controls for most phenotypes may be primarily due not to insufficient follow-up but rather to inaccurate diagnostic tests or suboptimal definitions of what constitutes disease. For many common diseases, diagnosis is not easy, there is lack of consensus among experts, or there are diverse sets of imperfect diagnostic criteria. We agree with Witte and Visscher that correcting for misclassification will not work if the model is not sufficiently accurate or if it is wrong—but we are not totally pessimistic about the prospects of predictive models. Models using genomic information alone may be unlikely to reach extremely high levels of discrimination for most phenotypes, with the exception of those with high heritability.5 However, composite models using both genomic and nongenomic information may have a better prospect. Age-related macular degeneration is probably the forerunner in this regard, and we suspect that the same high accuracy predictive models will become possible in the near future for other diseases.
Discussion(0)
No comments yet. Be the first to comment.