Methods to increase reproducibility in differential gene expression via meta-analysis
Article 2016 en
Authors
TS
Timothy E. Sweeney
WH
Winston Haynes
FV
Francesco Vallania
Abstract
1 min read
Findings from clinical and biological studies are often not reproducible when tested in independent cohorts. Due to the testing of a large number of hypotheses and relatively small sample sizes, results from whole-genome expression studies in particular are often not reproducible. Compared to single-study analysis, gene expression meta-analysis can improve reproducibility by integrating data from multiple studies. However, there are multiple choices in designing and carrying out a meta-analysis. Yet, clear guidelines on best practices are scarce. Here, we hypothesized that studying subsets of very large meta-analyses would allow for systematic identification of best practices to improve reproducibility. We therefore constructed three very large gene expression meta-analyses from clinical samples, and then examined meta-analyses of subsets of the datasets (all combinations of datasets with up to N/2 samples and K/2 datasets) compared to a 'silver standard' of differentially expressed genes found in the entire cohort. We tested three random-effects meta-analysis models using this procedure. We showed relatively greater reproducibility with more-stringent effect size thresholds with relaxed significance thresholds; relatively lower reproducibility when imposing extraneous constraints on residual heterogeneity; and an underestimation of actual false positive rate by Benjamini-Hochberg correction. In addition, multivariate regression showed that the accuracy of a meta-analysis increased significantly with more included datasets even when controlling for sample size.
Jessica Schulz, Petros Takousis, Inken Wohlers, Ivie Itua, Valerija Dobričić, Gerta Rücker, Harald Binder, Lefkos Middleton, John P A Ioannidis, Robert Perneczky, Lars Bertram, Christina M. Lill
Discussion(0)
No comments yet. Be the first to comment.