Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks. / Appadurai, Vivek; Bybjerg-Grauholm, Jonas; Krebs, Morten Dybdahl; Rosengren, Anders; Buil, Alfonso; Ingason, Andrés; Mors, Ole; Børglum, Anders D.; Hougaard, David M.; Nordentoft, Merete; Mortensen, Preben B.; Delaneau, Olivier; Werge, Thomas; Schork, Andrew J.

In: Communications Biology , Vol. 6, 101, 2023.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Appadurai, V, Bybjerg-Grauholm, J, Krebs, MD, Rosengren, A, Buil, A, Ingason, A, Mors, O, Børglum, AD, Hougaard, DM, Nordentoft, M, Mortensen, PB, Delaneau, O, Werge, T & Schork, AJ 2023, 'Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks', Communications Biology , vol. 6, 101. https://doi.org/10.1038/s42003-023-04477-y

APA

Appadurai, V., Bybjerg-Grauholm, J., Krebs, M. D., Rosengren, A., Buil, A., Ingason, A., Mors, O., Børglum, A. D., Hougaard, D. M., Nordentoft, M., Mortensen, P. B., Delaneau, O., Werge, T., & Schork, A. J. (2023). Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks. Communications Biology , 6, [101]. https://doi.org/10.1038/s42003-023-04477-y

Vancouver

Appadurai V, Bybjerg-Grauholm J, Krebs MD, Rosengren A, Buil A, Ingason A et al. Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks. Communications Biology . 2023;6. 101. https://doi.org/10.1038/s42003-023-04477-y

Author

Appadurai, Vivek ; Bybjerg-Grauholm, Jonas ; Krebs, Morten Dybdahl ; Rosengren, Anders ; Buil, Alfonso ; Ingason, Andrés ; Mors, Ole ; Børglum, Anders D. ; Hougaard, David M. ; Nordentoft, Merete ; Mortensen, Preben B. ; Delaneau, Olivier ; Werge, Thomas ; Schork, Andrew J. / Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks. In: Communications Biology . 2023 ; Vol. 6.

Bibtex

@article{4e706f67afae45e6940a71ece9f07562,
title = "Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks",
abstract = "Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.",
author = "Vivek Appadurai and Jonas Bybjerg-Grauholm and Krebs, {Morten Dybdahl} and Anders Rosengren and Alfonso Buil and Andr{\'e}s Ingason and Ole Mors and B{\o}rglum, {Anders D.} and Hougaard, {David M.} and Merete Nordentoft and Mortensen, {Preben B.} and Olivier Delaneau and Thomas Werge and Schork, {Andrew J.}",
note = "Publisher Copyright: {\textcopyright} 2023, The Author(s).",
year = "2023",
doi = "10.1038/s42003-023-04477-y",
language = "English",
volume = "6",
journal = "Communications Biology",
issn = "2399-3642",
publisher = "nature publishing group",

}

RIS

TY - JOUR

T1 - Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks

AU - Appadurai, Vivek

AU - Bybjerg-Grauholm, Jonas

AU - Krebs, Morten Dybdahl

AU - Rosengren, Anders

AU - Buil, Alfonso

AU - Ingason, Andrés

AU - Mors, Ole

AU - Børglum, Anders D.

AU - Hougaard, David M.

AU - Nordentoft, Merete

AU - Mortensen, Preben B.

AU - Delaneau, Olivier

AU - Werge, Thomas

AU - Schork, Andrew J.

N1 - Publisher Copyright: © 2023, The Author(s).

PY - 2023

Y1 - 2023

N2 - Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.

AB - Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.

U2 - 10.1038/s42003-023-04477-y

DO - 10.1038/s42003-023-04477-y

M3 - Journal article

C2 - 36697501

AN - SCOPUS:85146757556

VL - 6

JO - Communications Biology

JF - Communications Biology

SN - 2399-3642

M1 - 101

ER -

ID: 341877700