
The Library
Reconstructing genotypes in private genomic databases from genetic risk scores
Tools
Paige, Brooks, Bell, James, Bellet, Aurélien, Gascon, Adrià and Ezer, Daphne (2021) Reconstructing genotypes in private genomic databases from genetic risk scores. Journal of Computational Biology, 28 (5). pp. 435-451. doi:10.1089/cmb.2020.0445 ISSN 1557-8666.
|
PDF
cmb.2020.0445.pdf - Published Version - Requires a PDF viewer. Available under License Creative Commons Attribution 4.0. Download (21Mb) | Preview |
Official URL: https://doi.org/10.1089/cmb.2020.0445
Abstract
Some organizations such as 23andMe and the UK Biobank have large genomic databases that they re-use for multiple different genome-wide association studies. Even research studies that compile smaller genomic databases often utilize these databases to investigate many related traits. It is common for the study to report a genetic risk score (GRS) model for each trait within the publication. Here, we show that under some circumstances, these GRS models can be used to recover the genetic variants of individuals in these genomic databases—a reconstruction attack. In particular, if two GRS models are trained by using a largely overlapping set of participants, it is often possible to determine the genotype for each of the individuals who were used to train one GRS model, but not the other. We demonstrate this theoretically and experimentally by analyzing the Cornell Dog Genome database. The accuracy of our reconstruction attack depends on how accurately we can estimate the rate of co-occurrence of pairs of single nucleotide polymorphisms within the private database, so if this aggregate information is ever released, it would drastically reduce the security of a private genomic database. Caution should be applied when using the same database for multiple analysis, especially when a small number of individuals are included or excluded from one part of the study.
Item Type: | Journal Article | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QH Natural history > QH426 Genetics | ||||||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Life Sciences (2010- ) > Biological Sciences ( -2010) | ||||||||||||
SWORD Depositor: | Library Publications Router | ||||||||||||
Library of Congress Subject Headings (LCSH): | Genomics -- Data processing, Genetic screening, Human genome -- Research -- Moral and ethical aspects, Genomics -- Law and legislation, Privacy, Right of | ||||||||||||
Journal or Publication Title: | Journal of Computational Biology | ||||||||||||
Publisher: | Mary Ann Liebert Inc | ||||||||||||
ISSN: | 1557-8666 | ||||||||||||
Official Date: | 1 May 2021 | ||||||||||||
Dates: |
|
||||||||||||
Volume: | 28 | ||||||||||||
Number: | 5 | ||||||||||||
Page Range: | pp. 435-451 | ||||||||||||
DOI: | 10.1089/cmb.2020.0445 | ||||||||||||
Status: | Peer Reviewed | ||||||||||||
Publication Status: | Published | ||||||||||||
Access rights to Published version: | Open Access (Creative Commons) | ||||||||||||
Date of first compliant deposit: | 27 April 2022 | ||||||||||||
Date of first compliant Open Access: | 27 April 2022 | ||||||||||||
RIOXX Funder/Project Grant: |
|
||||||||||||
Is Part Of: | 1 |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year