A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State

Graffelman, Jan and Galván Femenía, Iván and de Cid, Rafael and Barceló Vidal, Carles (2019) A Log-Ratio Biplot Approach for Exploring Genetic Relatedness Based on Identity by State. Frontiers in Genetics, 10. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/1/package-entries/fgene-10-00341/fgene-10-00341.pdf] Text
pubmed-zip/versions/1/package-entries/fgene-10-00341/fgene-10-00341.pdf - Published Version

Download (6MB)

Abstract

The detection of cryptic relatedness in large population-based cohorts is of great importance in genome research. The usual approach for detecting closely related individuals is to plot allele sharing statistics, based on identity-by-state or identity-by-descent, in a two-dimensional scatterplot. This approach ignores that allele sharing data across individuals has in reality a higher dimensionality, and neither regards the compositional nature of the underlying counts of shared genotypes. In this paper we develop biplot methodology based on log-ratio principal component analysis that overcomes these restrictions. This leads to entirely new graphics that are essentially useful for exploring relatedness in genetic databases from homogeneous populations. The proposed method can be applied in an iterative manner, acting as a looking glass for more remote relationships that are harder to classify. Datasets from the 1,000 Genomes Project and the Genomes For Life-GCAT Project are used to illustrate the proposed method. The discriminatory power of the log-ratio biplot approach is compared with the classical plots in a simulation study. In a non-inbred homogeneous population the classification rate of the log-ratio principal component approach outperforms the classical graphics across the whole allele frequency spectrum, using only identity by state. In these circumstances, simulations show that with 35,000 independent bi-allelic variants, log-ratio principal component analysis, combined with discriminant analysis, can correctly classify relationships up to and including the fourth degree.

Item Type: Article
Subjects: South Asian Library > Medical Science
Depositing User: Unnamed user with email support@southasianlibrary.com
Date Deposited: 06 Mar 2023 10:16
Last Modified: 16 Jul 2024 08:37
URI: http://journal.repositoryarticle.com/id/eprint/237

Actions (login required)

View Item
View Item