Journal article
+ 1 other files
Inferring heterozygosity from ancient and low coverage genomes
-
Kousathanas, Athanasios
Department of Biology and Biochemistry, University of Fribourg, Switzerland - Statistical and Computational Evolutionary Biology Group, Swiss Institute of Bioinformatics, Fribourg, Switzerland
-
Leuenberger, Christoph
Department of Mathematics, University of Fribourg, Switzerland
-
Link, Vivian
Department of Biology and Biochemistry, University of Fribourg, Switzerland - Statistical and Computational Evolutionary Biology Group, Swiss Institute of Bioinformatics, Fribourg, Switzerland
-
Sell, Christian
Paleogenetics Group, University of Mainz, Germany
-
Burger, Joachim
Paleogenetics Group, University of Mainz, Germany
-
Wegmann, Daniel
Department of Biology and Biochemistry, University of Fribourg, Switzerland - Statistical and Computational Evolutionary Biology Group, Swiss Institute of Bioinformatics, Fribourg, Switzerland
Show more…
Published in:
- Genetics. - 2017, vol. 205, no. 1, p. 317–332
English
While genetic diversity can be quantified accurately from high coverage sequencing data, it is often desirable to obtain such estimates from data with low coverage, either to save costs or because of low DNA quality, as is observed for ancient samples. Here, we introduce a method to accurately infer heterozygosity probabilistically from sequences with average coverage Embedded Image of a single individual. The method relaxes the infinite sites assumption of previous methods, does not require a reference sequence, except for the initial alignment of the sequencing data, and takes into account both variable sequencing errors and potential postmortem damage. It is thus also applicable to nonmodel organisms and ancient genomes. Since error rates as reported by sequencing machines are generally distorted and require recalibration, we also introduce a method to accurately infer recalibration parameters in the presence of postmortem damage. This method does not require knowledge about the underlying genome sequence, but instead works with haploid data (e.g., from the X-chromosome from mammalian males) and integrates over the unknown genotypes. Using extensive simulations we show that a few megabasepairs of haploid data are sufficient for accurate recalibration, even at average coverages as low as Embedded Image At similar coverages, our method also produces very accurate estimates of heterozygosity down to Embedded Image within windows of about 1 Mbp. We further illustrate the usefulness of our approach by inferring genome-wide patterns of diversity for several ancient human samples, and we found that 3000–5000-year-old samples showed diversity patterns comparable to those of modern humans. In contrast, two European hunter-gatherer samples exhibited not only considerably lower levels of diversity than modern samples, but also highly distinct distributions of diversity along their genomes. Interestingly, these distributions were also very different between the two samples, supporting earlier conclusions of a highly diverse and structured population in Europe prior to the arrival of farming.
-
Faculty
- Faculté des sciences et de médecine
-
Department
- Département de Biologie
-
Language
-
-
Classification
-
Biological sciences
-
License
-
License undefined
-
Identifiers
-
-
Persistent URL
-
https://folia.unifr.ch/unifr/documents/305403
Other files
Statistics
Document views: 32
File downloads:
- weg_iha_sm.pdf: 75
- weg_iha.pdf: 71