首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recombination is an important evolutionary force in bacteria, but it remains challenging to reconstruct the imports that occurred in the ancestry of a genomic sample. Here we present ClonalFrameML, which uses maximum likelihood inference to simultaneously detect recombination in bacterial genomes and account for it in phylogenetic reconstruction. ClonalFrameML can analyse hundreds of genomes in a matter of hours, and we demonstrate its usefulness on simulated and real datasets. We find evidence for recombination hotspots associated with mobile elements in Clostridium difficile ST6 and a previously undescribed 310kb chromosomal replacement in Staphylococcus aureus ST582. ClonalFrameML is freely available at http://clonalframeml.googlecode.com/.  相似文献   

2.
The population genetic structure of Native Hawaiians has yet to be comprehensively studied, and the ancestral origins of Polynesians remain in question. In this study, we utilized high-resolution genome-wide SNP data and mitochondrial genomes of 148 and 160 Native Hawaiians, respectively, to characterize their population structure of the nuclear and mitochondrial genomes, ancestral origins, and population expansion. Native Hawaiians, who self-reported full Native Hawaiian heritage, demonstrated 78% Native Hawaiian, 11.5% European, and 7.8% Asian ancestry with 99% belonging to the B4 mitochondrial haplogroup. The estimated proportions of Native Hawaiian ancestry for those who reported mixed ancestry (i.e. 75% and 50% Native Hawaiian heritage) were found to be consistent with their self-reported heritage. A significant proportion of Melanesian ancestry (mean = 32%) was estimated in 100% self-reported Native Hawaiians in an ADMIXTURE analysis of Asian, Melanesian, and Native Hawaiian populations of K = 2, where K denotes the number of ancestral populations. This notable proportion of Melanesian admixture supports the “Slow-Boat” model of migration of ancestral Polynesian populations from East Asia to the Pacific Islands. In addition, approximately 1,300 years ago a single, strong expansion of the Native Hawaiian population was estimated. By providing important insight into the underlying population structure of Native Hawaiians, this study lays the foundation for future genetic association studies of this U.S. minority population.  相似文献   

3.
Admixture and recombination create populations and genomes with genetic ancestry from multiple source populations. Analyses of genetic ancestry in admixed populations are relevant for trait and disease mapping, studies of speciation, and conservation efforts. Consequently, many methods have been developed to infer genome-average ancestry and to deconvolute ancestry into continuous local ancestry blocks or tracts within individuals. Current methods for local ancestry inference perform well when admixture occurred recently or hybridization is ongoing, or when admixture occurred in the distant past such that local ancestry blocks have fixed in the admixed population. However, methods to infer local ancestry frequencies in isolated admixed populations still segregating for ancestry do not exist. In the current paper, I develop and test a continuous correlated beta process model to fill this analytical gap. The method explicitly models autocorrelations in ancestry frequencies at the population-level and uses discriminant analysis of SNP windows to take advantage of ancestry blocks within individuals. Analyses of simulated data sets show that the method is generally accurate such that ancestry frequency estimates exhibited low root-mean-square error and were highly correlated with the true values, particularly when large (±10 or ±20) SNP windows were used. Along these lines, the proposed method outperformed post hoc inference of ancestry frequencies from a traditional hidden Markov model (i.e., the linkage model in structure), particularly when admixture occurred more distantly in the past with little on-going gene flow or was followed by natural selection. The reliability and utility of the method was further assessed by analyzing genetic ancestry in an admixed human population (Uyghur) and three populations from a hybrid zone between Mus domesticus and M. musculus. Considerable variation in ancestry frequencies was detected within and among chromosomes in the Uyghur, with a large region of excess French ancestry harboring a gene with a known disease association. Similar variation was detected in the mouse hybrid zone, with notable constancy in regions of excess ancestry among admixed populations. By filling what has been an analytical gap, the proposed method should be a useful tool for many biologists. A computer program (popanc), written in C++, has been developed based on the proposed method and is available on-line at http://sourceforge.net/projects/popanc/.  相似文献   

4.
Yongtao Guan 《Genetics》2014,196(3):625-642
We present a two-layer hidden Markov model to detect the structure of haplotypes for unrelated individuals. This allows us to model two scales of linkage disequilibrium (one within a group of haplotypes and one between groups), thereby taking advantage of rich haplotype information to infer local ancestry of admixed individuals. Our method outperforms competing state-of-the-art methods, particularly for regions of small ancestral track lengths. Applying our method to Mexican samples in HapMap3, we found two regions on chromosomes 6 and 8 that show significant departure of local ancestry from the genome-wide average. A software package implementing the methods described in this article is freely available at http://bcm.edu/cnrc/mcmcmc.  相似文献   

5.
Although a large part of the global domestic dog population is free-ranging and free-breeding, knowledge of genetic diversity in these free-breeding dogs (FBDs) and their ancestry relations to pure-breed dogs is limited, and the indigenous status of FBDs in Asia is still uncertain. We analyse genome-wide SNP variability of FBDs across Eurasia, and show that they display weak genetic structure and are genetically distinct from pure-breed dogs rather than constituting an admixture of breeds. Our results suggest that modern European breeds originated locally from European FBDs. East Asian and Arctic breeds show closest affinity to East Asian FBDs, and they both represent the earliest branching lineages in the phylogeny of extant Eurasian dogs. Our biogeographic reconstruction of ancestral distributions indicates a gradual westward expansion of East Asian indigenous dogs to the Middle East and Europe through Central and West Asia, providing evidence for a major expansion that shaped the patterns of genetic differentiation in modern dogs. This expansion was probably secondary and could have led to the replacement of earlier resident populations in Western Eurasia. This could explain why earlier studies based on modern DNA suggest East Asia as the region of dog origin, while ancient DNA and archaeological data point to Western Eurasia.  相似文献   

6.

Background

The National Children’s Study (NCS) is a prospective epidemiological study in the USA tasked with identifying a nationally representative sample of 100,000 children, and following them from their gestation until they are 21 years of age. The objective of the study is to measure environmental and genetic influences on growth, development, and health. Determination of the ancestry of these NCS participants is important for assessing the diversity of study participants and for examining the effect of ancestry on various health outcomes.

Results

We estimated the genetic ancestry of a convenience sample of 641 parents enrolled at the 7 original NCS Vanguard sites, by analyzing 30,000 markers on exome arrays, using the 1000 Genomes Project superpopulations as reference populations, and compared this with the measures of self-reported ethnicity and race. For 99% of the individuals, self-reported ethnicity and race agreed with the predicted superpopulation. NCS individuals self-reporting as Asian had genetic ancestry of either South Asian or East Asian groups, while those reporting as either Hispanic White or Hispanic Other had similar genetic ancestry. Of the 33 individuals who self-reported as Multiracial or Non-Hispanic Other, 33% matched the South Asian or East Asian groups, while these groups represented only 4.4% of the other reported categories.

Conclusions

Our data suggest that self-reported ethnicity and race have some limitations in accurately capturing Hispanic and South Asian populations. Overall, however, our data indicate that despite the complexity of the US population, individuals know their ancestral origins, and that self-reported ethnicity and race is a reliable indicator of genetic ancestry.  相似文献   

7.
Estimating admixture histories is crucial for understanding the genetic diversity we see in present-day populations. Allele frequency or phylogeny-based methods are excellent for inferring the existence of admixture or its proportions. However, to estimate admixture times, spatial information from admixed chromosomes of local ancestry or the decay of admixture linkage disequilibrium (ALD) is used. One popular method, implemented in the programs ALDER and ROLLOFF, uses two-locus ALD to infer the time of a single admixture event, but is only able to estimate the time of the most recent admixture event based on this summary statistic. To address this limitation, we derive analytical expressions for the expected ALD in a three-locus system and provide a new statistical method based on these results that is able to resolve more complicated admixture histories. Using simulations, we evaluate the performance of this method on a range of different admixture histories. As an example, we apply the method to the Colombian and Mexican samples from the 1000 Genomes project. The implementation of our method is available at https://github.com/Genomics-HSE/LaNeta.  相似文献   

8.
North Africa has a great diversity of indigenous sheep breeds whose origin is linked to its environmental characteristics and to certain historical events that took place in the region. To date, few genome‐wide studies have been conducted to investigate the population structure of North African indigenous sheep. The objective of the present study was to provide a detailed assessment of the genetic structure and admixture patterns of six Maghreb sheep populations using the Illumina 50K Ovine BeadChip and comparisons with 22 global populations of sheep and mouflon. Regardless of the method of analysis used, patterns of multiple hybridization events were observed within all North African populations, leading to a heterogeneous genetic architecture that varies according to the breed. The Barbarine population showed the lowest genetic heterogeneity and major southwest Asian ancestry, providing additional support to the Asian origin of the North African fat‐tailed sheep. All other breeds presented substantial Merino introgression ranging from 15% for D'man to 31% for Black Thibar. We highlighted several signals of ancestral introgression between North African and southern European sheep. In addition, we identified two opposite gradients of ancestry, southwest Asian and central European, occurring between North Africa and central Europe. Our results provide further evidence of the weak global population structure of sheep resulting from high levels of gene flow among breeds occurring worldwide. At the regional level, signs of recent admixture among North African populations, resulting in a change of the original genomic architecture of minority breeds, were also detected.  相似文献   

9.
Traditional methods for analyzing population structure, such as the Structure program, ignore the influence of the effect of allele mutations between the ancestral and current alleles of genetic markers, which can dramatically influence the accuracy of the structural estimation of current populations. Studying these effects can also reveal additional information about population evolution such as the divergence time and migration history of admixed populations. We propose mStruct, an admixture of population-specific mixtures of inheritance models that addresses the task of structure inference and mutation estimation jointly through a hierarchical Bayesian framework, and a variational algorithm for inference. We validated our method on synthetic data and used it to analyze the Human Genome Diversity Project–Centre d''Etude du Polymorphisme Humain (HGDP–CEPH) cell line panel of microsatellites and HGDP single-nucleotide polymorphism (SNP) data. A comparison of the structural maps of world populations estimated by mStruct and Structure is presented, and we also report potentially interesting mutation patterns in world populations estimated by mStruct.THE deluge of genomic polymorphism data, such as the genomewide multilocus genotype profiles of variable numbers of tandem repeats (i.e., microsatellites) and single-nucleotide polymorphisms (SNPs), has fueled the long-standing interest in analyzing patterns of genetic variations to reconstruct the ancestral structures of modern human populations. Genetic ancestral information can shed light on the evolutionary history and migrations of modern populations (Bowcock et al. 1994; Rosenberg et al. 2002; Conrad et al. 2006). It also provides guidelines for more accurate association studies (Roeder et al. 1998) and is useful for many other population genetics problems (Queller et al. 1993; Hammer et al. 1998; Templeton 2002).Various methods have been proposed for stratifying population structures on the basis of multilocus genotype information from a set of individuals. For example, Pritchard et al. (2000) proposed a model-based approach implemented in the program Structure, which uses a statistical methodology known as the allele-frequency admixture model to stratify population structures. This model, and admixture models in general arising in genetic and other contexts (Blei et al. 2003), belongs to a more general class of hierarchical Bayesian models known as the mixed membership models (Erosheva et al. 2004). Such a model postulates that an empirical multiple-instance sample, such as the ensemble of genetic markers of an individual, is made up of either independently and identically distributed (iid) instantiations (Pritchard et al. 2000) or spatially coupled (Falush et al. 2003) instantiations, from multiple population-specific fixed-dimensional multinomial distributions of marker alleles [known as allele-frequency profiles, AP (Falush et al. 2003)]. Under this assumption, the admixture model identifies each ancestral population by a specific AP (that defines a unique vector of allele frequencies of each marker in each ancestral population) and displays the fraction of contributions from each AP in a modern individual genome as an admixing vector (also known as an ancestral proportion vector or structure vector) in a structural map over the population sample in question. Figure 1 shows an example of a structural map of four modern populations inferred from a portion of the HapMap multipopulation data set by Structure. In this population structural map, the admixing vector underlying each individual is represented as a thin vertical line of unit length and multiple colors, with the height of each color reflecting the fraction of the individual''s genome originated from a certain ancestral population denoted by that color and formally represented by a unique AP. This method has been applied to the Human Genome Diversity Project–Centre d''Etude du Polymorphisme Humain (HGDP–CEPH) Human Genome Diversity Cell Line Panel in Rosenberg et al. (2002) and many other studies, and has unraveled interesting patterns in the genetic structures of the world population. However, even though Structure was originally built on a genetic admixture model, in reality the structural patterns derived by Structure in various studies often turn out to be distinct clusters among the study populations (e.g., Figure 1), which has led many to think of it as a clustering program rather than a tool for uncovering genetic admixing as it was supposed to do. The design limitation of the Structure model behind this issue motivated us to develop a new approach in this article to analyze admixed genetic samples.Open in a separate windowFigure 1.—Population structural map inferred by Structure on HapMap data consisting of four populations.A recent extension of Structure, known as Structurama (Pella and Masuda 2006; Huelsenbeck and Andolfatto 2007), relaxes the finite dimensional assumption on ancestral populations in the admixture model by employing a Dirichlet process prior over the ancestral allele-frequency profiles. This allows automatic estimation of the maximum a posteriori probable number of ancestral populations. This extension is a useful improvement since it eliminates the need for manual selection of the number of ancestral populations. Anderson and Thompson (2002) address the problem of classifying species hybrids into categories, using a model-based Bayesian clustering approach implemented in the NewHybrid program. While this problem is not exactly identical to the problem of stratifying the structure of highly admixed populations, it is useful for structural analysis of populations that were recently admixed. The BAPS program (Corander et al. 2003) also uses a Bayesian approach to find the best partition of a set of individuals into subpopulations on the basis of genotypes. Parallel to the aforementioned model-based approaches for genomic structural analysis, direct algebraic eigen-decomposition and dimensionality reduction methods, such as the Eigensoft program (Patterson et al. 2006) based on principal components analysis (PCA), offer an alternative approach to explore and visualize the ancestral composition of modern populations and facilitate formal statistical tests for significance of population differentiation. However, unlike the model-based methods such as Structure, where each inferred ancestral population bears a concrete genetic meaning as a population-specific allele-frequency profile, the eigenvectors computed by Eigensoft represent the mutually orthogonal directions in an abstract low-dimensional ancestral space, in which population samples can be embedded and visualized; these eigenvectors can be understood as mathematical surrogates of independent genetic sources underlying a population sample, but lack a concrete interpretation under a generative genetic inheritance model (from here on, we use the term “inheritance model” to describe the process by which a descendant allele is derived from an ancestral allele). Analyses based on Eigensoft are usually limited to two-dimensional ancestral spaces, offering limited power in stratifying highly admixed populations.This progress notwithstanding, an important aspect of population admixing that is largely missing in the existing methods is the effect of allele mutations between the ancestral and current alleles of genetic markers, which can dramatically influence the accuracy of the structural estimation of current populations. It can also reveal additional information about population evolution, such as the relative divergence time and migration history of admixed populations.Consider, for example, the Structure model. Since an AP merely represents the frequency of alleles in an ancestral population rather than the actual allelic content or haplotypes of the alleles themselves, the admixture models developed so far on the basis of APs do not model genetic changes due to mutations from the ancestral alleles. Indeed, a serious pitfall of the model underlying Structure, as pointed out in Excoffier and Hamilton (2003), is that there is no mutation model for modern individual alleles with respect to hypothetical common prototypes in the ancestral populations. That means every unique allele in the modern population is assumed to have a distinct ancestral proportion, rather than allowing the possibility of it just being a descendant of some common ancestral allele that can also give rise to other closely related alleles at the same locus of other individuals in the modern population. Thus, while Structure aims to provide ancestry information for each individual and each locus, there is no explicit representation of the “ancestors” as a physical set of “founding alleles.” Therefore, the inferred population structural map emphasizes revealing the contributions of abstract population-specific ancestral proportion profiles, which does not necessarily reflect individual diversity or the extent of genetic changes with respect to the founders. Due to this limitation, Structure does not enable inference of the founding genetic patterns, the age of the founding alleles, or the population divergence time (Excoffier and Hamilton 2003).The lack of an appropriate allele mutation model in a structural inference program can also compromise our ability to reliably assess the amount or level of genetic admixing in different populations. The Structure model, like several other related models (Blei et al. 2003), is based on the fundamental assumption of the presence of genetic admixing among multiple founding populations. However, as we shall see later, on real population data such as the HGDP–CEPH panel, it produces results that favor clustering individuals into predominantly one allele-frequency profile or another, thus leading us to conclude that there was little or no admixing between the ancestral human populations. We believe that this occurs due to the absence of a mutation model in Structure. While a partitioning of individuals would be desirable for clustering them into groups, it does not offer enough biological insight into the intermixing of the populations.In this article, we present mStruct (which stands for Structure under mutations), based on a new model: an admixture of population-specific mixtures of inheritance models (AdMim). Statistically, AdMim is an admixture of mixture models, which represents each ancestral population as a mixture of ancestral alleles each with its own inheritance process and each modern individual as an “ancestry vector” (or structure vector) that reflects membership proportions of the ancestral populations. As we explain shortly, mStruct facilitates estimation of both the structural map of populations and the mutation parameters of either SNP or microsatellite alleles under various contexts. A new variational inference algorithm, which is much faster than the MCMC algorithm used for Structure, was developed for estimating the structure vectors and other genetic parameters of interest. We compare our method with Structure on simulated genotype data and on the microsatellite and SNP genotype data of world populations (Rosenberg et al. 2002; Conrad et al. 2006). Our results using microsatellite data reveal the presence of significant levels of genetic admixing among the founding populations underlying the HGDP–CEPH cell line panel, as well as consequences of expansion of humans out of Africa. Our results suggest that the inability of Structure to model mutations during genetic admixing could have caused it to detect correct clustering but very low levels of genetic admixing in each modern population in the HGDP–CEPH data. We also report interesting visualizations of genetic divergence in world populations revealed by the mutation patterns estimated by mStruct. The mStruct software has been implemented in C++ and is available for download at http://www.sailing.cs.cmu.edu/mstruct.html.  相似文献   

10.

Background

While spouse correlations have been documented for numerous traits, no prior studies have assessed assortative mating for genetic ancestry in admixed populations.

Results

Using 104 ancestry informative markers, we examined spouse correlations in genetic ancestry for Mexican spouse pairs recruited from Mexico City and the San Francisco Bay Area, and Puerto Rican spouse pairs recruited from Puerto Rico and New York City. In the Mexican pairs, we found strong spouse correlations for European and Native American ancestry, but no correlation in African ancestry. In the Puerto Rican pairs, we found significant spouse correlations for African ancestry and European ancestry but not Native American ancestry. Correlations were not attributable to variation in socioeconomic status or geographic heterogeneity. Past evidence of spouse correlation was also seen in the strong evidence of linkage disequilibrium between unlinked markers, which was accounted for in regression analysis by ancestral allele frequency difference at the pair of markers (European versus Native American for Mexicans, European versus African for Puerto Ricans). We also observed an excess of homozygosity at individual markers within the spouses, but this provided weaker evidence, as expected, of spouse correlation. Ancestry variance is predicted to decline in each generation, but less so under assortative mating. We used the current observed variances of ancestry to infer even stronger patterns of spouse ancestry correlation in previous generations.

Conclusions

Assortative mating related to genetic ancestry persists in Latino populations to the current day, and has impacted on the genomic structure in these populations.  相似文献   

11.
Inference of population structure and individual ancestry is important both for population genetics and for association studies. With next generation sequencing technologies it is possible to obtain genetic data for all accessible genetic variations in the genome. Existing methods for admixture analysis rely on known genotypes. However, individual genotypes cannot be inferred from low-depth sequencing data without introducing errors. This article presents a new method for inferring an individual’s ancestry that takes the uncertainty introduced in next generation sequencing data into account. This is achieved by working directly with genotype likelihoods that contain all relevant information of the unobserved genotypes. Using simulations as well as publicly available sequencing data, we demonstrate that the presented method has great accuracy even for very low-depth data. At the same time, we demonstrate that applying existing methods to genotypes called from the same data can introduce severe biases. The presented method is implemented in the NGSadmix software available at http://www.popgen.dk/software.  相似文献   

12.
Modern genetic samples are commonly used to trace dog origins, which entails untested assumptions that village dogs reflect indigenous ancestry or that breed origins can be reliably traced to particular regions. We used high-resolution Y chromosome markers (SNP and STR) and mitochondrial DNA to analyze 495 village dogs/dingoes from the Middle East and Southeast Asia, along with 138 dogs from >35 modern breeds to 1) assess genetic divergence between Middle Eastern and Southeast Asian village dogs and their phylogenetic affinities to Australian dingoes and gray wolves (Canis lupus) and 2) compare the genetic affinities of modern breeds to regional indigenous village dog populations. The Y chromosome markers indicated that village dogs in the two regions corresponded to reciprocally monophyletic clades, reflecting several to many thousand years divergence, predating the Neolithic ages, and indicating long-indigenous roots to those regions. As expected, breeds of the Middle East and East Asia clustered within the respective regional village dog clade. Australian dingoes also clustered in the Southeast Asian clade. However, the European and American breeds clustered almost entirely within the Southeast Asian clade, even sharing many haplotypes, suggesting a substantial and recent influence of East Asian dogs in the creation of European breeds. Comparison to 818 published breed dog Y STR haplotypes confirmed this conclusion and indicated that some African breeds reflect another distinct patrilineal origin. The lower-resolution mtDNA marker consistently supported Y-chromosome results. Both marker types confirmed previous findings of higher genetic diversity in dogs from Southeast Asia than the Middle East. Our findings demonstrate the importance of village dogs as windows into the past and provide a reference against which ancient DNA can be used to further elucidate origins and spread of the domestic dog.  相似文献   

13.
Dogs were present in pre-Columbian America, presumably brought by early human migrants from Asia. Studies of free-ranging village/street dogs have indicated almost total replacement of these original dogs by European dogs, but the extent to which Arctic, North and South American breeds are descendants of the original population remains to be assessed. Using a comprehensive phylogeographic analysis, we traced the origin of the mitochondrial DNA lineages for Inuit, Eskimo and Greenland dogs, Alaskan Malamute, Chihuahua, xoloitzcuintli and perro sín pelo del Peru, by comparing to extensive samples of East Asian (n = 984) and European dogs (n = 639), and previously published pre-Columbian sequences. Evidence for a pre-Columbian origin was found for all these breeds, except Alaskan Malamute for which results were ambigous. No European influence was indicated for the Arctic breeds Inuit, Eskimo and Greenland dog, and North/South American breeds had at most 30% European female lineages, suggesting marginal replacement by European dogs. Genetic continuity through time was shown by the sharing of a unique haplotype between the Mexican breed Chihuahua and ancient Mexican samples. We also analysed free-ranging dogs, confirming limited pre-Columbian ancestry overall, but also identifying pockets of remaining populations with high proportion of indigenous ancestry, and we provide the first DNA-based evidence that the Carolina dog, a free-ranging population in the USA, may have an ancient Asian origin.  相似文献   

14.
15.

Background

The estimation of individual ancestry from genetic data has become essential to applied population genetics and genetic epidemiology. Software programs for calculating ancestry estimates have become essential tools in the geneticist's analytic arsenal.

Results

Here we describe four enhancements to ADMIXTURE, a high-performance tool for estimating individual ancestries and population allele frequencies from SNP (single nucleotide polymorphism) data. First, ADMIXTURE can be used to estimate the number of underlying populations through cross-validation. Second, individuals of known ancestry can be exploited in supervised learning to yield more precise ancestry estimates. Third, by penalizing small admixture coefficients for each individual, one can encourage model parsimony, often yielding more interpretable results for small datasets or datasets with large numbers of ancestral populations. Finally, by exploiting multiple processors, large datasets can be analyzed even more rapidly.

Conclusions

The enhancements we have described make ADMIXTURE a more accurate, efficient, and versatile tool for ancestry estimation.  相似文献   

16.
The patterns of genetic variation within and among individuals and populations can be used to make inferences about the evolutionary forces that generated those patterns. Numerous population genetic approaches have been developed in order to infer evolutionary history. Here, we present the “Two-Two (TT)” and the “Two-Two-outgroup (TTo)” methods; two closely related approaches for estimating divergence time based in coalescent theory. They rely on sequence data from two haploid genomes (or a single diploid individual) from each of two populations. Under a simple population-divergence model, we derive the probabilities of the possible sample configurations. These probabilities form a set of equations that can be solved to obtain estimates of the model parameters, including population split times, directly from the sequence data. This transparent and computationally efficient approach to infer population divergence time makes it possible to estimate time scaled in generations (assuming a mutation rate), and not as a compound parameter of genetic drift. Using simulations under a range of demographic scenarios, we show that the method is relatively robust to migration and that the TTo method can alleviate biases that can appear from drastic ancestral population size changes. We illustrate the utility of the approaches with some examples, including estimating split times for pairs of human populations as well as providing further evidence for the complex relationship among Neandertals and Denisovans and their ancestors.  相似文献   

17.
《Genomics》2021,113(4):2056-2064
Ancestry informative markers have extensive uses and advantages in inferring ancestral origins and estimating ancestral genetic information components of admixed populations. With the characteristics of highly cultural exchange and the admixed genetic structure of the Kyrgyz group, it is essential to enrich the genetic data of the Kyrgyz group. In this study, we used a self-developed ancestry informative marker-deletion/insertion polymorphic (AIM-DIP) panel to explore ancestral components of Chinese Kyrgyz group and population genetic relationships between the Kyrgyz group and reference populations. Results showed that all AIM-DIP loci were conformed to Hardy-Weinberg equilibrium. There were 36 AIM-DIP loci that contributed significantly to genetic information inference. Multiple statistical analyses revealed that Chinese Kyrgyz group had a closer genetic relationship with Chinese Uyghur group. The ancestral components of the Kyrgyz group, being mostly composed of genetic components of European and East Asian populations, were more similar to the ancestral components of Chinese Uyghur group.  相似文献   

18.
Genetic ancestry,admixture and health determinants in Latin America   总被引:1,自引:0,他引:1  

Background

Modern Latin American populations were formed via genetic admixture among ancestral source populations from Africa, the Americas and Europe. We are interested in studying how combinations of genetic ancestry in admixed Latin American populations may impact genomic determinants of health and disease. For this study, we characterized the impact of ancestry and admixture on genetic variants that underlie health- and disease-related phenotypes in population genomic samples from Colombia, Mexico, Peru, and Puerto Rico.

Results

We analyzed a total of 347 admixed Latin American genomes along with 1102 putative ancestral source genomes from Africans, Europeans, and Native Americans. We characterized the genetic ancestry, relatedness, and admixture patterns for each of the admixed Latin American genomes, finding a spectrum of ancestry proportions within and between populations. We then identified single nucleotide polymorphisms (SNPs) with anomalous ancestry-enrichment patterns, i.e. SNPs that exist in any given Latin American population at a higher frequency than expected based on the population’s genetic ancestry profile. For this set of ancestry-enriched SNPs, we inspected their phenotypic impact on disease, metabolism, and the immune system. All four of the Latin American populations show ancestry-enrichment for a number of shared pathways, yielding evidence of similar selection pressures on these populations during their evolution. For example, all four populations show ancestry-enriched SNPs in multiple genes from immune system pathways, such as the cytokine receptor interaction, T cell receptor signaling, and antigen presentation pathways. We also found SNPs with excess African or European ancestry that are associated with ancestry-specific gene expression patterns and play crucial roles in the immune system and infectious disease responses. Genes from both the innate and adaptive immune system were found to be regulated by ancestry-enriched SNPs with population-specific regulatory effects.

Conclusions

Ancestry-enriched SNPs in Latin American populations have a substantial effect on health- and disease-related phenotypes. The concordant impact observed for same phenotypes across populations points to a process of adaptive introgression, whereby ancestry-enriched SNPs with specific functional utility appear to have been retained in modern populations by virtue of their effects on health and fitness.
  相似文献   

19.
Module network inference is an established statistical method to reconstruct co-expression modules and their upstream regulatory programs from integrated multi-omics datasets measuring the activity levels of various cellular components across different individuals, experimental conditions or time points of a dynamic process. We have developed Lemon-Tree, an open-source, platform-independent, modular, extensible software package implementing state-of-the-art ensemble methods for module network inference. We benchmarked Lemon-Tree using large-scale tumor datasets and showed that Lemon-Tree algorithms compare favorably with state-of-the-art module network inference software. We also analyzed a large dataset of somatic copy-number alterations and gene expression levels measured in glioblastoma samples from The Cancer Genome Atlas and found that Lemon-Tree correctly identifies known glioblastoma oncogenes and tumor suppressors as master regulators in the inferred module network. Novel candidate driver genes predicted by Lemon-Tree were validated using tumor pathway and survival analyses. Lemon-Tree is available from http://lemon-tree.googlecode.com under the GNU General Public License version 2.0.
This is a PLOS Computational Biology Software Article
  相似文献   

20.
Genome-wide association analysis in populations of European descent has recently found more than a hundred genetic variants affecting risk for common disease. An open question, however, is how relevant the variants discovered in Europeans are to other populations. To address this problem for cardiovascular phenotypes, we studied a cohort of 4,464 African Americans from the Jackson Heart Study (JHS), in whom we genotyped both a panel of 12 recently discovered genetic variants known to predict lipid profile levels in Europeans and a panel of up to 1,447 ancestry informative markers allowing us to determine the African ancestry proportion of each individual at each position in the genome. Focusing on lipid profiles—HDL-cholesterol (HDL-C), LDL-cholesterol (LDL-C), and triglycerides (TG)—we identified the lipoprotein lipase (LPL) locus as harboring variants that account for interethnic variation in HDL-C and TG. In particular, we identified a novel common variant within LPL that is strongly associated with TG (p=2.7×10−6) and explains nearly 1% of the variability in this phenotype, the most of any variant in African Americans to date. Strikingly, the extensively studied “gain-of-function” S447X mutation at LPL, which has been hypothesized to be the major determinant of the LPL-TG genetic association and is in trials for human gene therapy, has a significantly diminished strength of biological effect when it is found on a background of African rather than European ancestry. These results suggest that there are other, yet undiscovered variants at the locus that are truly causal (and are in linkage disequilibrium with S447X) or that work synergistically with S447X to modulate TG levels. Finally, we find systematically lower effect sizes for the 12 risk variants discovered in European populations on the African local ancestry background in JHS, highlighting the need for caution in the use of genetic variants for risk assessment across different populations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号