期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data

Joseph K. Pickrell Jonathan K. Pritchard 《PLoS genetics》2012,8(11)

Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and “ancient” Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com. 相似文献

2.

Viral Quasispecies Assembly via Maximal Clique Enumeration

Armin T?pfer Tobias Marschall Rowena A. Bull Fabio Luciani Alexander Sch?nhuth Niko Beerenwinkel 《PLoS computational biology》2014,10(3)

Virus populations can display high genetic diversity within individual hosts. The intra-host collection of viral haplotypes, called viral quasispecies, is an important determinant of virulence, pathogenesis, and treatment outcome. We present HaploClique, a computational approach to reconstruct the structure of a viral quasispecies from next-generation sequencing data as obtained from bulk sequencing of mixed virus samples. We develop a statistical model for paired-end reads accounting for mutations, insertions, and deletions. Using an iterative maximal clique enumeration approach, read pairs are assembled into haplotypes of increasing length, eventually enabling global haplotype assembly. The performance of our quasispecies assembly method is assessed on simulated data for varying population characteristics and sequencing technology parameters. Owing to its paired-end handling, HaploClique compares favorably to state-of-the-art haplotype inference methods. It can reconstruct error-free full-length haplotypes from low coverage samples and detect large insertions and deletions at low frequencies. We applied HaploClique to sequencing data derived from a clinical hepatitis C virus population of an infected patient and discovered a novel deletion of length 357±167 bp that was validated by two independent long-read sequencing experiments. HaploClique is available at https://github.com/armintoepfer/haploclique. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2-5. 相似文献

3.

Molecular Pathology of Rare Bleeding Disorders (RBDs) in India: A Systematic Review

Bipin P. Kulkarni Sona B. Nair Manasi Vijapurkar Leenam Mota Sharda Shanbhag Shehnaz Ali Shrimati D. Shetty Kanjaksha Ghosh 《PloS one》2014,9(10)

Background

Though rare in occurrence, patients with rare bleeding disorders (RBDs) are highly heterogeneous and may manifest with severe bleeding diathesis. Due to the high rate of consanguinity in many caste groups, these autosomal recessive bleeding disorders which are of rare occurrence in populations across the world, may not be as rare in India.

Objectives

To comprehensively analyze the frequency and nature of mutations in Indian patients with RBDs.

Methods

Pubmed search was used (www.pubmed.com) to explore the published literature from India on RBDs using the key words “rare bleeding disorders”, “mutations”, “India”, “fibrinogen”, “afibrinogenemia”, “factor II deficiency”, “prothrombin” “factor VII deficiency”, “factor V deficiency”, “factor X deficiency”, “factor XI deficiency”, “combined factor V and VIII deficiency”, “factor XIII deficiency”, “Bernard Soulier syndrome” and “Glanzmanns thrombasthenia” in different combinations. A total of 60 relevant articles could be retrieved. The distribution of mutations from India was compared with that of the world literature by referring to the Human Gene Mutation Database (HGMD) (www.hgmd.org).

Results

Taken together, 181 mutations in 270 patients with different RBDs have been reported from India. Though the types of mutations reported from India and their percentage distribution with respect to the world data are largely similar, yet much higher percentage of small deletions, duplication mutations, insertions, indels were observed in this analysis. Besides the identification of novel mutations and polymorphisms, several common mutations have also been reported, which will allow to develop a strategy for mutation screening in Indian patients with RBDs.

Conclusion

There is a need for a consortium of Institutions working on the molecular pathology of RBDs in India. This will facilitate a quicker and cheaper diagnosis of RBDs besides its utility in first trimester prenatal diagnosis of the affected families. 相似文献

4.

A statistical model for describing and simulating microbial community profiles

Siyuan Ma Boyu Ren Himel Mallick Yo Sup Moon Emma Schwager Sagun Maharjan Timothy L. Tickle Yiren Lu Rachel N. Carmody Eric A. Franzosa Lucas Janson Curtis Huttenhower 《PLoS computational biology》2021,17(9)

Many methods have been developed for statistical analysis of microbial community profiles, but due to the complex nature of typical microbiome measurements (e.g. sparsity, zero-inflation, non-independence, and compositionality) and of the associated underlying biology, it is difficult to compare or evaluate such methods within a single systematic framework. To address this challenge, we developed SparseDOSSA (Sparse Data Observations for the Simulation of Synthetic Abundances): a statistical model of microbial ecological population structure, which can be used to parameterize real-world microbial community profiles and to simulate new, realistic profiles of known structure for methods evaluation. Specifically, SparseDOSSA’s model captures marginal microbial feature abundances as a zero-inflated log-normal distribution, with additional model components for absolute cell counts and the sequence read generation process, microbe-microbe, and microbe-environment interactions. Together, these allow fully known covariance structure between synthetic features (i.e. “taxa”) or between features and “phenotypes” to be simulated for method benchmarking. Here, we demonstrate SparseDOSSA’s performance for 1) accurately modeling human-associated microbial population profiles; 2) generating synthetic communities with controlled population and ecological structures; 3) spiking-in true positive synthetic associations to benchmark analysis methods; and 4) recapitulating an end-to-end mouse microbiome feeding experiment. Together, these represent the most common analysis types in assessment of real microbial community environmental and epidemiological statistics, thus demonstrating SparseDOSSA’s utility as a general-purpose aid for modeling communities and evaluating quantitative methods. An open-source implementation is available at http://huttenhower.sph.harvard.edu/sparsedossa2. 相似文献

5.

Allelic Variation,Aneuploidy, and Nongenetic Mechanisms Suppress a Monogenic Trait in Yeast

Amy Sirr Gareth A. Cromie Eric W. Jeffery Teresa L. Gilbert Catherine L. Ludlow Adrian C. Scott Aimée M. Dudley 《Genetics》2015,199(1):247-262

Clinically relevant features of monogenic diseases, including severity of symptoms and age of onset, can vary widely in response to environmental differences as well as to the presence of genetic modifiers affecting the trait’s penetrance and expressivity. While a better understanding of modifier loci could lead to treatments for Mendelian diseases, the rarity of individuals harboring both a disease-causing allele and a modifying genotype hinders their study in human populations. We examined the genetic architecture of monogenic trait modifiers using a well-characterized yeast model of the human Mendelian disease classic galactosemia. Yeast strains with loss-of-function mutations in the yeast ortholog (GAL7) of the human disease gene (GALT) fail to grow in the presence of even small amounts of galactose due to accumulation of the same toxic intermediates that poison human cells. To isolate and individually genotype large numbers of the very rare (∼0.1%) galactose-tolerant recombinant progeny from a cross between two gal7Δ parents, we developed a new method, called “FACS-QTL.” FACS-QTL improves upon the currently used approaches of bulk segregant analysis and extreme QTL mapping by requiring less genome engineering and strain manipulation as well as maintaining individual genotype information. Our results identified multiple distinct solutions by which the monogenic trait could be suppressed, including genetic and nongenetic mechanisms as well as frequent aneuploidy. Taken together, our results imply that the modifiers of monogenic traits are likely to be genetically complex and heterogeneous. 相似文献

6.

AntiAngioPred: A Server for Prediction of Anti-Angiogenic Peptides

Azhagiya Singam Ettayapuram Ramaprasad Sandeep Singh Raghava Gajendra P. S Subramanian Venkatesan 《PloS one》2015,10(9)

The process of angiogenesis is a vital step towards the formation of malignant tumors. Anti-angiogenic peptides are therefore promising candidates in the treatment of cancer. In this study, we have collected anti-angiogenic peptides from the literature and analyzed the residue preference in these peptides. Residues like Cys, Pro, Ser, Arg, Trp, Thr and Gly are preferred while Ala, Asp, Ile, Leu, Val and Phe are not preferred in these peptides. There is a positional preference of Ser, Pro, Trp and Cys in the N terminal region and Cys, Gly and Arg in the C terminal region of anti-angiogenic peptides. Motif analysis suggests the motifs “CG-G”, “TC”, “SC”, “SP-S”, etc., which are highly prominent in anti-angiogenic peptides. Based on the primary analysis, we developed prediction models using different machine learning based methods. The maximum accuracy and MCC for amino acid composition based model is 80.9% and 0.62 respectively. The performance of the models on independent dataset is also reasonable. Based on the above study, we have developed a user-friendly web server named “AntiAngioPred” for the prediction of anti-angiogenic peptides. AntiAngioPred web server is freely accessible at http://clri.res.in/subramanian/tools/antiangiopred/index.html (mirror site: http://crdd.osdd.net/raghava/antiangiopred/). 相似文献

7.

A Preliminary Investigation of Individual Differences in Subjective Responses to D-Amphetamine,Alcohol, and Delta-9-Tetrahydrocannabinol Using a Within-Subjects Randomized Trial

Margaret C. Wardle Benjamin A. Marcus Harriet de Wit 《PloS one》2015,10(10)

Polydrug use is common, and might occur because certain individuals experience positive effects from several different drugs during early stages of use. This study examined individual differences in subjective responses to single oral doses of d-amphetamine, alcohol, and delta-9-tetrahydrocannabinol (THC) in healthy social drinkers. Each of these drugs produces feelings of well-being in at least some individuals, and we hypothesized that subjective responses to these drugs would be positively correlated. We also examined participants’ drug responses in relation to personality traits associated with drug use. In this initial, exploratory study, 24 healthy, light drug users (12 male, 12 female), aged 21–31 years, participated in a fully within-subject, randomized, counterbalanced design with six 5.5-hour sessions in which they received d-amphetamine (20mg), alcohol (0.8 g/kg), or THC (7.5 mg), each paired with a placebo session. Participants rated the drugs’ effects on both global measures (e.g. feeling a drug effect at all) and drug-specific measures. In general, participants’ responses to the three drugs were unrelated. Unexpectedly, “wanting more” alcohol was inversely correlated with “wanting more” THC. Additionally, in women, but not in men, “disliking” alcohol was negatively correlated with “disliking” THC. Positive alcohol and amphetamine responses were related, but only in individuals who experienced a stimulant effect of alcohol. Finally, high trait constraint (or lack of impulsivity) was associated with lower reports of liking alcohol. No personality traits predicted responses across multiple drug types. Generally, these findings do not support the idea that certain individuals experience greater positive effects across multiple drug classes, but instead provide some evidence for a “drug of choice” model, in which individuals respond positively to certain classes of drugs that share similar subjective effects, and dislike other types of drugs.

Trial Registration

ClinicalTrials.gov NCT02485158 相似文献

8.

Genome-Wide Scan for Adaptive Divergence and Association with Population-Specific Covariates

Mathieu Gautier 《Genetics》2015,201(4):1555-1579

In population genomics studies, accounting for the neutral covariance structure across population allele frequencies is critical to improve the robustness of genome-wide scan approaches. Elaborating on the BayEnv model, this study investigates several modeling extensions (i) to improve the estimation accuracy of the population covariance matrix and all the related measures, (ii) to identify significantly overly differentiated SNPs based on a calibration procedure of the XtX statistics, and (iii) to consider alternative covariate models for analyses of association with population-specific covariables. In particular, the auxiliary variable model allows one to deal with multiple testing issues and, providing the relative marker positions are available, to capture some linkage disequilibrium information. A comprehensive simulation study was carried out to evaluate the performances of these different models. Also, when compared in terms of power, robustness, and computational efficiency to five other state-of-the-art genome-scan methods (BayEnv2, BayScEnv, BayScan, flk, and lfmm), the proposed approaches proved highly effective. For illustration purposes, genotyping data on 18 French cattle breeds were analyzed, leading to the identification of 13 strong signatures of selection. Among these, four (surrounding the KITLG, KIT, EDN3, and ALB genes) contained SNPs strongly associated with the piebald coloration pattern while a fifth (surrounding PLAG1) could be associated to morphological differences across the populations. Finally, analysis of Pool-Seq data from 12 populations of Littorina saxatilis living in two different ecotypes illustrates how the proposed framework might help in addressing relevant ecological issues in nonmodel species. Overall, the proposed methods define a robust Bayesian framework to characterize adaptive genetic differentiation across populations. The BayPass program implementing the different models is available at http://www1.montpellier.inra.fr/CBGP/software/baypass/. 相似文献

9.

The 2015 Bioinformatics Open Source Conference (BOSC 2015)

Nomi L. Harris Peter J. A. Cock Hilmar Lapp Brad Chapman Rob Davey Christopher Fields Karsten Hokamp Monica Munoz-Torres 《PLoS computational biology》2016,12(2)

The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included “Data Science;” “Standards and Interoperability;” “Open Science and Reproducibility;” “Translational Bioinformatics;” “Visualization;” and “Bioinformatics Open Source Project Updates”. In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled “Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community,” that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.Open in a separate window 相似文献

10.

High-Resolution Global Analysis of the Influences of Bas1 and Ino4 Transcription Factors on Meiotic DNA Break Distributions in Saccharomyces cerevisiae

Xuan Zhu Scott Keeney 《Genetics》2015,201(2):525-542

相似文献

11.

MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data

Jiyuan Hu Tengfei Li Zidi Xiu Hong Zhang 《PloS one》2015,10(8)

Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package “MAFsnp” implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/. 相似文献

12.

mStruct: Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations

Suyash Shringarpure Eric P. Xing 《Genetics》2009,182(2):575-593

Traditional methods for analyzing population structure, such as the Structure program, ignore the influence of the effect of allele mutations between the ancestral and current alleles of genetic markers, which can dramatically influence the accuracy of the structural estimation of current populations. Studying these effects can also reveal additional information about population evolution such as the divergence time and migration history of admixed populations. We propose mStruct, an admixture of population-specific mixtures of inheritance models that addresses the task of structure inference and mutation estimation jointly through a hierarchical Bayesian framework, and a variational algorithm for inference. We validated our method on synthetic data and used it to analyze the Human Genome Diversity Project–Centre d''Etude du Polymorphisme Humain (HGDP–CEPH) cell line panel of microsatellites and HGDP single-nucleotide polymorphism (SNP) data. A comparison of the structural maps of world populations estimated by mStruct and Structure is presented, and we also report potentially interesting mutation patterns in world populations estimated by mStruct.THE deluge of genomic polymorphism data, such as the genomewide multilocus genotype profiles of variable numbers of tandem repeats (i.e., microsatellites) and single-nucleotide polymorphisms (SNPs), has fueled the long-standing interest in analyzing patterns of genetic variations to reconstruct the ancestral structures of modern human populations. Genetic ancestral information can shed light on the evolutionary history and migrations of modern populations (; ; ). It also provides guidelines for more accurate association studies (Roeder et al. 1998) and is useful for many other population genetics problems (; ; ).Various methods have been proposed for stratifying population structures on the basis of multilocus genotype information from a set of individuals. For example, proposed a model-based approach implemented in the program Structure, which uses a statistical methodology known as the allele-frequency admixture model to stratify population structures. This model, and admixture models in general arising in genetic and other contexts (Blei et al. 2003), belongs to a more general class of hierarchical Bayesian models known as the mixed membership models (). Such a model postulates that an empirical multiple-instance sample, such as the ensemble of genetic markers of an individual, is made up of either independently and identically distributed (iid) instantiations () or spatially coupled () instantiations, from multiple population-specific fixed-dimensional multinomial distributions of marker alleles [known as allele-frequency profiles, AP ()]. Under this assumption, the admixture model identifies each ancestral population by a specific AP (that defines a unique vector of allele frequencies of each marker in each ancestral population) and displays the fraction of contributions from each AP in a modern individual genome as an admixing vector (also known as an ancestral proportion vector or structure vector) in a structural map over the population sample in question. Figure 1 shows an example of a structural map of four modern populations inferred from a portion of the HapMap multipopulation data set by Structure. In this population structural map, the admixing vector underlying each individual is represented as a thin vertical line of unit length and multiple colors, with the height of each color reflecting the fraction of the individual''s genome originated from a certain ancestral population denoted by that color and formally represented by a unique AP. This method has been applied to the Human Genome Diversity Project–Centre d''Etude du Polymorphisme Humain (HGDP–CEPH) Human Genome Diversity Cell Line Panel in and many other studies, and has unraveled interesting patterns in the genetic structures of the world population. However, even though Structure was originally built on a genetic admixture model, in reality the structural patterns derived by Structure in various studies often turn out to be distinct clusters among the study populations (e.g., Figure 1), which has led many to think of it as a clustering program rather than a tool for uncovering genetic admixing as it was supposed to do. The design limitation of the Structure model behind this issue motivated us to develop a new approach in this article to analyze admixed genetic samples.Open in a separate window Figure 1.—Population structural map inferred by Structure on HapMap data consisting of four populations.A recent extension of Structure, known as Structurama (Pella and Masuda 2006; ), relaxes the finite dimensional assumption on ancestral populations in the admixture model by employing a Dirichlet process prior over the ancestral allele-frequency profiles. This allows automatic estimation of the maximum a posteriori probable number of ancestral populations. This extension is a useful improvement since it eliminates the need for manual selection of the number of ancestral populations. address the problem of classifying species hybrids into categories, using a model-based Bayesian clustering approach implemented in the NewHybrid program. While this problem is not exactly identical to the problem of stratifying the structure of highly admixed populations, it is useful for structural analysis of populations that were recently admixed. The BAPS program () also uses a Bayesian approach to find the best partition of a set of individuals into subpopulations on the basis of genotypes. Parallel to the aforementioned model-based approaches for genomic structural analysis, direct algebraic eigen-decomposition and dimensionality reduction methods, such as the Eigensoft program () based on principal components analysis (PCA), offer an alternative approach to explore and visualize the ancestral composition of modern populations and facilitate formal statistical tests for significance of population differentiation. However, unlike the model-based methods such as Structure, where each inferred ancestral population bears a concrete genetic meaning as a population-specific allele-frequency profile, the eigenvectors computed by Eigensoft represent the mutually orthogonal directions in an abstract low-dimensional ancestral space, in which population samples can be embedded and visualized; these eigenvectors can be understood as mathematical surrogates of independent genetic sources underlying a population sample, but lack a concrete interpretation under a generative genetic inheritance model (from here on, we use the term “inheritance model” to describe the process by which a descendant allele is derived from an ancestral allele). Analyses based on Eigensoft are usually limited to two-dimensional ancestral spaces, offering limited power in stratifying highly admixed populations.This progress notwithstanding, an important aspect of population admixing that is largely missing in the existing methods is the effect of allele mutations between the ancestral and current alleles of genetic markers, which can dramatically influence the accuracy of the structural estimation of current populations. It can also reveal additional information about population evolution, such as the relative divergence time and migration history of admixed populations.Consider, for example, the Structure model. Since an AP merely represents the frequency of alleles in an ancestral population rather than the actual allelic content or haplotypes of the alleles themselves, the admixture models developed so far on the basis of APs do not model genetic changes due to mutations from the ancestral alleles. Indeed, a serious pitfall of the model underlying Structure, as pointed out in , is that there is no mutation model for modern individual alleles with respect to hypothetical common prototypes in the ancestral populations. That means every unique allele in the modern population is assumed to have a distinct ancestral proportion, rather than allowing the possibility of it just being a descendant of some common ancestral allele that can also give rise to other closely related alleles at the same locus of other individuals in the modern population. Thus, while Structure aims to provide ancestry information for each individual and each locus, there is no explicit representation of the “ancestors” as a physical set of “founding alleles.” Therefore, the inferred population structural map emphasizes revealing the contributions of abstract population-specific ancestral proportion profiles, which does not necessarily reflect individual diversity or the extent of genetic changes with respect to the founders. Due to this limitation, Structure does not enable inference of the founding genetic patterns, the age of the founding alleles, or the population divergence time ().The lack of an appropriate allele mutation model in a structural inference program can also compromise our ability to reliably assess the amount or level of genetic admixing in different populations. The Structure model, like several other related models (Blei et al. 2003), is based on the fundamental assumption of the presence of genetic admixing among multiple founding populations. However, as we shall see later, on real population data such as the HGDP–CEPH panel, it produces results that favor clustering individuals into predominantly one allele-frequency profile or another, thus leading us to conclude that there was little or no admixing between the ancestral human populations. We believe that this occurs due to the absence of a mutation model in Structure. While a partitioning of individuals would be desirable for clustering them into groups, it does not offer enough biological insight into the intermixing of the populations.In this article, we present mStruct (which stands for Structure under mutations), based on a new model: an admixture of population-specific mixtures of inheritance models (AdMim). Statistically, AdMim is an admixture of mixture models, which represents each ancestral population as a mixture of ancestral alleles each with its own inheritance process and each modern individual as an “ancestry vector” (or structure vector) that reflects membership proportions of the ancestral populations. As we explain shortly, mStruct facilitates estimation of both the structural map of populations and the mutation parameters of either SNP or microsatellite alleles under various contexts. A new variational inference algorithm, which is much faster than the MCMC algorithm used for Structure, was developed for estimating the structure vectors and other genetic parameters of interest. We compare our method with Structure on simulated genotype data and on the microsatellite and SNP genotype data of world populations (; ). Our results using microsatellite data reveal the presence of significant levels of genetic admixing among the founding populations underlying the HGDP–CEPH cell line panel, as well as consequences of expansion of humans out of Africa. Our results suggest that the inability of Structure to model mutations during genetic admixing could have caused it to detect correct clustering but very low levels of genetic admixing in each modern population in the HGDP–CEPH data. We also report interesting visualizations of genetic divergence in world populations revealed by the mutation patterns estimated by mStruct. The mStruct software has been implemented in C++ and is available for download at http://www.sailing.cs.cmu.edu/mstruct.html. 相似文献

13.

Rare Variants Create Synthetic Genome-Wide Associations

Samuel P. Dickson Kai Wang Ian Krantz Hakon Hakonarson David B. Goldstein 《PLoS biology》2010,8(1)

Genome-wide association studies (GWAS) have now identified at least 2,000 common variants that appear associated with common diseases or related traits (http://www.genome.gov/gwastudies), hundreds of which have been convincingly replicated. It is generally thought that the associated markers reflect the effect of a nearby common (minor allele frequency >0.05) causal site, which is associated with the marker, leading to extensive resequencing efforts to find causal sites. We propose as an alternative explanation that variants much less common than the associated one may create “synthetic associations” by occurring, stochastically, more often in association with one of the alleles at the common site versus the other allele. Although synthetic associations are an obvious theoretical possibility, they have never been systematically explored as a possible explanation for GWAS findings. Here, we use simple computer simulations to show the conditions under which such synthetic associations will arise and how they may be recognized. We show that they are not only possible, but inevitable, and that under simple but reasonable genetic models, they are likely to account for or contribute to many of the recently identified signals reported in genome-wide association studies. We also illustrate the behavior of synthetic associations in real datasets by showing that rare causal mutations responsible for both hearing loss and sickle cell anemia create genome-wide significant synthetic associations, in the latter case extending over a 2.5-Mb interval encompassing scores of “blocks” of associated variants. In conclusion, uncommon or rare genetic variants can easily create synthetic associations that are credited to common variants, and this possibility requires careful consideration in the interpretation and follow up of GWAS signals. 相似文献

14.

VitisNet: “Omics” Integration through Grapevine Molecular Networks

Jér?me Grimplet Grant R. Cramer Julie A. Dickerson Kathy Mathiason John Van Hemert Anne Y. Fennell 《PloS one》2009,4(12)

相似文献

15.

Detecting Selection on Temporal and Spatial Scales: A Genomic Time-Series Assessment of Selective Responses to Devil Facial Tumor Disease

Anna Brüniche-Olsen Jeremy J. Austin Menna E. Jones Barbara R. Holland Christopher P. Burridge 《PloS one》2016,11(3)

Detecting loci under selection is an important task in evolutionary biology. In conservation genetics detecting selection is key to investigating adaptation to the spread of infectious disease. Loci under selection can be detected on a spatial scale, accounting for differences in demographic history among populations, or on a temporal scale, tracing changes in allele frequencies over time. Here we use these two approaches to investigate selective responses to the spread of an infectious cancer—devil facial tumor disease (DFTD)—that since 1996 has ravaged the Tasmanian devil (Sarcophilus harrisii). Using time-series ‘restriction site associated DNA’ (RAD) markers from populations pre- and post DFTD arrival, and DFTD free populations, we infer loci under selection due to DFTD and investigate signatures of selection that are incongruent among methods, populations, and times. The lack of congruence among populations influenced by DFTD with respect to inferred loci under selection, and the direction of that selection, fail to implicate a consistent selective role for DFTD. Instead genetic drift is more likely driving the observed allele frequency changes over time. Our study illustrates the importance of applying methods with different performance optima e.g. accounting for population structure and background selection, and assessing congruence of the results. 相似文献

16.

Cellular Memory of Acquired Stress Resistance in Saccharomyces cerevisiae

Qiaoning Guan Suraiya Haroon Diego González Bravo Jessica L. Will Audrey P. Gasch 《Genetics》2012,192(2):495-505

相似文献

17.

Safety of Onartuzumab in Patients with Solid Tumors: Experience to Date from the Onartuzumab Clinical Trial Program

Roland Morley Alison Cardenas Peter Hawkins Yasuyo Suzuki Virginia Paton See-Chun Phan Mark Merchant Jessie Hsu Wei Yu Qi Xia Daniel Koralek Patricia Luhn Wassim Aldairy 《PloS one》2015,10(10)

Background

Onartuzumab, a recombinant humanized monovalent monoclonal antibody directed against MET, the receptor for the hepatocyte growth factor, has been investigated for the treatment of solid tumors. This publication describes the safety profile of onartuzumab in patients with solid tumors using data from the global onartuzumab clinical development program.

Methods

Adverse event (AE) and laboratory data from onartuzumab phase II/III studies were analyzed and coded into standardized terms according to industry standards. The severity of AEs was assessed using the NCI Common Toxicity Criteria, Version 4. Medical Dictionary for Regulatory Activities (MedDRA) AEs were grouped using the standardized MedDRA queries (SMQs) “gastrointestinal (GI) perforation”, “embolic and thrombotic events, venous (VTE)”, and “embolic and thrombotic events, arterial (ATE)”, and the Adverse Event Group Term (AEGT) “edema.” The safety evaluable populations (patients who received at least one dose of study treatment) for each study were included in this analysis.

Results

A total of 773 onartuzumab-treated patients from seven studies (phase II, n = 6; phase III, n = 1) were included. Edema and VTEs were reported in onartuzumab-treated patients in all seven studies. Edema events in onartuzumab arms were generally grade 1–2 in severity, observed more frequently than in control arms and at incidences ranging from 25.4−65.7% for all grades and from 1.2−14.1% for grade 3. Hypoalbuminemia was also more frequent in onartuzumab arms and observed at frequencies between 77.8% and 98.3%. The highest frequencies of all grade and grade ≥3 VTE events were 30.3% and 17.2%, respectively in onartuzumab arms. The cumulative incidence of all grade ATE events ranged from 0−5.6% (grade ≥3, 0−5.1%) in onartuzumab arms. The frequency of GI perforation was below 10% in all studies; the highest estimates were observed in studies with onartuzumab plus bevacizumab for all grades (0−6.2%) and grade ≥3 (0−6.2%).

Conclusions

The frequencies of VTE, ATE, GI perforation, hypoalbuminemia, and edema in clinical studies were higher in patients receiving onartuzumab than in control arms; these are considered to be expected events in patients receiving onartuzumab. 相似文献

18.

Genotype-Frequency Estimation from High-Throughput Sequencing Data

Takahiro Maruki Michael Lynch 《Genetics》2015,201(2):473-486

Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy–Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE. 相似文献

19.

lociNGS: A Lightweight Alternative for Assessing Suitability of Next-Generation Loci for Evolutionary Analysis

Sarah M. Hird 《PloS one》2012,7(10)

Genomic enrichment methods and next-generation sequencing produce uneven coverage for the portions of the genome (the loci) they target; this information is essential for ascertaining the suitability of each locus for further analysis. lociNGS is a user-friendly accessory program that takes multi-FASTA formatted loci, next-generation sequence alignments and demographic data as input and collates, displays and outputs information about the data. Summary information includes the parameters coverage per locus, coverage per individual and number of polymorphic sites, among others. The program can output the raw sequences used to call loci from next-generation sequencing data. lociNGS also reformats subsets of loci in three commonly used formats for multi-locus phylogeographic and population genetics analyses – NEXUS, IMa2 and Migrate. lociNGS is available at https://github.com/SHird/lociNGS and is dependent on installation of MongoDB (freely available at http://www.mongodb.org/downloads). lociNGS is written in Python and is supported on MacOSX and Unix; it is distributed under a GNU General Public License. 相似文献

20.

Multiple Co-Evolutionary Networks Are Supported by the Common Tertiary Scaffold of the LacI/GalR Proteins

Daniel J. Parente Liskin Swint-Kruse 《PloS one》2013,8(12)

相似文献