首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Gene-based tests of association can increase the power of a genome-wide association study by aggregating multiple independent effects across a gene or locus into a single stronger signal. Recent gene-based tests have distinct approaches to selecting which variants to aggregate within a locus, modeling the effects of linkage disequilibrium, representing fractional allele counts from imputation, and managing permutation tests for p-values. Implementing these tests in a single, efficient framework has great practical value. Fast ASsociation Tests (Fast) addresses this need by implementing leading gene-based association tests together with conventional SNP-based univariate tests and providing a consolidated, easily interpreted report. Fast scales readily to genome-wide SNP data with millions of SNPs and tens of thousands of individuals, provides implementations that are orders of magnitude faster than original literature reports, and provides a unified framework for performing several gene based association tests concurrently and efficiently on the same data. Availability: https://bitbucket.org/baderlab/fast/downloads/FAST.tar.gz, with documentation at https://bitbucket.org/baderlab/fast/wiki/Home  相似文献   

2.
3.
RNA-binding proteins (RBPs) regulate splicing according to position-dependent principles, which can be exploited for analysis of regulatory motifs. Here we present RNAmotifs, a method that evaluates the sequence around differentially regulated alternative exons to identify clusters of short and degenerate sequences, referred to as multivalent RNA motifs. We show that diverse RBPs share basic positional principles, but differ in their propensity to enhance or repress exon inclusion. We assess exons differentially spliced between brain and heart, identifying known and new regulatory motifs, and predict the expression pattern of RBPs that bind these motifs. RNAmotifs is available at https://bitbucket.org/rogrro/rna_motifs.  相似文献   

4.
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf.  相似文献   

5.
6.
Cancer can be a result of accumulation of different types of genetic mutations such as copy number aberrations. The data from tumors are cross-sectional and do not contain the temporal order of the genetic events. Finding the order in which the genetic events have occurred and progression pathways are of vital importance in understanding the disease. In order to model cancer progression, we propose Progression Networks, a special case of Bayesian networks, that are tailored to model disease progression. Progression networks have similarities with Conjunctive Bayesian Networks (CBNs) [1],a variation of Bayesian networks also proposed for modeling disease progression. We also describe a learning algorithm for learning Bayesian networks in general and progression networks in particular. We reduce the hard problem of learning the Bayesian and progression networks to Mixed Integer Linear Programming (MILP). MILP is a Non-deterministic Polynomial-time complete (NP-complete) problem for which very good heuristics exists. We tested our algorithm on synthetic and real cytogenetic data from renal cell carcinoma. We also compared our learned progression networks with the networks proposed in earlier publications. The software is available on the website https://bitbucket.org/farahani/diprog.  相似文献   

7.
De-novo motif search is a frequently applied bioinformatics procedure to identify and prioritize recurrent elements in sequences sets for biological investigation, such as the ones derived from high-throughput differential expression experiments. Several algorithms have been developed to perform motif search, employing widely different approaches and often giving divergent results. In order to maximize the power of these investigations and ultimately be able to draft solid biological hypotheses, there is the need for applying multiple tools on the same sequences and merge the obtained results. However, motif reporting formats and statistical evaluation methods currently make such an integration task difficult to perform and mostly restricted to specific scenarios. We thus introduce here the Dynamic Motif Integration Toolkit (DynaMIT), an extremely flexible platform allowing to identify motifs employing multiple algorithms, integrate them by means of a user-selected strategy and visualize results in several ways; furthermore, the platform is user-extendible in all its aspects. DynaMIT is freely available at http://cibioltg.bitbucket.org.  相似文献   

8.
dadi is a popular but computationally intensive program for inferring models of demographic history and natural selection from population genetic data. I show that running dadi on a Graphics Processing Unit can dramatically speed computation compared with the CPU implementation, with minimal user burden. Motivated by this speed increase, I also extended dadi to four- and five-population models. This functionality is available in dadi version 2.1.0, https://bitbucket.org/gutenkunstlab/dadi/.  相似文献   

9.

Background

Structure-based drug design is an iterative process, following cycles of structural biology, computer-aided design, synthetic chemistry and bioassay. In favorable circumstances, this process can lead to the structures of hundreds of protein-ligand crystal structures. In addition, molecular dynamics simulations are increasingly being used to further explore the conformational landscape of these complexes. Currently, methods capable of the analysis of ensembles of crystal structures and MD trajectories are limited and usually rely upon least squares superposition of coordinates.

Results

Novel methodologies are described for the analysis of multiple structures of a protein. Statistical approaches that rely upon residue equivalence, but not superposition, are developed. Tasks that can be performed include the identification of hinge regions, allosteric conformational changes and transient binding sites. The approaches are tested on crystal structures of CDK2 and other CMGC protein kinases and a simulation of p38α. Known interaction - conformational change relationships are highlighted but also new ones are revealed. A transient but druggable allosteric pocket in CDK2 is predicted to occur under the CMGC insert. Furthermore, an evolutionarily-conserved conformational link from the location of this pocket, via the αEF-αF loop, to phosphorylation sites on the activation loop is discovered.

Conclusions

New methodologies are described and validated for the superimposition independent conformational analysis of large collections of structures or simulation snapshots of the same protein. The methodologies are encoded in a Python package called Polyphony, which is released as open source to accompany this paper [http://wrpitt.bitbucket.org/polyphony/].  相似文献   

10.
11.
Viral phylodynamics is defined as the study of how epidemiological, immunological, and evolutionary processes act and potentially interact to shape viral phylogenies. Since the coining of the term in 2004, research on viral phylodynamics has focused on transmission dynamics in an effort to shed light on how these dynamics impact viral genetic variation. Transmission dynamics can be considered at the level of cells within an infected host, individual hosts within a population, or entire populations of hosts. Many viruses, especially RNA viruses, rapidly accumulate genetic variation because of short generation times and high mutation rates. Patterns of viral genetic variation are therefore heavily influenced by how quickly transmission occurs and by which entities transmit to one another. Patterns of viral genetic variation will also be affected by selection acting on viral phenotypes. Although viruses can differ with respect to many phenotypes, phylodynamic studies have to date tended to focus on a limited number of viral phenotypes. These include virulence phenotypes, phenotypes associated with viral transmissibility, cell or tissue tropism phenotypes, and antigenic phenotypes that can facilitate escape from host immunity. Due to the impact that transmission dynamics and selection can have on viral genetic variation, viral phylogenies can therefore be used to investigate important epidemiological, immunological, and evolutionary processes, such as epidemic spread [2], spatio-temporal dynamics including metapopulation dynamics [3], zoonotic transmission, tissue tropism [4], and antigenic drift [5]. The quantitative investigation of these processes through the consideration of viral phylogenies is the central aim of viral phylodynamics.
This is a “Topic Page” article for PLOS Computational Biology.
  相似文献   

12.
High-throughput sequencing based techniques, such as 16S rRNA gene profiling, have the potential to elucidate the complex inner workings of natural microbial communities - be they from the world''s oceans or the human gut. A key step in exploring such data is the identification of dependencies between members of these communities, which is commonly achieved by correlation analysis. However, it has been known since the days of Karl Pearson that the analysis of the type of data generated by such techniques (referred to as compositional data) can produce unreliable results since the observed data take the form of relative fractions of genes or species, rather than their absolute abundances. Using simulated and real data from the Human Microbiome Project, we show that such compositional effects can be widespread and severe: in some real data sets many of the correlations among taxa can be artifactual, and true correlations may even appear with opposite sign. Additionally, we show that community diversity is the key factor that modulates the acuteness of such compositional effects, and develop a new approach, called SparCC (available at https://bitbucket.org/yonatanf/sparcc), which is capable of estimating correlation values from compositional data. To illustrate a potential application of SparCC, we infer a rich ecological network connecting hundreds of interacting species across 18 sites on the human body. Using the SparCC network as a reference, we estimated that the standard approach yields 3 spurious species-species interactions for each true interaction and misses 60% of the true interactions in the human microbiome data, and, as predicted, most of the erroneous links are found in the samples with the lowest diversity.  相似文献   

13.
The segregation of homologous chromosomes during the Meiosis I division requires an obligate crossover per homolog pair (crossover assurance). In Saccharomyces cerevisiae and mammals, Msh4 and Msh5 proteins stabilize Holliday junctions and its progenitors to facilitate crossing over. S. cerevisiae msh4/5 hypomorphs that reduce crossover levels up to twofold at specific loci on chromosomes VII, VIII, and XV without affecting homolog segregation were identified recently. We use the msh4–R676W hypomorph to ask if the obligate crossover is insulated from variation in crossover frequencies, using a S. cerevisiae S288c/YJM789 hybrid to map recombination genome-wide. The msh4–R676W hypomorph made on average 64 crossovers per meiosis compared to 94 made in wild type and 49 in the msh4Δ mutant confirming the defect seen at individual loci on a genome-wide scale. Crossover reductions in msh4–R676W and msh4Δ were significant across chromosomes regardless of size, unlike previous observations made at specific loci. The msh4–R676W hypomorph showed reduced crossover interference. Although crossover reduction in msh4–R676W is modest, 42% of the four viable spore tetrads showed nonexchange chromosomes. These results, along with modeling of crossover distribution, suggest the significant reduction in crossovers across chromosomes and the loss of interference compromises the obligate crossover in the msh4 hypomorph. The high spore viability of the msh4 hypomorph is maintained by efficient segregation of the natural nonexchange chromosomes. Our results suggest that variation in crossover frequencies can compromise the obligate crossover and also support a mechanistic role for interference in obligate crossover formation.  相似文献   

14.
15.
The rapidly growing amount of genomic sequence data being generated and made publicly available necessitate the development of new data storage and archiving methods. The vast amount of data being shared and manipulated also create new challenges for network resources. Thus, developing advanced data compression techniques is becoming an integral part of data production and analysis. The HapMap project is one of the largest public resources of human single-nucleotide polymorphisms (SNPs), characterizing over 3 million SNPs genotyped in over 1000 individuals. The standard format and biological properties of HapMap data suggest that a dedicated genetic compression method can outperform generic compression tools. We propose a compression methodology for genetic data by introducing HapZipper, a lossless compression tool tailored to compress HapMap data beyond benchmarks defined by generic tools such as gzip, bzip2 and lzma. We demonstrate the usefulness of HapZipper by compressing HapMap 3 populations to <5% of their original sizes. HapZipper is freely downloadable from https://bitbucket.org/pchanda/hapzipper/downloads/HapZipper.tar.bz2.  相似文献   

16.
Dbf4-dependent kinase (DDK) and cyclin-dependent kinase (CDK) are essential to initiate DNA replication at individual origins. During replication stress, the S-phase checkpoint inhibits the DDK- and CDK-dependent activation of late replication origins. Rad53 kinase is a central effector of the replication checkpoint and both binds to and phosphorylates Dbf4 to prevent late-origin firing. The molecular basis for the Rad53Dbf4 physical interaction is not clear but occurs through the Dbf4 N terminus. Here we found that both Rad53 FHA1 and FHA2 domains, which specifically recognize phospho-threonine (pT), interacted with Dbf4 through an N-terminal sequence and an adjacent BRCT domain. Purified Rad53 FHA1 domain (but not FHA2) bound to a pT Dbf4 peptide in vitro, suggesting a possible phospho-threonine-dependent interaction between FHA1 and Dbf4. The Dbf4Rad53 interaction is governed by multiple contacts that are separable from the Cdc5- and Msa1-binding sites in the Dbf4 N terminus. Importantly, abrogation of the Rad53Dbf4 physical interaction blocked Dbf4 phosphorylation and allowed late-origin firing during replication checkpoint activation. This indicated that Rad53 must stably bind to Dbf4 to regulate its activity.  相似文献   

17.
eIF5A is an essential and evolutionary conserved translation elongation factor, which has recently been proposed to be required for the translation of proteins with consecutive prolines. The binding of eIF5A to ribosomes occurs upon its activation by hypusination, a modification that requires spermidine, an essential factor for mammalian fertility that also promotes yeast mating. We show that in response to pheromone, hypusinated eIF5A is required for shmoo formation, localization of polarisome components, induction of cell fusion proteins, and actin assembly in yeast. We also show that eIF5A is required for the translation of Bni1, a proline-rich formin involved in polarized growth during shmoo formation. Our data indicate that translation of the polyproline motifs in Bni1 is eIF5A dependent and this translation dependency is lost upon deletion of the polyprolines. Moreover, an exogenous increase in Bni1 protein levels partially restores the defect in shmoo formation seen in eIF5A mutants. Overall, our results identify eIF5A as a novel and essential regulator of yeast mating through formin translation. Since eIF5A and polyproline formins are conserved across species, our results also suggest that eIF5A-dependent translation of formins could regulate polarized growth in such processes as fertility and cancer in higher eukaryotes.  相似文献   

18.
The unc-17 gene encodes the vesicular acetylcholine transporter (VAChT) in Caenorhabditis elegans. unc-17 reduction-of-function mutants are small, slow growing, and uncoordinated. Several independent unc-17 alleles are associated with a glycine-to-arginine substitution (G347R), which introduces a positive charge in the ninth transmembrane domain (TMD) of UNC-17. To identify proteins that interact with UNC-17/VAChT, we screened for mutations that suppress the uncoordinated phenotype of UNC-17(G347R) mutants. We identified several dominant allele-specific suppressors, including mutations in the sup-1 locus. The sup-1 gene encodes a single-pass transmembrane protein that is expressed in a subset of neurons and in body muscles. Two independent suppressor alleles of sup-1 are associated with a glycine-to-glutamic acid substitution (G84E), resulting in a negative charge in the SUP-1 TMD. A sup-1 null mutant has no obvious deficits in cholinergic neurotransmission and does not suppress unc-17 mutant phenotypes. Bimolecular fluorescence complementation (BiFC) analysis demonstrated close association of SUP-1 and UNC-17 in synapse-rich regions of the cholinergic nervous system, including the nerve ring and dorsal nerve cords. These observations suggest that UNC-17 and SUP-1 are in close proximity at synapses. We propose that electrostatic interactions between the UNC-17(G347R) and SUP-1(G84E) TMDs alter the conformation of the mutant UNC-17 protein, thereby restoring UNC-17 function; this is similar to the interaction between UNC-17/VAChT and synaptobrevin.  相似文献   

19.
Despite the importance of clathrin-mediated endocytosis (CME) for cell biology, it is unclear if all components of the machinery have been discovered and many regulatory aspects remain poorly understood. Here, using Saccharomyces cerevisiae and a fluorescence microscopy screening approach we identify previously unknown regulatory factors of the endocytic machinery. We further studied the top scoring protein identified in the screen, Ubx3, a member of the conserved ubiquitin regulatory X (UBX) protein family. In vivo and in vitro approaches demonstrate that Ubx3 is a new coat component. Ubx3-GFP has typical endocytic coat protein dynamics with a patch lifetime of 45 ± 3 sec. Ubx3 contains a W-box that mediates physical interaction with clathrin and Ubx3-GFP patch lifetime depends on clathrin. Deletion of the UBX3 gene caused defects in the uptake of Lucifer Yellow and the methionine transporter Mup1 demonstrating that Ubx3 is needed for efficient endocytosis. Further, the UBX domain is required both for localization and function of Ubx3 at endocytic sites. Mechanistically, Ubx3 regulates dynamics and patch lifetime of the early arriving protein Ede1 but not later arriving coat proteins or actin assembly. Conversely, Ede1 regulates the patch lifetime of Ubx3. Ubx3 likely regulates CME via the AAA-ATPase Cdc48, a ubiquitin-editing complex. Our results uncovered new components of the CME machinery that regulate this fundamental process.  相似文献   

20.
Constitutive transport of cellular materials is essential for cell survival. Although multiple small GTPase Rab proteins are required for the process, few regulators of Rabs are known. Here we report that EAT-17, a novel GTPase-activating protein (GAP), regulates RAB-6.2 function in grinder formation in Caenorhabditis elegans. We identified EAT-17 as a novel RabGAP that interacts with RAB-6.2, a protein that presumably regulates vesicle trafficking between Golgi, the endoplasmic reticulum, and plasma membrane to form a functional grinder. EAT-17 has a canonical GAP domain that is critical for its function. RNA interference against 25 confirmed and/or predicted RABs in C. elegans shows that RNAi against rab-6.2 produces a phenotype identical to eat-17. A directed yeast two-hybrid screen using EAT-17 as bait and each of the 25 RAB proteins as prey identifies RAB-6.2 as the interacting partner of EAT-17, confirming that RAB-6.2 is a specific substrate of EAT-17. Additionally, deletion mutants of rab-6.2 show grinder defects identical to those of eat-17 loss-of-function mutants, and both RAB-6.2 and EAT-17 are expressed in the terminal bulb of the pharynx where the grinder is located. Collectively, these results suggest that EAT-17 is a specific GTPase-activating protein for RAB-6.2. Based on the conserved function of Rab6 in vesicular transport, we propose that EAT-17 regulates the turnover rate of RAB-6.2 activity in cargo trafficking for grinder formation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号