期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Spidermonkey: rapid detection of co-evolving sites using Bayesian graphical models

Poon AF Lewis FI Frost SD Kosakovsky Pond SL 《Bioinformatics (Oxford, England)》2008,24(17):1949-1950

Spidermonkey is a new component of the Datamonkey suite of phylogenetic tools that provides methods for detecting coevolving sites from a multiple alignment of homologous nucleotide or amino acid sequences. It reconstructs the substitution history of the alignment by maximum likelihood-based phylogenetic methods, and then analyzes the joint distribution of substitution events using Bayesian graphical models to identify significant associations among sites. AVAILABILITY: Spidermonkey is publicly available both as a web application at http://www.data-monkey.org and as a stand-alone component of the phylogenetic software package HyPhy, which is freely distributed on the web (http://www.hyphy.org) as precompiled binaries and open source. 相似文献

2.

RASCAL: rapid scanning and correction of multiple sequence alignments

Thompson JD Thierry JC Poch O 《Bioinformatics (Oxford, England)》2003,19(9):1155-1161

MOTIVATION: Most multiple sequence alignment programs use heuristics that sometimes introduce errors into the alignment. The most commonly used methods to correct these errors use iterative techniques to maximize an objective function. We present here an alternative, knowledge-based approach that combines a number of recently developed methods into a two-step refinement process. The alignment is divided horizontally and vertically to form a 'lattice' in which well aligned regions can be differentiated. Alignment correction is then restricted to the less reliable regions, leading to a more reliable and efficient refinement strategy. RESULTS: The accuracy and reliability of RASCAL is demonstrated using: (i) alignments from the BAliBASE benchmark database, where significant improvements were often observed, with no deterioration of the existing high-quality regions, (ii) a large scale study involving 946 alignments from the ProDom protein domain database, where alignment quality was increased in 68% of the cases; and (iii) an automatic pipeline to obtain a high-quality alignment of 695 full-length nuclear receptor proteins, which took 11 min on a DEC Alpha 6100 computer Availability: RASCAL is available at ftp://ftp-igbmc.u-strasbg.fr/pub/RASCAL. SUPPLEMENTARY INFORMATION: http://bioinfo-igbmc.u-strasbourg.fr/BioInfo/RASCAL/paper/rascal_supp.html 相似文献

3.

Detecting genomic features under weak selective pressure: the example of codon usage in animals and plants

Duret L 《Bioinformatics (Oxford, England)》2002,18(Z2):S91

Large scale experiments of gene inactivation in yeast have shown that 50% of genes have no detectable impact on the phenotype, and similar observations have been made in other model organisms. This apparent paradox is probably due to the fact that many genes only have a marginal contribution to the fitness of organisms. Because of the size of populations and the number of generations that can be studied in laboratories, experimental approaches only permit to detect functional elements that have a strong phenotypic impact. Comparative sequence analysis can help to solve this problem: the analysis of sequences evolution permits to detect the action of selection, and hence to reveal functional features of genomes. This approach will be illustrated by the study of synonymous codon usage in animals and plants. 相似文献

4.

CVNH结构域的进化分析和选择压力检测

齐小琼高磊苏应娟王艇《遗传》2010,32(1)

蓝藻抗病毒蛋白-N(Cyanovirin-N,CV-N)具有广谱抗病毒活性,其同源物构成CVNH(Cyanovirin-N homology)蛋白家族,并且家族成员的抗人类免疫缺陷病毒结构域在进化上非常保守。文章通过重建基因树对CVNH结构域的"零散分布"特点作了更为细致的了解,发现在黑曲霉、费氏曲菌、产黄青霉、粗糙脉孢霉、蓝杆藻和水蕨等物种中存在多份该结构域拷贝。在此基础上,分别采用机理式模型(Mechanistic model)和MEC模型(Mechanistic-empirical combination model)对CVNH结构域序列位点进行适应性进化分析,结果显示:1)两类模型均未检测到统计上显著的正选择位点;2)净化选择对CVNH起主导作用;3)MEC模型更适合所研究的数据。进一步使用"支-特异"模型和"支-位点"模型对蓝杆菌菌株7822和7424的祖先分支进行检测,发现该分支经历过适应性进化,并且鉴定出6个正选择位点(34L、63L、13H、76C、78K和80I)。相似文献

5.

Mclip: motif detection based on cliques of gapped local profile-to-profile alignments

Frickey T Weiller G 《Bioinformatics (Oxford, England)》2007,23(4):502-503

A multitude of motif-finding tools have been published, which can generally be assigned to one of three classes: expectation-maximization, Gibbs-sampling or enumeration. Irrespective of this grouping, most motif detection tools only take into account similarities across ungapped sequence regions, possibly causing short motifs located peripherally and in varying distance to a 'core' motif to be missed. We present a new method, adding to the set of expectation-maximization approaches, that permits the use of gapped alignments for motif elucidation. Availability: The program is available for download from: http://bioinfoserver.rsbs.anu.edu.au/downloads/mclip.jar. Supplementary information: http://bioinfoserver.rsbs.anu.edu.au/utils/mclip/info.php. 相似文献

6.

In Arabidopsis thaliana codon volatility scores reflect GC3 composition rather than selective pressure

MJ O'Connell AM Doyle TE Juenger MT Donoghue C Keshavaiah R Tuteja C Spillane 《BMC research notes》2012,5(1):359

ABSTRACT: BACKGROUND: Synonymous codon usage bias has typically been correlated with, and attributed to translational efficiency. However, there are other pressures on genomic sequence composition that can affect codon usage patterns such as mutational biases. This study provides an analysis of the codon usage patterns in Arabidopsis thaliana in relation to gene expression levels, codon volatility, mutational biases and selective pressures. RESULTS: We have performed synonymous codon usage and codon volatility analyses for all genes in the A. thaliana genome. In contrast to reports for species from other kingdoms, we find that neither codon usage nor volatility are correlated with selection pressure (as measured by dN/dS), nor with gene expression levels on a genome wide level. Our results show that codon volatility and usage are not synonymous, rather that they are correlated with the abundance of G and C at the third codon position (GC3). CONCLUSIONS: Our results indicate that while the A. thaliana genome shows evidence for synonymous codon usage bias, this is not related to the expression levels of its constituent genes. Neither codon volatility nor codon usage are correlated with expression levels or selective pressures but, because they are directly related to the composition of G and C at the third codon position, they are the result of mutational bias. Therefore, in A. thaliana codon volatility and usage do not result from selection for translation efficiency or protein functional shift as measured by positive selection. 相似文献

7.

ddbRNA: detection of conserved secondary structures in multiple alignments 总被引：4，自引：0，他引：4

di Bernardo D Down T Hubbard T 《Bioinformatics (Oxford, England)》2003,19(13):1606-1611

MOTIVATION: Structured non-coding RNAs (ncRNAs) have a very important functional role in the cell. No distinctive general features common to all ncRNA have yet been discovered. This makes it difficult to design computational tools able to detect novel ncRNAs in the genomic sequence. RESULTS: We devised an algorithm able to detect conserved secondary structures in both pairwise and multiple DNA sequence alignments with computational time proportional to the square of the sequence length. We implemented the algorithm for the case of pairwise and three-way alignments and tested it on ncRNAs obtained from public databases. On the test sets, the pairwise algorithm has a specificity greater than 97% with a sensitivity varying from 22.26% for Blast alignments to 56.35% for structural alignments. The three-way algorithm behaves similarly. Our algorithm is able to efficiently detect a conserved secondary structure in multiple alignments. 相似文献

8.

Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage 总被引：3，自引：0，他引：3

Yang Z Nielsen R 《Molecular biology and evolution》2008,25(3):568-579

Current models of codon substitution are formulated at the levels of nucleotide substitution and do not explicitly consider the separate effects of mutation and selection. They are thus incapable of inferring whether mutation or selection is responsible for evolution at silent sites. Here we implement a few population genetics models of codon substitution that explicitly consider mutation bias and natural selection at the DNA level. Selection on codon usage is modeled by introducing codon-fitness parameters, which together with mutation-bias parameters, predict optimal codon frequencies for the gene. The selective pressure may be for translational efficiency and accuracy or for fine-tuning translational kinetics to produce correct protein folding. We apply the models to compare mitochondrial and nuclear genes from several mammalian species. Model assumptions concerning codon usage are found to affect the estimation of sequence distances (such as the synonymous rate d(S), the nonsynonymous rate d(N), and the rate at the 4-fold degenerate sites d(4)), as found in previous studies, but the new models produced very similar estimates to some old ones. We also develop a likelihood ratio test to examine the null hypothesis that codon usage is due to mutation bias alone, not influenced by natural selection. Application of the test to the mammalian data led to rejection of the null hypothesis in most genes, suggesting that natural selection may be a driving force in the evolution of synonymous codon usage in mammals. Estimates of selection coefficients nevertheless suggest that selection on codon usage is weak and most mutations are nearly neutral. The sensitivity of the analysis on the assumed mutation model is discussed. 相似文献

9.

A model of evolution with constant selective pressure for regulatory DNA sites

Farida N Enikeeva Ekaterina A Kotelnikova Mikhail S Gelfand Vsevolod J Makeev 《BMC evolutionary biology》2007,7(1):125

相似文献

10.

Hepatitis A virus mutant spectra under the selective pressure of monoclonal antibodies: codon usage constraints limit capsid variability 总被引：1，自引：0，他引：1

Aragonès L Bosch A Pintó RM 《Journal of virology》2008,82(4):1688-1700

Severe structural constraints in the hepatitis A virus (HAV) capsid have been suggested as the reason for the lack of emergence of new serotypes in spite of the occurrence of complex distributions of mutants or quasispecies. Analysis of the HAV mutant spectra under immune pressure by the monoclonal antibodies (MAbs) K34C8 (immunodominant site) and H7C27 (glycophorin binding site) has revealed different evolutionary dynamics. Populations composed of complex ensembles of mutants with very low fitness or single dominant mutants with high fitness permit the acquisition of resistance to each of the MAbs, respectively. Deletion mutants were detected as components of the mutant spectra: up to 61 residues, with an average of 19, and up to 83 residues, with an average of 45, in VP3 and VP1 proteins, respectively. A clear negative selection of those replacements affecting the residues encoded by rare codons of the capsid surface has been detected through the present quasispecies analysis, confirming a certain beneficial role of such clusters. Since these clusters are located near or at the epitope regions, the need to maintain such clusters might prevent the emergence of new serotypes. 相似文献

11.

RDP2: recombination detection and analysis from sequence alignments

Martin DP Williamson C Posada D 《Bioinformatics (Oxford, England)》2005,21(2):260-262

相似文献

12.

TOPAL 2.0: improved detection of mosaic sequences within multiple alignments

McGuire G Wright F 《Bioinformatics (Oxford, England)》2000,16(2):130-134

MOTIVATION: The Dss statistic was proposed by McGuire et al. (Mol. Biol. Evol., 14, 1125-1131, 1997) for scanning data sets for the presence of recombination, an important step in some phylogenetic analyses. The statistic, however, could not distinguish well between among-site rate variation and recombination, and had no statistical test for significant values. This paper addresses these shortfalls. RESULTS: A modification to the Dss statistic is proposed which accounts for rate variation to a large extent. A statistical test, based on parametric bootstrapping, is also suggested. AVAILABILITY: The TOPAL package (version 2) may be accessed from http:/ /www.bioss.sari.ac.uk/frank/Genetics and by anonymous ftp from typ://ftp.bioss.sari.ac.uk in the directory pub/phylogeny/topal. CONTACT: frank@bioss.sari.ac.uk 相似文献

13.

Incorporating evolution of transcription factor binding sites into annotated alignments

Bais AS Grossmann S Vingron M 《Journal of biosciences》2007,32(5):841-850

相似文献

14.

Incorporating evolution of transcription factor binding sites into annotated alignments

Abha S. Bais Steffen Grossmann Martin Vingron 《Journal of biosciences》2007,32(1):841-850

相似文献

15.

Stepwise detection of recombination breakpoints in sequence alignments 总被引：1，自引：0，他引：1

Graham J McNeney B Seillier-Moiseiwitsch F 《Bioinformatics (Oxford, England)》2005,21(5):589-595

MOTIVATION: We propose a stepwise approach to identify recombination breakpoints in a sequence alignment. The approach can be applied to any recombination detection method that uses a permutation test and provides estimates of breakpoints. RESULTS: We illustrate the approach by analyses of a simulated dataset and alignments of real data from HIV-1 and human chromosome 7. The presented simulation results compare the statistical properties of one-step and two-step procedures. More breakpoints are found with a two-step procedure than with a single application of a given method, particularly for higher recombination rates. At higher recombination rates, the additional breakpoints were located at the cost of only a slight increase in the number of falsely declared breakpoints. However, a large proportion of breakpoints still go undetected. AVAILABILITY: A makefile and C source code for phylogenetic profiling and the maximum chi2 method, tested with the gcc compiler on Linux and WindowsXP, are available at http://stat-db.stat.sfu.ca/stepwise/ CONTACT: jgraham@stat.sfu.ca. 相似文献

16.

Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage 总被引：10，自引：0，他引：10

Altschul SF; Erickson BW 《Molecular biology and evolution》1985,2(6):526-538

The similarity of two nucleotide sequences is often expressed in terms of evolutionary distance, a measure of the amount of change needed to transform one sequence into the other. Given two sequences with a small distance between them, can their similarity be explained by their base composition alone? The nucleotide order of these sequences contributes to their similarity if the distance is much smaller than their average permutation distance, which is obtained by calculating the distances for many random permutations of these sequences. To determine whether their similarity can be explained by their dinucleotide and codon usage, random sequences must be chosen from the set of permuted sequences that preserve dinucleotide and codon usage. The problem of choosing random dinucleotide and codon-preserving permutations can be expressed in the language of graph theory as the problem of generating random Eulerian walks on a directed multigraph. An efficient algorithm for generating such walks is described. This algorithm can be used to choose random sequence permutations that preserve (1) dinucleotide usage, (2) dinucleotide and trinucleotide usage, or (3) dinucleotide and codon usage. For example, the similarity of two 60-nucleotide DNA segments from the human beta-1 interferon gene (nucleotides 196-255 and 499-558) is not just the result of their nonrandom dinucleotide and codon usage. 相似文献

17.

A novel approach to remote homology detection: jumping alignments.

Rainer Spang Marc Rehmsmeier Jens Stoye 《Journal of computational biology》2002,9(5):747-760

We describe a new algorithm for protein classification and the detection of remote homologs. The rationale is to exploit both vertical and horizontal information of a multiple alignment in a well-balanced manner. This is in contrast to established methods such as profiles and profile hidden Markov models which focus on vertical information as they model the columns of the alignment independently and to family pairwise search which focuses on horizontal information as it treats given sequences separately. In our setting, we want to select from a given database of "candidate sequences" those proteins that belong to a given superfamily. In order to do so, each candidate sequence is separately tested against a multiple alignment of the known members of the superfamily by means of a new jumping alignment algorithm. This algorithm is an extension of the Smith-Waterman algorithm and computes a local alignment of a single sequence and a multiple alignment. In contrast to traditional methods, however, this alignment is not based on a summary of the individual columns of the multiple alignment. Rather, the candidate sequence is at each position aligned to one sequence of the multiple alignment, called the "reference sequence." In addition, the reference sequence may change within the alignment, while each such jump is penalized. To evaluate the discriminative quality of the jumping alignment algorithm, we compare it to profiles, profile hidden Markov models, and family pairwise search on a subset of the SCOP database of protein domains. The discriminative quality is assessed by median false positive counts (med-FP-counts). For moderate med-FP-counts, the number of successful searches with our method is considerably higher than with the competing methods. 相似文献

18.

A combined PCR and selective enrichment method for rapid detection of Listeria monocytogenes 总被引：1，自引：0，他引：1

S. Fitter M. Heuzenroeder C.J. Thomas 《Journal of applied microbiology》1992,73(1):53-59

Development of a routine detection assay for Listeria monocytogenes in foods that uses the polymerase chain reaction (PCR) and enrichment cultures was investigated. Oligonucleotide primers were chosen to amplify a 3'region of L. monocytogenes hly A gene spanning a conserved Hin dIII site. PCR detection sensitivity for L. monocytogenes in dilutions of pure enrichment cultures was between 50 and 500 colony forming units. A short enrichment period before PCR amplification allowed detection of the organisms in a range of complex foods contaminated with 10⁴ cfu/g. Detection sensitivity for the assay in the presence of chicken skin and soft cheese was determined at 10–100 cfu/g. Utilization of enrichment cultures and PCR allowed identification of the organism within 24 h or 2 days. 相似文献

19.

DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches 总被引：15，自引：4，他引：15

下载免费PDF全文

Thompson JD Plewniak F Thierry J Poch O 《Nucleic acids research》2000,28(15):2919-2926

DbClustal addresses the important problem of the automatic multiple alignment of the top scoring full-length sequences detected by a database homology search. By combining the advantages of both local and global alignment algorithms into a single system, DbClustal is able to provide accurate global alignments of highly divergent, complex sequence sets. Local alignment information is incorporated into a ClustalW global alignment in the form of a list of anchor points between pairs of sequences. The method is demonstrated using anchors supplied by the Blast post-processing program, Ballast. The rapidity and reliability of DbClustal have been demonstrated using the recently annotated Pyrococcus abyssi proteome where the number of alignments with totally misaligned sequences was reduced from 20% to <2%. A web site has been implemented proposing BlastP database searches with automatic alignment of the top hits by DbClustal. 相似文献

20.

Estimates of statistical significance for comparison of individual positions in multiple sequence alignments

Ruslan?I?Sadreyev Nick?V?Grishin Email author 《BMC bioinformatics》2004,5(1):106

Background

Profile-based analysis of multiple sequence alignments (MSA) allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1) MSA position and a set of predicted residue frequencies, and (2) between two MSA positions. These problems are important for (i) evaluation and optimization of methods predicting residue occurrence at protein positions; (ii) detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii) detection of sites that determine functional or structural specificity in two related families. 相似文献