共查询到20条相似文献,搜索用时 9 毫秒
1.
Bordewich Magnus Deutschmann Ina Maria Fischer Mareike Kasbohm Elisa Semple Charles Steel Mike 《Journal of mathematical biology》2018,77(3):527-544
Journal of Mathematical Biology - Phylogenetic inference aims to reconstruct the evolutionary relationships of different species based on genetic (or other) data. Discrete characters are a... 相似文献
2.
Despite the recent surge of interest in studying the evolution of development, surprisingly little work has been done to investigate the phylogenetic signal in developmental characters. Yet, both the potential usefulness of developmental characters in phylogenetic reconstruction and the validity of inferences on the evolution of developmental characters depend on the presence of such a phylogenetic signal and on the ability of our coding scheme to capture it. In a recent study, we showed, using simulations, that a new method (called the continuous analysis) using standardized time or ontogenetic sequence data and squared-change parsimony outperformed event pairing and event cracking in analyzing developmental data on a reference phylogeny. Using the same simulated data, we demonstrate that all these coding methods (event pairing and standardized time or ontogenetic sequence data) can be used to produce phylogenetically informative data. Despite some dependence between characters (the position of an event in an ontogenetic sequence is not independent of the position of other events in the same sequence), parsimony analysis of such characters converges on the correct phylogeny as the amount of data increases. In this context, the new coding method (developed for the continuous analysis) outperforms event pairing; it recovers a lower proportion of incorrect clades. This study thus validates the use of ontogenetic data in phylogenetic inference and presents a simple coding scheme that can extract a reliable phylogenetic signal from these data. 相似文献
3.
Bicego M Dellaglio F Felis GE 《Journal of bioinformatics and computational biology》2007,5(5):1069-1085
The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix-based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence- and amino acid sequence-based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis. 相似文献
4.
Phylogenetic analyses based on mitochondrial DNA have yielded widely differing relationships among members of the arthropod lineage Arachnida, depending on the nucleotide coding schemes and models of evolution used. We enhanced taxonomic coverage within the Arachnida greatly by sequencing seven new arachnid mitochondrial genomes from five orders. We then used all 13 mitochondrial protein-coding genes from these genomes to evaluate patterns of nucleotide and amino acid biases. Our data show that two of the six orders of arachnids (spiders and scorpions) have experienced shifts in both nucleotide and amino acid usage in all their protein-coding genes, and that these biases mislead phylogeny reconstruction. These biases are most striking for the hydrophobic amino acids isoleucine and valine, which appear to have evolved asymmetrical exchanges in response to shifts in nucleotide composition. To improve phylogenetic accuracy based on amino acid differences, we tested two recoding methods: (1) removing all isoleucine and valine sites and (2) recoding amino acids based on their physiochemical properties. We find that these methods yield phylogenetic trees that are consistent in their support of ancient intraordinal divergences within the major arachnid lineages. Further refinement of amino acid recoding methods may help us better delineate interordinal relationships among these diverse organisms. 相似文献
5.
6.
H W Detrich S A Overton 《Comparative biochemistry and physiology. B, Comparative biochemistry》1988,90(3):593-600
1. Tubulins purified from the brain tissues of three Antarctic fishes (Notothenia gibberifrons, Notothenia coriiceps neglecta, and Chaenocephalus aceratus) contain equimolar quantities of the alpha and beta chains and are free of microtubule-associated proteins (MAPs) and other non-tubulin proteins. 2. When examined by isoelectric focusing and by two-dimensional electrophoresis, brain tubulins from the Antarctic fishes were found to be highly heterogeneous; each was resolved into 15-20 distinct variants. The range of isoelectric points displayed by the Antarctic fish tubulins (5.30-5.75) is slightly more basic than that of bovine brain tubulin (5.25-5.60). 3. Peptide mapping demonstrated that tubulins from the Antarctic fishes and the cow differ in structure. 4. The amino acid compositions of piscine and mammalian tubulins are similar, but the Antarctic fish tubulins apparently contain fewer glutamyl and/or glutaminyl residues than do tubulins from the temperate channel catfish (Ictalurus punctatus) and the cow. 5. Native tubulin from N. coriiceps neglecta possesses 1-2 fewer net negative charges per tubulin dimer than does bovine tubulin. 6. We suggest that the enhanced assembly of Antarctic fish tubulins at low temperatures (-2 to +2 degrees C) results from adaptive, perhaps subtle, changes in their tubulin subunits. 相似文献
7.
Y Kiho 《Cell structure and function》1988,13(5):387-405
Two methods of qualitative analysis of sequence distribution in DNA and protein are presented. The first method is based on the finding that the frequency of occurrence of each nucleotide in a defined sequence with functional significance more or less deviates from uniform distribution. The deviation found in this defined sequence seems to parallel the function of this sequence. In the second method, two model compounds (trypsin and its inhibitor) have been used to see the topological fit between their local structures. Acrophilicity parameter for amino acid was used to construct the topological structure. Both methods may find practical application in algorithms to design functional DNA and protein molecules. 相似文献
8.
9.
10.
In computational structural biology, structure comparison is fundamental for our understanding of proteins. Structure comparison is, e.g., algorithmically the starting point for computational studies of structural evolution and it guides our efforts to predict protein structures from their amino acid sequences. Most methods for structural alignment of protein structures optimize the distances between aligned and superimposed residue pairs, i.e., the distances traveled by the aligned and superimposed residues during linear interpolation. Considering such a linear interpolation, these methods do not differentiate if there is room for the interpolation, if it causes steric clashes, or more severely, if it changes the topology of the compared protein backbone curves. To distinguish such cases, we analyze the linear interpolation between two aligned and superimposed backbones. We quantify the amount of steric clashes and find all self-intersections in a linear backbone interpolation. To determine if the self-intersections alter the protein’s backbone curve significantly or not, we present a path-finding algorithm that checks if there exists a self-avoiding path in a neighborhood of the linear interpolation. A new path is constructed by altering the linear interpolation using a novel interpretation of Reidemeister moves from knot theory working on three-dimensional curves rather than on knot diagrams. Either the algorithm finds a self-avoiding path or it returns a smallest set of essential self-intersections. Each of these indicates a significant difference between the folds of the aligned protein structures. As expected, we find at least one essential self-intersection separating most unknotted structures from a knotted structure, and we find even larger motions in proteins connected by obstruction free linear interpolations. We also find examples of homologous proteins that are differently threaded, and we find many distinct folds connected by longer but simple deformations. TM-align is one of the most restrictive alignment programs. With standard parameters, it only aligns residues superimposed within 5 Ångström distance. We find 42165 topological obstructions between aligned parts in 142068 TM-alignments. Thus, this restrictive alignment procedure still allows topological dissimilarity of the aligned parts. Based on the data we conclude that our program ProteinAlignmentObstruction provides significant additional information to alignment scores based solely on distances between aligned and superimposed residue pairs. 相似文献
11.
12.
13.
The presence in proteins of amino acid residues that change in concert during evolution is associated with keeping constant the protein spatial structure and functions. As in the case with morphological features, correlated substitutions may become the cause of homoplasies--the independent evolution of identical non-homological adaptations. Our data obtained on model phylogenetic trees and corresponding sets of sequences have shown that the presence of correlated substitutions distorts the results of phylogenetic reconstructions. A method for accounting for co-evolving amino acid residues in phylogenetic analysis is proposed. According to this method, only a single site from the group of correlated amino acid positions should remain, whereas other positions should not be used in further phylogenetic analysis. Simulations performed have shown that replacement on the average of 8% of variable positions in a pair of model sequences by coordinately evolving amino acid residues is able to change the tree topology. The removal of such amino acid residues from sequences before phylogenetic analysis restores the correct topology. 相似文献
14.
Proteus mirabilis amino acid deaminase: cloning, nucleotide sequence, and characterization of aad. 总被引:2,自引:0,他引:2
下载免费PDF全文
下载免费PDF全文 Proteus, Providencia, and Morganella species produce deaminases that generate alpha-keto acids from amino acids. The alpha-keto acid products are detected by the formation of colored iron complexes, raising the possibility that the enzyme functions to secure iron for these species, which do not produce traditional siderophores. A gene encoding an amino acid deaminase of uropathogenic Proteus mirabilis was identified by screening a genomic library hosted in Escherichia coli DH5 alpha for amino acid deaminase activity. The deaminase gene, localized on a cosmid clone by subcloning and Tn5::751 mutagenesis, was subjected to nucleotide sequencing. A single open reading frame, designated aad (amino acid deaminase), which appears to be both necessary and sufficient for deaminase activity, predicts a 473-amino-acid polypeptide (51,151 Da) encoded within an area mapped by transposon mutagenesis. The predicted amino acid sequence of Aad did not share significant amino acid sequence similarity with any other polypeptide in the PIR or SwissProt database. Amino acid deaminase activity in both P. mirabilis and E. coli transformed with aad-encoding plasmids was not affected by medium iron concentration or expression of genes in multicopy in fur, cya, or crp E. coli backgrounds. Enzyme expression was negatively affected by growth with glucose or glycerol as the sole carbon source but was not consistent with catabolite repression. 相似文献
15.
Sarani Ghoshal Lynn Jones Ramin Homayouni 《Metabolomics : Official journal of the Metabolomic Society》2014,10(2):250-258
4-Nitrophenyl phosphatase domain and non-neuronal SNAP25-like protein homolog1 (NIPSNAP1) is an evolutionarily conserved protein found in a variety of species ranging from C. elegans to human. NIPSNAP1 protein is localized in mitochondria and is highly expressed in liver, brain and kidney. The molecular and cellular roles of NIPSNAP1 are still unknown. To gain insights into the function of NIPSNAP1, we generated a mouse model with a disruption of Nipsnap1 gene and performed metabolomic analysis on their liver tissues. Liver samples from 13 to 15 month old NIPSNAP1 deficient (n = 7) and wild-type (n = 8) mice were extracted and processed for analysis using liquid/gas chromatography followed by mass spectrometry (LC/MS and GC/MS). We examined a total of 291 compounds in liver samples and found 45 compounds whose levels were significantly altered (p < 0.05, Welch’s t test) in NIPSNAP1 deficient mice compared to controls. These compounds were associated with a variety of processes, including metabolism of nucleotides, amino acids and lipids. In addition, we found a significant reduction in reduced glutathione (GSH) (0.63-fold change, p < 0.05) and elevation in cysteine–glutathione disulfide (2.77-fold change, p < 0.05). Our results suggest that NIPSNAP1 deficiency affects multiple processes in intermediate metabolism and results in oxidative stress in the liver. 相似文献
16.
Thermophilic prokaryotes have characteristic patterns of codon usage, amino acid composition and nucleotide content 总被引:13,自引:0,他引:13
A number of recent studies have shown that thermophilic prokaryotes have distinguishable patterns of both synonymous codon usage and amino acid composition, indicating the action of natural selection related to thermophily. On the other hand, several other studies of whole genomes have illustrated that nucleotide bias can have dramatic effects on synonymous codon usage and also on the amino acid composition of the encoded proteins. This raises the possibility that the thermophile-specific patterns observed at both the codon and protein levels are merely reflections of a single underlying effect at the level of nucleotide composition. Moreover, such an effect at the nucleotide level might be due entirely to mutational bias. In this study, we have compared the genomes of thermophiles and mesophiles at three levels: nucleotide content, codon usage and amino acid composition. Our results indicate that the genomes of thermophiles are distinguishable from mesophiles at all three levels and that the codon and amino acid frequency differences cannot be explained simply by the patterns of nucleotide composition. At the nucleotide level, we see a consistent tendency for the frequency of adenine to increase at all codon positions within the thermophiles. Thermophiles are also distinguished by their pattern of synonymous codon usage for several amino acids, particularly arginine and isoleucine. At the protein level, the most dramatic effect is a two-fold decrease in the frequency of glutamine residues among thermophiles. These results indicate that adaptation to growth at high temperature requires a coordinated set of evolutionary changes affecting (i) mRNA thermostability, (ii) stability of codon-anticodon interactions and (iii) increased thermostability of the protein products. We conclude that elevated growth temperature imposes selective constraints at all three molecular levels: nucleotide content, codon usage and amino acid composition. In addition to these multiple selective effects, however, the genomes of both thermophiles and mesophiles are often subject to superimposed large changes in composition due to mutational bias. 相似文献
17.
We isolated 7.4 mg of pure renin from 2 kg of rat kidneys using affinity chromatography on pepstatin-aminohexyl-Sepharose and an octapeptide renin inhibitor, H-77-Sepharose. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis showed that renin consists of two polypeptide chains linked by a disulfide bond, one of Mr = 36,000 (heavy chain) and the other of Mr = 3,000 (light chain). The amino-terminal 10-amino acid sequences of the heavy and the light chains were identical to the sequences beginning at Ser72 and Asp355, respectively, of the amino acid sequence of preprorenin deduced from the renin cDNA sequence. Amino acid sequencing of the carboxyl-terminal peptide of the heavy chain, generated by digestion with lysyl endopeptidase, showed that the carboxyl-terminal residue of the heavy chain is Phe. Thus, the propeptide of prorenin is cleaved after Thr71, followed by removal of two amino acids, Arg353 and Asn354, the result being formation of the heavy and light chains. Thus, the site of cleavage of rat prorenin is after a nonbasic amino acid, in contrast to the cleavage of the propeptide after a pair of basic amino acids in mouse submaxillary renin, human renal renin, and many secretory proteins. Treatment of renin with neuraminidase or glycopeptidase F had no apparent effect on the charge heterogeneity of renin. Glycosylation probably does not contribute to charge heterogeneity. 相似文献
18.
Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes,comparing genes and pseudogenes 总被引:11,自引:0,他引:11
下载免费PDF全文
下载免费PDF全文 Echols N Harrison P Balasubramanian S Luscombe NM Bertone P Zhang Z Gerstein M 《Nucleic acids research》2002,30(11):2515-2523
Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes—the worm, yeast, fly and human (chromosomes 21 and 22 only). Each of our nearly 2500 pseudogenes is characterized by one or more disablements mid-domain, such as premature stops and frameshifts. Here, we perform a comprehensive survey of the amino acid and nucleotide composition of these pseudogenes in comparison to that of functional genes and intergenic DNA. We show that pseudogenes invariably have an amino acid composition intermediate between genes and translated intergenic DNA. Although the degree of intermediacy varies among the four organisms, in all cases, it is most evident for amino acid types that differ most in occurrence between genes and intergenic regions. The same intermediacy also applies to codon frequencies, especially in the worm and human. Moreover, the intermediate composition of pseudogenes applies even though the composition of the genes in the four organisms is markedly different, showing a strong correlation with the overall A/T content of the genomic sequence. Pseudogenes can be divided into ‘ancient’ and ‘modern’ subsets, based on the level of sequence identity with their closest matching homolog (within the same genome). Modern pseudogenes usually have a much closer sequence composition to genes than ancient pseudogenes. Collectively, our results indicate that the composition of pseudogenes that are under no selective constraints progressively drifts from that of coding DNA towards non-coding DNA. Therefore, we propose that the degree to which pseudogenes approach a random sequence composition may be useful in dating different sets of pseudogenes, as well as to assess the rate at which intergenic DNA accumulates mutations. Our compositional analyses with the interactive viewer are available over the web at http://genecensus.org/pseudogene. 相似文献
19.
Given two sequences, a pattern of length m, a text of lengthn and a positive integer k, we give two algorithms. The firstfinds all occurrences of the pattern in the text as long asthese do not differ from each other by more than k differences.It runs in O(nk) time. The second algorithm finds all subsequencealignments between the pattern and the test with at most k differences.This algorithm runs in O(nmk) time, is very simple and easyto program.
Received on August 12, 1987; accepted on December 31, 1987 相似文献
20.
Teresa Przytycka 《Journal of computational biology》2007,14(5):539-549
Parsimony methods infer phylogenetic trees by minimizing number of character changes required to explain observed character states. From the perspective of applicability of parsimony methods, it is important to assess whether the characters used to infer phylogeny are likely to provide a correct tree. We introduce a graph theoretical characterization that helps to assess whether given set of characters is appropriate to use with parsimony methods. Given a set of characters and a set of taxa, we construct a network called character overlap graph. We show that the character overlap graph for characters that are appropriate to use in parsimony methods is characterized by significant under-representation of subnetworks known as holes, and provide a validation for this observation. This characterization explains success in constructing evolutionary trees using parsimony method for some characters (e.g., protein domains) and lack of such success for other characters (e.g., introns). In the latter case, the understanding of obstacles to applying parsimony methods in a direct way has lead us to a new approach for detecting inconsistent and/or noisy data. Namely, we introduce the concept of stable characters which is similar but less restrictive than the well known concept of pairwise compatible characters. Application of this approach to introns produces the evolutionary tree consistent with the Coelomata hypothesis. 相似文献
