首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper proposes a novel method using protein residue conservation and evolution information, i.e., spatial sequence profile, sequence information entropy and evolution rate, to infer protein binding sites. Some predictors based on support vector machines (SVMs) algorithm are constructed to predict the role of surface residues in protein-protein interface. By combining protein residue characters, the prediction performance can be improved obviously. We then made use of the predicted labels of neighbor residues to improve the performance of the predictors. The efficiency and the effectiveness of our proposed approach are verified by its better prediction performance based on a non-redundant data set of heterodimers.  相似文献   

2.
3.
4.
Biological sequence families contain many sequences that are very similar to each other because they are related by evolution, so the strategy for splitting data into separate training and test sets is a nontrivial choice in benchmarking sequence analysis methods. A random split is insufficient because it will yield test sequences that are closely related or even identical to training sequences. Adapting ideas from independent set graph algorithms, we describe two new methods for splitting sequence data into dissimilar training and test sets. These algorithms input a sequence family and produce a split in which each test sequence is less than p% identical to any individual training sequence. These algorithms successfully split more families than a previous approach, enabling construction of more diverse benchmark datasets.  相似文献   

5.
DeLuca S  Dorr B  Meiler J 《Biochemistry》2011,50(40):8521-8528
We hypothesize that the degree of surface exposure of amino acid side chains within a globular, soluble protein has been optimized in evolution, not only to minimize the solvation free energy of the monomeric protein but also to prevent protein aggregation. This effect needs to be taken into account when engineering proteins de novo. We test this hypothesis through addition of a knowledge-based, exposure-dependent energy term to the RosettaDesign solvation potential [Lazaridis, T., and Karplus, M. (1999) Proteins 35, 133-152]. Correlation between amino acid type and surface exposure is determined from a representative set of experimental protein structures. The amino acid solvent accessible surface area (SASA) is estimated with a neighbor vector measure that increases in accuracy compared to the neighbor count measure while remaining pairwise decomposable [Durham, E., et al. (2009) J. Mol. Model. 15, 1093-1108]. Benchmarking of this potential in protein design displays a 3.2% improvement in the overall sequence recovery and an 8.5% improvement in recovery of amino acid types tolerated in evolution.  相似文献   

6.
The extent and nature of epistatic interactions between mutations are issues of fundamental importance in evolutionary biology. However, they are difficult to study and their influence on adaptation remains poorly understood. Here, we use a systems-level approach to examine epistatic interactions that arose during the evolution of Escherichia coli in a defined environment. We used expression arrays to compare the effect on global patterns of gene expression of deleting a central regulatory gene, crp. Effects were measured in two lineages that had independently evolved for 20,000 generations and in their common ancestor. We found that deleting crp had a much more dramatic effect on the expression profile of the two evolved lines than on the ancestor. Because the sequence of the crp gene was unchanged during evolution, these differences indicate epistatic interactions between crp and mutations at other loci that accumulated during evolution. Moreover, a striking degree of parallelism was observed between the two independently evolved lines; 115 genes that were not crp-dependent in the ancestor became dependent on crp in both evolved lines. An analysis of changes in crp dependence of well-characterized regulons identified a number of regulatory genes as candidates for harboring beneficial mutations that could account for these parallel expression changes. Mutations within three of these genes have previously been found and shown to contribute to fitness. Overall, these findings indicate that epistasis has been important in the adaptive evolution of these lines, and they provide new insight into the types of genetic changes through which epistasis can evolve. More generally, we demonstrate that expression profiles can be profitably used to investigate epistatic interactions.  相似文献   

7.
In order to understand the evolution of enzyme reactions and to gain an overview of biological catalysis we have combined sequence and structural data to generate phylogenetic trees in an analysis of 276 structurally defined enzyme superfamilies, and used these to study how enzyme functions have evolved. We describe in detail the analysis of two superfamilies to illustrate different paradigms of enzyme evolution. Gathering together data from all the superfamilies supports and develops the observation that they have all evolved to act on a diverse set of substrates, whilst the evolution of new chemistry is much less common. Despite that, by bringing together so much data, we can provide a comprehensive overview of the most common and rare types of changes in function. Our analysis demonstrates on a larger scale than previously studied, that modifications in overall chemistry still occur, with all possible changes at the primary level of the Enzyme Commission (E.C.) classification observed to a greater or lesser extent. The phylogenetic trees map out the evolutionary route taken within a superfamily, as well as all the possible changes within a superfamily. This has been used to generate a matrix of observed exchanges from one enzyme function to another, revealing the scale and nature of enzyme evolution and that some types of exchanges between and within E.C. classes are more prevalent than others. Surprisingly a large proportion (71%) of all known enzyme functions are performed by this relatively small set of 276 superfamilies. This reinforces the hypothesis that relatively few ancient enzymatic domain superfamilies were progenitors for most of the chemistry required for life.  相似文献   

8.
Computers, the human mind, and social systems have common problems of inadequate memory and insufficient data manipulation speed. In each of these domains, information compression techniques have evolved to reduce storage and processing needs. Among the techniques for information compression, coding of information in procedures stands out as exceptionally powerful. Procedural information coding also gives rise to behavior that may be defined as intelligent. It is found in the human mind, in machines and in social systems. Its use in human thought is aided by language development which promotes regular review of abstract procedures. A practical consequence of better understanding of procedural information coding is the possibility of training people to exhibit greater mental capacity, a controversial possibility. This paper explores the impact of data processing resource limitations, data compression and procedural thinking in men and machines.  相似文献   

9.
Water is an essential element for living organisms, such that various responses have evolved to withstand water deficit in all living species. The study of these responses in plants has had particular relevance given the negative impact of water scarcity on agriculture. Among the molecules highly associated with plant responses to water limitation are the so-called late embryogenesis abundant (LEA) proteins. These proteins are ubiquitous in the plant kingdom and accumulate during the late phase of embryogenesis and in vegetative tissues in response to water deficit. To know about the evolution of these proteins, we have studied the distribution of group 1 LEA proteins, a set that has also been found beyond the plant kingdom, in Bacillus subtilis and Artemia franciscana. Here, we report the presence of group 1 LEA proteins in green algae (Chlorophyita and Streptophyta), suggesting that these group of proteins emerged before plant land colonization. By sequence analysis of public genomic databases, we also show that 34 prokaryote genomes encode group 1 LEA-like proteins; two of them belong to Archaea domain and 32 to bacterial phyla. Most of these microbes live in soil-associated habitats suggesting horizontal transfer from plants to bacteria; however, our phylogenetic analysis points to convergent evolution. Furthermore, we present data showing that bacterial group 1 LEA proteins are able to prevent enzyme inactivation upon freeze–thaw treatments in vitro, suggesting that they have analogous functions to plant LEA proteins. Overall, data in this work indicate that LEA1 proteins’ properties might be relevant to cope with water deficit in different organisms.  相似文献   

10.
Mitochondrial ribosomes are complex molecular machines indispensable for respiration. Their assembly involves the import of several dozens of mitochondrial ribosomal proteins (MRPs), encoded in the nuclear genome, into the mitochondrial matrix. Proteomic and structural data as well as computational predictions indicate that up to 25% of yeast MRPs do not have a conventional N‐terminal mitochondrial targeting signal (MTS). We experimentally characterized a set of 15 yeast MRPs in vivo and found that five use internal MTSs. Further analysis of a conserved model MRP, Mrp17/bS6m, revealed the identity of the internal targeting signal. Similar to conventional MTS‐containing proteins, the internal sequence mediates binding to TOM complexes. The entire sequence of Mrp17 contains positive charges mediating translocation. The fact that these sequence properties could not be reliably predicted by standard methods shows that mitochondrial protein targeting is more versatile than expected. We hypothesize that structural constraints imposed by ribosome assembly interfaces may have disfavored N‐terminal presequences and driven the evolution of internal targeting signals in MRPs.  相似文献   

11.
Triplumaria selenica Latteur, Tuffrau and Wespes, 1970 was redescribed from pyridinated silver carbonate-impregnated specimens. Triplumaria selenica has a slit of the vestibular opening extending posteriorly along the left side of the vestibulum. The wide C-shaped adoral polybrachykinety extends along the ventral side of the vestibular opening. The narrow perivestibular polybrachykinety extends laterally along the dorsal side of the vestibular opening from the right end of the adoral polybrachykinety and forms a loop extending posteriorly along the vestibular slit to join to the left end of the adoral polybrachykinety. The 18SSU rRNA gene of T. selenica as well as those of six other entodiniomorphid species, Raabena bella, Blepharocorys curvigula, Entodinium longinucleatum, Eudiplodinium rostratum, Metadinium medium, and Ostracodinium gracile was sequenced. The neighbor joining and maximum parsimony phylogenetic trees were constructed to discuss the evolution of entodiniomorphs. Our results will support and extend Wolska’s hypothesis: the ancestral forms of blepharocorythids have evolved into ophryoscolecids and Cycloposthium species via the ancestor of Triplumaria.  相似文献   

12.
Lake Baikal is considered as a unique place to study evolution. In this review, we report on recent data on the evolution of endemic freshwater sponges of this ancient lake. Nucleotide sequence data support the idea that these sponges are of monophyletic origin and evolved from Spongillidae. Baikalian sponges form the dominating biomass in the benthos of the lake. Data on the expression of the biomarker heat shock protein 70, revealed that the endemic sponge species of Lake Baikal are useful as bioindicators to assess the anthropogenic impact on the lake.  相似文献   

13.
Coding sequence evolution was once thought to be the result of selection on optimal protein function alone. Selection can, however, also act at the RNA level, for example, to facilitate rapid translation or ensure correct splicing. Here, we ask whether the way DNA works also imposes constraints on coding sequence evolution. We identify nucleosome positioning as a likely candidate to set up such a DNA-level selective regime and use high-resolution microarray data in yeast to compare the evolution of coding sequence bound to or free from nucleosomes. Controlling for gene expression and intra-gene location, we find a nucleosome-free "linker" sequence to evolve on average 5-6% slower at synonymous sites. A reduced rate of evolution in linker is especially evident at the 5' end of genes, where the effect extends to non-synonymous substitution rates. This is consistent with regular nucleosome architecture in this region being important in the context of gene expression control. As predicted, codons likely to generate a sequence unfavourable to nucleosome formation are enriched in linker sequence. Amino acid content is likewise skewed as a function of nucleosome occupancy. We conclude that selection operating on DNA to maintain correct positioning of nucleosomes impacts codon choice, amino acid choice, and synonymous and non-synonymous rates of evolution in coding sequence. The results support the exclusion model for nucleosome positioning and provide an alternative interpretation for runs of rare codons. As the intimate association of histones and DNA is a universal characteristic of genic sequence in eukaryotes, selection on coding sequence composition imposed by nucleosome positioning should be phylogenetically widespread.  相似文献   

14.
The aim of this study is to apply 3-D modeling to data obtained from different tableting machines and for different compression wheels on a linear rotary tableting machine replicator. A new analysis technique to interpret these data by 3-D parameter plots is presented. Tablets were produced on an instrumented eccentric tableting machine and on a linear rotary tableting machine replicator. The materials used were dicalcium phosphate dihydrate (DCPD), spray-dried lactose, microcrystalline cellulose (MCC), hydroxypropyl methylcellulose (HPMC), and theophylline monohydrate. Tableting was performed to different maximum relative densities (ρ rel, max). Force, time and displacement were recorded during compaction. The 3-D data plots were prepared using pressure, normalized time, and porosity according to Heckel. A twisted plane was fitted to these data according to the 3-D modeling technique. The resulting parameters were analyzed in a 3-D parameter plot. The results show that the 3-D modeling technique can be applied to compaction cycles from different tableting machines as different as eccentric and rotary tableting machines (simulated). The relation of the data to each other is the same even when the absolute values are different. This is also true for different compression wheels used on the linear rotary tableting machine replicator. By using compression wheels of different sizes on this simulator, mainly time plasticity changes. By using bigger compression wheels for simulation, the materials deform slower at lower densification and they deform faster at higher densification. For brittle materials, the stages of higher densification are influenced; for plastically deforming materials, the stages of lower and higher densification can be influenced.  相似文献   

15.
Wang B  Chen P  Huang DS  Li JJ  Lok TM  Lyu MR 《FEBS letters》2006,580(2):380-384
This paper proposes a novel method that can predict protein interaction sites in heterocomplexes using residue spatial sequence profile and evolution rate approaches. The former represents the information of multiple sequence alignments while the latter corresponds to a residue's evolutionary conservation score based on a phylogenetic tree. Three predictors using a support vector machines algorithm are constructed to predict whether a surface residue is a part of a protein-protein interface. The efficiency and the effectiveness of our proposed approach is verified by its better prediction performance compared with other models. The study is based on a non-redundant data set of heterodimers consisting of 69 protein chains.  相似文献   

16.
MOTIVATION: A large, high-quality database of homologous sequence alignments with good estimates of their corresponding phylogenetic trees will be a valuable resource to those studying phylogenetics. It will allow researchers to compare current and new models of sequence evolution across a large variety of sequences. The large quantity of data may provide inspiration for new models and methodology to study sequence evolution and may allow general statements about the relative effect of different molecular processes on evolution. RESULTS: The Pandit 7.6 database contains 4341 families of sequences derived from the seed alignments of the Pfam database of amino acid alignments of families of homologous protein domains (Bateman et al., 2002). Each family in Pandit includes an alignment of amino acid sequences that matches the corresponding Pfam family seed alignment, an alignment of DNA sequences that contain the coding sequence of the Pfam alignment when they can be recovered (overall, 82.9% of sequences taken from Pfam) and the alignment of amino acid sequences restricted to only those sequences for which a DNA sequence could be recovered. Each of the alignments has an estimate of the phylogenetic tree associated with it. The tree topologies were obtained using the neighbor joining method based on maximum likelihood estimates of the evolutionary distances, with branch lengths then calculated using a standard maximum likelihood approach.  相似文献   

17.
The complete base sequence of HIV-1 virus and GP120 ENV gene were analyzed to establish their distance to the expected neutral random sequence. An especial methodology was devised to achieve this aim. Analyses included: a) proportion of dinucleotides (signatures); b) homogeneity in the distribution of dinucleotides and bases (isochores) by dividing both segments in ten and three sub-segments, respectively; c) probability of runs of bases and No-bases according to the Bose-Einstein distribution. The analyses showed a huge deviation from the random distribution expected from neutral evolution and neutral-neighbor influence of nucleotide sites. The most significant result is the tremendous lack of CG dinucleotides (p < 10(-50) ), a selective trait of eukaryote and not of single stranded RNA virus genomes. Results not only refute neutral evolution and neutral neighbor influence, but also strongly indicate that any base at any nucleotide site correlates with all the viral genome or sub-segments. These results suggest that evolution of HIV-1 is pan-selective rather than neutral or nearly neutral.  相似文献   

18.
A statistical analysis of a data set composed of over 1600 scission events of DNA produced by the 2:1 1,10-phenanthroline-copper complex (OP-Cu) has demonstrated that the nucleotide 5' to the site of phosphodiester bond scission is a primary influence in the kinetics of cleavage at any sequence position. The scission was less affected by the 3' neighbor. For each of the sixteen possible dinucleotides, a kinetic parameter can be computed reflecting scission at the 3' nucleotide. When used to predict the scission pattern of a DNA sequence not part of the present data set, correlation coefficients of about 0.6 between predicted and observed patterns were obtained.  相似文献   

19.
20.
The genomic structural organization of human UbC CDS repeat units could be representative of concerted evolution. The structure of the UbC gene and its repeat unit number frequency at scales of different human ethnic populations remain to be sufficiently determined. In this study, we performed comparative analysis of UbC CDS regions in genomes from 140 Korean individuals. We found that the UbC gene allele types 9, 8 and 7 are present in the Korean population in proportions of 97.1%, 0.4% and 2.5%, respectively. Interestingly, we discovered that the allele types 7 and 8 harbor the novel UbC gene mosaic repeat units 3??5 (combined between sequence parts derived from standard repeat units 3 and 5) and 8??9 (combined between sequence parts derived from standard repeat units 8 and 9) within their sequence structures, respectively. Our analysis showed that the novel mosaic repeat unit 3^5 lacks the highly human-specific amino acid S38, implying a functional consequence. These results suggest that the genomic organization of UbC repeat units is still undergoing dynamic structural changes due to concerted evolution through unequal crossing-over. Our results could represent valuable data for future investigations related to treating genetic diseases caused by UbC gene mutations and variations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号