首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There is a debate on the folding of proteins with inverted sequences. Theoretical approaches and experiments give contradictory results. Many proteins in the Protein Data Bank (PDB) show conspicuous inverse sequence similarity (ISS) to each other. Here we analyze whether this ISS is related to structural similarity. For the first time, we performed a large scale three-dimensional (3-D) superposition of corresponding Calpha atoms of forwardly and inversely aligned proteins and tested the degree of secondary structure identity between them. Comparing proteins of less than 50% pairwise sequence identity, only 0.5% of the inversely aligned pairs had similar folds (99 out of 19073), whereas about 9% of forwardly aligned proteins in the same score and length range show similar 3-D structures (1731 out of 19248). This observation strongly supports the view that the inversion of sequences in almost all cases leads to a different folding property of the protein. Inverted sequences are thus suitable as protein-like sequences for control purposes without relations to existing proteins.  相似文献   

2.
Is the classical predator-prey theory inherently pathological? Defenders of the theory are losing ground in the debate. We will demonstrate that detractors' main argument is based on a faulty model, and that the conceptual and predictive bases of the theory are fundamentally sound.  相似文献   

3.
It is well-known that muscle redundancy grants the CNS numerous options to perform a task. Does muscle redundancy, however, allow sufficient robustness to compensate for loss or dysfunction of even a single muscle? Are all muscles equally redundant? We combined experimental and computational approaches to establish the limits of motor robustness for static force production. In computer-controlled cadaveric index fingers, we find that only a small subset (<5%) of feasible forces is robust to loss of any one muscle. Importantly, the loss of certain muscles compromises force production significantly more than others. Further computational modeling of a multi-joint, multi-muscle leg demonstrates that this severe lack of robustness generalizes to whole limbs. These results provide a biomechanical basis to begin to explain why redundant motor systems can be vulnerable to even mild neuromuscular pathology.  相似文献   

4.
5.
A program OBSTRUCT has been developed to obtain the largestpossible subset according to specific constraints from a setof protein sequences whose tertiary structures have been determinedcrystallographically. The user can request a range in sequencesimilarity level and/or structural resolution. The program optionallyincludes sequences with known three–dimensional foldselicited from NMR data.  相似文献   

6.
7.
8.
Here, we discuss the relationship between protein sequence and protein structural similarity. It is established that a protein structural distance (PSD) of 2.0 is a threshold above which two proteins are unlikely to have a detectable pairwise sequence relationship. A precise correlation is established between the level of sequence similarity, defined by a normalized Smith-Waterman score, and the probability that two proteins will have a similar structure (defined by pairwise PSD<2). This correlation can be used in evaluating the likelihood for success in a comparative modeling procedure. We establish the existence of a correlation between sequence and structural similarity for pairs of proteins that are related in structure but whose sequence relationship is not detectable using standard pairwise sequence alignments. Although it is well known that there is a close relationship between sequence and structural similarity for pairwise sequence identities greater than about 30 %, there has been little discussion as to the possible existence of such a relationship for pairs of proteins in or below the twilight zone of sequence similarity (<25 % pairwise sequence identity). Possible implications of our results for the evolution of protein structure are discussed.  相似文献   

9.
10.
An important task in functional genomics is to cluster homologous proteins, which may share common functions. Annotating proteins of unknown function by transferring annotations from their homologues of known annotations is one of the most efficient ways to predict protein function. In this paper, we use a modularity-based method called CD for grouping together homologous proteins. The method employs a global heuristic search strategy to find the partitioning of the weighted adjacency graph with the largest modularity. The weighted adjacency graph is constructed by the sigmodal transformation of all pairwise sequence similarities between all protein sequences in a given dataset. The method has been extensively tested on several subsets from the superfamily level of the SCOP (Structural Classification of Proteins) database, where some homologous proteins have very low sequence similarity. Compared with a widely used method MCL, we observe that the number of clusters obtained by CD is closer to the number of superfamilies in the dataset, the value of the F-measure given by CD is 10% better than MCL on average, and CD is more tolerant to noise to the sequence similarity. The experiment results indicate that CD is ideally suitable for clustering homologous proteins when sequence similarity is low.  相似文献   

11.

Background  

We advocate unifying classical and genomic classification of bacteriophages by integration of proteomic data and physicochemical parameters. Our previous application of this approach to the entirely sequenced members of the Podoviridae fully supported the current phage classification of the International Committee on Taxonomy of Viruses (ICTV). It appears that horizontal gene transfer generally does not totally obliterate evolutionary relationships between phages.  相似文献   

12.
Current analyses of protein sequence/structure relationships have focused on expected similarity relationships for structurally similar proteins. To survey and explore the basis of these relationships, we present a general sequence/structure map that covers all combinations of similarity/dissimilarity relationships and provide novel energetic analyses of these relationships. To aid our analysis, we divide protein relationships into four categories: expected/unexpected similarity (S and S(?)) and expected/unexpected dissimilarity (D and D(?)) relationships. In the expected similarity region S, we show that trends in the sequence/structure relation can be derived based on the requirement of protein stability and the energetics of sequence and structural changes. Specifically, we derive a formula relating sequence and structural deviations to a parameter characterizing protein stiffness; the formula fits the data reasonably well. We suggest that the absence of data in region S(?) (high structural but low sequence similarity) is due to unfavorable energetics. In contrast to region S, region D(?) (high sequence but low structural similarity) is well-represented by proteins that can accommodate large structural changes. Our analyses indicate that there are several categories of similarity relationships and that protein energetics provide a basis for understanding these relationships.  相似文献   

13.
We have characterized the relationship between accurate phylogenetic reconstruction and sequence similarity, testing whether high levels of sequence similarity can consistently produce accurate evolutionary trees. We generated protein families with known phylogenies using a modified version of the PAML/EVOLVER program that produces insertions and deletions as well as substitutions. Protein families were evolved over a range of 100-400 point accepted mutations; at these distances 63% of the families shared significant sequence similarity. Protein families were evolved using balanced and unbalanced trees, with ancient or recent radiations. In families sharing statistically significant similarity, about 60% of multiple sequence alignments were 95% identical to true alignments. To compare recovered topologies with true topologies, we used a score that reflects the fraction of clades that were correctly clustered. As expected, the accuracy of the phylogenies was greatest in the least divergent families. About 88% of phylogenies clustered over 80% of clades in families that shared significant sequence similarity, using Bayesian, parsimony, distance, and maximum likelihood methods. However, for protein families with short ancient branches (ancient radiation), only 30% of the most divergent (but statistically significant) families produced accurate phylogenies, and only about 70% of the second most highly conserved families, with median expectation values better than 10(-60), produced accurate trees. These values represent upper bounds on expected tree accuracy for sequences with a simple divergence history; proteins from 700 Giardia families, with a similar range of sequence similarities but considerably more gaps, produced much less accurate trees. For our simulated insertions and deletions, correct multiple sequence alignments did not perform much better than those produced by T-COFFEE, and including sequences with expressed sequence tag-like sequencing errors did not significantly decrease phylogenetic accuracy. In general, although less-divergent sequence families produce more accurate trees, the likelihood of estimating an accurate tree is most dependent on whether radiation in the family was ancient or recent. Accuracy can be improved by combining genes from the same organism when creating species trees or by selecting protein families with the best bootstrap values in comprehensive studies.  相似文献   

14.
One key element in understanding the molecular machinery of the cell is to understand the structure and function of each protein encoded in the genome. A very successful means of inferring the structure or function of a previously unannotated protein is via sequence similarity with one or more proteins whose structure or function is already known. Toward this end, we propose a means of representing proteins using pairwise sequence similarity scores. This representation, combined with a discriminative classification algorithm known as the support vector machine (SVM), provides a powerful means of detecting subtle structural and evolutionary relationships among proteins. The algorithm, called SVM-pairwise, when tested on its ability to recognize previously unseen families from the SCOP database, yields significantly better performance than SVM-Fisher, profile HMMs, and PSI-BLAST.  相似文献   

15.
The brain's link between perception and action involves several steps, which include stimulus transduction, neuronal coding of the stimulus, comparison to a memory template and choice of an appropriate behavioral response. All of these need time, and many studies report that the time needed to compare two stimuli correlates inversely with the perceived distance between them. We developed a behavioral assay in which we tested the time that a honeybee needs to discriminate between odors consisting of mixtures of two components, and included both very similar and very different stimuli spanning four log-concentration ranges. Bees learned to discriminate all odors, including very similar odors and the same odor at different concentrations. Even though discriminating two very similar odors appears to be a more difficult task than discriminating two very distinct substances, we found that the time needed to make a choice for or against an odor was independent of odor similarity. Our data suggest that, irrespective of the nature of the olfactory code, the bee olfactory system evaluates odor quality after a constant interval. This may ensure that odors are only assessed after the olfactory network has optimized its representation.  相似文献   

16.

Background  

Formal classification of a large collection of protein structures aids the understanding of evolutionary relationships among them. Classifications involving manual steps, such as SCOP and CATH, face the challenge of increasing volume of available structures. Automatic methods such as FSSP or Dali Domain Dictionary, yield divergent classifications, for reasons not yet fully investigated. One possible reason is that the pairwise similarity scores used in automatic classification do not adequately reflect the judgments made in manual classification. Another possibility is the difference between manual and automatic classification procedures. We explore the degree to which these two factors might affect the final classification.  相似文献   

17.

Background  

Sequence similarity between proteins is usually considered a reliable indicator of homology. Pyruvate-ferredoxin oxidoreductase and quinol-fumarate reductase contain ferredoxin domains that bind [Fe-S] clusters and are involved in electron transport. Profile-based methods for sequence comparison, such as PSI-BLAST and HMMer, suggest statistically significant similarity between these domains.  相似文献   

18.
19.
Comparative accuracy of methods for protein sequence similarity search   总被引:2,自引:0,他引:2  
MOTIVATION: Searching a protein sequence database for homologs is a powerful tool for discovering the structure and function of a sequence. Two new methods for searching sequence databases have recently been described: Probabilistic Smith-Waterman (PSW), which is based on Hidden Markov models for a single sequence using a standard scoring matrix, and a new version of BLAST (WU-BLAST2), which uses Sum statistics for gapped alignments. RESULTS: This paper compares and contrasts the effectiveness of these methods with three older methods (Smith- Waterman: SSEARCH, FASTA and BLASTP). The analysis indicates that the new methods are useful, and often offer improved accuracy. These tools are compared using a curated (by Bill Pearson) version of the annotated portion of PIR 39. Three different statistical criteria are utilized: equivalence number, minimum errors and the receiver operating characteristic. For complete-length protein query sequences from large families, PSW's accuracy is superior to that of the other methods, but its accuracy is poor when used with partial-length query sequences. False negatives are twice as common as false positives irrespective of the search methods if a family-specific threshold score that minimizes the total number of errors (i.e. the most favorable threshold score possible) is used. Thus, sensitivity, not selectivity, is the major problem. Among the analyzed methods using default parameters, the best accuracy was obtained from SSEARCH and PSW for complete-length proteins, and the two BLAST programs, plus SSEARCH, for partial-length proteins.   相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号