首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The tissue inhibitor of metalloproteinase (TIMP) family regulates extracellular matrix turnover and tissue remodeling by forming tight-binding inhibitory complexes with matrix metalloproteinases (MMPs). MMPs and TIMPs have been implicated in many normal and pathological processes, such as morphogenesis, development, angiogenesis, and cancer metastasis. This minireview provides information that would aid in classification of the TIMP family and in understanding the similarities and differences among TIMP members according to the physical data, primary structure, and homology values. Calculations of molecular weight, isoelectric point values, and molar extinction coefficients are reported. This study also compares sequence similarities and differences among the TIMP members through calculations of homology within their individual loop regions and the mature region of the molecule. Lastly, this report examines structure–function relationships of TIMPs. Thorough knowledge of TIMP primary and tertiary structure would facilitate the uncovering of the molecular mechanisms underlying metalloproteinase, inhibitory activities and biological functions of TIMPs.  相似文献   

2.
本文介绍欧洲分子生物学开放软件包EMBOSS序列分析程序应用实例.第1节简单介绍EMBOSS软件包的概况和基本用法.第2节介绍格式转换、序列提取、序列变换和序列显示等常用序列处理程序.第3节介绍序列比对程序,包括双序列比对、多序列比对和点阵图程序.第4节介绍常用核酸序列分析程序,可用于核苷酸组分统计、开放读码框分析、C...  相似文献   

3.
A novel method has been developed for acquiring the correct alignment of a query sequence against remotely homologous proteins by extracting structural information from profiles of multiple structure alignment. A systematic search algorithm combined with a group of score functions based on sequence information and structural information has been introduced in this procedure. A limited number of top solutions (15,000) with high scores were selected as candidates for further examination. On a test-set comprising 301 proteins from 75 protein families with sequence identity less than 30%, the proportion of proteins with completely correct alignment as first candidate was improved to 39.8% by our method, whereas the typical performance of existing sequence-based alignment methods was only between 16.1% and 22.7%. Furthermore, multiple candidates for possible alignment were provided in our approach, which dramatically increased the possibility of finding correct alignment, such that completely correct alignments were found amongst the top-ranked 1000 candidates in 88.3% of the proteins. With the assistance of a sequence database, completely correct alignment solutions were achieved amongst the top 1000 candidates in 94.3% of the proteins. From such a limited number of candidates, it would become possible to identify more correct alignment using a more time-consuming but more powerful method with more detailed structural information, such as side-chain packing and energy minimization, etc. The results indicate that the novel alignment strategy could be helpful for extending the application of highly reliable methods for fold identification and homology modeling to a huge number of homologous proteins of low sequence similarity. Details of the methods, together with the results and implications for future development are presented.  相似文献   

4.
We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain.  相似文献   

5.
Analysis for free: comparing programs for sequence analysis   总被引:4,自引:0,他引:4  
Programs to import, manage and align sequences and to analyse the properties of DNA, RNA and proteins are essential for every biological laboratory. This review describes two different freeware (BioEdit and pDRAW for MS Windows) and a commercial program (Sequencher for MS Windows and Apple MacOS). Bioedit and Sequencher offer functions such as sequence alignment and editing plus reading of sequence trace files. pDRAW is a very comfortable visualisation tool with a variety of analysis functions. While Sequencher impresses with a very user-friendly interface and easy-to-use tools, BioEdit offers the largest and most customisable variety of tools. The strength of pDRAW is drawing and analysis of single sequences for priming and restriction sites and virtual cloning. It has a database function for user-specific oligonucleotides and restriction enzymes.  相似文献   

6.
Multiple sequence alignment (MSA) accuracy is important, but there is no widely accepted method of judging the accuracy that different alignment algorithms give. We present a simple approach to detecting two types of error, namely block shifts and the misplacement of residues within a gap. Given a MSA, subsets of very similar sequences are generated through the use of a redundancy filter, typically using a 70–90% sequence identity cut-off. Subsets thus produced are typically small and degenerate, and errors can be easily detected even by manual examination. The errors, albeit minor, are inevitably associated with gaps in the alignment, and so the procedure is particularly relevant to homology modelling of protein loop regions. The usefulness of the approach is illustrated in the context of the universal but little known [K/R]KLH motif that occurs in intracellular loop 1 of G protein coupled receptors (GPCR); other issues relevant to GPCR modelling are also discussed.  相似文献   

7.
This paper discusses the benefit of mapping paired cysteine mutation patterns as a guide to identifying the positions of protein disulfide bonds. This information can facilitate the computer modeling of protein tertiary structure. First, a simple, paired natural-cysteine-mutation map is presented that identifies the positions of putative disulfide bonds in protein families. The method is based on the observation that if, during the process of evolution, a disulfide-bonded cysteine residue is not conserved, then it is likely that its counterpart will also be mutated. For each target protein, protein databases were searched for the primary amino acid sequences of all known members of distinct protein families. Primary sequence alignment was carried out using PileUp algorithms in the GCG package. To search for correlated mutations, we listed only the positions where cysteine residues were highly conserved and emphasized the mutated residues. In proteins of known three-dimensional structure, a striking pattern of paired cysteine mutations correlated with the positions of known disulfide bridges. For proteins of unknown architecture, the mutation maps showed several positions where disulfide bridging might occur.  相似文献   

8.
Silva PJ 《Proteins》2008,70(4):1588-1594
Hydrophobic cluster analysis (HCA) has long been used as a tool to detect distant homologies between protein sequences, and to classify them into different folds. However, it relies on expert human intervention, and is sensitive to subjective interpretations of pattern similarities. In this study, we describe a novel algorithm to assess the similarity of hydrophobic amino acid distributions between two sequences. Our algorithm correctly identifies as misattributions several HCA-based proposals of structural similarity between unrelated proteins present in the literature. We have also used this method to identify the proper fold of a large variety of sequences, and to automatically select the most appropriate structure for homology modeling of several proteins with low sequence identity to any other member of the protein data bank. Automatic modeling of the target proteins based on these templates yielded structures with TM-scores (vs. experimental structures) above 0.60, even without further refinement. Besides enabling a reliable identification of the correct fold of an unknown sequence and the choice of suitable templates, our algorithm also shows that whereas most structural classes of proteins are very homogeneous in hydrophobic cluster composition, a tenth of the described families are compatible with a large variety of hydrophobic patterns. We have built a browsable database of every major representative hydrophobic cluster pattern present in each structural class of proteins, freely available at http://www2.ufp.pt/ pedros/HCA_db/index.htm.  相似文献   

9.
The effectiveness of sequence alignment in detecting structural homology among protein sequences decreases markedly when pairwise sequence identity is low (the so‐called “twilight zone” problem of sequence alignment). Alternative sequence comparison strategies able to detect structural kinship among highly divergent sequences are necessary to address this need. Among them are alignment‐free methods, which use global sequence properties (such as amino acid composition) to identify structural homology in a rapid and straightforward way. We explore the viability of using tetramer sequence fragment composition profiles in finding structural relationships that lie undetected by traditional alignment. We establish a strategy to recast any given protein sequence into a tetramer sequence fragment composition profile, using a series of amino acid clustering steps that have been optimized for mutual information. Our method has the effect of compressing the set of 160,000 unique tetramers (if using the 20‐letter amino acid alphabet) into a more tractable number of reduced tetramers (~15–30), so that a meaningful tetramer composition profile can be constructed. We test remote homology detection at the topology and fold superfamily levels using a comprehensive set of fold homologs, culled from the CATH database that share low pairwise sequence similarity. Using the receiver‐operating characteristic measure, we demonstrate potentially significant improvement in using information‐optimized reduced tetramer composition, over methods relying only on the raw amino acid composition or on traditional sequence alignment, in homology detection at or below the “twilight zone”. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

10.
Elofsson A 《Proteins》2002,46(3):330-339
One of the most central methods in bioinformatics is the alignment of two protein or DNA sequences. However, so far large-scale benchmarks examining the quality of these alignments are scarce. On the other hand, recently several large-scale studies of the capacity of different methods to identify related sequences has led to new insights about the performance of fold recognition methods. To increase our understanding about fold recognition methods, we present a large-scale benchmark of alignment quality. We compare alignments from several different alignment methods, including sequence alignments, hidden Markov models, PSI-BLAST, CLUSTALW, and threading methods. For most methods, the alignment quality increases significantly at about 20% sequence identity. The difference in alignment quality between different methods is quite small, and the main difference can be seen at the exact positioning of the sharp rise in alignment quality, that is, around 15-20% sequence identity. The alignments are improved by using structural information. In general, the best alignments are obtained by methods that use predicted secondary structure information and sequence profiles obtained from PSI-BLAST. One interesting observation is that for different pairs many different methods create the best alignments. This finding implies that if a method that could select the best alignment method for each pair existed, a significant improvement of the alignment quality could be gained.  相似文献   

11.
本研究采用自行设计的引物对东方蜜蜂Apis cerana Fabricius气味受体(odorant receptors)Or1、Or2的部分基因组序列(GenBank登录号为:JN544932,JN544931)进行了克隆、测序和分析,以探寻传统气味受体(AcOr1)和非典型气味受体(AcOr2)基因在近缘种昆虫间的进化差异。试验所得的东方蜜蜂气味受体基因Or1、Or2的序列长度分别为1247bp和1138bp,各包含4个和2个内含子,编码区序列长度分别为682、686 bp。经序列比对发现,两气味受体DNA序列在东、西方蜜蜂及熊蜂间差异较大,最低相似性仅为56%(AcOr1—BtOr82a-like),差异的主要来源为内含子长度及其碱基的变异,而编码区氨基酸序列相似性较高,均达85%以上;从整体分析来看,在膜翅目昆虫中,非典型气味受体AcOr2较传统气味受体AcOr1是相对保守的气味受体基因。  相似文献   

12.
13.
We have developed a phylogeny-aware progressive alignment method that recognizes insertions and deletions as distinct evolutionary events and thus avoids systematic errors created by traditional alignment methods. We now extend this method to simultaneously model regional heterogeneity and evolution. This novel method can be flexibly adapted to alignment of nucleotide or amino acid sequences evolving under processes that vary over genomic regions and, being fully probabilistic, provides an estimate of regional heterogeneity of the evolutionary process along the alignment and a measure of local reliability of the solution. Furthermore, the evolutionary modelling of substitution process permits adjusting the sensitivity and specificity of the alignment and, if high specificity is aimed at, leaving sequences unaligned when their divergence is beyond a meaningful detection of homology.  相似文献   

14.
Detection of homologous proteins by an intermediate sequence search   总被引:2,自引:0,他引:2  
We developed a variant of the intermediate sequence search method (ISS(new)) for detection and alignment of weakly similar pairs of protein sequences. ISS(new) relates two query sequences by an intermediate sequence that is potentially homologous to both queries. The improvement was achieved by a more robust overlap score for a match between the queries through an intermediate. The approach was benchmarked on a data set of 2369 sequences of known structure with insignificant sequence similarity to each other (BLAST E-value larger than 0.001); 2050 of these sequences had a related structure in the set. ISS(new) performed significantly better than both PSI-BLAST and a previously described intermediate sequence search method. PSI-BLAST could not detect correct homologs for 1619 of the 2369 sequences. In contrast, ISS(new) assigned a correct homolog as the top hit for 121 of these 1619 sequences, while incorrectly assigning homologs for only nine targets; it did not assign homologs for the remainder of the sequences. By estimate, ISS(new) may be able to assign the folds of domains in approximately 29,000 of the approximately 500,000 sequences unassigned by PSI-BLAST, with 90% specificity (1 - false positives fraction). In addition, we show that the 15 alignments with the most significant BLAST E-values include the nearly best alignments constructed by ISS(new).  相似文献   

15.
Summary Sequence homologies among 34 chloroplast-type ferredoxins were examined using a computer program that quantitatively evaluates the extent of sequence similarity as a correlation coefficient. The resultant alignment contains six gaps representing insertions or deletions of some residues, all of which are located such that they precisely preserve the domains of structural fragments as determined by crystallographic data onSpirulina platensis ferredoxin.In the search for any total correlation between the chloroplast-type and 27 bacterial ferredoxins, 1891 comparison matrices prepared for possible combinations indicated that the bacterial basal sequence of 55 residues has been conserved evolutionarily in the chloroplast-type sequences corresponding to residue positions 36–90 ofSpirulina platensis ferredoxin. In addition, the bacterial connector sequence region was found to be conserved. These findings strongly suggest that the bacterial and chloroplast-type ferredoxins descended from a common ancestor, and branched off after the bacterial gene duplication, whereas the chloroplast-type ferredoxins originally were generated by duplicating the already duplicated bacterial gene, i.e., by double-duplication.  相似文献   

16.
Baoqiang Cao  Ron Elber 《Proteins》2010,78(4):985-1003
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein‐design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

17.
The matrix metalloproteinases (MMPs) are zinc dependent endopeptidases known for their ability to cleave one or several extracellular matrix (ECM) constituents, as well as non-matrix proteins. They comprise a large family of proteinases that share common structural and functional elements and are products of different genes. All members of this family contain a signal peptide, a propeptide and a catalytic domain. The catalytic domain contains two zinc ions and at least one calcium ion coordinated to various residues. All MMPs, with the exception matrilysin, have a hemopexin/vitronectin-like domain that is connected to the catalytic domain by a hinge or linker region. The hemopexin-like domain influences tissue inhibitor of metalloproteinases (TIMP) binding, the binding of certain substrates, membrane activation, and some proteolytic activities. It has been proposed that the origin of MMPs could be traced to before the emergence of vertebrates from invertebrates. It appears conceivable that the domain assemblies occurred at an early stage of the diversification of different MMPs and that they progressed through the evolutionary process independent of one another, and perhaps parallel to each other.  相似文献   

18.
Melo F  Marti-Renom MA 《Proteins》2006,63(4):986-995
Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs.  相似文献   

19.
The role of pattern databases in sequence analysis   总被引:2,自引:0,他引:2  
In the wake of the numerous now-fruitful genome projects, we are entering an era rich in biological data. The field of bioinformatics is poised to exploit this information in increasingly powerful ways, but the abundance and growing complexity both of the data and of the tools and resources required to analyse them are threatening to overwhelm us. Databases and their search tools are now an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments. In an age of information overload, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. But what are those tools? How do we access them? Which should we use? This review focuses on a particular type of database that is increasingly used in the task of routine sequence analysis--the so-called pattern database. The paper aims to provide an overview of the current status of pattern databases in common use, outlining the methods behind them and giving pointers on their diagnostic strengths and weaknesses.  相似文献   

20.
Identifying wheat leaf protein expression is a major challenge of functional genomics. Using two-dimensional gel electrophoresis 541 wheat leaf proteins were separated and 55 of them were sequenced by nano liquid chromatography-tandem mass spectrometry. Peptide sequence data were screened against protein banks and expressed sequence tag public banks. Among these 55 spots, 20 proteins were found in wheat and 21 in other grass families (http://www.ncbi.nlm.nih.gov/). Twelve proteins showed similarities with other eukaryotic plant species. One protein showed homology to a bacterial sequence and another protein remained unknown. In 18 cases a significant score was found for the wheat TUC (Tentative Unique Contigs) of the PlantGDB (http://www.plantgdb.org/) data. In several cases, different spots were identified as corresponding to the same protein that can probably be attributed to the hexaploid structure of wheat. The identified proteins were classified in six groups and their role is discussed. Most of them (31/55) are involved in carbohydrate metabolism.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号