首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
The three-dimensional structures of homologous proteins are usually conserved during evolution, as are critical residues in a few short sequence motifs that often constitute the active site in enzymes. The precise spatial organization of such sites depends on the lengths and positions of the secondary structural elements connecting the motifs. We show how members of protein superfamilies, such as kinesins, myosins, and G(alpha) subunits of trimeric G proteins, are identified and classed by simply counting the number of amino acid residues between important sequence motifs in their nucleotide triphosphate-hydrolyzing domains. Subfamily-specific landmark patterns (motif to motif scores) are principally due to inserts and gaps in surface loops. Unusual protein sequences and possible sequence prediction errors are detected.  相似文献   

2.
G-protein coupled receptors (GPCRs) belong to biologically important and functionally diverse and largest super family of membrane proteins. GPCRs retain a characteristic membrane topology of seven alpha helices with three intracellular, three extracellular loops and flanking N' and C' terminal residues. Subtle differences do exist in the helix boundaries (TM-domain), loop lengths, sequence features such as conserved motifs, and substituting amino acid patterns and their physiochemical properties amongst these sequences (clusters) at intra-genomic and inter-genomic level (please re-phrase into 2 statements for clarity). In the current study, we employ prediction of helix boundaries and scores derived from amino acid substitution exchange matrices to identify the conserved amino acid residues (motifs) as consensus in aligned set of homologous GPCR sequences. Co-clustered GPCRs from human and other genomes, organized as 32 clusters, were employed to study the amino acid conservation patterns and species-specific or cluster-specific motifs. Critical analysis on sequence composition and properties provide clues to connect functional relevance within and across genome for vast practical applications such as design of mutations and understanding of disease-causing genetic abnormalities.  相似文献   

3.
The amino acid sequences of proteins determine their three-dimensional structures and functions. However, how sequence information is related to structures and functions is still enigmatic. In this study, we show that at least a part of the sequence information can be extracted by treating amino acid sequences of proteins as a collection of English words, based on a working hypothesis that amino acid sequences of proteins are composed of short constituent amino acid sequences (SCSs) or “words”. We first confirmed that the English language highly likely follows Zipf''s law, a special case of power law. We found that the rank-frequency plot of SCSs in proteins exhibits a similar distribution when low-rank tails are excluded. In comparison with natural English and “compressed” English without spaces between words, amino acid sequences of proteins show larger linear ranges and smaller exponents with heavier low-rank tails, demonstrating that the SCS distribution in proteins is largely scale-free. A distribution pattern of SCSs in proteins is similar among species, but species-specific features are also present. Based on the availability scores of SCSs, we found that sequence motifs are enriched in high-availability sites (i.e., “key words”) and vice versa. In fact, the highest availability peak within a given protein sequence often directly corresponds to a sequence motif. The amino acid composition of high-availability sites within motifs is different from that of entire motifs and all protein sequences, suggesting the possible functional importance of specific SCSs and their compositional amino acids within motifs. We anticipate that our availability-based word decoding approach is complementary to sequence alignment approaches in predicting functionally important sites of unknown proteins from their amino acid sequences.  相似文献   

4.
We describe here a subclass of mammalian ABC transporters, the ABCA subfamily. This is a unique group that, in contrast to any other human ABC transporters, lacks a structural counterpart in yeast. The structural hallmark of the ABCA subfamily is the presence of a stretch of hydrophobic amino acids thought to span the membrane within the putative regulatory (R) domain. As for today, four ABCA transporters have been fully characterised but 11 ABCA-encoding genes have been identified. ABCA-specific motifs in the nucleotide binding folds can be detected when analysing the conserved sequences among the different members. These motifs may reveal functional constraints exclusive to this group of ABC transporters.  相似文献   

5.
The NBS-ARC domain sequences of Rx1 homologues were characterized in ten accessions of cultivated and wild potato species differing in their susceptibility to potato virus X. The NBS-ARC domain sequences studied contained a number of indels and nucleotide substitutions, some of them resulting in amino acid substitutions in the conserved motifs of the domain. There were no direct associations between the mutations of the NBS-ARC conserved motifs and the accessions’ susceptibility to the X virus.  相似文献   

6.
A foot-and-mouth disease virus (FMDV, HKN/2002) was isolated in Hong Kong in 2002. The nucleotide sequence of the 3D(pol) gene encoding the viral RNA-dependent RNA polymerase was determined and compared with that of the same gene from other FMDVs. The 3D(pol) gene was 1410 nucleotides in length encoding a protein of 470 amino acid residues. Sequence comparisons indicated that HKN/2002 belonged to serotype O. An evolutionary tree based on the 3D(pol) sequences of 20 FMDV isolates revealed that the nucleotide sequence of the HKN/2002 3D(pol) gene was most similar to those of isolates found in Taiwan in 1997, suggesting that they share a common ancestor. The amino acid sequence of the HKN/2002 3D(pol) gene was determined and aligned with those of representative isolates from seven other Picornaviridae genera. Eight highly conserved regions were detected, indicating a conserved functional relevance for these motifs. Alignment of 20 FMDV 3D(pol) amino acid sequences revealed a hypermutation region near the N-terminus that may help the virus evade host immune systems.  相似文献   

7.
Direct genomic DNA amplification with the primers recognizing the NBS–kinase sequence of the wheat gene Cre3(Genbank accession AF052641) was used to obtain partial homologs of this gene in perennial and annual rye, wheat, and tall wheatgrass. The nucleotide sequences of the cloned fragments and their deduced amino acid sequences were compared to the already-known Cre3homologs in other wheat, aegilops, and barley genotypes. Within the tribe Triticeae, the extent of homology ranged from 86 to 94% for nucleotide sequences and from 74 to 96% for the deduced amino acid sequences, with the most variable region between Kin3 and PR3 conserved motifs.  相似文献   

8.
水稻矮缩病毒第一号组份基因和编码蛋白的序列分析   总被引:7,自引:3,他引:4  
水稻矮缩病毒(RiceDwarfVirus,简称RDV)是我国南方水稻病毒病的重要病原,属植物呼肠孤病毒。从中国福建分离物中克隆了基因组第一号片段(S1)的全长cDNA并对其进行全序列分析,结果表明RDV福建分离物S1克隆片段全长4422bp,含有一个长4332bp的开放阅读框架,编码一个由1444个氨基酸组成的多肽(P1),分子量为164kD.根据基因序列,对推测的P1氨基酸序列分析表明,序列中含有依赖于RNA的RNA聚合酶(RNA-dependentpolymerase-RDRP)保守序列:motifI(DXXXXD)、motifⅡ(SGXXXTXXXN)和motifⅢ(GDD),除此之外,在模式Ⅲ后还存在一个很保守的区域EXXKXY。由此说明RDVS1编码的蛋白P1可能是病毒的一种RDRP。将RDV福建分离物引核苷酸和编码蛋白氨基酸序列与日本流行株系相比,同源性分别为95%和97%。RDV福建分离物S1序列已被DenBank接受,号码为U73201。  相似文献   

9.
Helicase motifs: the engine that powers DNA unwinding   总被引:1,自引:0,他引:1  
Helicases play essential roles in nearly all DNA metabolic transactions and have been implicated in a variety of human genetic disorders. A hallmark of these enzymes is the existence of a set of highly conserved amino acid sequences termed the 'helicase motifs' that were hypothesized to be critical for helicase function. These motifs are shared by another group of enzymes involved in chromatin remodelling. Numerous structure-function studies, targeting highly conserved residues within the helicase motifs, have been instrumental in uncovering the functional significance of these regions. Recently, the results of these mutational studies were augmented by the solution of the three-dimensional crystal structure of three different helicases. The structural model for each helicase revealed that the conserved motifs are clustered together, forming a nucleotide-binding pocket and a portion of the nucleic acid binding site. This result is gratifying, as it is consistent with structure-function studies suggesting that all the conserved motifs are involved in the nucleotide hydrolysis reaction. Here, we review helicase structure-function studies in the light of the recent crystal structure reports. The current data support a model for helicase action in which the conserved motifs define an engine that powers the unwinding of duplex nucleic acids, using energy derived from nucleotide hydrolysis and conformational changes that allow the transduction of energy between the nucleotide and nucleic acid binding sites. In addition, this ATP-hydrolysing engine is apparently also associated with proteins involved in chromatin remodelling and provides the energy required to alter protein-DNA structure, rather than duplex DNA or RNA structure.  相似文献   

10.
Discovering structural correlations in alpha-helices.   总被引:5,自引:2,他引:3       下载免费PDF全文
We have developed a new representation for structural and functional motifs in protein sequences based on correlations between pairs of amino acids and applied it to alpha-helical and beta-sheet sequences. Existing probabilistic methods for representing and analyzing protein sequences have traditionally assumed conditional independence of evidence. In other words, amino acids are assumed to have no effect on each other. However, analyses of protein structures have repeatedly demonstrated the importance of interactions between amino acids in conferring both structure and function. Using Bayesian networks, we are able to model the relationships between amino acids at distinct positions in a protein sequence in addition to the amino acid distributions at each position. We have also developed an automated program for discovering sequence correlations using standard statistical tests and validation techniques. In this paper, we test this program on sequences from secondary structure motifs, namely alpha-helices and beta-sheets. In each case, the correlations our program discovers correspond well with known physical and chemical interactions between amino acids in structures. Furthermore, we show that, using different chemical alphabets for the amino acids, we discover structural relationships based on the same chemical principle used in constructing the alphabet. This new representation of 3-dimensional features in protein motifs, such as those arising from structural or functional constraints on the sequence, can be used to improve sequence analysis tools including pattern analysis and database search.  相似文献   

11.
Interdependent MHC-DRB exon-plus-intron evolution in artiodactyls   总被引:2,自引:0,他引:2  
Exon 2 sequences of an expressed MHC-DRB locus from sheep were examined for polymorphisms in both the antigen-binding regions and the adjacent intronic mixed simple tandem repeat. Twenty-one novel exon 2 Ovar-DRB alleles were identified. Short nucleotide motifs are extensively shared between certain exon 2 regions of Ovar-DRB alleles. The simple repeat variations, the number of different amino acids at usually polymorphic sites, and the number of silent substitutions were reduced in the intraspecies analyses of sheep DRB sequences, compared with those of cattle and goats. It was paradoxical that the abundance of different sheep alleles was similar to that of cattle and goats. This paradox may be explained by postulating a relatively small number of "ancient" alleles, with the present-day Ovar-DRB alleles being generated by reciprocal exchange of nucleotide motifs. At the antigen-binding sites, new combinations of amino acids were maintained in Ovar-DRB alleles by strong positive selection. In sheep--and less pronounced in goats and cattle--the DRB alleles can be divided into two groups. In one group, silent substitutions are increased when compared with the other. This suggests separate evolutionary pathways for certain groups of DRB alleles within a species. The simple repetitive sequences are also discussed with respect to the evolution of DRB alleles.   相似文献   

12.
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.  相似文献   

13.
14.
李嵘  王喆之   《广西植物》2006,26(5):464-473
采用生物信息学的方法和工具对已在GenBank上注册的橡胶、烟草、辣椒、穿心莲等植物的萜类合成酶3-羟基-3-甲基戊二酰辅酶A还原酶的核酸及氨基酸序列进行分析,并对其组成成分、信号肽、跨膜拓朴结构域、疏水性/亲水性、蛋白质二级及三级结构、分子系统进化关系等进行预测和推断。结果表明该类酶基因的全长包括5′、3′非翻译区和一个开放阅读框,无信号肽,是一个跨膜的亲水性蛋白,包括两个功能HMG-CoA结合motif及两个功能NADPH结合motif,α-螺旋和不规则盘绕是蛋白质二级结构最大量的结构元件,β-转角和延伸链散布于整个蛋白质中,蛋白质的功能域在空间布局上折叠成“V”形,“V”形的两臂由螺旋状的N结构域和S结构域构成,中间部分由L结构域构成。  相似文献   

15.

Background  

In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences. Searching with PSSMs in complete genomes or large sequence databases is a common, but computationally expensive task.  相似文献   

16.
Evolutionary grouping of the RAS-protein family   总被引:1,自引:0,他引:1  
Over 50 proteins related to the mammalian H-, K-, and N-RAS GTP binding and hydrolyzing proteins are known. These relatively low molecular weight proteins are usually grouped into four subfamilies, termed true RAS, RAS-like, RHO, and RAB/YPT, based on the presence of shared amino acid sequence motifs in addition to those involved in guanine nucleotide binding. Here, we apply parsimony analysis to the overall amino acid sequences of these proteins to infer possible phylogenetic relationships among them.  相似文献   

17.
18.
NDFl、IPFl和HNF4是与胰岛素基因表达有关的DNA结合蛋白,通过比较SWISSPROT蛋白质数据库中人类、小鼠、大鼠这三种核蛋白氨基酸一级序列、模体和结构域,发现其结构十分相似,根据蛋白质结构和功能的关系,推测这些DNA结合蛋白与胰岛素基因结合的核苷酸序列相似;从GenBanl(核酸数据库中获得人类、小鼠、大鼠胰岛素DNA序列,用ClustalW比较三者Promoter区的核苷酸序列,显示有一段核苷酸序列较为相似,同时搜索TRANSFAC基因转录数据库中NDFl、IPFl和NHF4蛋白核苷酸结合位点,发现核酸比对保守的部分序列与TRANSFAC数据库中这三个转录因子的DNA结合位点一致,另外一些核酸保守序列可能为其他未知DNA结合蛋白的结合位点。这种核酸序列比对设计为分子生物学实验寻找和验证胰岛素DNA结合蛋白与核苷酸的结合位点提供了简单而实用的方法。  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号