首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
The EMOTIF database is a collection of more than 170 000 highly specific and sensitive protein sequence motifs representing conserved biochemical properties and biological functions. These protein motifs are derived from 7697 sequence alignments in the BLOCKS+ database (released on June 23, 2000) and all 8244 protein sequence alignments in the PRINTS database (version 27.0) using the emotif-maker algorithm developed by Nevill-Manning et al. (Nevill-Manning,C.G., Wu,T.D. and Brutlag,D.L. (1998) Proc. Natl Acad. Sci. USA, 95, 5865-5871; Nevill-Manning,C.G., Sethi,K.S., Wu,T. D. and Brutlag,D.L. (1997) ISMB-97, 5, 202-209). Since the amino acids and the groups of amino acids in these sequence motifs represent critical positions conserved in evolution, search algorithms employing the EMOTIF patterns can identify and classify more widely divergent sequences than methods based on global sequence similarity. The emotif protein pattern database is available at http://motif.stanford.edu/emotif/.  相似文献   

3.
4.
5.
Phylomat: an automated protein motif analysis tool for phylogenomics   总被引:2,自引:0,他引:2  
Recent progress in genomics, proteomics, and bioinformatics enables unprecedented opportunities to examine the evolutionary history of molecular, cellular, and developmental pathways through phylogenomics. Accordingly, we have developed a motif analysis tool for phylogenomics (Phylomat, http://alg.ncsa.uiuc.edu/pmat) that scans predicted proteome sets for proteins containing highly conserved amino acid motifs or domains for in silico analysis of the evolutionary history of these motifs/domains. Phylomat enables the user to download results as full protein or extracted motif/domain sequences from each protein. Tables containing the percent distribution of a motif/domain in organisms normalized to proteome size are displayed. Phylomat can also align the set of full protein or extracted motif/domain sequences and predict a neighbor-joining tree from relative sequence similarity. Together, Phylomat serves as a user-friendly data-mining tool for the phylogenomic analysis of conserved sequence motifs/domains in annotated proteomes from the three domains of life.  相似文献   

6.
Many protein regions have been shown to be intrinsically disordered, lacking unique structure under physiological conditions. These intrinsically disordered regions are not only very common in proteomes, but also crucial to the function of many proteins, especially those involved in signaling, recognition, and regulation. The goal of this work was to identify the prevalence, characteristics, and functions of conserved disordered regions within protein domains and families. A database was created to store the amino acid sequences of nearly one million proteins and their domain matches from the InterPro database, a resource integrating eight different protein family and domain databases. Disorder prediction was performed on these protein sequences. Regions of sequence corresponding to domains were aligned using a multiple sequence alignment tool. From this initial information, regions of conserved predicted disorder were found within the domains. The methodology for this search consisted of finding regions of consecutive positions in the multiple sequence alignments in which a 90% or more of the sequences were predicted to be disordered. This procedure was constrained to find such regions of conserved disorder prediction that were at least 20 amino acids in length. The results of this work included 3,653 regions of conserved disorder prediction, found within 2,898 distinct InterPro entries. Most regions of conserved predicted disorder detected were short, with less than 10% of those found exceeding 30 residues in length.  相似文献   

7.
谷氧还蛋白(glutaredoxin, GRX)是一类小分子氧化还原蛋白,可以调节蛋白质的氧化还原状态从而维持蛋白质的功能,在生物的生长发育及抗氧化反应中起着重要的作用。类谷氧还蛋白蛋白(glutaredoxin-like, GRL)是新划分的GRX类型,本研究为深入探究GRL基因家族在陆地棉中功能,对GhGRL基因家族进行生物信息学及表达分析。研究结果表明,32个GhGRL基因主要定位于细胞核,它们均具有GRX-GRX-Like保守结构域。GhGRL基因所编码的氨基酸的多重序列比对和保守序列分析发现,该家族成员序列相似性约为31.31%,大部分包含4个保守基序,同时这4个保守基序与保守结构域重叠;根据GhGRL基因的系统进化树可将32个GhGRL基因分为3亚组,基因结构分析发现该家族基因大部分无内含子;染色体定位分析显示GhGRL基因分散在19个染色体上,每条染色体上的GhGRL基因数目有很大的差别;表达谱数据分析表明,大部分GhGRL基因在根、茎、雄蕊、雌蕊、子房、叶片和花等7个组织器官中均有表达,并且有差异。以上结果有利于了解棉花GhGRL基因家族的基本情况,为深入研究该基因家族在生物学功能提供基础。  相似文献   

8.
9.
Most known plant disease-resistance genes (R genes) include in their encoded products domains such as a nucleotide-binding site (NBS) or leucine-rich repeats (LRRs). Sequences with unknown function, but encoding these conserved domains, have been defined as resistance gene analogues (RGAs). The conserved motifs within plant NBS domains make it possible to use degenerate primers and PCR to isolate RGAs. We used degenerate primers deduced from conserved motifs in the NBS domain of NBS-LRR resistance proteins to amplify genomic sequences from Lens species. Fragments from approximately 500-850 bp were obtained. The nucleotide sequence analysis of these fragments revealed 32 different RGA sequences in Lens species with a high similarity (up to 91%) to RGAs from other plants. The predicted amino acid sequences showed that lentil sequences contain all the conserved motifs (P-loop, kinase-2, kinase-3a, GLPL, and MHD) present in the majority of other known plant NBS-LRR resistance genes. Phylogenetic analyses grouped the Lens NBS sequences with the Toll and interleukin-1 receptor (TIR) subclass of NBS-LRR genes, as well as with RGA sequences isolated from other legume species. Using inverse PCR on one putative RGA of lentil, we were able to amplify the flanking regions of this sequence, which contained features found in R proteins.  相似文献   

10.
11.
Interpreting the impact of human genome variation on phenotype is challenging. The functional effect of protein-coding variants is often predicted using sequence conservation and population frequency data, however other factors are likely relevant. We hypothesized that variants in protein post-translational modification (PTM) sites contribute to phenotype variation and disease. We analyzed fraction of rare variants and non-synonymous to synonymous variant ratio (Ka/Ks) in 7,500 human genomes and found a significant negative selection signal in PTM regions independent of six factors, including conservation, codon usage, and GC-content, that is widely distributed across tissue-specific genes and function classes. PTM regions are also enriched in known disease mutations, suggesting that PTM variation is more likely deleterious. PTM constraint also affects flanking sequence around modified residues and increases around clustered sites, indicating presence of functionally important short linear motifs. Using target site motifs of 124 kinases, we predict that at least ∼180,000 motif-breaker amino acid residues that disrupt PTM sites when substituted, and highlight kinase motifs that show specific negative selection and enrichment of disease mutations. We provide this dataset with corresponding hypothesized mechanisms as a community resource. As an example of our integrative approach, we propose that PTPN11 variants in Noonan syndrome aberrantly activate the protein by disrupting an uncharacterized cluster of phosphorylation sites. Further, as PTMs are molecular switches that are modulated by drugs, we study mutated binding sites of PTM enzymes in disease genes and define a drug-disease network containing 413 novel predicted disease-gene links.  相似文献   

12.
13.
14.
15.
16.
Tripeptidyl-peptidase II (TPP II) is a cytosolic peptidase that has been implicated in fat formation and cancer, apparently independent of the enzymatic activity. In search for alternative functional regions, conserved motifs were identified and eleven signatures were constructed. Seven of the signatures covered previously investigated residues, whereas the functional importance of the other motifs is unknown. This provides directions for future investigations of alternative activities of TPP II. The obtained signatures provide an efficient bioinformatic tool for the identification of TPP II homologues. Hence, a TPP II sequence homologue from fission yeast, Schizosaccharomyces pombe, was identified and demonstrated to encode the TPP II-like protein previously reported as multicorn. Furthermore, an homologous protein was found in the prokaryote Blastopirellula marina, albeit the TPP II function was apparently not conserved. This gene is probably the result of a rare gene transfer from eukaryote to prokaryote.  相似文献   

17.
18.
19.
Protein tyrosine kinase-7 (PTK7) is a receptor protein tyrosine kinase (RPTK)-like molecule that contains a catalytically inactive tyrosine kinase domain. We report here the genomic structure of the human PTK7 gene by screening a BAC library and DNA sequencing. The PTK7 gene is organized into 20 exons. All of the splicing junctions followed the conserved GT/AG rule. The exon-intron structure of the PTK7 gene in the region that encodes the catalytic domain was distinct from those of other RPTKs with strong homology. The 5'-flanking sequence of the PTK7 gene contains two GC boxes that concatenate Sp1 binding motifs, but does not contain either the TATA or CAAT consensus sequence. Using a luciferase reporter assay, it was demonstrated that the 883-bp 5'-flanking sequence is functional as a promoter of the PTK7 gene. We identified four new splicing variants in testis that could be derived from alternative splicing of exons 8-10, 10, a part of 12-13, and 16. The expression patterns of the splicing variants in the hepatoma and colon cancer cells were different from those of the testis. Our findings suggest that PTK7 is evolutionarily distinct from other RPTKs, and that the alternative splicing of PTK7 mRNA may contribute to its diverse function in cell signaling.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号