首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The rapid increase in genomic information requires new techniques to infer protein function and predict protein-protein interactions. Bioinformatics identifies modular signaling domains within protein sequences with a high degree of accuracy. In contrast, little success has been achieved in predicting short linear sequence motifs within proteins targeted by these domains to form complex signaling networks. Here we describe a peptide library-based searching algorithm, accessible over the World Wide Web, that identifies sequence motifs likely to bind to specific protein domains such as 14-3-3, SH2, and SH3 domains, or likely to be phosphorylated by specific protein kinases such as Src and AKT. Predictions from database searches for proteins containing motifs matching two different domains in a common signaling pathway provides a much higher success rate. This technology facilitates prediction of cell signaling networks within proteomes, and could aid in the identification of drug targets for the treatment of human diseases.  相似文献   

2.
Biet E  Maurisse R  Dutreix M  Sun Js 《Biochemistry》2001,40(6):1779-1786
Oligonucleotide-directed triple helix formation provides an elegant rational basis for gene-specific DNA targeting and has been widely used to interfere with gene expression ("antigene" strategies) and as a molecular tool for biological studies. Various strategies have been developed to introduce sequence modifications in genomes. However, the low efficiency of the overall process in eucaryotic cells impairs efficient recovery of recombinant genomes. Since one limiting step in homologous recombination is the targeting to the homologous sequence, we have tested the contribution of an oligonucleotide-directed triple helix formation on the RecA-dependent association of an oligonucleotide and its homologous target on duplex DNA (D-loop formation). For this study, the recombinant ssDNA fragment was noncovalently associated to a triple helix-forming oligonucleotide. The physicochemical and biochemical characteristics of the triple helix and D-loop structures formed by the complex molecules in the presence or in the absence of RecA protein were determined. We have demonstrated that the triple helix-forming oligonucleotide increases the efficiency of D-loop formation and the RecA protein speeds up also the triple helix formation. The so-called "GOREC" (for guided homologous recombination) approach can be developed as a novel tool to improve the efficiency of directed mutagenesis and gene alteration in living organisms.  相似文献   

3.
4.
We identified the most frequent, variable-length DNA sequence motifs in the human and mouse genomes and sub-selected those with multiple recurrences in the intergenic and intronic regions and at least one additional exonic instance in the corresponding genome. We discovered that these motifs have virtually no overlap with intronic sequences that are conserved between human and mouse, and thus are genome-specific. Moreover, we found that these motifs span a substantial fraction of previously uncharacterized human and mouse intronic space. Surprisingly, we found that these genome-specific motifs are over-represented in the introns of genes belonging to the same biological processes and molecular functions in both the human and mouse genomes even though the underlying sequences are not conserved between the two genomes. In fact, the processes and functions that are linked to these genome-specific sequence-motifs are distinct from the processes and functions which are associated with intronic regions that are conserved between human and mouse. The findings show that intronic regions from different genomes are linked to the same processes and functions in the absence of underlying sequence conservation. We highlight the ramifications of this observation with a concrete example that involves the microsatellite instability gene MLH1.  相似文献   

5.
6.
La D  Silver M  Edgar RC  Livesay DR 《Biochemistry》2003,42(30):8988-8998
Protein motifs represent highly conserved regions within protein families and are generally accepted to describe critical regions required for protein stability and/or function. In this comprehensive analysis, we present a robust, unique approach to identify and compare corresponding mesophilic and thermophilic sequence motifs between all orthologous proteins within 44 microbial genomes. Motif similarity is determined through global sequence alignment of mesophilic and thermophilic motif pairs, which are identified by a greedy algorithm. Our results reveal only modest correlation between motif and overall sequence similarity, highlighting the rationale of motif-based approaches in comprehensive multigenome comparisons. Conserved mutations reflect previously suggested physiochemical principles for conferring thermostability. Additionally, comparisons between corresponding mesophilic and thermophilic motif pairs provide key biochemical insights related to thermostability and can be used to test the evolutionary robustness of individual structural comparisons. We demonstrate the ability of our unique approach to provide key insights in two examples: the TATA-box binding protein and glutamate dehydrogenase families. In the latter example, conserved mutations hint at novel origins leading to structural stability differences within the hexamer structures. Additionally, we present amino acid composition data and average protein length comparisons for all 44 microbial genomes.  相似文献   

7.
Poliovirus (PV) 2C protein is a nonstructural polypeptide involved in viral RNA replication, whose biochemical activity(ies) in this process has not been defined. By using site-directed mutagenesis, it was shown previously that disruption of nucleotide-binding motifs present in this protein abolished viral RNA synthesis (C. Mirzayan and E. Wimmer, Virology 189:547-555, 1992; N. L. Teterina, K. M. Kean, E. Gorbalenya, V. I. Agol, and M. Girard, J. Gen. Virol. 73:1977-1986, 1992). We have tested whether PV 2C or 2BC protein provided in trans could rescue the replication of these mutated genomes. Rescuing proteins were provided either by cotransfection with helper chimeric PV-coxsackievirus genomes or by expression in cells with a vaccinia virus-T7 RNA polymerase transient-expression system. We report here that replication of mutated RNAs genomes was poorly supported in trans both by helper genomes and by expressed 2C or 2BC proteins. Similarly, very inefficient complementation was observed for two mutated genomes with lethal lesions in 3D polymerase coding sequence. Our results indicate that poliovirus RNA replication shows marked preference for proteins contributed in cis.  相似文献   

8.
9.
We present an algorithm to detect protein sub-structural motifs from primary sequence. The input to the algorithm is a set of aligned multiple protein sequences. It uses wavelet transforms to decompose protein sequences represented numerically by different indices (such as polarity, accessible surface area or electron-ion integration potentials of the amino acids). The numerical representation of a protein sequence has significant correlation with its biological activity, thus common motifs are expected to be observable from the wavelet spectrum. The decomposed signals are then up-sampled and similarity search techniques are used to identify similar regions across all the proteins at multiple scales. Results indicate that wavelet transform techniques are a promising approach for rapid motif detection.  相似文献   

10.
The RecJ protein of Escherichia coli plays an important role in a number of DNA repair and recombination pathways. RecJ catalyzes processive degradation of single-stranded DNA in a 5'-to-3' direction. Sequences highly related to those encoding RecJ can be found in most of the eubacterial genomes sequenced to date. From alignment of these sequences, seven conserved motifs are apparent. At least five of these motifs are shared among a large family of proteins in eubacteria, eukaryotes, and archaea, including the PPX1 polyphosphatase of yeast and Drosophila Prune. Archaeal genomes are particularly rich in such sequences, but it has not been clear whether any of the encoded proteins play a functional role similar to that of RecJ exonuclease. We have investigated three such proteins from Methanococcus jannaschii with the strongest overall sequence similarity to E. coli RecJ. Two of the genes, MJ0977 and MJ0831, partially complement a recJ mutant phenotype in E. coli. The expression of MJ0977 in E. coli resulted in high levels of a thermostable single-stranded DNase activity with properties similar to those of RecJ exonuclease. Despite overall weak sequence similarity between the MJ0977 product and RecJ, these nucleases are likely to have similar biological functions.  相似文献   

11.

Background

Arthropod cuticle is composed predominantly of a self-assembling matrix of chitin and protein. Genes encoding structural cuticular proteins are remarkably abundant in arthropod genomes, yet there has been no systematic survey of conserved motifs across cuticular protein families.

Methodology/Principal Findings

Two short sequence motifs with conserved tyrosines were identified in Drosophila cuticular proteins that were similar to the GYR and YLP Interpro domains. These motifs were found in members of the CPR, Tweedle, CPF/CPFL, and (in Anopheles gambiae) CPLCG cuticular protein families, and the Dusky/Miniature family of cuticle-associated proteins. Tweedle proteins have a characteristic motif architecture that is shared with the Drosophila protein GCR1 and its orthologs in other species, suggesting that GCR1 is also cuticular. A resilin repeat, which has been shown to confer elasticity, matched one of the motifs; a number of other Drosophila proteins of unknown function exhibit a motif architecture similar to that of resilin. The motifs were also present in some proteins of the peritrophic matrix and the eggshell, suggesting molecular convergence among distinct extracellular matrices. More surprisingly, gene regulation, development, and proteolysis were statistically over-represented ontology terms for all non-cuticular matches in Drosophila. Searches against other arthropod genomes indicate that the motifs are taxonomically widespread.

Conclusions

This survey suggests a more general definition for GYR and YLP motifs and reveals their contribution to several types of extracellular matrix. They may define sites of protein interaction with DNA or other proteins, based on ontology analysis. These results can help guide experimental studies on the biochemistry of cuticle assembly.  相似文献   

12.
We compared levels of sequence divergence between fourfold synonymous coding sites and noncoding sites from the intergenic and intronic regions of the Plasmodium falciparum and Plasmodium reichenowi genomes. We observed significant differences in the level of divergence between these classes of silent sites. Fourfold synonymous coding sites exhibited the highest level of sequence divergence, followed by introns, and then intergenic sequences. This pattern of relative divergence rates has been observed in primate genomes but was unexpected in Plasmodium due to a paucity of variation at silent sites in P. falciparum and the corollary hypothesis that silent sites in this genome may be subject to atypical selective constraints. Exclusion of hypermutable CpG dinucleotides reduces the divergence level of synonymous coding sites to that of intergenic sites but does not diminish the significantly higher divergence level of introns relative to intergenic sites. A greater than expected incidence of CpG dinucleotides in intergenic regions less than 500 bp from genes may indicate selective maintenance of regulatory motifs containing CpGs. Divergence rates of different classes of silent sites in these Plasmodium genomes are determined by a combination of mutational and selective pressures.  相似文献   

13.
14.
We have identified four novel repeats and two domains in cell surface proteins encoded by the Methanosarcina acetivorans genome and in some archaeal and bacterial genomes. The repeats correspond to a certain number of amino acid residues present in tandem in a protein sequence and each repeat is characterized by conserved sequence motifs. These correspond to: (a) a 42 amino acid (aa) residue RIVW repeat; (b) a 45 aa residue LGxL repeat; (c) a 42 aa residue LVIVD repeat; and (d) a 54 aa residue LGFP repeat. The domains correspond to a certain number of aa residues in a protein sequence that do not comprise internal repeats. These correspond to: (a) a 200 aa residue DNRLRE domain; and (b) a 70 aa residue PEGA domain. We discuss the occurrence of these repeats and domains in the different proteins and genomes analysed in this work.  相似文献   

15.
16.
MOTIVATION: The detection of function-related local 3D-motifs in protein structures can provide insights towards protein function in absence of sequence or fold similarity. Protein loops are known to play important roles in protein function and several loop classifications have been described, but the automated identification of putative functional 3D-motifs in such classifications has not yet been addressed. This identification can be used on sequence annotations. RESULTS: We evaluated three different scoring methods for their ability to identify known motifs from the PROSITE database in ArchDB. More than 500 new putative function-related motifs not reported in PROSITE were identified. Sequence patterns derived from these motifs were especially useful at predicting precise annotations. The number of reliable sequence annotations could be increased up to 100% with respect to standard BLAST. CONTACT: boliva@imim.es SUPPLEMENTARY INFORMATION: Supplementary Data are available at Bioinformatics online.  相似文献   

17.
George J  Raju R 《Journal of virology》2000,74(20):9776-9785
The 3' nontranslated region of the genomes of Sindbis virus (SIN) and other alphaviruses carries several repeat sequence elements (RSEs) as well as a 19-nucleotide (nt) conserved sequence element (3'CSE). The 3'CSE and the adjoining poly(A) tail of the SIN genome are thought to act as viral promoters for negative-sense RNA synthesis and genome replication. Eight different SIN isolates that carry altered 3'CSEs were studied in detail to evaluate the role of the 3'CSE in genome replication. The salient findings of this study as it applies to SIN infection of BHK cells are as follows: i) the classical 19-nt 3'CSE of the SIN genome is not essential for genome replication, long-term stability, or packaging; ii) compensatory amino acid or nucleotide changes within the SIN genomes are not required to counteract base changes in the 3' terminal motifs of the SIN genome; iii) the 5' 1-kb regions of all SIN genomes, regardless of the differences in 3' terminal motifs, do not undergo any base changes even after 18 passages; iv) although extensive addition of AU-rich motifs occurs in the SIN genomes carrying defective 3'CSE, these are not essential for genome viability or function; and v) the newly added AU-rich motifs are composed predominantly of RSEs. These findings are consistent with the idea that the 3' terminal AU-rich motifs of the SIN genomes do not bind directly to the viral polymerase and that cellular proteins with broad AU-rich binding specificity may mediate this interaction. In addition to the classical 3'CSE, other RNA motifs located elsewhere in the SIN genome must play a major role in template selection by the SIN RNA polymerase.  相似文献   

18.
19.
Using a combination of several methods for protein sequence comparison and motif analysis, it is shown that the four recently described pseudouridine syntheses with different specificities belong to four distinct families. Three of these families share two conserved motifs that are likely to be directly involved in catalysis. One of these motifs is detected also in two other families of enzymes that specifically bind uridine, namely deoxycitidine triphosphate deaminases and deoxyuridine triphosphatases. It is proposed that this motif is an essential part of the uridine-binding site. Two of the pseudouridine syntheses, one of which modifies the anticodon arm of tRNAs and the other is predicted to modify a portion of the large ribosomal subunit RNA belonging to the peptidyltransferase center, are encoded in all extensively sequenced genomes, including the 'minimal' genome of Mycoplasma genitalium. These particular RNA modifications and the respective enzymes are likely to be essential for the functioning of any cell.  相似文献   

20.
There has been a dramatic increase in the number of completely sequenced bacterial genomes during the past two years as a result of the efforts both of public genome agencies and the pharmaceutical industry. The availability of completely sequenced genomes permits more systematic analyses of genes, evolution and genome function than was otherwise possible. Using computational methods - which are used to identify genes and their functions including statistics, sequence similarity, motifs, profiles, protein folds and probabilistic models - it is possible to develop characteristic genome signatures, assign functions to genes, identify pathogenic genes, identify metabolic pathways, develop diagnostic probes and discover potential drug-binding sites. All of these directions are critical to understanding bacterial growth, pathogenicity and host-pathogen interactions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号