首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: Cis-acting regulatory elements are frequently constrained by both sequence content and positioning relative to a functional site, such as a splice or polyadenylation site. We describe an approach to regulatory motif analysis based on non-negative matrix factorization (NMF). Whereas existing pattern recognition algorithms commonly focus primarily on sequence content, our method simultaneously characterizes both positioning and sequence content of putative motifs. RESULTS: Tests on artificially generated sequences show that NMF can faithfully reproduce both positioning and content of test motifs. We show how the variation of the residual sum of squares can be used to give a robust estimate of the number of motifs or patterns in a sequence set. Our analysis distinguishes multiple motifs with significant overlap in sequence content and/or positioning. Finally, we demonstrate the use of the NMF approach through characterization of biologically interesting datasets. Specifically, an analysis of mRNA 3'-processing (cleavage and polyadenylation) sites from a broad range of higher eukaryotes reveals a conserved core pattern of three elements.  相似文献   

2.
Computational methods such as sequence alignment and motif construction are useful in grouping related proteins into families, as well as helping to annotate new proteins of unknown function. These methods identify conserved amino acids in protein sequences, but cannot determine the specific functional or structural roles of conserved amino acids without additional study. In this work, we present 3MATRIX (http://3matrix.stanford.edu) and 3MOTIF (http://3motif.stanford.edu), a web-based sequence motif visualization system that displays sequence motif information in its appropriate three-dimensional (3D) context. This system is flexible in that users can enter sequences, keywords, structures or sequence motifs to generate visualizations. In 3MOTIF, users can search using discrete sequence motifs such as PROSITE patterns, eMOTIFs, or any other regular expression-like motif. Similarly, 3MATRIX accepts an eMATRIX position-specific scoring matrix, or will convert a multiple sequence alignment block into an eMATRIX for visualization. Each query motif is used to search the protein structure database for matches, in which the motif is then visually highlighted in three dimensions. Important properties of motifs such as sequence conservation and solvent accessible surface area are also displayed in the visualizations, using carefully chosen color shading schemes.  相似文献   

3.
4.
The sliding clamp of the Escherichia coli replisome is now understood to interact with many proteins involved in DNA synthesis and repair. A universal interaction motif is proposed to be one mechanism by which those proteins bind the E. coli sliding clamp, a homodimer of the beta subunit, at a single site on the dimer. The numerous beta(2)-binding proteins have various versions of the consensus interaction motif, including a related hexameric sequence. To determine if the variants of the motif could contribute to the competition of the beta-binding proteins for the beta(2) site, synthetic peptides derived from the putative beta(2)-binding motifs were assessed for their abilities to inhibit protein-beta(2) interactions, to bind directly to beta(2), and to inhibit DNA synthesis in vitro. A hierarchy emerged, which was consistent with sequence similarity to the pentameric consensus motif, QL(S/D)LF, and peptides containing proposed hexameric motifs were shown to have activities comparable to those containing the consensus sequence. The hierarchy of peptide binding may be indicative of a competitive hierarchy for the binding of proteins to beta(2) in various stages or circumstances of DNA replication and repair.  相似文献   

5.
RNA structural motifs are recurrent three-dimensional (3D) components found in the RNA architecture. These RNA structural motifs play important structural or functional roles and usually exhibit highly conserved 3D geometries and base-interaction patterns. Analysis of the RNA 3D structures and elucidation of their molecular functions heavily rely on efficient and accurate identification of these motifs. However, efficient RNA structural motif search tools are lacking due to the high complexity of these motifs. In this work, we present RNAMotifScanX, a motif search tool based on a base-interaction graph alignment algorithm. This novel algorithm enables automatic identification of both partially and fully matched motif instances. RNAMotifScanX considers noncanonical base-pairing interactions, base-stacking interactions, and sequence conservation of the motifs, which leads to significantly improved sensitivity and specificity as compared with other state-of-the-art search tools. RNAMotifScanX also adopts a carefully designed branch-and-bound technique, which enables ultra-fast search of large kink-turn motifs against a 23S rRNA. The software package RNAMotifScanX is implemented using GNU C++, and is freely available from http://genome.ucf.edu/RNAMotifScanX.  相似文献   

6.
Sequence motifs are becoming increasingly important in the analysis of gene regulation. How do we define sequence motifs, and why should we use sequence logos instead of consensus sequences to represent them? Do they have any relation with binding affinity? How do we search for new instances of a motif in this sea of DNA?  相似文献   

7.
8.
MOTIVATION: Searching RNA gene occurrences in genomic sequences is a task whose importance has been renewed by the recent discovery of numerous functional RNA, often interacting with other ligands. Even if several programs exist for RNA motif search, none exists that can represent and solve the problem of searching for occurrences of RNA motifs in interaction with other molecules. RESULTS: We present a constraint network formulation of this problem. RNA are represented as structured motifs that can occur on more than one sequence and which are related together by possible hybridization. The implemented tool MilPat is used to search for several sRNA families in genomic sequences. Results show that MilPat allows to efficiently search for interacting motifs in large genomic sequences and offers a simple and extensible framework to solve such problems. New and known sRNA are identified as H/ACA candidates in Methanocaldococcus jannaschii. AVAILABILITY: http://carlit.toulouse.inra.fr/MilPaT/MilPat.pl.  相似文献   

9.
A new sequence motif library StrProf was constructed characterizing the groups of related proteins in the PDB three-dimensional structure database. For a representative member of each protein family, which was identified by cross-referencing the PDB with the PIR superfamily classification, a group of related sequences was collected by the BLAST search against the nonredundant protein sequence database. For every group, the motifs were identified automatically according to the criteria of conservation and uniqueness of pentapeptide patterns and with a dual dynamic programming algorithm. In the StrProf library, motifs are represented by profile matrices rather than consensus patterns to allow more flexible search capabilities. Another dynamic programming algorithm was then developed to search this motif library. When the computationally derived StrProf was compared with PROSITE, which is a manually derived motif library in the best consensus pattern representation, the numbers of identified patterns were comparable. StrProf missed about one third of the PROSITE motifs, but there were also new motifs lacking in PROSITE. The new library was incorporated in SMART (Sequence Motif Analysis and Retrieval Tool), a computer tool designed to help search and annotate biologically important sites in an unknown protein sequence. The client program is available free of charge through the Internet.  相似文献   

10.
11.
Sister chromatid cohesion is resolved at anaphase onset when separase, a site-specific protease, cleaves the Scc1 subunit of the chromosomal cohesin complex that is responsible for holding sister chromatids together. This mechanism to initiate anaphase is conserved in eukaryotes from budding yeast to man. Budding yeast separase recognizes and cleaves two conserved peptide motifs within Scc1. In addition, separase cleaves a similar motif in the kinetochore and spindle protein Slk19. Separase may cleave further substrate proteins to orchestrate multiple cellular events that take place during anaphase. To investigate substrate recognition by budding yeast separase we analyzed the sequence requirements at one of the Scc1 cleavage site motifs by systematic mutagenesis. We derived a cleavage site consensus motif (not(FKRWY))(ACFHILMPVWY)(DE)X(AGSV)R/X. This motif is found in 1,139 of 5,889 predicted yeast proteins. We analyzed 28 candidate proteins containing this motif as well as 35 proteins that contain a core (DE)XXR motif. We could so far not confirm new separase substrates, but we have uncovered other forms of mitotic regulation of some of the proteins. We studied whether determinants other than the cleavage site motif mediate separase-substrate interaction. When the separase active site was occupied with a peptide inhibitor covering the cleavage site motif, separase still efficiently interacted with its substrate Scc1. This suggests that separase recognizes both a cleavage site consensus sequence as well as features outside the cleavage site.  相似文献   

12.
13.
La D  Sutch B  Livesay DR 《Proteins》2005,58(2):309-320
In this report, we demonstrate that phylogenetic motifs, sequence regions conserving the overall familial phylogeny, represent a promising approach to protein functional site prediction. Across our structurally and functionally heterogeneous data set, phylogenetic motifs consistently correspond to functional sites defined by both surface loops and active site clefts. Additionally, the partially buried prosthetic group regions of cytochrome P450 and succinate dehydrogenase are identified as phylogenetic motifs. In nearly all instances, phylogenetic motifs are structurally clustered, despite little overall sequence proximity, around key functional site features. Based on calculated false-positive expectations and standard motif identification methods, we show that phylogenetic motifs are generally conserved in sequence. This result implies that they can be considered motifs in the traditional sense as well. However, there are instances where phylogenetic motifs are not (overall) well conserved in sequence. This point is enticing, because it implies that phylogenetic motifs are able to identify key sequence regions that traditional motif-based approaches would not. Further, phylogenetic motif results are also shown to be consistent with evolutionary trace results, and bootstrapping is used to demonstrate tree significance.  相似文献   

14.
15.
16.
Protein import into peroxisomes relies on the import receptor Pex5, which recognizes proteins with a peroxisomal targeting signal 1 (PTS1) in the cytosol and directs them to a docking complex at the peroxisomal membrane. Receptor-cargo docking occurs at the membrane-associated protein Pex14. In human cells, this interaction is mediated by seven conserved diaromatic penta-peptide motifs (WXXX(F/Y) motifs) in the N-terminal half of Pex5 and the N-terminal domain of Pex14. A systematic screening of a Pex5 peptide library by ligand blot analysis revealed a novel Pex5-Pex14 interaction site of Pex5. The novel motif composes the sequence LVAEF with the evolutionarily conserved consensus sequence LVXEF. Replacement of the amino acid LVAEF sequence by alanines strongly affects matrix protein import into peroxisomes in vivo. The NMR structure of a complex of Pex5-(57–71) with the Pex14-N-terminal domain showed that the novel motif binds in a similar α-helical orientation as the WXXX(F/Y) motif but that the tryptophan pocket is now occupied by a leucine residue. Surface plasmon resonance analyses revealed 33 times faster dissociation rates for the LVXEF ligand when compared with a WXXX(F/Y) motif. Surprisingly, substitution of the novel motif with the higher affinity WXXX(F/Y) motif impairs protein import into peroxisomes. These data indicate that the distinct kinetic properties of the novel Pex14-binding site in Pex5 are important for processing of the peroxisomal targeting signal 1 receptor at the peroxisomal membrane. The novel Pex14-binding site may represent the initial tethering site of Pex5 from which the cargo-loaded receptor is further processed in a sequential manner.  相似文献   

17.
18.
MOTIVATION AND RESULTS: Motivated by the recent rise of interest in small regulatory RNAs, we present Locomotif--a new approach for locating RNA motifs that goes beyond the previous ones in three ways: (1) motif search is based on efficient dynamic programming algorithms, incorporating the established thermodynamic model of RNA secondary structure formation. (2) motifs are described graphically, using a Java-based editor, and search algorithms are derived from the graphics in a fully automatic way. The editor allows us to draw secondary structures, annotated with size and sequence information. They closely resemble the established, but informal way in which RNA motifs are communicated in the literature. Thus, the learning effort for Locomotif users is minimal. (3) Locomotif employs a client-server approach. Motifs are designed by the user locally. Search programs are generated and compiled on a bioinformatics server. They are made available both for execution on the server, and for download as C source code plus an appropriate makefile. AVAILABILITY: Locomotif is available at http://bibiserv.techfak.uni-bielefeld.de/locomotif.  相似文献   

19.
Lu CH  Lin YS  Chen YC  Yu CS  Chang SY  Hwang JK 《Proteins》2006,63(3):636-643
To identify functional structural motifs from protein structures of unknown function becomes increasingly important in recent years due to the progress of the structural genomics initiatives. Although certain structural patterns such as the Asp-His-Ser catalytic triad are easy to detect because of their conserved residues and stringently constrained geometry, it is usually more challenging to detect a general structural motifs like, for example, the betabetaalpha-metal binding motif, which has a much more variable conformation and sequence. At present, the identification of these motifs usually relies on manual procedures based on different structure and sequence analysis tools. In this study, we develop a structural alignment algorithm combining both structural and sequence information to identify the local structure motifs. We applied our method to the following examples: the betabetaalpha-metal binding motif and the treble clef motif. The betabetaalpha-metal binding motif plays an important role in nonspecific DNA interactions and cleavage in host defense and apoptosis. The treble clef motif is a zinc-binding motif adaptable to diverse functions such as the binding of nucleic acid and hydrolysis of phosphodiester bonds. Our results are encouraging, indicating that we can effectively identify these structural motifs in an automatic fashion. Our method may provide a useful means for automatic functional annotation through detecting structural motifs associated with particular functions.  相似文献   

20.
Cyclophilin A (CyPA) and its peptidyl-prolyl isomerase (PPIase) activity play an essential role in hepatitis C virus (HCV) replication, and mounting evidence indicates that nonstructural protein 5A (NS5A) is the major target of CyPA. However, neither a consensus CyPA-binding motif nor specific proline substrates that regulate CyPA dependence and sensitivity to cyclophilin inhibitors (CPIs) have been defined to date. We systematically characterized all proline residues in NS5A domain II, low-complexity sequence II (LCS-II), and domain III with both biochemical binding and functional replication assays. A tandem cyclophilin-binding site spanning domain II and LCS-II was identified. The first site contains a consensus sequence motif of AØPXW (where Ø is a hydrophobic residue) that is highly conserved in the majority of the genotypes of HCV (six of seven; the remaining genotype has VØPXW). The second tandem site contains a similar motif, and the ØP sequence is again conserved in six of the seven genotypes. Consistent with the similarity of their sequences, peptides representing the two binding motifs competed for CyPA binding in a spot-binding assay and induced similar chemical shifts when bound to the active site of CyPA. The two prolines (P310 and P341 of Japanese fulminant hepatitis 1 [JFH-1]) contained in these motifs, as well as a conserved tryptophan in the spacer region, were required for CyPA binding, HCV replication, and CPI resistance. Together, these data provide a high-resolution mapping of proline residues important for CyPA binding and identify critical amino acids modulating HCV susceptibility to the clinical CPI Alisporivir.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号