共查询到20条相似文献,搜索用时 15 毫秒
1.
Similarity landscapes: A way to detect many structural and sequence motifs in both introns and exons
Michael Hultner Douglas W. Smith Christopher Wills 《Journal of molecular evolution》1994,38(2):188-203
When investigators undertake searches of DNA databases, they normally discard large numbers of alignments that demonstrate very weak resemblances to each other, retaining only those that show statistically significant levels of resemblance. We show here that a great deal of information can be extracted from these weak alignments by examining them en masse. This is done by building three-dimensional similarity landscapes from the alignments, landscapes that reveal whether an unusual number of individually nonsignificant alignments tend to match up to a particular region of the query sequence being searched. The power of the search is increased by the use of libraries consisting entirely of introns or of exons. We show that (1) similarity landscapes with a variety of features can be generated from both intron and exon libraries, using introns or exons as query sequences; (2) the landscape features are real and not a statistical artifact; (3) well-known protein motifs used as query sequences can generate various landscape features; and (4) there is some evidence for resemblances between short regions of sequence carried by introns and exons. One possible interpretation of these results is that both introns and exons may have been built up during their evolution from short regions of sequence that as a result are now widely distributed throughout eukaryotic genomes. Such an interpretation would imply that these short regions have common ancestry. Alternatively, the wide sharing of short pieces of DNA may reflect regions with particular structural properties that have arisen through convergent evolution. The similarity-landscape approach can be used to detect such widespread structural motifs and sequence motifs in the genome that might be missed by less-global searches. It can also be used in conjunction with algorithms developed for detecting significant multiple alignments by isolating promising subsets of the databases that can be examined in more detail.Correspondence to: C. Wills 相似文献
2.
In the chemokine family, we characterize two examples of evolutionarily conserved unfavorable sequence motifs that affect quaternary structure. In contrast to the straightforward action of favorable sequences, these unfavorable motifs produce interactions disfavoring one outcome to indirectly promote another one but should not be confused with the broad sampling produced by negative selection and/or design. To identify such motifs, we developed a statistically validated computational method combining structure and phylogeny. This approach was applied in an analysis of the alternate forms of homodimerization exhibited in the chemokine family. While the chemokine family exhibits the same tertiary fold, members of certain subfamilies, including CXCL8, form a homodimer across the beta1 strand whereas members of other subfamilies, including CCL4 and CCL2, form a homodimer on the opposite side of the chemokine fold. These alternate dimerization states suggest that CCL4 and CCL2 contain specific sequences that disfavor CXCL8 dimerization. Using our computational approach, we identified two evolutionarily conserved sequence motifs in the CC subfamilies: a drastic two-residue deletion (DeltaRV) and a simple point mutation (V27R). Cloned into the CXCL8 background, these two motifs were experimentally proven to confer a monomeric state. NMR analyses indicate that these variants are structured in solution and retain the chemokine fold. Structurally, the motifs retain a chemokine tertiary fold while introducing unfavorable quaternary interactions that inhibit CXCL8 dimerization. In demonstrating the success of our computational method, our results argue that these unfavorable motifs have been evolutionarily conserved to specifically disfavor one dimerization state and, as a result, indirectly contribute to favoring another. 相似文献
3.
《Current biology : CB》2021,31(16):3515-3524.e6
4.
To identify functional structural motifs from protein structures of unknown function becomes increasingly important in recent years due to the progress of the structural genomics initiatives. Although certain structural patterns such as the Asp-His-Ser catalytic triad are easy to detect because of their conserved residues and stringently constrained geometry, it is usually more challenging to detect a general structural motifs like, for example, the betabetaalpha-metal binding motif, which has a much more variable conformation and sequence. At present, the identification of these motifs usually relies on manual procedures based on different structure and sequence analysis tools. In this study, we develop a structural alignment algorithm combining both structural and sequence information to identify the local structure motifs. We applied our method to the following examples: the betabetaalpha-metal binding motif and the treble clef motif. The betabetaalpha-metal binding motif plays an important role in nonspecific DNA interactions and cleavage in host defense and apoptosis. The treble clef motif is a zinc-binding motif adaptable to diverse functions such as the binding of nucleic acid and hydrolysis of phosphodiester bonds. Our results are encouraging, indicating that we can effectively identify these structural motifs in an automatic fashion. Our method may provide a useful means for automatic functional annotation through detecting structural motifs associated with particular functions. 相似文献
5.
6.
7.
8.
9.
Glycosylation motifs that direct arabinogalactan addition to arabinogalactan-proteins 总被引:1,自引:0,他引:1
Hydroxyproline (Hyp)-rich glycoproteins (HRGPs) participate in all aspects of plant growth and development. HRGPs are generally highly O-glycosylated through the Hyp residues, which means carbohydrates help define the interactive molecular surface and, hence, HRGP function. The Hyp contiguity hypothesis predicts that contiguous Hyp residues are sites of HRGP arabinosylation, whereas clustered noncontiguous Hyp residues are sites of galactosylation, giving rise to the arabinogalactan heteropolysaccharides that characterize the arabinogalactan-proteins. Early tests of the hypothesis using synthetic genes encoding only clustered noncontiguous Hyp in the sequence (serine [Ser]-Hyp-Ser-Hyp)(n) or contiguous Hyp in the series (Ser-Hyp-Hyp)(n) and (Ser-Hyp-Hyp-Hyp-Hyp)(n) confirmed that arabinogalactan polysaccharide was added only to noncontiguous Hyp, whereas arabinosylation occurred on contiguous Hyp. Here, we extended our tests of the codes that direct arabinogalactan polysaccharide addition to Hyp by building genes encoding the repetitive sequences (alanine [Ala]-proline [Pro]-Ala-Pro)(n), (threonine [Thr]-Pro-Thr-Pro)(n), and (valine [Val]-Pro-Val-Pro)(n), and expressing them in tobacco (Nicotiana tabacum) Bright-Yellow 2 cells as fusion proteins with green fluorescent protein. All of the Pro residues in the (Ala-Pro-Ala-Pro)(n) fusion protein were hydroxylated and consistent with the hypothesis that every Hyp residue was glycosylated with arabinogalactan polysaccharide. In contrast, 20% to 30% of Pro residues remained non-hydroxylated in the (Thr-Pro-Thr-Pro)(n), and (Val-Pro-Val-Pro)(n) fusion proteins. Furthermore, although 50% to 60% of the Hyp residues were glycosylated with arabinogalactan polysaccharide, some remained non-glycosylated or were arabinosylated. These results suggest that the amino acid side chains of flanking residues influence the extent of Pro hydroxylation and Hyp glycosylation and may explain why isolated noncontiguous Hyp in extensins do not acquire an arabinogalactan polysaccharide but are arabinosylated or remain non-glycosylated. 相似文献
10.
Motif3D is a web-based protein structure viewer designed to allow sequence motifs, and in particular those contained in the fingerprints of the PRINTS database, to be visualised on three-dimensional (3D) structures. Additional functionality is provided for the rhodopsin-like G protein-coupled receptors, enabling fingerprint motifs of any of the receptors in this family to be mapped onto the single structure available, that of bovine rhodopsin. Motif3D can be used via the web interface available at: http://www.bioinf.man.ac.uk/dbbrowser/motif3d/motif3d.html. 相似文献
11.
Scrutineer: a computer program that flexibly seeks and describes motifs and profiles in protein sequence databases 总被引:3,自引:0,他引:3
Scrutineer is an interactive, user-friendly program designedto search for motifs, patterns and profiles in the Swissprot,Protein Identification Resource (PIR) or SeqDb protein sequencedatabases. Basic capabilities include (i) searches for stringsof amino acids with multiple choices at a given position; (ii)searches for strings including variable-length segments anddelocalized constraints; (iii) searches over subsets of a databaseor particular regions within each sequence (e.g. N-terminalone-third); (iv) searches involving secondary structure predictions,physicochemical characteristics, and the like; and (v) searchesusing aligned sequences as targets with various optional weightingschemes. The various search criteria and hits can be combinedand complex targets located. Once the data are loaded into virtualmemory, all occurrences in PIR release 22.0 (3.7 X 106 aminoacids) of a given short string of amino acids (e.g. ahexamer)are found in -36s. Scrutineer can also describe the entire database,user-specified hits, user-defined regions of sequence and allhits. The source code and accompanying manual are being freelydistributed. 相似文献
12.
13.
An automatic procedure is proposed to identify, from the protein sequence database, conserved amino acid patterns (or sequence motifs) that are exclusive to a group of functionally related proteins. This procedure is applied to the PIR database and a dictionary of sequence motifs that relate to specific superfamilies constructed. The motifs have a practical relevance in identifying the membership of specific superfamilies without the need to perform sequence database searches in 20% of newly determined sequences. The sequence motifs identified represent functionally important sites on protein molecules. When multiple blocks exist in a single motif they are often close together in the 3-D structure. Furthermore, occasionally these motif blocks were found to be split by introns when the correlation with exon structures was examined. 相似文献
14.
PCR primers of arbitrary nucleotide sequence have identified DNA polymorphisms useful for genetic mapping in a large variety of organisms. Although technically very powerful, the use of arbitrary primers for genome mapping has the disadvantage of characterizing DNA sequences of unknown function. Thus, there is no reason to anticipate that DNA fragments amplified by use of arbitrary primers will be enriched for either transcribed or promoter sequences that may be conserved in evolution. For these reasons, we modified the arbitrarily primed PCR method by using oligonucleotide primers derived from conserved promoter elements and protein motifs. Twenty-nine of these primers were tested individually and in pairwise combinations for their ability to amplify genomic DNA from a variety of species including various inbred strains of laboratory mice and Mus spretus. Using recombinant inbred strains of mice, we determined the chromosomal location of 27 polymorphic fragments in the mouse genome. The results demonstrated that motif sequence-tagged PCR products are reliable markers for mapping the mouse genome and that motif primers can also be used for genomic fingerprinting of many divergent species. 相似文献
15.
Background
Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the model to be efficient we must know what properties do not associate significantly and can be omitted from the model. This paper will discuss the results of a randomization procedure to find motifs that associate significantly with either high or low antisense suppression activity, analysis of their properties, as well as the results of support vector machine modelling using these significant motifs as features. 相似文献16.
17.
The alpha-amylase gene in Drosophila melanogaster: nucleotide sequence, gene structure and expression motifs. 总被引:11,自引:3,他引:8 下载免费PDF全文
We present the complete nucleotide sequence of a Drosophila alpha-amylase gene and its flanking regions, as determined by cDNA and genomic sequence analysis. This gene, unlike its mammalian counterparts, contains no introns. Nevertheless the insect and mammalian genes share extensive nucleotide similarity and the insect protein contains the four amino acid sequence blocks common to all alpha-amylases. In Drosophila melanogaster, there are two closely-linked copies of the alpha-amylase gene and they are divergently transcribed. In the 5'-regions of the two gene-copies we find high sequence divergence, yet the typical eukaryotic gene expression motifs have been maintained. The 5'-terminus of the alpha-amylase mRNA, as determined by primer extension analysis, maps to a characteristic Drosophila sequence motif. Additional conserved elements upstream of both genes may also be involved in amylase gene expression which is known to be under complex controls that include glucose repression. 相似文献
18.
Leitmeyer KC Vaughn DW Watts DM Salas R Villalobos I de Chacon Ramos C Rico-Hesse R 《Journal of virology》1999,73(6):4738-4747
19.