首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Mouse immunoglobulin (Ig) switch-region sequences are anti-"runny"; that is, they have a smaller amount of their total bases in homonucleotide tracts ("runs") than would be expected if each nucleotide in the sequence were a random selection from a pool of the composition of the region. The switch sequences involve the first intron of rearranged Ig heavy-chain genes; this intron differs strikingly from the succeeding ones, which are "runny" (have more bases than expected in runs). Switch regions are the only category of sequences so far found to be antirunny by statistical test. This sequence characteristic is related to the presence in switch sequences of repeating heteronucleotides. We suggest that the resulting base dispersion and increased complexity favor more specific interactions between sequences, which may be advantageous in recombinational processes such as switching and translocation.   相似文献   

2.
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.  相似文献   

3.

Background

Protein synthetic lethal genetic interactions are useful to define functional relationships between proteins and pathways. However, the molecular mechanism of synthetic lethal genetic interactions remains unclear.

Results

In this study we used the clusters of short polypeptide sequences, which are typically shorter than the classically defined protein domains, to characterize the functionalities of proteins. We developed a framework to identify significant short polypeptide clusters from yeast protein sequences, and then used these short polypeptide clusters as features to predict yeast synthetic lethal genetic interactions. The short polypeptide clusters based approach provides much higher coverage for predicting yeast synthetic lethal genetic interactions. Evaluation using experimental data sets showed that the short polypeptide clusters based approach is superior to the previous protein domain based one.

Conclusion

We were able to achieve higher performance in yeast synthetic lethal genetic interactions prediction using short polypeptide clusters as features. Our study suggests that the short polypeptide cluster may help better understand the functionalities of proteins.
  相似文献   

4.
Protein:DNA interactions at chromosomal loop attachment sites   总被引:6,自引:0,他引:6  
We have recently identified an evolutionarily conserved class of sequences that organize chromosomal loops in the interphase nucleus, which we have termed "matrix association regions" (MARs). MARs are about 200 bp long, AT-rich, contain topoisomerase II consensus sequences and other AT-rich sequence motifs, often reside near cis-acting regulatory sequences, and their binding sites are abundant (greater than 10,000 per mammalian nucleus). Here we demonstrate that the interactions between the mouse kappa immunoglobulin gene MAR and topoisomerase II or the "nuclear matrix" occur between multiple and sometimes overlapping binding sites. Interestingly, the sites most susceptible to topoisomerase II cleavage are localized near the breakpoints of a previously described illegitimate recombination event. The presence of multiple binding sites within single MARs may allow DNA and RNA polymerase passage without disrupting primary loop organization.  相似文献   

5.

Background

Reversible interactions between the components of cellular signaling pathways allow for the formation and dissociation of multimolecular complexes with spatial and temporal resolution and, thus, are an important means of integrating multiple signals into a coordinated cellular response. Several mechanisms that underlie these interactions have been identified, including the recognition of specific docking sites, termed a D-domain and FXFP motif, on proteins that bind mitogen-activated protein kinases (MAPKs). We recently found that phosphatidylinositol-specific phospholipase C-γ1 (PLC-γ1) directly binds to extracellular signal-regulated kinase 2 (ERK2), a MAPK, via a D-domain-dependent mechanism. In addition, we identified D-domain sequences in several other PLC isozymes. In the present studies we sought to determine whether MAPK docking sequences could be recognized in other enzymes that metabolize phosphatidylinositols (PIs), as well as in enzymes that metabolize inositol phosphates (IPs).

Results

We found that several, but not all, of these enzymes contain identifiable D-domain sequences. Further, we found a high degree of conservation of these sequences and their location in human and mouse proteins; notable exceptions were PI 3-kinase C2-γ, PI 4-kinase type IIβ, and inositol polyphosphate 1-phosphatase.

Conclusion

The results indicate that there may be extensive crosstalk between MAPK signaling and signaling pathways that are regulated by cellular levels of PIs or IPs.  相似文献   

6.
Extremely AT-rich DNA sequences present a challenging template for specific recognition by RNA polymerase. In bacteria, this is because the promoter −10 hexamer, the major DNA element recognised by RNA polymerase, is itself AT-rich. We show that Histone-like Nucleoid Structuring (H-NS) protein can facilitate correct recognition of a promoter by RNA polymerase in AT-rich gene regulatory regions. Thus, at the Escherichia coli ehxCABD operon, RNA polymerase is unable to distinguish between the promoter −10 element and similar overlapping sequences. This problem is resolved in native nucleoprotein because the overlapping sequences are masked by H-NS. Our work provides mechanistic insight into nucleoprotein structure and its effect on protein-DNA interactions in prokaryotic cells.  相似文献   

7.
Livák F 《Immunogenetics》2003,55(5):307-314
Antigen receptor gene rearrangement is mediated by interactions between the VDJ recombinase and the recombination signal sequences that flank the antigen receptor gene segments. In this report I present phylogenetic analyses that suggest a remarkable evolutionary conservation of the recombination signal sequences flanking some of the orthologous T-cell receptor- locus gene segments between human and mouse. Comparison of published data on the usage of the same gene segments between human and mouse indicates similar conservation in the shape of the primary T-cell receptor- repertoire. I propose that interactions between the recombinase and its cognate recognition sequences play a hitherto underestimated role in the formation of the specific pattern of the primary, combinatorial antigen receptor repertoire and that this pattern appears to be conserved in diverse mammalian species. Generation of a conserved pattern of the primary T-cell receptor repertoire may be critical for efficient selection of immature T lymphocytes.  相似文献   

8.
In the studies reported here, we have examined the properties of the Mcp element from the Drosophila melanogaster bithorax complex (BX-C). We have found that sequences from the Mcp region of BX-C have properties characteristic of Polycomb response elements (PREs), and that they silence adjacent reporters by a mechanism that requires trans-interactions between two copies of the transgene. However, Mcp trans-regulatory interactions have several novel features. In contrast to classical transvection, homolog pairing does not seem to be required. Thus, trans-regulatory interactions can be observed not only between Mcp transgenes inserted at the same site, but also between Mcp transgenes inserted at distant sites on the same chromosomal arm, or even on different arms. Trans-regulation can even be observed between transgenes inserted on different chromosomes. A small 800-bp Mcp sequence is sufficient to mediate these long-distance trans-regulatory interactions. This small fragment has little silencing activity on its own and must be combined with other Polycomb-Group-responsive elements to function as a "pairing-sensitive" silencer. Finally, this pairing element can also mediate long-distance interactions between enhancers and promoters, activating mini-white expression.  相似文献   

9.
The fusion of some viruses (SIV, BLV, etc) to host cells implicates short fragments of the fusion protein that are asymmetric amphipathic helices in molecular modelling. The tilted orientation of these fragments at a water/lipid interface is directly related to their fusogenic capacity. On this basis, we have searched for fragments of sequences corresponding to "viral fusion peptides" in other proteins. We have developed a strategy to detect them from primary sequences. Many candidates were detected, especially in transmembrane areas of membranous proteins, in signal sequences and in globular proteins. We suggest that they are involved in the dynamics of lipid-protein interactions  相似文献   

10.
Stacking interactions between amino acids and bases are common in RNA-protein interactions. Many proteins that regulate mRNAs interact with single-stranded RNA elements in the 3' UTR (3'-untranslated region) of their targets. PUF proteins are exemplary. Here we focus on complexes formed between a Caenorhabditis elegans PUF protein, FBF, and its cognate RNAs. Stacking interactions are particularly prominent and involve every RNA base in the recognition element. To assess the contribution of stacking interactions to formation of the RNA-protein complex, we combine in vivo selection experiments with site-directed mutagenesis, biochemistry, and structural analysis. Our results reveal that the identities of stacking amino acids in FBF affect both the affinity and specificity of the RNA-protein interaction. Substitutions in amino acid side chains can restrict or broaden RNA specificity. We conclude that the identities of stacking residues are important in achieving the natural specificities of PUF proteins. Similarly, in PUF proteins engineered to bind new RNA sequences, the identity of stacking residues may contribute to "target" versus "off-target" interactions, and thus be an important consideration in the design of proteins with new specificities.  相似文献   

11.
12.
We investigate methods of estimating residue correlation within protein sequences. We begin by using mutual information (MI) of adjacent residues, and improve our methodology by defining the mutual information vector (MIV) to estimate long range correlations between nonadjacent residues. We also consider correlation based on residue hydropathy rather than protein-specific interactions. Finally, in experiments of family classification tests, the modeling power of MIV was shown to be significantly better than the classic MI method, reaching the level where proteins can be classified without alignment information.  相似文献   

13.
Insertion of new sequences into the catalytic domain of an enzyme   总被引:7,自引:0,他引:7  
Activities of enzymes can be modified by the replacement of active-site amino acids with residues that strengthen specific interactions with substrates or that alter the specificity. The scope for engineered enzymes would be broadened if additional, new sequences could be inserted into a catalytic domain. Properly designed, these sequences could encode new ligand binding sites, be intermediates in the construction of chimeric enzymes, or alter the internal flexibility and "breathing" modes of the active-site region. As a first step toward this objective, we inserted oligopeptides of up to 14 amino acids into various locations within an 82 amino acid region of the adenylate synthesis domain of Escherichia coli methionyl-tRNA synthetase. These sites include ones that are flanked by sequences that are conserved between the proteins from E. coli and the yeast Saccharomyces cerevisiae and those that are essential for activity and stability. We found that all of the insertional mutants are stable and some have catalytic parameters for adenylate synthesis that are comparable to those of the wild-type enzyme. Thus, such an approach may provide for a variety of novel applications.  相似文献   

14.
We investigate the relationship between the average fitness decay due to single mutations and the strength of epistatic interactions in genetic sequences. We observe that epistatic interactions between mutations are correlated to the average fitness decay, both in RNA secondary structure prediction as well as in digital organisms replicating in silico. This correlation implies that, during adaptation, epistasis and average mutational effect cannot be optimized independently. In experiments with RNA sequences evolving on a neutral network, the selective pressure to decrease the mutational load then leads to a reduction in the amount of sequences with strong antagonistic interactions between deleterious mutations in the population.  相似文献   

15.

Background

Precise identification of three-dimensional genome organization, especially enhancer-promoter interactions (EPIs), is important to deciphering gene regulation, cell differentiation and disease mechanisms. Currently, it is a challenging task to distinguish true interactions from other nearby non-interacting ones since the power of traditional experimental methods is limited due to low resolution or low throughput.

Results

We propose a novel computational framework EP2vec to assay three-dimensional genomic interactions. We first extract sequence embedding features, defined as fixed-length vector representations learned from variable-length sequences using an unsupervised deep learning method in natural language processing. Then, we train a classifier to predict EPIs using the learned representations in supervised way. Experimental results demonstrate that EP2vec obtains F1 scores ranging from 0.841~?0.933 on different datasets, which outperforms existing methods. We prove the robustness of sequence embedding features by carrying out sensitivity analysis. Besides, we identify motifs that represent cell line-specific information through analysis of the learned sequence embedding features by adopting attention mechanism. Last, we show that even superior performance with F1 scores 0.889~?0.940 can be achieved by combining sequence embedding features and experimental features.

Conclusions

EP2vec sheds light on feature extraction for DNA sequences of arbitrary lengths and provides a powerful approach for EPIs identification.
  相似文献   

16.
Two approaches to the understanding of biological sequences are confronted. While the recognition of particular signals in sequences relies on complex physical interactions, the problem is often analysed in terms of the presence or absence of literal motifs (strings) in the sequence. We present here a test-case for evaluating the potential of this approach. We classify DNA sequences as positive or negative depending on whether they contain a single melted domain in the middle of the sequence, which is a global physical property. Two sets of positive "biological" sequences were generated by a computer simulation of evolutionary divergence along the branches of a phylogenetic tree, under the constraint that each intermediate sequence be positive. These two sets and a set of random positive sequences were subjected to pattern analysis. The observed local patterns were used to construct expert systems to discriminate positive from negative sequences. The experts achieved 79% to 90% success on random positive sequences and up to 99% on the biological sets, while making less than 2% errors on negative sequences. Thus, the global constraints imposed on sequences by a physical process may generate local patterns that are sufficient to predict, with a reasonable probability, the behaviour of the sequences. However, rather large sets of biological sequences are required to generate patterns free of illegitimate constraints. Furthermore, depending upon the initial sequence, the sets of sequences generated on a phylogenetic tree may be amenable or refractory to string analysis, while obeying identical physical constraints. Our study clarifies the relationship between experts' errors on positive and negative sequences, and the contributions of legitimate and illegitimate patterns to these errors. The test-case appears suitable both for further investigations of problems in the theory of sequence evolution and for further testing of pattern analysis techniques.  相似文献   

17.
Tao Li  Bin Han 《DNA sequence》2004,15(2):135-139
This work studied the relationship of any two nucleotides in genomic sequences, coding sequences and full-length cDNAs. We made a statistical hypothesis that there exist no interactions between any two nucleotides in sequences, therefore, a hypothetical combination distribution of two nucleotides is considered and the difference between the hypothetical combination distribution and the actual distribution is used to measure the average interaction between the two nucleotides. As a result, we found that the interactions between any two nucleotides are clearly and closely related with dampable wavelike patterns along the sequences. Based on the results we daringly make some hypotheses on several biological topics. Further, studies on the wave may provide new clues for gene prediction and genome structure study.  相似文献   

18.
We report an interesting case of structural similarity between 2 small, nonhomologous proteins, the third domain of ovomucoid (ovomucoid) and the C-terminal fragment of ribosomal L7/L12 protein (CTF). The region of similarity consists of a 3-stranded beta-sheet and an alpha-helix. This region is highly similar; the corresponding elements of secondary structure share a common topology, and the RMS difference for "equivalent" C alpha atoms is 1.6 A. Surprisingly, this common structure arises from completely different sequences. For the common core, the sequence identity is less than 3%, and there is neither significant sequence similarity nor similarity in the position or orientation of conserved hydrophobic residues. This superposition raises the question of how 2 entirely different sequences can produce an identical structure. Analyzing this common region in ovomucoid revealed that it is stabilized by disulfide bonds. In contrast, the corresponding structure in CTF is stabilized in the alpha-helix by a composition of residues with high helix-forming propensities. This result suggests that different sequences and different stabilizing interactions can produce an identical structure.  相似文献   

19.
We have successfully linked protein library screening directly with the identification of active proteins, without the need for individual purification, display technologies or physical linkage between the protein and its encoding sequence. By using ‘MAX’ randomization we have rapidly constructed 60 overlapping gene libraries that encode zinc finger proteins, randomized variously at the three principal DNA-contacting residues. Expression and screening of the libraries against five possible target DNA sequences generated data points covering a potential 40000 individual interactions. Comparative analysis of the resulting data enabled direct identification of active proteins. Accuracy of this library analysis methodology was confirmed by both in vitro and in vivo analyses of identified proteins to yield novel zinc finger proteins that bind to their target sequences with high affinity, as indicated by low nanomolar apparent dissociation constants.  相似文献   

20.
The role of histone N-terminal domains on the thermodynamic stability of nucleosomes assembled on several different telomeric DNAs as well as on 'average' sequence DNA and on strong nucleosome positioning sequences, has been studied by competitive reconstitution. We find that histone tails hyperacetylation favors nucleosome formation, in a similar extent for all the examined sequences. On the contrary, removal of histone terminal domains by selective trypsinization causes a decrease of nucleosome stability which is smaller for telomeres compared to the other sequences examined, suggesting that telomeric sequences have only minor interactions with histone tails. Micrococcal nuclease kinetics shows enhanced accessibility of acetylated nucleosomes formed both on telomeric and 'average' sequence DNAs. These results suggest a more complex role for histone acetylation than the decrease of electrostatic interactions between DNA and histones.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号