首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.

Background

Meiotic recombination ensures proper segregation of homologous chromosomes and creates genetic variation. In many organisms, recombination occurs at limited sites, termed ''hotspots'', whose positions in mammals are determined by PR domain member 9 (PRDM9), a long-array zinc-finger and chromatin-modifier protein. Determining the rules governing the DNA binding of PRDM9 is a major issue in understanding how it functions.

Results

Mouse PRDM9 protein variants bind to hotspot DNA sequences in a manner that is specific for both PRDM9 and DNA haplotypes, and that in vitro binding parallels its in vivo biological activity. Examining four hotspots, three activated by Prdm9Cst and one activated by Prdm9Dom2, we found that all binding sites required the full array of 11 or 12 contiguous fingers, depending on the allele, and that there was little sequence similarity between the binding sites of the three Prdm9Cst activated hotspots. The binding specificity of each position in the Hlx1 binding site, activated by Prdm9Cst, was tested by mutating each nucleotide to its three alternatives. The 31 positions along the binding site varied considerably in the ability of alternative bases to support binding, which also implicates a role for additional binding to the DNA phosphate backbone.

Conclusions

These results, which provide the first detailed mapping of PRDM9 binding to DNA and, to our knowledge, the most detailed analysis yet of DNA binding by a long zinc-finger array, make clear that the binding specificities of PRDM9, and possibly other long-array zinc-finger proteins, are unusually complex.  相似文献   

4.
We have determined the complete nucleotide sequence of the monomer repeating unit of the 1.688 g/cm3 satellite DNA from Drosophila melanogaster. This satellite DNA, which makes up 4% of the Drosophila genome and is located primarily on the sex chromosomes, has a repeat unit 359 base-pairs in length. This complex sequence is unrelated to the other three major satellite DNAs present in this species, each of which contains a very short repeated sequence only 5 to 10 base-pairs long. The repeated sequence is more similar to the complex repeating units found in satellites of mammalian origin in that it contains runs of adenylate and thymidylate residues. We have determined the nature of the sequence variations in this DNA by restriction nuclease cleavage and by direct sequence determination of (1) individual monomer units cloned in hybrid plasmids, (2) mixtures of adjacent monomers from a cloned segment of this satellite DNA, (3) mixtures of monomer units isolated by restriction nuclease cleavage of total 1.688 g/cm3 satellite DNA. Both direct sequence determination and restriction nuclease cleavage indicate that certain positions in the repeat can be highly variable with up to 50% of certain restriction sites having altered recognition sequences. Despite the high degree of variation at certain sites, most positions in the sequence are highly conserved. Sequence analysis of a mixture of 15 adjacent monomer units detected only nine variable positions out of 359 base-pairs. Total satellite DNA showed only four additional positions. While some variability would have been missed due to the sequencing methods used, we conclude that the variation from one repeat to the next is not random and that most of the satellite repeat is conserved. This conservation may reflect functional aspects of the repeated DNA, since we have shown earlier that part of this sequence serves as a binding site for a sequence-specific DNA binding protein isolated from Drosophila embryos (Hsieh &; Brutlag, 1979).  相似文献   

5.
Synthetic sites inserted into a plasmid were used to analyze the sequence requirements for in vivo DNA cleavage dependent on bacteriophage T4 endonuclease II. A 16-bp variable sequence surrounding the cleavage site was sufficient for cleavage, although context both within and around this sequence influenced cleavage efficiency. The most efficiently cleaved sites matched the sequence CGRCCGCNTTGGCNGC, in which the strongly conserved bases to the left were essential for cleavage. The less-conserved bases in the center and in the right half determined cleavage efficiency in a manner not directly correlated with the apparent base preference at each position; a sequence carrying, in each of the 16 positions, the base most preferred in natural sites in pBR322 was cleaved infrequently. This, along with the effects of substitutions at one or two of the less-conserved positions, suggests that several combinations of bases can fulfill the requirements for recognition of the right part of this sequence. The replacements that improve cleavage frequency are predicted to influence helical twist and roll, suggesting that recognition of sequence-dependent DNA structure and recognition of specific bases are both important. Upon introduction of a synthetic site, cleavage at natural sites within 800 to 1,500 bp from the synthetic site was significantly reduced. This suggests that the enzyme may engage more DNA than its cleavage site and cleaves the best site within this region. Cleavage frequency at sites which do not conform closely to the consensus is, therefore, highly context dependent. Models and possible biological implications of these findings are discussed.  相似文献   

6.

Background

Modeling of transmembrane domains (TMDs) requires correct prediction of interfacial residues for in-silico modeling and membrane insertion studies. This implies the defining of a target sequence long enough to contain interfacial residues. However, too long sequences induce artifactual polymorphism: within tested modeling methods, the longer the target sequence, the more variable the secondary structure, as though the procedure were stopped before the end of the calculation (which may in fact be unreachable). Moreover, delimitation of these TMDs can produce variable results with sequence based two-dimensional prediction methods, especially for sequences showing polymorphism. To solve this problem, we developed a new modeling procedure using the PepLook method. We scanned the sequences by modeling peptides from the target sequence with a window of 19 residues.

Results

Using sequences whose NMR-structures are already known (GpA, EphA1 and Erb2-HER2), we first determined that the hydrophobic to hydrophilic accessible surface area ratio (ASAr) was the best criterion for delimiting the TMD sequence. The length of the helical structure and the Impala method further supported the determination of the TMD limits. This method was applied to the IL-2Rβ and IL-2Rγ TMD sequences of Homo sapiens, Rattus norvegicus, Mus musculus and Bos taurus.

Conclusions

We succeeded in reducing the variation in the TMD limits to only 2 residues and in gaining structural information.  相似文献   

7.
8.

Background

The CTCF insulator protein is a highly conserved zinc finger protein that has been implicated in many aspects of gene regulation and nuclear organization. The protein has been hypothesized to organize the human genome by forming DNA loops.

Results

In this paper, we report biochemical evidence to support the role for CTCF in forming DNA loops. We have measured DNA bending by CTCF at the chicken HS4 β-globin FII insulator element in vitro and have observed a unique DNA structure with aberrant electrophoretic mobility which we believe to be a DNA loop. CTCF is able to form this unusual DNA structure at two other binding sites: the c-myc P2 promoter and the chicken F1 lysozyme gene silencer. We also demonstrate that the length though not the sequence of the DNA downstream of the binding site is important for the ability of CTCF to form this unusual DNA structure. We hypothesize that a single CTCF protein molecule is able to act as a "looper" possibly through the use of several of its zinc fingers.

Conclusions

CTCF is able to form an unusual DNA structure through the zinc finger domain of the protein. This unusual DNA structure is formed in a directional manner by the CTCF protein. The findings described in this paper suggest mechanisms by which CTCF is able to form DNA loops, organize the mammalian genome and function as an insulator protein.  相似文献   

9.

Background

The retroviral Integrase protein catalyzes the insertion of linear viral DNA into host cell DNA. Although different retroviruses have been shown to target distinctive chromosomal regions, few of them display a site-specific integration. ZAM, a retroelement from Drosophila melanogaster very similar in structure and replication cycle to mammalian retroviruses is highly site-specific. Indeed, ZAM copies target the genomic 5′-CGCGCg-3′ consensus-sequences. To enlighten the determinants of this high integration specificity, we investigated the functional properties of its integrase protein denoted ZAM-IN.

Principal Findings

Here we show that ZAM-IN displays the property to nick DNA molecules in vitro. This endonuclease activity targets specific sequences that are present in a 388 bp fragment taken from the white locus and known to be a genomic ZAM integration site in vivo. Furthermore, ZAM-IN displays the unusual property to directly bind specific genomic DNA sequences. Two specific and independent sites are recognized within the 388 bp fragment of the white locus: the CGCGCg sequence and a closely apposed site different in sequence.

Conclusion

This study strongly argues that the intrinsic properties of ZAM-IN, ie its binding properties and its endonuclease activity, play an important part in ZAM integration specificity. Its ability to select two binding sites and to nick the DNA molecule reminds the strategy used by some site-specific recombination enzymes and forms the basis for site-specific integration strategies potentially useful in a broad range of genetic engineering applications.  相似文献   

10.
Discriminating phylogenetic signal from noise in DNA sequence data is a difficult problem in phylogenetic inference at higher systematic levels. For protein-coding genes, noise at synonymous (silent) positions can be filtered by deleting entire codon positions or types of change at a codon position. This method is not appropriate for replacement sites, because changes at each site within a codon may not be independent. This research presents a method using information from protein structure to evaluate variation in replacement sites. Analysis of the correlation of amino acid variation with protein structure identified rapidly evolving codons in the COIII gene. In a series of phylogenetic analyses attempting to recover a known set of vertebrate relationships, downweighting these labile codons produced the most accurate results. Structural correlates of variable and invariant residues identified in this study can be used to increase the accuracy of models used for phylogenetic inference. Viewing amino acid variation within a phylogenetic framework provided insight into residue changes important in the secondary and tertiary structures of the molecule, changes that were correlated between pairs of neighboring residues or between residues in neighboring helices.   相似文献   

11.

Background

Chromosome structure, DNA metabolic processes and cell type identity can all be affected by changing the positions of nucleosomes along chromosomal DNA, a reaction that is catalysed by SNF2-type ATP-driven chromatin remodelers. Recently it was suggested that in vivo, more than 50% of the nucleosome positions can be predicted simply by DNA sequence, especially within promoter regions. This seemingly contrasts with remodeler induced nucleosome mobility. The ability of remodeling enzymes to mobilise nucleosomes over short DNA distances is well documented. However, the nucleosome translocation processivity along DNA remains elusive. Furthermore, it is unknown what determines the initial direction of movement and how new nucleosome positions are adopted.

Methodology/Principal Findings

We have used AFM imaging and high resolution PAGE of mononucleosomes on 600 and 2500 bp DNA molecules to analyze ATP-dependent nucleosome repositioning by native and recombinant SNF2-type enzymes. We report that the underlying DNA sequence can control the initial direction of translocation, translocation distance, as well as the new positions adopted by nucleosomes upon enzymatic mobilization. Within a strong nucleosomal positioning sequence both recombinant Drosophila Mi-2 (CHD-type) and native RSC from yeast (SWI/SNF-type) repositioned the nucleosome at 10 bp intervals, which are intrinsic to the positioning sequence. Furthermore, RSC-catalyzed nucleosome translocation was noticeably more efficient when beyond the influence of this sequence. Interestingly, under limiting ATP conditions RSC preferred to position the nucleosome with 20 bp intervals within the positioning sequence, suggesting that native RSC preferentially translocates nucleosomes with 15 to 25 bp DNA steps.

Conclusions/Significance

Nucleosome repositioning thus appears to be influenced by both remodeler intrinsic and DNA sequence specific properties that interplay to define ATPase-catalyzed repositioning. Here we propose a successive three-step framework consisting of initiation, translocation and release steps to describe SNF2-type enzyme mediated nucleosome translocation along DNA. This conceptual framework helps resolve the apparent paradox between the high abundance of ATP-dependent remodelers per nucleus and the relative success of sequence-based predictions of nucleosome positioning in vivo.  相似文献   

12.
13.

Background

Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families.

Results

The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function.

Conclusions

Our results demonstrate that the method we present here using a k- modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.
  相似文献   

14.
15.

Background

Protein kinases (PKs) have emerged as the largest family of signaling proteins in eukaryotic cells and are involved in every aspect of cellular regulation. Great progresses have been made in understanding the mechanisms of PKs phosphorylating their substrates, but the detailed mechanisms, by which PKs ensure their substrate specificity with their structurally conserved catalytic domains, still have not been adequately understood. Correlated mutation analysis based on large sets of diverse sequence data may provide new insights into this question.

Methodology/Principal Findings

Statistical coupling, residue correlation and mutual information analyses along with clustering were applied to analyze the structure-based multiple sequence alignment of the catalytic domains of the Ser/Thr PK family. Two clusters of highly coupled sites were identified. Mapping these positions onto the 3D structure of PK catalytic domain showed that these two groups of positions form two physically close networks. We named these two networks as θ-shaped and γ-shaped networks, respectively.

Conclusions/Significance

The θ-shaped network links the active site cleft and the substrate binding regions, and might participate in PKs recognizing and interacting with their substrates. The γ-shaped network is mainly situated in one side of substrate binding regions, linking the activation loop and the substrate binding regions. It might play a role in supporting the activation loop and substrate binding regions before catalysis, and participate in product releasing after phosphoryl transfer. Our results exhibit significant correlations with experimental observations, and can be used as a guide to further experimental and theoretical studies on the mechanisms of PKs interacting with their substrates.  相似文献   

16.

Background

Type II transmembrane serine proteases (TTSPs) are a family of cell membrane tethered serine proteases with unclear roles as their cleavage site specificities and substrate degradomes have not been fully elucidated. Indeed just 52 cleavage sites are annotated in MEROPS, the database of proteases, their substrates and inhibitors.

Methodology/Principal Finding

To profile the active site specificities of the TTSPs, we applied Proteomic Identification of protease Cleavage Sites (PICS). Human proteome-derived database searchable peptide libraries were assayed with six human TTSPs (matriptase, matriptase-2, matriptase-3, HAT, DESC and hepsin) to simultaneously determine sequence preferences on the N-terminal non-prime (P) and C-terminal prime (P’) sides of the scissile bond. Prime-side cleavage products were isolated following biotinylation and identified by tandem mass spectrometry. The corresponding non-prime side sequences were derived from human proteome databases using bioinformatics. Sequencing of 2,405 individual cleaved peptides allowed for the development of the family consensus protease cleavage site specificity revealing a strong specificity for arginine in the P1 position and surprisingly a lysine in P1′ position. TTSP cleavage between R↓K was confirmed using synthetic peptides. By parsing through known substrates and known structures of TTSP catalytic domains, and by modeling the remainder, structural explanations for this strong specificity were derived.

Conclusions

Degradomics analysis of 2,405 cleavage sites revealed a similar and characteristic TTSP family specificity at the P1 and P1′ positions for arginine and lysine in unfolded peptides. The prime side is important for cleavage specificity, thus making these proteases unusual within the tryptic-enzyme class that generally has overriding non-prime side specificity.  相似文献   

17.
18.

Background

DNA-binding proteins perform their functions through specific or non-specific sequence recognition. Although many sequence- or structure-based approaches have been proposed to identify DNA-binding residues on proteins or protein-binding sites on DNA sequences with satisfied performance, it remains a challenging task to unveil the exact mechanism of protein-DNA interactions without crystal complex structures. Without information from complexes, the linkages between DNA-binding proteins and their binding sites on DNA are still missing.

Methods

While it is still difficult to acquire co-crystallized structures in an efficient way, this study proposes a knowledge-based learning method to effectively predict DNA orientation and base locations around the protein’s DNA-binding sites when given a protein structure. First, the functionally important residues of a query protein are predicted by a sequential pattern mining tool. After that, surface residues falling in the predicted functional regions are determined based on the given structure. These residues are then clustered based on their spatial coordinates and the resultant clusters are ranked by a proposed DNA-binding propensity function. Clusters with high DNA-binding propensities are treated as DNA-binding units (DBUs) and each DBU is analyzed by principal component analysis (PCA) to predict potential orientation of DNA grooves. More specifically, the proposed method is developed to predict the direction of the tangent line to the helix curve of the DNA groove where a DBU is going to bind.

Results

This paper proposes a knowledge-based learning procedure to determine the spatial location of the DNA groove with respect to the query protein structure by considering geometric propensity between protein side chains and DNA bases. The 11 test cases used in this study reveal that the location and orientation of the DNA groove around a selected DBU can be predicted with satisfied errors.

Conclusions

This study presents a method to predict the location and orientation of DNA grooves with respect to the structure of a DNA-binding protein. The test cases shown in this study reveal the possibility of imaging protein-DNA binding conformation before co-crystallized structure can be determined. How the proposed method can be incorporated with existing protein-DNA docking tools to study protein-DNA interactions deserve further studies in the near future.
  相似文献   

19.
E G Levin  D J Loskutoff 《Cell》1980,22(3):701-707
Yeast strains harboring independent mutations within the SUP4 tyrosine tRNA gene have been selected by virtue of their inactivating effect upon the SUP4-o UAA suppressor. Three fourths of the mutations at SUP4 are point alterations; the rest resemble the deletions described by Rothstein (1979). A meiotic genetic fine structure map of the locus was made by crossing 69 of the mutants in all combinations and testing for the frequency of SUP4-o recombinants. The sequences of SUP4 genes cloned from 32 mutant strains were determined by the dideoxynucleotide terminator method, using as primer a synthetic oligodeoxynucleotide corresponding to a sequence adjoining the SUP4 3′ terminus. The positions of the DNA sequence alterations showed good colinearity with the positions of the mutations on the genetic map. One of the 26 mutant sites found by DNA sequencing lies within the intervening sequence. At this site three repeat mutations were found, each changing AT → TA. Whereas mutations were generally rather uniformly distributed throughout the tRNATyr coding sequence, none occurred in the DNA sequences flanking the mature tRNATyr sequence or in a 12 nucleotide sequence including the 10 bp which constitute the 3′ side of the intervening sequence.  相似文献   

20.
Eukaryotic DNA topoisomerase I introduces transient single-stranded breaks on double-stranded DNA and spontaneously breaks down single-stranded DNA. The cleavage sites on both single and double-stranded SV40 DNA have been determined by DNA sequencing. Consistent with other reports, the eukaryotic enzymes, in contrast to prokaryotic type I topoisomerases, links to the 3'-end of the cleaved DNA and generates a free 5'-hydroxyl end on the other half of the broken DNA strand. Both human and calf enzymes cleave SV40 DNA at the identical and specific sites. From 827 nucleotides sequenced, 68 cleavage sites were mapped. The majority of the cleavage sites were present on both double and single-stranded DNA at exactly the same nucleotide positions, suggesting that the DNA sequence is essential for enzyme recognition. By analyzing all the cleavage sequences, certain nucleotides are found to be less favored at the cleavage sites. There is a high probability to exclude G from positions -4, -2, -1 and +1, T from position -3, and A from position -1. These five positions (-4 to +1 oriented in the 5' to 3' direction) around the cleavage sites must interact intimately with topo I and thus are essential for enzyme recognition. One topo I cleavage site which shows atypical cleavage sequence maps in the middle of a palindromic sequence near the origin of SV40 DNA replication. It occurs only on single-stranded SV40 DNA, suggesting that the DNA hairpin can alter the cleavage specificity. The strongest cleavage site maps near the origin of SV40 DNA replication at nucleotide 31-32 and has a pentanucleotide sequence of 5'-TGACT-3'.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号