首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
2.
Genetic recombination is an important process during the evolution of many virus species and occurs particularly frequently amongst begomoviruses in the single stranded DNA virus family, Geminiviridae. As in many other recombining viruses it is apparent that non-random recombination breakpoint distributions observable within begomovirus genomes sampled from nature are the product of variations both in basal recombination rates across genomes and in the over-all viability of different recombinant genomes. Whereas factors influencing basal recombination rates might include local degrees of sequence similarity between recombining genomes, nucleic acid secondary structures and genomic sensitivity to nuclease attack or breakage, the viability of recombinant genomes could be influenced by the degree to which their co-evolved protein-protein and protein-nucleotide and nucleotide-nucleotide interactions are disreputable by recombination. Here we investigate patterns of recombination that occur over 120 day long experimental infections of tomato plants with the begomoviruses Tomato yellow leaf curl virus and Tomato leaf curl Comoros virus. We show that patterns of sequence exchange between these viruses can be extraordinarily complex and present clear evidence that factors such as local degrees of sequence similarity but not genomic secondary structure strongly influence where recombination breakpoints occur. It is also apparent from our experiment that over-all patterns of recombination are strongly influenced by selection against individual recombinants displaying disrupted intra-genomic interactions such as those required for proper protein and nucleic acid folding. Crucially, we find that selection favoring the preservation of co-evolved longer-range protein-protein and protein DNA interactions is so strong that its imprint can even be used to identify the exact sequence tracts involved in these interactions.  相似文献   

3.
Protein sequences can be represented as binary patterns of polar (○) and nonpolar (?) amino acids. These binary sequence patterns are categorized into two classes: Class A patterns match the structural repeat of an idealized amphiphilic α-helix (3.6 residues per turn), and class B patterns match the structural repeat of an idealized amphiphilic β-strand (2 residues per turn). The difference between these two classes of sequence patterns has led to a strategy for de novo protein design based on binary patterning of polar and nonpolar amino acids. Here we ask whether similar binary patterning is incorporated in the sequences and structures of natural proteins. Analysis of the Protein Data Bank demonstrates the following. (1) Class A sequence patterns occur considerably more frequently in the sequences of natural proteins than would be expected at random, but class B patterns occur less often than expected. (2) Each pattern is found predominantly in the secondary structure expected from the binary strategy for protein design. Thus, class A patterns are found more frequently in α-helices than in β-strands, and class B patterns are found more frequently in β-strands than in α-helices. (3) Among the α-helices of natural proteins, the most commonly used binary patterns are indeed the class A patterns. (4) Among all β-strands in the database, the most commonly used binary patterns are not the expected class B patterns. (5) However, for solvent-exposed β-strands, the correlation is striking: All β-strands in the database that contain the class B patterns are exposed to solvent. (6) The bias of class A patterns for α-structure over β-structure and the bias of class B patterns for β-structure over α-structure are significant, not merely when compared to other binary patterns of polar (○) and nonpolar (?) amino acids, but also when compared to the full range of sequences in the database. The implications for the design of novel proteins are discussed.  相似文献   

4.
We have analyzed micrococcal nuclease (MNase) DNA cleavage patterns at the sequence level by examining 2.3 X 10(3) base-pairs of data derived from the Drosophila melanogaster 44D larval cuticle locus. Within this region, MNase preferentially cleaved 140 sites. Clusters of these sites appear to generate the preferential MNase eukaryotic DNA cleavage sites seen on agarose gels at roughly 100 to 300 base-pair intervals. These clusters of preferential cleavage sites rarely occur within gene coding regions. The analysis revealed that duplex DNA sequences preferentially cleaved by MNase are generally determined by a single strand sequence: d(A-T)n, where n greater than or equal to 1, flanked by a 5' dC or dG. Cleavage of the other strand is generally staggered 5' by several nucleotides and occurs even if such sequences are absent on that strand. An empirical predictive DNA cleavage model derived from a statistical analysis of the sequence level data was applied to seven eukaryotic gene loci of known sequence. The predicted patterns were in good general agreement with the previously observed eukaryotic gene/spacer cleavage pattern. Statistical analysis also revealed that sites of predicted preferential DNA cleavage occur less frequently in protein coding regions than for randomized sequences of the same length and nucleotide content. Comparison of the MNase cleavage patterns to the sequence-dependent pattern of binding energies between duplex DNA strands indicates that MNase preferentially cleaves sequences with low helix stability.  相似文献   

5.
Two different states of human immunodeficiency virus type 1 are apparent in the asymptomatic and late stages of infection. Important determinants associated with these two states have been found within the V3 loop of the viral Env protein. In this study, two large data sets of published V3 sequences were analyzed to identify patterns of sequence variability that would correspond to these two states of the virus. We were especially interested in the pattern of basic amino acid substitutions, since the presence of basic amino acids in V3 has been shown to change virus tropism in cell culture. Four features of the sequence heterogeneity in V3 were observed: (i) approximately 70% of all nonconservative basic substitutions occur at four positions in V3, and V3 sequences with a basic substitution in at least one of these four positions contain approximately 95% of all nonconservative basic substitutions; (ii) substitution patterns within V3 are influenced by the identity of the amino acid at position 25; (iii) sequence polymorphisms account for a significant fraction of uncharged amino acid substitutions at several positions in V3, and sequence heterogeneity other than these polymorphisms is most significant at two positions near the tip of V3; and (iv) sequence heterogeneity in V3 (in addition to the basic amino acid substitutions) is approximately twofold greater in V3 sequences that contain basic amino acid substitutions. By using this sequence analysis, we were able to identify distinct groups of V3 sequences in infected patients that appear to correspond to these two virus states. The identification of these discrete sequence patterns in vivo demonstrates how the V3 sequence can be used as a genetic marker for studying the two states of human immunodeficiency virus type 1.  相似文献   

6.
7.
A number of operator-binding proteins contain similar sequence features to Cro and cI repressors of bacteriophage and CAP protein of Escherichia coli, such as conserved amino acids at constant positions. However, these sequence patterns also occur in proteins that are not operator-binding. We use sequence analogy information in conjunction with a pattern recognition algorithm. The functional and structural properties, e.g., distributions of hydrophobicity, hydrophilicity, charged amino acids, electrostatic free energy, and helical structures of protein are also considered. Within the framework of discriminant analysis, we calculate the above variables and search for a better combination of variables. To assess the discriminatory power of these variables, we allocated additional sequences and predict DNA-binding regions of regulatory proteins not included in the training set.  相似文献   

8.
DNA's genetic code can be represented as an alphabetic sequence composed of the four letters A, C, G, and T, which represent the four types of nucleotides--adenylic, cytidylic, guanylic, and thymidylic acid--of which DNA is composed. Now that these sequences have been identified for many genes and are available in computer-readable form, scientists can analyze these data and search for patterns in an attempt to learn more about the regulatory functions of the gene. One area of study is that of the frequency of occurrence of specific nucleotide subsequences (e.g., ACAC) within part or all of a nucleotide sequence. This paper derives the probability distribution of the frequency of occurrence of a subsequence within a nucleotide sequence, under the hypothesis that the four nucleotides occur at random and with equal probability. This distribution is nontrivial because different subsequences have different "overlap capability." For example, the subsequence AAAA can occur up to 17 times in a sequence of length 20 (which would happen if the sequence were composed solely of A's), but the subsequence ACGT cannot occur more than 5 times in a sequence of length 20. Thus, the frequency distributions are different for each type of overlap capability. It is of interest to assess and compare the degree of nonrandomness for different subsequences or among different portions of a sequence; the existence and degree of nonrandomness may be related to the type and degree of functionality of a nucleotide (sub)sequence. The frequency distributions provided here can be used to perform exact significance tests of the hypothesis of randomness. An approximate test is also described for use with long sequences; this can be used to test a more general null hypothesis of nucleotides occurring with unequal probabilities.  相似文献   

9.
Functional & Integrative Genomics - Distinct gene expression patterns that occur during the adenoma-carcinoma sequence need to be determined to analyze the underlying mechanism in each step of...  相似文献   

10.
Divergence in expression between duplicated genes in Arabidopsis   总被引:2,自引:0,他引:2  
New genes may arise through tandem duplication, dispersed small-scale duplication, and polyploidy, and patterns of divergence between duplicated genes may vary among these classes. We have examined patterns of gene expression and coding sequence divergence between duplicated genes in Arabidopsis thaliana. Due to the simultaneous origin of polyploidy-derived gene pairs, we can compare covariation in the rates of expression divergence and sequence divergence within this group. Among tandem and dispersed duplicates, much of the divergence in expression profile appears to occur at or shortly after duplication. Contrary to findings from other eukaryotic systems, there is little relationship between expression divergence and synonymous substitutions, whereas there is a strong positive relationship between expression divergence and nonsynonymous substitutions. Because this pattern is pronounced among the polyploidy-derived pairs, we infer that the strength of purifying selection acting on protein sequence and expression pattern is correlated. The polyploidy-derived pairs are somewhat atypical in that they have broader expression patterns and are expressed at higher levels, suggesting differences among polyploidy- and nonpolyploidy-derived duplicates in the types of genes that revert to single copy. Finally, within many of the duplicated pairs, 1 gene is expressed at a higher level across all assayed conditions, which suggests that the subfunctionalization model for duplicate gene preservation provides, at best, only a partial explanation for the patterns of expression divergence between duplicated genes.  相似文献   

11.
Genetic control of cell division patterns in the Drosophila embryo   总被引:45,自引:0,他引:45  
B A Edgar  P H O'Farrell 《Cell》1989,57(1):177-187
  相似文献   

12.
During spermatogenesis, the complex events of the first meiotic prophase and division phase bring about dramatic changes in nuclear organization. One factor frustrating mechanistic dissection of these events is lack of knowledge about precisely what events occur, in what order they occur, and how they may be interrelated by temporal sequence; in other words, a precise timeline is lacking. This temporal ordering problem can be tackled by following expression and localization in mouse spermatocytes of proteins critical to events of the meiotic cell division process. These include ones that are primarily chromosomal and related to pairing and recombination, as well as kinases and substrates that mediate the cell cycle transition. Distinct and protein-specific patterns occur with respect to expression and localization throughout meiotic prophase and division and dramatic relocalization of proteins occurs as spermatocytes enter the meiotic division phase. This information provides a foundation for a meiotic timeline that can be augmented to provide, eventually, a complete catalog of meiotic events and their temporal sequence. Such a framework can clarify mechanisms of normal meiosis as well as mutant phenotypes and aberrations of the meiotic process that lead to aneuploidy.  相似文献   

13.
Tissue-specific gene expression using the upstream activating sequence (UAS)–GAL4 binary system has facilitated genetic dissection of many biological processes in Drosophila melanogaster. Refining GAL4 expression patterns or independently manipulating multiple cell populations using additional binary systems are common experimental goals. To simplify these processes, we developed a convertible genetic platform, the integrase swappable in vivo targeting element (InSITE) system. This approach allows GAL4 to be replaced with any other sequence, placing different genetic effectors under the control of the same regulatory elements. Using InSITE, GAL4 can be replaced with LexA or QF, allowing an expression pattern to be repurposed. GAL4 can also be replaced with GAL80 or split-GAL4 hemi-drivers, allowing intersectional approaches to refine expression patterns. The exchanges occur through efficient in vivo manipulations, making it possible to generate many swaps in parallel. This system is modular, allowing future genetic tools to be easily incorporated into the existing framework.  相似文献   

14.
香蕉果胶裂解酶基因的克隆   总被引:7,自引:0,他引:7  
根据已经报告的香蕉果胶裂解酶基因序列,设计了特异引物,通过RT-PCR获得果胶裂解酶的cDNA,并克隆测序,与已报告的序列进行了比较,二者核苷酸序列的同源性达99.24%;推测的氨基酸序列也具有很高的同源性,达97.7%.通过RT-PCR的方法对香蕉不同组织和不同成熟度果实的果胶裂解酶基因的表达进行了研究.结果表明该基因只在果实中表达,具有组织特异性,而且只在果实的特定发育阶段表达.  相似文献   

15.
16.
Chromatin self-organization by mutation bias   总被引:3,自引:0,他引:3  
Proteins, on binding to a DNA sequence, alter the frequency and quality of mutations that occur in the sequence. This represents a reverse flow of information from proteins to DNA. Nucleosome binding causes patterns of UV-induced damage which, when converted to mutations by replication, will phase nucleosomes. We propose that DNA binding proteins create their own high- or low-affinity binding sites along DNA sequences by biased mutational pressure.  相似文献   

17.
We describe a method to assign a protein structure to a functional family using family-specific fingerprints. Fingerprints represent amino acid packing patterns that occur in most members of a family but are rare in the background, a nonredundant subset of PDB; their information is additional to sequence alignments, sequence patterns, structural superposition, and active-site templates. Fingerprints were derived for 120 families in SCOP using Frequent Subgraph Mining. For a new structure, all occurrences of these family-specific fingerprints may be found by a fast algorithm for subgraph isomorphism; the structure can then be assigned to a family with a confidence value derived from the number of fingerprints found and their distribution in background proteins. In validation experiments, we infer the function of new members added to SCOP families and we discriminate between structurally similar, but functionally divergent TIM barrel families. We then apply our method to predict function for several structural genomics proteins, including orphan structures. Some predictions have been corroborated by other computational methods and some validated by subsequent functional characterization.  相似文献   

18.
19.
Many proteins that are synthesized in the cytoplasm of cells are ultimately found in non-cytoplasmic locations. The correct targeting and transport of proteins must occur across bacterial cell membranes, the endoplasmic reticulum membrane, and those of mitochondria and chloroplasts. One unifying feature among transported proteins in these systems is the requirement for an amino-terminal targeting signal. Although the primary sequence of targeting signals varies substantially, many patterns involving overall properties are shared. A recent surge in the identification of components of the transport apparatus from many different systems has revealed that these are also closely related. In this review we describe some of the key components of different transport systems and highlight these common features.  相似文献   

20.
We report the results of phylogenetic analyses of 1447 bases of mitochondrial DNA sequence for 21 populations representing seven species of the Anolis grahami series (A. conspersus, A. garmani, A. grahami, A. lineatopus, A. opalinus, A. reconditus, and A. valencienni), six of which occur on Jamaica. These data include 705 characters that are phylogenetically informative according to parsimony. A parsimony analysis of these data combined with previously published allozymic data yields a single most parsimonious tree with strong support for monophyly of the A. grahami series, the sister-group relationship between Anolis lineatopus and A. reconditus and a clade composed of Anolis garmani, A. grahami, and A. opalinus. Based on DNA data alone, A. conspersus is nested within A. grahami. Haplotypes sampled from geographic populations of A. grahami, A. lineatopus, and A. opalinus are highly divergent (approximately 12-15% sequence difference on average for each species) and show similar phylogeographic patterns, suggesting that each of these currently recognized species may be a complex of species. Anolis valencienni also shows high sequence divergence among haplotypes from different geographic populations (approximately 8% sequence difference) and may contain cryptic species. Divergence among haplotypes within A. garmani is substantially lower (approximately 3% sequence difference), and phylogeographic patterns are significantly different from those observed in A. grahami, A. lineatopus and A. opalinus.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号