首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
The Red Queen hypothesis depicts evolution as the continual struggle to adapt. According to this hypothesis, new genes, especially those originating from nongenic sequences (i.e., de novo genes), are eliminated unless they evolve continually in adaptation to a changing environment. Here, we analyze two Drosophila de novo miRNAs that are expressed in a testis-specific manner with very high rates of evolution in their DNA sequence. We knocked out these miRNAs in two sibling species and investigated their contributions to different fitness components. We observed that the fitness contributions of miR-975 in Drosophila simulans seem positive, in contrast to its neutral contributions in D. melanogaster, whereas miR-983 appears to have negative contributions in both species, as the fitness of the knockout mutant increases. As predicted by the Red Queen hypothesis, the fitness difference of these de novo miRNAs indicates their different fates.  相似文献   

4.
5.
6.
To understand whether any human-specific new genes may be associated with human brain functions, we computationally screened the genetic vulnerable factors identified through Genome-Wide Association Studies and linkage analyses of nicotine addiction and found one human-specific de novo protein-coding gene, FLJ33706 (alternative gene symbol C20orf203). Cross-species analysis revealed interesting evolutionary paths of how this gene had originated from noncoding DNA sequences: insertion of repeat elements especially Alu contributed to the formation of the first coding exon and six standard splice junctions on the branch leading to humans and chimpanzees, and two subsequent substitutions in the human lineage escaped two stop codons and created an open reading frame of 194 amino acids. We experimentally verified FLJ33706''s mRNA and protein expression in the brain. Real-Time PCR in multiple tissues demonstrated that FLJ33706 was most abundantly expressed in brain. Human polymorphism data suggested that FLJ33706 encodes a protein under purifying selection. A specifically designed antibody detected its protein expression across human cortex, cerebellum and midbrain. Immunohistochemistry study in normal human brain cortex revealed the localization of FLJ33706 protein in neurons. Elevated expressions of FLJ33706 were detected in Alzheimer''s brain samples, suggesting the role of this novel gene in human-specific pathogenesis of Alzheimer''s disease. FLJ33706 provided the strongest evidence so far that human-specific de novo genes can have protein-coding potential and differential protein expression, and be involved in human brain functions.  相似文献   

7.
Gene regulatory networks exhibit complex, hierarchical features such as global regulation and network motifs. There is much debate about whether the evolutionary origins of such features are the results of adaptation, or the by-products of non-adaptive processes of DNA replication. The lack of availability of gene regulatory networks of ancestor species on evolutionary timescales makes this a particularly difficult problem to resolve. Digital organisms, however, can be used to provide a complete evolutionary record of lineages. We use a biologically realistic evolutionary model that includes gene expression, regulation, metabolism and biosynthesis, to investigate the evolution of complex function in gene regulatory networks. We discover that: (i) network architecture and complexity evolve in response to environmental complexity, (ii) global gene regulation is selected for in complex environments, (iii) complex, inter-connected, hierarchical structures evolve in stages, with energy regulation preceding stress responses, and stress responses preceding growth rate adaptations and (iv) robustness of evolved models to mutations depends on hierarchical level: energy regulation and stress responses tend not to be robust to mutations, whereas growth rate adaptations are more robust and non-lethal when mutated. These results highlight the adaptive and incremental evolution of complex biological networks, and the value and potential of studying realistic in silico evolutionary systems as a way of understanding living systems.  相似文献   

8.
9.
Metallothioneins (MTs) are proteins devoted to the control of metal homeostasis and detoxification, and therefore, MTs have been crucial for the adaptation of the living beings to variable situations of metal bioavailability. The evolution of MTs is, however, not yet fully understood, and to provide new insights into it, we have investigated the MTs in the diverse classes of Mollusks. We have shown that most molluskan MTs are bimodular proteins that combine six domains—α, β1, β2, β3, γ, and δ—in a lineage-specific manner. We have functionally characterized the Neritimorpha β3β1 and the Patellogastropoda γβ1 MTs, demonstrating the metal-binding capacity of the new γ domain. Our results have revealed a modular organization of mollusk MT, whose evolution has been impacted by duplication, loss, and de novo emergence of domains. MTs represent a paradigmatic example of modular evolution probably driven by the structural and functional requirements of metal binding.  相似文献   

10.
11.
Emerging evidence indicates that epileptic encephalopathies are genetically highly heterogeneous, underscoring the need for large cohorts of well-characterized individuals to further define the genetic landscape. Through a collaboration between two consortia (EuroEPINOMICS and Epi4K/EPGP), we analyzed exome-sequencing data of 356 trios with the “classical” epileptic encephalopathies, infantile spasms and Lennox Gastaut syndrome, including 264 trios previously analyzed by the Epi4K/EPGP consortium. In this expanded cohort, we find 429 de novo mutations, including de novo mutations in DNM1 in five individuals and de novo mutations in GABBR2, FASN, and RYR3 in two individuals each. Unlike previous studies, this cohort is sufficiently large to show a significant excess of de novo mutations in epileptic encephalopathy probands compared to the general population using a likelihood analysis (p = 8.2 × 10−4), supporting a prominent role for de novo mutations in epileptic encephalopathies. We bring statistical evidence that mutations in DNM1 cause epileptic encephalopathy, find suggestive evidence for a role of three additional genes, and show that at least 12% of analyzed individuals have an identifiable causal de novo mutation. Strikingly, 75% of mutations in these probands are predicted to disrupt a protein involved in regulating synaptic transmission, and there is a significant enrichment of de novo mutations in genes in this pathway in the entire cohort as well. These findings emphasize an important role for synaptic dysregulation in epileptic encephalopathies, above and beyond that caused by ion channel dysfunction.  相似文献   

12.
13.
Two de novo protein design frameworks are applied to the discovery of new compstatin variants. One is based on sequence selection and fold specificity, whereas the other approach is based on sequence selection and approximate binding affinity calculations. The proposed frameworks were applied to a complex of C3c with compstatin variant E1 and new variants with improved binding affinities are predicted and experimentally validated. The computational studies elucidated key positions in the sequence of compstatin that greatly affect the binding affinity. Positions 4 and 13 were found to favor Trp, whereas positions 1, 9, and 10 are dominated by Asn, and position 11 consists mainly of Gln. A structural analysis of the C3c-bound peptide analogs is presented.  相似文献   

14.
De novo mutations affect risk for many diseases and disorders, especially those with early-onset. An example is autism spectrum disorders (ASD). Four recent whole-exome sequencing (WES) studies of ASD families revealed a handful of novel risk genes, based on independent de novo loss-of-function (LoF) mutations falling in the same gene, and found that de novo LoF mutations occurred at a twofold higher rate than expected by chance. However successful these studies were, they used only a small fraction of the data, excluding other types of de novo mutations and inherited rare variants. Moreover, such analyses cannot readily incorporate data from case-control studies. An important research challenge in gene discovery, therefore, is to develop statistical methods that accommodate a broader class of rare variation. We develop methods that can incorporate WES data regarding de novo mutations, inherited variants present, and variants identified within cases and controls. TADA, for Transmission And De novo Association, integrates these data by a gene-based likelihood model involving parameters for allele frequencies and gene-specific penetrances. Inference is based on a Hierarchical Bayes strategy that borrows information across all genes to infer parameters that would be difficult to estimate for individual genes. In addition to theoretical development we validated TADA using realistic simulations mimicking rare, large-effect mutations affecting risk for ASD and show it has dramatically better power than other common methods of analysis. Thus TADA''s integration of various kinds of WES data can be a highly effective means of identifying novel risk genes. Indeed, application of TADA to WES data from subjects with ASD and their families, as well as from a study of ASD subjects and controls, revealed several novel and promising ASD candidate genes with strong statistical support.  相似文献   

15.
Ticks and other arthropods often are hosts to nutrient providing bacterial endosymbionts, which contribute to their host’s fitness by supplying nutrients such as vitamins and amino acids. It has been detected, in our lab, that Ixodes pacificus is host to Rickettsia species phylotype G021. This endosymbiont is predominantly present, and 100% maternally transmitted in I. pacificus. To study roles of phylotype G021 in I. pacificus, bioinformatic and molecular approaches were carried out. MUMmer genome alignments of whole genome sequence of I. scapularis, a close relative to I. pacificus, against completely sequenced genomes of R. bellii OSU85-389, R. conorii, and R. felis, identified 8,190 unique sequences that are homologous to Rickettsia sequences in the NCBI Trace Archive. MetaCyc metabolic reconstructions revealed that all folate gene orthologues (folA, folC, folE, folKP, ptpS) required for de novo folate biosynthesis are present in the genome of Rickettsia buchneri in I. scapularis. To examine the metabolic capability of phylotype G021 in I. pacificus, genes of the folate biosynthesis pathway of the bacterium were PCR amplified using degenerate primers. BLAST searches identified that nucleotide sequences of the folA, folC, folE, folKP, and ptpS genes possess 98.6%, 98.8%, 98.9%, 98.5% and 99.0% identity respectively to the corresponding genes of Rickettsia buchneri. Phylogenetic tree constructions show that the folate genes of phylotype G021 and homologous genes from various Rickettsia species are monophyletic. This study has shown that all folate genes exist in the genome of Rickettsia species phylotype G021 and that this bacterium has the genetic capability for de novo folate synthesis.  相似文献   

16.
It is widely assumed that new proteins are created by duplication, fusion, or fission of existing coding sequences. Another mechanism of protein birth is provided by overlapping genes. They are created de novo by mutations within a coding sequence that lead to the expression of a novel protein in another reading frame, a process called “overprinting.” To investigate this mechanism, we have analyzed the sequences of the protein products of manually curated overlapping genes from 43 genera of unspliced RNA viruses infecting eukaryotes. Overlapping proteins have a sequence composition globally biased toward disorder-promoting amino acids and are predicted to contain significantly more structural disorder than nonoverlapping proteins. By analyzing the phylogenetic distribution of overlapping proteins, we were able to confirm that 17 of these had been created de novo and to study them individually. Most proteins created de novo are orphans (i.e., restricted to one species or genus). Almost all are accessory proteins that play a role in viral pathogenicity or spread, rather than proteins central to viral replication or structure. Most proteins created de novo are predicted to be fully disordered and have a highly unusual sequence composition. This suggests that some viral overlapping reading frames encoding hypothetical proteins with highly biased composition, often discarded as noncoding, might in fact encode proteins. Some proteins created de novo are predicted to be ordered, however, and whenever a three-dimensional structure of such a protein has been solved, it corresponds to a fold previously unobserved, suggesting that the study of these proteins could enhance our knowledge of protein space.Since their discovery (76), overlapping genes, i.e., DNA sequences simultaneously encoding two or more proteins in different reading frames, have exerted a fascination on evolutionary biologists. Among several mechanisms, they can be created by a process called “overprinting” (43), in which a DNA sequence originally encoding only one protein undergoes a genetic modification leading to the expression of a second reading frame in addition to the first one (Fig. (Fig.1).1). The resulting overlap encodes an ancestral, “overprinted” protein region and a protein region created de novo (i.e., not by duplication) called an “overprinting” or “novel” region (Fig. (Fig.1).1). At present, it is widely thought that the creation of proteins de novo is very rare, contrary to their emergence by gene duplication, which is thought to be the major factor (for reviews, see references 55 and 94). However, this belief might actually reflect the fact that proteins created de novo are in general very difficult to identify (55). Indeed, a long-standing question is whether a protein that has no detectable homolog in other organisms (called an “orphan” protein or “ORFan” [27] or “taxonomically restricted” [110]) represents a protein created de novo in a particular organism or merely a protein that is a member of a larger family whose other members have diverged beyond recognition or have become extinct (115). Proteins created de novo by overprinting provide a valuable opportunity to address these questions, and this constitutes one of the two strands of our study.Open in a separate windowFIG. 1.Creation of a novel protein region (C-terminal extension) by overprinting. Top, a DNA sequence encodes two proteins in different reading frames. Notice the potential, unused stop codon downstream of protein X. Middle, a mutation abolishes the stop codon of protein X, causing its elongation (“overprinting”) to the preexisting stop codon. This results in a gene overlap. Bottom, the overlap encodes an overprinted (ancestral) protein region (dark gray) and an overprinting (novel) protein region (light gray).Practically all studies of overlapping genes have been focused on evolutionary constraints and informational characteristics at the DNA level (see, e.g., references 46, 71, 75, 84, 85, and 114). However, very little has been done to assess potential effects of the overlap on the corresponding protein products. Two studies reported that overlapping proteins are enriched in amino acids with a high codon degeneracy (arginine, leucine, and serine) (68) and that they often simultaneously encode a cluster of basic amino acids in one frame and a stretch of acidic amino acids in the other frame (66).The other strand of the present study is based on earlier observations of the overlapping gene set of measles virus (41), which suggested that protein regions encoded by overlapping genes might have a propensity toward structural disorder.Structural disorder is an essential state of numerous proteins, in which it is associated mostly with signaling and regulation roles (21, 96, 111). The key feature of intrinsically disordered proteins (also called “unstructured” or “natively unfolded”) is that under physiological conditions, instead of a particular three-dimensional (3D) structure, they adopt ensembles of rapidly interconverting structural forms. Different degrees of disorder exist, from random coils to molten globules (100), and some disordered regions can become ordered under certain conditions (21, 96, 117). A variety of computer programs have been developed to predict these regions (19, 23, 101). Each predictor typically differs in what kind of “disorder” it identifies (23, 78), matching only some of the types of disorder mentioned above. Therefore, in order to choose a proper predictor, it was necessary to define precisely what kind of structural disorder we expected to find in proteins encoded by overlapping genes.At least two nonexclusive hypotheses can explain why overlapping genes might encode disordered proteins: (i) the newly created (overprinting) protein of each overlap might tend to be disordered, and (ii) structural disorder in proteins encoded by overlapping genes might alleviate evolutionary constraints imposed on their sequence by the overlap. These hypotheses are clarified below.Intuitively, the conditions required for a protein to fold into a stable 3D configuration, including sequence composition, periodicity, and complexity, are such that structurally ordered proteins represent a vanishingly small fraction of all possible amino acid sequences. Indeed, proteins artificially created from random nucleotide sequences generally have a low secondary structure content (107, 112). Hence our first hypothesis: novel, overprinting proteins are not expected to have a fixed 3D structure at birth, given the low probability of generating structure from a completely new sequence.Disordered proteins are generally subject to less structural constraint than ordered ones (13). Hence our second hypothesis: the presence of disorder in one or both products of an overlapping gene pair could greatly alleviate evolutionary constraints imposed by the overlap, allowing both protein products to scan a wider sequence space without losing their function.Both hypotheses suppose only the lack of a rigid structure, as opposed to a total lack of structure (e.g., some proteins created de novo from a random nucleotide sequence, though lacking secondary structure, have a certain degree of order [112]). For that reason, in this work, we use the widest possible definition of disorder, i.e., the lack of a rigid 3D structure, and we use a program whose predictions of disorder correspond to this definition, PONDR VSL2 (69) (see Results).In this work, we collected a large number of experimentally proven cases of proteins encoded by overlapping genes in unspliced eukaryotic RNA viruses and analyzed their sequence properties.  相似文献   

17.
Most living organisms can synthesize isosinate from 5-phosphoribosyl 1-pyrophosphate in the de novo purine biosynthesis pathway, which is basically composed of 10 reaction steps. Phosphoribosylglycinamide synthetase (GARS) catalyzes the second step of the pathway. We found that the enzyme shows weak, but significant, sequence similarity to phosphoribosylglycinamide formyltransferase 2 (GART2) and the ATPase domain of phosphoribosylaminoimidazole carboxylase (AIRCA), which catalyze the third and sixth steps of the pathway, respectively. In addition, the three enzymes were similar in amino acid sequence to biotin carboxylase (BC) and carbamoylphosphate synthetase (CPS), which are the members of the GS ADP-forming family. This family has been identified through a tertiary structure comparison and includes glutathione synthetase, d-alanine:d-alanine ligase, BC, succinyl-CoA synthetase β-chain, and phosphoribosylaminoimidazole-succinocarboxamide synthase. Molecular phylogenetic analysis based on a multiple alignment of GARS, GART2, AIRCA, BC, and CPS suggests that GART2 is more closely related to AIRCA than to GARS among the three enzymes from the pathway, though the three enzymes are relatively close to each other within the GS ADP-forming family. Moreover, the analysis showed that archaeal GARS had diverged before the speciation between bacteria and eucarya. Received: 3 June 1998 / Accepted: 8 September 1998  相似文献   

18.
19.
Insertions of the yeast element Ty3 resulting from induced retrotransposition were characterized in order to identify the genomic targets of transposition. The DNA sequences of the junctions between Ty3 and flanking DNA were determined for two insertions of an unmarked element. Each insertion was at position -17 from the 5' end of a tRNA-coding sequence. Ninety-one independent insertions of a marked Ty3 element were studied by Southern blot analysis. Pairs of independent insertions into seven genomic loci accounted for 14 of these insertions. The DNA sequence flanking the insertion site was determined for at least one member of each pair of integrated elements. In each case, insertion was at position -16 or -17 relative to the 5' end of one of seven different tRNA genes. This proportion of genomic loci used twice for Ty3 integration is consistent with that predicted by a Poisson distribution for a number of genomic targets roughly equivalent to the estimated number of yeast tRNA genes. In addition, insertions upstream of the same tRNA gene in one case were at different positions, but in all cases were in the same orientation. Thus, genomic insertions of Ty3 in a particular orientation are apparently specified by the target, while the actual position of the insertion relative to the tRNA-coding sequence can vary slightly.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号