首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The genome sequences of new viruses often contain many “orphan” or “taxon-specific” proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as “genus specific” by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions.  相似文献   

2.
Advances in sequencing and detection technology over the past two decades, highlighted by the data explosion brought about by the human genome project, have transformed what was previously assumed to be a relatively simple genetic landscape into a new picture where the so-called “dark matter” of the genome has stolen the spotlight from the not so hip protein-coding genes. The simplified central dogma of molecular biology, in which a gene encodes for a protein via a messenger RNA (mRNA), is still at the core of genetics but is now caught in a much more complex web of regulation by the genomic region previously known as “junk” DNA. Books such as Non-coding RNAs and epigenetic regulation of gene expression, published by Caister Academic Press, become essential guidelines to help us understand the current status of the very fast paced field of RNA research, which has only just started to uncover the roles of non-coding RNAs (ncRNAs) in the regulation of gene expression.  相似文献   

3.
The melting of base pairs is a ubiquitous feature of RNA structural transitions, which are widely used to sense and respond to cellular stimuli. A recent study employing solution nuclear magnetic resonance (NMR) imino proton exchange spectroscopy provides a rare base-pair-specific view of duplex melting in the Salmonella FourU RNA thermosensor, which regulates gene expression in response to changes in temperature at the translational level by undergoing a melting transition. The authors observe “microscopic” enthalpy–entropy compensation—often seen “macroscopically” across a series of related molecular species—across base pairs within the same RNA. This yields variations in base-pair stabilities that are an order of magnitude smaller than corresponding variations in enthalpy and entropy. A surprising yet convincing link is established between the slopes of enthalpy–entropy correlations and RNA melting points determined by circular dichroism (CD), which argues that unfolding occurs when base-pair stabilities are equalized. A single AG-to-CG mutation, which enhances the macroscopic hairpin thermostability and folding cooperativity and renders the RNA thermometer inactive in vivo, spreads its effect microscopically throughout all base pairs in the RNA, including ones far removed from the site of mutation. The authors suggest that an extended network of hydration underlies this long-range communication. This study suggests that the deconstruction of macroscopic RNA unfolding in terms of microscopic unfolding events will require careful consideration of water interactions.  相似文献   

4.

Background

Only a small fraction of the mosquito species of the genus Anopheles are able to transmit malaria, one of the biggest killer diseases of poverty, which is mostly prevalent in the tropics. This diversity has genetic, yet unknown, causes. In a further attempt to contribute to the elucidation of these variances, the international “Anopheles Genomes Cluster Consortium” project (a.k.a. “16 Anopheles genomes project”) was established, aiming at a comprehensive genomic analysis of several anopheline species, most of which are malaria vectors. In the frame of the international consortium carrying out this project our team studied the genes encoding families of non-coding RNAs (ncRNAs), concentrating on four classes: microRNA (miRNA), ribosomal RNA (rRNA), small nuclear RNA (snRNA), and in particular small nucleolar RNA (snoRNA) and, finally, transfer RNA (tRNA).

Results

Our analysis was carried out using, exclusively, computational approaches, and evaluating both the primary NGS reads as well as the respective genome assemblies produced by the consortium and stored in VectorBase; moreover, the results of RNAseq surveys in cases in which these were available and meaningful were also accessed in order to obtain supplementary data, as were “pre-genomic era” sequence data stored in nucleic acid databases. The investigation included the identification and analysis, in most species studied, of ncRNA genes belonging to several families, as well as the analysis of the evolutionary relations of some of those genes in cross-comparisons to other members of the genus Anopheles.

Conclusions

Our study led to the identification of members of these gene families in the majority of twenty different anopheline taxa. A set of tools for the study of the evolution and molecular biology of important disease vectors has, thus, been obtained.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1038) contains supplementary material, which is available to authorized users.  相似文献   

5.
Much of what is known about word recognition in toddlers comes from eyetracking studies. Here we show that the speed and facility with which children recognize words, as revealed in such studies, cannot be attributed to a task-specific, closed-set strategy; rather, children’s gaze to referents of spoken nouns reflects successful search of the lexicon. Toddlers’ spoken word comprehension was examined in the context of pictures that had two possible names (such as a cup of juice which could be called “cup” or “juice”) and pictures that had only one likely name for toddlers (such as “apple”), using a visual world eye-tracking task and a picture-labeling task (n = 77, mean age, 21 months). Toddlers were just as fast and accurate in fixating named pictures with two likely names as pictures with one. If toddlers do name pictures to themselves, the name provides no apparent benefit in word recognition, because there is no cost to understanding an alternative lexical construal of the picture. In toddlers, as in adults, spoken words rapidly evoke their referents.  相似文献   

6.
7.
The Solanaceae or “nightshade” family is an economically important group with remarkable diversity. To gain a better understanding of how the unique biology of the Solanaceae relates to the family’s small RNA (sRNA) genomic landscape, we downloaded over 255 publicly available sRNA data sets that comprise over 2.6 billion reads of sequence data. We applied a suite of computational tools to predict and annotate two major sRNA classes: (1) microRNAs (miRNAs), typically 20- to 22-nucleotide (nt) RNAs generated from a hairpin precursor and functioning in gene silencing and (2) short interfering RNAs (siRNAs), including 24-nt heterochromatic siRNAs typically functioning to repress repetitive regions of the genome via RNA-directed DNA methylation, as well as secondary phased siRNAs and trans-acting siRNAs generated via miRNA-directed cleavage of a polymerase II-derived RNA precursor. Our analyses described thousands of sRNA loci, including poorly understood clusters of 22-nt siRNAs that accumulate during viral infection. The birth, death, expansion, and contraction of these sRNA loci are dynamic evolutionary processes that characterize the Solanaceae family. These analyses indicate that individuals within the same genus share similar sRNA landscapes, whereas comparisons between distinct genera within the Solanaceae reveal relatively few commonalities.

Analysis of over 255 publicly available small RNA data sets enabled characterization of the small RNA landscape for the Solanaceae family.  相似文献   

8.
Regulatory networks play a central role in cellular behavior and decision making. Learning these regulatory networks is a major task in biology, and devising computational methods and mathematical models for this task is a major endeavor in bioinformatics. Boolean networks have been used extensively for modeling regulatory networks. In this model, the state of each gene can be either ‘on’ or ‘off’ and that next-state of a gene is updated, synchronously or asynchronously, according to a Boolean rule that is applied to the current-state of the entire system. Inferring a Boolean network from a set of experimental data entails two main steps: first, the experimental time-series data are discretized into Boolean trajectories, and then, a Boolean network is learned from these Boolean trajectories. In this paper, we consider three methods for data discretization, including a new one we propose, and three methods for learning Boolean networks, and study the performance of all possible nine combinations on four regulatory systems of varying dynamics complexities. We find that employing the right combination of methods for data discretization and network learning results in Boolean networks that capture the dynamics well and provide predictive power. Our findings are in contrast to a recent survey that placed Boolean networks on the low end of the “faithfulness to biological reality” and “ability to model dynamics” spectra. Further, contrary to the common argument in favor of Boolean networks, we find that a relatively large number of time points in the time-series data is required to learn good Boolean networks for certain data sets. Last but not least, while methods have been proposed for inferring Boolean networks, as discussed above, missing still are publicly available implementations thereof. Here, we make our implementation of the methods available publicly in open source at http://bioinfo.cs.rice.edu/.  相似文献   

9.
Phylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. However, no software currently exists that includes tools for dataset construction and subsequent analysis with diverse validation strategies to assess robustness. Furthermore, there are no publicly available high-quality curated databases designed to assess deep (>100 million years) relationships in the tree of eukaryotes. To address these issues, we developed an easy-to-use software package, PhyloFisher (https://github.com/TheBrownLab/PhyloFisher), written in Python 3. PhyloFisher includes a manually curated database of 240 protein-coding genes from 304 eukaryotic taxa covering known eukaryotic diversity, a novel tool for ortholog selection, and utilities that will perform diverse analyses required by state-of-the-art phylogenomic investigations. Through phylogenetic reconstructions of the tree of eukaryotes and of the Saccharomycetaceae clade of budding yeasts, we demonstrate the utility of the PhyloFisher workflow and the provided starting database to address phylogenetic questions across a large range of evolutionary time points for diverse groups of organisms. We also demonstrate that undetected paralogy can remain in phylogenomic “single-copy orthogroup” datasets constructed using widely accepted methods such as all vs. all BLAST searches followed by Markov Cluster Algorithm (MCL) clustering and application of automated tree pruning algorithms. Finally, we show how the PhyloFisher workflow helps detect inadvertent paralog inclusions, allowing the user to make more informed decisions regarding orthology assignments, leading to a more accurate final dataset.

Phylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. This article presents PhyloFisher, a community-driven tool for phylogenomic dataset construction to infer deep and shallow phylogenetic relationships among eukaryotes.  相似文献   

10.
Phylogenomics of prokaryotic ribosomal proteins   总被引:1,自引:0,他引:1  
Yutin N  Puigbò P  Koonin EV  Wolf YI 《PloS one》2012,7(5):e36972
Archaeal and bacterial ribosomes contain more than 50 proteins, including 34 that are universally conserved in the three domains of cellular life (bacteria, archaea, and eukaryotes). Despite the high sequence conservation, annotation of ribosomal (r-) protein genes is often difficult because of their short lengths and biased sequence composition. We developed an automated computational pipeline for identification of r-protein genes and applied it to 995 completely sequenced bacterial and 87 archaeal genomes available in the RefSeq database. The pipeline employs curated seed alignments of r-proteins to run position-specific scoring matrix (PSSM)-based BLAST searches against six-frame genome translations, mitigating possible gene annotation errors. As a result of this analysis, we performed a census of prokaryotic r-protein complements, enumerated missing and paralogous r-proteins, and analyzed the distributions of ribosomal protein genes among chromosomal partitions. Phyletic patterns of bacterial and archaeal r-protein genes were mapped to phylogenetic trees reconstructed from concatenated alignments of r-proteins to reveal the history of likely multiple independent gains and losses. These alignments, available for download, can be used as search profiles to improve genome annotation of r-proteins and for further comparative genomics studies.  相似文献   

11.
The hepatitis C virus (HCV) NS5b protein is an RNA-dependent RNA polymerase essential for replication of the viral RNA genome. In vitro and presumably in vivo, NS5b initiates RNA synthesis by a de novo mechanism. Different structural elements of NS5b have been reported to participate in RNA synthesis, especially a so-called “β-flap” and a C-terminal segment (designated “linker”) that connects the catalytic core of NS5b to a transmembrane anchor. High concentrations of GTP have also been shown to stimulate de novo RNA synthesis by HCV NS5b. Here we describe a combined structural and functional analysis of genotype 1 HCV-NS5b of strains H77 (subtype 1a), for which no structure has been previously reported, and J4 (subtype 1b). Our results highlight the linker as directly involved in lifting the first boundary to processive RNA synthesis, the formation of the first dinucleotide primer. The transition from this first dinucleotide primer state to processive RNA synthesis requires removal of the linker and of the β-flap with which it is shown to strongly interact in crystal structures of HCV NS5b. We find that GTP specifically stimulates this transition irrespective of its incorporation in neosynthesized RNA.  相似文献   

12.
There are two important problems in the assembly of small, icosahedral RNA viruses. First, how does the capsid protein select the viral RNA for packaging, when there are so many other candidate RNA molecules available? Second, what is the mechanism of assembly? With regard to the first question, there are a number of cases where a particular RNA sequence or structure—often one or more stem-loops—either promotes assembly or is required for assembly, but there are others where specific packaging signals are apparently not required. With regard to the assembly pathway, in those cases where stem-loops are involved, the first step is generally believed to be binding of the capsid proteins to these “fingers” of the RNA secondary structure. In the mature virus, the core of the RNA would then occupy the center of the viral particle, and the stem-loops would reach outward, towards the capsid, like stalagmites reaching up from the floor of a grotto towards the ceiling. Those viruses whose assembly does not depend on protein binding to stem-loops could have a different structure, with the core of the RNA lying just under the capsid, and the fingers reaching down into the interior of the virus, like stalactites. We review the literature on these alternative structures, focusing on RNA selectivity and the assembly mechanism, and we propose experiments aimed at determining, in a given virus, which of the two structures actually occurs.  相似文献   

13.

Background

A recessive mutation “c” in the Mexican axolotl, Ambystoma mexicanum, results in the failure of normal heart development. In homozygous recessive embryos, the hearts do not have organized myofibrils and fail to beat. In our previous studies, we identified a noncoding Myofibril-Inducing RNA (MIR) from axolotls which promotes myofibril formation and rescues heart development.

Results

We randomly cloned RNAs from fetal human heart. RNA from clone #291 promoted myofibril formation and induced heart development of mutant axolotls in organ culture. This RNA induced expression of cardiac markers in mutant hearts: tropomyosin, troponin and α-syntrophin. This cloned RNA matches in partial sequence alignment to human microRNA-499a and b, although it differs in length. We have concluded that this cloned RNA is unique in its length, but is still related to the microRNA-499 family. We have named this unique RNA, microRNA-499c. Thus, we will refer to this RNA derived from clone #291 as microRNA-499c throughout the rest of the paper.

Conclusions

This new form, microRNA-499c, plays an important role in cardiac development.  相似文献   

14.
In operant learning, behaviors are reinforced or inhibited in response to the consequences of similar actions taken in the past. However, because in natural environments the “same” situation never recurs, it is essential for the learner to decide what “similar” is so that he can generalize from experience in one state of the world to future actions in different states of the world. The computational principles underlying this generalization are poorly understood, in particular because natural environments are typically too complex to study quantitatively. In this paper we study the principles underlying generalization in operant learning of professional basketball players. In particular, we utilize detailed information about the spatial organization of shot locations to study how players adapt their attacking strategy in real time according to recent events in the game. To quantify this learning, we study how a make \ miss from one location in the court affects the probabilities of shooting from different locations. We show that generalization is not a spatially-local process, nor is governed by the difficulty of the shot. Rather, to a first approximation, players use a simplified binary representation of the court into 2 pt and 3 pt zones. This result indicates that rather than using low-level features, generalization is determined by high-level cognitive processes that incorporate the abstract rules of the game.  相似文献   

15.
The mature embryos of rice seeds contain translatable mRNAs required for the initial phase of germination. To clarify the relationship between seed longevity and RNA integrity in embryos, germinability and stability of embryonic RNAs were analyzed using the seeds of japonica rice cultivars subjected to controlled deterioration treatment (CDT) or long periods of storage. Degradation of RNA from embryos of a japonica rice cultivar “Nipponbare” was induced by CDT before the decline of the germination rate and we observed a positive relationship between seed germinability and integrity of embryonic RNAs. Moreover, this relationship was confirmed in the experiments using aged seeds from the “Nipponbare”, “Sasanishiki” and “Koshihikari” rice cultivars. In addition, the RNA integrity number (RIN) values, calculated using electrophoresis data and Agilent Bioanalyzer software, had a positive correlation with germinability (R2=0.75). Therefore, the stability of embryonic RNAs required for germination is involved in maintaining seed longevity over time and RIN values can serve as a quantitative indicator to evaluate germinability in rice.  相似文献   

16.
People have limited computational resources, yet they make complex strategic decisions over enormous spaces of possibilities. How do people efficiently search spaces with combinatorially branching paths? Here, we study players’ search strategies for a winning move in a “k-in-a-row” game. We find that players use scoring strategies to prune the search space and augment this pruning by a “shutter” heuristic that focuses the search on the paths emanating from their previous move. This strong pruning has its costs—both computational simulations and behavioral data indicate that the shutter size is correlated with players’ blindness to their opponent’s winning moves. However, simulations of the search while varying the shutter size, complexity levels, noise levels, branching factor, and computational limitations indicate that despite its costs, a narrow shutter strategy is the dominant strategy for most of the parameter space. Finally, we show that in the presence of computational limitations, the shutter heuristic enhances the performance of deep learning networks in these end-game scenarios. Together, our findings suggest a novel adaptive heuristic that benefits search in a vast space of possibilities of a strategic game.  相似文献   

17.
There is accumulating evidence that prior knowledge about expectations plays an important role in perception. The Bayesian framework is the standard computational approach to explain how prior knowledge about the distribution of expected stimuli is incorporated with noisy observations in order to improve performance. However, it is unclear what information about the prior distribution is acquired by the perceptual system over short periods of time and how this information is utilized in the process of perceptual decision making. Here we address this question using a simple two-tone discrimination task. We find that the “contraction bias”, in which small magnitudes are overestimated and large magnitudes are underestimated, dominates the pattern of responses of human participants. This contraction bias is consistent with the Bayesian hypothesis in which the true prior information is available to the decision-maker. However, a trial-by-trial analysis of the pattern of responses reveals that the contribution of most recent trials to performance is overweighted compared with the predictions of a standard Bayesian model. Moreover, we study participants'' performance in a-typical distributions of stimuli and demonstrate substantial deviations from the ideal Bayesian detector, suggesting that the brain utilizes a heuristic approximation of the Bayesian inference. We propose a biologically plausible model, in which decision in the two-tone discrimination task is based on a comparison between the second tone and an exponentially-decaying average of the first tone and past tones. We show that this model accounts for both the contraction bias and the deviations from the ideal Bayesian detector hypothesis. These findings demonstrate the power of Bayesian-like heuristics in the brain, as well as their limitations in their failure to fully adapt to novel environments.  相似文献   

18.
RNA editing by adenosine deamination is particularly prevalent in the squid nervous system. We hypothesized that the squid editing enzyme might contain structural differences that help explain this phenomenon. As a first step, a squid adenosine deaminase that acts on RNA (sqADAR2a) cDNA and the gene that encodes it were cloned from the giant axon system. PCR and RNase protection assays showed that a splice variant of this clone (sqADAR2b) was also expressed in this tissue. Both versions are homologous to the vertebrate ADAR2 family. sqADAR2b encodes a conventional ADAR2 family member with an evolutionarily conserved deaminase domain and two double-stranded RNA binding domains (dsRBD). sqADAR2a differs from sqADAR2b by containing an optional exon that encodes an “extra” dsRBD. Both splice variants are expressed at comparable levels and are extensively edited, each in a unique pattern. Recombinant sqADAR2a and sqADAR2b, produced in Pichia pastoris, are both active on duplex RNA. Using a standard 48-h protein induction, both sqADAR2a and sqADAR2b exhibit promiscuous self-editing; however, this activity is particularly robust for sqADAR2a. By decreasing the induction time to 16 h, self-editing was mostly eliminated. We next tested the ability of sqADAR2a and sqADAR2b to edit two K+ channel mRNAs in vitro. Both substrates are known to be edited in squid. For each mRNA, sqADAR2a edited many more sites than sqADAR2b. These data suggest that the “extra” dsRBD confers high activity on sqADAR2a.  相似文献   

19.
The proteome of the amoebo-flagellate protozoan Naegleria gruberi is rich in candidate RNA repair enzymes, including 15 putative RNA ligases, one of which, NgrRnl, is a eukaryal homolog of Deinococcus radiodurans RNA ligase, DraRnl. Here we report that purified recombinant NgrRnl seals nicked 3′-OH/5′-PO4 duplexes in which the 3′-OH strand is RNA. It does so via the “classic” ligase pathway, entailing reaction with ATP to form a covalent NgrRnl–AMP intermediate, transfer of AMP to the nick 5′-PO4, and attack of the RNA 3′-OH on the adenylylated nick to form a 3′–5′ phosphodiester. Unlike members of the four known families of ATP-dependent RNA ligases, NgrRnl lacks a carboxy-terminal appendage to its nucleotidyltransferase domain. Instead, it contains a defining amino-terminal domain that we show is important for 3′-OH/5′-PO4 nick-sealing and ligase adenylylation, but dispensable for phosphodiester synthesis at a preadenylylated nick. We propose that NgrRnl, DraRnl, and their homologs from diverse bacteria, viruses, and unicellular eukarya comprise a new “Rnl5 family” of nick-sealing ligases with a signature domain organization.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号