首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Novel genomes are today often annotated by small consortia or individuals whose background is not from bioinformatics.This audience requires tools that are easy to use.Such need has been addressed by several genome annotation tools and pipelines.Visualizing resulting annotation is a crucial step of quality control.The UCSC Genome Browser is a powerful and popular genome visualization tool.Assembly Hubs,which can be hosted on any publicly available web server,allow browsing genomes via UCSC Genome Browser servers.The steps for creating custom Assembly Hubs are well documented and the required tools are publicly available.However,the number of steps for creating a novel Assembly Hub is large.In some cases,the format of input files needs to be adapted,which is a difficult task for scientists without programming background.Here,we describe Make Hub,a novel command line tool that generates Assembly Hubs for the UCSC Genome Browser in a fully automated fashion.The pipeline also allows extending previously created Hubs by additional tracks.Make Hub is freely available for downloading at https://github.com/Gaius-Augustus/Make Hub.  相似文献   

2.

Background

Small RNA sequencing is commonly used to identify novel miRNAs and to determine their expression levels in plants. There are several miRNA identification tools for animals such as miRDeep, miRDeep2 and miRDeep*. miRDeep-P was developed to identify plant miRNA using miRDeep’s probabilistic model of miRNA biogenesis, but it depends on several third party tools and lacks a user-friendly interface. The objective of our miRPlant program is to predict novel plant miRNA, while providing a user-friendly interface with improved accuracy of prediction.

Result

We have developed a user-friendly plant miRNA prediction tool called miRPlant. We show using 16 plant miRNA datasets from four different plant species that miRPlant has at least a 10% improvement in accuracy compared to miRDeep-P, which is the most popular plant miRNA prediction tool. Furthermore, miRPlant uses a Graphical User Interface for data input and output, and identified miRNA are shown with all RNAseq reads in a hairpin diagram.

Conclusions

We have developed miRPlant which extends miRDeep* to various plant species by adopting suitable strategies to identify hairpin excision regions and hairpin structure filtering for plants. miRPlant does not require any third party tools such as mapping or RNA secondary structure prediction tools. miRPlant is also the first plant miRNA prediction tool that dynamically plots miRNA hairpin structure with small reads for identified novel miRNAs. This feature will enable biologists to visualize novel pre-miRNA structure and the location of small RNA reads relative to the hairpin. Moreover, miRPlant can be easily used by biologists with limited bioinformatics skills.miRPlant and its manual are freely available at http://www.australianprostatecentre.org/research/software/mirplant or http://sourceforge.net/projects/mirplant/.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-275) contains supplementary material, which is available to authorized users.  相似文献   

3.
4.
microRNAs (miRNA) are a class of non-protein coding functional RNAs that are thought to regulate expression of target genes by direct interaction with mRNAs. miRNAs have been identified through both experimental and computational methods in a variety of eukaryotic organisms. Though these approaches have been partially successful, there is a need to develop more tools for detection of these RNAs as they are also thought to be present in abundance in many genomes. In this report we describe a tool and a web server, named CID-miRNA, for identification of miRNA precursors in a given DNA sequence, utilising secondary structure-based filtering systems and an algorithm based on stochastic context free grammar trained on human miRNAs. CID-miRNA analyses a given sequence using a web interface, for presence of putative miRNA precursors and the generated output lists all the potential regions that can form miRNA-like structures. It can also scan large genomic sequences for the presence of potential miRNA precursors in its stand-alone form. The web server can be accessed at http://mirna.jnu.ac.in/cidmirna/.  相似文献   

5.
There are two important problems in the assembly of small, icosahedral RNA viruses. First, how does the capsid protein select the viral RNA for packaging, when there are so many other candidate RNA molecules available? Second, what is the mechanism of assembly? With regard to the first question, there are a number of cases where a particular RNA sequence or structure—often one or more stem-loops—either promotes assembly or is required for assembly, but there are others where specific packaging signals are apparently not required. With regard to the assembly pathway, in those cases where stem-loops are involved, the first step is generally believed to be binding of the capsid proteins to these “fingers” of the RNA secondary structure. In the mature virus, the core of the RNA would then occupy the center of the viral particle, and the stem-loops would reach outward, towards the capsid, like stalagmites reaching up from the floor of a grotto towards the ceiling. Those viruses whose assembly does not depend on protein binding to stem-loops could have a different structure, with the core of the RNA lying just under the capsid, and the fingers reaching down into the interior of the virus, like stalactites. We review the literature on these alternative structures, focusing on RNA selectivity and the assembly mechanism, and we propose experiments aimed at determining, in a given virus, which of the two structures actually occurs.  相似文献   

6.
RNAs play diverse roles in formation and function of subnuclear compartments, most of which are associated with active genes. NEAT1 and NEAT2/MALAT1 exemplify long non-coding RNAs (lncRNAs) known to function in nuclear bodies; however, we suggest that RNA biogenesis itself may underpin much nuclear compartmentalization. Recent studies show that active genes cluster with nuclear speckles on a genome-wide scale, significantly advancing earlier cytological evidence that speckles (aka SC-35 domains) are hubs of concentrated pre-mRNA metabolism. We propose the ‘karyotype to hub’ hypothesis to explain this organization: clustering of genes in the human karyotype may have evolved to facilitate the formation of efficient nuclear hubs, driven in part by the propensity of ribonucleoproteins (RNPs) to form large-scale condensates. The special capacity of highly repetitive RNAs to impact architecture is highlighted by recent findings that human satellite II RNA sequesters factors into abnormal nuclear bodies in disease, potentially co-opting a normal developmental mechanism.  相似文献   

7.
Abstract

Using primary and secondary structure information of an RNA molecule, the program RNA2D3D automatically and rapidly produces a first-order approximation of a 3-dimensional conformation consistent with this information. Applicable to structures of arbitrary branching complexity and pseudoknot content, it features efficient interactive graphical editing for the removal of any overlaps introduced by the initial generating procedure and for making conformational changes favorable to targeted features and subsequent refinement. With emphasis on fast exploration of alternative 3D conformations, one may interactively add or delete base-pairs, adjacent stems can be coaxially stacked or unstacked, single strands can be shaped to accommodate special constraints, and arbitrary subsets can be defined and manipulated as rigid bodies. Compaction, whereby base stacking within stems is optimally extended into connecting single strands, is also available as a means of strategically making the structures more compact and revealing folding motifs. Subsequent refinement of the first-order approximation, of modifications, and for the imposing of tertiary constraints is assisted with standard energy refinement techniques. Previously determined coordinates for any part of the molecule are readily incorporated, and any part of the modeled structure can be output as a PDB or XYZ file. Illustrative applications in the areas of ribozymes, viral kissing loops, viral internal ribosome entry sites, and nanobiology are presented.  相似文献   

8.
9.
Juvenile hormone (JH) plays a crucial role in preventing precocious metamorphosis and stimulating reproduction. Thus, its hemolymph titer should be under a tight control. As a negative controller, juvenile hormone esterase (JHE) performs a rapid breakdown of residual JH in the hemolymph during last instar to induce a larval-to-pupal metamorphosis. A whole genome of the diamondback moth (DBM), Plutella xylostella, has been annotated and proposed 11 JHE candidates. Sequence analysis using conserved motifs commonly found in other JHEs proposed a putative JHE (Px004817). Px004817 (64.61 kDa, pI = 5.28) exhibited a characteristic JHE expression pattern by showing high peak at the early last instar, at which JHE enzyme activity was also at a maximal level. RNA interference of Px004817 reduced JHE activity and interrupted pupal development with a significant increase of larval period. This study identifies Px004817 as a JHE-like gene of P. xylostella.  相似文献   

10.
11.
菜粉蝶线粒体基因组的全序列测定和分析   总被引:2,自引:0,他引:2  
目前关于蝶类线粒体基因组全序列及其分子进化的研究还不多见。本研究通过长PCR和引物步移法对菜粉蝶Pieris rapae Linnaeus线粒体基因组全序列进行了测定和初步分析。结果表明:菜粉蝶线粒体基因组全长15 157 bp, 包含13个蛋白编码基因、22个tRNA和2个rRNA基因以及1个非编码的控制区域, 它们的长度分别是11 196 bp, 1 474 bp, 2 093 bp和393 bp。37个基因的位置与已报道的其他蝶类基本一致, 共有10对基因间存在总共59 bp的重叠, 重叠碱基数在1~35 bp之间; 基因间隔序列共计13处120 bp, 间隔长度1~46 bp不等, 最大的基因间隔46 bp, 位于tRNAIle和tRNAGln基因之间。另外, 基于13个蛋白质编码基因的氨基酸序列, 重建了基于蛋白质编码基因序列数据的11种代表性蝶类的NJ和MP系统树。结果表明:凤蝶类(包括凤蝶和绢蝶)为一大支系, 粉蝶类、 灰蝶类与蛱蝶类(包括蛱蝶、 珍蝶)构成另一大支系。结果不支持粉蝶科与凤蝶科(包括凤蝶类和绢蝶类)构成单系群, 却显示粉蝶科、 灰蝶科和蛱蝶科的组合为单系群。  相似文献   

12.
RNA folding free energy change parameters are widely used to predict RNA secondary structure and to design RNA sequences. These parameters include terms for the folding free energies of helices and loops. Although the full set of parameters has only been traditionally available for the four common bases and backbone, it is well known that covalent modifications of nucleotides are widespread in natural RNAs. Covalent modifications are also widely used in engineered sequences. We recently derived a full set of nearest neighbor terms for RNA that includes N6-methyladenosine (m6A). In this work, we test the model using 98 optical melting experiments, matching duplexes with or without N6-methylation of A. Most experiments place RRACH, the consensus site of N6-methylation, in a variety of contexts, including helices, bulge loops, internal loops, dangling ends, and terminal mismatches. For matched sets of experiments that include either A or m6A in the same context, we find that the parameters for m6A are as accurate as those for A. Across all experiments, the root mean squared deviation between estimated and experimental free energy changes is 0.67 kcal/mol. We used the new experimental data to refine the set of nearest neighbor parameter terms for m6A. These parameters enable prediction of RNA secondary structures including m6A, which can be used to model how N6-methylation of A affects RNA structure.  相似文献   

13.
李鑫  李凯  李一佳  马磊 《生物信息学》2016,14(3):188-194
SeqMule可根据调用的人类基因组和外显子组数据自动调节变量,对所有测序数据的单核苷酸多态性(Single nucleotide polymorphism,SNP)进行分析和注释。目的:通过对两名痛风患者的实验数据进行分析,详细地为生物信息学研究人员介绍了SeqMule软件,以期为全基因组和外显子组测序数据提供一站式的分析途径。方法:基于SeqMule内置的BWA(BurrowsWheeler Aligner)、GATK(The Genome Analysis Toolkit)、SAMtools、Freebayes比对和分析工具,以两名痛风患者的DNA测序数据分析为例,本文详细地论述了SeqMule的特点及操作,并对两名患者的外显子测序数据进行了自动化比对与SNP分析。发现SeqMule优化了很多分析软件存在的一些问题,可以对外显子组和全基因组测序数据实现全面、灵活、高效地自动化分析,能更好地分析高通量测序数据,最终提升数据分析的一致性和准确性。  相似文献   

14.
云斑车蝗线粒体基因组全序列测定与分析   总被引:2,自引:1,他引:2  
党江鹏  刘念  叶伟  黄原 《昆虫学报》2008,51(7):671-680
采用长距 PCR 扩增及保守引物步移法并结合克隆测序测定并注释了云斑车蝗 Gastrimargus marmoratus (Thunberg)的线粒体基因组全序列。结果表明:云斑车蝗线粒体基因组全序列为15 904 bp(GenBank登录号为EU527334),A+T含量略高于非洲飞蝗Locusta migratoria,为76.04%,包括13个蛋白质编码基因,22个tRNA 基因,2个rRNA基因和一段1 057 bp的A+T富集区。蛋白质基因的起始密码子中,除COⅠ和ND5为TTG以外,均为昆虫典型的起始密码子ATN。ND5基因使用了不完全终止密码子T,其余基因均为典型的TAA或TAG。预测了22个tRNA基因的二级结构,发现tRNASer(AGN)缺少DHU臂, tRNASer(UGY)的反密码子环上有9个碱基。预测了云斑车蝗12S和16S rRNA二级结构,分别包括3个结构域30个茎环和6个结构域44个茎环。A+T富集区含有3个串联重复序列。  相似文献   

15.
High-throughput RNA-seq has revolutionized the process of small RNA (sRNA) discovery, leading to a rapid expansion of sRNA categories. In addition to the previously well-characterized sRNAs such as microRNAs (miRNAs), piwi-interacting RNAs (piRNAs), and small nucleolar RNA (snoRNAs), recent emerging studies have spotlighted on tRNA-derived sRNAs (tsRNAs) and rRNA-derived sRNAs (rsRNAs) as new categories of sRNAs that bear versatile functions. Since existing software and pipelines for sRNA annotation are mostly focused on analyzing miRNAs or piRNAs, here we developed the sRNA annotation pipelineoptimized for rRNA- and tRNA-derived sRNAs (SPORTS1.0). SPORTS1.0 is optimized for analyzing tsRNAs and rsRNAs from sRNA-seq data, in addition to its capacity to annotate canonical sRNAs such as miRNAs and piRNAs. Moreover, SPORTS1.0 can predict potential RNA modification sites based on nucleotide mismatches within sRNAs. SPORTS1.0 is precompiled to annotate sRNAs for a wide range of 68 species across bacteria, yeast, plant, and animal kingdoms, while additional species for analyses could be readily expanded upon end users’ input. For demonstration, by analyzing sRNA datasets using SPORTS1.0, we reveal that distinct signatures are present in tsRNAs and rsRNAs from different mouse cell types. We also find that compared to other sRNA species, tsRNAs bear the highest mismatch rate, which is consistent with their highly modified nature. SPORTS1.0 is an open-source software and can be publically accessed at https://github.com/junchaoshi/sports1.0.  相似文献   

16.
The significance of the intron-exon structure of genes is a mystery. As eukaryotic proteins are made up of modular functional domains, each exon was suspected to encode some form of module; however, the definition of a module remained vague. Comparison of pre-mRNA splice junctions with the three-dimensional architecture of its protein product from different eukaryotes revealed that the junctions were far less likely to occur inside the α-helices and Β-strands of proteins than within the more flexible linker regions (‘turns’ and ‘loops’) connecting them. The splice junctions were equally distributed in the different types of linkers and throughout the linker sequence, although a slight preference for the central region of the linker was observed. The avoidance of the α-helix and the (Β-strand by splice junctions suggests the existence of a selection pressure against their disruption, perhaps underscoring the investment made by nature in building these intricate secondary structures. A corollary is that the helix and the strand are the smallest integral architectural units of a protein and represent the minimal modules in the evolution of protein structure. These results should find use in comparative genomics, designing of cloning strategies, and in the mutual verification of genome sequences with protein structures.  相似文献   

17.
18.
The medaka Oryzias latipes is a small egg-laying freshwater teleost, and has become an excellent model system for developmental genetics and evolutionary biology. The medaka genome is relatively small in size, ∼800 Mb, and the genome sequencing project was recently completed by Japanese research groups, providing a high-quality draft genome sequence of the inbred Hd-rR strain of medaka. In this review, I present an overview of the medaka genome project including genome resources, followed by specific findings obtained with the medaka draft genome. In particular, I focus on the analysis that was done by taking advantage of the medaka system, such as the sex chromosome differentiation and the regional history of medaka species using single nucleotide polymorphisms as genomic markers.  相似文献   

19.

Background

Next-Generation Sequencing (NGS) is revolutionizing molecular epidemiology by providing new approaches to undertake whole genome sequencing (WGS) in diagnostic settings for a variety of human and veterinary pathogens. Previous sequencing protocols have been subject to biases such as those encountered during PCR amplification and cell culture, or are restricted by the need for large quantities of starting material. We describe here a simple and robust methodology for the generation of whole genome sequences on the Illumina MiSeq. This protocol is specific for foot-and-mouth disease virus (FMDV) or other polyadenylated RNA viruses and circumvents both the use of PCR and the requirement for large amounts of initial template.

Results

The protocol was successfully validated using five FMDV positive clinical samples from the 2001 epidemic in the United Kingdom, as well as a panel of representative viruses from all seven serotypes. In addition, this protocol was successfully used to recover 94% of an FMDV genome that had previously been identified as cell culture negative. Genome sequences from three other non-FMDV polyadenylated RNA viruses (EMCV, ERAV, VESV) were also obtained with minor protocol amendments. We calculated that a minimum coverage depth of 22 reads was required to produce an accurate consensus sequence for FMDV O. This was achieved in 5 FMDV/O/UKG isolates and the type O FMDV from the serotype panel with the exception of the 5′ genomic termini and area immediately flanking the poly(C) region.

Conclusions

We have developed a universal WGS method for FMDV and other polyadenylated RNA viruses. This method works successfully from a limited quantity of starting material and eliminates the requirement for genome-specific PCR amplification. This protocol has the potential to generate consensus-level sequences within a routine high-throughput diagnostic environment.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-828) contains supplementary material, which is available to authorized users.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号