首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Current computational methods used to analyze changes in DNA methylation and chromatin modification rely on sequenced genomes. Here we describe a pipeline for the detection of these changes from short-read sequence data that does not require a reference genome. Open source software packages were used for sequence assembly, alignment, and measurement of differential enrichment. The method was evaluated by comparing results with reference-based results showing a strong correlation between chromatin modification and gene expression. We then used our de novo sequence assembly to build the DNA methylation profile for the non-referenced Psammomys obesus genome. The pipeline described uses open source software for fast annotation and visualization of unreferenced genomic regions from short-read data.  相似文献   

3.
4.

Key message

This work suggests 2020 potential candidates in rice for the functional annotation of unannotated genes using meta-analysis of anatomical samples derived from microarray and RNA-seq technologies and this information will be useful to identify novel morphological agronomic traits.

Abstract

Although the genome of rice (Oryza sativa) has been sequenced, 14,365 genes are considered unannotated because they lack putative annotation information. According to the Rice Genome Annotation Project Database (http://rice.plantbiology.msu.edu/), the proportion of functionally characterized unannotated genes (0.35%) is quite limited when compared with the approximately 3.9% of annotated genes with assigned putative functions. Researchers require additional information to help them investigate the molecular mechanisms associated with those unannotated genes. To determine which of them might regulate morphological or physiological traits in the rice genome, we conducted a meta-analysis of expression data that covered a wide range of tissue/organ samples. Overall, 2020 genes showed cultivar-, tissue-, or organ-preferential patterns of expression. Representative candidates from featured groups were validated by RT-PCR, and the GUS reporter system was used to validate the expression of genes that were clustered according to their leaf or root preference. Taking a molecular and genetics approach, we examined meta-expression data and found that 127 genes were differentially expressed between japonica and indica rice cultivars. This is potentially significant for future agronomic applications. We also used a T-DNA insertional mutant and performed a co-expression network analysis of Sword shape dwarf1 (SSD1), a gene that regulates cell division. This network was refined via RT-PCR analysis. Our results suggested that SSD1 represses the expression of four genes related to the processes of DNA replication or cell division and provides insight into possible molecular mechanisms. Together, these strategies present a valuable tool for in-depth characterization of currently unannotated genes.
  相似文献   

5.
6.
7.
Recent studies hint that endogenous dsRNA plays an unexpected role in cellular signaling. However, a complete understanding of endogenous dsRNA signaling is hindered by an incomplete annotation of dsRNA-producing genes. To identify dsRNAs expressed in Caenorhabditis elegans, we developed a bioinformatics pipeline that identifies dsRNA by detecting clustered RNA editing sites, which are strictly limited to long dsRNA substrates of Adenosine Deaminases that act on RNA (ADAR). We compared two alignment algorithms for mapping both unique and repetitive reads and detected as many as 664 editing-enriched regions (EERs) indicative of dsRNA loci. EERs are visually enriched on the distal arms of autosomes and are predicted to possess strong internal secondary structures as well as sequence complementarity with other EERs, indicative of both intramolecular and intermolecular duplexes. Most EERs were associated with protein-coding genes, with ∼1.7% of all C. elegans mRNAs containing an EER, located primarily in very long introns and in annotated, as well as unannotated, 3′ UTRs. In addition to numerous EERs associated with coding genes, we identified a population of prospective noncoding EERs that were distant from protein-coding genes and that had little or no coding potential. Finally, subsets of EERs are differentially expressed during development as well as during starvation and infection with bacterial or fungal pathogens. By combining RNA-seq with freely available bioinformatics tools, our workflow provides an easily accessible approach for the identification of dsRNAs, and more importantly, a catalog of the C. elegans dsRNAome.  相似文献   

8.
RNA-Seq is a powerful tool for the annotation of genomes, in particular for the identification of isoforms and UTRs. Nevertheless, several software tools exist and no standard strategy to obtain a reliable annotation is yet established. We tested different combinations of the most commonly used reference-based alignment tools (TopHat, GSNAP) in combination with two frequently used reference-based assemblers (Cufflinks, Scripture) and evaluated the potential of RNA-Seq to improve the annotation of Drosophila pseudoobscura. While GSNAP maps a higher proportion of reads, TopHat resulted in a more accurate annotation when used in combination with Cufflinks. Scripture had the lowest sensitivity. Interestingly, after subsampling to the same coverage for GSNAP and TopHat, we find that both mappers have similar performance, implying that the advantage of TopHat is mainly an artifact of the lower coverage. Overall, we observed a low concordance among the different approaches tested both at junction and isoform levels. Using data from both sexes of two adult strains of D. pseudoobscura we detected alternative splicing for about 30% of the FlyBase multiple-exon genes. Moreover, we extended the boundaries for 6523 genes (about 40%). We annotated 669 new genes, 45% of them with splicing evidence. Most of the new genes are located on unassembled contigs, reflecting their incomplete annotation. Finally, we identified 99 additional new genes that are not represented in the current genome contigs of D. pseudoobscura, probably due to location in genomic regions that are difficult to assemble (e.g. heterochromatic regions).  相似文献   

9.
10.
11.
12.
13.
14.
15.
16.
Rice functional genomics is a scientific approach that seeks to identify and define the function of rice genes, and uncover when and how genes work together to produce phenotypic traits. Rapid progress in rice genome sequencing has facilitated research in rice functional genomics in China. The Ministry of Science and Technology of China has funded two major rice functional genomics research programmes for building up the infrastructures of the functional genomics study such as developing rice functional genomics tools and resources. The programmes were also aimed at cloning and functional analyses of a number of genes controlling important agronomic traits from rice. National and international collaborations on rice functional genomics study are accelerating rice gene discovery and application.  相似文献   

17.
18.
The current reach of genomics extends facilitated identification of microbial virulence factors, a primary objective for antimicrobial drug and vaccine design. Many putative proteins are yet to be identified which can act as potent drug targets. There is lack and limitation of methods which appropriately combine several omics ways for putative and new drug target identification. The study emphasizes a combined bioinformatic and theoretical method of screening unique and putative drug targets, lacking similarity with experimentally reported essential genes and drug targets. Synteny based comparison was carried out with 11 streptococci considering S. gordonii as reference genome. It revealed 534 non-homologous genes of which 334 were putative. Similarity search against host proteome, metabolic pathway annotation and subcellular localization predication identified 16 potent drug targets. This is a first attempt of several combinational approaches of similarity search with target protein structural features for screening drug targets, yielding a pipeline which can be substantiated to other human pathogens.  相似文献   

19.
20.
The genome sequence of Manduca sexta was recently determined using 454 technology. Cufflinks and MAKER2 were used to establish gene models in the genome assembly based on the RNA-Seq data and other species' sequences. Aided by the extensive RNA-Seq data from 50 tissue samples at various life stages, annotators over the world (including the present authors) have manually confirmed and improved a small percentage of the models after spending months of effort. While such collaborative efforts are highly commendable, many of the predicted genes still have problems which may hamper future research on this insect species. As a biochemical model representing lepidopteran pests, M. sexta has been used extensively to study insect physiological processes for over five decades. In this work, we assembled Manduca datasets Cufflinks 3.0, Trinity 4.0, and Oases 4.0 to assist the manual annotation efforts and development of Official Gene Set (OGS) 2.0. To further improve annotation quality, we developed methods to evaluate gene models in the MAKER2, Cufflinks, Oases and Trinity assemblies and selected the best ones to constitute MCOT 1.0 after thorough crosschecking. MCOT 1.0 has 18,089 genes encoding 31,666 proteins: 32.8% match OGS 2.0 models perfectly or near perfectly, 11,747 differ considerably, and 29.5% are absent in OGS 2.0. Future automation of this process is anticipated to greatly reduce human efforts in generating comprehensive, reliable models of structural genes in other genome projects where extensive RNA-Seq data are available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号