首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.

Background

Plant resistance genes (R genes) exist in large families and usually contain both a nucleotide-binding site domain and a leucine-rich repeat domain, denoted NBS-LRR. The genome sequence of cassava (Manihot esculenta) is a valuable resource for analysing the genomic organization of resistance genes in this crop.

Results

With searches for Pfam domains and manual curation of the cassava gene annotations, we identified 228 NBS-LRR type genes and 99 partial NBS genes. These represent almost 1% of the total predicted genes and show high sequence similarity to proteins from other plant species. Furthermore, 34 contained an N-terminal toll/interleukin (TIR)-like domain, and 128 contained an N-terminal coiled-coil (CC) domain. 63% of the 327 R genes occurred in 39 clusters on the chromosomes. These clusters are mostly homogeneous, containing NBS-LRRs derived from a recent common ancestor.

Conclusions

This study provides insight into the evolution of NBS-LRR genes in the cassava genome; the phylogenetic and mapping information may aid efforts to further characterize the function of these predicted R genes.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1554-9) contains supplementary material, which is available to authorized users.  相似文献   

3.
4.
5.

Background

Over the last years, several methods for the phenotype simulation of microorganisms, under specified genetic and environmental conditions have been proposed, in the context of Metabolic Engineering (ME). These methods provided insight on the functioning of microbial metabolism and played a key role in the design of genetic modifications that can lead to strains of industrial interest. On the other hand, in the context of Systems Biology research, biological network visualization has reinforced its role as a core tool in understanding biological processes. However, it has been scarcely used to foster ME related methods, in spite of the acknowledged potential.

Results

In this work, an open-source software that aims to fill the gap between ME and metabolic network visualization is proposed, in the form of a plugin to the OptFlux ME platform. The framework is based on an abstract layer, where the network is represented as a bipartite graph containing minimal information about the underlying entities and their desired relative placement. The framework provides input/output support for networks specified in standard formats, such as XGMML, SBGN or SBML, providing a connection to genome-scale metabolic models. An user-interface makes it possible to edit, manipulate and query nodes in the network, providing tools to visualize diverse effects, including visual filters and aspect changing (e.g. colors, shapes and sizes). These tools are particularly interesting for ME, since they allow overlaying phenotype simulation results or elementary flux modes over the networks.

Conclusions

The framework and its source code are freely available, together with documentation and other resources, being illustrated with well documented case studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0420-0) contains supplementary material, which is available to authorized users.  相似文献   

6.
7.

Background

Domestication has shaped the horse and lead to a group of many different types. Some have been under strong human selection while others developed in close relationship with nature. The aim of our study was to perform next generation sequencing of breed and non-breed horses to provide an insight into genetic influences on selective forces.

Results

Whole genome sequencing of five horses of four different populations revealed 10,193,421 single nucleotide polymorphisms (SNPs) and 1,361,948 insertion/deletion polymorphisms (indels). In comparison to horse variant databases and previous reports, we were able to identify 3,394,883 novel SNPs and 868,525 novel indels. We analyzed the distribution of individual variants and found significant enrichment of private mutations in coding regions of genes involved in primary metabolic processes, anatomical structures, morphogenesis and cellular components in non-breed horses and in contrast to that private mutations in genes affecting cell communication, lipid metabolic process, neurological system process, muscle contraction, ion transport, developmental processes of the nervous system and ectoderm in breed horses.

Conclusions

Our next generation sequencing data constitute an important first step for the characterization of non-breed in comparison to breed horses and provide a large number of novel variants for future analyses. Functional annotations suggest specific variants that could play a role for the characterization of breed or non-breed horses.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-562) contains supplementary material, which is available to authorized users.  相似文献   

8.
9.

Background

While the gargantuan multi-nation effort of sequencing T. aestivum gets close to completion, the annotation process for the vast number of wheat genes and proteins is in its infancy. Previous experimental studies carried out on model plant organisms such as A. thaliana and O. sativa provide a plethora of gene annotations that can be used as potential starting points for wheat gene annotations, proven that solid cross-species gene-to-gene and protein-to-protein correspondences are provided.

Results

DNA and protein sequences and corresponding annotations for T. aestivum and 9 other plant species were collected from Ensembl Plants release 22 and curated. Cliques of predicted 1-to-1 orthologs were identified and an annotation enrichment model was defined based on existing gene-GO term associations and phylogenetic relationships among wheat and 9 other plant species. A total of 13 cliques of size 10 were identified, which represent putative functionally equivalent genes and proteins in the 10 plant species. Eighty-five new and more specific GO terms were associated with wheat genes in the 13 cliques of size 10, which represent a 65% increase compared with the previously 130 known GO terms. Similar expression patterns for 4 genes from Arabidopsis, barley, maize and rice in cliques of size 10 provide experimental evidence to support our model. Overall, based on clique size equal or larger than 3, our model enriched the existing gene-GO term associations for 7,838 (8%) wheat genes, of which 2,139 had no previous annotation.

Conclusions

Our novel comparative genomics approach enriches existing T. aestivum gene annotations based on cliques of predicted 1-to-1 orthologs, phylogenetic relationships and existing gene ontologies from 9 other plant species.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1496-2) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background

Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs).

Results

The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced.

Conclusions

We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1826-4) contains supplementary material, which is available to authorized users.  相似文献   

11.

Background

Ontology-based enrichment analysis aids in the interpretation and understanding of large-scale biological data. Ontologies are hierarchies of biologically relevant groupings. Using ontology annotations, which link ontology classes to biological entities, enrichment analysis methods assess whether there is a significant over or under representation of entities for ontology classes. While many tools exist that run enrichment analysis for protein sets annotated with the Gene Ontology, there are only a few that can be used for small molecules enrichment analysis.

Results

We describe BiNChE, an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology.

Conclusions

BiNChE aids in the exploration of large sets of small molecules produced within Metabolomics or other Systems Biology research contexts. The open-source tool provides easy and highly interactive web access to enrichment analysis with the ChEBI ontology tool and is additionally available as a standalone library.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0486-3) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

Deviations in the amount of genomic content that arise during tumorigenesis, called copy number alterations, are structural rearrangements that can critically affect gene expression patterns. Additionally, copy number alteration profiles allow insight into cancer discrimination, progression and complexity. On data obtained from high-throughput sequencing, improving quality through GC bias correction and keeping false positives to a minimum help build reliable copy number alteration profiles.

Results

We introduce seqCNA, a parallelized R package for an integral copy number analysis of high-throughput sequencing cancer data. The package includes novel methodology on (i) filtering, reducing false positives, and (ii) GC content correction, improving copy number profile quality, especially under great read coverage and high correlation between GC content and copy number. Adequate analysis steps are automatically chosen based on availability of paired-end mapping, matched normal samples and genome annotation.

Conclusions

seqCNA, available through Bioconductor, provides accurate copy number predictions in tumoural data, thanks to the extensive filtering and better GC bias correction, while providing an integrated and parallelized workflow.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-178) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

Propionibacterium freudenreichii (PF) is an actinobacterium used in cheese technology and for its probiotic properties. PF is also extremely adaptable to several ecological niches and can grow on a variety of carbon and nitrogen sources. The aim of this work was to discover the genetic basis for strain-dependent traits related to its ability to use specific carbon sources. High-throughput sequencing technologies were ideal for this purpose as they have the potential to decipher genomic diversity at a moderate cost.

Results

21 strains of PF were sequenced and the genomes were assembled de novo. Scaffolds were ordered by comparison with the complete reference genome CIRM-BIA1, obtained previously using traditional Sanger sequencing. Automatic functional annotation and manual curation were performed. Each gene was attributed to either the core genome or an accessory genome. The ability of the 21 strains to degrade 50 different sugars was evaluated. Thirty-three sugars were degraded by none of the sequenced strains whereas eight sugars were degraded by all of them. The corresponding genes were present in the core genome. Lactose, melibiose and xylitol were only used by some strains. In this case, the presence/absence of genes responsible for carbon uptake and degradation correlated well with the phenotypes, with the exception of xylitol. Furthermore, the simultaneous presence of these genes was in line the metabolic pathways described previously in other species. We also considered the genetic origin (transduction, rearrangement) of the corresponding genomic islands. Ribose and gluconate were degraded to a greater or lesser extent (quantitative phenotype) by some strains. For these sugars, the phenotypes could not be explained by the presence/absence of a gene but correlated with the premature appearance of a stop codon interrupting protein synthesis and preventing the catabolism of corresponding carbon sources.

Conclusion

These results illustrate (i) the power of correlation studies to discover the genetic basis of binary strain-dependent traits, and (ii) the plasticity of PF chromosomes, probably resulting from horizontal transfers, duplications, transpositions and an accumulation of mutations. Knowledge of the genetic basis of nitrogen and sugar degradation opens up new strategies for the screening of PF strain collections to enable optimum cheese starter, probiotic and white biotechnology applications.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1467-7) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data.

Results

We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing.

Conclusions

Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1172) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

Forming a new species through the merger of two or more divergent parent species is increasingly seen as a key phenomenon in the evolution of many biological systems. However, little is known about how expression of parental gene copies (homeologs) responds following genome merger. High throughput RNA sequencing now makes this analysis technically feasible, but tools to determine homeolog expression are still in their infancy.

Results

Here we present HyLiTE – a single-step analysis to obtain tables of homeolog expression in a hybrid or allopolyploid and its parent species directly from raw mRNA sequence files. By implementing on-the-fly detection of diagnostic parental polymorphisms, HyLiTE can perform SNP calling and read classification simultaneously, thus allowing HyLiTE to be run as parallelized code. HyLiTE accommodates any number of parent species, multiple data sources (including genomic DNA reads to improve SNP detection), and implements a statistical framework optimized for genes with low to moderate expression.

Conclusions

HyLiTE is a flexible and easy-to-use program designed for bench biologists to explore patterns of gene expression following genome merger. HyLiTE offers practical advantages over manual methods and existing programs, has been designed to accommodate a wide range of genome merger systems, can identify SNPs that arose following genome merger, and offers accurate performance on non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0433-8) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates could lead false positive finding, misleading clustering pattern or model over-fitting issue, etc in the subsequent data analysis.

Results

We developed a Bioconductor package Dupchecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data. A real data example was demonstrated to show the usage and output of the package.

Conclusions

Researchers may not pay enough attention to checking and removing duplicated samples, and then data contamination could make the results or conclusions from meta-analysis questionable. We suggest applying DupChecker to examine all gene expression data sets before any data analysis step.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-323) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.

Background

The growth and development of the posterior silk gland and the biosynthesis of the silk core protein at the fifth larval instar stage of Bombyx mori are of paramount importance for silk production.

Results

Here, aided by next-generation sequencing and microarry assay, we profile 1,229 microRNAs (miRNAs), including 728 novel miRNAs and 110 miRNA/miRNA* duplexes, of the posterior silk gland at the fifth larval instar. Target gene prediction yields 14,222 unique target genes from 1,195 miRNAs. Functional categorization classifies the targets into complex pathways that include both cellular and metabolic processes, especially protein synthesis and processing.

Conclusion

The enrichment of target genes in the ribosome-related pathway indicates that miRNAs may directly regulate translation. Our findings pave a way for further functional elucidation of these miRNAs and their targets in silk production.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-410) contains supplementary material, which is available to authorized users.  相似文献   

19.
20.

Background

Bacterial spore germination is a developmental process during which all required metabolic pathways are restored to transfer cells from their dormant state into vegetative growth. Streptomyces are soil dwelling filamentous bacteria with complex life cycle, studied mostly for they ability to synthesize secondary metabolites including antibiotics.

Results

Here, we present a systematic approach that analyzes gene expression data obtained from 13 time points taken over 5.5 h of Streptomyces germination. Genes whose expression was significantly enhanced/diminished during the time-course were identified, and classified to metabolic and regulatory pathways. The classification into metabolic pathways revealed timing of the activation of specific pathways during the course of germination. The analysis also identified remarkable changes in the expression of specific sigma factors over the course of germination. Based on our knowledge of the targets of these factors, we speculate on their possible roles during germination. Among the factors whose expression was enhanced during the initial part of germination, SigE is though to manage cell wall reconstruction, SigR controls protein re-aggregation, and others (SigH, SigB, SigI, SigJ) control osmotic and oxidative stress responses.

Conclusions

From the results, we conclude that most of the metabolic pathway mRNAs required for the initial phases of germination were synthesized during the sporulation process and stably conserved in the spore. After rehydration in growth medium, the stored mRNAs are being degraded and resynthesized during first hour. From the analysis of sigma factors we conclude that conditions favoring germination evoke stress-like cell responses.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1173) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号