首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Burkholderia sprentiae strain WSM5005T is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated in Australia from an effective N2-fixing root nodule of Lebeckia ambigua collected in Klawer, Western Cape of South Africa, in October 2007. Here we describe the features of Burkholderia sprentiae strain WSM5005T, together with the genome sequence and its annotation. The 7,761,063 bp high-quality-draft genome is arranged in 8 scaffolds of 236 contigs, contains 7,147 protein-coding genes and 76 RNA-only encoding genes, and is one of 20 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Community Sequencing Program.  相似文献   

3.
4.
A synchrotron X-ray microscope is a powerful imaging apparatus for taking high-resolution and high-contrast X-ray images of nanoscale objects. A sufficient number of X-ray projection images from different angles is required for constructing 3D volume images of an object. Because a synchrotron light source is immobile, a rotational object holder is required for tomography. At a resolution of 10 nm per pixel, the vibration of the holder caused by rotating the object cannot be disregarded if tomographic images are to be reconstructed accurately. This paper presents a computer method to compensate for the vibration of the rotational holder by aligning neighboring X-ray images. This alignment process involves two steps. The first step is to match the “projected feature points” in the sequence of images. The matched projected feature points in the - plane should form a set of sine-shaped loci. The second step is to fit the loci to a set of sine waves to compute the parameters required for alignment. The experimental results show that the proposed method outperforms two previously proposed methods, Xradia and SPIDER. The developed software system can be downloaded from the URL, http://www.cs.nctu.edu.tw/~chengchc/SCTA or http://goo.gl/s4AMx.  相似文献   

5.
6.
7.
The vast scale of SARS-CoV-2 sequencing data has made it increasingly challenging to comprehensively analyze all available data using existing tools and file formats. To address this, we present a database of SARS-CoV-2 phylogenetic trees inferred with unrestricted public sequences, which we update daily to incorporate new sequences. Our database uses the recently proposed mutation-annotated tree (MAT) format to efficiently encode the tree with branches labeled with parsimony-inferred mutations, as well as Nextstrain clade and Pango lineage labels at clade roots. As of June 9, 2021, our SARS-CoV-2 MAT consists of 834,521 sequences and provides a comprehensive view of the virus’ evolutionary history using public data. We also present matUtils—a command-line utility for rapidly querying, interpreting, and manipulating the MATs. Our daily-updated SARS-CoV-2 MAT database and matUtils software are available at http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/ and https://github.com/yatisht/usher, respectively.  相似文献   

8.
Traditionally, phenotype-driven forward genetic plant mutant studies have been among the most successful approaches to revealing the roles of genes and their products and elucidating biochemical, developmental, and signaling pathways. A limitation is that it is time consuming, and sometimes technically challenging, to discover the gene responsible for a phenotype by map-based cloning or discovery of the insertion element. Reverse genetics is also an excellent way to associate genes with phenotypes, although an absence of detectable phenotypes often results when screening a small number of mutants with a limited range of phenotypic assays. The Arabidopsis Chloroplast 2010 Project (www.plastid.msu.edu) seeks synergy between forward and reverse genetics by screening thousands of sequence-indexed Arabidopsis (Arabidopsis thaliana) T-DNA insertion mutants for a diverse set of phenotypes. Results from this project are discussed that highlight the strengths and limitations of the approach. We describe the discovery of altered fatty acid desaturation phenotypes associated with mutants of At1g10310, previously described as a pterin aldehyde reductase in folate metabolism. Data are presented to show that growth, fatty acid, and chlorophyll fluorescence defects previously associated with antisense inhibition of synthesis of the family of acyl carrier proteins can be attributed to a single gene insertion in Acyl Carrier Protein4 (At4g25050). A variety of cautionary examples associated with the use of sequence-indexed T-DNA mutants are described, including the need to genotype all lines chosen for analysis (even when they number in the thousands) and the presence of tagged and untagged secondary mutations that can lead to the observed phenotypes.Decoding of the Arabidopsis (Arabidopsis thaliana) genome sequence earlier this decade (Arabidopsis Genome Initiative, 2000) provided the opportunity to determine the functions of approximately 27,000 protein-coding genes. One or more functions of a small percentage of genes are currently experimentally determined, typically from mutant or transgenic analysis or through biochemistry. However, roles for the vast majority of plant genes are either more or less accurately predicted by DNA sequence homology or unpredictable based upon DNA sequence (Arabidopsis Genome Initiative, 2000; Cho and Walbot, 2001; Rhee et al., 2008; for recent specific examples, see Gao et al., 2009; Schilmiller et al., 2009). Because of the uncertainty associated with homology-based function assessment, high-throughput approaches to gene function identification are needed to expand the universe of genes with experimental annotation.In contrast to organisms amenable to targeted gene replacement, such as bacteria, yeast, and mouse (Wendland, 2003; Wu et al., 2007; Adams and van der Weyden, 2008), obtaining a gene knockout is not as efficient in flowering plants. In Arabidopsis, the conventional way of creating a gene knockout is by insertional mutagenesis via Agrobacterium tumefaciens-mediated transformation (Krysan et al., 1999). Using this technique, a large piece of T-DNA is inserted into the genome in an untargeted manner (Alonso et al., 2003). If it lands within a coding or regulatory region, the T-DNA can influence the expression of the corresponding gene. While the probability of any single insertion element causing a mutation in a gene of interest is low, sequencing of hundreds of thousands of independent insertion sites has led to a collection of mutants in the majority of genes (http://signal.SALK.edu/tabout.html; Alonso et al., 2003).T-DNA mutants can be a valuable tool for forward genetics, in which hundreds or thousands of mutants are subjected to phenotypic assays (Feldmann, 1991; Kuromori et al., 2006), but reverse genetics is the most common way in which these mutant collections are utilized. Typically, a small number of candidate genes are tested for a role in a particular biological process by reducing or increasing gene expression and assaying one or more phenotypes (for review, see Page and Grossniklaus, 2002; Alonso and Ecker, 2006). The availability of a gene-indexed T-DNA mutant collection allows researchers to rapidly obtain mutant lines for their genes of interest (http://signal.SALK.edu/cgi-bin/tdnaexpress). The availability of a large collection of indexed mutant or RNA interference lines in other model organisms has facilitated large-scale reverse genetics studies (Piano et al., 2000; Giaever et al., 2002; Ho et al., 2009).In the course of a large reverse genetics project (The Chloroplast 2010 Project; http://www.plastid.msu.edu/), more than 3,500 T-DNA lines harboring insertions in nuclear genes, most of which were computationally predicted to encode chloroplast-targeted proteins, were subjected to a diverse set of phenotypic screens (Lu et al., 2008). In total, 85 phenotypic observations ranging from quantitative metabolite measurements to qualitative phenotypic observations are collected for each mutant line, and the data are stored in a relational database (http://bioinfo.bch.msu.edu/2010_LIMS). This approach seeks to take advantage of the best features of forward and reverse genetics by screening a large number of lines with mutations in known genes. Unlike conventional genetics screens, where plants are assayed for one or a small number of traits, this project surveys varied phenotypes.In this study, a variety of phenotypic variants were analyzed. In some cases, independent mutants of the same gene were found to have similar phenotypes, revealing new information about those genes. In other examples, a single homozygous mutant allele was found to have a detectable phenotype. These run the gamut from cases where secondary mutations are strongly implicated in causing the phenotype, to an example where an analogous maize (Zea mays) mutant is known to have a similar phenotype, to other instances where the causative mutation is yet to be identified. In several examples of secondary mutations, the phenotype was not due to a T-DNA insertion, reinforcing the idea that these untagged alleles are a cause for concern in conducting large-scale reverse genetics screens (Vitha et al., 2003; Adham et al., 2005; Zolman et al., 2008), while providing opportunities for gene function discovery by map-based cloning or whole genome sequence analysis.  相似文献   

9.
Desulfotomaculum kuznetsovii is a moderately thermophilic member of the polyphyletic spore-forming genus Desulfotomaculum in the family Peptococcaceae. This species is of interest because it originates from deep subsurface thermal mineral water at a depth of about 3,000 m. D. kuznetsovii is a rather versatile bacterium as it can grow with a large variety of organic substrates, including short-chain and long-chain fatty acids, which are degraded completely to carbon dioxide coupled to the reduction of sulfate. It can grow methylotrophically with methanol and sulfate and autotrophically with H2 + CO2 and sulfate. For growth it does not require any vitamins. Here, we describe the features of D. kuznetsovii together with the genome sequence and annotation. The chromosome has 3,601,386 bp organized in one contig. A total of 3,567 candidate protein-encoding genes and 58 RNA genes were identified. Genes of the acetyl-CoA pathway, possibly involved in heterotrophic growth with acetate and methanol, and in CO2 fixation during autotrophic growth are present. Genomic comparison revealed that D. kuznetsovii shows a high similarity with Pelotomaculum thermopropionicum. Genes involved in propionate metabolism of these two strains show a strong similarity. However, main differences are found in genes involved in the electron acceptor metabolism.  相似文献   

10.
PlantMetabolomics.org (PM) is a web portal and database for exploring, visualizing, and downloading plant metabolomics data. Widespread public access to well-annotated metabolomics datasets is essential for establishing metabolomics as a functional genomics tool. PM integrates metabolomics data generated from different analytical platforms from multiple laboratories along with the key visualization tools such as ratio and error plots. Visualization tools can quickly show how one condition compares to another and which analytical platforms show the largest changes. The database tries to capture a complete annotation of the experiment metadata along with the metabolite abundance databased on the evolving Metabolomics Standards Initiative. PM can be used as a platform for deriving hypotheses by enabling metabolomic comparisons between genetically unique Arabidopsis (Arabidopsis thaliana) populations subjected to different environmental conditions. Each metabolite is linked to relevant experimental data and information from various annotation databases. The portal also provides detailed protocols and tutorials on conducting plant metabolomics experiments to promote metabolomics in the community. PM currently houses Arabidopsis metabolomics data generated by a consortium of laboratories utilizing metabolomics to help elucidate the functions of uncharacterized genes. PM is publicly available at http://www.plantmetabolomics.org.In the post genomics era, metabolomics is fast emerging as a vital source of information to aid in solving systems biology puzzles with an emphasis on metabolic solutions. Metabolomics is the science of measuring the pool sizes of metabolites (small molecules of Mr ≤ 1,000 D), which collectively define the metabolome of a biological sample (Fiehn et al., 2000; Hall et al., 2002). Coverage of the entire plant metabolome is a daunting task as it is estimated that there are over 200,000 different metabolites within the plant kingdom (Goodacre et al., 2004). Although technology is rapidly advancing, there are still large gaps in our knowledge of the plant metabolome.Despite this lack of complete knowledge and the immense metabolic diversity among plants, metabolomics has become a key analytical tool in the plant community (Hall et al., 2002). This has led to the emergence of multiple experimental and analytical platforms that collectively generate millions of metabolite data points. Because of this vast amount of data, the development of public databases to capture information from metabolomics experiments is vital to provide the scientific community with comprehensive knowledge about metabolite data generation, annotation, and integration with metabolic pathway data. Some examples of these public databases are given below. The Human Metabolome Project contains comprehensive data for more than 2,000 metabolites found within the human body (Wishart et al., 2007). The Golm Database is a repository that provides access to mass spectrometry (MS) libraries, metabolite profiling experiments, and related information from gas chromatography (GC)-MS experimental platforms, along with tools to integrate this information with other systems biology knowledge (Kopka et al., 2005). The Madison Metabolomics Consortium Database contains primarily NMR spectra for Arabidopsis (Arabidopsis thaliana) and features thorough NMR search tools (Cui et al., 2008). SetupX and Binbase provide a framework that combines MS data and biological metadata for steering laboratory work flows and employs automated metabolite annotation (Scholz and Fiehn, 2007).A single analytical technique cannot identify and quantify all the metabolites found in plants. Thus, PlantMetabolomics.org (PM) was developed to provide a portal for accessing publicly available MS-based plant metabolomics experimental results from multiple analytical and separation techniques. PM also follows the emerging metabolomics standards for experiment annotation. PM has extensive annotation links between the identified metabolites and metabolic pathways in AraCyc (Mueller et al., 2003) at The Arabidopsis Information Resource (Rhee et al., 2003) and the Plant Metabolic Network (www.plantcyc.org), the Kyoto Encyclopedia of Genes and Genomes (KEGG; Kanehisa et al., 2004), and MetNetDB (Wurtele et al., 2007).Standards for the annotation of metabolomics experiments are still under active development and the metadata types collected in PM are based on the recommendations of the Metabolomics Standards Initiative (MSI; Fiehn et al., 2007a) and the Minimal Information for a Metabolomic Experiment (Bino et al., 2004) standards. MSI attempts to capture the complete annotation of metabolomics experiments and includes metadata of the experiments along with the metabolite abundance data. The initial database schema design was guided by the schema proposed in the Architecture for Metabolomics project (Jenkins et al., 2004).  相似文献   

11.
12.
13.
Recent studies have revealed that a small non-coding RNA, microRNA (miRNA) down-regulates its mRNA targets. This effect is regarded as an important role in various biological processes. Many studies have been devoted to predicting miRNA-target interactions. These studies indicate that the interactions may only be functional in some specific tissues, which depend on the characteristics of an miRNA. No systematic methods have been established in the literature to investigate the correlation between miRNA-target interactions and tissue specificity through microarray data. In this study, we propose a method to investigate miRNA-target interaction-supported tissues, which is based on experimentally validated miRNA-target interactions. The tissue specificity results by our method are in accordance with the experimental results in the literature.

Availability and Implementation

Our analysis results are available at http://tsmti.mbc.nctu.edu.tw/ and http://www.stat.nctu.edu.tw/hwang/tsmti.html.  相似文献   

14.
Functional protein annotation is an important matter for in vivo and in silico biology. Several computational methods have been proposed that make use of a wide range of features such as motifs, domains, homology, structure and physicochemical properties. There is no single method that performs best in all functional classification problems because information obtained using any of these features depends on the function to be assigned to the protein. In this study, we portray a novel approach that combines different methods to better represent protein function. First, we formulated the function annotation problem as a classification problem defined on 300 different Gene Ontology (GO) terms from molecular function aspect. We presented a method to form positive and negative training examples while taking into account the directed acyclic graph (DAG) structure and evidence codes of GO. We applied three different methods and their combinations. Results show that combining different methods improves prediction accuracy in most cases. The proposed method, GOPred, is available as an online computational annotation tool (http://kinaz.fen.bilkent.edu.tr/gopred).  相似文献   

15.
16.

Background

Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs).

Results

The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced.

Conclusions

We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1826-4) contains supplementary material, which is available to authorized users.  相似文献   

17.

Background

Predicting type-1 Human Immunodeficiency Virus (HIV-1) protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug design (especially for HIV-1 protease inhibitors) against this life-threatening virus. However, some drawbacks (like the shortage of the available training data and the high dimensionality of the feature space) turn this task into a difficult classification problem. Thus, various machine learning techniques, and specifically several classification methods have been proposed in order to increase the accuracy of the classification model. In addition, for several classification problems, which are characterized by having few samples and many features, selecting the most relevant features is a major factor for increasing classification accuracy.

Results

We propose for HIV-1 data a consistency-based feature selection approach in conjunction with recursive feature elimination of support vector machines (SVMs). We used various classifiers for evaluating the results obtained from the feature selection process. We further demonstrated the effectiveness of our proposed method by comparing it with a state-of-the-art feature selection method applied on HIV-1 data, and we evaluated the reported results based on attributes which have been selected from different combinations.

Conclusion

Applying feature selection on training data before realizing the classification task seems to be a reasonable data-mining process when working with types of data similar to HIV-1. On HIV-1 data, some feature selection or extraction operations in conjunction with different classifiers have been tested and noteworthy outcomes have been reported. These facts motivate for the work presented in this paper.

Software availability

The software is available at http://ozyer.etu.edu.tr/c-fs-svm.rar.The software can be downloaded at esnag.etu.edu.tr/software/hiv_cleavage_site_prediction.rar; you will find a readme file which explains how to set the software in order to work.  相似文献   

18.
19.
DNA methylation is an important epigenetic modification involved in gene regulation, which can now be measured using whole-genome bisulfite sequencing. However, cost, complexity of the data, and lack of comprehensive analytical tools are major challenges that keep this technology from becoming widely applied. Here we present BSmooth, an alignment, quality control and analysis pipeline that provides accurate and precise results even with low coverage data, appropriately handling biological replicates. BSmooth is open source software, and can be downloaded from http://rafalab.jhsph.edu/bsmooth.  相似文献   

20.
Hundreds of millions of figures are available in biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information. A high-quality ground truth standard can greatly facilitate the development of an automated system. This article describes DeTEXT: A database for evaluating text extraction from biomedical literature figures. It is the first publicly available, human-annotated, high quality, and large-scale figure-text dataset with 288 full-text articles, 500 biomedical figures, and 9308 text regions. This article describes how figures were selected from open-access full-text biomedical articles and how annotation guidelines and annotation tools were developed. We also discuss the inter-annotator agreement and the reliability of the annotations. We summarize the statistics of the DeTEXT data and make available evaluation protocols for DeTEXT. Finally we lay out challenges we observed in the automated detection and recognition of figure text and discuss research directions in this area. DeTEXT is publicly available for downloading at http://prir.ustb.edu.cn/DeTEXT/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号