首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
GSTaxClassifier (Genomic Signature based Taxonomic Classifier) is a program for metagenomics analysis of shotgun DNA sequences. The program includes
  1. a simple but effective algorithm, a modification of the Bayesian method, to predict the most probable genomic origins of sequences at different taxonomical ranks, on the basis of genome databases;
  2. a function to generate genomic profiles of reference sequences with tri-, tetra-, penta-, and hexa-nucleotide motifs for setting a user-defined database;
  3. two different formats (tabular- and tree-based summaries) to display taxonomic predictions with improved analytical methods; and
  4. effective ways to retrieve, search, and summarize results by integrating the predictions into the NCBI tree-based taxonomic information.
GSTaxClassifier takes input nucleotide sequences and using a modified Bayesian model evaluates the genomic signatures between metagenomic query sequences and reference genome databases. The simulation studies of a numerical data sets showed that GSTaxClassifier could serve as a useful program for metagenomics studies, which is freely available at http://helix2.biotech.ufl.edu:26878/metagenomics/.  相似文献   

2.
3.
4.
Schmieder R  Edwards R 《PloS one》2011,6(3):e17288
High-throughput sequencing technologies have strongly impacted microbiology, providing a rapid and cost-effective way of generating draft genomes and exploring microbial diversity. However, sequences obtained from impure nucleic acid preparations may contain DNA from sources other than the sample. Those sequence contaminations are a serious concern to the quality of the data used for downstream analysis, causing misassembly of sequence contigs and erroneous conclusions. Therefore, the removal of sequence contaminants is a necessary and required step for all sequencing projects. We developed DeconSeq, a robust framework for the rapid, automated identification and removal of sequence contamination in longer-read datasets (150 bp mean read length). DeconSeq is publicly available as standalone and web-based versions. The results can be exported for subsequent analysis, and the databases used for the web-based version are automatically updated on a regular basis. DeconSeq categorizes possible contamination sequences, eliminates redundant hits with higher similarity to non-contaminant genomes, and provides graphical visualizations of the alignment results and classifications. Using DeconSeq, we conducted an analysis of possible human DNA contamination in 202 previously published microbial and viral metagenomes and found possible contamination in 145 (72%) metagenomes with as high as 64% contaminating sequences. This new framework allows scientists to automatically detect and efficiently remove unwanted sequence contamination from their datasets while eliminating critical limitations of current methods. DeconSeq's web interface is simple and user-friendly. The standalone version allows offline analysis and integration into existing data processing pipelines. DeconSeq's results reveal whether the sequencing experiment has succeeded, whether the correct sample was sequenced, and whether the sample contains any sequence contamination from DNA preparation or host. In addition, the analysis of 202 metagenomes demonstrated significant contamination of the non-human associated metagenomes, suggesting that this method is appropriate for screening all metagenomes. DeconSeq is available at http://deconseq.sourceforge.net/.  相似文献   

5.
6.

Background

Shared-usage high throughput screening (HTS) facilities are becoming more common in academe as large-scale small molecule and genome-scale RNAi screening strategies are adopted for basic research purposes. These shared facilities require a unique informatics infrastructure that must not only provide access to and analysis of screening data, but must also manage the administrative and technical challenges associated with conducting numerous, interleaved screening efforts run by multiple independent research groups.

Results

We have developed Screensaver, a free, open source, web-based lab information management system (LIMS), to address the informatics needs of our small molecule and RNAi screening facility. Screensaver supports the storage and comparison of screening data sets, as well as the management of information about screens, screeners, libraries, and laboratory work requests. To our knowledge, Screensaver is one of the first applications to support the storage and analysis of data from both genome-scale RNAi screening projects and small molecule screening projects.

Conclusions

The informatics and administrative needs of an HTS facility may be best managed by a single, integrated, web-accessible application such as Screensaver. Screensaver has proven useful in meeting the requirements of the ICCB-Longwood/NSRB Screening Facility at Harvard Medical School, and has provided similar benefits to other HTS facilities.  相似文献   

7.
Despite the current wealth of sequencing data, one‐third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome‐scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction.  相似文献   

8.
9.
The genomic record of Humankind's evolutionary roots.   总被引:24,自引:3,他引:21  
  相似文献   

10.
Using metagenomic ‘parts lists’ to infer global patterns on microbial ecology remains a significant challenge. To deduce important ecological indicators such as environmental adaptation, molecular trait dispersal, diversity variation and primary production from the gene pool of an ecosystem, we integrated 25 ocean metagenomes with geographical, meteorological and geophysicochemical data. We find that climatic factors (temperature, sunlight) are the major determinants of the biomolecular repertoire of each sample and the main limiting factor on functional trait dispersal (absence of biogeographic provincialism). Molecular functional richness and diversity show a distinct latitudinal gradient peaking at 20°N and correlate with primary production. The latter can also be predicted from the molecular functional composition of an environmental sample. Together, our results show that the functional community composition derived from metagenomes is an important quantitative readout for molecular trait‐based biogeography and ecology.  相似文献   

11.
12.
The identification of disease genes via molecular DNA cloning has revolutionized human genetics and medicine. Both the candidate gene approach and positional cloning have been used successfully. The defects causing Huntington's disease, facioscapulohumeral muscular dystrophy, piebaldism, Hurler/Scheie syndrome, one form of autosomal recessive retinitis pigmentosa, and a second locus for autosomal dominant polycystic kidney disease have recently been localized to chromosome 4. In addition to the rapid progress in the cloning of the 203-megabase chromosome, the presence of more than 60 closely spaced microsatellites on this chromosome will undoubtedly lead to the localization of additional disease genes. In order to consider cloned genes as potential candidates for disorders assigned to chromosome 4, it is important to collect and order all genes with respect to their chromosomal localization. Analysis of cytogenetically visible interstitial and terminal deletions should also be helpful in defining new disease gene loci and in mapping novel genes. These data represent the status quo of the integrated molecular map for chromosome 4.  相似文献   

13.
A new gene encoding an esterase (designated as EstEP16) was identified from a metagenomic library prepared from a sediment sample collected from a deep-sea hydrothermal field in east Pacific. The open reading frame of this gene encoded 249 amino acid residues. It was cloned, overexpressed in Escherichia coli, and the recombinant protein was purified to homogeneity. The monomeric EstEP16 presented a molecular mass of 51.7 kDa. Enzyme assays using p-nitrophenyl esters with different acyl chain lengths as the substrates confirmed its esterase activity, yielding highest specific activity with p-nitrophenyl acetate. When p-nitrophenyl butyrate was used as a substrate, recombinant EstEP16 exhibited highest activity at pH 8.0 and 60 °C. The recombinant enzyme retained about 80% residual activity after incubation at 90 °C for 6 h, which indicated that EstEP16 was thermostable. Homology modeling of EstEP16 was developed with the monoacylglycerol lipase from Bacillus sp. H-257 as a template. The structure showed an α/β-hydrolase fold and indicated the presence of a typical catalytic triad. The activity of EstEP16 was inhibited by addition of phenylmethylsulfonyl fluoride, indicating that it contains serine residue, which plays a key role in the catalytic mechanism.  相似文献   

14.
Plant microbiota (the microorganisms that live in any associations with plant tissues) represents a rather unexplored area of metagenomic research compared with soils and oceans. Constructing a metagenomic library for plant microbiota is technically challenging. Using all the biomass without pre-enrichment could lead to vast proportions of the host plant DNA in the metagenomic library, doubtless obliterating the microbial contribution. Therefore, the first and essential step is to enrich for the constituent microorganisms from plant tissues. Here, a strong enrichment for plant microbiota was achieved by coupling SDS (sodium dodecyl sulfate) with NaCl, creating a predominantly microbial metagenomic library that contains 88% bacterial inserts. 16S rDNA sequence analysis revealed that the metagenomic DNA of enrichments originates from very diverse microorganisms. At least 74 distinct ribotypes (at a 97% threshold) from seven different bacterial phyla were identified and mainly distributed among Actinobacteria and Proteobacteria. Additionally, a simplified version of Amplified Ribosomal DNA Restriction Analysis (ARDRA) was developed for a quick and efficient assessment of the enriching procedures. This work opens further insight into the great biotechnical potential of plant microbiota, holding more potential for drug discovery through a metagenomic strategy, and paving the way for recovery and biochemical characterization of functional gene repertoire from plant microbiota.  相似文献   

15.
16.
17.
Metagenomes from various environmental soils were screened using alpha-naphthyl acetate and Fast Blue RR for a novel ester-hydrolyzing enzyme on Escherichia coli. Stepwise fragmentations and subcloning of the initial insert DNA (30-40 kb) using restriction enzymes selected to exclude already known esterases with subsequent screenings resulted in a positive clone with a 2.5-kb DNA fragment. The cloned sequence included an open reading frame consisting of 1089 bp, designated as est25, encoding a protein of 363 amino acids with a molecular mass of about 38.3 kDa. Amino acid sequence analysis revealed only moderate identity (< or = 48%) to the known esterases/lipases in the databases containing the conserved sequence motifs of esterases/lipases, such as HGGG (residues 124-127), GxSxG (residues 199-203), and the putative catalytic triad composed of Ser201, Asp303, and His333. Est25 was functionally overexpressed in a soluble form in E. coli with optimal activity at pH 7.0 and 25 degrees C. The purified Est25 exhibited hydrolyzing activity toward p-nitrophenyl (NP)-fatty acyl esters with short-length acyl chains (< or = C6) with the highest activity toward p-NP-acetate (Km=1.0 mM and Vmax = 63.7 U/mg), but not with chain lengths > or = C8, demonstrating that Est25 is an esterase originated most likely from a mesophilic microorganism in soils. Est25 efficiently hydrolyzed (R,S)-ketoprofen ethyl ester with Km of 16.4 mM and Vmax of 59.1 U/mg with slight enantioselectivity toward (R)-ketoprofen ethyl ester. This study demonstrates that functional screening combined with the sequential uses of restriction enzymes to exclude already known enzymes is a useful approach for isolating novel enzymes from a metagenome.  相似文献   

18.
A novel esterase gene was isolated by functional screening of a metagenomic library prepared from an activated sludge sample. The gene (est-XG2) consists of 1,506 bp with GC content of 74.8 %, and encodes a protein of 501 amino acids with a molecular mass of 53 kDa. Sequence alignment revealed that Est-XG2 shows a maximum amino acid identity (47 %) with the carboxylesterase from Thermaerobacter marianensis DSM 12885 (YP_004101478). The catalytic triad of Est-XG2 was predicted to be Ser192-Glu313-His412 with Ser192 in a conserved pentapeptide (GXSXG), and further confirmed by site-directed mutagenesis. Phylogenetic analysis suggested Est-XG2 belongs to the bacterial lipase/esterase family VII. The recombinant Est-XG2, expressed and purified from Escherichia coli, preferred to hydrolyze short and medium length p-nitrophenyl esters with the best substrate being p-nitrophenyl acetate (K m and k cat of 0.33 mM and 36.21 s?1, respectively). The purified enzyme also had the ability to cleave sterically hindered esters of tertiary alcohols. Biochemical characterization of Est-XG2 revealed that it is a thermophilic esterase that exhibits optimum activity at pH 8.5 and 70 °C. Est-XG2 had moderate tolerance to organic solvents and surfactants. The unique properties of Est-XG2, high thermostability and stability in the presence of organic solvents, may render it a potential candidate for industrial applications.  相似文献   

19.
With the expanding availability of sequencing technologies, research previously centered on the human genome can now afford to include the study of humans' internal ecosystem (human microbiome). Given the scale of the data involved in this metagenomic research (two orders of magnitude larger than the human genome) and their importance in relation to human health, it is crucial to guarantee (along with the appropriate data collection and taxonomy) proper tools for data analysis. We propose to adapt the approaches defined for the analysis of gene-expression microarray in order to infer information in metagenomics. In particular, we applied SAM, a broadly used tool for the identification of differentially expressed genes among different samples classes, to a reported dataset on a research model with mice of two genotypes (a high density lipoprotein knockout mouse and its wild-type counterpart). The data contain two different diets (high-fat or normal-chow) to ensure the onset of obesity, prodrome of metabolic syndromes (MS). By using 16S rRNA gene as a genomic diversity marker, we illustrate how this approach can identify bacterial populations differentially enriched among different genetic and dietary conditions of the host. This approach faithfully reproduces highly-relevant results from phylogenetic and standard statistical analyses, used to explain the role of the gut microbiome in relation to obesity. This represents a promising proof-of-principle for using functional genomic approaches in the fast growing area of metagenomics, and warrants the availability of a large body of thoroughly tested and theoretically sound methodologies to this exciting new field.  相似文献   

20.
A microarray‐based approach was used to screen a soil metagenome for the presence of blue light (BL) photoreceptor‐encoding genes. The microarray carried 149 different 54‐mer oligonucleotides, derived from consensus sequences of light, oxygen and voltage (LOV) domain BL photoreceptor genes. Calibration of the microarrays allowed the detection of minimally 50 ng of genomic DNA against a background of 2–5 μg of genomic DNA. Identification of a positive cosmid clone was still possible for an amount of 0.25 ng against a background of 10 μg of labelled DNA clones. The array could readily identify targets carrying 4% sequence mismatch. Using the LOV microarray, up to 1200 library clones in concentrations of c. 20 ng each with a c. 40 kb insert size could be screened in a single batch. After calibration and reliability controls, the microarray was probed with cosmid‐cloned DNA from the thermophilic fraction of a soil sample. From this approach, a novel gene was isolated that encodes a protein consisting of several Per‐Arnt‐Sim domains, a LOV domain associated to a histidine kinase and a response regulator domain. The novel gene showed highest similarity to a known sequence from Kineococcus radiotolerans SRS30216 (58% identity for the LOV domain only) and to a gene from Methylibium petroleiphilum PM1 (57% identity). The gene, designated as ht‐met1 (Hamburg Thermophile Metagenome 1), was isolated and fully sequenced (3615 bp). ht‐met1 is followed by a second open reading frame encoding a Fe‐chelatase, an arrangement quite frequent for BL photoreceptors. The LOV domain region of ht‐met1 was subcloned and expressed yielding a fully functional, flavin‐containing LOV domain. Irradiation generated the typical LOV photochemistry, with the transient formation of a flavin‐protein photoadduct. The dark recovery lifetime was found as τREC = 120 s (20°C) and is among the fastest ones determined so far for bacterial LOV domains.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号