首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Prediction of effective genome size in metagenomic samples   总被引:3,自引:0,他引:3       下载免费PDF全文
We introduce a novel computational approach to predict effective genome size (EGS; a measure that includes multiple plasmid copies, inserted sequences, and associated phages and viruses) from short sequencing reads of environmental genomics (or metagenomics) projects. We observe considerable EGS differences between environments and link this with ecologic complexity as well as species composition (for instance, the presence of eukaryotes). For example, we estimate EGS in a complex, organism-dense farm soil sample at about 6.3 megabases (Mb) whereas that of the bacteria therein is only 4.7 Mb; for bacteria in a nutrient-poor, organism-sparse ocean surface water sample, EGS is as low as 1.6 Mb. The method also permits evaluation of completion status and assembly bias in single-genome sequencing projects.  相似文献   

2.

Background

A metagenomic sample is a set of DNA fragments, randomly extracted from multiple cells in an environment, belonging to distinct, often unknown species. Unsupervised metagenomic clustering aims at partitioning a metagenomic sample into sets that approximate taxonomic units, without using reference genomes. Since samples are large and steadily growing, space-efficient clustering algorithms are strongly needed.

Results

We design and implement a space-efficient algorithmic framework that solves a number of core primitives in unsupervised metagenomic clustering using just the bidirectional Burrows-Wheeler index and a union-find data structure on the set of reads. When run on a sample of total length n, with m reads of maximum length ? each, on an alphabet of total size σ, our algorithms take O(n(t+logσ)) time and just 2n+o(n)+O(max{? σlogn,K logm}) bits of space in addition to the index and to the union-find data structure, where K is a measure of the redundancy of the sample and t is the query time of the union-find data structure.

Conclusions

Our experimental results show that our algorithms are practical, they can exploit multiple cores by a parallel traversal of the suffix-link tree, and they are competitive both in space and in time with the state of the art.
  相似文献   

3.
Fruits harbour abundant and diverse microbial communities that protect them from post-harvest pathogens. Identification of functional traits associated with a given microbiota can provide a better understanding of their potential influence. Here, we focused on the epiphytic microbiome of apple fruit. We suggest that shotgun metagenomic data can indicate specific functions carried out by different groups and provide information on their potential impact. Samples were collected from the surface of ‘Golden Delicious’ apples from four orchards that differ in their geographic location and management practice. Approximately 1 million metagenes were predicted based on a high-quality assembly. Functional profiling of the microbiome of fruits from orchards differing in their management practice revealed a functional shift in the microbiota. The organic orchard microbiome was enriched in pathways involved in plant defence activities; the conventional orchard microbiome was enriched in pathways related to the synthesis of antibiotics. The functional significance of the variations was explored using microbial network modelling algorithms to reveal the metabolic role of specific phylogenetic groups. The analysis identified several associations supported by other published studies. For example, the analysis revealed the nutritional dependencies of the Capnodiales group, including the Alternaria pathogen, on aromatic compounds.  相似文献   

4.
随着元基因组数据的不断增多,建立一个包含高品质的元基因组样本(也称为"微生物群落")数据的集成化的分析平台成为可能,使得微生物群落样本能够被有效分析、比较与搜索,从中发现更加深入的生物学意义。然而,一方面目前大部分元基因组数据库仅仅提供了简单的数据存储,缺乏良好的样本注释或者仅仅提供了很少的分析功能。另一方面,用于计算微生物群落数据相似性的方法所能够接受的样本数据量非常有限。长期以来,科学家们一直在寻找有效的方法计算海量微生物群落之间的相似性,从而研究样本之间的相似度并发现元基因组数据信息的相关性。Meta-Mesh是一个全新的在线元基因组分析系统,它包括元基因组数据库和分析平台,可以对元基因组样本进行系统、有效地分析,并实现样本的群落结构比较和精确搜索。其中,元基因组数据库已经从公共领域和内部实验室收集了超过7 000个高品质、带有有效注释的样本。同时,Meta-Mesh的分析平台提供了多种在线分析工具,可以对元基因组样本进行群落的结构分析与注释,多角度比较,并能通过快速索引策略和群落结构相似性算法在数据库中高效搜索近似的样本。Meta-Mesh通过"人体微生物群落样本的数据库搜索识别"以及"基于相似度矩阵的样本的聚类"等一系列的元基因组研究案例证明了其分析方面的性能。作为一个在线的元基因组数据库和分析系统,Meta-Mesh将服务于元基因组样本的快速分析、识别、比对、搜索等相关领域。  相似文献   

5.
The taxonomic analysis of sequencing data has become important in many areas of life sciences. However, currently available tools for that purpose either consume large amounts of RAM or yield insufficient quality and robustness. Here, we present kASA, a k-mer based tool capable of identifying and profiling metagenomic DNA or protein sequences with high computational efficiency and a user-definable memory footprint. We ensure both high sensitivity and precision by using an amino acid-like encoding of k-mers together with a range of multiple k’s. Custom algorithms and data structures optimized for external memory storage enable a full-scale taxonomic analysis without compromise on laptop, desktop, and HPCC.  相似文献   

6.
7.
The ability to isolate and measure multiple complex analytes in a single biological sample holds great potential in many biomedical fields, especially immunology and diagnostic clinical chemistry. We have developed a procedure involving recycling immunoaffinity chromatography for the simultaneous measurement of a number of analytes in a single sample. The procedure is based on the passage of a fluorochrome-labelled sample through a battery of small immunoaffinity columns, each column extracting a single analyte. Detection is achieved by acid elution of the bound analytes and laser-induced fluorescence. We have applied this system to a number of different biological fluids and found that it is capable of reliably isolating and measuring up to ten different cytokines in a 25-μl sample of human body fluid.  相似文献   

8.
Larsen K 《Biometrics》2004,60(1):85-92
Multiple categorical variables are commonly used in medical and epidemiological research to measure specific aspects of human health and functioning. To analyze such data, models have been developed considering these categorical variables as imperfect indicators of an individual's "true" status of health or functioning. In this article, the latent class regression model is used to model the relationship between covariates, a latent class variable (the unobserved status of health or functioning), and the observed indicators (e.g., variables from a questionnaire). The Cox model is extended to encompass a latent class variable as predictor of time-to-event, while using information about latent class membership available from multiple categorical indicators. The expectation-maximization (EM) algorithm is employed to obtain maximum likelihood estimates, and standard errors are calculated based on the profile likelihood, treating the nonparametric baseline hazard as a nuisance parameter. A sampling-based method for model checking is proposed. It allows for graphical investigation of the assumption of proportional hazards across latent classes. It may also be used for checking other model assumptions, such as no additional effect of the observed indicators given latent class. The usefulness of the model framework and the proposed techniques are illustrated in an analysis of data from the Women's Health and Aging Study concerning the effect of severe mobility disability on time-to-death for elderly women.  相似文献   

9.
元基因组文库分析技术研究进展   总被引:2,自引:0,他引:2  
李武  赵勇  王玉炯 《生态学报》2007,27(5):2070-2076
随着新的分析技术的不断出现和成熟,促进了微生物分子生态学及相关学科的诞生和迅速发展。其中,元基因组文库分析技术即是近年来微生物分子生态学研究领域兴起的一种新的分析技术。就元基因组分析技术诞生的背景及该技术的原理进行了讨论,着重阐述了元基因组文库分析技术在寻找新基因、开发新的生物活性物质、研究群落中微生物多样性、人类元基因组测序等方面的应用。另外,归纳总结了目前国际上常用的诸如PCR为基础的筛选、荧光原位杂交(fluorescent in situ hybridization,FISH)、底物诱导的基因表达筛选(substrate induced gene expression screening,SIGEX)、基因芯片等元基因组文库筛选方法,并就不同方法的优缺点进行了分析和讨论,指出了目前元基因组文库分析技术存在的主要问题并对今后该技术的发展进行了展望。  相似文献   

10.
In the present study, the conditions of analysis by inductively coupled plasma-atomic emission spectrometry (ICP-AES) were investigated. Twenty-six elements (Mg: 25 ppm; Sc: 10 ppm; Ti: 50 ppm; others: 100 ppm) were used as the elements interfering with selected 24 wavelengths. Consequently, the background values in 19 elements were subjected to some influences. However, all of these effects disappeared at low concentrations—less than 1 ppm of interfering elements. Next, the values from the ordinary calibration method were compared with those from the standard addition method using several biological samples. There was a discrepancy in the results obtained from both methods because of the sample, and three patterns were observed. However, no discrepancy was observed in the values for the standard reference materials using both methods. There was no significant difference between the certified values of the standard reference materials and the obtained ones by ICP. Therefore, the analytical wavelengths and the methods in the present study were suggested to be useful for ICP-AES analysis for environmental and/or biological samples.  相似文献   

11.
菌株是微生物研究中最基础的生命实体,其功能多样性对宿主表型有着重要影响。随着微生物组研究的深入,对复杂微生物群落进行菌株水平的构成分析和功能分析,在基础科研、临床应用等方面都有重要的价值。文中介绍了基于宏基因组数据的菌株分析的主流算法,以及菌株分析在微生物组研究中的潜在应用和未来的发展方向。  相似文献   

12.
General models of the evolution of cooperation, altruism and other social behaviours have focused almost entirely on single traits, whereas it is clear that social traits commonly interact. We develop a general kin-selection framework for the evolution of social behaviours in multiple dimensions. We show that whenever there are interactions among social traits new behaviours can emerge that are not predicted by one-dimensional analyses. For example, a prohibitively costly cooperative trait can ultimately be favoured owing to initial evolution in other (cheaper) social traits that in turn change the cost–benefit ratio of the original trait. To understand these behaviours, we use a two-dimensional stability criterion that can be viewed as an extension of Hamilton''s rule. Our principal example is the social dilemma posed by, first, the construction and, second, the exploitation of a shared public good. We find that, contrary to the separate one-dimensional analyses, evolutionary feedback between the two traits can cause an increase in the equilibrium level of selfish exploitation with increasing relatedness, while both social (production plus exploitation) and asocial (neither) strategies can be locally stable. Our results demonstrate the importance of emergent stability properties of multidimensional social dilemmas, as one-dimensional stability in all component dimensions can conceal multidimensional instability.  相似文献   

13.
Metagenome represent an unlimited resource for discovery of novel genes. Here we report, sequence analysis of a salt tolerant metagenomic clone (6B4) from a pond water metagenomic library. Clone 6B4 had an insert of 2254 bp with G+C composition of 64.06%. DNA sequence from 6B4 showed homology to DNA sequences from pro-teobacteria indicating origin of 6B4 metagenomic insert from a yet uncharacterized proteobacteria. Two encoded proteins from clone 6B4 showed match with ATP-depen-dent Clp protease adaptor protein (ClpS) and phasin, while two truncated encoded proteins showed match with poly-3-hydroxybutyrate synthase and permease. Clp complex is known to play a role in stress tolerance. Expression of ClpS from metagenomic clone is proposed to be responsible for salt tolerance of the metagenomic clone 6B4.  相似文献   

14.
A quantitative PCR procedure targeting the β-tubulin gene determined the number of Rotylenchulus reniformis Linford & Oliveira 1940 in metagenomic DNA samples isolated from soil. Of note, this outcome was in the presence of other soil-dwelling plant parasitic nematodes including its sister genus Helicotylenchus Steiner, 1945. The methodology provides a framework for molecular diagnostics of nematodes from metagenomic DNA isolated directly from soil.  相似文献   

15.
Several PCR methods have recently been developed to identify fecal contamination in surface waters. In all cases, researchers have relied on one gene or one microorganism for selection of host-specific markers. Here we describe the application of a genome fragment enrichment (GFE) method to identify host-specific genetic markers from fecal microbial community DNA. As a proof of concept, bovine fecal DNA was challenged against a porcine fecal DNA background to select for bovine-specific DNA sequences. Bioinformatic analyses of 380 bovine enriched metagenomic sequences indicated a preponderance of Bacteroidales-like regions predicted to encode membrane-associated and secreted proteins. Oligonucleotide primers capable of annealing to select Bacteroidales-like bovine GFE sequences exhibited extremely high specificity (>99%) in PCR assays with total fecal DNAs from 279 different animal sources. These primers also demonstrated a broad distribution of corresponding genetic markers (81% positive) among 148 different bovine sources. These data demonstrate that direct metagenomic DNA analysis by the competitive solution hybridization approach described is an efficient method for identifying potentially useful fecal genetic markers and for characterizing differences between environmental microbial communities.  相似文献   

16.
Given the absence of universal marker genes in the viral kingdom, researchers typically use BLAST (with stringent E-values) for taxonomic classification of viral metagenomic sequences. Since majority of metagenomic sequences originate from hitherto unknown viral groups, using stringent e-values results in most sequences remaining unclassified. Furthermore, using less stringent e-values results in a high number of incorrect taxonomic assignments. The SOrt-ITEMS algorithm provides an approach to address the above issues. Based on alignment parameters, SOrt-ITEMS follows an elaborate work-flow for assigning reads originating from hitherto unknown archaeal/bacterial genomes. In SOrt-ITEMS, alignment parameter thresholds were generated by observing patterns of sequence divergence within and across various taxonomic groups belonging to bacterial and archaeal kingdoms. However, many taxonomic groups within the viral kingdom lack a typical Linnean-like taxonomic hierarchy. In this paper, we present ProViDE (Program for Viral Diversity Estimation), an algorithm that uses a customized set of alignment parameter thresholds, specifically suited for viral metagenomic sequences. These thresholds capture the pattern of sequence divergence and the non-uniform taxonomic hierarchy observed within/across various taxonomic groups of the viral kingdom. Validation results indicate that the percentage of 'correct' assignments by ProViDE is around 1.7 to 3 times higher than that by the widely used similarity based method MEGAN. The misclassification rate of ProViDE is around 3 to 19% (as compared to 5 to 42% by MEGAN) indicating significantly better assignment accuracy. ProViDE software and a supplementary file (containing supplementary figures and tables referred to in this article) is available for download from http://metagenomics.atc.tcs.com/binning/ProViDE/  相似文献   

17.
18.
19.
20.
目的 评估粪便样本的不同保存条件对肠道微生态研究结果的影响.方法 设计相关实验,比较7种不同的保存方法,基于高通量测序技术比对不同时间、不同贮存温度对粪便样本DNA质量、菌群多样性及病原菌检出等结果之间的差异.结果 证实了对于粪便样本采集仍建议采取即刻提取核酸或-20℃保存的方法.同时比较多种保存方法后发现,采用不同样...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号