共查询到20条相似文献,搜索用时 8 毫秒
1.
Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering 总被引:1,自引:0,他引:1
MOTIVATION: With the advancements of next-generation sequencing technology, it is now possible to study samples directly obtained from the environment. Particularly, 16S rRNA gene sequences have been frequently used to profile the diversity of organisms in a sample. However, such studies are still taxed to determine both the number of operational taxonomic units (OTUs) and their relative abundance in a sample. RESULTS: To address these challenges, we propose an unsupervised Bayesian clustering method termed Clustering 16S rRNA for OTU Prediction (CROP). CROP can find clusters based on the natural organization of data without setting a hard cut-off threshold (3%/5%) as required by hierarchical clustering methods. By applying our method to several datasets, we demonstrate that CROP is robust against sequencing errors and that it produces more accurate results than conventional hierarchical clustering methods. Availability and Implementation: Source code freely available at the following URL: http://code.google.com/p/crop-tingchenlab/, implemented in C++ and supported on Linux and MS Windows. 相似文献
2.
Recent studies of 16S rRNA sequences through next-generation sequencing have revolutionized our understanding of the microbial community composition and structure. One common approach in using these data to explore the genetic diversity in a microbial community is to cluster the 16S rRNA sequences into Operational Taxonomic Units (OTUs) based on sequence similarities. The inferred OTUs can then be used to estimate species, diversity, composition, and richness. Although a number of methods have been developed and commonly used to cluster the sequences into OTUs, relatively little guidance is available on their relative performance and the choice of key parameters for each method. In this study, we conducted a comprehensive evaluation of ten existing OTU inference methods. We found that the appropriate dissimilarity value for defining distinct OTUs is not only related with a specific method but also related with the sample complexity. For data sets with low complexity, all the algorithms need a higher dissimilarity threshold to define OTUs. Some methods, such as, CROP and SLP, are more robust to the specific choice of the threshold than other methods, especially for shorter reads. For high-complexity data sets, hierarchical cluster methods need a more strict dissimilarity threshold to define OTUs because the commonly used dissimilarity threshold of 3% often leads to an under-estimation of the number of OTUs. In general, hierarchical clustering methods perform better at lower dissimilarity thresholds. Our results show that sequence abundance plays an important role in OTU inference. We conclude that care is needed to choose both a threshold for dissimilarity and abundance for OTU inference. 相似文献
3.
菌种1137116S rRNA序列分析及鉴定 总被引:1,自引:0,他引:1
通过PCR方法扩增菌种11371的16S rRNA基因并测序,将序列提交GenBank(登录号:DQ531606),并与其他链霉菌属种进行比较,通过DNAStar软件得到菌种16S rRNA基因序列进化树。同时采用插片法、显微镜观察等方法对株菌11371进行形态特征、培养特征、生理生化特征鉴定。结果表明,11371的16S rRNA序列与其他链霉菌具有一定的同源性,结合生理、生化指标鉴定结果,进一步确定菌种为不吸水链霉菌一株新亚种(Streptomyces ahygroscopicus subsp.wuzhouensis n.sub-sp.),菌株11371 16S rRNA序列为GenBank中首例Streptomyces ahygroscopicus的16S rRNA序列。 相似文献
4.
Wei Chen Yongmei Cheng Clarence Zhang Shaowu Zhang Hongyu Zhao 《Journal of microbiological methods》2013
Recent developments of next generation sequencing technologies have led to rapid accumulation of 16S rRNA sequences for microbiome profiling. One key step in data processing is to cluster short sequences into operational taxonomic units (OTUs). Although many methods have been proposed for OTU inferences, a major challenge is the balance between inference accuracy and computational efficiency, where inference accuracy is often sacrificed to accommodate the need to analyze large numbers of sequences. Inspired by the hierarchical clustering method and a modified greedy network clustering algorithm, we propose a novel multi-seeds based heuristic clustering method, named MSClust, for OTU inference. MSClust first adaptively selects multi-seeds instead of one seed for each candidate cluster, and the reads are then processed using a greedy clustering strategy. Through many numerical examples, we demonstrate that MSClust enjoys less memory usage, and better biological accuracy compared to existing heuristic clustering methods while preserving efficiency and scalability. 相似文献
5.
6.
利用多对引物,扩增并测定出大黄鱼16SrRNA基因和18SrRNA基因的部分序列,其长度分别为1202bp和1275bp,16SrRNA基因序列的GC含量为46.12%,18SrRNA基因的Gc含量为53.oo%。将大黄鱼16SrRNA基因序列与GenBank中15种硬骨鱼类的同源序列结合,同时将其18SrRNA基因序列与GenBank中9种脊索动物的同源序列相结合,运用软件获得各自序列间差异百分比,转换和颠换数值等信息。基于这两种基因序列,利用NJ法和BI法,分别构建16种硬骨鱼类和10种脊索动物的分子系统树。18SrRNA构建的系统树包括三大支,一支为哺乳类、鸟类和爬行类共6个物种,一支为两栖类的1个物种,另一支为2种硬骨鱼类。16SrRNA构建的系统树显示大黄鱼所在的石首鱼科与鲈科和盖刺鱼科亲缘关系较近。此外还讨论了这两个基因的序列特征。 相似文献
7.
8.
Sequence versus Structure for the Direct Detection of 16S rRNA on Planar Oligonucleotide Microarrays 下载免费PDF全文
Darrell P. Chandler Gregory J. Newton Jonathan A. Small Don S. Daly 《Applied microbiology》2003,69(5):2950-2958
A two-probe proximal chaperone detection system consisting of a species-specific capture probe for the microarray and a labeled, proximal chaperone probe for detection was recently described for direct detection of intact rRNAs from environmental samples on oligonucleotide arrays. In this study, we investigated the physical spacing and nucleotide mismatch tolerance between capture and proximal chaperone detector probes that are required to achieve species-specific 16S rRNA detection for the dissimilatory metal and sulfate reducer 16S rRNAs. Microarray specificity was deduced by analyzing signal intensities across replicate microarrays with a statistical analysis-of-variance model that accommodates well-to-well and slide-to-slide variations in microarray signal intensity. Chaperone detector probes located in immediate proximity to the capture probe resulted in detectable, nonspecific binding of nontarget rRNA, presumably due to base-stacking effects. Species-specific rRNA detection was achieved by using a 22-nt capture probe and a 15-nt detector probe separated by 10 to 14 nt along the primary sequence. Chaperone detector probes with up to three mismatched nucleotides still resulted in species-specific capture of 16S rRNAs. There was no obvious relationship between position or number of mismatches and within- or between-genus hybridization specificity. From these results, we conclude that relieving secondary structure is of principal concern for the successful capture and detection of 16S rRNAs on planar surfaces but that the sequence of the capture probe is more important than relieving secondary structure for achieving specific hybridization. 相似文献
9.
Sequence versus structure for the direct detection of 16S rRNA on planar oligonucleotide microarrays 总被引:6,自引:0,他引:6
A two-probe proximal chaperone detection system consisting of a species-specific capture probe for the microarray and a labeled, proximal chaperone probe for detection was recently described for direct detection of intact rRNAs from environmental samples on oligonucleotide arrays. In this study, we investigated the physical spacing and nucleotide mismatch tolerance between capture and proximal chaperone detector probes that are required to achieve species-specific 16S rRNA detection for the dissimilatory metal and sulfate reducer 16S rRNAs. Microarray specificity was deduced by analyzing signal intensities across replicate microarrays with a statistical analysis-of-variance model that accommodates well-to-well and slide-to-slide variations in microarray signal intensity. Chaperone detector probes located in immediate proximity to the capture probe resulted in detectable, nonspecific binding of nontarget rRNA, presumably due to base-stacking effects. Species-specific rRNA detection was achieved by using a 22-nt capture probe and a 15-nt detector probe separated by 10 to 14 nt along the primary sequence. Chaperone detector probes with up to three mismatched nucleotides still resulted in species-specific capture of 16S rRNAs. There was no obvious relationship between position or number of mismatches and within- or between-genus hybridization specificity. From these results, we conclude that relieving secondary structure is of principal concern for the successful capture and detection of 16S rRNAs on planar surfaces but that the sequence of the capture probe is more important than relieving secondary structure for achieving specific hybridization. 相似文献
10.
11.
鹿类动物的系统演化关系一直存在争议,特别是獐亚科的设立与否.通过测定獐的线粒体16S rRNA基因,并从GenBank获得鹿类另外14种动物的线粒体16S rRNA基因全序列,以水牛和绵羊作双外群,构建系统进化树,探讨鹿类动物系统发生关系及獐亚科的有效性.结果分析表明:(1)支持鹿科分为鹿亚科、麂亚科、獐亚科和美洲鹿亚科,麝科成立;(2)獐亚科有效,支持獐与原属美洲鹿亚科狍共同组成獐亚科;(3)毛冠鹿的分类地位还有待进一步确定. 相似文献
12.
Quantitative Comparisons of 16S rRNA Gene Sequence Libraries from Environmental Samples 总被引:18,自引:13,他引:18 下载免费PDF全文
David R. Singleton Michelle A. Furlong Stephen L. Rathbun William B. Whitman 《Applied microbiology》2001,67(9):4374-4376
To determine the significance of differences between clonal libraries of environmental rRNA gene sequences, differences between homologous coverage curves, CX(D), and heterologous coverage curves, CXY(D), were calculated by a Cramér-von Mises-type statistic and compared by a Monte Carlo test procedure. This method successfully distinguished rRNA gene sequence libraries from soil and bioreactors and correctly failed to find differences between libraries of the same composition. 相似文献
13.
本文报道了油菜叶绿体16S rRNA基因的全顺序及其5′端上游的156bp和3′端下游的101bp的核苷酸顺序。油菜叶绿体16s rRNA基因长为1491bp,和烟草、玉米相比,同源程度分别为98.5%、96.1%。油菜叶绿体16S rRNA基因5′端上游及3′端下游的顺序能互补而形成一个较大的茎环结构,但与烟草相比,由于3′端下游顺序有79bp的缺失,因此,该结构中的茎部分大小仅为烟草的二分之一。 相似文献
14.
Sequence arrangement of the 16S and 26S rRNA genes in the pathogenic haemoflagellate Leishmania donovani. 总被引:9,自引:10,他引:9 下载免费PDF全文
Kinetic and chemical analysis show that the haploid genome of Leishmania donovani has between 4.6 and 6.5 X 10(7) Kb pairs of DNA. Cot analysis shows that the genome contains 12% rapidly reassociating DNA, U3% middle repetitive DNA with an average reiteration frequency of 77 and 62% single copy DNA. Saturation hybridization experiments show that 0.82% of the nuclear DNA is occupied by rRNA coding sequences. The average repetition frequency of these sequences is determined to be 166. Sedimentation velocity studies indicate the two major rRNA species have sedimentation values of 26S and 16S, respectively. The arrangement of the rRNA genes and their spacer sequences on long strands of purified rDNA has been determined by the examination of the structure of rRNA:DNA hybrids prepared for electron microscopy by the gene 32-ethidium bromide technique. Long DNA strands are observed to contain several gene sets (16S + 26S). One repeat unit contains the following sequences in the order given: (a) A 16S gene of length 2.12 Kb, (b) An internal transcribed spacer (Spl) of length 1.23 Kb, which contains a short sequence that may code for a 5.8S rRNA, (C) 26S gene with a length of 4.31 Kb which contains an internal gap region of length 0.581 Ib, (d) An external spacer of average length 5.85 Kb. 相似文献
15.
Identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA genes 总被引:19,自引:0,他引:19
Janssen PH 《Applied and environmental microbiology》2006,72(3):1719-1728
16.
Huijing Hao Junrong Liang Ran Duan Yuhuang Chen Chang Liu Yuchun Xiao Xu Li Mingming Su Huaiqi Jing Xin Wang 《PloS one》2016,11(1)
API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method. 相似文献
17.
Sequence heterogeneity in the two 16S rRNA genes of Phormium yellow leaf phytoplasma. 总被引:1,自引:0,他引:1 下载免费PDF全文
L W Liefting M T Andersen R E Beever R C Gardner R L Forster 《Applied microbiology》1996,62(9):3133-3139
Phormium yellow leaf (PYL) phytoplasma causes a lethal disease of the monocotyledon, New Zealand flax (Phormium tenax). The 16S rRNA genes of PYL phytoplasma were amplified from infected flax by PCR and cloned, and the nucleotide sequences were determined. DNA sequencing and Southern hybridization analysis of genomic DNA indicated the presence of two copies of the 16S rRNA gene. The two 16S rRNA genes exhibited sequence heterogeneity in 4 nucleotide positions and could be distinguished by the restriction enzymes BpmI and BsrI. This is the first record in which sequence heterogeneity in the 16S rRNA genes of a phytoplasma has been determined by sequence analysis. A phylogenetic tree based on 16S rRNA gene sequences showed that PYL phytoplasma is most closely related to the stolbur and German grapevine yellows phytoplasmas, which form the stolbur subgroup of the aster yellows group. This phylogenetic position of PYL phytoplasma was supported by 16S/23S spacer region sequence data. 相似文献
18.
19.
Identifying the Dominant Soil Bacterial Taxa in Libraries of 16S rRNA and 16S rRNA Genes 总被引:14,自引:9,他引:14 下载免费PDF全文
Peter H. Janssen 《Applied microbiology》2006,72(3):1719-1728
20.
L M Doyle J O McInerney J Mooney R Powell A Haikara Dr A P Moran 《Journal of industrial microbiology & biotechnology》1995,15(2):67-70
The 16S ribosomal RNA gene from the beer-spoilage organism,Megasphaera cerevisiae was polymerase chain reaction (PCR)-amplified and sequenced. Analysis confirmed the phylogenetic position ofM. cerevisiae as a sister taxon ofMegasphaera elsdenii, within the obligately anaerobic, Gram-negative cocci. The sequence obtained should facilitate the development of DNA probes for early detection of this spoilage organism. 相似文献