期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Tools for T-RFLP data analysis using Excel

Nils Johan Fredriksson Malte Hermansson Britt-Marie Wilén 《BMC bioinformatics》2014,15(1)

Background

Terminal restriction fragment length polymorphism (T-RFLP) analysis is a DNA-fingerprinting method that can be used for comparisons of the microbial community composition in a large number of samples. There is no consensus on how T-RFLP data should be treated and analyzed before comparisons between samples are made, and several different approaches have been proposed in the literature. The analysis of T-RFLP data can be cumbersome and time-consuming, and for large datasets manual data analysis is not feasible. The currently available tools for automated T-RFLP analysis, although valuable, offer little flexibility, and few, if any, options regarding what methods to use. To enable comparisons and combinations of different data treatment methods an analysis template and an extensive collection of macros for T-RFLP data analysis using Microsoft Excel were developed.

Results

The Tools for T-RFLP data analysis template provides procedures for the analysis of large T-RFLP datasets including application of a noise baseline threshold and setting of the analysis range, normalization and alignment of replicate profiles, generation of consensus profiles, normalization and alignment of consensus profiles and final analysis of the samples including calculation of association coefficients and diversity index. The procedures are designed so that in all analysis steps, from the initial preparation of the data to the final comparison of the samples, there are various different options available. The parameters regarding analysis range, noise baseline, T-RF alignment and generation of consensus profiles are all given by the user and several different methods are available for normalization of the T-RF profiles. In each step, the user can also choose to base the calculations on either peak height data or peak area data.

Conclusions

The Tools for T-RFLP data analysis template enables an objective and flexible analysis of large T-RFLP datasets in a widely used spreadsheet application.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0361-7) contains supplementary material, which is available to authorized users. 相似文献

2.

Analysis of T-RFLP data using analysis of variance and ordination methods: a comparative study 总被引：4，自引：0，他引：4

Culman SW Gauch HG Blackwood CB Thies JE 《Journal of microbiological methods》2008,75(1):55-63

The analysis of T-RFLP data has developed considerably over the last decade, but there remains a lack of consensus about which statistical analyses offer the best means for finding trends in these data. In this study, we empirically tested and theoretically compared ten diverse T-RFLP datasets derived from soil microbial communities using the more common ordination methods in the literature: principal component analysis (PCA), nonmetric multidimensional scaling (NMS) with Sørensen, Jaccard and Euclidean distance measures, correspondence analysis (CA), detrended correspondence analysis (DCA) and a technique new to T-RFLP data analysis, the Additive Main Effects and Multiplicative Interaction (AMMI) model. Our objectives were i) to determine the distribution of variation in T-RFLP datasets using analysis of variance (ANOVA), ii) to determine the more robust and informative multivariate ordination methods for analyzing T-RFLP data, and iii) to compare the methods based on theoretical considerations. For the 10 datasets examined in this study, ANOVA revealed that the variation from Environment main effects was always small, variation from T-RFs main effects was large, and variation from T-RF × Environment (T × E) interactions was intermediate. Larger variation due to T × E indicated larger differences in microbial communities between environments/treatments and thus demonstrated the utility of ANOVA to provide an objective assessment of community dissimilarity. The comparison of statistical methods typically yielded similar empirical results. AMMI, T-RF-centered PCA, and DCA were the most robust methods in terms of producing ordinations that consistently reached a consensus with other methods. In datasets with high sample heterogeneity, NMS analyses with Sørensen and Jaccard distance were the most sensitive for recovery of complex gradients. The theoretical comparison showed that some methods hold distinct advantages for T-RFLP analysis, such as estimations of variation captured, realistic or minimal assumptions about the data, reduced weight placed on rare T-RFs, and uniqueness of solutions. Our results lead us to recommend that method selection be guided by T-RFLP dataset complexity and the outlined theoretical criteria. Finally, we recommend using binary or relativized peak height data with soil-based T-RFLP data for ordination-based exploratory microbial analyses. 相似文献

3.

A correlation coefficient for circular data 总被引：2，自引：0，他引：2

FISHER N. I.; LEE A. J. 《Biometrika》1983,70(2):327-332

相似文献

4.

Partial correlation coefficient between distance matrices as a new indicator of protein-protein interactions

Sato T Yamanishi Y Horimoto K Kanehisa M Toh H 《Bioinformatics (Oxford, England)》2006,22(20):2488-2492

相似文献

5.

T-REX: software for the processing and analysis of T-RFLP data

Steven W Culman Robert Bukowski Hugh G Gauch Hinsby Cadillo-Quiroz Daniel H Buckley 《BMC bioinformatics》2009,10(1):171

Background

Despite increasing popularity and improvements in terminal restriction fragment length polymorphism (T-RFLP) and other microbial community fingerprinting techniques, there are still numerous obstacles that hamper the analysis of these datasets. Many steps are required to process raw data into a format ready for analysis and interpretation. These steps can be time-intensive, error-prone, and can introduce unwanted variability into the analysis. Accordingly, we developed T-REX, free, online software for the processing and analysis of T-RFLP data. 相似文献

6.

Exploration of phylogenetic data using a global sequence analysis method

Charles?Chapus Christine?Dufraigne Scott?Edwards Alain?Giron Bernard?Fertil Patrick?Deschavanne Email author 《BMC evolutionary biology》2005,5(1):63

Background

Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. 相似文献

7.

Quantized correlation coefficient for measuring reproducibility of ChIP-chip data

Shouyong Peng Mitzi I Kuroda Peter J Park 《BMC bioinformatics》2010,11(1):399

Background

Chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) is used to study protein-DNA interactions and histone modifications on a genome-scale. To ensure data quality, these experiments are usually performed in replicates, and a correlation coefficient between replicates is used often to assess reproducibility. However, the correlation coefficient can be misleading because it is affected not only by the reproducibility of the signal but also by the amount of binding signal present in the data. 相似文献

8.

Impact of T-RFLP data analysis choices on assessments of microbial community structure and dynamics

Nils Johan Fredriksson Malte Hermansson Britt-Marie Wilén 《BMC bioinformatics》2014,15(1)

Background

Terminal restriction fragment length polymorphism (T-RFLP) analysis is a common DNA-fingerprinting technique used for comparisons of complex microbial communities. Although the technique is well established there is no consensus on how to treat T-RFLP data to achieve the highest possible accuracy and reproducibility. This study focused on two critical steps in the T-RFLP data treatment: the alignment of the terminal restriction fragments (T-RFs), which enables comparisons of samples, and the normalization of T-RF profiles, which adjusts for differences in signal strength, total fluorescence, between samples.

Results

Variations in the estimation of T-RF sizes were observed and these variations were found to affect the alignment of the T-RFs. A novel method was developed which improved the alignment by adjusting for systematic shifts in the T-RF size estimations between the T-RF profiles. Differences in total fluorescence were shown to be caused by differences in sample concentration and by the gel loading. Five normalization methods were evaluated and the total fluorescence normalization procedure based on peak height data was found to increase the similarity between replicate profiles the most. A high peak detection threshold, alignment correction, normalization and the use of consensus profiles instead of single profiles increased the similarity of replicate T-RF profiles, i.e. lead to an increased reproducibility. The impact of different treatment methods on the outcome of subsequent analyses of T-RFLP data was evaluated using a dataset from a longitudinal study of the bacterial community in an activated sludge wastewater treatment plant. Whether the alignment was corrected or not and if and how the T-RF profiles were normalized had a substantial impact on ordination analyses, assessments of bacterial dynamics and analyses of correlations with environmental parameters.

Conclusions

A novel method for the evaluation and correction of the alignment of T-RF profiles was shown to reduce the uncertainty and ambiguity in alignments of T-RF profiles. Large differences in the outcome of assessments of bacterial community structure and dynamics were observed between different alignment and normalization methods. The results of this study can therefore be of value when considering what methods to use in the analysis of T-RFLP data.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0360-8) contains supplementary material, which is available to authorized users. 相似文献

9.

An analysis of correlation matrices: Equal correlations 总被引：1，自引：0，他引：1

BRIEN C. J.; VENABLES W. N.; JAMES A. T.; MAYO O. 《Biometrika》1984,71(3):545-554

相似文献

10.

mtDNA analysis of 174 Eurasian populations using a new iterative rank correlation method

Zoltán Juhász Tibor Fehér Endre Németh Horolma Pamjav 《Molecular genetics and genomics : MGG》2016,291(1):493-509

相似文献

11.

Target speech feature extraction using non-parametric correlation coefficient

Sang Yeob Oh Kyung-Yong Chung 《Cluster computing》2014,17(3):893-899

Speech recognition systems for the automobile have a few weaknesses, including failure to recognize speech due to the mixing of environment noise from inside and outside the car and from other voices. Therefore, this paper features a technique for extracting only the selected target voice from input sound that is a mixture of voices and noises. The feature for selective speech extraction composes a correlation map of auditory elements by using similarity between channels and continuity of time, and utilizes a method of extracting speech features by using a non-parametric correlation coefficient. This proposed method was validated by showing that the average distortion of separation of the technique decreased by 0.8630 dB. It was shown that the performance of the selective feature extraction utilizing a cross correlation is good, but overall, the selective feature extraction utilizing a non-parametric correlation is better. 相似文献

12.

A general correlation coefficient for directional data and related regression problems

JUPP P. E.; MARDIA K. V. 《Biometrika》1980,67(1):163-173

相似文献

13.

Joint explorative analysis of neuroreceptor subsystems in the human brain: application to receptor-transporter correlation using PET data

Cselényi Z Lundberg J Halldin C Farde L Gulyás B 《Neurochemistry international》2004,45(5):773-781

Positron emission tomography (PET) has proved to be a highly successful technique in the qualitative and quantitative exploration of the human brain's neurotransmitter-receptor systems. In recent years, the number of PET radioligands, targeted to different neuroreceptor systems of the human brain, has increased considerably. This development paves the way for a simultaneous analysis of different receptor systems and subsystems in the same individual. The detailed exploration of the versatility of neuroreceptor systems requires novel technical approaches, capable of operating on huge parametric image datasets. An initial step of such explorative data processing and analysis should be the development of novel exploratory data-mining tools to gain insight into the "structure" of complex multi-individual, multi-receptor data sets. For practical reasons, a possible and feasible starting point of multi-receptor research can be the analysis of the pre- and post-synaptic binding sites of the same neurotransmitter. In the present study, we propose an unsupervised, unbiased data-mining tool for this task and demonstrate its usefulness by using quantitative receptor maps, obtained with positron emission tomography, from five healthy subjects on (pre-synaptic) serotonin transporters (5-HTT or SERT) and (post-synaptic) 5-HT(1A) receptors. Major components of the proposed technique include the projection of the input receptor maps to a feature space, the quasi-clustering and classification of projected data (neighbourhood formation), trans-individual analysis of neighbourhood properties (trajectory analysis), and the back-projection of the results of trajectory analysis to normal space (creation of multi-receptor maps). The resulting multi-receptor maps suggest that complex relationships and tendencies in the relationship between pre- and post-synaptic transporter-receptor systems can be revealed and classified by using this method. As an example, we demonstrate the regional correlation of the serotonin transporter-receptor systems. These parameter-specific multi-receptor maps can usefully guide the researchers in their endeavour to formulate models of multi-receptor interactions and changes in the human brain. 相似文献

14.

Visualizing plant metabolomic correlation networks using clique-metabolite matrices. 总被引：6，自引：0，他引：6

F Kose W Weckwerth T Linke O Fiehn 《Bioinformatics (Oxford, England)》2001,17(12):1198-1208

相似文献

15.

属特异性T-RFLP技术用于乳酸杆菌的群落分析 总被引：2，自引：0，他引：2

张思璐刘云霄张浩琪 CHIN James 吴希阳《微生物学通报》2012,39(8):1179-1189

【目的】设计乳酸杆菌属特异性T-RFLP技术(末端限制性片段长度多态性分析)对14株乳酸杆菌进行分型。【方法】采用源于16S-23S rRNA基因间隔区序列的乳酸杆菌属特异性引物LAB-rev,乳酸杆菌的属特异性引物,6-FAM荧光标记后结合16S上游通用引物7f用于乳酸杆菌的PCR扩增。【结果】选取HaeⅢ和HhaⅠ进行限制性酶切,最后对酶切后的产物末端测序得到T-RFLP峰谱图,该图谱能够快速准确地对不同种的乳酸杆菌进行定性、定量的分析。【结论】实验成功搭建T-RFLP技术用于微生态环境中乳酸杆菌检测的平台,对于在功能性食品、乳酸饮料和药物对肠道微生态的影响及菌种鉴定等领域有重大意义。相似文献

16.

Linking of digital images to phylogenetic data matrices using a morphological ontology 总被引：1，自引：0，他引：1

Ramírez MJ Coddington JA Maddison WP Midford PE Prendini L Miller J Griswold CE Hormiga G Sierwald P Scharff N Benjamin SP Wheeler WC 《Systematic biology》2007,56(2):283-294

Images are paramount in documentation of morphological data. Production and reproduction costs have traditionally limited how many illustrations taxonomy could afford to publish, and much comparative knowledge continues to be lost as generations turn over. Now digital images are cheaply produced and easily disseminated electronically but pose problems in maintenance, curation, sharing, and use, particularly in long-term data sets involving multiple collaborators and institutions. We propose an efficient linkage of images to phylogenetic data sets via an ontology of morphological terms; an underlying, fine-grained database of specimens, images, and associated metadata; fixation of the meaning of morphological terms (homolog names) by ostensive references to particular taxa; and formalization of images as standard views. The ontology provides the intellectual structure and fundamental design of the relationships and enables intelligent queries to populate phylogenetic data sets with images. The database itself documents primary morphological observations, their vouchers, and associated metadata, rather than the conventional data set cell, and thereby facilitates data maintenance despite character redefinition or specimen reidentification. It minimizes reexamination of specimens, loss of information or data quality, and echoes the data models of web-based repositories for images, specimens, and taxonomic names. Confusion and ambiguity in the meanings of technical morphological terms are reduced by ostensive definitions pointing to features in particular taxa, which may serve as reference for globally unique identifiers of characters. Finally, the concept of standard views (an image illustrating one or more homologs in a specific sex and life stage, in a specific orientation, using a specific device and preparation technique) enables efficient, dynamic linkage of images to the data set and automatic population of matrix cells with images independently of scoring decisions. 相似文献

17.

Metagenomic bacterial community profiles of chicken embryo gastrointestinal tract by using T-RFLP analysis

L. A. Ilina E. A. Yildirim I. N. Nikonov V. A. Filippova G. Y. Laptev N. I. Novikova A. A. Grozina T. N. Lenkova V. A. Manukyan I. A. Egorov V. I. Fisinin 《Doklady. Biochemistry and biophysics》2016,466(1):47-51

Thirty microbial phylotypes of microorganisms were found in the gastrointestinal tract of chicken belonging to the Hajseks White breed, and 38 phylotypes were found in the gastrointestinal tract of chicken belonging to the Hajseks Brown breed. The microbiome of the gastrointestinal tract of the chicken embryos of the Hajseks White breed was dominated by the typical representatives of avian intestinal microflora—bacteria of the family Enterobacteriaceae (47.3%), orders Actinomycetales (13.6%) and Bifidobacteriales (20.6%), and the family Lachnospiraceae (1.1%). The microbiome of the gastrointestinal tract of the chicken embryos of the Hajseks Brown breed was dominated by the pathogenic bacteria of the order Rickettsiales (94.8%). The metagenome of gastrointestinal tract of both breeds also contained a small number of genes of unidentified bacteria. 相似文献

18.

Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM

Weir JP 《Journal of strength and conditioning research / National Strength & Conditioning Association》2005,19(1):231-240

Reliability, the consistency of a test or measurement, is frequently quantified in the movement sciences literature. A common metric is the intraclass correlation coefficient (ICC). In addition, the SEM, which can be calculated from the ICC, is also frequently reported in reliability studies. However, there are several versions of the ICC, and confusion exists in the movement sciences regarding which ICC to use. Further, the utility of the SEM is not fully appreciated. In this review, the basics of classic reliability theory are addressed in the context of choosing and interpreting an ICC. The primary distinction between ICC equations is argued to be one concerning the inclusion (equations 2,1 and 2,k) or exclusion (equations 3,1 and 3,k) of systematic error in the denominator of the ICC equation. Inferential tests of mean differences, which are performed in the process of deriving the necessary variance components for the calculation of ICC values, are useful to determine if systematic error is present. If so, the measurement schedule should be modified (removing trials where learning and/or fatigue effects are present) to remove systematic error, and ICC equations that only consider random error may be safely used. The use of ICC values is discussed in the context of estimating the effects of measurement error on sample size, statistical power, and correlation attenuation. Finally, calculation and application of the SEM are discussed. It is shown how the SEM and its variants can be used to construct confidence intervals for individual scores and to determine the minimal difference needed to be exhibited for one to be confident that a true change in performance of an individual has occurred. 相似文献

19.

Molecular analysis of fecal microbiota in elderly individuals using 16S rDNA library and T-RFLP

Hayashi H Sakamoto M Kitahara M Benno Y 《Microbiology and immunology》2003,47(8):557-570

Fecal microbiota in six elderly individuals were characterized by the 16S rDNA libraries and terminal restriction fragment length polymorphism (T-RFLP) analysis. Random clones of 16S rRNA gene sequences were isolated after PCR amplification with universal primer sets from total genomic DNA extracted from feces of three elderly individuals. These clones were partially sequenced (about 500 bp). T-RFLP analysis was performed using 16S rDNA amplified from six subjects. The lengths of the terminal restriction fragment (T-RF) were analyzed after digestion by HhaI and MspI. Among 240 clones obtained, approximately 46% belonged to 27 known species. About 54% of the other clones were 56 novel "phylotypes" (at least 98% homology of clone sequence). These libraries included 83 species or phylotypes. In addition, about 13% (30 phylotypes) of these phylotypes were newly discovered in these libraries. A large number of species that are not yet known exist in the feces of elderly individuals. 16S rDNA libraries and T-RFLP analysis revealed that the majority of bacteria were Bacteroides and relatives, Clostridium rRNA cluster IV, IX, Clostridium rRNA subcluster XIVa, and "Gammaproteobacteria". The proportion of Clostridium rRNA subcluster XIVa was lower than in healthy adults. In addition, although Ruminococcus obeum and its closely related phylotypes were detected in high frequency in healthy young subjects, hardly any were detected in our elderly individuals. "Gammaproteobacteria" were detected at high frequency. 相似文献

20.

Confidence interval estimation of the intraclass correlation coefficient for binary outcome data 总被引：3，自引：0，他引：3

Zou G Donner A 《Biometrics》2004,60(3):807-811

We obtain closed-form asymptotic variance formulae for three point estimators of the intraclass correlation coefficient that may be applied to binary outcome data arising in clusters of variable size. Our results include as special cases those that have previously appeared in the literature (Fleiss and Cuzick, 1979, Applied Psychological Measurement 3, 537-542; Bloch and Kraemer, 1989, Biometrics 45, 269-287; Altaye, Donner, and Klar, 2001, Biometrics 57, 584-588). Simulation results indicate that confidence intervals based on the estimator proposed by Fleiss and Cuzick provide coverage levels close to nominal over a wide range of parameter combinations. Two examples are presented. 相似文献