期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Data quality assessment of ungated flow cytometry data in high throughput experiments.

Nolwenn Le Meur Anthony Rossini Maura Gasparetto Clay Smith Ryan R Brinkman Robert Gentleman 《Cytometry. Part A》2007,71(6):393-403

BACKGROUND: The recent development of semiautomated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. METHODS: We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. RESULTS: We found that graphical representations can reveal substantial nonbiological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. CONCLUSIONS: Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control. 相似文献

2.

Optimizing transformations for automated,high throughput analysis of flow cytometry data

Greg Finak Juan-Manuel Perez Andrew Weng Raphael Gottardo 《BMC bioinformatics》2010,11(1):546

Background

In a high throughput setting, effective flow cytometry data analysis depends heavily on proper data preprocessing. While usual preprocessing steps of quality assessment, outlier removal, normalization, and gating have received considerable scrutiny from the community, the influence of data transformation on the output of high throughput analysis has been largely overlooked. Flow cytometry measurements can vary over several orders of magnitude, cell populations can have variances that depend on their mean fluorescence intensities, and may exhibit heavily-skewed distributions. Consequently, the choice of data transformation can influence the output of automated gating. An appropriate data transformation aids in data visualization and gating of cell populations across the range of data. Experience shows that the choice of transformation is data specific. Our goal here is to compare the performance of different transformations applied to flow cytometry data in the context of automated gating in a high throughput, fully automated setting. We examine the most common transformations used in flow cytometry, including the generalized hyperbolic arcsine, biexponential, linlog, and generalized Box-Cox, all within the BioConductor flowCore framework that is widely used in high throughput, automated flow cytometry data analysis. All of these transformations have adjustable parameters whose effects upon the data are non-intuitive for most users. By making some modelling assumptions about the transformed data, we develop maximum likelihood criteria to optimize parameter choice for these different transformations. 相似文献

3.

flowCore: a Bioconductor package for high throughput flow cytometry

Florian Hahne Nolwenn LeMeur Ryan R Brinkman Byron Ellis Perry Haaland Deepayan Sarkar Josef Spidlen Errol Strain Robert Gentleman 《BMC bioinformatics》2009,10(1):106-8

Background

Recent advances in automation technologies have enabled the use of flow cytometry for high throughput screening, generating large complex data sets often in clinical trials or drug discovery settings. However, data management and data analysis methods have not advanced sufficiently far from the initial small-scale studies to support modeling in the presence of multiple covariates. 相似文献

4.

Combination of automated high throughput platforms, flow cytometry, and hierarchical clustering to detect cell state.

Christine M Kitsos Phani Bhamidipati Irena Melnikova Ethan P Cash Chris McNulty Julia Furman Michael J Cima Douglas Levinson 《Cytometry. Part A》2007,71(1):16-27

BACKGROUND: This study examined whether hierarchical clustering could be used to detect cell states induced by treatment combinations that were generated through automation and high-throughput (HT) technology. Data-mining techniques were used to analyze the large experimental data sets to determine whether nonlinear, non-obvious responses could be extracted from the data. METHODS: Unary, binary, and ternary combinations of pharmacological factors (examples of stimuli) were used to induce differentiation of HL-60 cells using a HT automated approach. Cell profiles were analyzed by incorporating hierarchical clustering methods on data collected by flow cytometry. Data-mining techniques were used to explore the combinatorial space for nonlinear, unexpected events. Additional small-scale, follow-up experiments were performed on cellular profiles of interest. RESULTS: Multiple, distinct cellular profiles were detected using hierarchical clustering of expressed cell-surface antigens. Data-mining of this large, complex data set retrieved cases of both factor dominance and cooperativity, as well as atypical cellular profiles. Follow-up experiments found that treatment combinations producing "atypical cell types" made those cells more susceptible to apoptosis. CONCLUSIONS Hierarchical clustering and other data-mining techniques were applied to analyze large data sets from HT flow cytometry. From each sample, the data set was filtered and used to define discrete, usable states that were then related back to their original formulations. Analysis of resultant cell populations induced by a multitude of treatments identified unexpected phenotypes and nonlinear response profiles. 相似文献

5.

High throughput flow cytometry

Kuckuck FW Edwards BS Sklar LA 《Cytometry》2001,44(1):83-90

BACKGROUND: Conventional flow cytometry does not allow the rapid analysis of multiple samples. This has limited its uses in drug discovery, for which the standard for throughput is 100,000 samples per day. METHODS: We describe a simple method in which commercial peristaltic tubing is connected from a commercial autosampler to a flow cytometer. The samples are delivered via a peristaltic pump from source wells in a multiwell plate. The samples are separated by air bubbles. RESULTS: Throughput rates approach the limit of the autosampler (up to 100 wells per minute). Using optimal tubing and flow rates, particles remain within appropriate light scatter and fluorescence gates. The carryover between wells is typically less than 5% without and 1% with a wash step. The volumes of sample delivered are in the microliter scale. The approach has been validated with instruments from three manufacturers. CONCLUSIONS: Flow cytometry has potential throughput of 100,000 samples or more per day starting with the method described. The method is currently best suited to end-point assays. However, combined with high-speed sorting and single- cell assays, the number of assays could approach 1 billion per day. 相似文献

6.

Comparison of five clustering algorithms to classify phytoplankton from flow cytometry data.

M F Wilkins S A Hardy L Boddy C W Morris 《Cytometry》2001,44(3):210-217

BACKGROUND: Artificial neural networks (ANNs) have been shown to be valuable in the analysis of analytical flow cytometric (AFC) data in aquatic ecology. Automated extraction of clusters is an important first stage in deriving ANN training data from field samples, but AFC data pose a number of challenges for many types of clustering algorithm. The fuzzy k-means algorithm recently has been extended to address nonspherical clusters with the use of scatter matrices. Four variants were proposed, each optimizing a different measure of clustering "goodness." METHODS: With AFC data obtained from marine phytoplankton species in culture, the four fuzzy k-means algorithm variants were compared with each other and with another multivariate clustering algorithm based on critical distances currently used in flow cytometry. RESULTS: One of the algorithm variants (adaptive distances, also known as the Gustafson--Kessel algorithm) was found to be robust and reliable, whereas the others showed various problems. CONCLUSIONS: The adaptive distances algorithm was superior in use to the clustering algorithms against which it was tested, but the problem of automatic determination of the number of clusters remains to be addressed. 相似文献

7.

Data standards for flow cytometry

Spidlen J Gentleman RC Haaland PD Langille M Le Meur N Ochs MF Schmitt C Smith CA Treister AS Brinkman RR 《Omics : a journal of integrative biology》2006,10(2):209-214

Flow cytometry (FCM) is an analytical tool widely used for cancer and HIV/AIDS research, and treatment, stem cell manipulation and detecting microorganisms in environmental samples. Current data standards do not capture the full scope of FCM experiments and there is a demand for software tools that can assist in the exploration and analysis of large FCM datasets. We are implementing a standardized approach to capturing, analyzing, and disseminating FCM data that will facilitate both more complex analyses and analysis of datasets that could not previously be efficiently studied. Initial work has focused on developing a community-based guideline for recording and reporting the details of FCM experiments. Open source software tools that implement this standard are being created, with an emphasis on facilitating reproducible and extensible data analyses. As well, tools for electronic collaboration will assist the integrated access and comprehension of experiments to empower users to collaborate on FCM analyses. This coordinated, joint development of bioinformatics standards and software tools for FCM data analysis has the potential to greatly facilitate both basic and clinical research--impacting a notably diverse range of medical and environmental research areas. 相似文献

8.

Single and multi-subject clustering of flow cytometry data for cell-type identification and anomaly detection

Maziyar Baran Pouyan Vasu Jindal Javad Birjandtalab Mehrdad Nourani 《BMC medical genomics》2016,9(2):41

Background

Measurement of various markers of single cells using flow cytometry has several biological applications. These applications include improving our understanding of behavior of cellular systems, identifying rare cell populations and personalized medication. A common critical issue in the existing methods is identification of the number of cellular populations which heavily affects the accuracy of results. Furthermore, anomaly detection is crucial in flow cytometry experiments. In this work, we propose a two-stage clustering technique for cell type identification in single subject flow cytometry data and extend it for anomaly detection among multiple subjects.

Results

Our experimentation on 42 flow cytometry datasets indicates high performance and accurate clustering (F-measure > 91 %) in identifying main cellular populations. Furthermore, our anomaly detection technique evaluated on Acute Myeloid Leukemia dataset results in only <2 % false positives.

相似文献

9.

Data handling strategies for high throughput pyrosequencers

Trombetti GA Bonnal RJ Rizzi E De Bellis G Milanesi L 《BMC bioinformatics》2007,8(Z1):S22

相似文献

10.

Using phytoplankton and flow cytometry to analyze grazing by marine organisms 总被引：2，自引：0，他引：2

T L Cucci S E Shumway W S Brown C R Newell 《Cytometry》1989,10(5):659-669

Phytoplankton can, through their autofluorescent characteristics, be thought of as tracer particles in much the same way as fluorescent microspheres when used in particle uptake experiments. Flow cytometric techniques can be used to differentiate phytoplankton from other suspended particles by the two primary autofluorescing photosynthetic pigments, chlorophyll and phycoerythrin. Based on these characteristics, phytoplankton assemblages have been used to assess grazing rates, particle selectivity, and endocytotic abilities in various marine species, from single-celled organisms to higher invertebrates. 相似文献

11.

Statistical considerations for high throughput screening data

Xian-Jin Xie 《生物学前沿》2010,5(4):354-360

High throughput screening (HTS) is a widely used effective approach in genome-wide association and large scale protein expression studies, drug discovery, and biomedical imaging research. How to accurately identify candidate ‘targets’ or biologically meaningful features with a high degree of confidence has led to extensive statistical research in an effort to minimize both false-positive and false-negative rates. A large body of literature on this topic with in-depth statistical contents is available. We examine currently available statistical methods on HTS and aim to summarize some selected methods into a concise, easy-tofollow introduction for experimental biologists. 相似文献

12.

Introduction to flow cytometry data file standard 总被引：2，自引：0，他引：2

P N Dean C B Bagwell T Lindmo R F Murphy G C Salzman 《Cytometry》1990,11(3):321-322

The Data File Standards Committee of the Society for Analytical Cytology presents a Standard to be used for the storage of data associated with flow cytometric measurements. The Standard specifies a format that provides for the inclusion of all information necessary to fully describe: 1) the instrument used for the measurement; 2) the sample measured; 3) the data obtained; and 4) the results of analysis of the data. The Committee and the Society for Analytical Cytology point out that the use of this Standard by all those individuals and companies that generate or use data taken with flow cytometers or generate methods of analysis for the data will encourage the sharing of such data and methods of analysis. 相似文献

13.

Using flowViz to visualize flow cytometry data

Sarkar D Le Meur N Gentleman R 《Bioinformatics (Oxford, England)》2008,24(6):878-879

Summary: Automated analysis of flow cytometry (FCM) data isessential for it to become successful as a high throughput technology.We believe that the principles of Trellis graphics can be adaptedto provide useful visualizations that can aid such automation.In this article, we describe the R/Bioconductor package flowVizthat implements such visualizations. Availability: flowViz is available as an R package from theBioconductor project: http://bioconductor.org Contact: dsarkar{at}fhcrc.org Associate Editor: Olga Troyanskaya 相似文献

14.

3-D clustering: a tool for high throughput docking

John P. Priestle 《Journal of molecular modeling》2009,15(5):551-560

This report describes a computer program for clustering docking poses based on their 3-dimensional (3D) coordinates as well as on their chemical structures. This is chiefly intended for reducing a set of hits coming from high throughput docking, since the capacity to prepare and biologically test such molecules is generally far more limited than the capacity to generate such hits. The advantage of clustering molecules based on 3D, rather than 2D, criteria is that small variations on a scaffold may bring about different binding modes for molecules that would not be predicted by 2D similarity alone. The program does a pose-by-pose/atom-by-atom comparison of a set of docking hits (poses), scoring both spatial and chemical similarity. Using these pair-wise similarities, the whole set is clustered based on a user-supplied similarity threshold. An output coordinate file is created that mirrors the input coordinate file, but contains two new properties: a cluster number and similarity to the cluster center. Poses in this output file can easily be sorted by cluster and displayed together for visual inspection with any standard molecular viewing program, and decisions made about which molecule should be selected for biological testing as the best representative of this group of similar molecules with similar binding modes. 相似文献

15.

Misty Mountain clustering: application to fast unsupervised flow cytometry gating

István P Sugár Stuart C Sealfon 《BMC bioinformatics》2010,11(1):502

Background

There are many important clustering questions in computational biology for which no satisfactory method exists. Automated clustering algorithms, when applied to large, multidimensional datasets, such as flow cytometry data, prove unsatisfactory in terms of speed, problems with local minima or cluster shape bias. Model-based approaches are restricted by the assumptions of the fitting functions. Furthermore, model based clustering requires serial clustering for all cluster numbers within a user defined interval. The final cluster number is then selected by various criteria. These supervised serial clustering methods are time consuming and frequently different criteria result in different optimal cluster numbers. Various unsupervised heuristic approaches that have been developed such as affinity propagation are too expensive to be applied to datasets on the order of 10⁶ points that are often generated by high throughput experiments. 相似文献

16.

MIPHENO: data normalization for high throughput metabolite analysis

Shannon M Bell Lyle D Burgoon Robert L Last 《BMC bioinformatics》2012,13(1):10

Background

High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course of months and years, often without the controls needed to compare directly across the dataset. Few methods are available to facilitate comparisons of high throughput metabolic data generated in batches where explicit in-group controls for normalization are lacking. 相似文献

17.

The spectral networks paradigm in high throughput mass spectrometry

Guthals A Watrous JD Dorrestein PC Bandeira N 《Molecular bioSystems》2012,8(10):2535-2544

High-throughput proteomics is made possible by a combination of modern mass spectrometry instruments capable of generating many millions of tandem mass (MS(2)) spectra on a daily basis and the increasingly sophisticated associated software for their automated identification. Despite the growing accumulation of collections of identified spectra and the regular generation of MS(2) data from related peptides, the mainstream approach for peptide identification is still the nearly two decades old approach of matching one MS(2) spectrum at a time against a database of protein sequences. Moreover, database search tools overwhelmingly continue to require that users guess in advance a small set of 4-6 post-translational modifications that may be present in their data in order to avoid incurring substantial false positive and negative rates. The spectral networks paradigm for analysis of MS(2) spectra differs from the mainstream database search paradigm in three fundamental ways. First, spectral networks are based on matching spectra against other spectra instead of against protein sequences. Second, spectral networks find spectra from related peptides even before considering their possible identifications. Third, spectral networks determine consensus identifications from sets of spectra from related peptides instead of separately attempting to identify one spectrum at a time. Even though spectral networks algorithms are still in their infancy, they have already delivered the longest and most accurate de novo sequences to date, revealed a new route for the discovery of unexpected post-translational modifications and highly-modified peptides, enabled automated sequencing of cyclic non-ribosomal peptides with unknown amino acids and are now defining a novel approach for mapping the entire molecular output of biological systems that is suitable for analysis with tandem mass spectrometry. Here we review the current state of spectral networks algorithms and discuss possible future directions for automated interpretation of spectra from any class of molecules. 相似文献

18.

Detection and monitoring of normal and leukemic cell populations with hierarchical clustering of flow cytometry data

Fi?er K Sieger T Schumich A Wood B Irving J Mejst?íková E Dworzak MN 《Cytometry. Part A》2012,81(1):25-34

Flow cytometry is a valuable tool in research and diagnostics including minimal residual disease (MRD) monitoring of hematologic malignancies. However, its gradual advancement toward increasing numbers of fluorescent parameters leads to information rich datasets, which are challenging to analyze by standard gating and do not reflect the multidimensionality of the data. We have developed a novel method to analyze complex flow cytometry data, based on hierarchical clustering analysis (HCA) but with a new underlying algorithm, using Mahalanobis distance measure. HCA is scalable to analyze complex multiparameter datasets (here demonstrated on up to 12 color flow cytometry and on a 20-parameter synthetic dataset). We have validated this method by comparison with standard gating approaches when performed independently by expert cytometrists. Acute lymphoblastic leukemia blast populations were analyzed in diagnostic and follow-up datasets (n = 123) from three centers. HCA results correlated very well (Passing-Bablok correlation coefficient = 0.992, slope = 1, intercept = -0.01) with standard gating data obtained by the I-BFM FLOW-MRD study group. To further improve the performance in follow-up samples with low MRD levels and to automate MRD detection, we combined HCA with support vector machine (SVM) learning. HCA in combination with SVM provides a novel diagnostic tool that not only allows analysis of increasingly complex flow cytometry data but also is less observer-dependent compared with classical gating and has potential for automation. 相似文献

19.

Microscope-based multiparameter laser scanning cytometer yielding data comparable to flow cytometry data 总被引：7，自引：0，他引：7

L A Kamentsky L D Kamentsky 《Cytometry》1991,12(5):381-387

We describe a computer-controlled 10 microns spot size laser scanning cytometer for making multiple wavelength fluorescence and scatter measurements of unconstrained cells on a surface such as a microscope slide. Designated areas of slides placed on a microscope stage are automatically scanned, and cells which generate above-threshold scatter or fluorescence values are found and individually processed to determine a list of measurement parameters. For each fluorescence or scatter measurement parameter, this list contains the integrated and peak values and bit pattern images of a scan window centered on the cell. The measurement time, the position of the cell on the slide, and two segmentation indices are also included in the list. Measurement time, cell position, and properties derived from the bit patterns are used interchangeably with integrated or peak measurement values as coordinates of multiproperty displays. Cells may be selected for counting, data display in various forms, or visual observation based on their meeting complex criteria among a chain of two property screens. Cells with selected properties may be viewed during an experiment or retrospectively. A designated specimen field may be repeatedly remeasured to perform kinetic cell studies. An argon ion and a HeNe- based laser instrument have been constructed and software has been written and evaluated with the specific goal of increasing the precision of propidium iodide-stained cellular DNA measurements. Some of the capabilities of the instrument and its current performance are described. 相似文献

20.

Modeling flow cytometry data for cancer vaccine immune monitoring

Jacob Frelinger Janet Ottinger Cécile Gouttefangeas Cliburn Chan 《Cancer immunology, immunotherapy : CII》2010,59(9):1435-1441

Flow cytometry (FCM) is widely used in cancer research for diagnosis, detection of minimal residual disease, as well as immune monitoring and profiling following immunotherapy. In all these applications, the challenge is to detect extremely rare cell subsets while avoiding spurious positive events. To achieve this objective, it helps to be able to analyze FCM data using multiple markers simultaneously, since the additional information provided often helps to minimize the number of false positive and false negative events, hence increasing both sensitivity and specificity. However, with manual gating, at most two markers can be examined in a single dot plot, and a sequential strategy is often used. As the sequential strategy discards events that fall outside preceding gates at each stage, the effectiveness of the strategy is difficult to evaluate without laborious and painstaking back-gating. Model-based analysis is a promising computational technique that works using information from all marker dimensions simultaneously, and offers an alternative approach to flow analysis that can usefully complement manual gating in the design of optimal gating strategies. Results from model-based analysis will be illustrated with examples from FCM assays commonly used in cancer immunotherapy laboratories. 相似文献