首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
BACKGROUND: The recent development of semiautomated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. METHODS: We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. RESULTS: We found that graphical representations can reveal substantial nonbiological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. CONCLUSIONS: Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control.  相似文献   

2.

Background  

In a high throughput setting, effective flow cytometry data analysis depends heavily on proper data preprocessing. While usual preprocessing steps of quality assessment, outlier removal, normalization, and gating have received considerable scrutiny from the community, the influence of data transformation on the output of high throughput analysis has been largely overlooked. Flow cytometry measurements can vary over several orders of magnitude, cell populations can have variances that depend on their mean fluorescence intensities, and may exhibit heavily-skewed distributions. Consequently, the choice of data transformation can influence the output of automated gating. An appropriate data transformation aids in data visualization and gating of cell populations across the range of data. Experience shows that the choice of transformation is data specific. Our goal here is to compare the performance of different transformations applied to flow cytometry data in the context of automated gating in a high throughput, fully automated setting. We examine the most common transformations used in flow cytometry, including the generalized hyperbolic arcsine, biexponential, linlog, and generalized Box-Cox, all within the BioConductor flowCore framework that is widely used in high throughput, automated flow cytometry data analysis. All of these transformations have adjustable parameters whose effects upon the data are non-intuitive for most users. By making some modelling assumptions about the transformed data, we develop maximum likelihood criteria to optimize parameter choice for these different transformations.  相似文献   

3.

Background  

Recent advances in automation technologies have enabled the use of flow cytometry for high throughput screening, generating large complex data sets often in clinical trials or drug discovery settings. However, data management and data analysis methods have not advanced sufficiently far from the initial small-scale studies to support modeling in the presence of multiple covariates.  相似文献   

4.
BACKGROUND: This study examined whether hierarchical clustering could be used to detect cell states induced by treatment combinations that were generated through automation and high-throughput (HT) technology. Data-mining techniques were used to analyze the large experimental data sets to determine whether nonlinear, non-obvious responses could be extracted from the data. METHODS: Unary, binary, and ternary combinations of pharmacological factors (examples of stimuli) were used to induce differentiation of HL-60 cells using a HT automated approach. Cell profiles were analyzed by incorporating hierarchical clustering methods on data collected by flow cytometry. Data-mining techniques were used to explore the combinatorial space for nonlinear, unexpected events. Additional small-scale, follow-up experiments were performed on cellular profiles of interest. RESULTS: Multiple, distinct cellular profiles were detected using hierarchical clustering of expressed cell-surface antigens. Data-mining of this large, complex data set retrieved cases of both factor dominance and cooperativity, as well as atypical cellular profiles. Follow-up experiments found that treatment combinations producing "atypical cell types" made those cells more susceptible to apoptosis. CONCLUSIONS Hierarchical clustering and other data-mining techniques were applied to analyze large data sets from HT flow cytometry. From each sample, the data set was filtered and used to define discrete, usable states that were then related back to their original formulations. Analysis of resultant cell populations induced by a multitude of treatments identified unexpected phenotypes and nonlinear response profiles.  相似文献   

5.
BACKGROUND: Conventional flow cytometry does not allow the rapid analysis of multiple samples. This has limited its uses in drug discovery, for which the standard for throughput is 100,000 samples per day. METHODS: We describe a simple method in which commercial peristaltic tubing is connected from a commercial autosampler to a flow cytometer. The samples are delivered via a peristaltic pump from source wells in a multiwell plate. The samples are separated by air bubbles. RESULTS: Throughput rates approach the limit of the autosampler (up to 100 wells per minute). Using optimal tubing and flow rates, particles remain within appropriate light scatter and fluorescence gates. The carryover between wells is typically less than 5% without and 1% with a wash step. The volumes of sample delivered are in the microliter scale. The approach has been validated with instruments from three manufacturers. CONCLUSIONS: Flow cytometry has potential throughput of 100,000 samples or more per day starting with the method described. The method is currently best suited to end-point assays. However, combined with high-speed sorting and single- cell assays, the number of assays could approach 1 billion per day.  相似文献   

6.
BACKGROUND: Artificial neural networks (ANNs) have been shown to be valuable in the analysis of analytical flow cytometric (AFC) data in aquatic ecology. Automated extraction of clusters is an important first stage in deriving ANN training data from field samples, but AFC data pose a number of challenges for many types of clustering algorithm. The fuzzy k-means algorithm recently has been extended to address nonspherical clusters with the use of scatter matrices. Four variants were proposed, each optimizing a different measure of clustering "goodness." METHODS: With AFC data obtained from marine phytoplankton species in culture, the four fuzzy k-means algorithm variants were compared with each other and with another multivariate clustering algorithm based on critical distances currently used in flow cytometry. RESULTS: One of the algorithm variants (adaptive distances, also known as the Gustafson--Kessel algorithm) was found to be robust and reliable, whereas the others showed various problems. CONCLUSIONS: The adaptive distances algorithm was superior in use to the clustering algorithms against which it was tested, but the problem of automatic determination of the number of clusters remains to be addressed.  相似文献   

7.
We have investigated the use of hierarchical clustering of flow cytometry data to classify samples of conventional central chondrosarcoma, a malignant cartilage forming tumor of uncertain cellular origin, according to similarities with surface marker profiles of several known cell types. Human primary chondrosarcoma cells, articular chondrocytes, mesenchymal stem cells, fibroblasts, and a panel of tumor cell lines from chondrocytic or epithelial origin were clustered based on the expression profile of eleven surface markers. For clustering, eight hierarchical clustering algorithms, three distance metrics, as well as several approaches for data preprocessing, including multivariate outlier detection, logarithmic transformation, and z‐score normalization, were systematically evaluated. By selecting clustering approaches shown to give reproducible results for cluster recovery of known cell types, primary conventional central chondrosacoma cells could be grouped in two main clusters with distinctive marker expression signatures: one group clustering together with mesenchymal stem cells (CD49b‐high/CD10‐low/CD221‐high) and a second group clustering close to fibroblasts (CD49b‐low/CD10‐high/CD221‐low). Hierarchical clustering also revealed substantial differences between primary conventional central chondrosarcoma cells and established chondrosarcoma cell lines, with the latter not only segregating apart from primary tumor cells and normal tissue cells, but clustering together with cell lines from epithelial lineage. Our study provides a foundation for the use of hierarchical clustering applied to flow cytometry data as a powerful tool to classify samples according to marker expression patterns, which could lead to uncover new cancer subtypes. J. Cell. Physiol. 225: 601–611, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

8.
Flow cytometry (FCM) is an analytical tool widely used for cancer and HIV/AIDS research, and treatment, stem cell manipulation and detecting microorganisms in environmental samples. Current data standards do not capture the full scope of FCM experiments and there is a demand for software tools that can assist in the exploration and analysis of large FCM datasets. We are implementing a standardized approach to capturing, analyzing, and disseminating FCM data that will facilitate both more complex analyses and analysis of datasets that could not previously be efficiently studied. Initial work has focused on developing a community-based guideline for recording and reporting the details of FCM experiments. Open source software tools that implement this standard are being created, with an emphasis on facilitating reproducible and extensible data analyses. As well, tools for electronic collaboration will assist the integrated access and comprehension of experiments to empower users to collaborate on FCM analyses. This coordinated, joint development of bioinformatics standards and software tools for FCM data analysis has the potential to greatly facilitate both basic and clinical research--impacting a notably diverse range of medical and environmental research areas.  相似文献   

9.

Background

Measurement of various markers of single cells using flow cytometry has several biological applications. These applications include improving our understanding of behavior of cellular systems, identifying rare cell populations and personalized medication. A common critical issue in the existing methods is identification of the number of cellular populations which heavily affects the accuracy of results. Furthermore, anomaly detection is crucial in flow cytometry experiments. In this work, we propose a two-stage clustering technique for cell type identification in single subject flow cytometry data and extend it for anomaly detection among multiple subjects.

Results

Our experimentation on 42 flow cytometry datasets indicates high performance and accurate clustering (F-measure > 91 %) in identifying main cellular populations. Furthermore, our anomaly detection technique evaluated on Acute Myeloid Leukemia dataset results in only <2 % false positives.
  相似文献   

10.
Phytoplankton can, through their autofluorescent characteristics, be thought of as tracer particles in much the same way as fluorescent microspheres when used in particle uptake experiments. Flow cytometric techniques can be used to differentiate phytoplankton from other suspended particles by the two primary autofluorescing photosynthetic pigments, chlorophyll and phycoerythrin. Based on these characteristics, phytoplankton assemblages have been used to assess grazing rates, particle selectivity, and endocytotic abilities in various marine species, from single-celled organisms to higher invertebrates.  相似文献   

11.
Introduction to flow cytometry data file standard   总被引:2,自引:0,他引:2  
The Data File Standards Committee of the Society for Analytical Cytology presents a Standard to be used for the storage of data associated with flow cytometric measurements. The Standard specifies a format that provides for the inclusion of all information necessary to fully describe: 1) the instrument used for the measurement; 2) the sample measured; 3) the data obtained; and 4) the results of analysis of the data. The Committee and the Society for Analytical Cytology point out that the use of this Standard by all those individuals and companies that generate or use data taken with flow cytometers or generate methods of analysis for the data will encourage the sharing of such data and methods of analysis.  相似文献   

12.
Summary: Automated analysis of flow cytometry (FCM) data isessential for it to become successful as a high throughput technology.We believe that the principles of Trellis graphics can be adaptedto provide useful visualizations that can aid such automation.In this article, we describe the R/Bioconductor package flowVizthat implements such visualizations. Availability: flowViz is available as an R package from theBioconductor project: http://bioconductor.org Contact: dsarkar{at}fhcrc.org Associate Editor: Olga Troyanskaya  相似文献   

13.
14.
High throughput screening (HTS) is a widely used effective approach in genome-wide association and large scale protein expression studies, drug discovery, and biomedical imaging research. How to accurately identify candidate ‘targets’ or biologically meaningful features with a high degree of confidence has led to extensive statistical research in an effort to minimize both false-positive and false-negative rates. A large body of literature on this topic with in-depth statistical contents is available. We examine currently available statistical methods on HTS and aim to summarize some selected methods into a concise, easy-tofollow introduction for experimental biologists.  相似文献   

15.
This report describes a computer program for clustering docking poses based on their 3-dimensional (3D) coordinates as well as on their chemical structures. This is chiefly intended for reducing a set of hits coming from high throughput docking, since the capacity to prepare and biologically test such molecules is generally far more limited than the capacity to generate such hits. The advantage of clustering molecules based on 3D, rather than 2D, criteria is that small variations on a scaffold may bring about different binding modes for molecules that would not be predicted by 2D similarity alone. The program does a pose-by-pose/atom-by-atom comparison of a set of docking hits (poses), scoring both spatial and chemical similarity. Using these pair-wise similarities, the whole set is clustered based on a user-supplied similarity threshold. An output coordinate file is created that mirrors the input coordinate file, but contains two new properties: a cluster number and similarity to the cluster center. Poses in this output file can easily be sorted by cluster and displayed together for visual inspection with any standard molecular viewing program, and decisions made about which molecule should be selected for biological testing as the best representative of this group of similar molecules with similar binding modes.  相似文献   

16.

Background  

There are many important clustering questions in computational biology for which no satisfactory method exists. Automated clustering algorithms, when applied to large, multidimensional datasets, such as flow cytometry data, prove unsatisfactory in terms of speed, problems with local minima or cluster shape bias. Model-based approaches are restricted by the assumptions of the fitting functions. Furthermore, model based clustering requires serial clustering for all cluster numbers within a user defined interval. The final cluster number is then selected by various criteria. These supervised serial clustering methods are time consuming and frequently different criteria result in different optimal cluster numbers. Various unsupervised heuristic approaches that have been developed such as affinity propagation are too expensive to be applied to datasets on the order of 106 points that are often generated by high throughput experiments.  相似文献   

17.
We present a method for generating gel-based unordered 2D arrays of bacterial cells of a very high density, up to 10(5) cells per mm(2). Bacteria in a suspension are focused into a thin layer when the suspension and a dry gel matrix penetrate each other. Formation of a second gel from gel-forming components contained in the suspension results in immobilization of the cells. The immobilized cells stay alive and can repeatedly divide to produce microcolonies. The method provides for high-throughput screening and massively parallel analysis of individual cells in large populations, as well as for rapid isolation of rare clones.  相似文献   

18.

Background  

High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course of months and years, often without the controls needed to compare directly across the dataset. Few methods are available to facilitate comparisons of high throughput metabolic data generated in batches where explicit in-group controls for normalization are lacking.  相似文献   

19.
We describe a computer-controlled 10 microns spot size laser scanning cytometer for making multiple wavelength fluorescence and scatter measurements of unconstrained cells on a surface such as a microscope slide. Designated areas of slides placed on a microscope stage are automatically scanned, and cells which generate above-threshold scatter or fluorescence values are found and individually processed to determine a list of measurement parameters. For each fluorescence or scatter measurement parameter, this list contains the integrated and peak values and bit pattern images of a scan window centered on the cell. The measurement time, the position of the cell on the slide, and two segmentation indices are also included in the list. Measurement time, cell position, and properties derived from the bit patterns are used interchangeably with integrated or peak measurement values as coordinates of multiproperty displays. Cells may be selected for counting, data display in various forms, or visual observation based on their meeting complex criteria among a chain of two property screens. Cells with selected properties may be viewed during an experiment or retrospectively. A designated specimen field may be repeatedly remeasured to perform kinetic cell studies. An argon ion and a HeNe- based laser instrument have been constructed and software has been written and evaluated with the specific goal of increasing the precision of propidium iodide-stained cellular DNA measurements. Some of the capabilities of the instrument and its current performance are described.  相似文献   

20.
Flow cytometry is a valuable tool in research and diagnostics including minimal residual disease (MRD) monitoring of hematologic malignancies. However, its gradual advancement toward increasing numbers of fluorescent parameters leads to information rich datasets, which are challenging to analyze by standard gating and do not reflect the multidimensionality of the data. We have developed a novel method to analyze complex flow cytometry data, based on hierarchical clustering analysis (HCA) but with a new underlying algorithm, using Mahalanobis distance measure. HCA is scalable to analyze complex multiparameter datasets (here demonstrated on up to 12 color flow cytometry and on a 20-parameter synthetic dataset). We have validated this method by comparison with standard gating approaches when performed independently by expert cytometrists. Acute lymphoblastic leukemia blast populations were analyzed in diagnostic and follow-up datasets (n = 123) from three centers. HCA results correlated very well (Passing-Bablok correlation coefficient = 0.992, slope = 1, intercept = -0.01) with standard gating data obtained by the I-BFM FLOW-MRD study group. To further improve the performance in follow-up samples with low MRD levels and to automate MRD detection, we combined HCA with support vector machine (SVM) learning. HCA in combination with SVM provides a novel diagnostic tool that not only allows analysis of increasingly complex flow cytometry data but also is less observer-dependent compared with classical gating and has potential for automation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号