首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We describe a probabilistic approach to simultaneous image segmentation and intensity estimation for complementary DNA microarray experiments. The approach overcomes several limitations of existing methods. In particular, it (a) uses a flexible Markov random field approach to segmentation that allows for a wider range of spot shapes than existing methods, including relatively common 'doughnut-shaped' spots; (b) models the image directly as background plus hybridization intensity, and estimates the two quantities simultaneously, avoiding the common logical error that estimates of foreground may be less than those of the corresponding background if the two are estimated separately; and (c) uses a probabilistic modeling approach to simultaneously perform segmentation and intensity estimation, and to compute spot quality measures. We describe two approaches to parameter estimation: a fast algorithm, based on the expectation-maximization and the iterated conditional modes algorithms, and a fully Bayesian framework. These approaches produce comparable results, and both appear to offer some advantages over other methods. We use an HIV experiment to compare our approach to two commercial software products: Spot and Arrayvision.  相似文献   

2.
3.
A statistical model is proposed for the analysis of errors in microarray experiments and is employed in the analysis and development of a combined normalisation regime. Through analysis of the model and two-dye microarray data sets, this study found the following. The systematic error introduced by microarray experiments mainly involves spot intensity-dependent, feature-specific and spot position-dependent contributions. It is difficult to remove all these errors effectively without a suitable combined normalisation operation. Adaptive normalisation using a suitable regression technique is more effective in removing spot intensity-related dye bias than self-normalisation, while regional normalisation (block normalisation) is an effective way to correct spot position-dependent errors. However, dye-flip replicates are necessary to remove feature-specific errors, and also allow the analyst to identify the experimentally introduced dye bias contained in non-self-self data sets. In this case, the bias present in the data sets may include both experimentally introduced dye bias and the biological difference between two samples. Self-normalisation is capable of removing dye bias without identifying the nature of that bias. The performance of adaptive normalisation, on the other hand, depends on its ability to correctly identify the dye bias. If adaptive normalisation is combined with an effective dye bias identification method then there is no systematic difference between the outcomes of the two methods.  相似文献   

4.
Many quantitative cell biology questions require fast yet reliable automated image segmentation to identify and link cells from frame‐to‐frame, and characterize the cell morphology and fluorescence. We present SuperSegger, an automated MATLAB‐based image processing package well‐suited to quantitative analysis of high‐throughput live‐cell fluorescence microscopy of bacterial cells. SuperSegger incorporates machine‐learning algorithms to optimize cellular boundaries and automated error resolution to reliably link cells from frame‐to‐frame. Unlike existing packages, it can reliably segment microcolonies with many cells, facilitating the analysis of cell‐cycle dynamics in bacteria as well as cell‐contact mediated phenomena. This package has a range of built‐in capabilities for characterizing bacterial cells, including the identification of cell division events, mother, daughter and neighbouring cells, and computing statistics on cellular fluorescence, the location and intensity of fluorescent foci. SuperSegger provides a variety of postprocessing data visualization tools for single cell and population level analysis, such as histograms, kymographs, frame mosaics, movies and consensus images. Finally, we demonstrate the power of the package by analyzing lag phase growth with single cell resolution.  相似文献   

5.
6.
MOTIVATION: Although numerous algorithms have been developed for microarray segmentation, extensive comparisons between the algorithms have acquired far less attention. In this study, we evaluate the performance of nine microarray segmentation algorithms. Using both simulated and real microarray experiments, we overcome the challenges in performance evaluation, arising from the lack of ground-truth information. The usage of simulated experiments allows us to analyze the segmentation accuracy on a single pixel level as is commonly done in traditional image processing studies. With real experiments, we indirectly measure the segmentation performance, identify significant differences between the algorithms, and study the characteristics of the resulting gene expression data. RESULTS: Overall, our results show clear differences between the algorithms. The results demonstrate how the segmentation performance depends on the image quality, which algorithms operate on significantly different performance levels, and how the selection of a segmentation algorithm affects the identification of differentially expressed genes. AVAILABILITY: Supplementary results and the microarray images used in this study are available at the companion web site http://www.cs.tut.fi/sgn/csb/spotseg/  相似文献   

7.
Microarrays are part of a new class of biotechnologies that allow the monitoring of expression levels for thousands of genes simultaneously. Image analysis is an important aspect of microarray experiments, one that can have a potentially large impact on subsequent analyses, such as clustering or the identification of differentially expressed genes. This paper reviews a number of existing image analysis methods used on cDNA microarray data. In particular, it describes and discusses the different segmentation and background adjustment methods. It was found that in some cases background adjustment can substantially reduce the precision--that is, increase the variability of low-intensity spot values. In contrast, the choice of segmentation procedure seems to have a smaller impact.  相似文献   

8.
Xu D  Li G  Wu L  Zhou J  Xu Y 《Bioinformatics (Oxford, England)》2002,18(11):1432-1437
MOTIVATION: DNA microarray is a powerful high-throughput tool for studying gene function and regulatory networks. Due to the problem of potential cross hybridization, using full-length genes for microarray construction is not appropriate in some situations. A bioinformatic tool, PRIMEGENS, has recently been developed for the automatic design of PCR primers using DNA fragments that are specific to individual open reading frames (ORFs). RESULTS: PRIMEGENS first carries out a BLAST search for each target ORF against all other ORFs of the genome to quickly identify possible homologous sequences. Then it performs optimal sequence alignment between the target ORF and each of its homologous ORFs using dynamic programming. PRIMEGENS uses the sequence alignments to select gene- specific fragments, and then feeds the fragments to the Primer3 program to design primer pairs for PCR amplification. PRIMEGENS can be run from the command line on Unix/Linux platforms as a stand-alone package or it can be used from a Web interface. The program runs efficiently, and it takes a few seconds per sequence on a typical workstation. PCR primers specific to individual ORFs from Shewanella oneidensis MR-1 and Deinococcus radiodurans R1 have been designed. The PCR amplification results indicate that this method is very efficient and reliable for designing specific probes for microarray analysis.  相似文献   

9.
10.
A mixture model-based approach to the clustering of microarray expression data   总被引:13,自引:0,他引:13  
MOTIVATION: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. RESULTS: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. AVAILABILITY: EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/  相似文献   

11.
Recently, the model-based roentgen stereophotogrammetric analysis (RSA) method has been developed as an in vivo tool to estimate static pose and dynamic motion of the instrumented prostheses. The two essential inputs for the RSA method are prosthetic models and roentgen images. During RSA calculation, the implants are often reversely scanned and input in the form of meshes to estimate the outline error between prosthetic projection and roentgen images. However, the execution efficiency of the RSA iterative calculation may limit its clinical practicability, and one reason for inefficiency may be very large number of meshes in the model. This study uses two methods of mesh manipulation to improve the execution efficiency of RSA calculation. The first is to simplify the model meshes and the other is to segment and delete the meshes of insignificant regions. An index (i.e. critical percentage) of an optimal element number is defined as the trade-off between execution efficiency and result accuracy. The predicted results are numerically validated by total knee prosthetic system. The outcome shows that the optimal strategy of the mesh manipulation is simplification and followed by segmentation. On average, the element number can even be reduced to 1% of the original models. After the mesh manipulation, the execution efficiency can be increased about 75% without compromising the accuracy of the predicted RSA results (the increment of rotation and translation error: 0.06° and 0.02 mm). In conclusion, prosthetic models should be manipulated by simplification and segmentation methods prior to the RSA calculation to increase the execution efficiency and then to improve clinical applicability of the RSA method.  相似文献   

12.
Wang S  Zhu J 《Biometrics》2008,64(2):440-448
Summary .   Variable selection in high-dimensional clustering analysis is an important yet challenging problem. In this article, we propose two methods that simultaneously separate data points into similar clusters and select informative variables that contribute to the clustering. Our methods are in the framework of penalized model-based clustering. Unlike the classical L 1-norm penalization, the penalty terms that we propose make use of the fact that parameters belonging to one variable should be treated as a natural "group." Numerical results indicate that the two new methods tend to remove noninformative variables more effectively and provide better clustering results than the L 1-norm approach.  相似文献   

13.
CRCView is a user-friendly point-and-click web server for analyzing and visualizing microarray gene expression data using a Dirichlet process mixture model-based clustering algorithm. CRCView is designed to clustering genes based on their expression profiles. It allows flexible input data format, rich graphical illustration as well as integrated GO term based annotation/interpretation of clustering results. Availability: http://helab.bioinformatics.med.umich.edu/crcview/.  相似文献   

14.
As the topological properties of each spot in DNA microarray images may vary from one another, we employed granulometries to understand the shape-size content contributed due to a significant intensity value within a spot. Analysis was performed on the microarray image that consisted of 240 spots by using concepts from mathematical morphology. In order to find out indices for each spot and to further classify them, we adopted morphological multiscale openings, which provided microarrays at multiple scales. Successive opened microarrays were subtracted to identify the protrusions that were smaller than the size of structuring element. Spot-wise details, in terms of probability of these observed protrusions,were computed by placing a regularly spaced grid on microarray such that each spot was centered in each grid. Based on the probability of size distribution functions of these protrusions isolated at each level, we estimated the mean size and texture index for each spot. With these characteristics, we classified the spots in a microarray image into bright and dull categories through pattern spectrum and shape-size complexity measures. These segregated spots can be compared with those of hybridization levels.  相似文献   

15.
16.
MOTIVATION: Classifying genes into clusters depending on their expression profiles is one of the most important analysis techniques for microarray data. Because temporal gene expression profiles are indicative of the dynamic functional properties of genes, the application of clustering analysis to time-course data allows the more precise division of genes into functional classes. Conventional clustering methods treat the sampling data at each time point as data obtained under different experimental conditions without considering the continuity of time-course data between time periods t and t+1. Here, we propose a method designated mathematical model-based clustering (MMBC). RESULTS: The proposed method, designated MMBC, was applied to artificial data and time-course data obtained using Saccharomyces cerevisiae. Our method is able to divide data into clusters more accurately and coherently than conventional clustering methods. Furthermore, MMBC is more tolerant to noise than conventional clustering methods. AVAILABILITY: Software is available upon request. CONTACT: taizo@brs.kyushu-u.ac.jp.  相似文献   

17.
Inspired by the temporal correlation theory of brain functions, researchers have presented a number of neural oscillator networks to implement visual scene segmentation problems. Recently, it is shown that many biological neural networks are typical small-world networks. In this paper, we propose and investigate two small-world models derived from the well-known LEGION (locally excitatory and globally inhibitory oscillator network) model. To form a small-world network, we add a proper proportion of unidirectional shortcuts (random long-range connections) to the original LEGION model. With local connections and shortcuts, the neural oscillators can not only communicate with neighbors but also exchange phase information with remote partners. Model 1 introduces excitatory shortcuts to enhance the synchronization within an oscillator group representing the same object. Model 2 goes further to replace the global inhibitor with a sparse set of inhibitory shortcuts. Simulation results indicate that the proposed small-world models could achieve synchronization faster than the original LEGION model and are more likely to bind disconnected image regions belonging together. In addition, we argue that these two models are more biologically plausible.  相似文献   

18.
MOTIVATION: Although several recently proposed analysis packages for microarray data can cope with heavy-tailed noise, many applications rely on Gaussian assumptions. Gaussian noise models foster computational efficiency. This comes, however, at the expense of increased sensitivity to outlying observations. Assessing potential insufficiencies of Gaussian noise in microarray data analysis is thus important and of general interest. RESULTS: We propose to this end assessing different noise models on a large number of microarray experiments. The goodness of fit of noise models is quantified by a hierarchical Bayesian analysis of variance model, which predicts normalized expression values as a mixture of a Gaussian density and t-distributions with adjustable degrees of freedom. Inference of differentially expressed genes is taken into consideration at a second mixing level. For attaining far reaching validity, our investigations cover a wide range of analysis platforms and experimental settings. As the most striking result, we find irrespective of the chosen preprocessing and normalization method in all experiments that a heavy-tailed noise model is a better fit than a simple Gaussian. Further investigations revealed that an appropriate choice of noise model has a considerable influence on biological interpretations drawn at the level of inferred genes and gene ontology terms. We conclude from our investigation that neglecting the over dispersed noise in microarray data can mislead scientific discovery and suggest that the convenience of Gaussian-based modelling should be replaced by non-parametric approaches or other methods that account for heavy-tailed noise.  相似文献   

19.
MOTIVATION: Cluster analysis of gene expression profiles has been widely applied to clustering genes for gene function discovery. Many approaches have been proposed. The rationale is that the genes with the same biological function or involved in the same biological process are more likely to co-express, hence they are more likely to form a cluster with similar gene expression patterns. However, most existing methods, including model-based clustering, ignore known gene functions in clustering. RESULTS: To take advantage of accumulating gene functional annotations, we propose incorporating known gene functions as prior probabilities in model-based clustering. In contrast to a global mixture model applicable to all the genes in the standard model-based clustering, we use a stratified mixture model: one stratum corresponds to the genes of unknown function while each of the other ones corresponding to the genes sharing the same biological function or pathway; the genes from the same stratum are assumed to have the same prior probability of coming from a cluster while those from different strata are allowed to have different prior probabilities of coming from the same cluster. We derive a simple EM algorithm that can be used to fit the stratified model. A simulation study and an application to gene function prediction demonstrate the advantage of our proposal over the standard method. CONTACT: weip@biostat.umn.edu  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号