首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Siegmund KD 《Human genetics》2011,129(6):585-595
Following the rapid development and adoption in DNA methylation microarray assays, we are now experiencing a growth in the number of statistical tools to analyze the resulting large-scale data sets. As is the case for other microarray applications, biases caused by technical issues are of concern. Some of these issues are old (e.g., two-color dye bias and probe- and array-specific effects), while others are new (e.g., fragment length bias and bisulfite conversion efficiency). Here, I highlight characteristics of DNA methylation that suggest standard statistical tools developed for other data types may not be directly suitable. I then describe the microarray technologies most commonly in use, along with the methods used for preprocessing and obtaining a summary measure. I finish with a section describing downstream analyses of the data, focusing on methods that model percentage DNA methylation as the outcome, and methods for integrating DNA methylation with gene expression or genotype data.  相似文献   

2.
Regression approaches for microarray data analysis.   总被引:6,自引:0,他引:6  
A variety of new procedures have been devised to handle the two-sample comparison (e.g., tumor versus normal tissue) of gene expression values as measured with microarrays. Such new methods are required in part because of some defining characteristics of microarray-based studies: (i) the very large number of genes contributing expression measures which far exceeds the number of samples (observations) available and (ii) the fact that by virtue of pathway/network relationships, the gene expression measures tend to be highly correlated. These concerns are exacerbated in the regression setting, where the objective is to relate gene expression, simultaneously for multiple genes, to some external outcome or phenotype. Correspondingly, several methods have been recently proposed for addressing these issues. We briefly critique some of these methods prior to a detailed evaluation of gene harvesting. This reveals that gene harvesting, without additional constraints, can yield artifactual solutions. Results obtained employing such constraints motivate the use of regularized regression procedures such as the lasso, least angle regression, and support vector machines. Model selection and solution multiplicity issues are also discussed. The methods are evaluated using a microarray-based study of cardiomyopathy in transgenic mice.  相似文献   

3.
4.
A robust analysis of comparative genomic microarray data is critical for meaningful genomic comparison studies. In this paper, we compare our method (implemented in a new software tool, GENCOM, freely available at ) with three commonly used analysis methods: GACK (freely available at ), an empirical cut-off value of twofold difference between the fluorescence intensities after LOWESS normalization or after AVERAGE normalization in which the fluorescence intensity is divided by the average fluorescence intensity of the entire data set. Each method was tested using data sets from real experiments with prior knowledge of conserved and divergent genes. GENCOM and GACK were superior when a high proportion of genes were divergent. GENCOM was the most suitable method for the data set in which the relationship between the fluorescence intensities was not linear. GENCOM has proved robust in an analysis of all the data sets tested.  相似文献   

5.

Background  

The data from DNA microarrays are increasingly being used in order to understand effects of different conditions, exposures or diseases on the modulation of the expression of various genes in a biological system. This knowledge is then further used in order to generate molecular mechanistic hypotheses for an organism when it is exposed to different conditions. Several different methods have been proposed to analyze these data under different distributional assumptions on gene expression. However, the empirical validation of these assumptions is lacking.  相似文献   

6.

Background  

When DNA microarray data are used for gene clustering, genotype/phenotype correlation studies, or tissue classification the signal intensities are usually transformed and normalized in several steps in order to improve comparability and signal/noise ratio. These steps may include subtraction of an estimated background signal, subtracting the reference signal, smoothing (to account for nonlinear measurement effects), and more. Different authors use different approaches, and it is generally not clear to users which method they should prefer.  相似文献   

7.
An enormous amount of microarray data has been collected and accumulated in public repositories. Although some of the depositions include raw and processed data, significant parts of them include processed data only. If we need to combine multiple datasets for specific purposes, the data should be adjusted prior to use to remove bias between the datasets. We focused on a GeneChip platform and a pre-processing method, RMA, and examined simple quantile correction as the post-processing method for integration. Integration of the data pre-processed by RMA was evaluated using artificial spike-in datasets and real microarray datasets of atopic dermatitis and lung cancer. Studies using the spike-in datasets show that the quantile correction for data integration reduces the data quality at some extent but it should be acceptable level. Studies using the real datasets show that the quantile correction significantly reduces the bias. These results show that the quantile correction is useful for integration of multiple datasets processed by RMA, and encourage effective use of public microarray data.  相似文献   

8.
GenePublisher, a system for automatic analysis of data from DNA microarray experiments, has been implemented with a web interface at http://www.cbs.dtu.dk/services/GenePublisher. Raw data are uploaded to the server together with a specification of the data. The server performs normalization, statistical analysis and visualization of the data. The results are run against databases of signal transduction pathways, metabolic pathways and promoter sequences in order to extract more information. The results of the entire analysis are summarized in report form and returned to the user.  相似文献   

9.
This article focuses on clustering techniques for the analysis of microarray data and discusses contributions and applications for the implementation of intelligent diagnostic systems and therapy design studies. Approaches to validating and visualising expression clustering results and software and other relevant resources to support clustering-based analyses are reviewed. Finally, this paper addresses current limitations and problems that need to be investigated for the development of an advanced generation of pattern discovery tools.  相似文献   

10.
Michel W  Mai T  Naiser T  Ott A 《Biophysical journal》2007,92(3):999-1004
We investigate the kinetics of DNA hybridization reactions on glass substrates, where one 22 mer strand (bound-DNA) is immobilized via phenylene-diisothiocyanate linker molecule on the substrate, the dye-labeled (Cy3) complementary strand (free-DNA) is in solution in a reaction chamber. We use total internal reflection fluorescence for surface detection of hybridization. As a new feature we perform a simultaneous real-time measurement of the change of free-DNA concentration in bulk parallel to the total internal reflection fluorescence measurement. We observe that the free-DNA concentration decreases considerably during hybridization. We show how the standard Langmuir kinetics needs to be extended to take into account the change in bulk concentration and explain our experimental results. Connecting both measurements we can estimate the surface density of accessible, immobilized bound-DNA. We discuss the implications with respect to DNA microarray detection.  相似文献   

11.
Qian J  Kluger Y  Yu H  Gerstein M 《BioTechniques》2003,35(1):42-4, 46, 48
  相似文献   

12.

Background  

Oligonucleotide arrays have become one of the most widely used high-throughput tools in biology. Due to their sensitivity to experimental conditions, normalization is a crucial step when comparing measurements from these arrays. Normalization is, however, far from a solved problem. Frequently, we encounter datasets with significant technical effects that currently available methods are not able to correct.  相似文献   

13.
DNA microarrays represent the latest advance in molecular technology. In combination with bioinformatics, they provide unparalleled opportunities for simultaneous detection of thousands of genes or target DNA sequences and offer tremendous potential for studying food-borne microorganisms. This review provides an up-to-date look at the application of DNA microarray technology to detect food-borne pathogenic bacteria, viruses, and parasites. In addition, it covers the advantages of using microarray technology to further characterize microorganisms by providing information for specific identification of isolates, to understand the pathogenesis based on the presence of virulence genes, and to indicate how new pathogenic strains evolved epidemiologically and phylogenetically.  相似文献   

14.
Analysis of recursive gene selection approaches from microarray data   总被引:1,自引:0,他引:1  
MOTIVATION: Finding a small subset of most predictive genes from microarray for disease prediction is a challenging problem. Support vector machines (SVMs) have been found to be successful with a recursive procedure in selecting important genes for cancer prediction. However, it is not well understood how much of the success depends on the choice of the specific classifier and how much on the recursive procedure. We answer this question by examining multiple classifers [SVM, ridge regression (RR) and Rocchio] with feature selection in recursive and non-recursive settings on three DNA microarray datasets (ALL-AML Leukemia data, Breast Cancer data and GCM data). RESULTS: We found recursive RR most effective. On the AML-ALL dataset, it achieved zero error rate on the test set using only three genes (selected from over 7000), which is more encouraging than the best published result (zero error rate using 8 genes by recursive SVM). On the Breast Cancer dataset and the two largest categories of the GCM dataset, the results achieved by recursive RR are also very encouraging. A further analysis of the experimental results shows that different classifiers penalize redundant features to different extent and this property plays an important role in the recursive feature selection process. RR classifier tends to penalize redundant features to a much larger extent than the SVM does. This may be the reason why recursive RR has a better performance in selecting genes.  相似文献   

15.
16.
Quantitative information about the nucleic acids hybridization reaction on microarrays is fundamental to designing optimized assays for molecular diagnostics. This study presents the kinetic, equilibrium, and thermodynamic analyses of DNA hybridization in a microarray system designed for fast molecular testing of pathogenic bacteria. Our microarray setup uses a porous, nylon membrane for probe immobilization and flowthrough incubation. The Langmuir model was used to determine the reaction rate constants of hybridization with antisense targets specific to Staphylococcus epidermidis and Staphylococcus aureus strains. The kinetic analysis revealed a sequence-dependent reaction rate, with association rate constants on the order of 105 M−1 s−1 and dissociation rate constants of 10−4 s−1. We found that by increasing the probe surface density from 1011 to 1012 molecules/cm2, the hybridization rate and efficiency are suppressed while the melting temperature of the DNA duplex increases. The maximum fraction of hybridized capture probes at equilibrium did not exceed 50% for hybridization with antisense sequences and was below 6% for hybridization with long targets obtained from PCR. The van’t Hoff analysis of the temperature denaturation data showed that the DNA hybridization in our porous, flowthrough microarray is thermodynamically less favorable than the hybridization of the same sequences in solution.  相似文献   

17.

Background  

Numerous microarray analysis programs have been created through the efforts of Open Source software development projects. Providing browser-based interfaces that allow these programs to be executed over the Internet enhances the applicability and utility of these analytic software tools.  相似文献   

18.
With the emergence of genome-wide colorimetric in situ hybridization (ISH) data sets such as the Allen Brain Atlas, it is important to understand the relationship between this gene expression modality and those derived from more quantitative based technologies. This study introduces a novel method for standardized relative quantification of colorimetric ISH signal that enables a large-scale cross-platform expression level comparison of ISH with two publicly available microarray brain data sources.  相似文献   

19.
A mesophilic toluene-degrading consortium (TDC) and an ethylbenzene-degrading consortium (EDC) were established under sulfate-reducing conditions. These consortia were first characterized by denaturing gradient gel electrophoresis (DGGE) fingerprinting of PCR-amplified 16S rRNA gene fragments, followed by sequencing. The sequences of the major bands (T-1 and E-2) belonging to TDC and EDC, respectively, were affiliated with the family Desulfobacteriaceae. Another major band from EDC (E-1) was related to an uncultured non-sulfate-reducing soil bacterium. Oligonucleotide probes specific for the 16S rRNAs of target organisms corresponding to T-1, E-1, and E-2 were designed, and hybridization conditions were optimized for two analytical formats, membrane and DNA microarray hybridization. Both formats were used to characterize the TDC and EDC, and the results of both were consistent with DGGE analysis. In order to assess the utility of the microarray format for analysis of environmental samples, oil-contaminated sediments from the coast of Kuwait were analyzed. The DNA microarray successfully detected bacterial nucleic acids from these samples, but probes targeting specific groups of sulfate-reducing bacteria did not give positive signals. The results of this study demonstrate the limitations and the potential utility of DNA microarrays for microbial community analysis.  相似文献   

20.
MOTIVATION: One problem with discriminant analysis of DNA microarray data is that each sample is represented by quite a large number of genes, and many of them are irrelevant, insignificant or redundant to the discriminant problem at hand. Methods for selecting important genes are, therefore, of much significance in microarray data analysis. In the present study, a new criterion, called LS Bound measure, is proposed to address the gene selection problem. The LS Bound measure is derived from leave-one-out procedure of LS-SVMs (least squares support vector machines), and as the upper bound for leave-one-out classification results it reflects to some extent the generalization performance of gene subsets. RESULTS: We applied this LS Bound measure for gene selection on two benchmark microarray datasets: colon cancer and leukemia. We also compared the LS Bound measure with other evaluation criteria, including the well-known Fisher's ratio and Mahalanobis class separability measure, and other published gene selection algorithms, including Weighting factor and SVM Recursive Feature Elimination. The strength of the LS Bound measure is that it provides gene subsets leading to more accurate classification results than the filter method while its computational complexity is at the level of the filter method. AVAILABILITY: A companion website can be accessed at http://www.ntu.edu.sg/home5/pg02776030/lsbound/. The website contains: (1) the source code of the gene selection algorithm; (2) the complete set of tables and figures regarding the experimental study; (3) proof of the inequality (9). CONTACT: ekzmao@ntu.edu.sg.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号