首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.

Background

Array-based comparative genomic hybridization (aCGH) is a high-throughput method for measuring genome-wide DNA copy number changes. Current aCGH methods have limited resolution, sensitivity and reproducibility. Microarrays for aCGH are available only for a few organisms and combination of aCGH data with expression data is cumbersome.

Results

We present a novel method of using commercial oligonucleotide expression microarrays for aCGH, enabling DNA copy number measurements and expression profiles to be combined using the same platform. This method yields aCGH data from genomic DNA without complexity reduction at a median resolution of approximately 17,500 base pairs. Due to the well-defined nature of oligonucleotide probes, DNA amplification and deletion can be defined at the level of individual genes and can easily be combined with gene expression data.

Conclusion

A novel method of gene resolution analysis of copy number variation (graCNV) yields high-resolution maps of DNA copy number changes and is applicable to a broad range of organisms for which commercial oligonucleotide expression microarrays are available. Due to the standardization of oligonucleotide microarrays, graCNV results can reliably be compared between laboratories and can easily be combined with gene expression data using the same platform.  相似文献   

3.
Traits such as disease resistance are costly to evaluate and slow to improve using current methods. Analysis of gene expression profiles (e.g. DNA microarrays) has potential for predicting such phenotypes and has been used in an analogous way to classify cancer types in human patients. However, doubts have been raised regarding the use of classification methods with microarray data for this purpose. Here we propose a method using random regression with cross validation, which accounts for the distribution of variation in the trait and utilises different subsets of patients or animals to perform a complete validation of predictive ability. Published breast tumour data were used to test the method. Despite the small dataset (n < 100), the new approach resulted in a moderate but significant correlation between the predicted and actual phenotypes (0.32). Binary classification of the predicted phenotypes yielded similar classification error rates to those found by other authors (35%). Unlike other methods, the new method gave a quantitative estimate of phenotype that could be used to rank animals and select those with extreme phenotypic performance. Use of the method in an optimal way using larger sample sizes, and combining DNA microarrays and other testing platforms, is recommended.  相似文献   

4.
DNA microarrays have revolutionized gene expression studies and made large-scale parallel measurement of whole genome expression a feasible technique in model species where genomes are well characterized. Such studies are perfectly suited to unraveling the complex regulation and/or interaction of both genes and proteins likely involved in most physiological processes. Gene expression profiles are currently being used to identify genes underlying a range of physiological responses. Characterization of these genes will help to elucidate the pathways and processes regulating physiological processes. Expanding the use of DNA microarrays to non-model species that have been critical in elucidating certain physiological pathways will be valuable in determining the genes associated with these processes. Approaches that do not require complete genome information have recently been applied to "non-model" organisms. As whole genomes are sequenced for non-model organisms, the application of DNA microarrays to comparative physiology will expand even further. The recent development of protein microarrays will be critical in understanding the regulation of physiological processes not accounted for at the genomic level. Together, DNA and protein microarrays provide the most thorough and efficient method of understanding the molecular basis of physiological processes to date. In turn, classical physiological approaches will be vital in characterizing and verifying the function of the novel genes identified by microarray experiments. Ultimately, DNA and protein microarray expression profiles may be used to predict physiological responses.  相似文献   

5.
MOTIVATION: DNA microarrays allow the simultaneous measurement of thousands of gene expression levels in any given patient sample. Gene expression data have been shown to correlate with survival in several cancers, however, analysis of the data is difficult, since typically at most a few hundred patients are available, resulting in severely underdetermined regression or classification models. Several approaches exist to classify patients in different risk classes, however, relatively little has been done with respect to the prediction of actual survival times. We introduce CASPAR, a novel method to predict true survival times for the individual patient based on microarray measurements. CASPAR is based on a multivariate Cox regression model that is embedded in a Bayesian framework. A hierarchical prior distribution on the regression parameters is specifically designed to deal with high dimensionality (large number of genes) and low sample size settings, that are typical for microarray measurements. This enables CASPAR to automatically select small, most informative subsets of genes for prediction. RESULTS: Validity of the method is demonstrated on two publicly available datasets on diffuse large B-cell lymphoma (DLBCL) and on adenocarcinoma of the lung. The method successfully identifies long and short survivors, with high sensitivity and specificity. We compare our method with two alternative methods from the literature, demonstrating superior results of our approach. In addition, we show that CASPAR can further refine predictions made using clinical scoring systems such as the International Prognostic Index (IPI) for DLBCL and clinical staging for lung cancer, thus providing an additional tool for the clinician. An analysis of the genes identified confirms previously published results, and furthermore, new candidate genes correlated with survival are identified.  相似文献   

6.
Discrimination of disease patients based on gene expression data is a crucial problem in clinical area. An important issue to solve this problem is to find a discriminative subset of genes from thousands of genes on a microarray or DNA chip. Aiming at finding informative genes for disease classification on microarray, we present a gene selection method based on the forward variable (gene) selection method (FSM) and show, using typical public microarray datasets, that our method can extract a small set of genes being crucial for discriminating different classes with a very high accuracy almost closed to perfect classification.  相似文献   

7.
Multivariate measurement of gene expression relationships   总被引:5,自引:0,他引:5  
  相似文献   

8.
The measurements of coordinated patterns of protein abundance using antibody microarrays could be used to gain insight into disease biology and to probe the use of combinations of proteins for disease classification. The correct use and interpretation of antibody microarray data requires proper normalization of the data, which has not yet been systematically studied. Therefore we undertook a study to determine the optimal normalization of data from antibody microarray profiling of proteins in human serum specimens. Forty-three serum samples collected from patients with pancreatic cancer and from control subjects were probed in triplicate on microarrays containing 48 different antibodies, using a direct labeling, two-color comparative fluorescence detection format. Seven different normalization methods representing major classes of normalization for antibody microarray data were compared by their effects on reproducibility, accuracy, and trends in the data set. Normalization with ELISA-determined concentrations of IgM resulted in the most accurate, reproducible, and reliable data. The other normalization methods were deficient in at least one of the criteria. Multiparametric classification of the samples based on the combined measurement of seven of the proteins demonstrated the potential for increased classification accuracy compared with the use of individual measurements. This study establishes reliable normalization for antibody microarray data, criteria for assessing normalization performance, and the capability of antibody microarrays for serum-protein profiling and multiparametric sample classification.  相似文献   

9.
With the advent of high-throughput technologies for measuring genome-wide expression profiles, a large number of methods have been proposed for discovering diagnostic markers that can accurately discriminate between different classes of a disease. However, factors such as the small sample size of typical clinical data, the inherent noise in high-throughput measurements, and the heterogeneity across different samples, often make it difficult to find reliable gene markers. To overcome this problem, several studies have proposed the use of pathway-based markers, instead of individual gene markers, for building the classifier. Given a set of known pathways, these methods estimate the activity level of each pathway by summarizing the expression values of its member genes, and use the pathway activities for classification. It has been shown that pathway-based classifiers typically yield more reliable results compared to traditional gene-based classifiers. In this paper, we propose a new classification method based on probabilistic inference of pathway activities. For a given sample, we compute the log-likelihood ratio between different disease phenotypes based on the expression level of each gene. The activity of a given pathway is then inferred by combining the log-likelihood ratios of the constituent genes. We apply the proposed method to the classification of breast cancer metastasis, and show that it achieves higher accuracy and identifies more reproducible pathway markers compared to several existing pathway activity inference methods.  相似文献   

10.

Background

The normalization of DNA microarrays allows comparison among samples by adjusting for individual hybridization intensities. The approaches most commonly used are global normalization methods that are based on the expression of all genes on the slide and on the modulation of a small proportion of genes. Alternative approaches must be developed for microarrays where the proportion of modulated genes and their distribution are unknown and they may be biased towards up- or down-modulated trends.

Results

The aim of the work is to study the use of spike-in controls to normalize low-density microarrays. Our test-array was designed to analyze gene modulation in response to hypoxia (a condition of low oxygen tension) in a macrophage cell line. RNA was extracted from controls and cells exposed to hypoxia, mixed with spike RNA, labeled and hybridized to our test-array. We used eight bacterial RNAs as source of spikes. The test-array contained the oligonucleotides specific for 178 mouse genes and those specific for the eight spikes. We assessed the quality of the spike signals, the reproducibility of the results and, in general, the nature of the variability. The small values of the coefficients of variation revealed high reproducibility of our platform either in replicated spots or in technical replicates. We demonstrated that the spike-in system was suitable for normalizing our platform and determining the threshold for discriminating the hypoxia modulated genes. We assessed the application of the spike-in normalization method to microarrays in which the distribution of the expression values was symmetric or asymmetric. We found that this system is accurate, reproducible and comparable to other normalization methods when the distribution of the expression values is symmetric. In contrast, we found that the use of the spike-in normalization method is superior and necessary when the distribution of the gene expression is asymmetric and biased towards up-regulated genes.

Conclusion

We demonstrate that spike-in controls based normalization is a reliable and reproducible method that has the major advantage to be applicable also to biased platform where the distribution of the up- and down-regulated genes is asymmetric as it may occur in diagnostic chips.  相似文献   

11.
12.
13.
Kepler TB  Crosby L  Morgan KT 《Genome biology》2002,3(7):research0037.1-research003712

Background  

With the advent of DNA hybridization microarrays comes the remarkable ability, in principle, to simultaneously monitor the expression levels of thousands of genes. The quantiative comparison of two or more microarrays can reveal, for example, the distinct patterns of gene expression that define different cellular phenotypes or the genes induced in the cellular response to insult or changing environmental conditions. Normalization of the measured intensities is a prerequisite of such comparisons, and indeed, of any statistical analysis, yet insufficient attention has been paid to its systematic study. The most straightforward normalization techniques in use rest on the implicit assumption of linear response between true expression level and output intensity. We find that these assumptions are not generally met, and that these simple methods can be improved.  相似文献   

14.
With the complete sequencing of the human genome, research priorities have shifted from the identification of genes to the elucidation of their function. Methods currently used by scientists to characterize gene function, such as knock-out mice, are based upon loss of protein function and analysis of the resulting phenotypes to infer a potential role for the protein under scrutiny. Until now, these methods have been successful but time consuming and only a few genes at a time could be analyzed. Cell microarrays allow to simultaneously transfect thousands of different nucleic acid molecules, RNA or DNA, into adherent cells. It is then possible to analyze a large pallet of resulting phenotypes in clusters of transfected cells. We are currently manufacturing cell microarrays with collections of full-length cDNA cloned in expression vectors (gain of function analyses) or siRNA (loss of function studies) to unravel function of genes involved in differentiation and proliferation of human cells. Although there are still some technological difficulties to overcome, the potential for cell microarrays to speed up functional exploration of genomes is very promising.  相似文献   

15.
基于决策森林特征基因的两种识别方法   总被引:1,自引:0,他引:1  
应用DNA芯片可获得成千上万个基因的表达谱数据。寻找对疾病有鉴别力的特征基因 ,滤掉与疾病无关的基因是基因表达谱数据分析的关键问题。利用决策森林方法的集成优势 ,提出基于决策森林的两种特征基因识别方法。该方法先由决策森林按照一定的显著性水平滤掉大部分与疾病类别无关的基因 ,然后采用统计频数法和扰动法 ,根据所选特征对分类的贡献程度对初选的特征基因作更加精细地选择。最后 ,选用神经网络作为外部分类器对所选的特征基因子集进行评价 ,将提出的方法应用于 4 0例结肠癌组织与 2 2例正常组织中 2 0 0 0个基因的表达谱实验数据。结果表明 :上述两种方法选出的特征基因均具有较高的疾病鉴别能力 ,均可获得最优特征基因子集 ,基于决策森林的统计频数法优于扰动法。  相似文献   

16.
17.
Yan X  Zheng T 《BMC genomics》2008,9(Z2):S14

Background

Gene expression data extracted from microarray experiments have been used to study the difference between mRNA abundance of genes under different conditions. In one of such experiments, thousands of genes are measured simultaneously, which provides a high-dimensional feature space for discriminating between different sample classes. However, most of these dimensions are not informative about the between-class difference, and add noises to the discriminant analysis.

Results

In this paper we propose and study feature selection methods that evaluate the "informativeness" of a set of genes. Two measures of information based on multigene expression profiles are considered for a backward information-driven screening approach for selecting important gene features. By considering multigene expression profiles, we are able to utilize interaction information among these genes. Using a breast cancer data, we illustrate our methods and compare them to the performance of existing methods.

Conclusion

We illustrate in this paper that methods considering gene-gene interactions have better classification power in gene expression analysis. In our results, we identify important genes with relative large p-values from single gene tests. This indicates that these are genes with weak marginal information but strong interaction information, which will be overlooked by strategies that only examine individual genes.
  相似文献   

18.
Minimum redundancy feature selection from microarray gene expression data   总被引:7,自引:0,他引:7  
How to selecting a small subset out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their differential expressions among phenotypes and pick the top-ranked genes. We observe that feature sets so obtained have certain redundancy and study methods to minimize it. We propose a minimum redundancy - maximum relevance (MRMR) feature selection framework. Genes selected via MRMR provide a more balanced coverage of the space and capture broader characteristics of phenotypes. They lead to significantly improved class predictions in extensive experiments on 6 gene expression data sets: NCI, Lymphoma, Lung, Child Leukemia, Leukemia, and Colon. Improvements are observed consistently among 4 classification methods: Naive Bayes, Linear discriminant analysis, Logistic regression, and Support vector machines. SUPPLIMENTARY: The top 60 MRMR genes for each of the datasets are listed in http://crd.lbl.gov/~cding/MRMR/. More information related to MRMR methods can be found at http://www.hpeng.net/.  相似文献   

19.
20.
MOTIVATION: DNA microarrays have revolutionized biological research, but their reliability and accuracy have not been extensively evaluated. Thorough testing of microarrays through comparison to dissimilar gene expression methods is necessary in order to determine their accuracy. RESULTS: We have systematically compared three global gene expression methods on all available histologically normal samples from five human organ types. The data included 25 Affymetrix high-density oligonucleotide array experiments, 23 expressed sequence tag based expression (EBE) experiments and 5 SAGE experiments. The reported gene-by-gene expression patterns showed a wide range of correlations between pairs of methods. This level of agreement was sufficient for accurate clustering of datasets from the same tissue and dissimilar methods, but highlights the need for thorough validation of individual gene expression measurements by alternate, non-global methods. Furthermore, analyses of mRNA abundance distributions indicate limitations in the EBE and SAGE methods at both high- and low-expression levels.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号