共查询到20条相似文献,搜索用时 31 毫秒
1.
Background
Clustering is one of the most commonly used methods for discovering hidden structure in microarray gene expression data. Most current methods for clustering samples are based on distance metrics utilizing all genes. This has the effect of obscuring clustering in samples that may be evident only when looking at a subset of genes, because noise from irrelevant genes dominates the signal from the relevant genes in the distance calculation. 相似文献2.
3.
Background
A routine goal in the analysis of microarray data is to identify genes with expression levels that correlate with known classes of experiments. In a growing number of array data sets, it has been shown that there is an over-abundance of genes that discriminate between known classes as compared to expectations for random classes. Therefore, one can search for novel classes in array data by looking for partitions of experiments for which there are an over-abundance of discriminatory genes. We have previously used such an approach in a breast cancer study. 相似文献4.
Osama Mahmoud Andrew Harrison Aris Perperoglou Asma Gul Zardad Khan Metodi V Metodiev Berthold Lausen 《BMC bioinformatics》2014,15(1)
Background
Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature’s relevance to a classification task.Results
We apply POS, along‐with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.Conclusions
A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along‐with a novel gene score are exploited to produce the selected subset of genes.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-274) contains supplementary material, which is available to authorized users. 相似文献5.
Background
Microarray studies in cancer compare expression levels between two or more sample groups on thousands of genes. Data analysis follows a population-level approach (e.g., comparison of sample means) to identify differentially expressed genes. This leads to the discovery of 'population-level' markers, i.e., genes with the expression patterns A > B and B > A. We introduce the PPST test that identifies genes where a significantly large subset of cases exhibit expression values beyond upper and lower thresholds observed in the control samples. 相似文献6.
Selection of reference genes for gene expression studies in human neutrophils by real-time PCR 总被引:1,自引:0,他引:1
Background
Reference genes, which are often referred to housekeeping genes, are frequently used to normalize mRNA levels between different samples. However the expression level of these genes may vary among tissues or cells, and may change under certain circumstances. Thus the selection of reference gene(s) is critical for gene expression studies. For this purpose, 10 commonly used housekeeping genes were investigated in isolated human neutrophils. 相似文献7.
8.
Background
Proximal chromosome 15q is implicated in neurodevelopmental disorders including Prader-Willi and Angelman syndromes, autistic disorder and developmental abnormalities resulting from chromosomal deletions or duplications. A subset of genes in this region are subject to genomic imprinting, the expression of the gene from only one parental allele. 相似文献9.
Selection of housekeeping genes for gene expression studies in human reticulocytes using real-time PCR 总被引:6,自引:0,他引:6
Background
Control genes, which are often referred to as housekeeping genes, are frequently used to normalise mRNA levels between different samples. However, the expression level of these genes may vary among tissues or cells and may change under certain circumstances. Thus, the selection of housekeeping genes is critical for gene expression studies. To address this issue, 7 candidate housekeeping genes including several commonly used ones were investigated in isolated human reticulocytes. For this, a simple ΔCt approach was employed by comparing relative expression of 'pairs of genes' within each sample. On this basis, stability of the candidate housekeeping genes was ranked according to repeatability of the gene expression differences among 31 samples. 相似文献10.
Joseph C Roden Brandon W King Diane Trout Ali Mortazavi Barbara J Wold Christopher E Hart 《BMC bioinformatics》2006,7(1):194-22
Background
There are many methods for analyzing microarray data that group together genes having similar patterns of expression over all conditions tested. However, in many instances the biologically important goal is to identify relatively small sets of genes that share coherent expression across only some conditions, rather than all or most conditions as required in traditional clustering; e.g. genes that are highly up-regulated and/or down-regulated similarly across only a subset of conditions. Equally important is the need to learn which conditions are the decisive ones in forming such gene sets of interest, and how they relate to diverse conditional covariates, such as disease diagnosis or prognosis. 相似文献11.
Background
Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE) rather than recursive feature elimination (RFE). We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. 相似文献12.
Peter A DiMaggioJr Scott R McAllister Christodoulos A Floudas Xiao-Jiang Feng Joshua D Rabinowitz Herschel A Rabitz 《BMC bioinformatics》2008,9(1):458
Background
The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Biclustering in particular has emerged as an important problem in the analysis of gene expression data since genes may only jointly respond over a subset of conditions. Biclustering algorithms also have important applications in sample classification where, for instance, tissue samples can be classified as cancerous or normal. Many of the methods for biclustering, and clustering algorithms in general, utilize simplified models or heuristic strategies for identifying the "best" grouping of elements according to some metric and cluster definition and thus result in suboptimal clusters. 相似文献13.
14.
Ramdas L Cogdell DE Jia JY Taylor EE Dunmire VR Hu L Hamilton SR Zhang W 《BMC genomics》2004,5(1):35-9
Background
DNA microarrays using long oligonucleotide probes are widely used to evaluate gene expression in biological samples. These oligonucleotides are pre-synthesized and sequence-optimized to represent specific genes with minimal cross-hybridization to homologous genes. Probe length and concentration are critical factors for signal sensitivity, particularly when genes with various expression levels are being tested. We evaluated the effects of oligonucleotide probe length and concentration on signal intensity measurements of the expression levels of genes in a target sample. 相似文献15.
16.
Allison M Cotton Bing Ge Nicholas Light Veronique Adoue Tomi Pastinen Carolyn J Brown 《Genome biology》2013,14(11):R122
Background
X-chromosome inactivation (XCI) results in the silencing of most genes on one X chromosome, yielding mono-allelic expression in individual cells. However, random XCI results in expression of both alleles in most females. Allelic imbalances have been used genome-wide to detect mono-allelically expressed genes. Analysis of X-linked allelic imbalance in females with skewed XCI offers the opportunity to identify genes that escape XCI with bi-allelic expression in contrast to those with mono-allelic expression and which are therefore subject to XCI.Results
We determine XCI status for 409 genes, all of which have at least five informative females in our dataset. The majority of genes are subject to XCI and genes that escape from XCI show a continuum of expression from the inactive X. Inactive X expression corresponds to differences in the level of histone modification detected by allelic imbalance after chromatin immunoprecipitation. Differences in XCI between populations and between cell lines derived from different tissues are observed.Conclusions
We demonstrate that allelic imbalance can be used to determine an inactivation status for X-linked genes, even without completely non-random XCI. There is a range of expression from the inactive X. Genes escaping XCI, including those that do so in only a subset of females, cluster together, demonstrating that XCI and location on the X chromosome are related. In addition to revealing mechanisms involved in cis-gene regulation, determining which genes escape XCI can expand our understanding of the contributions of X-linked genes to sexual dimorphism. 相似文献17.
18.
Background
It is widely accepted that orthologous genes between species are conserved at the sequence level and perform similar functions in different organisms. However, the level of conservation of gene expression patterns of the orthologous genes in different species has been unclear. To address the issue, we compared gene expression of orthologous genes based on 2,557 human and 1,267 mouse samples with high quality gene expression data, selected from experiments stored in the public microarray repository ArrayExpress. 相似文献19.
Background
A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the probed genes and examines the tail areas of the list for over-representation of various functional classes. Alternatively, one monitors the average differential expression level of genes belonging to a given functional class. So far these two types of method have not been combined. 相似文献20.
Eva Freyhult Mattias Landfors Jenny Önskog Torgeir R Hvidsten Patrik Rydén 《BMC bioinformatics》2010,11(1):503