首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sîrbu A  Ruskin HJ  Crane M 《PloS one》2010,5(11):e13822

Background

Inferring Gene Regulatory Networks (GRNs) from time course microarray data suffers from the dimensionality problem created by the short length of available time series compared to the large number of genes in the network. To overcome this, data integration from diverse sources is mandatory. Microarray data from different sources and platforms are publicly available, but integration is not straightforward, due to platform and experimental differences.

Methods

We analyse here different normalisation approaches for microarray data integration, in the context of reverse engineering of GRN quantitative models. We introduce two preprocessing approaches based on existing normalisation techniques and provide a comprehensive comparison of normalised datasets.

Conclusions

Results identify a method based on a combination of Loess normalisation and iterative K-means as best for time series normalisation for this problem.  相似文献   

2.
3.
4.
5.
6.
Finding edging genes from microarray data   总被引:1,自引:0,他引:1  
MOTIVATION: A set of genes and their gene expression levels are used to classify disease and normal tissues. Due to the massive number of genes in microarray, there are a large number of edges to divide different classes of genes in microarray space. The edging genes (EGs) can be co-regulated genes, they can also be on the same pathway or deregulated by the same non-coding genes, such as siRNA or miRNA. Every gene in EGs is vital for identifying a tissue's class. The changing in one EG's gene expression may cause a tissue alteration from normal to disease and vice versa. Finding EGs is of biological importance. In this work, we propose an algorithm to effectively find these EGs. RESULT: We tested our algorithm with five microarray datasets. The results are compared with the border-based algorithm which was used to find gene groups and subsequently divide different classes of tissues. Our algorithm finds a significantly larger amount of EGs than does the border-based algorithm. As our algorithm prunes irrelevant patterns at earlier stages, time and space complexities are much less prevalent than in the border-based algorithm. AVAILABILITY: The algorithm proposed is implemented in C++ on Linux platform. The EGs in five microarray datasets are calculated. The preprocessed datasets and the discovered EGs are available at http://www3.it.deakin.edu.au/~phoebe/microarray.html.  相似文献   

7.

Background  

Gene expression analysis has many applications in cancer diagnosis, prognosis and therapeutic care. Relative quantification is the most widely adopted approach whereby quantification of gene expression is normalised relative to an endogenously expressed control (EC) gene. Central to the reliable determination of gene expression is the choice of control gene. The purpose of this study was to evaluate a panel of candidate EC genes from which to identify the most stably expressed gene(s) to normalise RQ-PCR data derived from primary colorectal cancer tissue.  相似文献   

8.
9.

Background

Transfection of cells with gene-specific, single-stranded oligonucleotides can induce the targeted exchange of one or two nucleotides in the targeted gene. To characterize the features of the DNA-repair mechanisms involved, we examined the maximal distance for the simultaneous exchange of two nucleotides by a single-stranded oligonucleotide. The chosen experimental system was the correction of a hprt-point mutation in a hamster cell line, the generation of an additional nucleotide exchange at a variable distance from the first exchange position and the investigation of the rate of simultaneous nucleotide exchanges.

Results

The smaller the distance between the two exchange positions, the higher was the probability of a simultaneous exchange. The detected simultaneous nucleotide exchanges were found to cluster in a region of about fourteen nucleotides upstream and downstream from the first exchange position.

Conclusion

We suggest that the mechanism involved in the repair of the targeted DNA strand utilizes only a short sequence of the single-stranded oligonucleotide, which may be physically incorporated into the DNA or be used as a matrix for a repair process.  相似文献   

10.
11.
根据周期表达基因的周期性和峰值特点,提出了一种将microarray时序表达数据划分为若干个基因表达周期,并对周期内的峰值特点进行评估以识别周期表达基因的方法,能有效减小microarray实验时的噪声干扰。选取了三组广泛使用的时序表达数据和一组可靠的周期表达基因集合对该方法的效果进行了测试,并与三种典型的周期表达基因识别方法的效果进行了比较。该方法能有效地从各种microarray时序表达数据中识别周期表达基因。  相似文献   

12.

Background  

One frequent application of microarray experiments is in the study of monitoring gene activities in a cell during cell cycle or cell division. A new challenge for analyzing the microarray experiments is to identify genes that are statistically significantly periodically expressed during the cell cycle. Such a challenge occurs due to the large number of genes that are simultaneously measured, a moderate to small number of measurements per gene taken at different time points, and high levels of non-normal random noises inherited in the data.  相似文献   

13.
Huang HL  Lee CC  Ho SY 《Bio Systems》2007,90(1):78-86
It is essential to select a minimal number of relevant genes from microarray data while maximizing classification accuracy for the development of inexpensive diagnostic tests. However, it is intractable to simultaneously optimize gene selection and classification accuracy that is a large parameter optimization problem. We propose an efficient evolutionary approach to gene selection from microarray data which can be combined with the optimal design of various multiclass classifiers. The proposed method (named GeneSelect) consists of three parts which are fully cooperated: an efficient encoding scheme of candidate solutions, a generalized fitness function, and an intelligent genetic algorithm (IGA). An existing hybrid approach based on genetic algorithm and maximum likelihood classification (GA/MLHD) is proposed to select a small number of relevant genes for accurate classification of samples. To evaluate the performance of GeneSelect, the gene selection is combined with the same maximum likelihood classification (named IGA/MLHD) for convenient comparisons. The performance of IGA/MLHD is applied to 11 cancer-related human gene expression datasets. The simulation results show that IGA/MLHD is superior to GA/MLHD in terms of the number of selected genes, classification accuracy, and robustness of selected genes and accuracy.  相似文献   

14.
To overcome random experimental variation, even for simple screens, data from multiple microarrays have to be combined. There are, however, systematic differences between arrays, and any bias remaining after experimental measures to ensure consistency needs to be controlled for. It is often difficult to make the right choice of data transformation and normalisation methods to achieve this end. In this tutorial paper we review the problem and a selection of solutions, explaining the basic principles behind normalisation procedures and providing guidance for their application.  相似文献   

15.
16.
Fuzzy J-Means and VNS methods for clustering genes from microarray data   总被引:4,自引:0,他引:4  
MOTIVATION: In the interpretation of gene expression data from a group of microarray experiments that include samples from either different patients or conditions, special consideration must be given to the pleiotropic and epistatic roles of genes, as observed in the variation of gene coexpression patterns. Crisp clustering methods assign each gene to one cluster, thereby omitting information about the multiple roles of genes. RESULTS: Here, we present the application of a local search heuristic, Fuzzy J-Means, embedded into the variable neighborhood search metaheuristic for the clustering of microarray gene expression data. We show that for all the datasets studied this algorithm outperforms the standard Fuzzy C-Means heuristic. Different methods for the utilization of cluster membership information in determining gene coregulation are presented. The clustering and data analyses were performed on simulated datasets as well as experimental cDNA microarray data for breast cancer and human blood from the Stanford Microarray Database. AVAILABILITY: The source code of the clustering software (C programming language) is freely available from Nabil.Belacel@nrc-cnrc.gc.ca  相似文献   

17.
Journal of Mathematical Biology - The 3D microarrays, generally known as gene-sample-time microarrays, couple the information on different time points collected by 2D microarrays that measure gene...  相似文献   

18.
Housekeeping genes are widely used as internal controls in a variety of study types, including real time RT-PCR, microarrays, Northern analysis and RNase protection assays. However, even commonly used housekeeping genes may vary in stability depending on the cell type or disease being studied. Thus, it is necessary to identify additional housekeeping-type genes that show sample-independent stability. Here, we used statistical analysis to examine a large human microarray database, seeking genes that were stably expressed in various tissues, disease states and cell lines. We further selected genes that were expressed at different levels, because reference and target genes should be present in similar copy numbers to achieve reliable quantitative results. Real time RT-PCR amplification of three newly identified reference genes, CGI-119, CTBP1 and GOLGAl, alongside three well-known housekeeping genes, B2M, GAPD, and TUBB, confirmed that the newly identified genes were more stably expressed in individual samples with similar ranges. These results collectively suggest that statistical analysis of microarray data can be used to identify new candidate housekeeping genes showing consistent expression across tissues and diseases. Our analysis identified three novel candidate housekeeping genes (CGI-119, GOLGA1, and CTBP1) that could prove useful for normalization across a variety of RNA-based techniques.  相似文献   

19.
束永俊  李勇  柏锡  才华  纪巍  朱延明 《生物信息学》2009,7(3):168-170,177
利用方差分析法从拟南芥芯片表达谱数据库挖掘非生物胁迫相关基因,并对这些基因进行GO注释分析,从而揭示非生物胁迫的生物学意义,发现非生物胁迫主要影响植物基因表达过程的转录调节和信号转导过程的磷酸化。同时对这些基因的上游启动子区域序列进行分析,挖掘非生物胁迫反应调控过程和适应过程的转录因子,发现植物非生物胁迫过程主要受bHLH—ZIP类和ZN—FINGER类C2H2型转录因子的调节。  相似文献   

20.
Hu J  Xu J 《BMC genomics》2010,11(Z2):S3

Motivation

Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes.

Results

We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning) is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO) with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change.

Conclusions

Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号