首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

The ability to monitor the change in expression patterns over time, and to observe the emergence of coherent temporal responses using gene expression time series, obtained from microarray experiments, is critical to advance our understanding of complex biological processes. In this context, biclustering algorithms have been recognized as an important tool for the discovery of local expression patterns, which are crucial to unravel potential regulatory mechanisms. Although most formulations of the biclustering problem are NP-hard, when working with time series expression data the interesting biclusters can be restricted to those with contiguous columns. This restriction leads to a tractable problem and enables the design of efficient biclustering algorithms able to identify all maximal contiguous column coherent biclusters.  相似文献   

2.
J An  AW Liew  CC Nelson 《PloS one》2012,7(8):e42431

Background

Accumulated biological research outcomes show that biological functions do not depend on individual genes, but on complex gene networks. Microarray data are widely used to cluster genes according to their expression levels across experimental conditions. However, functionally related genes generally do not show coherent expression across all conditions since any given cellular process is active only under a subset of conditions. Biclustering finds gene clusters that have similar expression levels across a subset of conditions. This paper proposes a seed-based algorithm that identifies coherent genes in an exhaustive, but efficient manner.

Methods

In order to find the biclusters in a gene expression dataset, we exhaustively select combinations of genes and conditions as seeds to create candidate bicluster tables. The tables have two columns (a) a gene set, and (b) the conditions on which the gene set have dissimilar expression levels to the seed. First, the genes with less than the maximum number of dissimilar conditions are identified and a table of these genes is created. Second, the rows that have the same dissimilar conditions are grouped together. Third, the table is sorted in ascending order based on the number of dissimilar conditions. Finally, beginning with the first row of the table, a test is run repeatedly to determine whether the cardinality of the gene set in the row is greater than the minimum threshold number of genes in a bicluster. If so, a bicluster is outputted and the corresponding row is removed from the table. Repeating this process, all biclusters in the table are systematically identified until the table becomes empty.

Conclusions

This paper presents a novel biclustering algorithm for the identification of additive biclusters. Since it involves exhaustively testing combinations of genes and conditions, the additive biclusters can be found more readily.  相似文献   

3.

Background

Biclustering algorithm can find a number of co-expressed genes under a set of experimental conditions. Recently, differential co-expression bicluster mining has been used to infer the reasonable patterns in two microarray datasets, such as, normal and cancer cells.

Methods

In this paper, we propose an algorithm, DECluster, to mine Differential co-Expression biCluster in two discretized microarray datasets. Firstly, DECluster produces the differential co-expressed genes from each pair of samples in two microarray datasets, and constructs a differential weighted undirected sample–sample relational graph. Secondly, the differential biclusters are generated in the above differential weighted undirected sample–sample relational graph. In order to mine maximal differential co-expression biclusters efficiently, we design several pruning techniques for generating maximal biclusters without candidate maintenance.

Results

The experimental results show that our algorithm is more efficient than existing methods. The performance of DECluster is evaluated by empirical p-value and gene ontology, the results show that our algorithm can find more statistically significant and biological differential co-expression biclusters than other algorithms.

Conclusions

Our proposed algorithm can find more statistically significant and biological biclusters in two microarray datasets than the other two algorithms.  相似文献   

4.

Background

Biclustering is an important analysis procedure to understand the biological mechanisms from microarray gene expression data. Several algorithms have been proposed to identify biclusters, but very little effort was made to compare the performance of different algorithms on real datasets and combine the resultant biclusters into one unified ranking.

Results

In this paper we propose differential co-expression framework and a differential co-expression scoring function to objectively quantify quality or goodness of a bicluster of genes based on the observation that genes in a bicluster are co-expressed in the conditions belonged to the bicluster and not co-expressed in the other conditions. Furthermore, we propose a scoring function to stratify biclusters into three types of co-expression. We used the proposed scoring functions to understand the performance and behavior of the four well established biclustering algorithms on six real datasets from different domains by combining their output into one unified ranking.

Conclusions

Differential co-expression framework is useful to provide quantitative and objective assessment of the goodness of biclusters of co-expressed genes and performance of biclustering algorithms in identifying co-expression biclusters. It also helps to combine the biclusters output by different algorithms into one unified ranking i.e. meta-biclustering.  相似文献   

5.
6.

Background  

RNA interference (RNAi) has become a powerful means for silencing target gene expression in mammalian cells and is envisioned to be useful in therapeutic approaches to human disease. In recent years, high-throughput, genome-wide screening of siRNA/miRNA libraries has emerged as a desirable approach. Current methods for constructing siRNA/miRNA expression vectors require the synthesis of long oligonucleotides, which is costly and suffers from mutation problems.  相似文献   

7.

Background  

The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Biclustering in particular has emerged as an important problem in the analysis of gene expression data since genes may only jointly respond over a subset of conditions. Biclustering algorithms also have important applications in sample classification where, for instance, tissue samples can be classified as cancerous or normal. Many of the methods for biclustering, and clustering algorithms in general, utilize simplified models or heuristic strategies for identifying the "best" grouping of elements according to some metric and cluster definition and thus result in suboptimal clusters.  相似文献   

8.

Background  

The use of small interfering RNA (siRNA) molecules in animals to achieve double-stranded RNA-mediated interference (RNAi) has recently emerged as a powerful method of sequence-specific gene knockdown. As DNA-based expression of short hairpin RNA (shRNA) for RNAi may offer some advantages over chemical and in vitro synthesised siRNA, a number of vectors for expression of shRNA have been developed. These often feature polymerase III (pol. III) promoters of either mouse or human origin.  相似文献   

9.

Background  

Adjustable gene expression is crucial in a number of applications such as de- or transdifferentiation of cell phenotypes, tissue engineering, various production processes as well as gene-therapy initiatives. Viral vectors, based on the Adeno-Associated Virus (AAV) type 2, have emerged as one of the most promising types of vectors for therapeutic applications due to excellent transduction efficiencies of a broad variety of dividing and mitotically inert cell types and due to their unique safety features.  相似文献   

10.
11.
12.
13.

Background  

Cells dynamically adapt their gene expression patterns in response to various stimuli. This response is orchestrated into a number of gene expression modules consisting of co-regulated genes. A growing pool of publicly available microarray datasets allows the identification of modules by monitoring expression changes over time. These time-series datasets can be searched for gene expression modules by one of the many clustering methods published to date. For an integrative analysis, several time-series datasets can be joined into a three-dimensional gene-condition-time dataset, to which standard clustering or biclustering methods are, however, not applicable. We thus devise a probabilistic clustering algorithm for gene-condition-time datasets.  相似文献   

14.

Background  

All currently available methods of network/association inference from microarray gene expression measurements implicitly assume that such measurements represent the actual expression levels of different genes within each cell included in the biological sample under study. Contrary to this common belief, modern microarray technology produces signals aggregated over a random number of individual cells, a "nitty-gritty" aspect of such arrays, thereby causing a random effect that distorts the correlation structure of intra-cellular gene expression levels.  相似文献   

15.
Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.  相似文献   

16.

Background  

The DNA microarray technology allows the measurement of expression levels of thousands of genes under tens/hundreds of different conditions. In microarray data, genes with similar functions usually co-express under certain conditions only [1]. Thus, biclustering which clusters genes and conditions simultaneously is preferred over the traditional clustering technique in discovering these coherent genes. Various biclustering algorithms have been developed using different bicluster formulations. Unfortunately, many useful formulations result in NP-complete problems. In this article, we investigate an efficient method for identifying a popular type of biclusters called additive model. Furthermore, parallel coordinate (PC) plots are used for bicluster visualization and analysis.  相似文献   

17.

Background  

Drosophila gene expression pattern images document the spatiotemporal dynamics of gene expression during embryogenesis. A comparative analysis of these images could provide a fundamentally important way for studying the regulatory networks governing development. To facilitate pattern comparison and searching, groups of images in the Berkeley Drosophila Genome Project (BDGP) high-throughput study were annotated with a variable number of anatomical terms manually using a controlled vocabulary. Considering that the number of available images is rapidly increasing, it is imperative to design computational methods to automate this task.  相似文献   

18.

Background  

One of the most commonly performed tasks when analysing high throughput gene expression data is to use clustering methods to classify the data into groups. There are a large number of methods available to perform clustering, but it is often unclear which method is best suited to the data and how to quantify the quality of the classifications produced.  相似文献   

19.

Background  

Variations in codon usage between species are one of the major causes affecting recombinant protein expression levels, with a significant impact on the economy of industrial enzyme production processes. The use of codon-optimized genes may overcome this problem. However, designing a gene for optimal expression requires choosing from a vast number of possible DNA sequences and different codon optimization methods have been used in the past decade. Here, a comparative study of the two most common methods is presented using calf prochymosin as a model.  相似文献   

20.

Background  

With the advance of microarray technology, several methods for gene classification and prognosis have been already designed. However, under various denominations, some of these methods have similar approaches. This study evaluates the influence of gene expression variance structure on the performance of methods that describe the relationship between gene expression levels and a given phenotype through projection of data onto discriminant axes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号