首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
2.
3.
4.
5.
The availability of both the Xenopus tropicalis genome and the soon to be released Xenopus laevis genome provides a solid foundation for Xenopus developmental biologists. The Xenopus community has presently amassed expression data for ~2,300 genes in the form of published images collected in the Xenbase, the principal Xenopus research database. A few of these genes have been examined in both X. tropicalis and X. laevis and the cross-species comparison has been proven invaluable for studying gene function. A recently published work has yielded developmental expression profiles for the majority of Xenopus genes across fourteen developmental stages spanning the blastula, gastrula, neurula, and the tail-bud. While this data was originally queried for global evolutionary and developmental principles, here we demonstrate its general use for gene-level analyses. In particular, we present the accessibility of this dataset through Xenbase and describe biases in the characterized genes in terms of sequence and expression conservation across the two species. We further indicate the advantage of examining coexpression for gene function discovery relating to developmental processes conserved across species. We suggest that the integration of additional large-scale datasets--comprising diverse functional data--into Xenbase promises to provide a strong foundation for researchers in elucidating biological processes including the gene regulatory programs encoding development.  相似文献   

6.
7.
Two genes are said to be coexpressed if their expression levels have a similar spatial or temporal pattern. Ever since the profiling of gene microarrays has been in progress, computational modeling of coexpression has acquired a major focus. As a result, several similarity/distance measures have evolved over time to quantify coexpression similarity/dissimilarity between gene pairs. Of these, correlation coefficient has been established to be a suitable quantifier of pairwise coexpression. In general, correlation coefficient is good for symbolizing linear dependence, but not for nonlinear dependence. In spite of this drawback, it outperforms many other existing measures in modeling the dependency in biological data. In this paper, for the first time, we point out a significant weakness of the existing similarity/distance measures, including the standard correlation coefficient, in modeling pairwise coexpression of genes. A novel measure, called BioSim, which assumes values between -1 and +1 corresponding to negative and positive dependency and 0 for independency, is introduced. The computation of BioSim is based on the aggregation of stepwise relative angular deviation of the expression vectors considered. The proposed measure is analytically suitable for modeling coexpression as it accounts for the features of expression similarity, expression deviation and also the relative dependence. It is demonstrated how the proposed measure is better able to capture the degree of coexpression between a pair of genes as compared to several other existing ones. The efficacy of the measure is statistically analyzed by integrating it with several module-finding algorithms based on coexpression values and then applying it on synthetic and biological data. The annotation results of the coexpressed genes as obtained from gene ontology establish the significance of the introduced measure. By further extending the BioSim measure, it has been shown that one can effectively identify the variability in the expression patterns over multiple phenotypes. We have also extended BioSim to figure out pairwise differential expression pattern and coexpression dynamics. The significance of these studies is shown based on the analysis over several real-life data sets. The computation of the measure by focusing on stepwise time points also makes it effective to identify partially coexpressed genes. On the whole, we put forward a complete framework for coexpression analysis based on the BioSim measure.  相似文献   

8.
9.
Coexpression analysis is a powerful, widely used methodology for the investigation of underlying patterns in gene expression data. This "guilt-by-association" approach aims to find groups of genes with closely correlated expression profiles. Observation of consistent correlations across phenotypically diverse samples indicates that these genes have a shared function. We have recently described the application of weighted gene coexpression network analysis (WGCNA) to a 295 sample production CHO cell line microarray dataset and elucidated groups of genes related to growth rate and cell-specific productivity (Qp). In this study, we present the CHO gene coexpression database (CGCDB), a web-based system, designed specifically for researchers in the CHO community to provide user-friendly access to these gene-gene coexpression patterns. In addition to correlation between genes, the direct correlations between probesets and either growth rate or Qp are provided. Results are presented to the user via an interactive network diagram and in a downloadable tabular format. It is hoped that this resource will allow researchers to prioritize cell line engineering and/or biomarker candidates to enhance CHO-based cell culture for the production of biotherapeutics. Availability: www.cgcdb.org.  相似文献   

10.
Increasingly large-scale expression compendia for different species are becoming available. By exploiting the modularity of the coexpression network, these compendia can be used to identify biological processes for which the expression behavior is conserved over different species. However, comparing module networks across species is not trivial. The definition of a biologically meaningful module is not a fixed one and changing the distance threshold that defines the degree of coexpression gives rise to different modules. As a result when comparing modules across species, many different partially overlapping conserved module pairs across species exist and deciding which pair is most relevant is hard. Therefore, we developed a method referred to as conserved modules across organisms (COMODO) that uses an objective selection criterium to identify conserved expression modules between two species. The method uses as input microarray data and a gene homology map and provides as output pairs of conserved modules and searches for the pair of modules for which the number of sharing homologs is statistically most significant relative to the size of the linked modules. To demonstrate its principle, we applied COMODO to study coexpression conservation between the two well-studied bacteria Escherichia coli and Bacillus subtilis. COMODO is available at: http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Information_Zarrineh_2010/comodo/index.html.  相似文献   

11.
cDNA-AFLP is a genome-wide expression analysis technology that does not require any prior knowledge of gene sequences. This PCR-based technique combines a high sensitivity with a high specificity, allowing detection of rarely expressed genes and distinguishing between homologous genes. In this report, we validated quantitative expression data of 110 cDNA-AFLP fragments in yeast with DNA microarrays and GeneChip data. The best correlation was found between cDNA-AFLP and GeneChip data. The cDNA-AFLP data revealed a low number of inconsistent profiles that could be explained by gel artifact, overexposure, or mismatch amplification. In addition, 18 cDNA-AFLP fragments displayed homology to genomic yeast DNA, but could not be linked unambiguously to any known ORF. These fragments were most probably derived from 5' or 3' noncoding sequences or might represent previously unidentified ORFs. Genes liable to cross hybridization showed identical results in cDNA-AFLP and GeneChip analysis. Three genes, which were readily detected with cDNA-AFLP, showed no significant expression in GeneChip experiments. We show that cDNA-AFLP is a very good alternative to microarrays and since no preexisting biological or sequence information is required, it is applicable to any species.  相似文献   

12.
Large amounts of gene expression data from several different technologies are becoming available to the scientific community. A common practice is to use these data to calculate global gene coexpression for validation or integration of other "omic" data. To assess the utility of publicly available datasets for this purpose we have analyzed Homo sapiens data from 1202 cDNA microarray experiments, 242 SAGE libraries, and 667 Affymetrix oligonucleotide microarray experiments. The three datasets compared demonstrate significant but low levels of global concordance (rc<0.11). Assessment against Gene Ontology (GO) revealed that all three platforms identify more coexpressed gene pairs with common biological processes than expected by chance. As the Pearson correlation for a gene pair increased it was more likely to be confirmed by GO. The Affymetrix dataset performed best individually with gene pairs of correlation 0.9-1.0 confirmed by GO in 74% of cases. However, in all cases, gene pairs confirmed by multiple platforms were more likely to be confirmed by GO. We show that combining results from different expression platforms increases reliability of coexpression. A comparison with other recently published coexpression studies found similar results in terms of performance against GO but with each method producing distinctly different gene pair lists.  相似文献   

13.
Although we know there is considerable variation in gut microbial composition within host species, little is known about how this variation is shaped and why such variation exists. In humans, obesity is associated with the relative abundance of two dominant bacterial phyla: an increase in the proportion of Firmicutes and a decrease in the proportion of Bacteroidetes. As there is evidence that humans have adapted to colder climates by increasing their body mass (e.g. Bergmann''s rule), we tested whether Firmicutes increase and Bacteroidetes decrease with latitude, using 1020 healthy individuals drawn from 23 populations and six published studies. We found a positive correlation between Firmicutes and latitude and a negative correlation between Bacteroidetes and latitude. The overall pattern appears robust to sex, age and bacterial detection methods. Comparisons between African Americans and native Africans and between European Americans and native Europeans suggest no evidence of host genotype explaining the observed patterns. The variation of gut microbial composition described here is consistent with the pattern expected by Bergmann''s rule. This surprising link between large-scale geography and human gut microbial composition merits further investigation.  相似文献   

14.
GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox   总被引:26,自引:0,他引:26  
High-throughput gene expression analysis has become a frequent and powerful research tool in biology. At present, however, few software applications have been developed for biologists to query large microarray gene expression databases using a Web-browser interface. We present GENEVESTIGATOR, a database and Web-browser data mining interface for Affymetrix GeneChip data. Users can query the database to retrieve the expression patterns of individual genes throughout chosen environmental conditions, growth stages, or organs. Reversely, mining tools allow users to identify genes specifically expressed during selected stresses, growth stages, or in particular organs. Using GENEVESTIGATOR, the gene expression profiles of more than 22,000 Arabidopsis genes can be obtained, including those of 10,600 currently uncharacterized genes. The objective of this software application is to direct gene functional discovery and design of new experiments by providing plant biologists with contextual information on the expression of genes. The database and analysis toolbox is available as a community resource at https://www.genevestigator.ethz.ch.  相似文献   

15.
Large volumes of genomic data have been generated for several plant species over the past decade, including structural sequence data and functional annotation at the genome level. Various technologies such as expressed sequence tags (ESTs), massively parallel signature sequencing (MPSS) and microarrays have been used to study gene expression and to provide functional data for many genes simultaneously. This review focuses on recent advances in the application of microarrays in plant genomic research and in gene expression databases available for plants. Large sets of Arabidopsis microarray data are publicly available. Recently developed array platforms are currently being used to generate genome-wide expression profiles for several crop species. Coupled to these platforms are public databases that provide access to these large-scale expression data, which can be used to aid the functional discovery of gene function.  相似文献   

16.
The operon encoding ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) in the cyanobacterium Synechococcus sp. PCC7002 contains three rbc genes, rbcL, rbcX and rbcS, in this order. Introduction of translational frameshift into the rbcX gene resulted in a significant decrease in the production of large (RbcL) and small (RbcS) subunits of the Rubisco protein in Synechococcus sp. PCC7002 and in Escherichia coli. To investigate the function of the rbcX gene product (RbcX), we constructed the expression plasmid for the rbcX gene and examined the effects of RbcX on the recombinant Rubisco production in Escherichia coli. The coexpression experiments revealed that RbcX had marked effects on the production of large and small subunits of Rubisco without any significant influence on the mRNA level of rbc genes and/or the post-translational assembly of the Rubisco protein. The present rbcX coexpression system provides a novel and useful method for investigating the Rubisco maturation pathway.  相似文献   

17.
18.
Aim: Phytosociological databases often contain unbalanced samples of real vegetation, which should be carefully resampled before any analyses. We propose a new resampling method based on species composition, called heterogeneity‐constrained random (HCR) resampling. Method: Many subsets of the source vegetation database are selected randomly. These subsets are sorted by decreasing mean dissimilarity between pairs of the vegetation plots, and then sorted again by increasing variance of these dissimilarities. Ranks from both sortings are summed for each subset, and the subset with the lowest summed rank is considered as the most representative. The performance of this method was tested using simulated point patterns that represented different levels of aggregation of vegetation plots within a database. The distributions of points in the subsets resulting from different resampling methods, both with and without database stratification, were compared using Ripley's K function. The mean of random selections from an unbiased sample was used as a reference in these comparisons. The efficiency of the method was also demonstrated with real phytosociological data. Results: Both stratified and HCR resampling yielded selection patterns more similar to the reference than resampling without these tools. Outcomes from the resampling that combined these two methods were the most similar to the reference. The efficiency of the HCR resampling method varied with different levels of aggregation in the database. Conclusions: This new method is efficient for resampling phytosociological databases. As it only uses information on species occurrences/abundances, it does not require the definition of strata, thereby avoiding the effect of subjective decisions on the selection outcome. Nevertheless, this method can also be applied to stratified databases.  相似文献   

19.
Novak JP  Sladek R  Hudson TJ 《Genomics》2002,79(1):104-113
Large-scale gene expression measurement techniques provide a unique opportunity to gain insight into biological processes under normal and pathological conditions. To interpret the changes in expression profiles for thousands of genes, we face the nontrivial problem of understanding the significance of these changes. In practice, the sources of background variability in expression data can be divided into three categories: technical, physiological, and sampling. To assess the relative importance of these sources of background variation, we generated replicate gene expression profiles on high-density Affymetrix GeneChip oligonucleotide arrays, using either identical RNA samples or RNA samples obtained under similar biological states. We derived a novel measure of dispersion in two-way comparisons, using a linear characteristic function. When comparing expression profiles from replicate tests using the same RNA sample (a test for technical variability), we observed a level of dispersion similar to the pattern obtained with RNA samples from replicate cultures of the same cell line (a test for physiological variability). On the other hand, a higher level of dispersion was observed when tissue samples of different animals were compared (an example of sampling variability). This implies that, in experiments in which samples from different subjects are used, the variation induced by the stimulus may be masked by non-stimuli-related differences in the subjects' biological state. These analyses underscore the need for replica experiments to reliably interpret large-scale expression data sets, even with simple microarray experiments.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号