首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
A key complication in comparative genomics for reliable gene function prediction is the existence of duplicated genes. To study the effect of gene duplication on function prediction, we analyze orthologs between pairs of genomes where in one genome the orthologous gene has duplicated after the speciation of the two genomes (i.e. inparalogs). For these duplicated genes we investigate whether the gene that is most similar on the sequence level is also the gene that has retained the ancestral gene-neighborhood. Although the majority of investigated cases show a consistent pattern between sequence similarity and gene-neighborhood conservation, a substantial fraction, 29–38%, is inconsistent. The observation of inconsistency is not the result of a chance outcome owing to a lack of divergence time between inparalogs, but rather it seems to be the result of a chance outcome caused by very similar rates of sequence evolution of both inparalogs relative to their ortholog. If one-to-one orthologous relationships are required, it is advisable to combine contextual information (i.e. gene-neighborhood in prokaryotes and co-expression in eukaryotes) with protein sequence information to predict the most probable functional equivalent ortholog in the presence of inparalogs.  相似文献   

2.
Differences between species have been suggested to largely reside in the network of connections among the genes. Nevertheless, the rate at which these connections evolve has not been properly quantified. Here, we measure the extent to which co-regulation between pairs of genes is conserved over large phylogenetic distances; between two eukaryotes Caenorhabditis elegans and Saccharomyces cerevisiae, and between two prokaryotes Escherichia coli and Bacillus subtilis. We first construct a reliable set of co-regulated genes by combining various functional genomics data from yeast, and subsequently determine conservation of co-regulation in worm from the distribution of co-expression values. For B.subtilis and E.coli, we use known operons and regulons. We find that between 76 and 80% of the co-regulatory connections are conserved between orthologous pairs of genes, which is very high compared with previous estimates and expectations regarding network evolution. We show that in the case of gene duplication after speciation, one of the two inparalogous genes tends to retain its original co-regulatory relationship, while the other loses this link and is presumably free for differentiation or sub-functionalization. The high level of co-regulation conservation implies that reliably predicted functional relationships from functional genomics data in one species can be transferred with high accuracy to another species when that species also harbours the associated genes.  相似文献   

3.
Gene duplications are one of the most important mechanisms for the origin of evolutionary novelties. Even though various models of the fate of duplicated genes have been established, current knowledge about the role of divergent selection after gene duplication is rather limited. In this study, we analyzed sequence divergence in response to neo- and subfunctionalization of segmentally duplicated genes in the genome of Arabidopsis thaliana. We compared the genomes of A. thaliana and the poplar Populus trichocarpa to identify orthologous pairs of genes and their corresponding inparalogs. Maximum-likelihood analyses of the nonsynonymous and synonymous substitution rate ratio [Formula: see text] of pairs of A. thaliana inparalogs were used to detect differences in the evolutionary rates of protein coding sequences. We analyzed 1,924 A. thaliana paralogous pairs and our results indicate that around 6.9% show divergent ω values between the lineages for a fraction of sites. We observe an enrichment of regulatory sequences, a reduced level of co-expression and an increased number of substitutions that can be attributed to positive selection based on an McDonald-Kreitman type of analysis. Taken together, these results show that selection after duplication contributes substantially to gene novelties and hence functional divergence in plants.  相似文献   

4.
Co-expressed genes are often expected to be functionally related and many bioinformatics approaches based on co-expression have been developed to infer their biological role. However, such annotations may be unreliable, whereas the evolutionary conservation of gene co-expression among species may form a basis for more confident predictions. The huge amount of expression data (microarrays, SAGE, ESTs) has already allowed functional studies based on conserved co-expression in animals. Up to now, the implementation of analogous tools for plants has been strongly limited probably by the paucity and heterogeneity of data. Here we present ORTom, a tomato-centred EST data-mining approach based on conserved co-expression in the Solanaceae family. ORTom can be used to predict functional relationships among genes and to prioritize candidate genes for targeted studies. The method consists in ranking ESTs co-expressed with a gene of interest according to the level of expression pattern conservation in phylogenetically-related plants (potato, tobacco and pepper) to obtain lists of putative functionally-related genes. The lists are then analyzed for Gene Ontology keyword enrichment. The web server ORTom has been implemented to make the results publicly-available and searchable. Few biological examples on how the tool can be used are presented.  相似文献   

5.

Background

Why do some groups of physically linked genes stay linked over long evolutionary periods? Although several factors are associated with the formation of gene clusters in eukaryotic genomes, the particular contribution of each feature to clustering maintenance remains unclear.

Results

We quantify the strength of the proposed factors in a yeast lineage. First we identify the magnitude of each variable to determine linkage conservation by using several comparator species at different distances to Saccharomyces cerevisiae. For adjacent gene pairs, in line with null simulations, intergenic distance acts as the strongest covariate. Which of the other covariates appear important depends on the comparator, although high co-expression is related to synteny conservation commonly, especially in the more distant comparisons, these being expected to reveal strong but relatively rare selection. We also analyze those pairs that are immediate neighbors through all the lineages considered. Current intergene distance is again the best predictor, followed by the local density of essential genes and co-regulation, with co-expression and recombination rate being the weakest predictors. The genome duplication seen in yeast leaves some mark on linkage conservation, as adjacent pairs resolved as single copy in all post-whole genome duplication species are more often found as adjacent in pre-duplication species.

Conclusion

Current intergene distance is consistently the strongest predictor of synteny conservation as expected under a simple null model. Other variables are of lesser importance and their relevance depends both on the species comparison in question and the fate of the duplicates following genome duplication.
  相似文献   

6.
7.
8.

Background

The massive scale of microarray derived gene expression data allows for a global view of cellular function. Thus far, comparative studies of gene expression between species have been based on the level of expression of the gene across corresponding tissues, or on the co-expression of the gene with another gene.

Results

To compare gene expression between distant species on a global scale, we introduce the "expression context". The expression context of a gene is based on the co-expression with all other genes that have unambiguous counterparts in both genomes. Employing this new measure, we show 1) that the expression context is largely conserved between orthologs, and 2) that sequence identity shows little correlation with expression context conservation after gene duplication and speciation.

Conclusion

This means that the degree of sequence identity has a limited predictive quality for differential expression context conservation between orthologs, and thus presumably also for other facets of gene function.  相似文献   

9.
Gene duplication is one of the main mechanisms by which genomes can acquire novel functions. It has been proposed that the retention of gene duplicates can be associated to processes of tissue expression divergence. These models predict that acquisition of divergent expression patterns should be acquired shortly after the duplication, and that larger divergence in tissue expression would be expected for paralogs, as compared to orthologs of a similar age. Many studies have shown that gene duplicates tend to have divergent expression patterns and that gene family expansions are associated with high levels of tissue specificity. However, the timeframe in which these processes occur have rarely been investigated in detail, particularly in vertebrates, and most analyses do not include direct comparisons of orthologs as a baseline for the expected levels of tissue specificity in absence of duplications. To assess the specific contribution of duplications to expression divergence, we combine here phylogenetic analyses and expression data from human and mouse. In particular, we study differences in spatial expression among human-mouse paralogs, specifically duplicated after the radiation of mammals, and compare them to pairs of orthologs in the same species. Our results show that gene duplication leads to increased levels of tissue specificity and that this tends to occur promptly after the duplication event.  相似文献   

10.
Wang Y  Robbins KR  Rekaya R 《PloS one》2010,5(10):e13239
Assessing conservation/divergence of gene expression across species is important for the understanding of gene regulation evolution. Although advances in microarray technology have provided massive high-dimensional gene expression data, the analysis of such data is still challenging. To date, assessing cross-species conservation of gene expression using microarray data has been mainly based on comparison of expression patterns across corresponding tissues, or comparison of co-expression of a gene with a reference set of genes. Because direct and reliable high-throughput experimental data on conservation of gene expression are often unavailable, the assessment of these two computational models is very challenging and has not been reported yet. In this study, we compared one corresponding tissue based method and three co-expression based methods for assessing conservation of gene expression, in terms of their pair-wise agreements, using a frequently used human-mouse tissue expression dataset. We find that 1) the co-expression based methods are only moderately correlated with the corresponding tissue based methods, 2) the reliability of co-expression based methods is affected by the size of the reference ortholog set, and 3) the corresponding tissue based methods may lose some information for assessing conservation of gene expression. We suggest that the use of either of these two computational models to study the evolution of a gene's expression may be subject to great uncertainty, and the investigation of changes in both gene expression patterns over corresponding tissues and co-expression of the gene with other genes is necessary.  相似文献   

11.
To address the need for new antibacterials, a number of bacterial genomes have been systematically disrupted to identify essential genes. Such programs have focused on the disruption of single genes and may have missed functions encoded by gene pairs or multiple genes. In this work, we hypothesized that we could predict the identity of pairs of proteins within one organism that have the same function. We identified 135 putative protein pairs in Bacillus subtilis and attempted to disrupt the genes forming these, singly and then in pairs. The single gene disruptions revealed new genes that could not be disrupted individually and other genes required for growth in minimal medium or for sporulation. The pairwise disruptions revealed seven pairs of proteins that are likely to have the same function, as the presence of one protein can compensate for the absence of the other. Six of these pairs are essential for bacterial viability and in four cases show a pattern of species conservation appropriate for potential antibacterial development. This work highlights the importance of combinatorial studies in understanding gene duplication and identifying functional redundancy.  相似文献   

12.
Genomic DNA fragments bearing proline-rich protein (PRP) genes expressed specifically in hamster parotid glands have been isolated and characterized. Complete exonic sequences as well as intronic and a considerable portion of the flanking sequences are reported for a PRP gene, H29. H29 is interrupted by three intervening sequences, with consensus splice junctions, and it likely encodes the acidic hamster PRP Hp43a. Exceedingly high homology of the 5'-untranslated region and the sequence encoding the signal peptide is observed with other PRPs of all species studied. Significant homology was also detected among the repetitive sequences of the mature acidic PRPs from human, mouse, hamster, and rat. This conservation of the internal repeats of the PRPs suggested that proline-rich protein gene evolution involved intragenic duplication of internal repeats and gene duplication and conversion. Both hamster and mouse PRP genes (H29 and mouse proline-rich protein gene, respectively) share considerable sequence similarity in the 5'-flanking regions for about 100 base pairs upstream. The remainder of the upstream sequences were heterologous except for three oligonucleotide regions with 60-70% sequence conservation. These three regions are thought to be involved in the regulation of the tissue-specific PRP gene induction.  相似文献   

13.
Arabidopsis thaliana is believed to have experienced at least two and possibly three whole-genome duplication events in its evolutionary history. In order to investigate the evolutionary relationships between these duplication events and diversification of disease resistance (R) genes, segmental-duplication events containing R genes belonging to the nucleotide binding-leucine rich repeat (NB-LRR) class were identified. Of 153 segmental-duplication events containing NB-LRR genes, only 22 contained NB-LRR genes in both members of the duplication pair, indicating a high frequency of NB-LRR gene loss after whole-genome duplication. The relative age of the duplication events was estimated based on the average synonymous substitution rate of the duplicated gene pairs in the segments. These data were combined with phylogenetic analyses. NB-LRR genes present in segment pairs derived from the most recent whole-genome duplication event, estimated to have occurred only 20 to 40 million years ago, occupy very distant branches of the NB-LRR phylogenetic tree. These data suggest that when NB-LRR clusters are duplicated as part of a whole-genome duplication, homoeologous NB-LRR genes are preferentially lost, either by eliminating one copy of the cluster or by eliminating individual genes such that only paralogous NB-LRR genes are maintained.  相似文献   

14.
15.
16.

Background  

Genes that are co-expressed tend to be involved in the same biological process. However, co-expression is not a very reliable predictor of functional links between genes. The evolutionary conservation of co-expression between species can be used to predict protein function more reliably than co-expression in a single species. Here we examine whether co-expression across multiple species is also a better prioritizer of disease genes than is co-expression between human genes alone.  相似文献   

17.
Using a data set of protein translations associated with map positions in the human genome, we identified 1520 mapped highly conserved gene families. By comparing sharing of families between genomic windows, we identified 92 potentially duplicated blocks in the human genome containing 422 duplicated members of these families. Using branching order in the phylogenetic trees, we timed gene duplication events in these families relative to the primate-rodent divergence, the amniote-amphibian divergence, and the deuterostome-protostome divergence. The results showed similar patterns of gene duplication times within duplicated blocks and outside duplicated blocks. Both within and outside duplicated blocks, numerous duplications were timed prior to the deuterostome-protostome divergence, whereas others occurred after the amniote-amphibian divergence. Thus, neither gene duplication in general nor duplication of genomic blocks could be attributed entirely to polyploidization early in vertebrate history. The strongest signal in the data was a tendency for intrachromosomal duplications to be more recent than interchromosomal duplications, consistent with a model whereby tandem duplication-whether of single genes or of genomic blocks-may be followed by eventual separation of duplicates due to chromosomal rearrangements. The rate of separation of tandemly duplicated gene pairs onto separated chromosomes in the human lineage was estimated at 1.7 x 10(-9) per gene-pair per year.  相似文献   

18.
19.
20.
A substantial proportion of human genes contain tissue-specifically DNA-methylated regions (TDMRs). However, little is known about the evolutionary conservation of differentially methylated loci, how they evolve, and the signals that regulate them. We have studied TDMR conservation in the PLG and TBX gene families and in 32 pseudogene–parental gene pairs. Among the members of the recently evolved PLG gene family, 5′-UTR methylation is conserved and inversely correlated with the cognate gene expression, indicating as well a conserved regulatory role of DNA methylation. Conversely, many genes of the much older TBX family display complementary tissue-specific methylation, suggesting an epigenetic complementation in the evolution of this gene family. Similar to gene families, unprocessed pseudogenes arose from gene duplications and we found TDMR conservation in some pseudogene–parental gene pairs displaying short evolutionary distances. However, for the majority of unprocessed pseudogenes and for all processed pseudogenes examined, we found that tissue-specific methylation arose de novo after gene duplication.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号