首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Most genes in Arabidopsis thaliana are members of gene families. How do the members of gene families arise, and how are gene family copy numbers maintained? Some gene families may evolve primarily through tandem duplication and high rates of birth and death in clusters, and others through infrequent polyploidy or large-scale segmental duplications and subsequent losses.

Results

Our approach to understanding the mechanisms of gene family evolution was to construct phylogenies for 50 large gene families in Arabidopsis thaliana, identify large internal segmental duplications in Arabidopsis, map gene duplications onto the segmental duplications, and use this information to identify which nodes in each phylogeny arose due to segmental or tandem duplication. Examples of six gene families exemplifying characteristic modes are described. Distributions of gene family sizes and patterns of duplication by genomic distance are also described in order to characterize patterns of local duplication and copy number for large gene families. Both gene family size and duplication by distance closely follow power-law distributions.

Conclusions

Combining information about genomic segmental duplications, gene family phylogenies, and gene positions provides a method to evaluate contributions of tandem duplication and segmental genome duplication in the generation and maintenance of gene families. These differences appear to correspond meaningfully to differences in functional roles of the members of the gene families.
  相似文献   

2.
ABSTRACT: BACKGROUND: Proteins convey the majority of biochemical and cellular activities in organisms. Over the course of evolution, proteins undergo normal sequence mutations as well as large scale mutations involving domain duplication and/or domain shuffling. These events result in the generation of new proteins and protein families. Processes that affect proteome evolution drive species diversity and adaptation. Herein, change over the course of metazoan evolution, as defined by birth/death and duplication/deletion events within protein families and domains, was examined using the proteomes of 9 metazoan and two outgroup species. RESULTS: In studying members of the three major metazoan groups, the vertebrates, arthropods, and nematodes, we found that the number of protein families increased at the majority of lineages over the course of metazoan evolution where the magnitude of these increases was greatest at the lineages leading to mammals. In contrast, the number of protein domains decreased at most lineages and at all terminal lineages. This resulted in a weak correlation between protein family birth and domain birth; however, the correlation between domain birth and domain member duplication was quite strong. These data suggest that domain birth and protein family birth occur via different mechanisms, and that domain shuffling plays a role in the formation of protein families. The ratio of protein family birth to protein domain birth (domain shuffling index) suggests that shuffling had a more demonstrable effect on protein families in nematodes and arthropods than in vertebrates. Through the contrast of high and low domain shuffling indices at the lineages of Trichinella spiralis and Gallus gallus, we propose a link between protein redundancy and evolutionary changes controlled by domain shuffling; however, the speed of adaptation among the different lineages was relatively invariant. Evaluating the functions of protein families that appeared or disappeared at the last common ancestors (LCAs) of the three metazoan clades supports a correlation with organism adaptation. Furthermore, bursts of new protein families and domains in the LCAs of metazoans and vertebrates are consistent with whole genome duplications. CONCLUSION: Metazoan speciation and adaptation were explored by birth/death and duplication/deletion events among protein families and domains. Our results provide insights into protein evolution and its bearing on metazoan evolution.  相似文献   

3.
Many genes exist in the form of families; however, little is known about their size variation, evolution and biology. Here, we present the size variation and evolution of the nucleotide-binding site (NBS)-encoding gene family and receptor-like kinase (RLK) gene family in Oryza, Glycine and Gossypium. The sizes of both families vary by numeral fold, not only among species, surprisingly, also within a species. The size variations of the gene families are shown to correlate with each other, indicating their interactions, and driven by natural selection, artificial selection and genome size variation, but likely not by polyploidization. The numbers of genes in the families in a polyploid species are similar to those of one of its diploid donors, suggesting that polyploidization plays little roles in the expansion of the gene families and that organisms tend not to maintain their ‘surplus’ genes in the course of evolution. Furthermore, it is found that the size variations of both gene families are associated with organisms’ phylogeny, suggesting their roles in speciation and evolution. Since both selection and speciation act on organism’s morphological, physiological and biological variation, our results indicate that the variation of gene family size provides a source of genetic variation and evolution.  相似文献   

4.
One of the unique insights provided by the growing number of fully sequenced genomes is the pervasiveness of gene duplication and gene loss. Indeed, several metrics now suggest that rates of gene birth and death per gene are only 10–40% lower than nucleotide substitutions per site, and that per nucleotide, the consequent lineage‐specific expansion and contraction of gene families may play at least as large a role in adaptation as changes in orthologous sequences. While gene family evolution is pervasive, it may be especially important in our own evolution since it appears that the “revolving door” of gene duplication and loss has undergone multiple accelerations in the lineage leading to humans. In this paper, we review current understanding of gene family evolution including: methods for inferring copy number change, evidence for adaptive expansion and adaptive contraction of gene families, the origins of new families and deaths of previously established ones, and finally we conclude with a perspective on challenges and promising directions for future research.  相似文献   

5.

Background  

Previous studies in Ascomycetes have shown that the function of gene families of which the size is considerably larger in extant pathogens than in non-pathogens could be related to pathogenicity traits. However, by only comparing gene inventories in extant species, no insights can be gained into the evolutionary process that gave rise to these larger family sizes in pathogens. Moreover, most studies which consider gene families in extant species only tend to explain observed differences in gene family sizes by gains rather than by losses, hereby largely underestimating the impact of gene loss during genome evolution.  相似文献   

6.
We introduce and analyze a simple discrete probabilistic model of genome evolution. It is based on four fundamental evolutionary events: gene duplication, loss, accumulated change and innovation. We call it the DLCI model. It is the first such model rigorously analyzed. The focus of the paper is on the size distribution of gene families. We derive formulae for the equilibrium of gene family sizes, and show that they follow a logarithmic distribution. We also consider a disjoint union of DLCI models and present empirical results for bacterial genomes.  相似文献   

7.
Several isolates of the marine cyanobacterial genus Prochlorococcus have smaller genome sizes than those of the closely related genus Synechococcus. In order to test whether loss of protein-coding genes has contributed to genome size reduction in Prochlorococcus, we reconstructed events of gene family evolution over a strongly supported phylogeny of 12 Prochlorococcus genomes and 9 Synechococcus genomes. Significantly, more events both of loss of paralogs within gene families and of loss of entire gene families occurred in Prochlorococcus than in Synechococcus. The number of nonancestral gene families in genomes of both genera was positively correlated with the extent of genomic islands (GIs), consistent with the hypothesis that horizontal gene transfer (HGT) is associated with GIs. However, even when only isolates with comparable extents of GIs were compared, significantly more events of gene family loss and of paralog loss were seen in Prochlorococcus than in Synechococcus, implying that HGT is not the primary reason for the genome size difference between the two genera.  相似文献   

8.
MOTIVATION: Gene duplications and losses (GDLs) are important events in genome evolution. They result in expansion or contraction of gene families, with a likely role in phenotypic evolution. As more genomes become available and their annotations are improved, software programs capable of rapidly and accurately identifying the content of ancestral genomes and the timings of GDLs become necessary to understand the unique evolution of each lineage. RESULTS: We report EvolMAP, a new algorithm and software that utilizes a species tree-based gene clustering method to join all-to-all symmetrical similarity comparisons of multiple gene sets in order to infer the gene composition of multiple ancestral genomes. The algorithm further uses Dollo parsimony-based comparison of the inferred ancestral genes to pinpoint the timings of GDLs onto evolutionary intervals marked by speciation events. Using EvolMAP, first we analyzed the expansion of four families of G-protein coupled receptors (GPCRs) within animal lineages. Additional to demonstrating the unique expansion tree for each family, results also show that the ancestral eumetazoan genome contained many fewer GPCRs than modern animals, and these families expanded through concurrent lineage-specific duplications. Second, we analyzed the history of GDLs in mammalian genomes by comparing seven proteomes. In agreement with previous studies, we report that the mammalian gene family sizes have changed drastically through their evolution. Interestingly, although we identified a potential source of duplication for 75% of the gained genes, remaining 25% did not have clear-cut sources, revealing thousands of genes that have likely gained their distinct sequence identities within the descent of mammals. AVAILABILITY: Query server, source code and executable are available at http://kosik-web.mcdb.ucsb.edu/evolmap/index.htm .  相似文献   

9.
Evolution of plant microRNA gene families   总被引:3,自引:0,他引:3  
Li A  Mao L 《Cell research》2007,17(3):212-218
  相似文献   

10.
Words are built from smaller meaning bearing parts, called morphemes. As one word can contain multiple morphemes, one morpheme can be present in different words. The number of distinct words a morpheme can be found in is its family size. Here we used Birth-Death-Innovation Models (BDIMs) to analyze the distribution of morpheme family sizes in English and German vocabulary over the last 200 years. Rather than just fitting to a probability distribution, these mechanistic models allow for the direct interpretation of identified parameters. Despite the complexity of language change, we indeed found that a specific variant of this pure stochastic model, the second order linear balanced BDIM, significantly fitted the observed distributions. In this model, birth and death rates are increased for smaller morpheme families. This finding indicates an influence of morpheme family sizes on vocabulary changes. This could be an effect of word formation, perception or both. On a more general level, we give an example on how mechanistic models can enable the identification of statistical trends in language change usually hidden by cultural influences.  相似文献   

11.
12.
McBride CS  Arguello JR  O'Meara BC 《Genetics》2007,177(3):1395-1416
The insect chemoreceptor superfamily comprises the olfactory receptor (Or) and gustatory receptor (Gr) multigene families. These families give insects the ability to smell and taste chemicals in the environment and are thus rich resources for linking molecular evolutionary and ecological processes. Although dramatic differences in family size among distant species and high divergence among paralogs have led to the belief that the two families evolve rapidly, a lack of evolutionary data over short time scales has frustrated efforts to identify the major forces shaping this evolution. Here, we investigate patterns of gene loss/gain, divergence, and polymorphism in the entire repertoire of approximately 130 chemoreceptor genes from five closely related species of Drosophila that share a common ancestor within the past 12 million years. We demonstrate that the overall evolution of the Or and Gr families is nonneutral. We also show that selection regimes differ both between the two families as wholes and within each family among groups of genes with varying functions, patterns of expression, and phylogenetic histories. Finally, we find that the independent evolution of host specialization in Drosophila sechellia and D. erecta is associated with a fivefold acceleration of gene loss and increased rates of amino acid evolution at receptors that remain intact. Gene loss appears to primarily affect Grs that respond to bitter compounds while elevated Ka/Ks is most pronounced in the subset of Ors that are expressed in larvae. Our results provide strong evidence that the observed phenomena result from the invasion of a novel ecological niche and present a unique synthesis of molecular evolutionary analyses with ecological data.  相似文献   

13.
MOTIVATION: The distributions of many genome-associated quantities, including the membership of paralogous gene families can be approximated with power laws. We are interested in developing mathematical models of genome evolution that adequately account for the shape of these distributions and describe the evolutionary dynamics of their formation. RESULTS: We show that simple stochastic models of genome evolution lead to power-law asymptotics of protein domain family size distribution. These models, called Birth, Death and Innovation Models (BDIM), represent a special class of balanced birth-and-death processes, in which domain duplication and deletion rates are asymptotically equal up to the second order. The simplest, linear BDIM shows an excellent fit to the observed distributions of domain family size in diverse prokaryotic and eukaryotic genomes. However, the stochastic version of the linear BDIM explored here predicts that the actual size of large paralogous families is reached on an unrealistically long timescale. We show that introduction of non-linearity, which might be interpreted as interaction of a particular order between individual family members, allows the model to achieve genome evolution rates that are much better compatible with the current estimates of the rates of individual duplication/loss events.  相似文献   

14.
This article deals with the theoretical size distribution of gene and protein families in complete genomes. A simple evolutionary model for the development of such families in which genes in a family are formed or selected against independently and at random, and in which new families are formed by the random splitting of existing families, is used to derive the resulting size distribution. Mathematically this turns out to be the distribution of the state of a homogeneous birth-and-death process after an exponentially distributed time, which it is shown will under certain conditions exhibit the power-law behaviour observed for gene and protein family sizes.  相似文献   

15.
The evolution of the mouse immunoglobulin heavy chain variable region (Igh-V) locus was investigated by the comprehensive analysis of variable region (Vh) gene family content and restriction fragment polymorphism in the genusMus. The examination of naturalMus domesticus populations suggests an important role for recombination in the generation of the considerable restriction fragment polymorphism found at theIgh-V locus. Although the sizes of individualVh gene families vary widely both within and between differentMus species, evolutionary trends ofVh gene family copy number are revealed by the analysis of homologues of mouseVh gene families inRattus andPeromyscus. Processes of duplication, deletion, and sequence divergence all contribute to the evolution ofVh gene copy number. CertainVh gene families have expanded or contracted differently in the various muroid lineages examined. Collectively, these findings suggest that the evolution of individualVh family size is not driven by strong selective pressure but is relatively neutral, and that gene flow, rather than selection, serves to maintain the high level of restriction fragment polymorphism seen inM. domesticus.  相似文献   

16.
17.
Many testis-specific genes from the sex chromosomes are subject to rapid evolution, which can make it difficult to identify murine genes in the human genome. The murine CYPT gene family includes 15 members, but orthologs were undetectable in the human genome. However, using refined homology search, sequences corresponding to the shared promoter region of the CYPT family were identified at 39 loci. Most loci were located immediately upstream of genes belonging to the VCX/Y, SPANX, or CSAG gene families. Sequence comparison of the loci revealed a conserved CYPT promoter-like (CPL) element featuring TATA and CCAAT boxes. The expression of members of the three families harboring the CPL resembled the murine expression of the CYPT family, with weak expression in late pachytene spermatocytes and predominant expression in spermatids, but some genes were also weakly expressed in somatic cells and in other germ cell types. The genomic regions harboring the gene families were rich in direct and inverted segmental duplications (SD), which may facilitate gene conversion and rapid evolution. The conserved CPL and the common expression profiles suggest that the human VCX/Y, SPANX, and CSAG2 gene families together with the murine SPANX gene and the CYPT family may share a common ancestor. Finally, we present evidence that VCX/Y and SPANX may be paralogs with a similar protein structure consisting of C terminal acidic repeats of variable lengths.  相似文献   

18.
Analysis of increasingly saturated sequence databases have shown that gene family sizes are highly skewed with many families being small and few containing many, far-diverged homologs. Additionally, recently published results have identified a structural determinant of mutational plasticity: designability that correlates strongly with gene family size. In this paper, we explore the possible links between the two observations, exploring the possible effect of designability on duplication and divergence. We show that designability has an inverse of expected relationship with strength of selection. More designable domains that should have more mutational plasticity evolve slower. However, we also present evidence that recently duplicated genes have variable probability of locus fixation correlated with strength of selection. As expected, paralogs under stronger evolutionary pressure have a lower failure rate. Finally, we show that probability of pseudogene formation from gene duplication can be directly tied to designability and functional flexibility of the family. We present evidence that gene families with higher designability have diverged farther because of lower probability of pseudogenization. Additionally, mutational plasticity may play an integral role by influencing pseudogenization rate. Either way, we show that considering the failure rate of duplications is integral in understanding the determinants and dynamics of molecular evolution.  相似文献   

19.
Life history patterns in lizards of the arid and semiarid zone of Australia   总被引:1,自引:1,他引:0  
Klaus Henle 《Oecologia》1991,88(3):347-358
Summary Studies on the life histories and population dynamics of lizards in the semiarid/arid zone of Australia are reviewed to identify the influence of size (female mean snout-vent length), phylogeny (family effects) and ecological parameters on the evolution of life history traits of these species. Species producing more than one clutch per year are larger than single-clutched ones. In an ANCOVA, significant effects of size and phylogeny on clutch size and on age at sexual maturity were found. Microhabitat (arboreal, terrestrial, and subterranean life style) also had an effect on clutch size, but only mediated through a significant interaction with size. However, results of the ANCOVAs depend on the families and ecological parameters included in the analyses. Therefore, caution is necessary in interpreting or generalizing the results; in any case, size and phylogeny explain only a small percentage of the observed variation. Nevertheless, a direct comparison of a set of syntopic/paratopic desert lizards supports and extends the main conclusions of the ANCOVA. A significant but small phylogenetic effect was found, and arboreal microhabitat was associated with greater age at sexual maturity. Activity (diurnal versus nocturnal) influenced yearly mortalities and clutch frequencies. For both, microhabitat and activity, predation levels and size-dependent mortality were the likely selective factors causing these correlations. The demographic environment explains the paucity of duurnal lizard species with fixed clutch sizes in the semiarid/arid zone of Australia. Possible causes for the evolution of fixed clutch sizes are discussed.  相似文献   

20.
We surveyed the molecular evolutionary characteristics of 25 plant gene families, with the goal of better understanding general processes in plant gene family evolution. The survey was based on 247 GenBank sequences representing four grass species (maize, rice, wheat, and barley). For each gene family, orthology and paralogy relationships were uncertain. Recognizing this uncertainty, we characterized the molecular evolution of each gene family in four ways. First, we calculated the ratio of nonsynonymous to synonymous substitutions (d N/d S) both on branches of gene phylogenies and across codons. Our results indicated that the d N/d S ratio was statistically heterogeneous across branches in 17 of 25 (68%) gene families. The vast majority of d N/d S estimates were <<1.0, suggestive of selective constraint on amino acid replacements, and no estimates were >1.0, either across phylogenetic lineages or across codons. Second, we tested separately for nonsynonymous and synonymous molecular clocks. Sixty-eight percent of gene families rejected a nonsynonymous molecular clock, and 52% of gene families rejected a synonymous molecular clock. Thus, most gene families in this study deviated from clock-like evolution at either synonymous or nonsynonymous sites. Third, we calculated the effective number of codons and the proportion of G+C synonymous sites for each sequence in each gene family. One or both quantities vary significantly within 18 of 25 gene families. Finally, we tested for gene conversion, and only six gene families provided evidence of gene conversion events. Altogether, evolution for these 25 gene families is marked by selective constraint that varies among gene family members, a lack of molecular clock at both synonymous and nonsynonymous sites, and substantial variation in codon usage. Received: 25 May 2000 / Accepted: 16 October 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号