首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed.  相似文献   

2.
There are many ways to group completed genome sequences in hierarchical patterns (trees) reflecting relationships between their genes. Such groupings help us organize biological information and bear crucially on underlying processes of genome and organismal evolution. Genome trees make use of all comparable genes but can variously weight the contributions of these genes according to similarity, congruent patterns of similarity, or prevalence among genomes. Here we explore such possible weighting strategies, in an analysis of 142 prokaryotic and 5 eukaryotic genomes. We demonstrate that alternate weighting strategies have different advantages, and we propose that each may have its specific uses in systematic or evolutionary biology. Comparisons of results obtained with different methods can provide further clues to major events and processes in genome evolution.  相似文献   

3.
4.
Aldehyde dehydrogenase (ALDH) superfamily represents a group of NAD(P)(+)-dependent enzymes that catalyze the oxidation of a wide spectrum of endogenous and exogenous aldehydes. With the advent of megabase genome sequencing, the ALDH superfamily is expanding rapidly on many fronts. As expected, ALDH genes are found in virtually all genomes analyzed to date, indicating the importance of these enzymes in biological functions. Complete genome sequences of various species have revealed additional ALDH genes. As of July 2000, the ALDH superfamily consists of 331 distinct genes, of which eight are found in archaea, 165 in eubacteria, and 158 in eukaryota. The number of ALDH genes in some species with their genomes completely sequenced and annotated, Escherichia coli and Caenorhabditis elegans, ranges from 10 to 17. In the human genome, 17 functional genes and three pseudogenes have been identified to date. Divergent evolution, based on multiple alignment analysis of 86 eukaryotic ALDH amino-acid sequences, was the basis of the standardized ALDH gene nomenclature system (Pharmacogenetics 9: 421-434, 1999). Thus far, the eukaryotic ALDHs comprise 20 gene families. A complete list of all ALDH sequences known to date is presented here along with the evolution analysis of the eukaryotic ALDHs.  相似文献   

5.
Aldehyde dehydrogenase (ALDH) superfamily represents a group of NAD(P)+-dependent enzymes that catalyze the oxidation of a wide spectrum of endogenous and exogenous aldehydes. With the advent of megabase genome sequencing, the ALDH superfamily is expanding rapidly on many fronts. As expected, ALDH genes are found in virtually all genomes analyzed to date, indicating the importance of these enzymes in biological functions. Complete genome sequences of various species have revealed additional ALDH genes. As of July 2000, the ALDH superfamily consists of 331 distinct genes, of which eight are found in archaea, 165 in eubacteria, and 158 in eukaryota. The number of ALDH genes in some species with their genomes completely sequenced and annotated, Escherichia coli and Caenorhabditis elegans, ranges from 10 to 17. In the human genome, 17 functional genes and three pseudogenes have been identified to date. Divergent evolution, based on multiple alignment analysis of 86 eukaryotic ALDH amino-acid sequences, was the basis of the standardized ALDH gene nomenclature system (Pharmacogenetics 9: 421–434, 1999). Thus far, the eukaryotic ALDHs comprise 20 gene families. A complete list of all ALDH sequences known to date is presented here along with the evolution analysis of the eukaryotic ALDHs.  相似文献   

6.
7.
Irimia M  Roy SW 《PLoS genetics》2008,4(8):e1000148
The presence of spliceosomal introns in eukaryotes raises a range of questions about genomic evolution. Along with the fundamental mysteries of introns' initial proliferation and persistence, the evolutionary forces acting on intron sequences remain largely mysterious. Intron number varies across species from a few introns per genome to several introns per gene, and the elements of intron sequences directly implicated in splicing vary from degenerate to strict consensus motifs. We report a 50-species comparative genomic study of intron sequences across most eukaryotic groups. We find two broad and striking patterns. First, we find that some highly intron-poor lineages have undergone evolutionary convergence to strong 3' consensus intron structures. This finding holds for both branch point sequence and distance between the branch point and the 3' splice site. Interestingly, this difference appears to exist within the genomes of green alga of the genus Ostreococcus, which exhibit highly constrained intron sequences through most of the intron-poor genome, but not in one much more intron-dense genomic region. Second, we find evidence that ancestral genomes contained highly variable branch point sequences, similar to more complex modern intron-rich eukaryotic lineages. In addition, ancestral structures are likely to have included polyT tails similar to those in metazoans and plants, which we found in a variety of protist lineages. Intriguingly, intron structure evolution appears to be quite different across lineages experiencing different types of genome reduction: whereas lineages with very few introns tend towards highly regular intronic sequences, lineages with very short introns tend towards highly degenerate sequences. Together, these results attest to the complex nature of ancestral eukaryotic splicing, the qualitatively different evolutionary forces acting on intron structures across modern lineages, and the impressive evolutionary malleability of eukaryotic gene structures.  相似文献   

8.
Human obesity is a main cause of morbidity and mortality. Recently, several studies have demonstrated an association between the FTO gene locus and early onset and severe obesity. To date, the FTO gene has only been discovered in vertebrates. We identified FTO homologs in the complete genome sequences of various evolutionary diverse marine eukaryotic algae, ranging from unicellular photosynthetic picoplankton to a multicellular seaweed. However, FTO homologs appear to be absent from all other completely sequenced genomes of plants, fungi, and invertebrate animals. Although the biological roles of these marine algal FTO homologs are still unknown, these genes will be useful for exploring basic protein features and could hence help unravel the function of the FTO gene in vertebrates and its inferred link with obesity in humans.  相似文献   

9.
The current state of knowledge concerning the unsolved problem of the huge interspecific eukaryotic genome size variations not correlating with the species phenotypic complexity (C-value enigma also known as C-value paradox) is reviewed. Characteristic features of eukaryotic genome structure and molecular mechanisms that are the basis of genome size changes are examined in connection with the C-value enigma. It is emphasized that endogenous mutagens, including reactive oxygen species, create a constant nuclear environment where any genome evolves. An original quantitative model and general conception are proposed to explain the C-value enigma. In accordance with the theory, the noncoding sequences of the eukaryotic genome provide genes with global and differential protection against chemical mutagens and (in addition to the anti-mutagenesis and DNA repair systems) form a new, third system that protects eukaryotic genetic information. The joint action of these systems controls the spontaneous mutation rate in coding sequences of the eukaryotic genome. It is hypothesized that the genome size is inversely proportional to functional efficiency of the anti-mutagenesis and/or DNA repair systems in a particular biological species. In this connection, a model of eukaryotic genome evolution is proposed.  相似文献   

10.
The sequencing of eukaryotic genomes has lagged behind sequencing of organisms in the other domains of life, archae and bacteria, primarily due to their greater size and complexity. With recent advances in high-throughput technologies such as robotics and improved computational resources, the number of eukaryotic genome sequencing projects has increased significantly. Among these are a number of sequencing projects of tropical pathogens of medical and veterinary importance, many of which are responsible for causing widespread morbidity and mortality in peoples of developing countries. Uncovering the complete gene complement of these organisms is proving to be of immense value in the development of novel methods of parasite control, such as antiparasitic drugs and vaccines, as well as the development of new diagnostic tools. Combining pathogen genome sequences with the host and vector genome sequences is promising to be a robust method for the identification of host-pathogen interactions. Finally, comparative sequencing of related species, especially of organisms used as model systems in the study of the disease, is beginning to realize its potential in the identification of genes, and the evolutionary forces that shape the genes, that are involved in evasion of the host immune response.  相似文献   

11.
Six DNA fragments of interphase chromosomes isolated from nuclear envelopes of murine hepatocytes were cloned and sequenced. Analysis of their structural-functional organization suggests that these fragments are highly specified protein-nonencoding fractions of a eukaryotic genome. In the evolutionary process, they appear already in archaebacteria and may be "ancestral" for DNA sequences involved in structuring chromosomal domains (rosette-like structures) of tissue-specific genes. In their composition, these fragments have nucleotide sequences homologous to the repeats of the SINE and LINE families and to the satellite DNA of murine centromeres.  相似文献   

12.
Complete genome sequences are accumulating rapidly, culminating with the announcement of the human genome sequence in February 2001. In addition to cataloguing the diversity of genes and other sequences, genome sequences will provide the first detailed and complete data on gene families and genome organization, including data on evolutionary changes. Reciprocally, evolutionary biology will make important contributions to the efforts to understand functions of genes and other sequences in genomes. Large-scale, detailed and unbiased comparisons between species will illuminate the evolution of genes and genomes, and population genetics methods will enable detection of functionally important genes or sequences, including sequences that have been involved in adaptive changes.  相似文献   

13.
Rich in repeated DNA sequences and poor in genes, the heterochromatin is an important functional part of the eukaryotic genome. Heterochromatin exhibits high evolutionary variability, which was revealed on the cytological and molecular levels in malarial mosquito species from the Anopheles maculipennis complex. In this connection, investigation of the heterochromatin molecular composition in species of this complex is of interest.  相似文献   

14.
15.
Analysis of evolution of exon-intron structure of eukaryotic genes   总被引:10,自引:0,他引:10  
The availability of multiple, complete eukaryotic genome sequences allows one to address many fundamental evolutionary questions on genome scale. One such important, long-standing problem is evolution of exon-intron structure of eukaryotic genes. Analysis of orthologous genes from completely sequenced genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists. The data on shared and lineage-specific intron positions were used as the starting point for evolutionary reconstruction with parsimony and maximum-likelihood approaches. Parsimony methods produce reconstructions with intron-rich ancestors but also infer lineage-specific, in many cases, high levels of intron loss and gain. Different probabilistic models gave opposite results, apparently depending on model parameters and assumptions, from domination of intron loss, with extremely intron-rich ancestors, to dramatic excess of gains, to the point of denying any true conservation of intron positions among deep eukaryotic lineages. Development of models with adequate, realistic parameters and assumptions seems to be crucial for obtaining more definitive estimates of intron gain and loss in different eukaryotic lineages. Many shared intron positions were detected in ancestral eukaryotic paralogues which evolved by duplication prior to the divergence of extant eukaryotic lineages. These findings indicate that numerous introns were present in eukaryotic genes already at the earliest stages of evolution of eukaryotes and are compatible with the hypothesis that the original, catastrophic intron invasion accompanied the emergence of the eukaryotic cells. Comparison of various features of old and younger introns starts shedding light on probable mechanisms of intron insertion, indicating that propagation of old introns is unlikely to be a major mechanism for origin of new ones. The existence and structure of ancestral protosplice sites were addressed by examining the context of introns inserted within codons that encode amino acids conserved in all eukaryotes and, accordingly, are not subject to selection for splicing efficiency. It was shown that introns indeed predominantly insert into or are fixed in specific protosplice sites which have the consensus sequence (A/C)AG|Gt.  相似文献   

16.
SUMMARY: Many biological papers describe short, functional DNA sites without specifying their exact positions in the genome. We have developed a Web server that automates the tedious task of locating such sites in eukaryotic genomes, thus giving access to the context of rich annotations that are increasingly available for genome sequences. AVAILABILITY: http://zlab.bu.edu/site2genome/  相似文献   

17.
Origin and evolution of spliceosomal introns   总被引:1,自引:0,他引:1  
ABSTRACT: Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded 'introns first' held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers' Reports section.  相似文献   

18.
Pseudogenes   总被引:8,自引:0,他引:8  
  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号