首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.

Background  

Fibroblast Growth Factors (FGF) and their receptors are well known for having major implications in cell signalling controlling embryonic development. Recently, a gene coding for a protein closely related to FGFRs (Fibroblast Growth Factor Receptors) called FGFR5 or FGFR-like 1 (FGFRL1), has been described in vertebrates. An orthologous gene was also found in the cephalochordate amphioxus, but no orthologous genes were found by the authors in other non-vertebrate species, even if a FGFRL1 gene was identified in the sea urchin genome, as well as a closely related gene, named nou-darake, in the planarian Dugesia japonica. These intriguing data of a deuterostome-specific gene that might be implicated in FGF signalling prompted us to search for putative FGFRL1 orthologues in the completely sequenced genomes of metazoans.  相似文献   

2.
The COG database: an updated version includes eukaryotes   总被引:4,自引:0,他引:4  

Background

The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies.

Results

We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes.

Conclusion

The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies.  相似文献   

3.

Background  

When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication). The utility of phylogenetic information in high-throughput genome annotation ("phylogenomics") is widely recognized, but existing approaches are either manual or not explicitly based on phylogenetic trees.  相似文献   

4.

Background  

Gastropod mitochondrial genomes exhibit an unusually great variety of gene orders compared to other metazoan mitochondrial genome such as e.g those of vertebrates. Hence, gastropod mitochondrial genomes constitute a good model system to study patterns, rates, and mechanisms of mitochondrial genome rearrangement. However, this kind of evolutionary comparative analysis requires a robust phylogenetic framework of the group under study, which has been elusive so far for gastropods in spite of the efforts carried out during the last two decades. Here, we report the complete nucleotide sequence of five mitochondrial genomes of gastropods (Pyramidella dolabrata, Ascobulla fragilis, Siphonaria pectinata, Onchidella celtica, and Myosotella myosotis), and we analyze them together with another ten complete mitochondrial genomes of gastropods currently available in molecular databases in order to reconstruct the phylogenetic relationships among the main lineages of gastropods.  相似文献   

5.

Background

Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana.

Results

Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species.

Conclusion

This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome triplication analysis in B. oleracea, B. rapa and A. thaliana genomes, our study provides insight into the evolutionary history of NBS-encoding genes after divergence of A. thaliana and the Brassica lineage. These results together with expression pattern analysis of NBS-encoding orthologous genes provide useful resource for functional characterization of these genes and genetic improvement of relevant crops.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-3) contains supplementary material, which is available to authorized users.  相似文献   

6.
7.

Background  

Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation.  相似文献   

8.

Background  

The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI) ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not.  相似文献   

9.

Background  

Parthenium argentatum (guayule) is an industrial crop that produces latex, which was recently commercialized as a source of latex rubber safe for people with Type I latex allergy. The complete plastid genome of P. argentatum was sequenced. The sequence provides important information useful for genetic engineering strategies. Comparison to the sequences of plastid genomes from three other members of the Asteraceae, Lactuca sativa, Guitozia abyssinica and Helianthus annuus revealed details of the evolution of the four genomes. Chloroplast-specific DNA barcodes were developed for identification of Parthenium species and lines.  相似文献   

10.
Surprising complexity of the ancestral apoptosis network   总被引:1,自引:1,他引:0       下载免费PDF全文
Zmasek CM  Zhang Q  Ye Y  Godzik A 《Genome biology》2007,8(10):R226-8

Background

Apoptosis, one of the main types of programmed cell death, is regulated and performed by a complex protein network. Studies in model organisms, mostly in the nematode Caenorhabditis elegans, identified a relatively simple apoptotic network consisting of only a few proteins. However, analysis of several recently sequenced invertebrate genomes, ranging from the cnidarian sea anemone Nematostella vectensis, representing one of the morphologically simplest metazoans, to the deuterostomes sea urchin and amphioxus, contradicts the current paradigm of a simple ancestral network that expanded in vertebrates.

Results

Here we show that the apoptosome-forming CED-4/Apaf-1 protein, present in single copy in vertebrate, nematode, and insect genomes, had multiple paralogs in the cnidarian-bilaterian ancestor. Different members of this ancestral Apaf-1 family led to the extant proteins in nematodes/insects and in deuterostomes, explaining significant functional differences between proteins that until now were believed to be orthologous. Similarly, the evolution of the Bcl-2 and caspase protein families appears surprisingly complex and apparently included significant gene loss in nematodes and insects and expansions in deuterostomes.

Conclusion

The emerging picture of the evolution of the apoptosis network is one of a succession of lineage-specific expansions and losses, which combined with the limited number of 'apoptotic' protein families, resulted in apparent similarities between networks in different organisms that mask an underlying complex evolutionary history. Similar results are beginning to surface for other regulatory networks, contradicting the intuitive notion that regulatory networks evolved in a linear way, from simple to complex.  相似文献   

11.

Background

Phyletic patterns denote the presence and absence of orthologous genes in completely sequenced genomes and are used to infer functional links between genes, on the assumption that genes involved in the same pathway or functional system are co-inherited by the same set of genomes. However, this basic premise has not been quantitatively tested, and the limits of applicability of the phyletic-pattern method remain unknown.

Results

We characterized a hierarchy of 3,688 phyletic patterns encompassing more than 5,000 known protein-coding genes from 66 complete microbial genomes, using different distances, clustering algorithms, and measures of cluster quality. The most sensitive set of parameters recovered 223 clusters, each consisting of genes that belong to the same metabolic pathway or functional system. Fifty-six clusters included unexpected genes with plausible functional links to the rest of the cluster. Only a small percentage of known pathways and multiprotein complexes are co-inherited as one cluster; most are split into many clusters, indicating that gene loss and displacement has occurred in the evolution of most pathways.

Conclusions

Phyletic patterns of functionally linked genes are perturbed by differential gains, losses and displacements of orthologous genes in different species, reflecting the high plasticity of microbial genomes. Groups of genes that are co-inherited can, however, be recovered by hierarchical clustering, and may represent elementary functional modules of cellular metabolism. The phyletic patterns approach alone can confidently predict the functional linkages for about 24% of the entire data set.  相似文献   

12.
13.
14.

Background  

There has been remarkably little study of nucleotide substitution rate variation among plant nuclear genes, in part because orthology is difficult to establish. Orthology is even more problematic for intergenic regions of plant nuclear genomes, because plant genomes generally harbor a wealth of repetitive DNA. In theory orthologous intergenic data is valuable for studying rate variation because nucleotide substitutions in these regions should be under little selective constraint compared to coding regions. As a result, evolutionary rates in intergenic regions may more accurately reflect genomic features, like recombination and GC content, that contribute to nucleotide substitution.  相似文献   

15.

Background  

The three trypanosomatids pathogenic to men, Trypanosoma cruzi, Trypanosoma brucei and Leishmania major, are etiological agents of Chagas disease, African sleeping sickness and cutaneous leishmaniasis, respectively. The complete sequencing of these trypanosomatid genomes represented a breakthrough in the understanding of these organisms. Genome sequencing is a step towards solving the parasite biology puzzle, as there are a high percentage of genes encoding proteins without functional annotation. Also, technical limitations in protein expression in heterologous systems reinforce the evident need for the development of a high-throughput reverse genetics platform. Ideally, such platform would lead to efficient cloning and compatibility with various approaches. Thus, we aimed to construct a highly efficient cloning platform compatible with plasmid vectors that are suitable for various approaches.  相似文献   

16.

Background  

The combination of complete genome sequence information with expression data enables us to characterize the relationship between a protein's evolutionary origin or functional category and its expression pattern. In this study, mouse proteins were assigned into functional and phyletic groups and the gene expression patterns of the different protein groupings were examined by microarray analysis in various mouse tissues.  相似文献   

17.

Background

Paenibacillus larvae is a Firmicute bacterium that causes American Foulbrood, a lethal disease in honeybees and is a major source of global agricultural losses. Although P. larvae phages were isolated prior to 2013, no full genome sequences of P. larvae bacteriophages were published or analyzed. This report includes an in-depth analysis of the structure, genomes, and relatedness of P. larvae myoviruses Abouo, Davis, Emery, Jimmer1, Jimmer2, and siphovirus phiIBB_Pl23 to each other and to other known phages.

Results

P. larvae phages Abouo, Davies, Emery, Jimmer1, and Jimmer2 are myoviruses with ~50 kbp genomes. The six P. larvae phages form three distinct groups by dotplot analysis. An annotated linear genome map of these six phages displays important identifiable genes and demonstrates the relationship between phages. Sixty phage assembly or structural protein genes and 133 regulatory or other non-structural protein genes were identifiable among the six P. larvae phages. Jimmer1, Jimmer2, and Davies formed stable lysogens resistant to superinfection by genetically similar phages. The correlation between tape measure protein gene length and phage tail length allowed identification of co-isolated phages Emery and Abouo in electron micrographs. A Phamerator database was assembled with the P. larvae phage genomes and 107 genomes of Firmicute-infecting phages, including 71 Bacillus phages. Phamerator identified conserved domains in 1,501 of 6,181 phamilies (only 24.3%) encoded by genes in the database and revealed that P. larvae phage genomes shared at least one phamily with 72 of the 107 other phages. The phamily relationship of large terminase proteins was used to indicate putative DNA packaging strategies. Analyses from CoreGenes, Phamerator, and electron micrograph measurements indicated Jimmer1, Jimmer2, Abouo and Davies were related to phages phiC2, EJ-1, KC5a, and AQ113, which are small-genome myoviruses that infect Streptococcus, Lactobacillus, and Clostridium, respectively.

Conclusions

This paper represents the first comparison of phage genomes in the Paenibacillus genus and the first organization of P. larvae phages based on sequence and structure. This analysis provides an important contribution to the field of bacteriophage genomics by serving as a foundation on which to build an understanding of the natural predators of P. larvae.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-745) contains supplementary material, which is available to authorized users.  相似文献   

18.

Background

Analysis of the complete genomes from the multicellular myxobacteria Myxococcus xanthus and Sorangium cellulosum identified the highest number of eukaryotic-like protein kinases (ELKs) compared to all other genomes analyzed. High numbers of protein phosphatases (PPs) could therefore be anticipated, as reversible protein phosphorylation is a major regulation mechanism of fundamental biological processes.

Methodology

Here we report an intensive analysis of the phosphatomes of M. xanthus and S. cellulosum in which we constructed phylogenetic trees to position these sequences relative to PPs from other prokaryotic organisms.

Principal Findings

Predominant observations were: (i) M. xanthus and S. cellulosum possess predominantly Ser/Thr PPs; (ii) S. cellulosum encodes the highest number of PP2c-type phosphatases so far reported for a prokaryotic organism; (iii) in contrast to M. xanthus only S. cellulosum encodes high numbers of SpoIIE-like PPs; (iv) there is a significant lack of synteny among M. xanthus and S. cellulosum, and (v) the degree of co-organization between kinase and phosphatase genes is extremely low in these myxobacterial genomes.

Conclusions

We conclude that there has been a greater expansion of ELKs than PPs in multicellular myxobacteria.  相似文献   

19.

Background

An organism's ability to adapt to its particular environmental niche is of fundamental importance to its survival and proliferation. In the largest study of its kind, we sought to identify and exploit the amino-acid signatures that make species-specific protein adaptation possible across 100 complete genomes.

Results

Environmental niche was determined to be a significant factor in variability from correspondence analysis using the amino acid composition of over 360,000 predicted open reading frames (ORFs) from 17 archae, 76 bacteria and 7 eukaryote complete genomes. Additionally, we found clusters of phylogenetically unrelated archae and bacteria that share similar environments by amino acid composition clustering. Composition analyses of conservative, domain-based homology modeling suggested an enrichment of small hydrophobic residues Ala, Gly, Val and charged residues Asp, Glu, His and Arg across all genomes. However, larger aromatic residues Phe, Trp and Tyr are reduced in folds, and these results were not affected by low complexity biases. We derived two simple log-odds scoring functions from ORFs (CG) and folds (CF) for each of the complete genomes. CF achieved an average cross-validation success rate of 85 ± 8% whereas the CG detected 73 ± 9% species-specific sequences when competing against all other non-redundant CG. Continuously updated results are available at http://genome.mshri.on.ca.

Conclusion

Our analysis of amino acid compositions from the complete genomes provides stronger evidence for species-specific and environmental residue preferences in genomic sequences as well as in folds. Scoring functions derived from this work will be useful in future protein engineering experiments and possibly in identifying horizontal transfer events.  相似文献   

20.

Background  

The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号