首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Whole-genome comparisons provide insight into genome evolution by informing on gene repertoires, gene gains/losses, and genome organization. Most of our knowledge about eukaryotic genome evolution is derived from studies of multicellular model organisms. The eukaryotic phylum Apicomplexa contains obligate intracellular protist parasites responsible for a wide range of human and veterinary diseases (e.g., malaria, toxoplasmosis, and theileriosis). We have developed an in silico protein-encoding gene based pipeline to investigate synteny across 12 apicomplexan species from six genera. Genome rearrangement between lineages is extensive. Syntenic regions (conserved gene content and order) are rare between lineages and appear to be totally absent across the phylum, with no group of three genes found on the same chromosome and in the same order within 25 kb up- and downstream of any orthologous genes. Conserved synteny between major lineages is limited to small regions in Plasmodium and Theileria/Babesia species, and within these conserved regions, there are a number of proteins putatively targeted to organelles. The observed overall lack of synteny is surprising considering the divergence times and the apparent absence of transposable elements (TEs) within any of the species examined. TEs are ubiquitous in all other groups of eukaryotes studied to date and have been shown to be involved in genomic rearrangements. It appears that there are different criteria governing genome evolution within the Apicomplexa relative to other well-studied unicellular and multicellular eukaryotes.  相似文献   

2.
We describe and validate a new membrane protein topology prediction method, TMHMM, based on a hidden Markov model. We present a detailed analysis of TMHMM's performance, and show that it correctly predicts 97-98 % of the transmembrane helices. Additionally, TMHMM can discriminate between soluble and membrane proteins with both specificity and sensitivity better than 99 %, although the accuracy drops when signal peptides are present. This high degree of accuracy allowed us to predict reliably integral membrane proteins in a large collection of genomes. Based on these predictions, we estimate that 20-30 % of all genes in most genomes encode membrane proteins, which is in agreement with previous estimates. We further discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C(in) topologies. We discuss the possible relevance of this finding for our understanding of membrane protein assembly mechanisms. A TMHMM prediction service is available at http://www.cbs.dtu.dk/services/TMHMM/.  相似文献   

3.
Comparative sequencing of plant genomes: choices to make   总被引:4,自引:0,他引:4       下载免费PDF全文
  相似文献   

4.
Magnifying Genomes (MaGe) is a microbial genome annotation system based on a relational database containing information on bacterial genomes, as well as a web interface to achieve genome annotation projects. Our system allows one to initiate the annotation of a genome at the early stage of the finishing phase. MaGe's main features are (i) integration of annotation data from bacterial genomes enhanced by a gene coding re-annotation process using accurate gene models, (ii) integration of results obtained with a wide range of bioinformatics methods, among which exploration of gene context by searching for conserved synteny and reconstruction of metabolic pathways, (iii) an advanced web interface allowing multiple users to refine the automatic assignment of gene product functions. MaGe is also linked to numerous well-known biological databases and systems. Our system has been thoroughly tested during the annotation of complete bacterial genomes (Acinetobacter baylyi ADP1, Pseudoalteromonas haloplanktis, Frankia alni) and is currently used in the context of several new microbial genome annotation projects. In addition, MaGe allows for annotation curation and exploration of already published genomes from various genera (e.g. Yersinia, Bacillus and Neisseria). MaGe can be accessed at http://www.genoscope.cns.fr/agc/mage.  相似文献   

5.
A report on the Cold Spring Harbor Laboratory meeting 'Plant Genomes: From Sequence to Phenome', Cold Spring Harbor, USA, 9-12 December 2004.  相似文献   

6.
Version 5.1 of PlasmoDB, a resource for malaria parasite genomic and functional genomics datasets, was released in August 2006. This new release includes additional Plasmodium genomes and a newly designed website. The new site reflects the status of PlasmoDB as a member of a linked family of Apicomplexan databases.  相似文献   

7.
Genomic regulatory blocks are chromosomal regions spanned by long clusters of highly conserved noncoding elements devoted to long-range regulation of developmental genes, often immobilizing other, unrelated genes into long-lasting syntenic arrangements. Synorth is a web resource for exploring and categorizing the syntenic relationships in genomic regulatory blocks across multiple genomes, tracing their evolutionary fate after teleost whole genome duplication at the level of genomic regulatory block loci, individual genes, and their phylogenetic context.  相似文献   

8.
9.
Recent pathogenomic research on plant parasitic oomycete effector function and plant host responses has resulted in major conceptual advances in plant pathology, which has been possible thanks to the availability of genome sequences.  相似文献   

10.
Key Message

Contrasting substitution rates in the organellar genomes of Lophophytum agree with the DNA repair, replication, and recombination gene content. Plastid and nuclear genes whose products form multisubunit complexes co-evolve.

Abstract

The organellar genomes of the holoparasitic plant Lophophytum (Balanophoraceae) show disparate evolution. In the plastid, the genome has been severely reduced and presents a?>?85% AT content, while in the mitochondria most protein-coding genes have been replaced by homologs acquired by horizontal gene transfer (HGT) from their hosts (Fabaceae). Both genomes carry genes whose products form multisubunit complexes with those of nuclear genes, creating a possible hotspot of cytonuclear coevolution. In this study, we assessed the evolutionary rates of plastid, mitochondrial and nuclear genes, and their impact on cytonuclear evolution of genes involved in multisubunit complexes related to lipid biosynthesis and proteolysis in the plastid and those in charge of the oxidative phosphorylation in the mitochondria. Genes from the plastid and the mitochondria (both native and foreign) of Lophophytum showed extremely high and ordinary substitution rates, respectively. These results agree with the biased loss of plastid-targeted proteins involved in angiosperm organellar repair, replication, and recombination machinery. Consistent with the high rate of evolution of plastid genes, nuclear-encoded subunits of plastid complexes showed disproportionate increases in non-synonymous substitution rates, while those of the mitochondrial complexes did not show different rates than the control (i.e. non-organellar nuclear genes). Moreover, the increases in the nuclear-encoded subunits of plastid complexes were positively correlated with the level of physical interaction they possess with the plastid-encoded ones. Overall, these results suggest that a structurally-mediated compensatory factor may be driving plastid-nuclear coevolution in Lophophytum, and that mito-nuclear coevolution was not altered by HGT.

  相似文献   

11.
Bioinformatic approaches have allowed the identification in Arabidopsis thaliana of twenty genes encoding for homologues of animal ionotropic glutamate receptors (iGLRs). Some of these putative receptor proteins, grouped into three subfamilies, have been located to the plasmamembrane, but their possible location in organelles has not been investigated so far. In the present work we provide multiple evidence for the plastid localization of a glutamate receptor, AtGLR3.4, in Arabidopsis and tobacco. Biochemical analysis was performed using an antibody shown to specifically recognize both the native protein in Arabidopsis and the recombinant AtGLR3.4 fused to YFP expressed in tobacco. Western blots indicate the presence of AtGLR3.4 in both the plasmamembrane and in chloroplasts. In agreement, in transformed Arabidopsis cultured cells as well as in agroinfiltrated tobacco leaves, AtGLR3.4::YFP is detected both at the plasmamembrane and at the plastid level by confocal microscopy. The photosynthetic phenotype of mutant plants lacking AtGLR3.4 was also investigated. These results identify for the first time a dual localization of a glutamate receptor, revealing its presence in plastids and chloroplasts and opening the way to functional studies.  相似文献   

12.
Mitochondrial DNA, widely applied in studies of population differentiation in animals, is rarely used in plants because of its slow rate of sequence evolution and its complex genomic organization. We demonstrate the utility of two polymorphic mitochondrial tandem repeats located in the second intron of the nad1 gene of Norway spruce. Most of the size variants showed pronounced population differentiation and a distinct geographical distribution. A GenBank search revealed that mitochondrial tandem repeats occur in a broad range of plant species and may serve as a novel molecular marker for unravelling population processes in plants.  相似文献   

13.
14.

Background

While next-generation sequencing technologies have made sequencing genomes faster and more affordable, deciphering the complete genome sequence of an organism remains a significant bioinformatics challenge, especially for large genomes. Low sequence coverage, repetitive elements and short read length make de novo genome assembly difficult, often resulting in sequence and/or fragment “gaps” – uncharacterized nucleotide (N) stretches of unknown or estimated lengths. Some of these gaps can be closed by re-processing latent information in the raw reads. Even though there are several tools for closing gaps, they do not easily scale up to processing billion base pair genomes.

Results

Here we describe Sealer, a tool designed to close gaps within assembly scaffolds by navigating de Bruijn graphs represented by space-efficient Bloom filter data structures. We demonstrate how it scales to successfully close 50.8 % and 13.8 % of gaps in human (3 Gbp) and white spruce (20 Gbp) draft assemblies in under 30 and 27 h, respectively – a feat that is not possible with other leading tools with the breadth of data used in our study.

Conclusion

Sealer is an automated finishing application that uses the succinct Bloom filter representation of a de Bruijn graph to close gaps in draft assemblies, including that of very large genomes. We expect Sealer to have broad utility for finishing genomes across the tree of life, from bacterial genomes to large plant genomes and beyond. Sealer is available for download at https://github.com/bcgsc/abyss/tree/sealer-release.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0663-4) contains supplementary material, which is available to authorized users.  相似文献   

15.
The GeneSeqer@PlantGDB Web server (http://www.plantgdb.org/cgi-bin/GeneSeqer.cgi) provides a gene structure prediction tool tailored for applications to plant genomic sequences. Predictions are based on spliced alignment with source-native ESTs and full-length cDNAs or non-native probes derived from putative homologous genes. The tool is illustrated with applications to refinement of current gene structure annotation and de novo annotation of draft genomic sequences. The service should facilitate expert annotation as a community effort by providing convenient access to all public plant sequences via the PlantGDB database, a simple four-step protocol for spliced alignment and visually appealing displays of the predicted gene structures in addition to detailed sequence alignments.  相似文献   

16.
17.
With the arrival of low-cost, next-generation sequencing, a multitude of new plant genomes are being publicly released, providing unseen opportunities and challenges for comparative genomics studies. Here, we present PLAZA 2.5, a user-friendly online research environment to explore genomic information from different plants. This new release features updates to previous genome annotations and a substantial number of newly available plant genomes as well as various new interactive tools and visualizations. Currently, PLAZA hosts 25 organisms covering a broad taxonomic range, including 13 eudicots, five monocots, one lycopod, one moss, and five algae. The available data consist of structural and functional gene annotations, homologous gene families, multiple sequence alignments, phylogenetic trees, and colinear regions within and between species. A new Integrative Orthology Viewer, combining information from different orthology prediction methodologies, was developed to efficiently investigate complex orthology relationships. Cross-species expression analysis revealed that the integration of complementary data types extended the scope of complex orthology relationships, especially between more distantly related species. Finally, based on phylogenetic profiling, we propose a set of core gene families within the green plant lineage that will be instrumental to assess the gene space of draft or newly sequenced plant genomes during the assembly or annotation phase.  相似文献   

18.
? Knowledge of the phylogenetic pattern and biological relevance of the base composition of large eukaryotic genomes (including those of plants) is poor. With the use of flow cytometry (FCM), the amount of available data on the guanine + cytosine (GC) content of plants has nearly doubled in the last decade. However, skepticism exists concerning the reliability of the method because of uncertainty in some input parameters. ? Here, we tested the reliability of FCM for estimating GC content by comparison with the biochemical method of DNA temperature melting analysis (TMA). We conducted measurements in 14 plant species with a maximum currently known GC content range (33.6-47.5% as measured by FCM). We also compared the estimations of the GC content by FCM with genomic sequences in 11 Oryza species. ? FCM and TMA data exhibited a high degree of correspondence which remained stable over the relatively wide range of binding lengths (3.39-4.09) assumed for the base-specific dye used. A high correlation was also observed between FCM results and the sequence data in Oryza, although the latter GC contents were consistently lower. ? Reliable estimates of the genomic base composition in plants by FCM are comparable with estimates obtained using other methods, and so wider application of FCM in future plant genomic research, although it would pose a challenge, would be supported by these findings.  相似文献   

19.
Surface proteins in Gram-positive bacteria are frequently implicated in virulence. We have focused on a group of extracellular cell wall-attached proteins (CWPs), containing an LPXTG motif for cleavage and covalent coupling to peptidoglycan by sortase enzymes. A hidden Markov model (HMM) approach for predicting the LPXTG-anchored cell wall proteins of Gram-positive bacteria was developed and compared against existing methods. The HMM model is parsimonious in terms of the number of freely estimated parameters, and it has proved to be very sensitive and specific in a training set of 55 experimentally verified LPXTG-anchored cell wall proteins as well as in reliable data sets of globular and transmembrane proteins. In order to identify such proteins in Gram-positive bacteria, a comprehensive analysis of 94 completely sequenced genomes has been performed. We identified, in total, 860 LPXTG-anchored cell wall proteins, a number that is significantly higher compared to those obtained by other available methods. Of these proteins, 237 are hypothetical proteins according to the annotation of SwissProt, and 88 had no homologs in the SwissProt database--this might be evidence that they are members of newly identified families of CWPs. The prediction tool, the database with the proteins identified in the genomes, and supplementary material are available online at http://bioinformatics.biol.uoa.gr/CW-PRED/.  相似文献   

20.
A new system to recognize protein coding genes in the coronavirus genomes, specially suitable for the SARS-CoV genomes, has been proposed in this paper. Compared with some existing systems, the new program package has the merits of simplicity, high accuracy, reliability, and quickness. The system ZCURVE_CoV has been run for each of the 11 newly sequenced SARS-CoV genomes. Consequently, six genomes not annotated previously have been annotated, and some problems of previous annotations in the remaining five genomes have been pointed out and discussed. In addition to the polyprotein chain ORFs 1a and 1b and the four genes coding for the major structural proteins, spike (S), small envelop (E), membrane (M), and nuleocaspid (N), respectively, ZCURVE_CoV also predicts 5-6 putative proteins in length between 39 and 274 amino acids with unknown functions. Some single nucleotide mutations within these putative coding sequences have been detected and their biological implications are discussed. A web service is provided, by which a user can obtain the annotated result immediately by pasting the SARS-CoV genome sequences into the input window on the web site (http://tubic.tju.edu.cn/sars/). The software ZCURVE_CoV can also be downloaded freely from the web address mentioned above and run in computers under the platforms of Windows or Linux.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号