共查询到20条相似文献,搜索用时 0 毫秒
1.
BackgroundSome ferns have medicinal properties and are used in therapeutic interventions. However, the classification and phylogenetic relationships of ferns remain incompletely reported. Considering that chloroplast genomes provide ideal information for species identification and evolution, in this study, three unpublished and one published ferns were sequenced and compared with other ferns to obtain comprehensive information on their classification and evolution.Materials and MethodsThe complete chloroplast genomes of Dryopteris goeringiana (Kunze) Koidz, D. crassirhizoma Nakai, Athyrium brevifrons Nakai ex Kitagawa, and Polystichum tripteron (Kunze) Presl were sequenced using the Illumina HiSeq 4,000 platform. Simple sequence repeats (SSRs), nucleotide diversity analysis, and RNA editing were investigated in all four species. Genome comparison and inverted repeats (IR) boundary expansion and contraction analyses were also performed. The relationships among the ferns were studied by phylogenetic analysis based on the whole chloroplast genomes.ResultsThe whole chloroplast genomes ranged from 148,539 to 151,341 bp in size and exhibited typical quadripartite structures. Ten highly variable loci with parsimony informative (Pi) values of > 0.02 were identified. A total of 75–108 SSRs were identified, and only six SSRs were present in all four ferns. The SSRs contained a higher number of A + T than G + C bases. C‐to‐U conversion was the most common type of RNA editing event. Genome comparison analysis revealed that single‐copy regions were more highly conserved than IR regions. IR boundary expansion and contraction varied among the four ferns. Phylogenetic analysis showed that species in the same genus tended to cluster together with and had relatively close relationships.ConclusionThe results provide valuable information on fern chloroplast genomes that will be useful to identify and classify ferns, and study their phylogenetic relationships and evolution. 相似文献
2.
As the number of complete microbial genomes publicly available is still growing, the problem of annotation quality in these very large sequences remains unsolved. Indeed, the number of annotations associated with complete genomes is usually lower than those of the shorter entries encountered in the repository collections. Moreover, classical sequence database management systems have difficulties in handling entries of such size. In this context, the Enhanced Microbial Genomes Library (EMGLib) was developed to try to alleviate these problems. This library contains all the complete genomes from prokaryotes (bacteria and archaea) already sequenced and the yeast genome in GenBank format. The annotations are improved by the introduction of data on codon usage, gene orientation on the chromosome and gene families. It is possible to access EMGLib through two database systems set up on WWW servers: the PBIL server at http://pbil.univ-lyon1.fr/emglib.html and the MICADO server at http://locus.jouy.inra.fr/micado 相似文献
3.
Uchiyama I 《Nucleic acids research》2003,31(1):58-62
MBGD is a workbench system for comparative analysis of completely sequenced microbial genomes. The central function of MBGD is to create an orthologous gene classification table using precomputed all-against-all similarity relationships among genes in multiple genomes. In MBGD, an automated classification algorithm has been implemented so that users can create their own classification table by specifying a set of organisms and parameters. This feature is especially useful when the user's interest is focused on some taxonomically related organisms. The created classification table is stored into the database and can be explored combining with the data of individual genomes as well as similarity relationships among genomes. Using these data, users can carry out comparative analyses from various points of view, such as phylogenetic pattern analysis, gene order comparison and detailed gene structure comparison. MBGD is accessible at http://mbgd.genome.ad.jp/. 相似文献
4.
Comparative genome analysis is a powerful approach to understanding the biology of infectious bacterial pathogens. In this study, a quantitative approach, referred to as Gnom(Cmp), was developed to study the microevolution of bacterial pathogens. Although much more time-consuming than existing tools, this procedure provides a much higher resolution. Gnom(Cmp) accomplishes this by establishing genome-wide heterogeneity genotypes, which are then quantified and comparatively analyzed. The heterogeneity genotypes are defined as chromosomal base positions that have multiple variants within particular genomes, resulted from DNA duplications and subsequent mutations. To prove the concept, the procedure was applied on the genomes of 15 Staphylococcus aureus strains, focusing extensively on two pairs of hVISA/VISA strains. hVISA refers to heteroresistant vancomycin-intermediate S. aureus strains and VISA is their VISA mutants. hVISA/VISA displays some remarkable properties. hVISA is susceptible to vancomycin, but VISA mutants emerge soon after a short period of vancomycin therapy, therefore making the pathogen a great model organism for fast-evolving bacterial pathogens. The analysis indicated that Gnom(Cmp) could reveal variants within the genomes, which can be analyzed within the global genome context. Gnom(Cmp) discovered evolutionary hotspots and their dynamics among many closely related, even isogenic genomes. The analysis thus allows the exploration of the molecular mechanisms behind hVISA/VISA evolution, providing a working hypotheses for experimental testing and validation. 相似文献
5.
6.
7.
The subclass Pteriomorphia is a morphologically diverse and economically important group of Mollusca. We retrieved 42 mitochondrial genomes (mtGenomes) of Pteriomorphia and concatenated protein-coding genes, rRNAs and tRNAs to assess phylogenetic relationships and divergence times among the families with maximum likelihood (ML) and Bayesian inference (BI) analyses. Both ML and BI analyses strongly support the same topology except for the position of Atrina pectinata. Our study confirms the monophyly of the families Arcidae, Mytilidae, Pteriidae, Ostreidae and Pectinidae. Within Pteriomorphia, we recovered two clusters, one comprising Mytilidae, Arcidae and Pectinidae, the other consisting of Ostreidae, Pteriidae and Pinnidae, but we did not confirm a basal position for any family. The phylogenetic trees suggest that Ostreidae, Pteriidae and Pinnidae should be grouped as the order Ostreoida. Divergence times of major families are estimated as follows: Arcidae, 315.9 Ma; Pectinidae, 384.4 Ma; Ostreidae, 240.8 Ma; Mytilidae, 390.8 Ma. Comparative analysis indicates a low-level codon usage bias (with an average of 50.29) in mtGenomes of Pteriomorphia. In Mytilidae and Ostreidae, the codon usage bias was under mutation pressure rather than selection. Contrastingly, mutation is not the main factor in defining the codon usage in Pectinidae and Pteriidae. Among Ostreidae, Pectinidae and Mytilidae, Ka/Ks ratios range from 0.00 to 1.22 and most values (89.11%) are less than 0.20, indicating that most genes are under strong negative or purifying selection. The protein-coding gene orders show dramatically different patterns in Pteriomorphia. There is no gene block even consisting of two genes that is shared by five families. 相似文献
8.
Pradeep Kumar Burma Alok Raj Jayant K. Deb Samir K. Brahmachari 《Journal of biosciences》1992,17(4):395-411
In this article we describe and demonstrate the versatility of a computer program, GENOME MAPPING, that uses interactive graphics
and runs on an IRIS workstation. The program helps to visualize as well as analyse global and local patterns of genomic DNA
sequences. It was developed keeping in mind the requirements of the human genome sequencing programme, which requires rapid
analysis of the data. Using GENOME MAPPING one can discern signature patterns of different kinds of sequences and analyse
such patterns for repetitive as well as rare sequence strings. Further, one can visualize the extent of global homology between
different genomic sequences. An application of our method to the published yeast mitochondrial genome data shows similar sequence
organizations in the entire sequence and in smaller subsequences 相似文献
9.
Organellar Genome Retrieval (OGRe) is a relational database of complete mitochondrial genome sequences for over 250 Metazoan species. OGRe provides a resource for the comparative analysis of mitochondrial genomes at several levels. At the sequence level, OGRe allows the retrieval of any selected set of mitochondrial genes from any selected set of species. Species are classified using a taxonomic system that allows easy selection of related groups of species. Sequence alignments are also available for some species. At the level of individual nucleotides, the system contains information on base frequencies and codon usage frequencies that can be compared between organisms. At the level of whole genomes, OGRe provides several ways of visualizing information on gene order. Diagrams illustrating the genome arrangement can be generated for any selected set of species automatically from the information in the database. Searches can be done based on gene arrangement to find sets of species that have the same order as one another. Diagrams for pairwise comparison of species can be produced that show the positions of break-points in the gene order and use colour to highlight the sections of the genome that have moved. OGRe is available from http://www.bioinf.man.ac.uk/ogre. 相似文献
10.
Metagenomics facilitates the study of the genetic information from uncultured microbes and complex microbial communities. Assembling complete genomes from metagenomics data is difficult because most samples have high organismal complexity and strain diversity. Some studies have attempted to extract complete bacterial, archaeal, and viral genomes and often focus on species with circular genomes so they can help confirm completeness with circularity. However, less than 100 circularized bacterial and archaeal genomes have been assembled and published from metagenomics data despite the thousands of datasets that are available. Circularized genomes are important for (1) building a reference collection as scaffolds for future assemblies, (2) providing complete gene content of a genome, (3) confirming little or no contamination of a genome, (4) studying the genomic context and synteny of genes, and (5) linking protein coding genes to ribosomal RNA genes to aid metabolic inference in 16S rRNA gene sequencing studies. We developed a semi-automated method called Jorg to help circularize small bacterial, archaeal, and viral genomes using iterative assembly, binning, and read mapping. In addition, this method exposes potential misassemblies from k-mer based assemblies. We chose species of the Candidate Phyla Radiation (CPR) to focus our initial efforts because they have small genomes and are only known to have one ribosomal RNA operon. In addition to 34 circular CPR genomes, we present one circular Margulisbacteria genome, one circular Chloroflexi genome, and two circular megaphage genomes from 19 public and published datasets. We demonstrate findings that would likely be difficult without circularizing genomes, including that ribosomal genes are likely not operonic in the majority of CPR, and that some CPR harbor diverged forms of RNase P RNA. Code and a tutorial for this method is available at https://github.com/lmlui/Jorg and is available on the DOE Systems Biology KnowledgeBase as a beta app. 相似文献
11.
Jochen Blom Stefan P Albaum Daniel Doppmeier Alfred Pühler Frank-J?rg Vorh?lter Martha Zakrzewski Alexander Goesmann 《BMC bioinformatics》2009,10(1):154
Background
The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. 相似文献12.
Strains of Staphylococcus aureus, an important human pathogen, display up to 20% variability in their genome sequence, and most sequence information is available for human clinical isolates that have not been subjected to genetic analysis of virulence attributes. S. aureus strain Newman, which was also isolated from a human infection, displays robust virulence properties in animal models of disease and has already been extensively analyzed for its molecular traits of staphylococcal pathogenesis. We report here the complete genome sequence of S. aureus Newman, which carries four integrated prophages, as well as two large pathogenicity islands. In agreement with the view that S. aureus Newman prophages contribute important properties to pathogenesis, fewer virulence factors are found outside of the prophages than for the highly virulent strain MW2. The absence of drug resistance genes reflects the general antibiotic-susceptible phenotype of S. aureus Newman. Phylogenetic analyses reveal clonal relationships between the staphylococcal strains Newman, COL, NCTC8325, and USA300 and a greater evolutionary distance to strains MRSA252, MW2, MSSA476, N315, Mu50, JH1, JH9, and RF122. However, polymorphism analysis of two large pathogenicity islands distributed among these strains shows that the two islands were acquired independently from the evolutionary pathway of the chromosomal backbones of staphylococcal genomes. Prophages and pathogenicity islands play central roles in S. aureus virulence and evolution. 相似文献
13.
Ségolène Caboche Gaël Even Alexandre Loywick Christophe Audebert David Hot 《Genome biology》2017,18(1):233
The increase in available sequence data has advanced the field of microbiology; however, making sense of these data without bioinformatics skills is still problematic. We describe MICRA, an automatic pipeline, available as a web interface, for microbial identification and characterization through reads analysis. MICRA uses iterative mapping against reference genomes to identify genes and variations. Additional modules allow prediction of antibiotic susceptibility and resistance and comparing the results of several samples. MICRA is fast, producing few false-positive annotations and variant calls compared to current methods, making it a tool of great interest for fully exploiting sequencing data. 相似文献
14.
15.
Feruza U. Mustafina Dong‐Keun Yi Kyung Choi Chang Ho Shin Komiljon Sh. Tojibaev Stephen R. Downie 《Ecology and evolution》2019,9(1):364-377
Prangos fedtschenkoi (Regel & Schmalh.) Korovin and P. lipskyi Korovin (Apiaceae) are rare plant species endemic to mountainous regions of Middle Asia. Both are edificators of biotic communities and valuable resource plants. The results of recent phylogenetic analyses place them in Prangos subgen. Koelzella (M. Hiroe) Lyskov & Pimenov and suggest they may possibly represent sister species. To aid in development of molecular markers useful for intraspecific phylogeographic and population‐level genetic studies of these ecologically and economically important plants, we determined their complete plastid genome sequences and compared the results obtained to several previously published plastomes of Apiaceae. The plastomes of P. fedtschenkoi and P. lipskyi are typical of Apiaceae and most other higher plant plastid DNAs in their sizes (153,626 and 154,143 bp, respectively), structural organization, gene arrangement, and gene content (with 113 unique genes). A total of 49 and 48 short sequence repeat (SSR) loci of 10 bp or longer were detected in P. fedtschenkoi and P. lipskyi plastomes, respectively, representing 42–43 mononucleotides and 6 AT dinucleotides. Seven tandem repeats of 30 bp or longer with a sequence identity ≥90% were identified in each plastome. Further comparisons revealed 319 polymorphic sites between the plastomes (IR, 21; LSC, 234; SSC, 64), representing 43.8% transitions (Ts), 56.1% transversions (Tv), and a Ts/Tv ratio of 0.78. Within genic regions, two indel events were observed in rpoA (6 and 51 bp) and ycf1 (3 and 12 bp), and one in ndhF (6 bp). The most variable intergenic spacer region was that of accD/psaI, with 21.1% nucleotide divergence. Each Prangos species possessed one of two separate inversions (either 5 bp in ndhB intron or 9 bp in petB intron), and these were predicted to form hairpin structures with flanking repeat sequences of 18 and 19 bp, respectively. Both species have also incorporated novel DNA in the LSC region adjacent to the LSC/IRa junction, and BLAST searches revealed it had a 100 bp match (86% sequence identity) to noncoding mitochondrial DNA. Prangos‐specific primers were developed for the variable accD/psaI intergenic spacer and preliminary PCR‐surveys suggest that this region will be useful for future phylogeographic and population‐level studies. 相似文献
16.
The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant 总被引:3,自引:0,他引:3 下载免费PDF全文
Huala E Dickerman AW Garcia-Hernandez M Weems D Reiser L LaFond F Hanley D Kiphart D Zhuang M Huang W Mueller LA Bhattacharyya D Bhaya D Sobral BW Beavis W Meinke DW Town CD Somerville C Rhee SY 《Nucleic acids research》2001,29(1):102-105
Arabidopsis thaliana, a small annual plant belonging to the mustard family, is the subject of study by an estimated 7000 researchers around the world. In addition to the large body of genetic, physiological and biochemical data gathered for this plant, it will be the first higher plant genome to be completely sequenced, with completion expected at the end of the year 2000. The sequencing effort has been coordinated by an international collaboration, the Arabidopsis Genome Initiative (AGI). The rationale for intensive investigation of Arabidopsis is that it is an excellent model for higher plants. In order to maximize use of the knowledge gained about this plant, there is a need for a comprehensive database and information retrieval and analysis system that will provide user-friendly access to Arabidopsis information. This paper describes the initial steps we have taken toward realizing these goals in a project called The Arabidopsis Information Resource (TAIR) (www.arabidopsis.org). 相似文献
17.
18.
The aim of this study is to design a biological information retrieval and analysis system (BIRAS) based on the Internet. Using the specific network protocol, BIRAS system could send and receive information from the Entrez search and retrieval system maintained by National Center for Biotechnology Information (NCBI) in USA. The literatures, nucleotide sequence, protein sequences, and other resources according to the user-defined term could then be retrieved and sent to the user by pop up message or by E-mail informing automatically using BIRAS system. All the information retrieving and analyzing processes are done in real-time. As a robust system for intelligently and dynamically retrieving and analyzing on the user-defined information, it is believed that BIRAS would be extensively used to retrieve specific information from large amount of biological databases in now days. The program is available on request from the corresponding author. 相似文献
19.
C Harger M Skupski J Bingham A Farmer S Hoisie P Hraber D Kiphart L Krakowski M McLeod J Schwertfeger G Seluja A Siepel G Singh D Stamper P Steadman N Thayer R Thompson P Wargo M Waugh J J Zhuang P A Schad 《Nucleic acids research》1998,26(1):21-26
In 1997 the primary focus of the Genome Sequence DataBase (GSDB; www. ncgr.org/gsdb ) located at the National Center for Genome Resources was to improve data quality and accessibility. Efforts to increase the quality of data within the database included two major projects; one to identify and remove all vector contamination from sequences in the database and one to create premier sequence sets (including both alignments and discontiguous sequences). Data accessibility was improved during the course of the last year in several ways. First, a graphical database sequence viewer was made available to researchers. Second, an update process was implemented for the web-based query tool, Maestro. Third, a web-based tool, Excerpt, was developed to retrieve selected regions of any sequence in the database. And lastly, a GSDB flatfile that contains annotation unique to GSDB (e.g., sequence analysis and alignment data) was developed. Additionally, the GSDB web site provides a tool for the detection of matrix attachment regions (MARs), which can be used to identify regions of high coding potential. The ultimate goal of this work is to make GSDB a more useful resource for genomic comparison studies and gene level studies by improving data quality and by providing data access capabilities that are consistent with the needs of both types of studies. 相似文献