首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hasan MS  Liu Q  Wang H  Fazekas J  Chen B  Che D 《Bioinformation》2012,8(4):203-205
Genomic Islands (GIs) are genomic regions that are originally from other organisms, through a process known as Horizontal Gene Transfer (HGT). Detection of GIs plays a significant role in biomedical research since such align genomic regions usually contain important features, such as pathogenic genes. We have developed a use friendly graphic user interface, Genomic Island Suite of Tools (GIST), which is a platform for scientific users to predict GIs. This software package includes five commonly used tools, AlienHunter, IslandPath, Colombo SIGI-HMM, INDeGenIUS and Pai-Ida. It also includes an optimization program EGID that ensembles the result of existing tools for more accurate prediction. The tools in GIST can be used either separately or sequentially. GIST also includes a downloadable feature that facilitates collecting the input genomes automatically from the FTP server of the National Center for Biotechnology Information (NCBI). GIST was implemented in Java, and was compiled and executed on Linux/Unix operating systems. AVAILABILITY: The database is available for free at http://www5.esu.edu/cpsc/bioinfo/software/GIST.  相似文献   

2.
Che D  Hasan MS  Wang H  Fazekas J  Huang J  Liu Q 《Bioinformation》2011,7(6):311-314
Genomic islands (GIs) are genomic regions that are originally transferred from other organisms. The detection of genomic islands in genomes can lead to many applications in industrial, medical and environmental contexts. Existing computational tools for GI detection suffer either low recall or low precision, thus leaving the room for improvement. In this paper, we report the development of our Ensemble algorithm for Genomic Island Detection (EGID). EGID utilizes the prediction results of existing computational tools, filters and generates consensus prediction results. Performance comparisons between our ensemble algorithm and existing programs have shown that our ensemble algorithm is better than any other program. EGID was implemented in Java, and was compiled and executed on Linux operating systems. EGID is freely available at http://www5.esu.edu/cpsc/bioinfo/software/EGID.  相似文献   

3.
4.
Genomes of prokaryotes harbor genomic islands (GIs), which are frequently acquired via horizontal gene transfer (HGT). Here I present an analysis of GIs with respect to gene-encoded functions. GIs were identified by statistical analysis of codon usage and clustering. Genes classified as putatively alien (pA) or putatively native (pN) were categorized according to the COG database. Among pA and pN genes, the distribution of COG functions and classes were studied for different groupings of prokaryotes. Groups were formed according to taxonomical relation or habitats. In all groups, genes related to class L (replication, recombination, and repair) were statistically significantly overrepresented in GIs. GIs of bacteria and archaea showed a distinct pattern of preferences. In archeal GIs, genes belonging to COG class M (cell wall/membrane/envelope biogenesis) or Q (secondary metabolites biosynthesis, transport, and catabolism) were more frequent. In bacterial GIs, genes of classes U (intracellular trafficking, secretion, and vesicular transport), N (cell motility), and V (defense mechanisms) were predominant. Underrepresentation was strongest for genes belonging to class J (translation, ribosomal structure, and biogenesis). Among single COG functions overrepresented in GIs were transferases and transporters. In both superkingdoms, HGT enhances genomic content by meeting demands that are independent of the studied habitats. These findings are in agreement with the complexity theory, which predicts the preferential import of operational genes. However, only specific subsets of operational genes were enriched in GIs. Modification of the cell envelope, cell motility, secretion, and protection of cellular DNA are major issues in HGT. [Reviewing Editor: Dr. Siv Andersson]  相似文献   

5.
The Network Makeup Artist (NORMA) is a web tool for interactive network annotation visualization and topological analysis, able to handle multiple networks and annotations simultaneously. Precalculated annotations (e.g., Gene Ontology, Pathway enrichment, community detection, or clustering results) can be uploaded and visualized in a network, either as colored pie-chart nodes or as color-filled areas in a 2D/3D Venn-diagram-like style. In the case where no annotation exists, algorithms for automated community detection are offered. Users can adjust the network views using standard layout algorithms or allow NORMA to slightly modify them for visually better group separation. Once a network view is set, users can interactively select and highlight any group of interest in order to generate publication-ready figures. Briefly, with NORMA, users can encode three types of information simultaneously. These are 1) the network, 2) the communities or annotations of interest, and 3) node categories or expression values. Finally, NORMA offers basic topological analysis and direct topological comparison across any of the selected networks. NORMA service is available at http://norma.pavlopouloslab.info, whereas the code is available at https://github.com/PavlopoulosLab/NORMA.  相似文献   

6.
Helicobacter hepaticus is an important pathogen in laboratory mice and induces the development of liver tumors and gastrointestinal disease in susceptible strains of mice. In this study, a miniset of 36 cosmid clones from a genomic library of H. hepaticus was ordered and grouped into four large contigs representing approximately 1 Mb of the H. hepaticus genome using PCR, DNA sequencing, Southern and dot-blot hybridization and pulsed-field gel electrophoresis. From the 200-300 terminal nucleotide sequences of 38 cosmid clones, 56 coding regions were predicted, of which 51 were found to have orthologs in the public databases and five appeared to be unique to H. hepaticus. Of these 51 genes, 36 have orthologs in Helicobacter pylori and 25 display the highest sequence similarity to H. pylori. However, chromosomal positions of these genes are not conserved between these two helicobacters. In addition, 10 H. hepaticus genes had the highest sequence similarity to orthologs in Campylobacter jejuni. The GC content in a randomly selected 21-kb H. hepaticus genomic sequence was 35.8%, which approximates the average between H. pylori (39%) and C. jejuni (30.6%). These results demonstrate that: (1) H. hepaticus is more closely related to H. pylori than C. jejuni; (2) significant genomic alterations exist between H. hepaticus and H. pylori, including gene organization, protein sequences and GC content, probably in part due to specific adaptation to distinct ecological niches.  相似文献   

7.
Ilyina  T. S.  Romanova  Yu. M. 《Molecular Biology》2002,36(2):171-179
Data on the structural organization and evolutionary role of specific bacterial DNA regions known as genomic islands are reviewed. Emphasis is placed on the most extensively studied genomic islands, pathogenicity islands (PAIs), which are present in the chromosome of Gram-negative and Gram-positive pathogenic bacteria and absent from related nonpathogenic strains. PAIs are long DNA regions that harbor virulence genes and often differ in GC content from the remainder of the bacterial genome. Many PAI occur in the tRNA gene loci, which provide a convenient target for foreign gene insertion. Some PAI are highly homologous to each other and contain sequences similar to ISs, phage att sites, and plasmid ori sites, along with functional or defective integrase and transposase genes, suggesting horizontal transfer of PAI among bacteria.  相似文献   

8.
序列消除与异源多倍体植物基因组的进化   总被引:5,自引:0,他引:5  
经杂交后多倍化形成的异源多倍体植物,被认为在其形成的早期阶段经历了DNA序列消除过程。发生消除的序列既涉及到高拷贝的序列也有低拷贝的序列,而且大多数情况下倾向于消除来自其中一个亲本的序列。序列消除的模式因基因组组成和物种的不同而有差异,并且可能受到细胞质的影响。尽管序列消除的分子机制还不是很清楚,但很多证据已表明非同源染色体之间的互作不是主要的原因。目前认为,序列消除增加了非同源染色体之间的差异,为多倍化后在减数分裂过程中快速恢复二倍化的染色体配对模式提供了物质基础,这样更有利于多倍体在自然界快速稳定。  相似文献   

9.
《Fly》2013,7(5):279-281
Microsatellites show tremendous variation between genomes in terms of their occurrence and composition. Availability of whole genome sequences allows us to study microsatellite characteristics of fully sequenced insect genomes to understand the evolution and biological significance of microsatellites. InSatDb is an insect microsatellite database that provides an interactive interface to query information on microsatellites annotated with size (in base pairs and repeat units); genomic location (exon, intron, up-stream or transposon); nature (perfect or imperfect); and sequence composition (repeat motif and GC%). Here, we present a snap shot of the distribution and composition of microsatellites in introns and exons of insect genomes. The data present interesting observations regarding the microsatellite life-cycle and genome flux.  相似文献   

10.
The nucleotide composition of genomes undergoes dramatic variations among all three kingdoms of life. GC content, an important characteristic for a genome, is related to many important functions, and therefore GC content and its distribution are routinely reported for sequenced genomes. Traditionally, GC content distribution is assessed by computing GC contents in windows that slide along the genome. Disadvantages of this routinely used window-based method include low resolution and low sensitivity. Additionally, different window sizes result in different GC content distribution patterns within the same genome. We proposed a windowless method, the GC profile, for displaying GC content variations across the genome. Compared to the window-based method, the GC profile has the following advantages: 1) higher sensitivity, because of variation-amplifying procedures; 2) higher resolution, because boundaries between domains can be determined at one single base pair; 3) uniqueness, because the GC profile is unique for a given genome and 4) the capacity to show both global and regional GC content distributions. These characteristics are useful in identifying horizontally-transferred genomic islands and homogenous GC-content domains. Here, we review the applications of the GC profile in identifying genomic islands and genome segmentation points, and in serving as a platform to integrate with other algorithms for genome analysis. A web server generating GC profiles and implementing relevant genome segmentation algorithms is available at: www.zcurve.net.  相似文献   

11.
High islands, with potentially greater habitat diversity, are expected to have greater species richness and diversity compared to low islands, typically atolls and coral islands of lower habitat diversity, within the same geographical area. Patterns of species similarity, richness, and diversity were compared among coral reef fishes between the low island of the Southwest Palau Islands (SWPI), and the low and high islands of the Main Palauan Archipelago (MPA). Data from diurnal visual transects accounted for approximately 64% and 69% of the shorefish faunas known from the SWPI and MPA, respectively. Two distinct fish faunas were representative of low and high islands. The first was confined to the coral islands of the SWPI. The second was partitioned into both low and high islands of the MPA, and Helen Reef, a large atoll in the SWPI. The second type was clustered into atolls, low islands with atoll-like barrier reef systems, a coral island, and three high island systems, one with an extensive barrier reef system. Contrary to the prediction that high islands, with relatively greater habitat diversity, would have greater species richness and diversity, species richness and diversity were greatest at Kossol, a large atoll-like low island locality at the northern end of a high island in the MPA, followed by two atolls, Kayangel (MPA, north of Kossol) and Helen Reef. In contrast, species richness and diversity were lower at high island localities and lowest at small coral islands. These results suggest that habitat diversity for reef fishes increases as a function of increasing area regardless of whether the locality is a high or low island.  相似文献   

12.
BLAST (Basic Local Alignment Search Tool) searches against DNA and protein sequence databases have become an indispensable tool for biomedical research. The proliferation of the genome sequencing projects is steadily increasing the fraction of genome-derived sequences in the public databases and their importance as a public resource. We report here the availability of Genomic BLAST, a novel graphical tool for simplifying BLAST searches against complete and unfinished genome sequences. This tool allows the user to compare the query sequence against a virtual database of DNA and/or protein sequences from a selected group of organisms with finished or unfinished genomes. The organisms for such a database can be selected using either a graphic taxonomy-based tree or an alphabetical list of organism-specific sequences. The first option is designed to help explore the evolutionary relationships among organisms within a certain taxonomy group when performing BLAST searches. The use of an alphabetical list allows the user to perform a more elaborate set of selections, assembling any given number of organism-specific databases from unfinished or complete genomes. This tool, available at the NCBI web site http://www.ncbi.nlm.nih.gov/cgi-bin/Entrez/genom_table_cgi, currently provides access to over 170 bacterial and archaeal genomes and over 40 eukaryotic genomes.  相似文献   

13.
介绍了用尿素法提取蜘蛛基因组DNA。通过与其他DNA提取方法相比较,证明尿素法具有可在室温条件下进行、DNA得率高、完整性好、简单快速等优点。以提取的DNA为模板进行PCR扩增,获得预期大小的、高重复、高GC含量的编码蜘蛛牵引丝蛋白基因的DNA片段。  相似文献   

14.
The accelerating growth of the public microbial genomic data imposes substantial burden on the research community that uses such resources.Building databases for non-redundant reference sequences from massive microbial genomic data based on clustering analysis is essential.However,existing clustering algorithms perform poorly on long genomic sequences.In this article,we present Gclust,a parallel program for clustering complete or draft genomic sequences,where clustering is accelerated with a novel parallelization strategy and a fast sequence comparison algorithm using sparse suffix arrays(SSAs).Moreover,genome identity measures between two sequences are calculated based on their maximal exact matches(MEMs).In this paper,we demonstrate the high speed and clustering quality of Gclust by examining four genome sequence datasets.Gclust is freely available for non-commercial use at https://github.com/niu-lab/gclust.We also introduce a web server for clustering user-uploaded genomes at http://niulab.scgrid.cn/gclust.  相似文献   

15.
Naming of uncultured Bacteria and Archaea is often inconsistent with the International Code of Nomenclature of Prokaryotes. The recent practice of proposing names for higher taxa without designation of lower ranks and nomenclature types is one of the most important inconsistencies that needs to be addressed to avoid nomenclatural instability. The Code requires names of higher taxa up to the rank of class to be derived from the type genus name, with a proposal pending to formalise this requirement for the rank of phylum. Designation of nomenclature types is crucial for providing priority to names and ensures their uniqueness and stability. However, only legitimate names proposed for axenic cultures can be used for this purpose. Candidatus names reserved for taxa lacking cultured representatives may be granted this right if recent proposals to use genome sequences as type material are endorsed, thereby allowing the Code to be fully applied to lineages represented by metagenome-assembled genomes (MAGs) or single amplified genomes (SAGs). Genome quality standards need to be considered to ensure unambiguous assignment of type material. Here, we illustrate the recommended practice by proposing nomenclature type material for four major uncultured prokaryotic lineages based on high-quality MAGs in accordance with the Code.  相似文献   

16.
JRGarbe YDa 《遗传学报》2003,30(12):1193-1195
对于在遗传研究和家系研究中大的系谱结构图还很难分析。系谱的绘制通常是遗传性状的分析研究的第一步。系图可以反映整个群体的结构、每个个体之间的相互关系以及基因流的走向,便于理解遗传性状的本质。因为所用家系数目的增大和复杂性的增加,绘制1个清晰的系谱有时变得十分困难。因此开发了1种名为Pedigraph软件,可以解决这个问题。Pedigraph能够完成对于大的复杂的群体的系谱绘制工作,并能进行相应的系谱分析。初步的测试表明这个软件在研究动植物的遗传育种中是1个有用的工具,同时它也可以用于人类的群体和历史等方面的研究。  相似文献   

17.
Determining the phylogeny of closely related prokaryotes may fail in an analysis of rRNA or a small set of sequences. Whole-genome phylogeny utilizes the maximally available sample space. For a precise determination of genome similarity, two aspects have to be considered when developing an algorithm of whole-genome phylogeny: (1) gene order conservation is a more precise signal than gene content; and (2) when using sequence similarity, failures in identifying orthologues or the in situ replacement of genes via horizontal gene transfer may give misleading results. GO4genome is a new paradigm, which is based on a detailed analysis of gene function and the location of the respective genes. For characterization of genes, the algorithm uses gene ontology enabling a comparison of function independent of evolutionary relationship. After the identification of locally optimal series of gene functions, their length distribution is utilized to compute a phylogenetic distance. The outcome is a classification of genomes based on metabolic capabilities and their organization. Thus, the impact of effects on genome organization that are not covered by methods of molecular phylogeny can be studied. Genomes of strains belonging to Escherichia coli, Shigella, Streptococcus, Methanosarcina, and Yersinia were analyzed. Differences from the findings of classical methods are discussed. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

18.
Next-generation sequencing(NGS) technologies generate thousands to millions of genetic variants per sample.Identification of potential disease-causal variants is labor intensive as it relies on filtering using various annotation metrics and consideration of multiple pathogenicity prediction scores.We have developed VPOT(variant prioritization ordering tool),a python-based command line tool that allows researchers to create a single fully customizable pathogenicity ranking score from any number of annotation values,each with a user-defined weighting.The use of VPOT can be informative when analyzing entire cohorts,as variants in a cohort can be prioritized.VPOT also provides additional functions to allow variant filtering based on a candidate gene list or by affected status in a family pedigree.VPOT outperforms similar tools in terms of efficacy,flexibility,scalability,and computational performance.VPOT is freely available for public use at Git Hub(https://github.com/VCCRI/VPOT/).Documentation for installation along with a user tutorial,a default parameter file,and test data are provided.  相似文献   

19.

Background  

It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle.  相似文献   

20.
Ontologies have emerged as a fast growing research topic in the area of semantic web during last decade. Currently there are 204 ontologies that are available through OBO Foundry and BioPortal. Several excellent tools for navigating the ontological structure are available, however most of them are dedicated to a specific annotation data or integrated with specific analysis applications, and do not offer flexibility in terms of general-purpose usage for ontology exploration. We developed OntoVisT, a web based ontological visualization tool. This application is designed for interactive visualization of any ontological hierarchy for a specific node of interest, up to the chosen level of children and/or ancestor. It takes any ontology file in OBO format as input and generates output as DAG hierarchical graph for the chosen query. To enhance the navigation capabilities of complex networks, we have embedded several features such as search criteria, zoom in/out, center focus, nearest neighbor highlights and mouse hover events. The application has been tested on all 72 data sets available in OBO format through OBO foundry. The results for few of them can be accessed through OntoVisT-Gallery. AVAILABILITY: The database is available for free at http://ccbb.jnu.ac.in/OntoVisT.html.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号