首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
WebACT--an online companion for the Artemis Comparison Tool   总被引:4,自引:0,他引:4  
SUMMARY: WebACT is an online resource which enables the rapid provision of simultaneous BLAST comparisons between up to five genomic sequences in a format amenable for visualization with the well-known Artemis Comparison Tool (ACT). Comparisons can be generated on-the-fly using sequences directly retrieved via EMBL database queries, or by entering or uploading user sequences. Furthermore, pre-computed comparisons are available between all publicly available, completed prokaryotic genomes and plasmids currently contained within the Genome Reviews database (372 sequences, representing 175 different species). The system is designed to minimize the volume of downloaded data and maximize performance. Genome sequences, annotation and pre-computed comparisons are stored in a relational database allowing flexible querying based on user-defined sequence regions, from whole genome to a defined region flanking a specified gene. Comparison and sequence files, whether computed online or retrieved from the database of pre-computed genome comparisons, can be viewed online using ACT and are available for download. AVAILABILITY: Freely accessible at http://www.webact.org. SUPPLEMENTARY INFORMATION: User guide and worked examples are available at http://www.webact.org/WebACT/docs.  相似文献   

2.
Artemis: sequence visualization and annotation   总被引:31,自引:0,他引:31  
SUMMARY: Artemis is a DNA sequence visualization and annotation tool that allows the results of any analysis or sets of analyses to be viewed in the context of the sequence and its six-frame translation. Artemis is especially useful in analysing the compact genomes of bacteria, archaea and lower eukaryotes, and will cope with sequences of any size from small genes to whole genomes. It is implemented in Java, and can be run on any suitable platform. Sequences and annotation can be read and written directly in EMBL, GenBank and GFF format. AVAILABITLTY: Artemis is available under the GNU General Public License from http://www.sanger.ac.uk/Software/Artemis  相似文献   

3.
With the number of published microbial genomes now in excess of 100, any new genome that is sequenced is likely to have a close relative available for comparison. Indeed, it is increasingly difficult to perform any genomic analysis that is not comparative. This should, however, not be seen as a drawback; it is often the case that a large amount of information can be drawn from these comparisons, especially between closely related organisms. Several genome sequences published recently indicate the value of comparisons at the genomic level.  相似文献   

4.
The Artemis Group comprises mammalian proteins with important functions in the repair of ionizing radiation-induced DNA double-strand breaks and in the cleavage of DNA hairpin extremities generated during V(D)J recombination. Little is known about the presence of Artemis/Artemis-like proteins in non-mammalian species. We have characterized new Artemis/Artemis-like sequences from the genomes of some fungi and from non-mammalian metazoan species. An in-depth phylogenetic analysis of these new Artemis/Artemis-like sequences showed that they form a distinct clade within the Pso2p/Snm1p A and B Groups. Hydrophobic cluster analysis and three-dimensional modeling allowed to map and to compare conserved regions in these Artemis/Artemis-like proteins. The results indicate that Artemis probably belongs to an ancient DNA recombination mechanism that diversified with the evolution of multi-cellular eukaryotic lineage.  相似文献   

5.
Artemis is a widely used software tool for annotating and viewing sequence data. No database is required to use Artemis. Instead, individual sequence data files can be analysed with little or no formatting, making it particularly suited to the study of small genomes and chromosomes, and straightforward for a novice user to get started. Since its release in 1999, Artemis has been used to annotate a diverse collection of prokaryotic and eukaryotic genomes, ranging from Streptomyces coelicolor to, more recently, a large proportion of the Plasmodium falciparum genome. Artemis allows annotated genomes to be easily browsed and makes it simple to add useful biological information to raw sequence data. This paper gives an overview of some of the features of Artemis and includes how it facilitates manual gene prediction and can provide an overview of entire chromosomes or small compact genomes--useful for uncovering unusual features such as pathogenicity islands.  相似文献   

6.
Adeno-associated virus type 2 (AAV2) preferentially integrates its genome into the AAVS1 locus on human chromosome 19. Preferential integration requires the AAV2 Rep68 or Rep78 protein (Rep68/78), a Rep68/78 binding site (RBS), and a nicking site within AAVS1 and may also require an RBS within the virus genome. To obtain further information that might help to elucidate the mechanism and preferred substrate configurations of preferential integration, we amplified junctions between AAV2 DNA and AAVS1 from AAV2-infected HeLaJW cells and cells with defective Artemis or xeroderma pigmentosum group A genes. We sequenced 61 distinct junctions. The integration junction sequences show the three classical types of nonhomologous-end-joining joints: microhomology at junctions (57%), insertion of sequences that are not normally contiguous with either the AAV2 or the AAVS1 sequences at the junction (31%), and direct joining (11%). These junctions were spread over 750 bases and were all downstream of the Rep68/78 nicking site within AAVS1. Two-thirds of the junctions map to 350 bases of AAVS1 that are rich in polypyrimidine tracts on the nicked strand. The majority of AAV2 breakpoints were within the inverted terminal repeat (ITR) sequences, which contain RBSs. We never detected a complete ITR at a junction. Residual ITRs at junctions never contained more than one RBS, suggesting that the hairpin form, rather than the linear ITR, is the more frequent integration substrate. Our data are consistent with a model in which a cellular protein other than Artemis cleaves AAV2 hairpins to produce free ends for integration.  相似文献   

7.

Background  

Systematic genome comparisons are an important tool to reveal gene functions, pathogenic features, metabolic pathways and genome evolution in the era of post-genomics. Furthermore, such comparisons provide important clues for vaccines and drug development. Existing genome comparison software often lacks accurate information on orthologs, the function of similar genes identified and genome-wide reports and lists on specific functions. All these features and further analyses are provided here in the context of a modular software tool "inGeno" written in Java with Biojava subroutines.  相似文献   

8.
MyGV is an application to visualize (potentially genome-scale) gene structure annotation and prediction. The output of any external gene prediction program can be easily converted to a generalized format for input into MyGV. The application displays all input simultaneously in graphical representation, with a toggle option for a text-based view. Zooming capabilities allow detailed comparisons for specific genome locations. The tool is particularly helpful for refinement of ab initio predicted gene structures by spliced alignment with cDNA or protein homologs. AVAILABILITY: The program was written in Java and is freely available to non-commercial users by electronic download from http://bioinformatics.iastate.edu/bioinformatics2go/MyGV.  相似文献   

9.
MOTIVATION: Dot-matrix plots are widely used for similarity analysis of biological sequences. Many algorithms and computer software tools have been developed for this purpose. Though some of these tools have been reported to handle sequences of a few 100 kb, analysis of genome sequences with a length of >10 Mb on a microcomputer is still impractical due to long execution time and computer memory requirement. RESULTS: Two dot-matrix comparison methods have been developed for analysis of large sequences. The methods initially locate similarity regions between two sequences using a fast word search algorithm, followed with an explicit comparison on these regions. Since the initial screening removes most of random matches, the computing time is substantially reduced. The methods produce high quality dot-matrix plots with low background noise. Space requirements are linear, so the algorithms can be used for comparison of genome size sequences. Computing speed may be affected by highly repetitive sequence structures of eukaryote genomes. A dot-matrix plot of Yeast genome (12 Mb) with both strands was generated in 80 s with a 1 GHz personal computer.  相似文献   

10.
SUMMARY: To annotate newly sequenced organisms, cross-species sequence comparison algorithms can be applied to align gene sequences to the genome of a related species. To improve the accuracy of alignment, spaced seeds must be optimized for each comparison. As the number and diversity of genomes increase, an efficient alternative is to cluster pairwise comparisons into groups and identify seeds for groups instead of individual comparisons. Here we investigate a measure of comparison closeness and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed. AVAILABILITY: Source code is freely available at http://dna.cs.gwu.edu and from Bioinformatics online.  相似文献   

11.
SUMMARY: ACGT (a comparative genomics tool) is a genomic DNA sequence comparison viewer and analyzer. It can read a pair of DNA sequences in GenBank, Embl or Fasta formats, with or without a comparison file, and provide users with many options to view and analyze the similarities between the input sequences. It is written in Java and can be run on Unix, Linux and Windows platforms. AVAILABILITY: The ACGT program is freely available with documentation and examples at website: http://db.systemsbiology.net/projects/local/mhc/acgt/  相似文献   

12.
Hsieh MH  Goodman HM 《Plant physiology》2002,130(4):1797-1806
In bacteria, the regulatory ACT domains serve as amino acid-binding sites in some feedback-regulated amino acid metabolic enzymes. We have identified a novel type of ACT domain-containing protein family in Arabidopsis whose members contain ACT domain repeats (the "ACR" protein family). There are at least eight ACR genes located on each of the five chromosomes in the Arabidopsis genome. Gene structure comparisons indicate that the ACR gene family may have arisen by gene duplications. Northern-blot analysis indicates that each member of the ACR gene family has a distinct expression pattern in various organs from 6-week-old Arabidopsis. Moreover, analyses of an ACR3 promoter-beta-glucuronidase (GUS) fusion in transgenic Arabidopsis revealed that the GUS activity formed a gradient in the developing leaves and sepals, whereas low or no GUS activity was detected in the basal regions. In 2-week-old Arabidopsis seedlings grown in tissue culture, the expression of the ACR gene family is differentially regulated by plant hormones, salt stress, cold stress, and light/dark treatment. The steady-state levels of ACR8 mRNA are dramatically increased by treatment with abscisic acid or salt. Levels of ACR3 and ACR4 mRNA are increased by treatment with benzyladenine. The amino acid sequences of Arabidopsis ACR proteins are most similar in the ACT domains to the bacterial sensor protein GlnD. The ACR proteins may function as novel regulatory or sensor proteins in plants.  相似文献   

13.
14.
Advances in sequencing technologies have accelerated the sequencing of new genomes, far outpacing the generation of gene and protein resources needed to annotate them. Direct comparison and alignment of existing cDNA sequences from a related species is an effective and readily available means to determine genes in the new genomes. Current spliced alignment programs are inadequate for comparing sequences between different species, owing to their low sensitivity and splice junction accuracy. A new spliced alignment tool, sim4cc, overcomes problems in the earlier tools by incorporating three new features: universal spaced seeds, to increase sensitivity and allow comparisons between species at various evolutionary distances, and powerful splice signal models and evolutionarily-aware alignment techniques, to improve the accuracy of gene models. When tested on vertebrate comparisons at diverse evolutionary distances, sim4cc had significantly higher sensitivity compared to existing alignment programs, more than 10% higher than the closest competitor for some comparisons, while being comparable in speed to its predecessor, sim4. Sim4cc can be used in one-to-one or one-to-many comparisons of genomic and cDNA sequences, and can also be effectively incorporated into a high-throughput annotation engine, as demonstrated by the mapping of 64 000 Fagus grandifolia 454 ESTs and unigenes to the poplar genome.  相似文献   

15.
16.
SUMMARY: We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. MOTIVATION: Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.  相似文献   

17.
ABSTRACT: BACKGROUND: The anaerobic spirochaete Brachyspira pilosicoli causes enteric disease in avian, porcine and human hosts, amongst others. To date, the only available genome sequence of B. pilosicoli is that of strain 95/1000, a porcine isolate. In the first intra-species genome comparison within the Brachyspira genus, we report the whole genome sequence of B. pilosicoli B2904, an avian isolate, the incomplete genome sequence of B. pilosicoli WesB, a human isolate, and the comparisons with B. pilosicoli 95/1000. We also draw on incomplete genome sequences from three other Brachyspira species. Finally we report the first application of the high-throughput Biolog phenotype screening tool on the B. pilosicoli strains for detailed comparisons between genotype and phenotype. RESULTS: Feature and sequence genome comparisons revealed a high degree of similarity between the three B. pilosicoli strains, although the genomes of B2904 and WesB were larger than that of 95/1000 (~2,765, 2.890 and 2.596 Mb, respectively). Genome rearrangements were observed which correlated largely with the positions of mobile genetic elements. Through comparison of the B2904 and WesB genomes with the 95/1000 genome, features that we propose are non-essential due to their absence from 95/1000 include a peptidase, glycine reductase complex components and transposases. Novel bacteriophages were detected in the newly-sequenced genomes, which appeared to have involvement in intra- and inter-species horizontal gene transfer. Phenotypic differences predicted from genome analysis, such as the lack of genes for glucuronate catabolism in 95/1000, were confirmed by phenotyping. CONCLUSIONS: The availability of multiple B. pilosicoli genome sequences has allowed us to demonstrate the substantial genomic variation that exists between these strains, and provides an insight into genetic events that are shaping the species. In addition, phenotype screening allowed determination of how genotypic differences translated to phenotype. Further application of such comparisons will improve understanding of the metabolic capabilities of Brachyspira species.  相似文献   

18.
Organellar Genome Retrieval (OGRe) is a relational database of complete mitochondrial genome sequences for over 250 Metazoan species. OGRe provides a resource for the comparative analysis of mitochondrial genomes at several levels. At the sequence level, OGRe allows the retrieval of any selected set of mitochondrial genes from any selected set of species. Species are classified using a taxonomic system that allows easy selection of related groups of species. Sequence alignments are also available for some species. At the level of individual nucleotides, the system contains information on base frequencies and codon usage frequencies that can be compared between organisms. At the level of whole genomes, OGRe provides several ways of visualizing information on gene order. Diagrams illustrating the genome arrangement can be generated for any selected set of species automatically from the information in the database. Searches can be done based on gene arrangement to find sets of species that have the same order as one another. Diagrams for pairwise comparison of species can be produced that show the positions of break-points in the gene order and use colour to highlight the sections of the genome that have moved. OGRe is available from http://www.bioinf.man.ac.uk/ogre.  相似文献   

19.
We represent all DNA sequences as points in twelve-dimensional space in such a way that homologous DNA sequences are clustered together, from which a new genomic space is created for global DNA sequences comparison of millions of genes simultaneously. More specifically, basing on the contents of four nucleotides, their distances from the origin and their distribution along the sequences, a twelve-dimensional vector is given to any DNA sequence. The applicability of this analysis on global comparison of gene structures was tested on myoglobin, beta-globin, histone-4, lysozyme, and rhodopsin families. Members from each family exhibit smaller vector distances relative to the distances of members from different families. The vector distance also distinguishes random sequences generated based on same bases composition. Sequence comparisons showed consistency with the BLAST method. Once the new gene is discovered, we can compute the location of this new gene in our genomic space. It is natural to predict that the properties of this new gene are similar to the properties of known genes that are locating near by. Biologists can do various experiments to test these properties.  相似文献   

20.
An important computational technique for extracting the wealth of information hidden in human genomic sequence data is to compare the sequence with that from the corresponding region of the mouse genome, looking for segments that are conserved over evolutionary time. Moreover, the approach generalises to comparison of sequences from any two related species. The underlying rationale (which is abundantly confirmed by observation) is that a random mutation in a functional region is usually deleterious to the organism, and hence unlikely to become fixed in the population, whereas mutations in a non-functional region are free to accumulate over time.The potential value of this approach is so attractive that the public and private projects to sequence the human genome are now turning to sequencing the mouse, and you will soon be able to compare the human and mouse sequences of your favourite genomic region.We are currently witnessing an explosion of computer tools for comparative analysis of two genomic sequences. Here the capabilities of two new network servers for comparing genomic sequences from any pair of closely related species are sketched.The Syntenic Gene Prediction Program SGP-I utilises sequence comparisons to enhance the ability to locate protein coding segments in genomic data. PipMaker attempts to determine all conserved genomic regions, regardless of their function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号