首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The flood of sequence data resulting from the large number of current genome projects has increased the need for a flexible, open source genome annotation system, which so far has not existed. To account for the individual needs of different projects, such a system should be modular and easily extensible. We present a genome annotation system for prokaryote genomes, which is well tested and readily adaptable to different tasks. The modular system was developed using an object-oriented approach, and it relies on a relational database backend. Using a well defined application programmers interface (API), the system can be linked easily to other systems. GenDB supports manual as well as automatic annotation strategies. The software currently is in use in more than a dozen microbial genome annotation projects. In addition to its use as a production genome annotation system, it can be employed as a flexible framework for the large-scale evaluation of different annotation strategies. The system is open source.  相似文献   

2.
As sequencing technology improves, an increasing number of projects aim to generate full genome sequence, even for nonmodel taxa. These projects may be feasibly conducted at lower read depths if the alignment can be aided by previously developed genomic resources from a closely related species. We investigated the feasibility of constructing a complete mitochondrial (mt) genome without preamplification or other targeting of the sequence. Here we present a full mt genome sequence (16,463 nucleotides) for the bighorn sheep (Ovis canadensis) generated though alignment of SOLiD short-read sequences to a reference genome. Average read depth was 1240, and each base was covered by at least 36 reads. We then conducted a phylogenomic analysis with 27 other bovid mitogenomes, which placed bighorn sheep firmly in the Ovis clade. These results show that it is possible to generate a complete mitogenome by skimming a low-coverage genomic sequencing library. This technique will become increasingly applicable as the number of taxa with some level of genome sequence rises.  相似文献   

3.
The human genome initiative has provided the motivating force for launching sequencing projects suitable for testing various DNA-sequencing strategies, as well as motivating the development of mapping and sequencing technologies. In addition to projects targeting selected regions of the human genome, other projects are based on model organisms such as yeast, nematode and mouse. The sequencing of homologous regions of human and mouse genomes is a new approach to genome analysis, and is providing insights into gene evolution, function and regulation which could not be determined so easily from the analysis of just one species.  相似文献   

4.
After sequencing the human and mouse genomes, the annotation of these sequences with biological functions is an important challenge in genomic research. A major tool to analyse gene function on the organismal level is the analysis of mutant phenotypes. Because of its genetic and physiological similarity to man, the mouse has become the model organism of choice for the study of genetic diseases. In addition, there is at the moment no other vertebrate for which versatile techniques to manipulate the genome are as well developed. Several mouse mutagenesis projects have provided the proof-of-principle that a systematic and comprehensive mutagenesis of every gene in the mammalian genome will be feasible. An exhaustive functional annotation of the mammalian genome can only be achieved in a combination of phenotype- and gene-driven approaches in large- and small-scale academic and private projects. Major challenges will be to develop standardised phenotyping protocols for the clinical and pathological characterisation of mouse mutants, the improvement of mutation detection methods and the dissemination of resources and data. Beyond gene annotation, it will be necessary to understand how gene functions are integrated into the complex network of regulatory interactions in the cell.  相似文献   

5.
6.
The Arabidopsis genome sequence is scheduled for completion at the end of this year (December 2000). It will be the first higher plant genome to be sequenced, and will allow a detailed comparison with bacterial, yeast and animal genomes. Already, two of the five chromosomes have been sequenced, and we have had our first glimpse of higher eukaryotic centromeres, and the structure of heterochromatin. The implications for understanding plant gene function, genome structure and genome organization are profound. In this review, the lessons learned for future genome projects are reviewed as well as a summary of the initial findings in Arabidopsis. Electronic Publication  相似文献   

7.
Although characterization of the genotype has undergone revolutionary advances as a result of the successful genome projects, the chasm between our understanding of a fully characterized gene sequence and the phenotypic repertoire of the organism is as broad and deep as it was in the pre-genomic era. There are two fundamental unsolved problems that provide the context for the challenges in relating genotype to phenotype. We address one of these and describe a generic method for constructing a system design space in which qualitatively distinct phenotypes can be identified and counted, their relative fitness analyzed and compared, and their tolerance to change measured.  相似文献   

8.
Fungi have now well and truly entered the genomic age. We currently know the complete DNA sequence for 18 fungal species and many more fungal genome sequencing projects are in progress. Whilst yeasts dominated the early genomic years, recently there has been a dramatic increase in filamentous fungal genome projects. The implications of this wealth of genetic information for mycologists worldwide is immense. In this review we summarise the background to fungal genome projects with an emphasis on the filamentous fungi. We discuss efforts to determine gene function and to compare genomes from different species. Since this is such a fast-moving field, useful web sites are listed that will enable the reader to keep up to date with developments.  相似文献   

9.
10.

Background  

At intermediate stages of genome assembly projects, when a number of contigs have been generated and their validity needs to be verified, it is desirable to align these contigs to a reference genome when it is available. The interest is not to analyze a detailed alignment between a contig and the reference genome at the base level, but rather to have a rough estimate of where the contig aligns to the reference genome, specifically, by identifying the starting and ending positions of such a region. This information is very useful in ordering the contigs, facilitating post-assembly analysis such as gap closure and resolving repeats. There exist programs, such as BLAST and MUMmer, that can quickly align and identify high similarity segments between two sequences, which, when seen in a dot plot, tend to agglomerate along a diagonal but can also be disrupted by gaps or shifted away from the main diagonal due to mismatches between the contig and the reference. It is a tedious and practically impossible task to visually inspect the dot plot to identify the regions covered by a large number of contigs from sequence assembly projects. A forced global alignment between a contig and the reference is not only time consuming but often meaningless.  相似文献   

11.
Protozoan parasites are causing some of the most devastating diseases world-wide. It has now been recognised that a major effort is needed to be able to control or eliminate these diseases. Genome projects for the most important protozoan parasites have been initiated in the hope that the read-out of these projects will help to understand the biology of the parasites and identify new targets for urgently needed drugs. Here, I will review the current status of protozoan parasite genome projects, present findings obtained as a result of the availability of genomic data and discuss the potential impact of genome information on disease control.  相似文献   

12.
This Short Communication highlights the diversity of 'secondary' genome data (like mitochondrial and plastid genomes) that can be gleaned from next-generation sequencing projects, and encourages researchers to be mindful that these data are often as informative and useful as the 'primary' genome data.  相似文献   

13.
The draft sequence of several complete protozoan genomes is now available and genome projects are ongoing for a number of other species. Different strategies are being implemented to identify and annotate protein coding and RNA genes in these genomes, as well as study their genomic architecture. Since the genomes vary greatly in size, GC-content, nucleotide composition, and degree of repetitiveness, genome structure is often a factor in choosing the methodology utilised for annotation. In addition, the approach taken is dictated, to a greater or lesser extent, by the particular reasons for carrying out genome-wide analyses and the level of funding available for projects. Nevertheless, these projects have provided a plethora of material that will aid in understanding the biology and evolution of these parasites, as well as identifying new targets that can be used to design urgently required drug treatments for the diseases they cause.  相似文献   

14.
The past decade has seen the completion of numerous whole-genome sequencing projects, began with bacterial genomes and continued with eukaryotic species from different phyla: fungi, plants and animals. Besides, more biological information are produced and are shared thanks to information exchange systems, and more biological concepts, as well as more bioinformatics tools, are available. In this article, we will describe how the evolutionary biology concepts, as well as computer science, are useful for a better understanding of biology in general and genome annotation in particular. The genome annotation process consists of taking the raw DNA produced, for example, by the genome sequencing projects, adding the layers of analysis and interpretation necessary to extract its biological significance and placing it in the context of our understanding of biological processes. Genome annotation is a multistep process falling into two broad categories: structural and functional annotation.  相似文献   

15.
16.
17.
The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects.  相似文献   

18.
With the expansion of next‐generation sequencing technology and advanced bioinformatics, there has been a rapid growth of genome sequencing projects. However, while this technology enables the rapid and cost‐effective assembly of draft genomes, the quality of these assemblies usually falls short of gold standard genome assemblies produced using the more traditional BAC by BAC and Sanger sequencing approaches. Assembly validation is often performed by the physical anchoring of genetically mapped markers, but this is prone to errors and the resolution is usually low, especially towards centromeric regions where recombination is limited. New approaches are required to validate reference genome assemblies. The ability to isolate individual chromosomes combined with next‐generation sequencing permits the validation of genome assemblies at the chromosome level. We demonstrate this approach by the assessment of the recently published chickpea kabuli and desi genomes. While previous genetic analysis suggests that these genomes should be very similar, a comparison of their chromosome sizes and published assemblies highlights significant differences. Our chromosomal genomics analysis highlights short defined regions that appear to have been misassembled in the kabuli genome and identifies large‐scale misassembly in the draft desi genome. The integration of chromosomal genomics tools within genome sequencing projects has the potential to significantly improve the construction and validation of genome assemblies. The approach could be applied both for new genome assemblies as well as published assemblies, and complements currently applied genome assembly strategies.  相似文献   

19.
Rates of DNA Duplication and Mitochondrial DNA Insertion in the Human Genome   总被引:11,自引:0,他引:11  
The hundreds of mitochondrial pseudogenes in the human nuclear genome sequence (numts) constitute an excellent system for studying and dating DNA duplications and insertions. These pseudogenes are associated with many complete mitochondrial genome sequences and through those with a good fossil record. By comparing individual numts with primate and other mammalian mitochondrial genome sequences, we estimate that these numts arose continuously over the last 58 million years. Our pairwise comparisons between numts suggest that most human numts arose from different mitochondrial insertion events and not by DNA duplication within the nuclear genome. The nuclear genome appears to accumulate mtDNA insertions at a rate high enough to predict within-population polymorphism for the presence/absence of many recent mtDNA insertions. Pairwise analysis of numts and their flanking DNA produces an estimate for the DNA duplication rate in humans of 2.2 × 10–9 per numt per year. Thus, a nucleotide site is about as likely to be involved in a duplication event as it is to change by point substitution. This estimate of the rate of DNA duplication of noncoding DNA is based on sequences that are not in duplication hotspots, and is close to the rate reported for functional genes in other species.  相似文献   

20.
GOLD is a comprehensive resource for accessing information related to completed and ongoing genome projects world-wide. The database currently provides information on 350 genome projects, of which 48 have been completely sequenced and their analysis published. GOLD was created in 1997 and since April 2000 it has been licensed to Integrated Genomics. The database is freely available through the URL: http://igweb.integratedgenomics.com/GOLD/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号