首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Human Genome Project has generated extensive map and sequence data for a large number of Bacterial Artificial Chromosome (BAC) clones. In order to maximize the efficient use of the data and to minimize the redundant work for the research community, The Institute for Genomic Research (TIGR) comprehensive BAC resource (cBACr) (http://www.tigr.org/tdb/BacResource/BAC_resourc e_intro. html) was built as an expansion of the TIGR human BAC ends database. This resource collects, integrates and reports the information on library, maps, sequence, annotation and functions for each human and mouse BAC. The current database contains 635 016 human BACs and 265 617 mouse BACs that were characterized by various approaches, among which 22 705 human clones and 1000 mouse clones have sequence and annotation data.  相似文献   

2.
Human BAC ends quality assessment and sequence analyses   总被引:8,自引:0,他引:8  
Zhao S  Malek J  Mahairas G  Fu L  Nierman W  Venter JC  Adams MD 《Genomics》2000,63(3):321-332
End sequences from bacterial artificial chromosomes (BACs) provide highly specific sequence markers in large-scale sequencing projects. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length of >460 bp for a total of 141 Mb covering approximately 4.7% of the genome. Over 60% of the clones have BAC end sequences (BESs) from both ends representing more than fivefold coverage of the human genome by the paired-end clones. Our quality assessments and sequence analyses indicate that BESs from human BAC libraries developed at The California Institute of Technology (CalTech) and Roswell Park Cancer Institute have similar properties. The analyses have highlighted differences in insert size for different segments of the CalTech library. Problems with the fidelity of tracking of sequence data back to physical clones have been observed in some subsets of the overall BES dataset. The annotation results of BESs for the contents of available genomic sequences, sequence tagged sites, expressed sequence tags, protein encoding regions, and repeats indicate that this resource will be valuable in many areas of genome research.  相似文献   

3.
4.
5.
Yang XL  Bai DZ  Qiu W  Dong HQ  Li DQ  Chen F  Ma RL  Hugh TB  Gao JF 《遗传》2012,34(7):887-894
在已知中国美利奴羊MHC(Major histocompatibility complex)区段BAC(Bacterial artificial chromosome)克隆序列信息和预测的基因注释前提下,用位于中国美利奴羊基因组BAC文库MHC区段的6个BAC克隆酶切片段为探针,以噬菌斑原位杂交筛选法筛选中国美利奴羊混合组织cDNA文库(库库杂交),对分离到的cDNA阳性克隆进行全序列测定,并与相应的已知序列信息和基因注释的BAC克隆比对以及在NCBI Blastn数据库中序列相似性检索,旨在验证基因注释结果的准确性和对基因(序列)功能的初步分析。实验中,经过两轮杂交共筛选出27个cDNA阳性克隆(序列),并发现这些序列均可定位到相应的BAC克隆上,且25条序列处在注释基因的外显子部分;在NCBI数据库中经Blastn序列相似性检索发现,23条序列与牛基因的序列相似性最高,且与免疫功能密切相关。  相似文献   

6.
With the increasing quantities of Brassica genomic data being entered into the public domain and in preparation for the complete Brassica genome sequencing effort, there is a growing requirement for the structuring and detailed bioinformatic analysis of Brassica genomic information within a user-friendly database. At the Plant Biotechnology Centre, Melbourne, Australia, we have developed a series of tools and computational pipelines to assist in the processing and structuring of genomic data, to aid its application to agricultural biotechnology research. These tools include a sequence database, ASTRA, a sequence processing pipeline incorporating annotation against GenBank, SwissProt and Arabidopsis Gene Ontology (GO) data and tools for molecular marker discovery and comparative genome analysis. All sequences are mined for simple sequence repeat (SSR) molecular markers using 'SSR primer' and mapped onto the complete Arabidopsis thaliana genome by sequence comparison. The database may be queried using a text-based search of sequence annotation or GO terms, BLAST comparison against resident sequences, or by the position of candidate orthologues within the Arabidopsis genome. Tools have also been developed and applied to the discovery of single nucleotide polymorphism (SNP) molecular markers and the in silico mapping of Brassica BAC end sequences onto the Arabidopsis genome. Planned extensions to this resource include the integration of gene expression data and the development of an EnsEMBL-based genome viewer.  相似文献   

7.
8.
We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.  相似文献   

9.
Three-dimensional structures are now known within most protein families and it is likely, when searching a sequence database, that one will identify a homolog of known structure. The goal of Entrez's 3D-structure database is to make structure information and the functional annotation it can provide easily accessible to molecular biologists. To this end, Entrez's search engine provides several powerful features: (i) links between databases, for example between a protein's sequence and structure; (ii) pre-computed sequence and structure neighbors; and (iii) structure and sequence/structure alignment visualization. Here, we focus on a new feature of Entrez's Molecular Modeling Database (MMDB): Graphical summaries of the biological annotation available for each 3D structure, based on the results of automated comparative analysis. MMDB is available at: http://www.ncbi.nlm.nih.gov/Entrez/structure.html.  相似文献   

10.
The only natural mechanism of malaria transmission in sub-Saharan Africa is the mosquito, generally Anopheles gambiae. Blocking malaria parasite transmission by stopping the development of Plasmodium in the insect vector would provide a useful alternative to the current methods of malaria control. Toward this end, it is important to understand the molecular basis of the malaria parasite refractory phenotype in An. gambiae mosquito strains. We have selected and sequenced six bacterial artificial chromosome (BAC) clones from the Pen-1 region that is the major quantitative trait locus involved in Plasmodium encapsulation. The sequence and the annotation of five overlapping BAC clones plus one adjacent, but not contiguous clone, totaling 585kb of genomic sequence from the centromeric end of the Pen-1 region of the PEST strain were compared to that of the genome sequence of the same strain produced by the whole genome shotgun technique. This project identified 23 putative mosquito genes plus putative copies of the retrotransposable elements BEL12 and TRANSIBN1_AG in the six BAC clones. Nineteen of the predicted genes are most similar to their Drosophila melanogaster homologs while one is more closely related to vertebrate genes. Comparison of these new BAC sequences plus previously published BAC sequences to the cognate region of the assembled genome sequence identified three retrotransposons present in one sequence version but not the other. One of these elements, Indy, has not been previously described. These observations provide evidence for the recent active transposition of these elements and demonstrate the plasticity of the Anopheles genome. The BAC sequences strongly support the public whole genome shotgun assembly and automatic annotation while also demonstrating the benefit of complementary genome sequences and of human curation. Importantly, the data demonstrate the differences in the genome sequence of an individual mosquito compared to that of a hypothetical, average genome sequence generated by whole genome shotgun assembly.  相似文献   

11.
The increasing popularity of DNA chip technology for the study of gene expression is producing, for each experiment, a sizable quantity of numerical data to analyse and an accompanying large number of gene identifiers that should be associated with the relevant biological annotation. We describe here a website at IFOM (FIRC Institute of Molecular Oncology) where we release regularly updated annotation tables for the most used Affymetrix oligonucleotide DNA chips and for the whole Research Genetics 46K clone collection for cDNA arrays. These tables are synchronised with every new release of the mouse and human UniGene databases (NCBI; National Center for Biotechnology Information), allowing fast and easy preliminary annotation of DNA array experiments. We also report some comparative evidence about the importance of biological database synchronisation and cross-references in the process of generating annotation tables for DNA chips.  相似文献   

12.
SWISS-PROT, a curated protein sequence data bank, contains not only sequence data but also annotation relevant to a particular sequence. The annotation added to each entry is done by a team of biologists and comes, primarily, from articles in journals reporting the actual sequencing and sometimes characterisation. Review articles and collaboration with external experts also play a role along with the use of secondary databases like PROSITE and Pfam in addition to a variety of feature prediction methods. Annotation added by these methods is checked for relevance and likelihood to a particular sequence. The onset of genome sequencing has led to a dramatic increase in sequence data to be included in SWISS-PROT. This has led to the production of TrEMBL (Translation of the EMBL database). TrEMBL consists of entries in a SWISS-PROT format that are derived from the translation of all coding sequences in the EMBL nucleotide sequence database, that are not in SWISS-PROT. Unlike SWISS-PROT entries those in TrEMBL are awaiting manual annotation. However, rather than just representing basic sequence and source information, steps have been taken to add features and annotation automatically. In taking these steps it is hoped that TrEMBL entries are enhanced with some indication as to what a protein is, could or may be.  相似文献   

13.

Background

The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways.

Results

annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools.

Conclusion

annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST-sequencing projects.  相似文献   

14.
15.
To provide a novel resource for analysis of the genome of Biomphalaria glabrata, members of the international Biomphalaria glabrata Genome Initiative (http://biology.unm.edu/biomphalaria-genome.html), working with the Arizona Genomics Institute (AGI) and supported by the National Human Genome Research Institute (NHGRI), produced a high quality bacterial artificial chromosome (BAC) library. The BB02 strain B. glabrata, a field isolate (Belo Horizonte, Minas Gerais, Brasil) that is susceptible to several strains of Schistosoma mansoni, was selfed for two generations to reduce haplotype diversity in the offspring. High molecular weight DNA was isolated from ovotestes of 40 snails, partially digested with HindIII, and ligated into pAGIBAC1 vector. The resulting B. glabrata BAC library (BG_BBa) consists of 61824 clones (136.3 kb average insert size) and provides 9.05 x coverage of the 931 Mb genome. Probing with single/low copy number genes from B. glabrata and fingerprinting of selected BAC clones indicated that the BAC library sufficiently represents the gene complement. BAC end sequence data (514 reads, 299860 nt) indicated that the genome of B. glabrata contains ~ 63% AT, and disclosed several novel genes, transposable elements, and groups of high frequency sequence elements. This BG_BBa BAC library, available from AGI at cost to the research community, gains in relevance because BB02 strain B. glabrata is targeted whole genome sequencing by NHGRI.  相似文献   

16.
17.
As a complement to whole-genome sequencing efforts, we are comparing targeted genomic regions among sweet orange cultivars to identify coding and conserved noncoding regions, including regulatory elements, responsible for biological features unique to this species. Here, we report the identification of 1,018 bacterial artificial chromosome (BAC) clones containing genes relevant to fruit quality from a Citrus sinensis cv. “Vaniglia” 19.3X BAC library by two-dimensional 9?×?9 overgo hybridization. To design the overgo probes, we used the “C38” expressed sequence tag assembly (http://harvest.ucr.edu/) and OligoSpawn software (http://138.23.178.42). For BAC library screening, we selected 81 overgo probes associated with unigenes that putatively code for enzymes relevant to fruit quality (flavonol, anthocyanin, carotenoid, cellulose, starch, ascorbic acid, aromatic amino acid, and lignin biosynthesis; sucrose catabolism; glycolysis; oxidative/nonoxidative pentose phosphate pathway; fatty acid biosynthesis and oxidation; Krebs cycle). Hybridization probes were pooled and hybridized in groups of intersecting rows and columns to high-density BAC filters, followed by a deconvolution process that established BAC-probe addresses. BAC addresses were obtained for 75 of the 81 overgo probes initially selected, for a total of 1,018 BAC clones, a number consistent with the depth of coverage of the BAC library. BAC end sequencing was carried out, and end-sequence pairs were mapped to their best location in the Citrus clementina genome sequence assembly using the comparative genomic database Phytozome (http://www.phytozome.net/). The BAC clones corresponding to each probe were mapped within the same scaffold as the target gene, demonstrating that the approach we used was successful in isolating the targeted genomic regions.  相似文献   

18.
MOTIVATION: Because of the unique biological features, a bioinformatic platform for the integrated genetic and physical map of maize is required for storing, integrating, accessing and visualizing the underlying data. RESULTS: The goal of the Maize Mapping Project is to develop a fully integrated genetic and physical map for maize. To display this integrated map, we have developed iMap. iMap has three main components: a relational database (iMapDB), a map graphic browser (iMap Viewer) and a search utility (iMap Search). iMapDB is populated with current genetic and physical map data, describing relationships among genetic loci, molecular markers and bacterial artificial chromosome (BAC) contigs. The database also contains integrated information produced by applying a set of anchoring rules to assign BAC contigs to specific locations on the genetic map. The iMap Viewer and iMap Search functions are combined in the user interface to allow viewing and retrieving many types of genetic and physical map data. The iMap Viewer features side-by-side chromosome-based displays of the genetic map and associated BAC contigs. For each genetic locus, information about marker type or contig can be viewed via pop-up windows that feature links to external data resources. Searches can be conducted for genetic locus, probe or sequence accession number; search results include relevant map positions, anchored BAC contigs and links to the graphical display of relevant chromosomes. iMap can be accessed at http://www.maizemap.org AVAILABILITY: The iMap utility package is available for non-commercial use upon request from the authors.  相似文献   

19.
A set of BAC clones spanning the human genome   总被引:13,自引:0,他引:13  
Using the human bacterial artificial chromosome (BAC) fingerprint-based physical map, genome sequence assembly and BAC end sequences, we have generated a fingerprint-validated set of 32855 BAC clones spanning the human genome. The clone set provides coverage for at least 98% of the human fingerprint map, 99% of the current assembled sequence and has an effective resolving power of 79 kb. We have made the clone set publicly available, anticipating that it will generally facilitate FISH or array-CGH-based identification and characterization of chromosomal alterations relevant to disease.  相似文献   

20.
A BAC-based integrated linkage map of the silkworm Bombyx mori   总被引:3,自引:0,他引:3  

Background

In 2004, draft sequences of the model lepidopteran Bombyx mori were reported using whole-genome shotgun sequencing. Because of relatively shallow genome coverage, the silkworm genome remains fragmented, hampering annotation and comparative genome studies. For a more complete genome analysis, we developed extended scaffolds combining physical maps with improved genetic maps.

Results

We mapped 1,755 single nucleotide polymorphism (SNP) markers from bacterial artificial chromosome (BAC) end sequences onto 28 linkage groups using a recombining male backcross population, yielding an average inter-SNP distance of 0.81 cM (about 270 kilobases). We constructed 6,221 contigs by fingerprinting clones from three BAC libraries digested with different restriction enzymes, and assigned a total of 724 single copy genes to them by BLAST (basic local alignment search tool) search of the BAC end sequences and high-density BAC filter hybridization using expressed sequence tags as probes. We assigned 964 additional expressed sequence tags to linkage groups by restriction fragment length polymorphism analysis of a nonrecombining female backcross population. Altogether, 361.1 megabases of BAC contigs and singletons were integrated with a map containing 1,688 independent genes. A test of synteny using Oxford grid analysis with more than 500 silkworm genes revealed six versus 20 silkworm linkage groups containing eight or more orthologs of Apis versus Tribolium, respectively.

Conclusion

The integrated map contains approximately 10% of predicted silkworm genes and has an estimated 76% genome coverage by BACs. This provides a new resource for improved assembly of whole-genome shotgun data, gene annotation and positional cloning, and will serve as a platform for comparative genomics and gene discovery in Lepidoptera and other insects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号