首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent advances in both clone fingerprinting and draft sequencing technology have made it increasingly common for species to have a bacterial artificial clone (BAC) fingerprint map, BAC end sequences (BESs) and draft genomic sequence. The FPC (fingerprinted contigs) software package contains three modules that maximize the value of these resources. The BSS (blast some sequence) module provides a way to easily view the results of aligning draft sequence to the BESs, and integrates the results with the following two modules. The MTP (minimal tiling path) module uses sequence and fingerprints to determine a minimal tiling path of clones. The DSI (draft sequence integration) module aligns draft sequences to FPC contigs, displays them alongside the contigs and identifies potential discrepancies; the alignment can be based on either individual BES alignments to the draft, or on the locations of BESs that have been assembled into the draft. FPC also supports high-throughput fingerprint map generation as its time-intensive functions have been parallelized for Unix-based desktops or servers with multiple CPUs. Simulation results are provided for the MTP, DSI and parallelization. These features are in the FPC V9.3 software package, which is freely available.  相似文献   

2.
Ustilago maydis, a basidiomycete, is a model organism among phytopathogenic fungi. A physical map of U. maydis strain 521 was developed from bacterial artificial chromosome (BAC) clones. BAC fingerprints used polyacrylamide gel electrophoresis to separate restriction fragments. Fragments were labeled at the HindIII site and co-digested with HaeIII to reduce fragments to 50-750 bp. Contiguous overlapping sets of clones (contigs) were assembled at nine stringencies (from P < or = 1 x 10(-6) to 1 x 10(-24)). Each assembly nucleated contigs with different percentages of bands overlapping between clones (from 20% to 97%). The number of clones per contig decreased linearly from 41 to 12 from P < or = 1 x 10(-7) to 1 x 10 (-12). The number of separate contigs increased from 56 to 150 over the same range. A hybridization-based physical map of the same BAC clones was compared with the fingerprint contigs built at P < or = 1 x 10(-7). The two methods provided consistent physical maps that were largely validated by genome sequence. The combined hybridization and fingerprint physical map provided a minimum tile path composed of 258 BAC clones (18-20 Mbp) distributed among 28 merged contigs. The genome of U. maydis was estimated to be 20.5 Mbp by pulsed-field gel electrophoresis and 24 Mbp by BAC fingerprints. There were 23 separate chromosomes inferred by both pulsed-field gel electrophoresis and fingerprint contigs. Only 11 of the tile path BAC clones contained recognizable centromere, telomere, and subtelomere repeats (high-copy DNA), suggesting that repeats caused some false merges. There were 247 tile path BAC clones that encompassed about 17.5 Mbp of low-copy DNA sequence. BAC clones are available for repeat and unique gene cluster analysis including tDNA-mediated transformation. Program FingerPrint Contigs maps aligned with each chromosome can be viewed at http://www.siu.edu/~meksem/ustilago_maydis/.  相似文献   

3.
We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.  相似文献   

4.
PGAAS: a prokaryotic genome assembly assistant system   总被引:3,自引:0,他引:3  
MOTIVATION: In order to accelerate the finishing phase of genome assembly, especially for the whole genome shotgun approach of prokaryotic species, we have developed a software package designated prokaryotic genome assembly assistant system (PGAAS). The approach upon which PGAAS is based is to confirm the order of contigs and fill gaps between contigs through peptide links obtained by searching each contig end with BLASTX against protein databases. RESULTS: We used the contig dataset of the cyanobacterium Synechococcus sp. strain PCC7002 (PCC7002), which was sequenced with six-fold coverage and assembled using the Phrap package. The subject database is the protein database of the cyanobacterium, Synechocystis sp. strain PCC6803 (PCC6803). We found more than 100 non-redundant peptide segments which can link at least 2 contigs. We tested one pair of linked contigs by sequencing and obtained satisfactory result. PGAAS provides a graphic user interface to show the bridge peptides and pier contigs. We integrated Primer3 into our package to design PCR primers at the adjacent ends of the pier contigs. AVAILABILITY: We tested PGAAS on a Linux (Redhat 6.2) PC machine. It is developed with free software (MySQL, PHP and Apache). The whole package is distributed freely and can be downloaded as UNIX compress file: ftp://ftp.cbi.pku.edu.cn/pub/software/unix/pgaas1.0.tar.gz. The package is being continually updated.  相似文献   

5.
We have constructed a soybean bacterial artificial chromosome (BAC) library using the plant introduction (PI) 437654. The library contains 73728 clones stored in 192384-well microtiter plates. A random sampling of 230 BACs indicated an average insert size of 136 kb with a range of 20 to 325 kb, and less than 4% of the clones do not contain inserts. Ninety percent of BAC clones in the library have an average insert size greater than 100 kb. Based on a genome size of 1115 Mb, library coverage is 9 haploid genome equivalents. Screening the BAC library colony filters with cpDNA sequences showed that contamination of the genomic library with chloroplast clones was low (1.85%). Library screening with three genomic RFLP probes linked to soybean cyst nematode (SCN) resistance genes resulted in an average of 18 hits per probe (range 7 to 30). Two separate pools of forward and reverse suppression subtractive cDNAs obtained from SCN-infected and uninfected roots of PI 437654 were hybridized to the BAC library filters. The 488 BACs identified from positive signals were fingerprinted and analyzed using FPC software (version 4.0) resulting in 85 different contigs. Contigs were grouped and analyzed in three categories: (1) contigs of BAC clones which hybridized to forward subtracted cDNAs, (2) contigs of BAC clones which hybridized to reverse subtracted cDNAs, and (3) contigs of BAC clones which hybridized to both forward and reverse subtracted cDNAs. This protocol provides an estimate of the number of genomic regions involved in early resistance response to a pathogenic attack.  相似文献   

6.
MOTIVATION: When analysing novel protein sequences, it is now essential to extend search strategies to include a range of 'secondary' databases. Pattern databases have become vital tools for identifying distant relationships in sequences, and hence for predicting protein function and structure. The main drawback of such methods is the relatively small representation of proteins in trial samples at the time of their construction. Therefore, a negative result of an amino acid sequence comparison with such a databank forces a researcher to search for similarities in the original protein banks. We developed a database of patterns constructed for groups of related proteins with maximum representation of amino acid sequences of SWISS-PROT in the groups. RESULTS: Software tools and a new method have been designed to construct patterns of protein families. By using such method, a new version of databank of protein family patterns, PROF_ PAT 1.3, is produced. This bank is based on SWISS-PROT (r1.38) and TrEMBL (r1.11), and contains patterns of more than 13 000 groups of related proteins in a format similar to that of the PROSITE. Motifs of patterns, which had the minimum level of probability to be found in random sequences, were selected. Flexible fast search program accompanies the bank. The researcher can specify a similarity matrix (the type PAM, BLOSUM and other). Variable levels of similarity can be set (permitting search strategies ranging from exact matches to increasing levels of 'fuzziness'). AVAILABILITY: The Internet address for comparing sequences with the bank is: http://wwwmgs.bionet.nsc.ru/mgs/programs/prof_pat/. The local version of the bank and search programs (approximately 50 Mb) is available via ftp: ftp://ftp.bionet.nsc. ru/pub/biology/vector/prof_pat/, and ftp://ftp.ebi.ac. uk/pub/databases/prof_pat/. Another appropriate way for its external use is to mail amino acid sequences to bachin@vector.nsc.ru for comparison with PROF_ PAT 1.3.  相似文献   

7.
A BAC-based physical map of the channel catfish genome   总被引:3,自引:0,他引:3  
Xu P  Wang S  Liu L  Thorsen J  Kucuktas H  Liu Z 《Genomics》2007,90(3):380-388
Catfish is the major aquaculture species in the United States. To enhance its genome studies involving genetic linkage and comparative mapping, a bacterial artificial chromosome (BAC) contig-based physical map of the channel catfish (Ictalurus punctatus) genome was generated using four-color fluorescence-based fingerprints. Fingerprints of 34,580 BAC clones (5.6x genome coverage) were generated for the FPC assembly of the BAC contigs. A total of 3307 contigs were assembled using a cutoff value of 1x10(-20). Each contig contains an average of 9.25 clones with an average size of 292 kb. The combined contig size for all contigs was 0.965 Gb, approximately the genome size of the channel catfish. The reliability of the contig assembly was assessed by both hybridization of gene probes to BAC clones contained in the fingerprinted assembly and validation of randomly selected contigs using overgo probes designed from BAC end sequences. The presented physical map should greatly enhance genome research in the catfish, particularly aiding in the identification of genomic regions containing genes underlying important performance traits.  相似文献   

8.
Draft sequence derived from the 46-Mb gene-rich euchromatic portion of human chromosome 19 (HSA19) was utilized to generate a sequence-ready physical map spanning homologous regions of mouse chromosomes. Sequence similarity searches with the human sequence identified more than 1000 individual orthologous mouse genes from which 382 overgo probes were developed for hybridization. Using human gene order and spacing as a model, these probes were used to isolate and assemble bacterial artificial chromosome (BAC) clone contigs spanning homologous mouse regions. Each contig was verified, extended, and joined to neighboring contigs by restriction enzyme fingerprinting analysis. Approximately 3000 mouse BACs were analyzed and assembled into 44 contigs with a combined length of 41.4 Mb. These BAC contigs, covering 90% of HSA19-related mouse DNA, are distributed throughout 15 homology segments derived from different regions of mouse chromosomes 7, 8, 9, 10, and 17. The alignment of the HSA19 map with the ordered mouse BAC contigs revealed a number of structural differences in several overtly conserved homologous regions and more precisely defined the borders of the known regions of HSA19-syntenic homology. Our results demonstrate that given a human draft sequence, BAC contig maps can be constructed quickly for comparative sequencing without the need for preestablished mouse-specific genetic or physical markers and indicate that similar strategies can be applied with equal success to genomes of other vertebrate species.  相似文献   

9.
Bacterial artificial chromosome (BAC) clones are effective mapping and sequencing reagents for use with a wide variety of small and large genomes. This report describes research aimed at determining the genome structure of Ochrobactrum anthropi, an opportunistic human pathogen that has potential applications in biodegradation of hazardous organic compounds. A BAC library for O. anthropi was constructed that provides a 70-fold genome coverage based on an estimated genome size of 4.8 Mb. The library contains 3072 clones with an average insert size of 112 kb. High-density colony filters of the library were made, and a physical map of the genome was constructed using a hybridization without replacement strategy. In addition, 1536 BAC clones were fingerprinted with HindIII and analyzed using IMAGE and Fingerprint Contig software (FPC, Sanger Centre, U.K.). The FPC results supported the hybridization data, resulting in the formation of two major contigs representing the two major replicons of the O. anthropi genome. After determining a reduced tiling path, 138 BAC ends from the reduced tile were sequenced for a preliminary gene survey. A search of the public databases with the BLASTX algorithm resulted in 77 strong hits (E-value < 0.001), of which 89% showed similarity to a wide variety of prokaryotic genes. These results provide a contig-based physical map to assist the cloning of important genomic regions and the potential sequencing of the O. anthropi genome.  相似文献   

10.
Trichoderma reesei is an important industrial fungus known for its ability to efficiently secrete large quantities of protein as well as its wide variety of biomass degrading enzymes. Past research on this fungus has primarily focused on extending its protein production capabilities, leaving the structure of its 33 Mb genome essentially a mystery. To begin to address these deficiencies and further our knowledge of T. reesei's secretion and cellulolytic potential, we have created a genomic framework for this fungus. We constructed a BAC library containing 9216 clones with an average insert size of 125 kb which provides a coverage of 28 genome equivalents. BAC ends were sequenced and annotated using publicly available software which identified a number of genes not seen in previously sequenced EST datasets. Little evidence was found for repetitive sequence in T. reesei with the exception of several copies of an element with similarity to the Podospora anserina transposon, PAT. Hybridization of 34 genes involved in biomass degradation revealed five groups of co-located genes in the genome. BAC clones were fingerprinted and analyzed using fingerprinted contigs (FPC) software resulting in 334 contigs covering 28 megabases of the genome. The assembly of these FPC contigs was verified by congruence with hybridization results.  相似文献   

11.
12.
Three maize (Zea mays) bacterial artificial chromosome (BAC) libraries were constructed from inbred line B73. High-density filter sets from all three libraries, made using different restriction enzymes (HindIII, EcoRI, and MboI, respectively), were evaluated with a set of complex probes including the 185-bp knob repeat, ribosomal DNA, two telomere-associated repeat sequences, four centromere repeats, the mitochondrial genome, a multifragment chloroplast DNA probe, and bacteriophage lambda. The results indicate that the libraries are of high quality with low contamination by organellar and lambda-sequences. The use of libraries from multiple enzymes increased the chance of recovering each region of the genome. Ninety maize restriction fragment-length polymorphism core markers were hybridized to filters of the HindIII library, representing 6x coverage of the genome, to initiate development of a framework for anchoring BAC contigs to the intermated B73 x Mo17 genetic map and to mark the bin boundaries on the physical map. All of the clones used as hybridization probes detected at least three BACs. Twenty-two single-copy number core markers identified an average of 7.4 +/- 3.3 positive clones, consistent with the expectation of six clones. This information is integrated into fingerprinting data generated by the Arizona Genomics Institute to assemble the BAC contigs using fingerprint contig and contributed to the process of physical map construction.  相似文献   

13.
Whole-genome sequencing of the soybean (Glycine max (L.) Merr. 'Williams 82') has made it important to integrate its physical and genetic maps. To facilitate this integration of maps, we screened 3290 microsatellites (SSRs) identified from BAC end sequences of clones comprising the 'Williams 82' physical map. SSRs were screened against 3 mapping populations. We found the AAT and ACT motifs produced the greatest frequency of length polymorphisms, ranging from 17.2% to 32.3% and from 11.8% to 33.3%, respectively. Other useful motifs include the dinucleotide repeats AG, AT, and AG, with frequency of length polymorphisms ranging from 11.2% to 18.4% (AT), 12.4% to 20.6% (AG), and 11.3% to 16.4% (GT). Repeat lengths less than 16 bp were generally less useful than repeat lengths of 40-60 bp. Two hundred and sixty-five SSRs were genetically mapped in at least one population. Of the 265 mapped SSRs, 60 came from BAC singletons not yet placed into contigs of the physical map. One hundred and ten originated in BACs located in contigs for which no genetic map location was previously known. Ninety-five SSRs came from BACs within contigs for which one or more other BACs had already been mapped. For these fingerprinted contigs (FPC) a high percentage of the mapped markers showed inconsistent map locations. A strategy is introduced by which physical and genetic map inconsistencies can be resolved using the preliminary 4x assembly of the whole genome sequence of soybean.  相似文献   

14.
Three Chinese chestnut bacterial artificial chromosome (BAC) libraries were developed and used for physical map construction. Specifically, high information content fingerprinting was used to assemble 126,445 BAC clones into 1,377 contigs and 12,919 singletons. Integration of the dense Chinese chestnut genetic map with the physical map was achieved via high-throughput hybridization using overgo probes derived from sequence-based genetic markers. A total of 1,026 probes were anchored to the physical map including 831 probes corresponding to 878 expressed sequence tag-based markers. Within the physical map, three BAC contigs were anchored to the three major fungal blight-resistant quantitative trait loci on chestnut linkage groups B, F, and G. A subset of probes corresponding to orthologous genes in poplar showed only a limited amount of conserved gene order between the poplar and chestnut genomes. The integrated genetic and physical map of Chinese chestnut is available at www.fagaceae.org/physical_maps.  相似文献   

15.
We have developed an automated, high-throughput fingerprinting technique for large genomic DNA fragments suitable for the construction of physical maps of large genomes. In the technique described here, BAC DNA is isolated in a 96-well plate format and simultaneously digested with four 6-bp-recognizing restriction endonucleases that generate 3' recessed ends and one 4-bp-recognizing restriction endonuclease that generates a blunt end. Each of the four recessed 3' ends is labeled with a different fluorescent dye, and restriction fragments are sized on a capillary DNA analyzer. The resulting fingerprints are edited with a fingerprint-editing computer program and contigs are assembled with the FPC computer program. The technique was evaluated by repeated fingerprinting of several BACs included as controls in plates during routine fingerprinting of a BAC library and by reconstruction of contigs of rice BAC clones with known positions on rice chromosome 10.  相似文献   

16.
Maize is a major cereal crop and an important model system for basic biological research. Knowledge gained from maize research can also be used to genetically improve its grass relatives such as sorghum, wheat, and rice. The primary objective of the Maize Genome Sequencing Consortium (MGSC) was to generate a reference genome sequence that was integrated with both the physical and genetic maps. Using a previously published integrated genetic and physical map, combined with in-coming maize genomic sequence, new sequence-based genetic markers, and an optical map, we dynamically picked a minimum tiling path (MTP) of 16,910 bacterial artificial chromosome (BAC) and fosmid clones that were used by the MGSC to sequence the maize genome. The final MTP resulted in a significantly improved physical map that reduced the number of contigs from 721 to 435, incorporated a total of 8,315 mapped markers, and ordered and oriented the majority of FPC contigs. The new integrated physical and genetic map covered 2,120 Mb (93%) of the 2,300-Mb genome, of which 405 contigs were anchored to the genetic map, totaling 2,103.4 Mb (99.2% of the 2,120 Mb physical map). More importantly, 336 contigs, comprising 94.0% of the physical map (∼1,993 Mb), were ordered and oriented. Finally we used all available physical, sequence, genetic, and optical data to generate a golden path (AGP) of chromosome-based pseudomolecules, herein referred to as the B73 Reference Genome Sequence version 1 (B73 RefGen_v1).  相似文献   

17.
Many clone-based physical maps have been built with the FingerPrinted Contig (FPC) software, which is written in C and runs locally for fast and flexible analysis. If the maps were viewable only from FPC, they would not be as useful to the whole community since FPC must be installed on the user machine and the database downloaded. Hence, we have created a set of Web tools so users can easily view the FPC data and perform salient queries with standard browsers. This set includes the following four programs: WebFPC, a view of the contigs; WebChrom, the location of the contigs and genetic markers along the chromosome; WebBSS, locating user-supplied sequence on the map; and WebFCmp, comparing fingerprints. For additional FPC support, we have developed an FPC module for BioPerl and an FPC browser using the Generic Model Organism Project (GMOD) genome browser (GBrowse), where the FPC BioPerl module generates the data files for input into GBrowse. This provides an alternative to the WebChrom/WebFPC view. These tools are available to download along with documentation. The tools have been implemented for both the rice (Oryza sativa) and maize (Zea mays) FPC maps, which both contain the locations of clones, markers, genetic markers, and sequenced clone (along with links to sites that contain additional information).  相似文献   

18.
A second-generation linkage map was constructed for the silkworm, Bombyx mori, focusing on mapping Bombyx sequences appearing in public nucleotide databases and bacterial artificial chromosome (BAC) contigs. A total of 874 BAC contigs containing 5067 clones (22% of the library) were constructed by PCR-based screening with sequence-tagged sites (STSs) derived from whole-genome shotgun (WGS) sequences. A total of 523 BAC contigs, including 342 independent genes registered in public databases and 85 expressed sequence tags (ESTs), were placed onto the linkage map. We found significant synteny and conserved gene order between B. mori and a nymphalid butterfly, Heliconius melpomene, in four linkage groups (LGs), strongly suggesting that using B. mori as a reference for comparative genomics in Lepidotera is highly feasible.  相似文献   

19.
We have compiled the DNA sequence data for Escherichia coli available from the GenBank and EMBL data libraries and independently from the literature. Unlike the previous updates of our E.coli databases, we provide the most recent version preferentially via the World Wide Web System (use URL: http://susi.bio.unigiessen.de/usr/local/www++ +/html/ecdc.html). Our database includes an assembled set of contiguous sequences. Each of these contigs compiles all available sequence information, including those derived from a variety of elder sequences. The organization of the database allows one to find the exact physical location of each individual gene or regulatory region, even regarding discrepancies in nomenclature. The WWW program allows access into the original EMBL and SWISSPROT datafiles. A FASTA and BLAST search may be performed online. Besides the WWW format a flat file version may be obtained via ftp. The complete compilation, including a full set of genetic map data and the E.coli protein index, can be obtained in machine readable form from the EMBL data library as a part of the CD-ROM issue of the EMBL sequence database, released and updated every three months. After deletion of all detected overlaps a total of 3 333 878 individual bp was determined by the end of September 1995. This corresponds to a total of 71.71% of the entire E.coli chromosome consisting of about 4720 kbp. About 94 kbp (2%) are available additionally, but have not yet been definitely mapped.  相似文献   

20.

Background

The presence of closely related genomes in polyploid species makes the assembly of total genomic sequence from shotgun sequence reads produced by the current sequencing platforms exceedingly difficult, if not impossible. Genomes of polyploid species could be sequenced following the ordered-clone sequencing approach employing contigs of bacterial artificial chromosome (BAC) clones and BAC-based physical maps. Although BAC contigs can currently be constructed for virtually any diploid organism with the SNaPshot high-information-content-fingerprinting (HICF) technology, it is currently unknown if this is also true for polyploid species. It is possible that BAC clones from orthologous regions of homoeologous chromosomes would share numerous restriction fragments and be therefore included into common contigs. Because of this and other concerns, physical mapping utilizing the SNaPshot HICF of BAC libraries of polyploid species has not been pursued and the possibility of doing so has not been assessed. The sole exception has been in common wheat, an allohexaploid in which it is possible to construct single-chromosome or single-chromosome-arm BAC libraries from DNA of flow-sorted chromosomes and bypass the obstacles created by polyploidy.

Results

The potential of the SNaPshot HICF technology for physical mapping of polyploid plants utilizing global BAC libraries was evaluated by assembling contigs of fingerprinted clones in an in silico merged BAC library composed of single-chromosome libraries of two wheat homoeologous chromosome arms, 3AS and 3DS, and complete chromosome 3B. Because the chromosome arm origin of each clone was known, it was possible to estimate the fidelity of contig assembly. On average 97.78% or more clones, depending on the library, were from a single chromosome arm. A large portion of the remaining clones was shown to be library contamination from other chromosomes, a feature that is unavoidable during the construction of single-chromosome BAC libraries.

Conclusions

The negligibly low level of incorporation of clones from homoeologous chromosome arms into a contig during contig assembly suggested that it is feasible to construct contigs and physical maps using global BAC libraries of wheat and almost certainly also of other plant polyploid species with genome sizes comparable to that of wheat. Because of the high purity of the resulting assembled contigs, they can be directly used for genome sequencing. It is currently unknown but possible that equally good BAC contigs can be also constructed for polyploid species containing smaller, more gene-rich genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号