首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
An overview of the apple genome through BAC end sequence analysis   总被引:1,自引:0,他引:1  
The apple, Malus x domestica Borkh., is one of the most important fruit trees grown worldwide. A bacterial artificial chromosome (BAC)-based physical map of the apple genome has been recently constructed. Based on this physical map, a total of approximately 2,100 clones from different contigs (overlapping BAC clones) have been selected and sequenced at both ends, generating 3,744 high-quality BAC end sequences (BESs) including 1,717 BAC end pairs. Approximately 8.5% of BESs contain simple sequence repeats (SSRs), most of which are AT/TA dimer repeats. Potential transposable elements are identified in approximately 21% of BESs, and most of these elements are retrotransposons. About 11% of BESs have homology to the Arabidopsis protein database. The matched proteins cover a broad range of categories. The average GC content of the predicted coding regions of BESs is 42.4%; while, that of the whole BESs is 39%. A small number of BES pairs were mapped to neighboring chromosome regions of A. thaliana and Populus trichocarpa; whereas, no pairs are mapped to the Oryza sativa genome. The apple has a higher degree of synteny with the closely related Populus than with the distantly related Arabidopsis. BAC end sequencing can be used to anchor a small proportion of the apple genome to the Populus and possibly to the Arabidopsis genomes.  相似文献   

2.
Due in part to its small genome (~350 Mb), Brachypodium distachyon is emerging as a model system for temperate grasses, including important crops like wheat and barley. We present the analysis of 10.9% of the Brachypodium genome based on 64,696 bacterial artificial chromosome (BAC) end sequences (BES). Analysis of repeat DNA content in BES revealed that approximately 11.0% of the genome consists of known repetitive DNA. The vast majority of the Brachypodium repetitive elements are LTR retrotransposons. While Bare-1 retrotransposons are common to wheat and barley, Brachypodium repetitive element sequence-1 (BRES-1), closely related to Bare-1, is also abundant in Brachypodium. Moreover, unique Brachypodium repetitive element sequences identified constitute approximately 7.4% of its genome. Simple sequence repeats from BES were analyzed, and flanking primer sequences for SSR detection potentially useful for genetic mapping are available at . Sequence analyses of BES indicated that approximately 21.2% of the Brachypodium genome represents coding sequence. Furthermore, Brachypodium BES have more significant matches to ESTs from wheat than rice or maize, although these species have similar sizes of EST collections. A phylogenetic analysis based on 335 sequences shared among seven grass species further revealed a closer relationship between Brachypodium and Triticeae than Brachypodium and rice or maize. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. N. Huo and G.R. Lazo contributed equally to this work.  相似文献   

3.
Papaya (Carica papaya L.) is a major tree fruit crop of tropical and subtropical regions with an estimated genome size of 372 Mbp. We present the analysis of 4.7% of the papaya genome based on BAC end sequences (BESs) representing 17 million high-quality bases. Microsatellites discovered in 5,452 BESs and flanking primer sequences are available to papaya breeding programs at . Sixteen percent of BESs contain plant repeat elements, the vast majority (83.3%) of which are class I retrotransposons. Several novel papaya-specific repeats were identified. Approximately 19.1% of the BESs have homology to Arabidopsis cDNA. Increasing numbers of completely sequenced plant genomes and BES projects enable novel approaches to comparative plant genomics. Paired BESs of Carica, Arabidopsis, Populus, Brassica and Lycopersicon were mapped onto the completed genomes of Arabidopsis and Populus. In general the level of microsynteny was highest between closely related organisms. However, papaya revealed a higher degree of apparent synteny with the more distantly related poplar than with the more closely related Arabidopsis. This, as well as significant colinearity observed between peach and poplar genome sequences, support recent observations of frequent genome rearrangements in the Arabidopsis lineage and suggest that the poplar genome sequence may be more useful for elucidating the papaya and other rosid genomes. These insights will play a critical role in selecting species and sequencing strategies that will optimally represent crop genomes in sequence databases.Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.Chun Wan J. Lai and Qingyi Yu have contributed equally to this work.  相似文献   

4.
5.
Spartina species play an important ecological role on salt marshes. Spartina maritima is an Old-World species distributed along the European and North-African Atlantic coasts. This hexaploid species (2n = 6x = 60, 2C = 3,700 Mb) hybridized with different Spartina species introduced from the American coasts, which resulted in the formation of new invasive hybrids and allopolyploids. Thus, S. maritima raises evolutionary and ecological interests. However, genomic information is dramatically lacking in this genus. In an effort to develop genomic resources, we analysed 40,641 high-quality bacterial artificial chromosome-end sequences (BESs), representing 26.7 Mb of the S. maritima genome. BESs were searched for sequence homology against known databases. A fraction of 16.91 % of the BESs represents known repeats including a majority of long terminal repeat (LTR) retrotransposons (13.67 %). Non-LTR retrotransposons represent 0.75 %, DNA transposons 0.99 %, whereas small RNA, simple repeats and low-complexity sequences account for 1.38 % of the analysed BESs. In addition, 4,285 simple sequence repeats were detected. Using the coding sequence database of Sorghum bicolor, 6,809 BESs found homology accounting for 17.1 % of all BESs. Comparative genomics with related genera reveals that the microsynteny is better conserved with S. bicolor compared to other sequenced Poaceae, where 37.6 % of the paired matching BESs are correctly orientated on the chromosomes. We did not observe large macrosyntenic rearrangements using the mapping strategy employed. However, some regions appeared to have experienced rearrangements when comparing Spartina to Sorghum and to Oryza. This work represents the first overview of S. maritima genome regarding the respective coding and repetitive components. The syntenic relationships with other grass genomes examined here help clarifying evolution in Poaceae, S. maritima being a part of the poorly-known Chloridoideae sub-family.  相似文献   

6.
Transposable elements (TEs) are viewed as major contributors to the evolution of fungal genomes. Genomic resources such as BAC libraries are an underutilized resource for studying genome-wide TE distribution. Using the BAC end sequences and physical map that are available for the rice blast fungus, Magnaporthe grisea, we describe a likelihood ratio test designed to identify clustering of TEs in the genome. A significant variation in the distribution of three TEs, MAGGY, MGL, and Pot2 was observed among the fingerprint contigs of the physical map. We utilized a draft sequence of M. grisea chromosome 7 to validate our results and found a similar pattern of clustering. By examining individual BAC end sequences, we found evidence for 11 unique integrations of MAGGY or MGL into Pot2 but no evidence for the reciprocal integration of Pot2 into another TE. This suggests that: (a) the presence of Pot2 in the genome predates that of the other TEs, (b) Pot2 was less transpositionally active than other TEs, or (c) that MAGGY and MGL have integration site preference for Pot2. High transition/transversion mutation ratios as well as bias in transition site context was observed in MAGGY and MGL elements, but not in Pot2 elements. These features are consistent with the effects of a Repeat-Induced Point (RIP) mutation-like process occurring in MAGGY and MGL elements. This study illustrates the general utility of a physical map and BAC end sequences for the study of genome-wide repetitive DNA content and organization.  相似文献   

7.
In an effort to increase the density of sequence-based markers for the horse genome we generated 9473 BAC end sequences (BESs) from the CHORI-241 BAC library with an average read length of 677 bp. BLASTN searches with the BESs revealed 4036 meaningful hits (E 相似文献   

8.
We constructed and characterized arrayed bacterial artificial chromosome (BAC) libraries of five Drosophila species (D. melanogaster, D. simulans, D. sechellia, D. auraria, and D. ananassae), which are genetically well characterized in the studies of meiosis, evolution, population genetics, and developmental biology. The BAC libraries comprise 8,000 to 12,500 clones for each species, estimated to cover the most of the genomes. We sequenced both ends of most of these BAC clones with a success rate of 91%. Of these, 53,701 clones consisting of non-repetitive BAC end sequences (BESs) were mapped with reference of the public D. melanogaster genome sequences. The BES mapping estimated that the BAC libraries of D. auraria and D. ananassae covered 47% and 57% of the D. melanogaster genome, respectively, and those of D. melanogaster, D. sechellia, and D. simulans covered 94-97%. The low coverage by BESs of D. auraria and D. ananassae may be due to the high sequence divergence with D. melanogaster. From the comparative BES mapping, 111 possible breakpoints of chromosomal rearrangements were identified in these four species. The breakpoints of the major chromosome rearrangement between D. simulans and D. melanogaster on the third chromosome were determined within 20 kb in 84E and 30 kb in 93E/F. Corresponding breakpoints were also identified in D. sechellia. The BAC clones described here will be an important addition to the Drosophila genomic resources.  相似文献   

9.
In common bean, a complex disease resistance (R) gene cluster, harboring many specific R genes against various pathogens, is located at the end of the linkage group B4. A BAC library of the Meso-american bean genotype BAT93 was screened with PRLJ1, a probe previously shown to be specific to the B4 R gene cluster, leading to the identification of 73 positive BAC clones. BAC-end sequencing (BES) of the 73 positive BACs generated 75 kb of sequence. These BACs were organized into 6 contigs, all mapped at the B4 R gene cluster. To evaluate the potential of BES for marker development, BES-derived specific primers were used to check for linkage with two allelic anthracnose R specificities Co-3 and Co-3 ( 2 ), through the analysis of pairs of Near Isogenic Lines (NILs). Out of 32 primer pairs tested, two revealed polymorphisms between the NILs, confirming the suspected location of Co-3 and Co-3 ( 2 ) at the B4 cluster. In order to identify the orthologous region of the B4 R gene cluster in the two model legume genomes, bean BESs were used as queries in TBLASTX searches of Medicago truncatula and Lotus japonicus BAC clones. Putative orthologous regions were identified on chromosome Mt6 and Lj2, in agreement with the colinearity observed between Mt and Lj for these regions.  相似文献   

10.
11.
12.
Belknap WR  Wang Y  Huo N  Wu J  Rockhold DR  Gu YQ  Stover E 《Génome》2011,54(12):1005-1015
The citrus cultivar Carrizo is the single most important rootstock to the US citrus industry and has resistance or tolerance to a number of major citrus diseases, including citrus tristeza virus, foot rot, and Huanglongbing (HLB, citrus greening). A Carrizo genomic sequence database providing approximately 3.5×genome coverage (haploid genome size approximately 367 Mb) was populated through 454 GS FLX shotgun sequencing. Analysis of the repetitive DNA fraction indicated a total interspersed repeat fraction of 36.5%. Assembly and characterization of abundant citrus Ty3/gypsy elements revealed a novel type of element containing open reading frames encoding a viral RNA-silencing suppressor protein (RNA binding protein, rbp) and a plant cytokinin riboside 5′-monophosphate phosphoribohydrolase-related protein (LONELY GUY, log). Similar gypsy elements were identified in the Populus trichocarpa genome. Gene-coding region analysis indicated that 24.4% of the nonrepetitive reads contained genic regions. The depth of genome coverage was sufficient to allow accurate assembly of constituent genes, including a putative phloem-expressed gene. The development of the Carrizo database (http://citrus.pw.usda.gov/) will contribute to characterization of agronomically significant loci and provide a publicly available genomic resource to the citrus research community.  相似文献   

13.
A Bacterial Artificial Chromosome (BAC) genomic DNA library of Anopheles gambiae, the major human malaria vector in sub-Saharan Africa, was constructed and characterized. This library (ND-TAM) is composed of 30,720 BAC clones in eighty 384-well plates. The estimated average insert size of the library is 133 kb, with an overall genome coverage of approximately 14-fold. The ends of approximately two-thirds of the clones in the library were sequenced, yielding 32,340 pair-mate ends. A statistical analysis (G-test) of the results of PCR screening of the library indicated a random distribution of BACs in the genome, although one gap encompassing the white locus on the X-chromosome was identified. Furthermore, combined with another previously constructed BAC library (ND-1), ~2,000 BACs have been physically mapped by polytene chromosomal in situ hybridization. These BAC end pair mates and physically mapped BACs have been useful for both the assembly of a fully sequenced A. gambiae genome and for linking the assembled sequence to the three polytene chromosomes. This ND-TAM library is now publicly available at both http://www.malaria.mr4.org/mr4pages/index.html/ and http://hbz.tamu.edu/, providing a valuable resource to the mosquito research community.  相似文献   

14.

Background

Thellungiella halophila (also known as Thellungiella salsuginea) is a model halophyte with a small plant size, short life cycle, and small genome. It easily undergoes genetic transformation by the floral dipping method used with its close relative, Arabidopsis thaliana. Thellungiella genes exhibit high sequence identity (approximately 90% at the cDNA level) with Arabidopsis genes. Furthermore, Thellungiella not only shows tolerance to extreme salinity stress, but also to chilling, freezing, and ozone stress, supporting the use of Thellungiella as a good genomic resource in studies of abiotic stress tolerance.

Results

We constructed a full-length enriched Thellungiella (Shan Dong ecotype) cDNA library from various tissues and whole plants subjected to environmental stresses, including high salinity, chilling, freezing, and abscisic acid treatment. We randomly selected about 20 000 clones and sequenced them from both ends to obtain a total of 35 171 sequences. CAP3 software was used to assemble the sequences and cluster them into 9569 nonredundant cDNA groups. We named these cDNAs "RTFL" (RIKEN Thellungiella Full-Length) cDNAs. Information on functional domains and Gene Ontology (GO) terms for the RTFL cDNAs were obtained using InterPro. The 8289 genes assigned to InterPro IDs were classified according to the GO terms using Plant GO Slim. Categorical comparison between the whole Arabidopsis genome and Thellungiella genes showing low identity to Arabidopsis genes revealed that the population of Thellungiella transport genes is approximately 1.5 times the size of the corresponding Arabidopsis genes. This suggests that these genes regulate a unique ion transportation system in Thellungiella.

Conclusion

As the number of Thellungiella halophila (Thellungiella salsuginea) expressed sequence tags (ESTs) was 9388 in July 2008, the number of ESTs has increased to approximately four times the original value as a result of this effort. Our sequences will thus contribute to correct future annotation of the Thellungiella genome sequence. The full-length enriched cDNA clones will enable the construction of overexpressing mutant plants by introduction of the cDNAs driven by a constitutive promoter, the complementation of Thellungiella mutants, and the determination of promoter regions in the Thellungiella genome.  相似文献   

15.
In this report we present the results of the analysis of approximately 2.7 Mb of genomic information for the American mink (Neovison vison) derived through BAC end sequencing. Our study, which encompasses approximately 1/1000th of the mink genome, suggests that simple sequence repeats (SSRs) are less common in the mink than in the human genome, whereas the average GC content of the mink genome is slightly higher than that of its human counterpart. The 2.7 Mb mink genomic dataset also contained 2,416 repeat elements (retroids and DNA transposons) occupying almost 31% of the sequence space. Among repeat elements, LINEs were over-represented and endogenous viruses (aka LTRs) under-represented in comparison to the human genome. Finally, we present a virtual map of the mink genome constructed with reference to the human and canine genome assemblies using a comparative genomics approach and incorporating over 200 mink BESs with unique hits to the human genome.  相似文献   

16.
Availability of the human genome sequence and high similarity between humans and pigs at the molecular level provides an opportunity to use a comparative mapping approach to piggy-BAC the human genome. In order to advance the pig genome sequencing initiative, sequence similarity between large-scale porcine BAC-end sequences (BESs) and human genome sequence was used to construct a comparatively-anchored porcine physical map that is a first step towards sequencing the pig genome. A total of 50,300 porcine BAC clones were end-sequenced, yielding 76,906 BESs after trimming with an average read length of 538 bp. To anchor the porcine BACs on the human genome, these BESs were subjected to BLAST analysis using the human draft sequence, revealing 31.5% significant hits (E < e(-5)). Both genic and non-genic regions of homology contributed to the alignments between the human and porcine genomes. Porcine BESs with unique homology matches within the human genome provided a source of markers spaced approximately 70 to 300 kb along each human chromosome. In order to evaluate the utility of piggy-BACing human genome sequences, and confirm predictions of orthology, 193 evenly spaced BESs with similarity to HSA3 and HSA21 were selected and then utilized for developing a high-resolution (1.22 Mb) comparative radiation hybrid map of SSC13 that represents a fusion of HSA3 and HSA21. Resulting RH mapping of SSC13 covers 99% and 97% of HSA3 and HSA21, respectively. Seven evolutionary conserved blocks were identified including six on HSA3 and a single syntenic block corresponding to HSA21. The strategy of piggy-BACing the human genome described in this study demonstrates that through a directed, targeted comparative genomics approach construction of a high-resolution anchored physical map of the pig genome can be achieved. This map supports the selection of BACs to construct a minimal tiling path for genome sequencing and targeted gap filling. Moreover, this approach is highly relevant to other genome sequencing projects.  相似文献   

17.
Whole-genome sequencing of the soybean (Glycine max (L.) Merr. 'Williams 82') has made it important to integrate its physical and genetic maps. To facilitate this integration of maps, we screened 3290 microsatellites (SSRs) identified from BAC end sequences of clones comprising the 'Williams 82' physical map. SSRs were screened against 3 mapping populations. We found the AAT and ACT motifs produced the greatest frequency of length polymorphisms, ranging from 17.2% to 32.3% and from 11.8% to 33.3%, respectively. Other useful motifs include the dinucleotide repeats AG, AT, and AG, with frequency of length polymorphisms ranging from 11.2% to 18.4% (AT), 12.4% to 20.6% (AG), and 11.3% to 16.4% (GT). Repeat lengths less than 16 bp were generally less useful than repeat lengths of 40-60 bp. Two hundred and sixty-five SSRs were genetically mapped in at least one population. Of the 265 mapped SSRs, 60 came from BAC singletons not yet placed into contigs of the physical map. One hundred and ten originated in BACs located in contigs for which no genetic map location was previously known. Ninety-five SSRs came from BACs within contigs for which one or more other BACs had already been mapped. For these fingerprinted contigs (FPC) a high percentage of the mapped markers showed inconsistent map locations. A strategy is introduced by which physical and genetic map inconsistencies can be resolved using the preliminary 4x assembly of the whole genome sequence of soybean.  相似文献   

18.
A set of BAC clones spanning the human genome   总被引:13,自引:0,他引:13  
Using the human bacterial artificial chromosome (BAC) fingerprint-based physical map, genome sequence assembly and BAC end sequences, we have generated a fingerprint-validated set of 32855 BAC clones spanning the human genome. The clone set provides coverage for at least 98% of the human fingerprint map, 99% of the current assembled sequence and has an effective resolving power of 79 kb. We have made the clone set publicly available, anticipating that it will generally facilitate FISH or array-CGH-based identification and characterization of chromosomal alterations relevant to disease.  相似文献   

19.
A family of dispersed repeats longer than 7 kilobase pairs (kbp) has been identified in the very large genome of Lilium henryi, and two subregions cloned. Initially a rapidly reannealing probe (C0t<1 M s) was prepared by hydroxyapatite chromatography. Half the copies of all sequences repeated 15000 times per genome are expected to reanneal by this C0t value. The probe hydridized to abundant fragments of 2, 5, and 7 kbp released from genomic DNA by Bam HI digestion. Twelve 2-kb fragments and ten 5-kb sequences were cloned into pBR322. Restriction mapping of the two sets of clones showed individual members to be quite similar. Length variation was no more than 200 base pairs (bp) between repeats, and consensus sites were present on 80%–90% of occasions. In situ hybridization using representative 2-kbp and 5-kbp clones showed each sequence to be dispersed throughout all chromosomal regions. Studies on the genomic organization suggested that the 2-kbp and 5-kbp sequences are usually adjacent, and that occasional absence of the internal Bam HI site results in the release of the 7-kbP fragment. There are at least 13000 copies of the full repeat per L. henryi genome, thus accounting for approximately 0.3% of the total of 32 million kbp.  相似文献   

20.
Libraries constructed in bacterial artificial chromosome (BAC) vectors have become the choice for clone sets in high throughput genomic sequencing projects primarily because of their high stability. BAC libraries have been proposed as a source for minimally over-lapping clones for sequencing large genomic regions, and the use of BAC end sequences (i.e. sequences adjoining the insert sites) has been proposed as a primary means for selecting minimally overlapping clones for sequencing large genomic regions. For this strategy to be effective, high throughput methods for BAC end sequencing of all the clones in deep coverage BAC libraries needed to be developed. Here we describe a low cost, efficient, 96 well procedure for BAC end sequencing. These methods allow us to generate BAC end sequences from human and Arabidoposis libraries with an average read length of >450 bases and with a single pass sequencing average accuracy of >98%. Application of BAC end sequences in genomic sequen-cing is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号