首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
BAC-end sequences (BESs) of hybrid sugarcane cultivar R570 are presented. A total of 66,990 informative BESs were obtained from 43,874 BAC clones. Similarity search using a variety of public databases revealed that 13.5 and 42.8 % of BESs match known gene-coding and repeat regions, respectively. That 11.7 % of BESs are still unmatched to any nucleotide sequences in the current public databases despite the fact that a close relative, sorghum, is fully sequenced, indicates that there may be many sugarcane-specific or lineage-specific sequences. We found 1,742 simple sequence repeat motifs in 1,585 BESs, spanning 27,383 bp in length. As simple sequence repeat markers derived from BESs have some advantages over randomly generated markers, these may be particularly useful for comparing BAC-based physical maps with genetic maps. BES and overgo hybridization information was used for anchoring sugarcane BAC clones to the sorghum genome sequence. While sorghum and sugarcane have extensive similarity in terms of genomic structure, only 2,789 BACs (6.4 %) could be confidently anchored to the sorghum genome at the stringent threshold of having both-end information (BESs or overgos) within 300 Kb. This relatively low rate of anchoring may have been caused in part by small- or large-scale genomic rearrangements in the Saccharum genus after two rounds of whole genome duplication since its divergence from the sorghum lineage about 7.8 million years ago. Limiting consideration to only low-copy matches, 1,245 BACs were placed to 1,503 locations, covering ~198 Mb of the sorghum genome or about 78 % of the estimated 252 Mb of euchromatin. BESs and their analyses presented here may provide an early profile of the sugarcane genome as well as a basis for BAC-by-BAC sequencing of much of the basic gene set of sugarcane.  相似文献   

2.
Summary Fifty random clones (350–2300 bp), derived from sheared, nuclear DNA, were studied via Southern analysis in order to make deductions about the organization and evolution of the tomato genome. Thirty-four of the clones were mapped genetically and determined to represent points on 11 of the 12 tomato chromosomes. Under moderate stringency conditions (80% homology required) 44% of the clones were classified as single copy. Under higher stringency, the majority of the clones (78%) behaved as single copy. Most of the remaining clones belonged to multicopy families containing 2–20 copies, while a few contained moderately or highly repeated sequences (10% at moderate stringency, 4% at high stringency). Divergence rates of sequences homologous to the 50 random genomic clones were compared with those corresponding to 20 previously described cDNA (coding sequence) clones. Rates were measured by probing each clone (random genomics and cDNAs) onto filters containing DNA from various species from the family Solanaceae (including potato, Datura, petunia and tobacco) as well as one species (watermelon) from another plant family, Cucurbitaceae. Under moderate stringency conditions, the majority of the random clones (single copy and repetitive) failed to detect homologous sequences in the more distantly related species, whereas approximately 90% of the 20 coding sequences analyzed could still be detected in all solanaceous species. The most highly repeated sequences appear to be the fastest evolving and homologous copies could be detected only in species most closely related to tomato. Dispersion of repetitive sequences, as opposed to tandem clustering, appears to be the rule for the tomato genome. None of the repetitive sequences discovered by this random sampling of the genome were tandemly arranged — a finding consistent with the notion that the tomato genome contains only a small fraction of satellite DNA. This study, along with a companion paper (Ganal et al. 1988), provides the first general sketch of the tomato genome at the molecular level and indicates that it is comprised largely of single copy sequences and these sequences, together with repetitive sequences are evolving at a rate faster than the coding portion of the genome. The small genome and paucity of highly repetitive DNA are favourable attributes with respect to the possibilities of conducting chromosome walking experiments in tomato and the fact that coding regions are well conserved among solanaceous species may be useful for distinguishing clones that contain coding regions from those that do not.  相似文献   

3.
Short interspersed nuclear elements (SINEs) are highly abundant non‐autonomous retrotransposons that are widespread in plants. They are short in size, non‐coding, show high sequence diversity, and are therefore mostly not or not correctly annotated in plant genome sequences. Hence, comparative studies on genomic SINE populations are rare. To explore the structural organization and impact of SINEs, we comparatively investigated the genome sequences of the Solanaceae species potato (Solanum tuberosum), tomato (Solanum lycopersicum), wild tomato (Solanum pennellii), and two pepper cultivars (Capsicum annuum). Based on 8.5 Gbp sequence data, we annotated 82 983 SINE copies belonging to 10 families and subfamilies on a base pair level. Solanaceae SINEs are dispersed over all chromosomes with enrichments in distal regions. Depending on the genome assemblies and gene predictions, 30% of all SINE copies are associated with genes, particularly frequent in introns and untranslated regions (UTRs). The close association with genes is family specific. More than 10% of all genes annotated in the Solanaceae species investigated contain at least one SINE insertion, and we found genes harbouring up to 16 SINE copies. We demonstrate the involvement of SINEs in gene and genome evolution including the donation of splice sites, start and stop codons and exons to genes, enlargement of introns and UTRs, generation of tandem‐like duplications and transduction of adjacent sequence regions.  相似文献   

4.
The African trypanosome genome   总被引:1,自引:0,他引:1  
The haploid nuclear genome of the African trypanosome, Trypanosoma brucei, is about 35 Mb and varies in size among different trypanosome isolates by as much as 25%. The nuclear DNA of this diploid organism is distributed among three size classes of chromosomes: the megabase chromosomes of which there are at least 11 pairs ranging from 1 Mb to more than 6 Mb (numbered I-XI from smallest to largest); several intermediate chromosomes of 200-900 kb and uncertain ploidy; and about 100 linear minichromosomes of 50-150 kb. Size differences of as much as four-fold can occur, both between the two homologues of a megabase chromosome pair in a specific trypanosome isolate and among chromosome pairs in different isolates. The genomic DNA sequences determined to date indicated that about 50% of the genome is coding sequence. The chromosomal telomeres possess TTAGGG repeats and many, if not all, of the telomeres of the megabase and intermediate chromosomes are linked to expression sites for genes encoding variant surface glycoproteins (VSGs). The minichromosomes serve as repositories for VSG genes since some but not all of their telomeres are linked to unexpressed VSG genes. A gene discovery program, based on sequencing the ends of cloned genomic DNA fragments, has generated more than 20 Mb of discontinuous single-pass genomic sequence data during the past year, and the complete sequences of chromosomes I and II (about 1 Mb each) in T. brucei GUTat 10.1 are currently being determined. It is anticipated that the entire genomic sequence of this organism will be known in a few years. Analysis of a test microarray of 400 cDNAs and small random genomic DNA fragments probed with RNAs from two developmental stages of T. brucei demonstrates that the microarray technology can be used to identify batteries of genes differentially expressed during the various life cycle stages of this parasite.  相似文献   

5.
6.
The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. ‘Francesco’ was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568 887 315 bp, consisting of 45 088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16 644 bp and 60 737 bp, respectively, and the longest scaffold was 1 287 144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp.  相似文献   

7.
 The root-knot nematode resistance gene Mi-1 in tomato has long been thought to be located in the pericentromeric heterochromatin region of the long arm of chromosome 6 because of its very tight genetic linkage (approx. 1 cM) to the markers Aps-1 (Acid phosphatase 1) and yv (yellow virescent). Using Mi-BAC clones and an Aps-1 YAC clone in fluorescence in situ hybridisation (FISH) to pachytene chromosomes we now provide direct physical evidence showing that Mi-1 is located at the border of the euchromatin and heterochromatin regions in the short arm (6S) and Aps-1 in the pericentromeric heterochromatin of the long arm (6L) close to the euchromatin. Taking into account both the estimated DNA content of hetero- and euchromatin regions and the compactness of the tomato chromosomes at pachytene (2 Mb/μm), our data suggest that Mi-1 and Aps-1 are at least 40 Mb apart, a base pair-to-centiMorgan relationship that is more than 50-fold higher than the average value of 750 kb/cM of the tomato genome. An integrated cytogenetic-molecular map of chromosome 6 is presented that provides a framework for physical mapping. Received: 24 July 1998 / Accepted: 14 August 1998  相似文献   

8.
Wang Y  Tang X  Cheng Z  Mueller L  Giovannoni J  Tanksley SD 《Genetics》2006,172(4):2529-2540
Eleven sequenced BACs were annotated and localized via FISH to tomato pachytene chromosomes providing the first global insights into the compositional differences of euchromatin and pericentromeric heterochromatin in this model dicot species. The results indicate that tomato euchromatin has a gene density (6.7 kb/gene) similar to that of Arabidopsis and rice. Thus, while the euchromatin comprises only 25% of the tomato nuclear DNA, it is sufficient to account for approximately 90% of the estimated 38,000 nontransposon genes that compose the tomato genome. Moreover, euchromatic BACs were largely devoid of transposons or other repetitive elements. In contrast, BACs assigned to the pericentromeric heterochromatin had a gene density 10-100 times lower than that of the euchromatin and are heavily populated by retrotransposons preferential to the heterochromatin-the most abundant transposons belonging to the Jinling Ty3/gypsy-like retrotransposon family. Jinling elements are highly methylated and rarely transcribed. Nonetheless, they have spread throughout the pericentromeric heterochromatin in tomato and wild tomato species fairly recently-well after tomato diverged from potato and other related solanaceous species. The implications of these findings on evolution and on sequencing the genomes of tomato and other solanaceous species are discussed.  相似文献   

9.
We present complete sequences of the mitochondrial genomes for two important mosquitoes, Aedes aegypti and Culex quinquefasciatus, that are major vectors of dengue virus and lymphatic filariasis, respectively. The A. aegypti mitochondrial genome is 16,655 bp in length and that of C. quinquefasciatus is 15,587 bp, yet both contain 13 protein coding genes, 22 transfer RNA (tRNA) genes, one 12S ribosomal RNA (rRNA) gene, one 16S rRNA gene and a control region (CR) in the same order. The difference in the genome size is due to the difference in the length of the control region. We also analyzed insertions of nuclear copies of mtDNA-like sequences (NUMTs) in a comparative manner between the two mosquitoes. The NUMT sequences occupy ~0.008% of the A. aegypti genome and ~0.001% of the C. quinquefasciatus genome. Several NUMTs were found localized in the introns of predicted protein coding genes in both genomes (32 genes in A. aegypti but only four in C. quinquefasciatus). None of these NUMT-containing genes had an ortholog between the two species or had paralogous copies within a genome that was also NUMT-containing. It was further observed that the NUMT-containing genes were relatively longer but had lower GC content compared to the NUMT-less paralogous copies. Moreover, stretches of homologies are present among the genic and non-genic NUMTs that may play important roles in genomic rearrangement of NUMTs in these genomes. Our study provides new insights on understanding the roles of nuclear mtDNA sequences in genome complexities of these mosquitoes.  相似文献   

10.
Automatic annotation of eukaryotic genes,pseudogenes and promoters   总被引:1,自引:0,他引:1  
  相似文献   

11.
《Genomics》2021,113(4):2189-2198
Sooty moulds are fungi of economic importance and with unique lifestyle mainly growing on insect honeydew. However, the lack of genomic data hinders investigation of genetic mechanisms underlying their ecological adaptation. With long-read sequencing technology, we generated the genome of Scorias spongiosa, an extraordinary sooty mould fungus associated with honeydew of colony aphids and producing large fruiting bodies. A 24.21 Mb high-quality genome assembly with a N50 length of 3.37 Mb was obtained. The genome contained 7758 protein coding genes, 97.13% of which were homologous to known genes, and approximately 0.29 Mb repeat sequences. Comparative genomics showed S. spongiosa lost relatively more gene families and contained fewer species-specific genes and gene families, with many CAZyme families and sugar transporters reduced or absent. This study not only promotes understanding of the ecological adaptation of sooty moulds, but also provides valuable genomic data resource for future comparative genomic and genetic studies.  相似文献   

12.
Brachypodium is well suited as a model system for temperate grasses because of its compact genome and a range of biological features. In an effort to develop resources for genome research in this emerging model species, we constructed 2 bacterial artificial chromosome (BAC) libraries from an inbred diploid Brachypodium distachyon line, Bd21, using restriction enzymes HindIII and BamHI. A total of 73,728 clones (36,864 per BAC library) were picked and arrayed in 192,384-well plates. The average insert size for the BamHI and HindIII libraries is estimated to be 100 and 105 kb, respectively, and inserts of chloroplast origin account for 4.4% and 2.4%, respectively. The libraries individually represent 9.4- and 9.9-fold haploid genome equivalents with combined 19.3-fold genome coverage, based on a genome size of 355 Mb reported for the diploid Brachypodium, implying a 99.99% probability that any given specific sequence will be present in each library. Hybridization of the libraries with 8 starch biosynthesis genes was used to empirically evaluate this theoretical genome coverage; the frequency at which these genes were present in the library clones gave an estimated coverage of 11.6- and 19.6-fold genome equivalents. To obtain a first view of the sequence composition of the Brachypodium genome, 2185 BAC end sequences (BES) representing 1.3 Mb of random genomic sequence were compared with the NCBI GenBank database and the GIRI repeat database. Using a cutoff expectation value of E<10-10, only 3.3% of the BESs showed similarity to repetitive sequences in the existing database, whereas 40.0% had matches to the sequences in the EST database, suggesting that a considerable portion of the Brachypodium genome is likely transcribed. When the BESs were compared with individual EST databases, more matches hit wheat than maize, although their EST collections are of a similar size, further supporting the close relationship between Brachypodium and the Triticeae. Moreover, 122 BESs have significant matches to wheat ESTs mapped to individual chromosome bin positions. These BACs represent colinear regions containing the mapped wheat ESTs and would be useful in identifying additional markers for specific wheat chromosome regions.  相似文献   

13.
Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ≥300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified.  相似文献   

14.
Methanobacterium sp. Mb1, a hydrogenotrophic methanogenic Archaeon, was isolated from a rural biogas plant producing methane-rich biogas from maize silage and cattle manure in Germany. Here we report the complete genome sequence of the novel methanogenic isolate Methanobacterium sp. Mb1 harboring a 2,029,766 bp circular chromosome featuring a GC content of 39.74%. The genome encodes two rRNA operons, 41 tRNA genes and 2021 coding sequences and represents the smallest genome currently known within the genus Methanobacterium.  相似文献   

15.
The collembolan Folsomia candida Willem, 1902, is widely distributed throughout the world and has been frequently used as a test organism in soil ecology and ecotoxicology studies. However, it is questioned as an ideal “standard” because of differences in reproductive modes and cryptic genetic diversity between strains from various geographical origins. In this study, we obtained two high-quality chromosome-level genomes of F. candida, for a parthenogenetic strain (named FCDK, 219.08 Mb, 25,139 protein-coding genes) and a sexual strain (named FCSH, 153.09 Mb, 21,609 protein-coding genes), reannotated the genome of the parthenogenetic strain reported by Faddeeva-Vakhrusheva et al. in 2017 (named FCBL, 221.7 Mb, 25,980 protein-coding genes) and conducted comparative genomic analyses of the three strains. High genome similarities between FCDK and FCBL based on synteny, genome architecture, mitochondrial and nuclear gene sequences suggest that they are conspecific. The seven chromosomes of FCDK are each 25%–54% larger than the corresponding chromosomes of FCSH, showing obvious repetitive element expansions and large-scale inversions and translocations but no whole-genome duplication. The strain-specific genes, expanded gene families and genes in nonsyntenic chromosomal regions identified in FCDK are highly related to the broader environmental adaptation of parthenogenetic strains. In addition, FCDK has fewer strain-specific microRNAs than FCSH, and their mitochondrial and nuclear genes have diverged greatly. In conclusion, FCDK/FCBL and FCSH have accumulated independent genetic changes and evolved into distinct species after 10 million years ago. Our work provides important genomic resources for studying the mechanisms of rapidly cryptic speciation and soil arthropod adaptation to soil ecosystems.  相似文献   

16.
Summary The major families of repeated DNA sequences in the genome of tomato (Lycopersicon esculentum) were isolated from a sheared DNA library. One thousand clones, representing one million base pairs, or 0.15% of the genome, were surveyed for repeated DNA sequences by hybridization to total nuclear DNA. Four major repeat classes were identified and characterized with respect to copy number, chromosomal localization by in situ hybridization, and evolution in the family Solanaceae. The most highly repeated sequence, with approximately 77000 copies, consists of a 162 bp tandemly repeated satellite DNA. This repeat is clustered at or near the telomeres of most chromosomes and also at the centromeres and interstitial sites of a few chromosomes. Another family of tandemly repeated sequences consists of the genes coding for the 45 S ribosomal RNA. The 9.1 kb repeating unit in L. esculentum was estimated to be present in approximately 2300 copies. The single locus, previously mapped using restriction fragment length polymorphisms, was shown by in situ hybridization as a very intense signal at the end of chromosome 2. The third family of repeated sequences was interspersed throughout nearly all chromosomes with an average of 133 kb between elements. The total copy number in the genome is approximately 4200. The fourth class consists of another interspersed repeat showing clustering at or near the centromeres in several chromosomes. This repeat had a copy number of approximately 2100. Sequences homologous to the 45 S ribosomal DNA showed cross-hybridization to DNA from all solanaceous species examined including potato, Datura, Petunia, tobacco and pepper. In contrast, with the exception of one class of interspersed repeats which is present in potato, all other repetitive sequences appear to be limited to the crossing-range of tomato. These results, along with those from a companion paper (Zamir and Tanksley 1988), indicate that tomato possesses few highly repetitive DNA sequences and those that do exist are evolving at a rate higher than most other genomic sequences.  相似文献   

17.
Unlike other important Solanaceae crops such as tomato, potato, chili pepper, and tobacco, all of which originated in South America and are cultivated worldwide, eggplant (Solanum melongena L.) is indigenous to the Old World and in this respect it is phylogenetically unique. To broaden our knowledge of the genomic nature of solanaceous plants further, we dissected the eggplant genome and built a draft genome dataset with 33,873 scaffolds termed SME_r2.5.1 that covers 833.1 Mb, ca. 74% of the eggplant genome. Approximately 90% of the gene space was estimated to be covered by SME_r2.5.1 and 85,446 genes were predicted in the genome. Clustering analysis of the predicted genes of eggplant along with the genes of three other solanaceous plants as well as Arabidopsis thaliana revealed that, of the 35,000 clusters generated, 4,018 were exclusively composed of eggplant genes that would perhaps confer eggplant-specific traits. Between eggplant and tomato, 16,573 pairs of genes were deduced to be orthologous, and 9,489 eggplant scaffolds could be mapped onto the tomato genome. Furthermore, 56 conserved synteny blocks were identified between the two species. The detailed comparative analysis of the eggplant and tomato genomes will facilitate our understanding of the genomic architecture of solanaceous plants, which will contribute to cultivation and further utilization of these crops.  相似文献   

18.
邵伟伟  乔芬  蔡玮  林植华  韦力 《兽类学报》2023,43(2):182-192
脊椎动物基因组含有丰富的微卫星信息。本研究对翼手目动物中的大蹄蝠全基因组及其基因的微卫星分布特征进行分析,并对含有微卫星编码序列的基因进行注释分析。结果表明,大蹄蝠全基因组大小为2.24 Gb,共含有497 883个微卫星,其中,数量和比例最多的是单碱基和二碱基重复类型,分别有173 953个(34.94%)和222 591个(44.71%),相对丰度分别为77.78 loci/Mb和99.52 loci/Mb。微卫星数量从单碱基重复到六碱基重复单元最多的类型分别为(A)n、(AC)n、(TAT)n、(TTTA)n、(AACAA)n和(TATCTA)n,比例分别为95.14%、55.25%、38.41%、22.17%、48.68%和20.30%。不同基因区和基因间区的数量及丰度不同,其中基因间区的微卫星数量及其丰度最大,分别为322 666个和2 541.57 loci/Mb,编码区的微卫星数量及其丰度最小,分别为1 461个和461.98 loci/Mb。基因间区和全基因组的微卫星的分布特征相似。编码区最多的微卫星类型为三碱基重复单元,外显子最多的微卫星类型为单碱基、二碱基和三碱基重...  相似文献   

19.
Sequence organization of the mitochondrial genome of yeast--a review   总被引:3,自引:0,他引:3  
M de Zamaroczy  G Bernardi 《Gene》1985,37(1-3):1-17
We have compiled the available primary structural data for the mitochondrial genome of Saccharomyces cerevisiae and have estimated the size of the remaining gaps, which represent 12-13% of the genome. The lengths of sequenced regions and of gaps lead to a new assessment of genome sizes; these range (in round figures) from 85 000 bp for the long genomes, to 78 000 bp for the short genomes, to 74 000 bp for the supershort genome of Saccharomyces carlsbergensis. These values are 8-11% higher than those previously estimated from restriction fragments. Interstrain differences concern not only facultative intervening sequences (introns) and mini-inserts, but also insertions/deletions in intergenic sequences. The primary structure appears to be extremely conserved in genes and ori sequences, and highly conserved in intergenic sequences. Since coding sequences represent at most 33-35% of the genome, at least two thirds of the genome are formed by noncoding and yet highly conserved sequences. The G + C level of genes or exon is 25%, and that of intronic open reading frames (ORFs) 22%; increasingly lower values are shown by intronic closed reading frames (CRFs), 20%, ori sequences, 19%, intergenic ORFs, 17.5% and intergenic sequences, 15%.  相似文献   

20.

Comparative sequence analyses have identified highly conserved genomic DNA sequences, including noncoding sequences, between humans and other species. By performing whole-genome comparisons of human and mouse, we have identified 611 conserved noncoding sequences longer than 500 bp, with more than 95% identity between the species. These long conserved noncoding sequences (LCNS) include 473 new sequences that do not overlap with previously reported ultraconserved elements (UCE), which are defined as aligned sequences longer than 200 bp with 100% identity in human, mouse, and rat. The LCNS were distributed throughout the genome except for the Y chromosome and often occurred in clusters within regions with a low density of coding genes. Many of the LCNS were also highly conserved in other mammals, chickens, frogs, and fish; however, we were unable to find orthologous sequences in the genomes of invertebrate species. In order to examine whether these conserved sequences are functionally important or merely mutational cold spots, we directly measured the frequencies of ENU-induced germline mutations in the LCNS of the mouse. By screening about 40.7 Mb, we found 35 mutations, including mutations at nucleotides that were conserved between human and fish. The mutation frequencies were equivalent to those found in other genomic regions, including coding sequences and introns, suggesting that the LCNS are not mutational cold spots at all. Taken together, these results suggest that mutations occur with equal frequency in LCNS but are eliminated by natural selection during the course of evolution.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号