首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Il-Young Ahn  Carlos E Winter 《Génome》2006,49(8):1007-1015
This work describes the physicochemical characterization of the genome and telomere structure from the nematode Oscheius tipulae CEW1. Oscheius tipulae is a free-living nematode belonging to the family Rhabditidae and has been used as a model system for comparative genetic studies. A new protocol that combines fluorescent detection of double-stranded DNA and S1 nuclease was used to determine the genome size of O. tipulae as 100.8 Mb (approximately 0.1 pg DNA/haploid nucleus). The genome of this nematode is made up of 83.4% unique copy sequences, 9.4% intermediate repetitive sequences, and 7.2% highly repetitive sequences, suggesting that its structure is similar to those of other nematodes of the genus Caenorhabditis. We also showed that O. tipulae has the same telomere repeats already found in Caenorhabditis elegans at the ends and in internal regions of the chromosomes. Using a cassette-ligation-mediated PCR protocol we were able to obtain 5 different putative subtelomeric sequences of O. tipulae, which show no similarity to C. elegans or C. briggsae subtelomeric regions. DAPI staining of hermaphrodite gonad cells show that, as detected in C. elegans and other rhabditids, O. tipulae have a haploid complement of 6 chromosomes.  相似文献   

2.
Chibana H  Oka N  Nakayama H  Aoyama T  Magee BB  Magee PT  Mikami Y 《Genetics》2005,170(4):1525-1537
The size of the genome in the opportunistic fungus Candida albicans is 15.6 Mb. Whole-genome shotgun sequencing was carried out at Stanford University where the sequences were assembled into 412 contigs. C. albicans is a diploid basically, and analysis of the sequence is complicated due to repeated sequences and to sequence polymorphism between homologous chromosomes. Chromosome 7 is 1 Mb in size and the best characterized of the 8 chromosomes in C. albicans. We assigned 16 of the contigs, ranging in length from 7309 to 267,590 bp, to chromosome 7 and determined sequences of 16 regions. These regions included four gaps, a misassembled sequence, and two major repeat sequences (MRS) of >16 kb. The length of the continuous sequence attained was 949,626 bp and provided complete coverage of chromosome 7 except for telomeric regions. Sequence analysis was carried out and predicted 404 genes, 11 of which included at least one intron. A 7-kb indel, which might be caused by a retrotransposon, was identified as the largest difference between the homologous chromosomes. Synteny analysis revealed that the degree of synteny between C. albicans and Saccharomyces cerevisiae is too weak to use for completion of the genomic sequence in C. albicans.  相似文献   

3.
Populus euphratica is well adapted to extreme desert environments and is an important model species for elucidating the mechanisms of abiotic stress resistance in trees. The current assembly of P. euphratica genome is highly fragmented with many gaps and errors, thereby impeding downstream applications. Here, we report an improved chromosome‐level reference genome of P. euphratica (v2.0) using single‐molecule sequencing and chromosome conformation capture (Hi‐C) technologies. Relative to the previous reference genome, our assembly represents a nearly 60‐fold improvement in contiguity, with a scaffold N50 size of 28.59 Mb. Using this genome, we have found that extensive expansion of Gypsy elements in P. euphratica led to its rapid increase in genome size compared to any other Salicaceae species studied to date, and potentially contributed to adaptive divergence driven by insertions near genes involved in stress tolerance. We also detected a wide range of unique structural rearrangements in P. euphratica, including 2,549 translocations, 454 inversions, 121 tandem and 14 segmental duplications. Several key genes likely to be involved in tolerance to abiotic stress were identified within these regions. This high‐quality genome represents a valuable resource for poplar breeding and genetic improvement in the future, as well as comparative genomic analysis with other Salicaceae species.  相似文献   

4.
The genome sequence of silkworm, Bombyx mori.   总被引:21,自引:0,他引:21  
We performed threefold shotgun sequencing of the silkworm (Bombyx mori) genome to obtain a draft sequence and establish a basic resource for comprehensive genome analysis. By using the newly developed RAMEN assembler, the sequence data derived from whole-genome shotgun (WGS) sequencing were assembled into 49,345 scaffolds that span a total length of 514 Mb including gaps and 387 Mb without gaps. Because the genome size of the silkworm is estimated to be 530 Mb, almost 97% of the genome has been organized in scaffolds, of which 75% has been sequenced. By carrying out a BLAST search for 50 characteristic Bombyx genes and 11,202 non-redundant expressed sequence tags (ESTs) in a Bombyx EST database against the WGS sequence data, we evaluated the validity of the sequence for elucidating the majority of silkworm genes. Analysis of the WGS data revealed that the silkworm genome contains many repetitive sequences with an average length of <500 bp. These repetitive sequences appear to have been derived from truncated transposons, which are interspersed at 2.5- to 3-kb intervals throughout the genome. This pattern suggests that silkworm may have an active mechanism that promotes removal of transposons from the genome. We also found evidence for insertions of mitochondrial DNA fragments at 9 sites. A search for Bombyx orthologs to Drosophila genes controlling sex determination in the WGS data revealed 11 Bombyx genes and suggested that the sex-determining systems differ profoundly between the two species.  相似文献   

5.
The sizes of the centromeric regions of Arabidopsis thaliana chromosomes 1, 2, and 3 were determined by construction of their physical maps on the basis of restriction analysis. As the reported centromeric regions contain large gaps in the middle due to highly repetitive sequences, appropriate probes for Southern hybridization were prepared from the sequences reported for the flanking regions and from the sequences of BAC and YAC clones newly isolated in this work, and restriction analysis was performed using DNA of a hypomethylated strain (ddm1). The sizes of the genetically defined centromeric regions were deduced to be 9 megabases (Mb), 4.2 Mb and 4.1 Mb, respectively (chromosome 1, from markers T22C23-t7 to T3P8-sp6; chromosome 2, from F5J15-sp6 to T15D9; chromosome 3, from T9G9-sp6 to T15M14; G. P. Copenhaver et al. Science, 286, 2468-2479, 1999). By combining the sizes of the centromeric regions previously estimated for chromosomes 4 and 5 and the sequence data reported for the A. thaliana genome, the total genome size of A. thaliana was estimated to be approximately 146.0 Mb.  相似文献   

6.
宽边黄粉蝶(Eurema hecabe)是重要的传粉昆虫,广布于非洲热带区、东洋区、澳洲区及古北区东部,具有较高的学术价值和经济意义。为了深入研究其广泛适应性,确定适合该蝴蝶的全基因组的测序研究策略,我们首先做低覆盖度的基因组Survey测序,然后做大规模的全基因组深度测序。采用第二代高通量的测序技术作为该研究的研究方法,测定了宽边黄粉蝶基因组大小,并利用生物信息学方法估计该种的杂合率、重复序列和GC含量等基因组信息。结果表明:(1)宽边黄粉蝶的基因组大小估计为285.34 Mb,测序深度51×;(2)从K-mer分布曲线发现黄粉蝶基因组有明显的杂合峰,杂合率达1.97%,重复序列比例为35.37%。该研究结果对于揭示宽边黄粉蝶物种的起源和进化及适应性具有重要意义,为宽边黄粉蝶选择全基因组测序策略提供依据。  相似文献   

7.
A new approach to sequencing and assembling a highly heterozygous genome, that of grape, species Vitis vinifera cv Pinot Noir, is described. The combining of genome shotgun of paired reads produced by Sanger sequencing and sequencing by synthesis of unpaired reads was shown to be an efficient procedure for decoding a complex genome. About 2 million SNPs and more than a million heterozygous gaps have been identified in the 500Mb genome of grape. More than 91% of the sequence assembled into 58,611 contigs is now anchored to the 19 linkage groups of V. vinifera.  相似文献   

8.
Y L Chang  Q Tao  C Scheuring  K Ding  K Meksem  H B Zhang 《Genetics》2001,159(3):1231-1242
The genome of the model plant species Arabidopsis thaliana has recently been sequenced. To accelerate its current genome research, we developed a whole-genome, BAC/BIBAC-based, integrated physical, genetic, and sequence map of the A. thaliana ecotype Columbia. This new map was constructed from the clones of a new plant-transformation-competent BIBAC library and is integrated with the existing sequence map. The clones were restriction fingerprinted by DNA sequencing gel-based electrophoresis, assembled into contigs, and anchored to an existing genetic map. The map consists of 194 BAC/BIBAC contigs, spanning 126 Mb of the 130-Mb Arabidopsis genome. A total of 120 contigs, spanning 114 Mb, were anchored to the chromosomes of Arabidopsis. Accuracy of the integrated map was verified using the existing physical and sequence maps and numerous DNA markers. Integration of the new map with the sequence map has enabled gap closure of the sequence map and will facilitate functional analysis of the genome sequence. The method used here has been demonstrated to be sufficient for whole-genome physical mapping from large-insert random bacterial clones and thus is applicable to rapid development of whole-genome physical maps for other species.  相似文献   

9.
The genome sequence of the free-living nematode Caenorhabditis elegans is nearly complete, with resolution of the final difficult regions expected over the next few months. This will represent the first genome of a multicellular organism to be sequenced to completion. The genome is approximately 97 Mb in total, and encodes more than 19,099 proteins, considerably more than expected before sequencing began. The sequencing project--a collaboration between the Genome Sequencing Center in St Louis and the Sanger Centre in Hinxton--has lasted eight years, with the majority of the sequence generated in the past four years. Analysis of the genome sequence is just beginning and represents an effort that will undoubtedly last more than another decade. However, some interesting findings are already apparent, indicating that the scope of the project, the approach taken, and the usefulness of having the genetic blueprint for this small organism have been well worth the effort.  相似文献   

10.

Background

The different regions of a genome do not evolve at the same rate. For example, comparative genomic studies have suggested that the sex chromosomes and the regions harbouring the immune defence genes in the Major Histocompatability Complex (MHC) may evolve faster than other genomic regions. The advent of the next generation sequencing technologies has made it possible to study which genomic regions are evolutionary liable to change and which are static, as well as enabling an increasing number of genome studies of non-model species. However, de novo sequencing of the whole genome of an organism remains non-trivial. In this study, we present the draft genome of the black grouse, which was developed using a reference-guided assembly strategy.

Results

We generated 133 Gbp of sequence data from one black grouse individual by the SOLiD platform and used a combination of de novo assembly and chicken reference genome mapping to assemble the reads into 4572 scaffolds with a total length of 1022 Mb. The draft genome well covers the main chicken chromosomes 1 ~ 28 and Z which have a total length of 1001 Mb. The draft genome is fragmented, but has a good coverage of the homologous chicken genes. Especially, 33.0% of the coding regions of the homologous genes have more than 90% proportion of their sequences covered. In addition, we identified ~1 M SNPs from the genome and identified 106 genomic regions which had a high nucleotide divergence between black grouse and chicken or between black grouse and turkey.

Conclusions

Our results support the hypothesis that the chromosome X (Z) evolves faster than the autosomes and our data are consistent with the MHC regions being more liable to change than the genome average. Our study demonstrates how a moderate sequencing effort can be combined with existing genome references to generate a draft genome for a non-model species.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-180) contains supplementary material, which is available to authorized users.  相似文献   

11.
The rice leaffolder Cnaphalocrocis exigua (Crambidae, Lepidoptera) is an important agricultural pest that damages rice crops and other members of related grass families. C. exigua exhibits a very similar morphological phenotype and feeding behaviour to C. medinalis, another species of rice leaffolder whose genome was recently reported. However, genomic information for C. exigua remains extremely limited. Here, we used a hybrid strategy combining different sequencing technologies, including Illumina, PacBio, 10× Genomics, and Hi – C scaffolding, to generate a high-quality chromosome-level genome assembly of C. exigua. We initially obtained a 798.8 Mb assembly with a contig N50 size of 2.9 Mb, and the N50 size was subsequently increased to 25.7 Mb using Hi – C technology to anchor 1413 scaffolds to 32 chromosomes. We detected a total of 97.7% Benchmarking Universal Single-Copy Orthologues (BUSCO) in the genome assembly, which was comprised of ~52% repetitive sequence and annotated 14,922 protein-coding genes. Of note, the Z and W sex chromosomes were assembled and identified. A comparative genomic analysis demonstrated that despite the high synteny observed between the two rice leaffolders, the species have distinct genomic features associated with expansion and contraction of gene families and selection pressure. In summary, our chromosome-level genome assembly and comparative genomic analysis of C. exigua provide novel insights into the evolution and ecology of this rice insect pests and offer useful information for pest control.  相似文献   

12.
线虫(Caenorhabditis elegans)是重要的模式生物,其基因组序列分析工作于1998年底基本完成,已有19000多个基因被鉴定。本文概述线虫基因组研究中遗传图谱、物理图谱、序列测定和基因识别等方面的研究成果,以及线虫基因组计划将对生命科学研究产生的影响。  相似文献   

13.
Arabidopsis thaliana has a relatively small genome of approximately 130 Mb containing about 10% repetitive DNA. Genome sequencing studies reveal a gene-rich genome, predicted to contain approximately 25000 genes spaced on average every 4.5 kb. Between 10 to 20% of the predicted genes occur as clusters of related genes, indicating that local sequence duplication and subsequent divergence generates a significant proportion of gene families. In addition to gene families, repetitive sequences comprise individual and small clusters of two to three retroelements and other classes of smaller repeats. The clustering of highly repetitive elements is a striking feature of the A. thaliana genome emerging from sequence and other analyses.  相似文献   

14.
Genome-wide physical mapping with bacteria-based large-insert clones (e.g., BACs, PACs, and PBCs) promises to revolutionize genomics of large, complex genomes. To accelerate rice and other grass species genome research, we developed a genome-wide BAC-based map of the rice genome. The map consists of 298 BAC contigs and covers 419 Mb of the 430-Mb rice genome. Subsequent analysis indicated that the contigs constituting the map are accurate and reliable. Particularly important to proficiency were (1) a high-resolution, high-throughput DNA sequencing gel-based electrophoretic method for BAC fingerprinting, (2) the use of several complementary large-insert BAC libraries, and (3) computer-aided contig assembly. It has been demonstrated that the fingerprinting method is not significantly influenced by repeated sequences, genome size, and genome complexity. Use of several complementary libraries developed with different restriction enzymes minimized the "gaps" in the physical map. In contrast to previous estimates, a clonal coverage of 6.0-8.0 genome equivalents seems to be sufficient for development of a genome-wide physical map of approximately 95% genome coverage. This study indicates that genome-wide BAC-based physical maps can be developed quickly and economically for a variety of plant and animal species by restriction fingerprint analysis via DNA sequencing gel-based electrophoresis.  相似文献   

15.
Henk DA  Fisher MC 《PloS one》2012,7(2):e31268
Fungal genomes range in size from 2.3 Mb for the microsporidian Encephalitozoon intestinalis up to 8000 Mb for Entomophaga aulicae, with a mean genome size of 37 Mb. Basidiobolus, a common inhabitant of vertebrate guts, is distantly related to all other fungi, and is unique in possessing both EF-1α and EFL genes. Using DNA sequencing and a quantitative PCR approach, we estimated a haploid genome size for Basidiobolus at 350 Mb. However, based on allelic variation, the nuclear genome is at least diploid, leading us to believe that the final genome size is at least 700 Mb. We also found that EFL was in three times the copy number of its putatively functionally overlapping paralog EF-1α. This suggests that gene or genome duplication may be an important feature of B. ranarum evolution, and also suggests that B. ranarum may have mechanisms in place that favor the preservation of functionally overlapping genes.  相似文献   

16.
Bivalves, a highly diverse and the most evolutionarily successful class of invertebrates native to aquatic habitats, provide valuable molecular resources for understanding the evolutionary adaptation and aquatic ecology. Here, we reported a high‐quality chromosome‐level genome assembly of the razor clam Sinonovacula constricta using Pacific Bioscience single‐molecule real‐time sequencing, Illumina paired‐end sequencing, 10X Genomics linked‐reads and Hi‐C reads. The genome size was 1,220.85 Mb, containing scaffold N50 of 65.93 Mb and contig N50 of 976.94 Kb. A total of 899 complete (91.92%) and seven partial (0.72%) matches of the 978 metazoa Benchmarking Universal Single‐Copy Orthologs were determined in this genome assembly. And Hi‐C scaffolding of the genome resulted in 19 pseudochromosomes. A total of 28,594 protein‐coding genes were predicted in the S. constricta genome, of which 25,413 genes (88.88%) were functionally annotated. In addition, 39.79% of the assembled genome was composed of repetitive sequences, and 4,372 noncoding RNAs were identified. The enrichment analyses of the significantly expanded and contracted genes suggested an evolutionary adaptation of S. constricta to highly stressful living environments. In summary, the genomic resources generated in this work not only provide a valuable reference genome for investigating the molecular mechanisms of S. constricta biological functions and evolutionary adaptation, but also facilitate its genetic improvement and disease treatment. Meanwhile, the obtained genome greatly improves our understanding of the genetics of molluscs and their comparative evolution.  相似文献   

17.
Recent segmental and gene duplications in the mouse genome   总被引:2,自引:0,他引:2       下载免费PDF全文

Background

The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (≥ 5 kb) and recent (≥ 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies.

Results

We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice.

Conclusion

Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.
  相似文献   

18.
《Genomics》2022,114(5):110473
The potato grouper, Epinephelus tukula, is one of the largest coral reef teleost, and it is an important germplasm resource for selection and cross breeding. Here we report a potato grouper genome assembly generated using PacBio long-read sequencing, Illumina sequencing and high-throughput chromatin conformation capture (Hi-C) technology. The genome size was 1.13 Gb, with a total of 508 contigs anchored into 24 chromosomes. The scaffold N50 was 42.65 Mb. For the genome models, our assembled genome contained 98.11% complete BUSCO with the vertebrata_odb9 database. One more copies of Gh and Hsp90b1 were identified in the E. tukula genome, which might contribute to its fast growth and high resistance to stress. In addition, 435 putative antimicrobial peptide (AMP) genes were identified in the potato grouper. This study provides a good reference for whole genome selective breeding of the potato grouper and for future development of novel marine drugs.  相似文献   

19.
China is the origin and evolutionary centre of Oriental pears. Pyrus betuleafolia is a wild species native to China and distributed in the northern region, and it is widely used as rootstock. Here, we report the de novo assembly of the genome of P. betuleafolia‐Shanxi Duli using an integrated strategy that combines PacBio sequencing, BioNano mapping and chromosome conformation capture (Hi‐C) sequencing. The genome assembly size was 532.7 Mb, with a contig N50 of 1.57 Mb. A total of 59 552 protein‐coding genes and 247.4 Mb of repetitive sequences were annotated for this genome. The expansion genes in P. betuleafolia were significantly enriched in secondary metabolism, which may account for the organism's considerable environmental adaptability. An alignment analysis of orthologous genes showed that fruit size, sugar metabolism and transport, and photosynthetic efficiency were positively selected in Oriental pear during domestication. A total of 573 nucleotide‐binding site (NBS)‐type resistance gene analogues (RGAs) were identified in the P. betuleafolia genome, 150 of which are TIR‐NBS‐LRR (TNL)‐type genes, which represented the greatest number of TNL‐type genes among the published Rosaceae genomes and explained the strong disease resistance of this wild species. The study of flavour metabolism‐related genes showed that the anthocyanidin reductase (ANR) metabolic pathway affected the astringency of pear fruit and that sorbitol transporter (SOT) transmembrane transport may be the main factor affecting the accumulation of soluble organic matter. This high‐quality P. betuleafolia genome provides a valuable resource for the utilization of wild pear in fundamental pear studies and breeding.  相似文献   

20.
A collection of 9,990 single-pass nuclear genomic sequences, corresponding to 5 Mb of tomato DNA, were obtained using methylation filtration (MF) strategy and reduced to 7,053 unique undermethylated genomic islands (UGIs) distributed as follows: (1) 59% non-coding sequences, (2) 28% coding sequences, (3) 12% transposons—96% of which are class I retroelements, and (4) 1% organellar sequences integrated into the nuclear genome over the past approximately 100 million years. A more detailed analysis of coding UGIs indicates that the unmethylated portion of tomato genes extends as far as 676 bp upstream and 766 bp downstream of coding regions with an average of 174 and 171 bp, respectively. Based on the analysis of the UGI copy distribution, the undermethylated portion of the tomato genome is determined to account for the majority of the unmethylated genes in the genome and is estimated to constitute 61±15 Mb of DNA (~5% of the entire genome)—which is significantly less than the 220 Mb estimated for gene-rich euchromatic arms of the tomato genome. This result indicates that, while most genes reside in the euchromatin, a significant portion of euchromatin is methylated in the intergenic spacer regions. Implications of the results for sequencing the genome of tomato and other solanaceous species are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号