首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
We have established a high-quality, chromosome-level genome assembly for the hexaploid common wheat cultivar ‘Fielder’, an American, soft, white, pastry-type wheat released in 1974 and known for its amenability to Agrobacterium tumefaciens-mediated transformation and genome editing. Accurate, long-read sequences were obtained using PacBio circular consensus sequencing with the HiFi approach. Sequence reads from 16 SMRT cells assembled using the hifiasm assembler produced assemblies with N50 greater than 20 Mb. We used the Omni-C chromosome conformation capture technique to order contigs into chromosome-level assemblies, resulting in 21 pseudomolecules with a cumulative size of 14.7 and 0.3 Gb of unanchored contigs. Mapping of published short reads from a transgenic wheat plant with an edited seed-dormancy gene, TaQsd1, identified four positions of transgene insertion into wheat chromosomes. Detection of guide RNA sequences in pseudomolecules provided candidates for off-target mutation induction. These results demonstrate the efficiency of chromosome-scale assembly using PacBio HiFi reads and their application in wheat genome-editing studies.  相似文献   

2.
To gain genetic insights into the early-flowering phenotype of ornamental cherry, also known as sakura, we determined the genome sequences of two early-flowering cherry (Cerasus × kanzakura) varieties, ‘Kawazu-zakura’ and ‘Atami-zakura’. Because the two varieties are interspecific hybrids, likely derived from crosses between Cerasus campanulata (early-flowering species) and Cerasus speciosa, we employed the haplotype-resolved sequence assembly strategy. Genome sequence reads obtained from each variety by single-molecule real-time sequencing (SMRT) were split into two subsets, based on the genome sequence information of the two probable ancestors, and assembled to obtain haplotype-phased genome sequences. The resultant genome assembly of ‘Kawazu-zakura’ spanned 519.8 Mb with 1,544 contigs and an N50 value of 1,220.5 kb, while that of ‘Atami-zakura’ totalled 509.6 Mb with 2,180 contigs and an N50 value of 709.1 kb. A total of 72,702 and 69,528 potential protein-coding genes were predicted in the genome assemblies of ‘Kawazu-zakura’ and ‘Atami-zakura’, respectively. Gene clustering analysis identified 2,634 clusters uniquely presented in the C. campanulata haplotype sequences, which might contribute to its early-flowering phenotype. Genome sequences determined in this study provide fundamental information for elucidating the molecular and genetic mechanisms underlying the early-flowering phenotype of ornamental cherry tree varieties and their relatives.  相似文献   

3.
Pseudobagrus ussuriensis is an aquaculture catfish with significant sexual dimorphism. In this study, a chromosome-level genome with a size of 741.97 Mb was assembled for female P. ussuriensis. A total of 26 chromosome-level contigs covering 97.34% of the whole-genome assembly were obtained with an N50 of 28.53 Mb and an L50 of 11. A total of 24,075 protein-coding genes were identified, with 91.54% (22,039) genes being functionally annotated. Based on the genome assembly, four chromosome evolution clusters of catfishes were identified and the formation process of P. ussuriensis chromosomes was predicted. A total of 55 sex-related quantitative trait loci (QTLs) with a phenotypic variance explained value of 100% were located on chromosome 8 (chr08). The QTLs and other previously identified sex-specific markers were located in a sex-determining region of 16.83 Mb (from 6.90 to 23.73 Mb) on chr08, which was predicted as the X chromosome. The sex-determining region comprised 554 genes, with 135 of which being differently expressed between males and females/pseudofemales, and 16 candidate sex-determining genes were screened out. The results of this study provided a useful chromosome-level genome for genetic, genomic and evolutionary studies of P. ussuriensis, and also be useful for further studies on sex-determination mechanism analysis and sex-control breeding of this fish.  相似文献   

4.
The genus Pogonophryne is a speciose group that includes 28 species inhabiting the coastal or deep waters of the Antarctic Southern Ocean. The genus has been divided into five species groups, among which the P. albipinna group is the most deep-living group and is characterized by a lack of spots on the top of the head. Here, we carried out genome survey sequencing of P. albipinna using the Illumina HiSeq platform to estimate the genomic characteristics and identify genome-wide microsatellite motifs. The genome size was predicted to be ∼883.8 Mb by K-mer analysis (K = 25), and the heterozygosity and repeat ratio were 0.289 and 39.03%, respectively. The genome sequences were assembled into 571624 contigs, covering a total length of ∼819.3 Mb with an N50 of 2867 bp. A total of 2217422 simple sequence repeat (SSR) motifs were identified from the assembly data, and the number of repeats decreased as the length and number of repeats increased. These data will provide a useful foundation for the development of new molecular markers for the P. albipinna group as well as for further whole-genome sequencing of P. albipinna.  相似文献   

5.
6.
The red‐spotted grouper Epinephelus akaara (E. akaara) is one of the most economically important marine fish in China, Japan and South‐East Asia and is a threatened species. The species is also considered a good model for studies of sex inversion, development, genetic diversity and immunity. Despite its importance, molecular resources for E. akaara remain limited and no reference genome has been published to date. In this study, we constructed a chromosome‐level reference genome of E. akaara by taking advantage of long‐read single‐molecule sequencing and de novo assembly by Oxford Nanopore Technology (ONT) and Hi‐C. A red‐spotted grouper genome of 1.135 Gb was assembled from a total of 106.29 Gb polished Nanopore sequence (GridION, ONT), equivalent to 96‐fold genome coverage. The assembled genome represents 96.8% completeness (BUSCO) with a contig N50 length of 5.25 Mb and a longest contig of 25.75 Mb. The contigs were clustered and ordered onto 24 pseudochromosomes covering approximately 95.55% of the genome assembly with Hi‐C data, with a scaffold N50 length of 46.03 Mb. The genome contained 43.02% repeat sequences and 5,480 noncoding RNAs. Furthermore, combined with several RNA‐seq data sets, 23,808 (99.5%) genes were functionally annotated from a total of 23,923 predicted protein‐coding sequences. The high‐quality chromosome‐level reference genome of E. akaara was assembled for the first time and will be a valuable resource for molecular breeding and functional genomics studies of red‐spotted grouper in the future.  相似文献   

7.
Dissecting the genetic mechanisms underlying dioecy (i.e., separate female and male individuals) is critical for understanding the evolution of this pervasive reproductive strategy. Nonetheless, the genetic basis of sex determination remains unclear in many cases, especially in systems where dioecy has arisen recently. Within the economically important plant genus Solanum (∼2,000 species), dioecy is thought to have evolved independently at least 4 times across roughly 20 species. Here, we generate the first genome sequence of a dioecious Solanum and use it to ascertain the genetic basis of sex determination in this species. We de novo assembled and annotated the genome of Solanum appendiculatum (assembly size: ∼750 Mb scaffold N50: 0.92 Mb; ∼35,000 genes), identified sex-specific sequences and their locations in the genome, and inferred that males in this species are the heterogametic sex. We also analyzed gene expression patterns in floral tissues of males and females, finding approximately 100 genes that are differentially expressed between the sexes. These analyses, together with observed patterns of gene-family evolution specific to S. appendiculatum, consistently implicate a suite of genes from the regulatory network controlling pectin degradation and modification in the expression of sex. Furthermore, the genome of a species with a relatively young sex-determination system provides the foundational resources for future studies on the independent evolution of dioecy in this clade.  相似文献   

8.
About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/∼23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/∼2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars.  相似文献   

9.
Japanese chestnut (Castanea crenata Sieb. et Zucc.), unlike other Castanea species, is resistant to most diseases and wasps. However, genomic data of Japanese chestnut that could be used to determine its biotic stress resistance mechanisms have not been reported to date. In this study, we employed long-read sequencing and genetic mapping to generate genome sequences of Japanese chestnut at the chromosome level. Long reads (47.7 Gb; 71.6× genome coverage) were assembled into 781 contigs, with a total length of 721.2 Mb and a contig N50 length of 1.6 Mb. Genome sequences were anchored to the chestnut genetic map, comprising 14,973 single nucleotide polymorphisms (SNPs) and covering 1,807.8 cM map distance, to establish a chromosome-level genome assembly (683.8 Mb), with 69,980 potential protein-encoding genes and 425.5 Mb repetitive sequences. Furthermore, comparative genome structure analysis revealed that Japanese chestnut shares conserved chromosomal segments with woody plants, but not with herbaceous plants, of rosids. Overall, the genome sequence data of Japanese chestnut generated in this study is expected to enhance not only its genetics and genomics but also the evolutionary genomics of woody rosids.  相似文献   

10.
The comparative morphology and anatomy of the leaves of the rheophytic Rhododendron ripense and the closely related inland species Rhododendron macrosepalum were examined. The leaf of R. ripense is thinner than that of R. macrosepalum, with leaf length to width ratios (leaf index) of 2.92 and 1.91, respectively. Moreover, the leaf of R. ripense consists of fewer cells than the leaf of R. macrosepalum, suggesting stenophyllization of R. ripense caused by the decreased number of cells. In addition, leaf thickness and the number of stomata per leaf of R. ripense were significantly greater than those of R. macrosepalum, but the density of the short glandular pilose hairs on the leaf of R. ripense was lower. The observed morphological differences between the two species may be explained by certain aspects of the riparian environment, such as high irradiation and frequent flooding after heavy rainfall, to which R. ripense is exposed.  相似文献   

11.

Background

The availability of diverse second- and third-generation sequencing technologies enables the rapid determination of the sequences of bacterial genomes. However, identifying the sequencing technology most suitable for producing a finished genome with multiple chromosomes remains a challenge. We evaluated the abilities of the following three second-generation sequencers: Roche 454 GS Junior (GS Jr), Life Technologies Ion PGM (Ion PGM), and Illumina MiSeq (MiSeq) and a third-generation sequencer, the Pacific Biosciences RS sequencer (PacBio), by sequencing and assembling the genome of Vibrio parahaemolyticus, which consists of a 5-Mb genome comprising two circular chromosomes.

Results

We sequenced the genome of V. parahaemolyticus with GS Jr, Ion PGM, MiSeq, and PacBio and performed de novo assembly with several genome assemblers. Although GS Jr generated the longest mean read length of 418 bp among the second-generation sequencers, the maximum contig length of the best assembly from GS Jr was 165 kbp, and the number of contigs was 309. Single runs of Ion PGM and MiSeq produced data of considerably greater sequencing coverage, 279× and 1,927×, respectively. The optimized result for Ion PGM contained 61 contigs assembled from reads of 77× coverage, and the longest contig was 895 kbp in size. Those for MiSeq were 34 contigs, 58× coverage, and 733 kbp, respectively. These results suggest that higher coverage depth is unnecessary for a better assembly result. We observed that multiple rRNA coding regions were fragmented in the assemblies from the second-generation sequencers, whereas PacBio generated two exceptionally long contigs of 3,288,561 and 1,875,537 bps, each of which was from a single chromosome, with 73× coverage and mean read length 3,119 bp, allowing us to determine the absolute positions of all rRNA operons.

Conclusions

PacBio outperformed the other sequencers in terms of the length of contigs and reconstructed the greatest portion of the genome, achieving a genome assembly of “finished grade” because of its long reads. It showed the potential to assemble more complex genomes with multiple chromosomes containing more repetitive sequences.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-699) contains supplementary material, which is available to authorized users.  相似文献   

12.
采用二代和三代测序技术分别对金针菇单核体菌株“6-3”进行测序,应用4种组装策略进行基因组的de novo组装,对比组装效果。基因组组装的参数方面,仅使用二代测序组装的效果最差,长度大于10kb的Contig全长只有24.6Mb,Contig N50只有23kb,组装率只有59.27%。采用三代组装二代校正的组装策略效果最好,长度大于10kb的Contig全长为38.3Mb,Contig N50为2.8Mb,组装率高达92.16%。保守单拷贝基因拼接效果方面,4种组装策略获得基因组序列与BUSCO数据库里的担子菌的保守单拷贝基因比对,基因完整性均大于94%。在组装准确性方面,经过PCR扩增、Sanger测序验证,三代组装二代校正的基因组序列完整并且连续,同时序列上碱基的SNP、InDel数量最少。综上所述,三代组装二代校正得到的基因组序列具有Contig N50值大、组装率高、碱基准确性高的特点,是食用菌基因组测序较为理想的方案。  相似文献   

13.
Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone.  相似文献   

14.
Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags (‘barcodes’). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three ‘bait’ sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species ‘barcodes’ that currently use the cox1 gene only.  相似文献   

15.
16.
The Asian citrus psyllid, Diaphorina citri, is the insect vector of the causal agent of huanglongbing (HLB), a devastating bacterial disease of commercial citrus. Presently, few genomic resources exist for D. citri. In this study, we utilized PacBio HiFi and chromatin confirmation contact (Hi-C) sequencing to sequence, assemble, and compare three high-quality, chromosome-scale genome assemblies of D. citri collected from California, Taiwan, and Uruguay. Our assemblies had final sizes of 282.67 Mb (California), 282.89 Mb (Taiwan), and 266.67 Mb (Uruguay) assembled into 13 pseudomolecules—a reduction in assembly size of 41–45% compared with previous assemblies which we validated using flow cytometry. We identified the X chromosome in D. citri and annotated each assembly for repetitive elements, protein-coding genes, transfer RNAs, ribosomal RNAs, piwi-interacting RNA clusters, and endogenous viral elements. Between 19,083 and 20,357 protein-coding genes were predicted. Repetitive DNA accounts for 36.87–38.26% of each assembly. Comparative analyses and mitochondrial haplotype networks suggest that Taiwan and Uruguay D. citri are more closely related, while California D. citri are closely related to Florida D. citri. These high-quality, chromosome-scale assemblies provide new genomic resources to researchers to further D. citri and HLB research.  相似文献   

17.
The greenfin horse‐faced filefish, Thamnaconus septentrionalis, is a valuable commercial fish species that is widely distributed in the Indo‐West Pacific Ocean. This fish has characteristic blue–green fins, rough skin and a spine‐like first dorsal fin. Thamnaconus septentrionalis is of conservation concern because its population has declined sharply, and it is an important marine aquaculture fish species in China. Genomic resources for the filefish are lacking, and no reference genome has been released. In this study, the first chromosome‐level genome of T. septentrionalis was constructed using nanopore sequencing and Hi‐C technology. A total of 50.95 Gb polished nanopore sequences were generated and were assembled into a 474.31‐Mb genome, accounting for 96.45% of the estimated genome size of this filefish. The assembled genome contained only 242 contigs, and the achieved contig N50 was 22.46 Mb, a surprisingly high value among all sequenced fish species. Hi‐C scaffolding of the genome resulted in 20 pseudochromosomes containing 99.44% of the total assembled sequences. The genome contained 67.35 Mb of repeat sequences, accounting for 14.2% of the assembly. A total of 22,067 protein‐coding genes were predicted, 94.82% of which were successfully annotated with putative functions. Furthermore, a phylogenetic tree was constructed using 1,872 single‐copy orthologous genes, and 67 unique gene families were identified in the filefish genome. This high‐quality assembled genome will be a valuable resource for a range of future genomic, conservation and breeding studies of T. septentrionalis.  相似文献   

18.
Onychostoma macrolepis is an emerging commercial cyprinid fish species. It is a model system for studies of sexual dimorphism and genome evolution. Here, we report the chromosome‐level assembly of the O.macrolepis genome obtained from the integration of nanopore long‐read sequencing with physical maps produced using Bionano and Hi‐C technology. A total of 87.9 Gb of nanopore sequence provided approximately 100‐fold coverage of the genome. The preliminary genome assembly was 883.2 Mb in size with a contig N50 size of 11.2 Mb. The 969 corrected contigs obtained from Bionano optical mapping were assembled into 853 scaffolds and produced an assembly of 886.5 Mb with a scaffold N50 of 16.5 Mb. Finally, using the Hi‐C data, 881.3 Mb (99.4% of genome) in 526 scaffolds were anchored and oriented in 25 chromosomes ranging in size from 25.27 to 56.49 Mb. In total, 24,770 protein‐coding genes were predicted in the genome, and ~96.85% of the genes were functionally annotated. The annotated assembly contains 93.3% complete genes from the BUSCO reference set. In addition, we identified 409 Mb (46.23% of the genome) of repetitive sequence, and 11,213 non‐coding RNAs, in the genome. Evolutionary analysis revealed that O. macrolepis diverged from common carp approximately 24.25 million years ago. The chromosomes of O. macrolepis showed an unambiguous correspondence to the chromosomes of zebrafish. The high‐quality genome assembled in this work provides a valuable genomic resource for further biological and evolutionary studies of O. macrolepis.  相似文献   

19.
Asparagus kiusianus is a disease-resistant dioecious plant species and a wild relative of garden asparagus (Asparagus officinalis). To enhance A. kiusianus genomic resources, advance plant science, and facilitate asparagus breeding, we determined the genome sequences of the male and female lines of A. kiusianus. Genome sequence reads obtained with a linked-read technology were assembled into four haplotype-phased contig sequences (∼1.6 Gb each) for the male and female lines. The contig sequences were aligned onto the chromosome sequences of garden asparagus to construct pseudomolecule sequences. Approximately 55,000 potential protein-encoding genes were predicted in each genome assembly, and ∼70% of the genome sequence was annotated as repetitive. Comparative analysis of the genomes of the two species revealed structural and sequence variants between the two species as well as between the male and female lines of each species. Genes with high sequence similarity with the male-specific sex determinant gene in A. officinalis, MSE1/AoMYB35/AspTDF1, were presented in the genomes of the male line but absent from the female genome assemblies. Overall, the genome sequence assemblies, gene sequences, and structural and sequence variants determined in this study will reveal the genetic mechanisms underlying sexual differentiation in plants, and will accelerate disease-resistance breeding in garden asparagus.  相似文献   

20.
Sophora japonica is a medium-size deciduous tree belonging to Leguminosae family and famous for its high ecological, economic and medicinal value. Here, we reveal a draft genome of S. japonica, which was ∼511.49 Mb long (contig N50 size of 17.34 Mb) based on Illumina, Nanopore and Hi-C data. We reliably assembled 110 contigs into 14 chromosomes, representing 91.62% of the total genome, with an improved N50 size of 31.32 Mb based on Hi-C data. Further investigation identified 271.76 Mb (53.13%) of repetitive sequences and 31,000 protein-coding genes, of which 30,721 (99.1%) were functionally annotated. Phylogenetic analysis indicates that S. japonica separated from Arabidopsis thaliana and Glycine max ∼107.53 and 61.24 million years ago, respectively. We detected evidence of species-specific and common-legume whole-genome duplication events in S. japonica. We further found that multiple TF families (e.g. BBX and PAL) have expanded in S. japonica, which might have led to its enhanced tolerance to abiotic stress. In addition, S. japonica harbours more genes involved in the lignin and cellulose biosynthesis pathways than the other two species. Finally, population genomic analyses revealed no obvious differentiation among geographical groups and the effective population size continuously declined since 2 Ma. Our genomic data provide a powerful comparative framework to study the adaptation, evolution and active ingredients biosynthesis in S. japonica. More importantly, our high-quality S. japonica genome is important for elucidating the biosynthesis of its main bioactive components, and improving its production and/or processing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号