首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Gadus macrocephalus (Pacific cod) is an economically important species on the northern coast of the Pacific. Although numerous studies on G. macrocephalus exist, there are few reports on its genomic data. Here, we used whole-genome sequencing data to elucidate the genomic characteristics and phylogenetic relationship of G. macrocephalus. From the 19-mer frequency distribution, the genome size was estimated to be 658.22 Mb. The heterozygosity, repetitive sequence content and GC content were approximately 0.62%, 27.50% and 44.73%, respectively. The draft genome sequences were initially assembled, yielding a total of 500,760 scaffolds (N50 = 3565 bp). A total of 789,860 microsatellite motifs were identified from the genomic data, and dinucleotide repeat was the most dominant simple sequence repeat motif. As a byproduct of whole-genome sequencing, the mitochondrial genome was assembled to investigate the evolutionary relationships between G. macrocephalus and its relatives. On the basis of 13 protein-coding gene sequences of the mitochondrial genome of Gadidae species, the maximum likelihood phylogenetic tree showed that complicated relationships and divergence times among Gadidae species. Demographic history analysis revealed changes in the G. macrocephalus population during the Pleistocene by using the pairwise sequentially Markovian coalescent model. These findings supplement the genomic data of G. macrocephalus, and make a valuable contribution to the whole-genome studies on G. macrocephalus.  相似文献   

2.
3.
Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ≥300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified.  相似文献   

4.
Rosa roxburghii Tratt is an important commercial horticultural crop in China that is recognized for its nutritional and medicinal values. In spite of the economic significance, genomic information on this rose species is currently unavailable. In the present research, a genome survey of R. roxburghii was carried out using next-generation sequencing (NGS) technologies. Total 30.29 Gb sequence data was obtained by HiSeq 2500 sequencing and an estimated genome size of R. roxburghii was 480.97 Mb, in which the guanine plus cytosine (GC) content was calculated to be 38.63%. All of these reads were technically assembled and a total of 627,554 contigs with a N50 length of 1.484 kb and furthermore 335,902 scaffolds with a total length of 409.36 Mb were obtained. Transposable elements (TE) sequence of 90.84 Mb which comprised 29.20% of the genome, and 167,859 simple sequence repeats (SSRs) were identified from the scaffolds. Among these, the mono-(66.30%), di-(25.67%), and tri-(6.64%) nucleotide repeats contributed to nearly 99% of the SSRs, and sequence motifs AG/CT (28.81%) and GAA/TTC (14.76%) were the most abundant among the dinucleotide and trinucleotide repeat motifs, respectively. Genome analysis predicted a total of 22,721 genes which have an average length of 2311.52 bp, an average exon length of 228.15 bp, and average intron length of 401.18 bp. Eleven genes putatively involved in ascorbate metabolism were identified and its expression in R. roxburghii leaves was validated by quantitative real-time PCR (qRT-PCR). This is the first report of genome-wide characterization of this rose species.  相似文献   

5.
In analysis of the repeats from the mink X Chromosome (Chr), we have identified a B2-like repetitive sequence of 195 base pairs (bp) flanked by short direct repeats of 14 bp. It contains regions homologous to the split intragenic RNA polymerase III promoter and a 3 A-rich region followed by an oligo(dA) sequence. A feature of the repeat is the presence of a perfect polypyrimidine tract 22 bp in length absent from the known Alu- and Alu-like sequences. Alignment of the mink B2-like sequence and mouse B2-consensus sequence allowed us to estimate their similarity as 55%. The repeat is present in 1–2×105 copies per mink genome and 2–4×103 copies per X Chr. In situ hybridization analysis demonstrated a similar distribution pattern of the B2-like repeat along the length of all the mink chromosomes including the X. We also observed the presence of mink B2-like hybridizable sequence in the genomes of other Carnivora species.The nucleotide sequence data reported in this paper have been submitted to EMBL Data Library and have been assigned the accession number X52381 (MVB2RPT).  相似文献   

6.
Characterizing the walnut genome through analyses of BAC end sequences   总被引:1,自引:0,他引:1  
Persian walnut (Juglans regia L.) is an economically important tree for its nut crop and timber. To gain insight into the structure and evolution of the walnut genome, we constructed two bacterial artificial chromosome (BAC) libraries, containing a total of 129,024 clones, from in vitro-grown shoots of J. regia cv. Chandler using the HindIII and MboI cloning sites. A total of 48,218 high-quality BAC end sequences (BESs) were generated, with an accumulated sequence length of 31.2?Mb, representing approximately 5.1% of the walnut genome. Analysis of repeat DNA content in BESs revealed that approximately 15.42% of the genome consists of known repetitive DNA, while walnut-unique repetitive DNA identified in this study constitutes 13.5% of the genome. Among the walnut-unique repetitive DNA, Julia SINE and JrTRIM elements represent the first identified walnut short interspersed element (SINE) and terminal-repeat retrotransposon in miniature (TRIM) element, respectively; both types of elements are abundant in the genome. As in other species, these SINEs and TRIM elements could be exploited for developing repeat DNA-based molecular markers in walnut. Simple sequence repeats (SSR) from BESs were analyzed and found to be more abundant in BESs than in expressed sequence tags. The density of SSR in the walnut genome analyzed was also slightly higher than that in poplar and papaya. Sequence analysis of BESs indicated that approximately 11.5% of the walnut genome represents a coding sequence. This study is an initial characterization of the walnut genome and provides the largest genomic resource currently available; as such, it will be a valuable tool in studies aimed at genetically improving walnut.  相似文献   

7.
Long INterspersed Elements (LINE-1s or L1s) are abundant non-LTR retrotransposons in mammalian genomes that are capable of insertional mutagenesis. They have been associated with target site deletions upon insertion in cell culture studies of retrotransposition. Here, we report 50 deletion events in the human and chimpanzee genomes directly linked to the insertion of L1 elements, resulting in the loss of ~18 kb of sequence from the human genome and ~15 kb from the chimpanzee genome. Our data suggest that during the primate radiation, L1 insertions may have deleted up to 7.5 Mb of target genomic sequences. While the results of our in vivo analysis differ from those of previous cell culture assays of L1 insertion-mediated deletions in terms of the size and rate of sequence deletion, evolutionary factors can reconcile the differences. We report a pattern of genomic deletion sizes similar to those created during the retrotransposition of Alu elements. Our study provides support for the existence of different mechanisms for small and large L1-mediated deletions, and we present a model for the correlation of L1 element size and the corresponding deletion size. In addition, we show that internal rearrangements can modify L1 structure during retrotransposition events associated with large deletions.  相似文献   

8.
We have determined the genome structure of the centromeric region of Arabidopsis thaliana chromosome 4 by sequence analysis of BAC clones obtained by genome walking, followed by construction of a physical map using DNA of a hypomethylated strain. The total size of the centromeric region, corresponding to the recombinant inbred (RI) markers between mi87 and mi167, was approximately 5.3 megabases (Mb). This value is over 3 Mb longer than that previously estimated by the Arabidopsis Genome Initiative (Nature, 408, 796-815, 2000). Although we could not cover the entire centromeric region by BAC clones because of the presence of highly repetitive sequences in the middle (2.7 Mb), the cloned regions spanning approximately 1 Mb at both sides of the gap were newly sequenced. These results together with the reported sequences in the adjacent regions suggest that the centromeric region is principally composed of a central domain of 2.7 Mb, consisting of mainly 180-bp repeats and Athila elements, and upper and lower flanking regions of 1.55 Mb and 1 Mb, respectively. The flanking regions were predominantly composed of various types of transposable elements, except for the upper end moiety in which a large 5S rDNA array (0.65 Mb) and central domain-like sequence are present. Such an organization is essentially identical to the centromeric region of chromosome 5 reported previously.  相似文献   

9.
《Genomics》2021,113(4):2189-2198
Sooty moulds are fungi of economic importance and with unique lifestyle mainly growing on insect honeydew. However, the lack of genomic data hinders investigation of genetic mechanisms underlying their ecological adaptation. With long-read sequencing technology, we generated the genome of Scorias spongiosa, an extraordinary sooty mould fungus associated with honeydew of colony aphids and producing large fruiting bodies. A 24.21 Mb high-quality genome assembly with a N50 length of 3.37 Mb was obtained. The genome contained 7758 protein coding genes, 97.13% of which were homologous to known genes, and approximately 0.29 Mb repeat sequences. Comparative genomics showed S. spongiosa lost relatively more gene families and contained fewer species-specific genes and gene families, with many CAZyme families and sugar transporters reduced or absent. This study not only promotes understanding of the ecological adaptation of sooty moulds, but also provides valuable genomic data resource for future comparative genomic and genetic studies.  相似文献   

10.
Retrotransposons and their remnants often constitute more than 50% of higher plant genomes. Although extensively studied in monocot crops such as maize (Zea mays) and rice (Oryza sativa), the impact of retrotransposons on dicot crop genomes is not well documented. Here, we present an analysis of retrotransposons in soybean (Glycine max). Analysis of approximately 3.7 megabases (Mb) of genomic sequence, including 0.87 Mb of pericentromeric sequence, uncovered 45 intact long terminal repeat (LTR)-retrotransposons. The ratio of intact elements to solo LTRs was 8:1, one of the highest reported to date in plants, suggesting that removal of retrotransposons by homologous recombination between LTRs is occurring more slowly in soybean than in previously characterized plant species. Analysis of paired LTR sequences uncovered a low frequency of deletions relative to base substitutions, indicating that removal of retrotransposon sequences by illegitimate recombination is also operating more slowly. Significantly, we identified three subfamilies of nonautonomous elements that have replicated in the recent past, suggesting that retrotransposition can be catalyzed in trans by autonomous elements elsewhere in the genome. Analysis of 1.6 Mb of sequence from Glycine tomentella, a wild perennial relative of soybean, uncovered 23 intact retroelements, two of which had accumulated no mutations in their LTRs, indicating very recent insertion. A similar pattern was found in 0.94 Mb of sequence from Phaseolus vulgaris (common bean). Thus, autonomous and nonautonomous retrotransposons appear to be both abundant and active in Glycine and Phaseolus. The impact of nonautonomous retrotransposon replication on genome size appears to be much greater than previously appreciated.  相似文献   

11.
Artemia is an industrially important genus used in aquaculture as a nutritious diet for fish and as an aquatic model organism for toxicity tests. However, despite the significance of Artemia, genomic research remains incomplete and knowledge on its genomic characteristics is insufficient. In particular, Artemia franciscana of North America has been widely used in fisheries of other continents, resulting in invasion of native species. Therefore, studies on population genetics and molecular marker development as well as morphological analyses are required to investigate its population structure and to discriminate closely related species. Here, we used the Illumina Hi-Seq platform to estimate the genomic characteristics of A. franciscana through genome survey sequencing (GSS). Further, simple sequence repeat (SSR) loci were identified for microsatellite marker development. The predicted genome size was ∼867 Mb using K-mer (a sequence of k characters in a string) analysis (K = 17), and heterozygosity and duplication rates were 0.655 and 0.809%, respectively. A total of 421467 SSRs were identified from the genome survey assembly, most of which were dinucleotide motifs with a frequency of 77.22%. The present study will be a useful basis in genomic and genetic research for A. franciscana.  相似文献   

12.
Pineapple (Ananas comosus (L.) Merrill) is the second most important tropical fruit in term of international trade. The availability of whole genomic sequences and expressed sequence tags (ESTs) offers an opportunity to identify and characterize microsatellite or simple sequence repeat (SSR) markers in pineapple. A total of 278,245 SSRs and 41,962 SSRs with an overall density of 728.57 SSRs/Mb and 619.37 SSRs/Mb were mined from genomic and ESTs sequences, respectively. 5′-untranslated regions (5′-UTRs) had the greatest amount of SSRs, 3.6–5.2 fold higher SSR density than other regions. For repeat length, 12 bp was the predominant repeat length in both assembled genome and ESTs. Class I SSRs were underrepresented compared with class II SSRs. For motif length, dinucleotide repeats were the most abundant in genomic sequences, whereas trinucleotides were the most common motif in ESTs. Tri- and hexanucleotides of total SSRs were more prevalent in ESTs than in the whole genome. The SSR frequency decreased dramatically as repeat times increased. AT was the most frequent single motif across the entire genome while AG was the most abundant motif in ESTs. Across six examined plant species, the pineapple genome displayed the highest density, substantially more than the second-place cucumber. Annotation and expression analyses were also conducted for genes containing SSRs. This thorough analysis of SSR markers in pineapple provided valuable information on the frequency and distribution of SSRs in the pineapple genome. This genomic resource will expedite genomic research and pineapple improvement.  相似文献   

13.
Moso bamboo (Phyllostachys pubescens) is one of the world’s most important bamboo species. It has the largest area of all planted bamboo—over two-thirds of the total bamboo forest area—and the highest economic value in China. Moso bamboo is a tetraploid (4x=48) and a special member of the grasses family. Although several genomes have been sequenced or are being sequenced in the grasses family, we know little about the genome of the bambusoids (bamboos). In this study, the moso bamboo genome size was estimated to be about 2034 Mb by flow cytometry (FCM), using maize (cv. B73) and rice (cv. Nipponbare) as internal references. The rice genome has been sequenced and the maize genome is being sequenced. We found that the size of the moso bamboo genome was similar to that of maize but significantly larger than that of rice. To determine whether the bamboo genome had a high proportion of repeat elements, similar to that of the maize genome, approximately 1000 genome survey sequences (GSS) were generated. Sequence analysis showed that the proportion of repeat elements was 23.3% for the bamboo genome, which is significantly lower than that of the maize genome (65.7%). The bamboo repeat elements were mainly Gypsy/DIRS1 and Ty1/Copia LTR retrotransposons (14.7%), with a few DNA transposons. However, more genomic sequences are needed to confirm the above results due to several factors, such as the limitation of our GSS data. This study is the first to investigate sequence composition of the bamboo genome. Our results are valuable for future genome research of moso and other bamboos.  相似文献   

14.
Opium poppy (Papaver somniferum L.) is an important pharmaceutical crop with very few genetic marker resources. To expand these resources, we sequenced genomic DNA using pyrosequencing technology and examined the DNA sequences for simple sequence repeats (SSRs). A total of 1,244,412 sequence reads were obtained covering 474 Mb. Approximately half of the reads (52 %) were assembled into 166,724 contigs representing 105 Mb of the opium poppy genome. A total of 23,283 non-redundant SSRs were identified in 18,944 contigs (11.3 % of total contigs). Trinucleotide and tetranucleotide repeats were the most abundant SSR repeats, accounting for 49.0 and 27.9 % of all SSRs, respectively. The AAG/TTC repeat was the most abundant trinucleotide repeat, representing 19.7 % of trinucleotide repeats. Other SSR repeat types were AT-rich. A total of 23,126 primer pairs (98.7 % of total SSRs) were designed to amplify SSRs. Fifty-three genomic SSR markers were tested in 37 opium poppy accessions and seven Papaver species for determination of polymorphism and transferability. Intraspecific polymorphism information content (PIC) values of the genomic SSR markers were intermediate, with an average 0.17, while the interspecific average PIC value was slightly higher, 0.19. All markers showed at least 88 % transferability among related species. This study increases sequence coverage of the opium poppy genome by sevenfold and the number of opium poppy-specific SSR markers by sixfold. This is the first report of the development of genomic SSR markers in opium poppy, and the genomic SSR markers developed in this study will be useful in diversity, identification, mapping and breeding studies in opium poppy.  相似文献   

15.
《Genomics》2020,112(6):4742-4748
The flathead fish Platycephalus sp.1 is an ecologically and commercially important marine fish in the northwestern Pacific with notable sexual differences in growth and development. Yet the genomic data of this species is lacking. In the present study, whole genome sequencing of two individuals (one male and one female) of Platycephalus sp.1 were conducted to provide fundamental genomic information. The genome sizes were estimated to be 674.96 Mb (male) and 684.15 Mb (female) by using k-mer analyses. The heterozygosity and repeat ratios suggested possible male heterogamety of this species. The draft genome sequences were initially assembled and genome-wide microsatellite motifs were identified. Besides, the complete mitochondrial genome sequences were assembled and the phylogenetic analyses genetically supported the validation of Platycephalus sp.1. The reported genomic data and genetic markers in this study could be useful in future comparative genomics and evolutionary biology studies.  相似文献   

16.
Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species.  相似文献   

17.
Triticeae species (including wheat, barley and rye) have huge and complex genomes due to polyploidization and a high content of transposable elements (TEs). TEs are known to play a major role in the structure and evolutionary dynamics of Triticeae genomes. During the last 5 years, substantial stretches of contiguous genomic sequence from various species of Triticeae have been generated, making it necessary to update and standardize TE annotations and nomenclature. In this study we propose standard procedures for these tasks, based on structure, nucleic acid and protein sequence homologies. We report statistical analyses of TE composition and distribution in large blocks of genomic sequences from wheat and barley. Altogether, 3.8 Mb of wheat sequence available in the databases was analyzed or re-analyzed, and compared with 1.3 Mb of re-annotated genomic sequences from barley. The wheat sequences were relatively gene-rich (one gene per 23.9 kb), although wheat gene-derived sequences represented only 7.8% (159 elements) of the total, while the remainder mainly comprised coding sequences found in TEs (54.7%, 751 elements). Class I elements [mainly long terminal repeat (LTR) retrotransposons] accounted for the major proportion of TEs, in terms of sequence length as well as element number (83.6% and 498, respectively). In addition, we show that the gene-rich sequences of wheat genome A seem to have a higher TE content than those of genomes B and D, or of barley gene-rich sequences. Moreover, among the various TE groups, MITEs were most often associated with genes: 43.1% of MITEs fell into this category. Finally, the TRIM and copia elements were shown to be the most active TEs in the wheat genome. The implications of these results for the evolution of diploid and polyploid wheat species are discussed. Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

18.
Sebastiscus species, marine rockfishes, are of essential economic value. However, the genomic data of this genus is lacking and incomplete. Here, whole genome sequencing of all species of Sebastiscus was conducted to provide fundamental genomic information. The genome sizes were estimated to be 802.49 Mb (S. albofasciatus), 786.79 Mb (S. tertius), and 776.00 Mb (S. marmoratus) by using k-mer analyses. The draft genome sequences were initially assembled, and genome-wide microsatellite motifs were identified. The heterozygosity, repeat ratios, and numbers of microsatellite motifs all suggested possibly that S. tertius is more closely related to S. albofasciatus than S. marmoratus at the genetic level. Moreover, the complete mitochondrial genome sequences were assembled from the whole genome data and the phylogenetic analyses genetically supported the validation of Sebastiscus species. This study provides an important genome resource for further studies of Sebastiscus species.  相似文献   

19.
Powdery mildew of wheat (Triticum aestivum L.) is caused by the ascomycete fungus Blumeria graminis f.sp. tritici. Genomic approaches open new ways to study the biology of this obligate biotrophic pathogen. We started the analysis of the Bg tritici genome with the low-pass sequencing of its genome using the 454 technology and the construction of the first genomic bacterial artificial chromosome (BAC) library for this fungus. High-coverage contigs were assembled with the 454 reads. They allowed the characterization of 56 transposable elements and the establishment of the Blumeria repeat database. The BAC library contains 12,288 clones with an average insert size of 115 kb, which represents a maximum of 7.5-fold genome coverage. Sequencing of the BAC ends generated 12.6 Mb of random sequence representative of the genome. Analysis of BAC-end sequences revealed a massive invasion of transposable elements accounting for at least 85% of the genome. This explains the unusually large size of this genome which we estimate to be at least 174 Mb, based on a large-scale physical map constructed through the fingerprinting of the BAC library. Our study represents a crucial step in the perspective of the determination and study of the whole Bg tritici genome sequence.  相似文献   

20.
Ricinus communis is a versatile industrial oil crop that is cultivated worldwide. Genetic improvement and marker-assisted breeding of castor bean have been slowed owing to the lack of abundant and efficient molecular markers. As co-dominant markers, simple sequence repeats (SSRs) are useful for genetic evaluation and molecular breeding. The recently released whole-genome sequence of castor bean provides useful genomic resources for developing markers on a genome-wide scale. In the present study, the distribution and frequency of microsatellites in the castor bean genome were characterised and numerous SSR markers were developed using genomic data mining. In total, 18,647 SSR loci at a density of one SSR per 18.89 Kb in the castor bean genome sequence (representing approximately 352.27 Mb) were identified. Dinucleotide repeats were the most frequently observed microsatellites, although the AAT repeat motif was also prevalent. Using six cultivars as screening samples, 670 polymorphic SSR markers from 1,435 primer pairs (46.7 %) were developed. Trinucleotide motif loci contained a higher proportion of polymorphisms (48.5 %) than dinucleotide motif loci (39.2 %). The polymorphism level in the SSR loci was positively correlated with the increasing number of repeat units in the microsatellites. The phylogenetic relationship among 32 varieties was evaluated using the developed SSR markers. Cultivars developed at the same institute clustered together, suggesting that these cultivars have a narrow genetic background. The large number of SSR markers developed in this study will be useful for genetic mapping and for breeding improved castor-oil plants. These markers will also facilitate genetic and genomic studies of Euphorbiaceae.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号