首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《Genomics》2023,115(5):110661
We report the sequencing and assembly of the PH8 strain of Leishmania amazonensis one of the etiological agents of leishmaniasis. After combining data from long Pacbio reads, short Illumina reads and synteny with the Leishmania mexicana genome, the sequence of 34 chromosomes with 8317 annotated genes was generated. Multigene families encoding three virulence factors, A2, amastins and the GP63 metalloproteases, were identified and compared to their annotation in other Leishmania species. As they have been recently recognized as virulence factors essential for disease establishment and progression of the infection, we also identified 14 genes encoding proteins involved in parasite iron and heme metabolism and compared to genes from other Trypanosomatids. To follow these studies with a genetic approach to address the role of virulence factors, we tested two CRISPR-Cas9 protocols to generate L. amazonensis knockout cell lines, using the Miltefosine transporter gene as a proof of concept.  相似文献   

2.
Clonorchis sinensis (family Opisthorchiidae) is an important foodborne parasite that has a major socioeconomic impact on ~35 million people predominantly in China, Vietnam, Korea and the Russian Far East. In humans, infection with C. sinensis causes clonorchiasis, a complex hepatobiliary disease that can induce cholangiocarcinoma (CCA), a malignant cancer of the bile ducts. Central to understanding the epidemiology of this disease is knowledge of genetic variation within and among populations of this parasite. Although most published molecular studies seem to suggest that C. sinensis represents a single species, evidence of karyotypic variation within C. sinensis and cryptic species within a related opisthorchiid fluke (Opisthorchis viverrini) emphasise the importance of studying and comparing the genes and genomes of geographically distinct isolates of C. sinensis. Recently, we sequenced, assembled and characterised a draft nuclear genome of a C. sinensis isolate from Korea and compared it with a published draft genome of a Chinese isolate of this species using a bioinformatic workflow established for comparing draft genome assemblies and their gene annotations. We identified that 50.6% and 51.3% of the Korean and Chinese C. sinensis genomic scaffolds were syntenic, respectively. Within aligned syntenic blocks, the genomes had a high level of nucleotide identity (99.1%) and encoded 15 variable proteins likely to be involved in diverse biological processes. Here, we review current technical challenges of using draft genome assemblies to undertake comparative genomic analyses to quantify genetic variation between isolates of the same species. Using a workflow that overcomes these challenges, we report on a high-quality draft genome for C. sinensis from Korea and comparative genomic analyses, as a basis for future investigations of the genetic structures of C. sinensis populations, and discuss the biotechnological implications of these explorations.  相似文献   

3.
De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS) platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food “natto.” The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome.  相似文献   

4.
Genetic diversity within parental lines of hybrid rice is the foundation of heterosis utilization and yield improvement. Previous studies have suggested that genetic diversity was narrow in cytoplasmic male sterile (CMS/A line) and restorer lines (R line) for Three-line hybrid rice. However, the genetic diversity within maintainer lines (B line), especially at a genome-wide scale, remains largely unknown. In the present study, we performed deep re-sequencing of the elite maintainer line V20B (Oryza sativa L. ssp. indica). We then compared the V20B sequence with the 93-11 (Oryza sativa L. ssp. indica) genome sequence. 112.1 × 106 paired-end reads (PE reads) were generated with approximately 30-fold sequencing depth. The V20B PE reads uniquely covered 87.6 % of the 93-11 genome sequence. Overall, a total of 660,778 single-nucleotide polymorphism (SNPs) and 266,301 insertions and deletions (InDels) were identified, yielding an average of 2.1 SNPs/kb and 0.8 InDels/kb. Genome-wide distribution of the SNPs and InDels was non-random, and variation-rich and variation-poor regions were identified in all chromosomes. A total of 20,562 non-synonymous SNPs spanning 8,854 genes were annotated. Our results identified DNA polymorphisms at the genome-wide scale and uncovered the high level of genetic diversity between V20B and 93-11. Our results proved that next-generation sequencing technologies can be powerful tools to study genome-wide DNA polymorphisms, to query genetic diversity, and to enable molecular improvement efforts with Three-line hybrid rice. Further, our results also indicated that 93-11 could be used as core germplasm for the improvement of wild-abortive CMS lines and the maintainer lines.  相似文献   

5.
Detecting single nucleotide polymorphisms (SNPs) between genomes is becoming a routine task with next-generation sequencing. Generally, SNP detection methods use a reference genome. As non-model organisms are increasingly investigated, the need for reference-free methods has been amplified. Most of the existing reference-free methods have fundamental limitations: they can only call SNPs between exactly two datasets, and/or they require a prohibitive amount of computational resources. The method we propose, discoSnp, detects both heterozygous and homozygous isolated SNPs from any number of read datasets, without a reference genome, and with very low memory and time footprints (billions of reads can be analyzed with a standard desktop computer). To facilitate downstream genotyping analyses, discoSnp ranks predictions and outputs quality and coverage per allele. Compared to finding isolated SNPs using a state-of-the-art assembly and mapping approach, discoSnp requires significantly less computational resources, shows similar precision/recall values, and highly ranked predictions are less likely to be false positives. An experimental validation was conducted on an arthropod species (the tick Ixodes ricinus) on which de novo sequencing was performed. Among the predicted SNPs that were tested, 96% were successfully genotyped and truly exhibited polymorphism.  相似文献   

6.
BackgroundMosquitoes host and transmit numerous arthropod-borne viruses (arboviruses) that cause disease in both humans and animals. Effective surveillance of virome profiles in mosquitoes is vital to the prevention and control of mosquito-borne diseases in northwestern China, where epidemics occur frequently.MethodsMosquitoes were collected in the Shaanxi-Gansu-Ningxia region (Shaanxi Province, Gansu Province, and Ningxia Hui Autonomous Region) of China from June to August 2019. Morphological methods were used for taxonomic identification of mosquito species. High-throughput sequencing and metagenomic analysis were used to characterize mosquito viromes.ResultsA total of 22,959 mosquitoes were collected, including Culex pipiens (45.7%), Culex tritaeniorhynchus (40.6%), Anopheles sinensis (8.4%), Aedes (5.2%), and Armigeres subalbatus (0.1%). In total, 3,014,183 (0.95% of clean reads) viral sequences were identified and assigned to 116 viral species (including pathogens such as Japanese encephalitis virus and Getah virus) in 31 viral families, including Flaviviridae, Togaviridae, Phasmaviridae, Phenuiviridae, and some unclassified viruses. Mosquitoes collected in July (86 species in 26 families) showed greater viral diversity than those from June and August. Culex pipiens (69 species in 25 families) and Culex tritaeniorhynchus (73 species in 24 families) carried more viral species than Anopheles sinensis (50 species in 19 families) or Aedes (38 species in 20 families) mosquitoes.ConclusionViral diversity and abundance were affected by mosquito species and collection time. The present study elucidates the virome compositions of various mosquito species in northwestern China, improving the understanding of virus transmission dynamics for comparison with those of disease outbreaks.  相似文献   

7.
8.
Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos) within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia.  相似文献   

9.
10.
11.
The draft genome sequences of several primates are available, providing insights into evolutionary and anthropological research. However, genomic resources from New World monkeys are conspicuously lacking. To date, the genomes of only two platyrrhine species, the common marmoset and the Bolivian squirrel monkey, have been fully sequenced. This is especially limiting for comparative genomics research, considering that New World monkeys are the most speciose primate group, and platyrrhine genetic diversity is comparable to that of the catarrhines (i.e. apes and Old World monkeys). Here, we present the generation and annotation of numerous sequence reads from the genomes of Spider monkey (Ateles belzebuth), Owl monkey (Aotus lemurinus) and Uakari (Cacajao calvus), representing the three platyrrhine families, Atelidae, Cebidae and Pitheciidae, respectively. These sequencing reads were developed from gDNA shotgun libraries containing over 3000 individual sequences with an average length of 726 bps. Of these sequences, 1220 contain <20% repeats, and thus are potentially highly useful phylogenetic markers for other platyrrhine species. Among them, a large number of sequencing reads were found to match unique regions within the human (2462 sequences) and the marmoset (2829 sequences) genomes. In particular, the majority of these sequencing reads are from putatively neutrally evolving intergenic regions. Thus, they are likely to be highly informative for inferring neutral evolutionary patterns and genomic evolution for other New World monkeys.  相似文献   

12.
13.
14.
Despite the ever-increasing output of next-generation sequencing data along with developing assemblers, dozens to hundreds of gaps still exist in de novo microbial assemblies due to uneven coverage and large genomic repeats. Third-generation single-molecule, real-time (SMRT) sequencing technology avoids amplification artifacts and generates kilobase-long reads with the potential to complete microbial genome assembly. However, due to the low accuracy (~85%) of third-generation sequences, a considerable amount of long reads (>50X) are required for self-correction and for subsequent de novo assembly. Recently-developed hybrid approaches, using next-generation sequencing data and as few as 5X long reads, have been proposed to improve the completeness of microbial assembly. In this study we have evaluated the contemporary hybrid approaches and demonstrated that assembling corrected long reads (by runCA) produced the best assembly compared to long-read scaffolding (e.g., AHA, Cerulean and SSPACE-LongRead) and gap-filling (SPAdes). For generating corrected long reads, we further examined long-read correction tools, such as ECTools, LSC, LoRDEC, PBcR pipeline and proovread. We have demonstrated that three microbial genomes including Escherichia coli K12 MG1655, Meiothermus ruber DSM1279 and Pdeobacter heparinus DSM2366 were successfully hybrid assembled by runCA into near-perfect assemblies using ECTools-corrected long reads. In addition, we developed a tool, Patch, which implements corrected long reads and pre-assembled contigs as inputs, to enhance microbial genome assemblies. With the additional 20X long reads, short reads of S. cerevisiae W303 were hybrid assembled into 115 contigs using the verified strategy, ECTools + runCA. Patch was subsequently applied to upgrade the assembly to a 35-contig draft genome. Our evaluation of the hybrid approaches shows that assembling the ECTools-corrected long reads via runCA generates near complete microbial genomes, suggesting that genome assembly could benefit from re-analyzing the available hybrid datasets that were not assembled in an optimal fashion.  相似文献   

15.
16.
17.
18.
A highly lethal hemorrhagic disease associated with infection by elephant endotheliotropic herpesvirus (EEHV) poses a severe threat to Asian elephant husbandry. We have used high-throughput methods to sequence the genomes of the two genotypes that are involved in most fatalities, namely, EEHV1A and EEHV1B (species Elephantid herpesvirus 1, genus Proboscivirus, subfamily Betaherpesvirinae, family Herpesviridae). The sequences were determined from postmortem tissue samples, despite the data containing tiny proportions of viral reads among reads from a host for which the genome sequence was not available. The EEHV1A genome is 180,421 bp in size and consists of a unique sequence (174,601 bp) flanked by a terminal direct repeat (2,910 bp). The genome contains 116 predicted protein-coding genes, of which six are fragmented, and seven paralogous gene families are present. The EEHV1B genome is very similar to that of EEHV1A in structure, size, and gene layout. Half of the EEHV1A genes lack orthologs in other members of subfamily Betaherpesvirinae, such as human cytomegalovirus (genus Cytomegalovirus) and human herpesvirus 6A (genus Roseolovirus). Notable among these are 23 genes encoding type 3 membrane proteins containing seven transmembrane domains (the 7TM family) and seven genes encoding related type 2 membrane proteins (the EE50 family). The EE50 family appears to be under intense evolutionary selection, as it is highly diverged between the two genotypes, exhibits evidence of sequence duplications or deletions, and contains several fragmented genes. The availability of the genome sequences will facilitate future research on the epidemiology, pathogenesis, diagnosis, and treatment of EEHV-associated disease.  相似文献   

19.
20.
The genus Citrus is an important fruit crop and nutritional source for the good health of humans. Cytochrome P450s represent about 1 % of the proteome and mediate diverse biochemical reactions pertaining to both primary and secondary metabolism. Analysis of Citrus genomic resources identified 296 plant cytochrome P450s (CYP) coding genes in Citrus clementina, 272 in double haploid (dh) Citrus sinensis, and 202 in C. sinensis. In C. clementina and dh C. sinensis, CYP genes are distributed into nine clans. In the three genomes, single intron containing CYP genes are predominant in the A-type families. Among non-A-type CYP families, multiple intron containing genes are predominant. More number of genes in CYP A-type families over non-A-type families is attributed to rapid evolution of A-type genes facilitated by their gene organization. Further, complex gene organization of non-A-type genes with the presence of multiple introns might have contributed to the slower evolvement of paralogs. Majority of introns (1,660) from three genomes showed canonical GT-AG splice sites. However, 33 introns showed non-conventional GC… PyAG splice sites and functionality of these splice sites is confirmed by the ESTs lacking this intron. Across the families, gene organization is conserved between the three genomes. In dh C. sinensis, 22 genes were identified to have alternate splicing. Examination of scaffolds in C. clementina revealed that majority of the Citrus CYP genes are solitary and a few of them are in clusters of 3–8 genes. PCR amplification of C. sinensis genomic DNA with gene-specific primers failed to amplify out-grouped genes Ccl-CYP706A16 and Ccl-CYP706B1, confirming that they are specific to C. clementina. Differential number of CYP genes observed between C. clementina and C. sinensis is attributed to the extent of variability between their parents representing ancestral taxa.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号