首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Oxford Nanopore MinION Sequencing and Genome Assembly   总被引:1,自引:0,他引:1  
The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS) technology. The third-generation sequencing (TGS) technology, led by Pacific Biosciences (PacBio), is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that pro-mises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT). MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the geno-mics community. While de novo genome assemblies can be cheaply produced from SGS data, assem-bly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in gen-ome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.  相似文献   

2.
现生的鼹科动物分布于欧亚大陆和北美大陆,包括54种已知物种。鼹科动物有丰富的生态类型,是研究适应性进化的较好模型。本研究通过二代测序的方法分别获得了长吻鼩鼹和库氏长吻鼹两个物种心脏和肺脏以及脾脏和肺脏的转录组数据。这两个物种分别代表了分布于中国西南部、缅甸北部的没有特化的原始类群鼩鼹亚科以及高度适应地下生活的鼹亚科鼹族。我们首次报道了库氏长吻鼹在中国的分布。通过从头拼接,分别获得长吻鼩鼹和库氏长吻鼹197 092个和225 956个转录本,以及125 427个和94 023个unigene。通过与GenBank中的基因组注释文件比对,得到鼹科物种同源基因家族8 376个,及鼹科鼩鼱科同源基因家族8 114个。差异表达基因的各组织高表达基因中均找到10个以上组织特异性基因。然而BUSCO分析确定完整单拷贝基因在两个物种中分别为43.0%和56.6%,提示死亡后mRNA迅速降解并影响转录组拼接。比较两个物种肺部基因的表达差异发现库氏长吻鼹335个相对高表达的基因,其中包括HMGB1、HSPD1、SF3B1、COL3A1、SUMO1和JUNB等,已有报道表明上述基因可能与低氧或高海拔适应有关...  相似文献   

3.
Current challenges in de novo plant genome sequencing and assembly   总被引:1,自引:0,他引:1  
Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community.  相似文献   

4.
Pitzer E  Masselot A  Colinge J 《Proteomics》2007,7(17):3051-3054
De novo peptide sequencing algorithms are often tested on relatively small data sets made of excellent spectra. Since there are always more and more tandem mass spectra available, we have assembled six large, reliable, and diverse (three mass spectrometer types) data sets intended for such tests and we make them accessible via a web server. To exemplify their use we investigate the performance of Lutefisk, PepNovo, and PepNovoTag, three well-established peptide de novo sequencing programs.  相似文献   

5.
As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here, we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the Brassica napus cultivar Darmor‐bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor‐bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in future production of a B. napus pangenome.  相似文献   

6.
Until 2019, the human genome was available in only one fully annotated version, GRCh38, which was the result of 18 years of continuous improvement and revision. Despite dramatic improvements in sequencing technology, no other genome was available as an annotated reference until 2019, when the genome of an Ashkenazi individual, Ash1, was released. In this study, we describe the assembly and annotation of a second individual genome, from a Puerto Rican individual whose DNA was collected as part of the Human Pangenome project. The new genome, called PR1, is the first true reference genome created from an individual of African descent. Due to recent improvements in both sequencing and assembly technology, and particularly to the use of the recently completed CHM13 human genome as a guide to assembly, PR1 is more complete and more contiguous than either GRCh38 or Ash1. Annotation revealed 37,755 genes (of which 19,999 are protein coding), including 12 additional gene copies that are present in PR1 and missing from CHM13. Fifty-seven genes have fewer copies in PR1 than in CHM13, 9 map only partially, and 3 genes (all noncoding) from CHM13 are entirely missing from PR1.  相似文献   

7.
Scutellaria L. (family Lamiaceae) includes approximately 470 species found in most parts of the world and is commonly known as skullcaps. Scutellaria L. is a medicinal herb used as a folk remedy in Korea and East Asia, but it is difficult to identify and classify various subspecies by morphological methods. Since Scutellaria L. has not been studied genetically, to expand the knowledge of species in the genus Scutellaria L., de novo whole-genome assembly was performed in Scutellaria indica var. tsusimensis (H. Hara) Ohwi using the Illumina sequencing platform. We aimed to develop a molecular method that could be used to classify S. indica var. tsusimensis (H. Hara) Ohwi, S. indica L. and three other Scutellaria L. species. The assembly results for S. indica var. tsusimensis (H. Hara) Ohwi revealed a genome size of 318,741,328 bp and a scaffold N50 of 78,430. The assembly contained 92.08% of the conserved BUSCO core gene set and was estimated to cover 94.65% of the genome. The obtained genes were compared with previously registered Scutellaria nucleotide sequences and similar regions using the NCBI BLAST service, and a total of 279 similar nucleotide sequences were detected. By selecting the 279 similar nucleotide sequences and nine chloroplast DNA barcode genes, primers were prepared so that the size of the PCR product was 100 to 1000 bp. As a result, a species-specific primer set capable of distinguishing five species of Scutellaria L. was developed.  相似文献   

8.
9.
Illumina's Genome Analyzer generates ultra-short sequence reads, typically 36 nucleotides in length, and is primarily intended for resequencing. We tested the potential of this technology for de novo sequence assembly on the 6 Mbp genome of Pseudomonas syringae pv. syringae B728a with several freely available assembly software packages. Using an unpaired data set, velvet assembled >96% of the genome into contigs with an N50 length of 8289 nucleotides and an error rate of 0.33%. edena generated smaller contigs (N50 was 4192 nucleotides) and comparable error rates. ssake and vcake yielded shorter contigs with very high error rates. Assembly of paired-end sequence data carrying 400 bp inserts produced longer contigs (N50 up to 15 628 nucleotides), but with increased error rates (0.5%). Contig length and error rate were very sensitive to the choice of parameter values. Noncoding RNA genes were poorly resolved in de novo assemblies, while >90% of the protein-coding genes were assembled with 100% accuracy over their full length. This study demonstrates that, in practice, de novo assembly of 36-nucleotide reads can generate reasonably accurate assemblies from about 40 × deep sequence data sets. These draft assemblies are useful for exploring an organism's proteomic potential, at a very economic low cost.  相似文献   

10.
11.
A report on the Strategies for de novo assemblies of complex crop genomes workshop held at The Genome Analysis Centre, Norwich, UK, 8-10 October 2012.  相似文献   

12.
The ability of mouse Krebs II ascites cell DNA methylase to add methyl groups to native, unmethylated DNA (de novo activity) is stimulated by limited proteolysis. The affinity of the enzyme for DNA is not altered by this treatment but the rate of reaction is increased so that 40% or more of methylatable sites are methylated within 4.5 h. The activation is associated with a decrease in size of the enzyme to 6.2 S.  相似文献   

13.
Several academic software are available to help the validation and reporting of proteomics data generated by MS analyses. However, to our knowledge, none of them have been conceived to meet the particular needs generated by the study of organisms whose genomes are not sequenced. In that context, we have developed OVNIp, an open‐source application which facilitates the whole process of proteomics results interpretation. One of its unique attributes is its capacity to compile multiple results (from several search engines and/or several databank searches) with a resolution of conflicting interpretations. Moreover, OVNIp enables automated exploitation of de novo sequences generated from unassigned MS/MS spectra leading to higher sequence coverage and enhancing confidence in the identified proteins. The exploitation of these additional spectra might also identify novel proteins through a MS‐BLAST search, which can be easily ran from the OVNIp interface. Beyond this primary scope, OVNIp can also benefit to users who look for a simple standalone application to both visualize and confirm MS/MS result interpretations through a simple graphical interface and generate reports according to user‐defined forms which may integrate the prerequisites for publication. Sources, documentation and a stable release for Windows are available at http://wwwappli.nantes.inra.fr:8180/OVNIp .  相似文献   

14.
复杂基因组测序技术研究进展   总被引:1,自引:0,他引:1  
复杂基因组指的是无法使用常规测序和组装手段直接解析的一类基因组,通常指包含高比例重复序列、高杂合度、极端GC含量、存在难消除异源DNA污染的基因组。为了解决复杂基因组的测序和组装问题,需要分别从基因组测序实验方法、测序技术平台、组装算法与策略3个方面进行深入研究。本文详细介绍了复杂基因组测序组装相关的现有技术与方法,并结合复杂基因组经典实例介绍了复杂基因组测序的技术解决途径和发展历程,可为制订合适的复杂基因组测序策略提供参考。  相似文献   

15.
16.
Background: Next-generation sequencing (NGS) technologies have fostered an unprecedented proliferation of high-throughput sequencing projects and a concomitant development of novel algorithms for the assembly of short reads. However, numerous technical or computational challenges in de novo assembly still remain, although many new ideas and solutions have been suggested to tackle the challenges in both experimental and computational settings.Results: In this review, we first briefly introduce some of the major challenges faced by NGS sequence assembly. Then, we analyze the characteristics of various sequencing platforms and their impact on assembly results. After that, we classify de novo assemblers according to their frameworks (overlap graph-based, de Bruijn graph-based and string graph-based), and introduce the characteristics of each assembly tool and their adaptation scene. Next, we introduce in detail the solutions to the main challenges of de novo assembly of next generation sequencing data, single-cell sequencing data and single molecule sequencing data. At last, we discuss the application of SMS long reads in solving problems encountered in NGS assembly.Conclusions: This review not only gives an overview of the latest methods and developments in assembly algorithms, but also provides guidelines to determine the optimal assembly algorithm for a given input sequencing data type.  相似文献   

17.
Some of the most effective antibiotics (e.g. Vancomycin and Daptomycin) are cyclic peptides produced by non-ribosomal biosynthetic pathways. While hundreds of biomedically important cyclic peptides have been sequenced, the computational techniques for sequencing cyclic peptides are still in their infancy. Previous methods for sequencing peptide antibiotics and other cyclic peptides are based on Nuclear Magnetic Resonance spectroscopy, and require large amount (miligrams) of purified materials that, for most compounds, are not possible to obtain. Recently, development of MS-based methods has provided some hope for accurate sequencing of cyclic peptides using picograms of materials. In this paper we develop a method for sequencing of cyclic peptides by multistage MS, and show its advantages over single-stage MS. The method is tested on known and new cyclic peptides from Bacillus brevis, Dianthus superbus and Streptomyces griseus, as well as a new family of cyclic peptides produced by marine bacteria.  相似文献   

18.
Due to almost identical chemical properties of C-terminal and side-chain carboxylic groups, selective C-terminal derivatization has been difficult. Although oxazolone-based C-terminal derivatization is the only selective C-terminal modification available, it has not been used widely because of its low derivatization efficiency. In this paper, an improved oxazolone chemistry for incorporation of Br signature to C-terminus is reported. MS/MS analysis of the brominated peptides led to a series of y ions with Br signature, facilitating de novo C-terminal sequencing.  相似文献   

19.
M Nemoto  Q Wang  D Li  S Pan  T Matsunaga  D Kisailus 《Proteomics》2012,12(18):2890-2894
The biomineralized radular teeth of chitons are known to consist of iron-based magnetic crystals, associated with the maximum hardness and stiffness of any biomineral. Based on our transmission electron microscopy analysis of partially mineralized teeth, we suggest that the organic matrix within the teeth controls the iron oxide nucleation. Thus, we used Nano-LC-MS to perform a proteomic analysis of the organic matrix in radular teeth of the chiton Cryptochiton stelleri in order to identify the proteins involved in the biomineralization process. Since the genome sequence of C. stelleri is not available, cross-species similarity searching and de novo peptide sequencing were used to screen the proteins. Our results indicate that several proteins were dominant in the mineralized part of the radular teeth, amongst which, myoglobin and a highly acidic peptide were identified as possibly involved in the biomineralization process.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号