首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
王昊  陈挺 《生物信息学》2021,19(1):26-34
DNA测序是生物信息学研究的重要内容之一,对测序序列的从头拼接是其中非常基础而重要的步骤.随着测序技术的不断更新,新的第三代测序数据拥有更长的序列长度、高错误率等性质,针对这些性质,同时使用二代、三代测序数据进行混合拼接是获得更好的拼接结果一种重要方式.本文介绍了现有的混合拼接软件的基本原理,并比较了不同软件拼接结果....  相似文献   

2.
近几年飞速发展的高通量测序技术(next generation sequencing,NGS)在生命科学研究的各个领域充分展现了其低成本、高通量和应用面广等优势。在现代农业生物技术领域,利用高通量测序技术,科学家们不仅能更经济而高效对农作物、模式植物或不同栽培品种进行深入的全基因组测序、重测序,也可以对成百上千的栽培品种进行高效而准确的遗传差异分析、分子标记分析、连锁图谱分析、表观遗传学分析、转录组分析,进而改进农作物的育种技术,加快新品种的育种研究。其中,获得农作物的全基因组序列是其他研究和分析的基础。本文通过介绍近年来发表的一些利用高通量测序技术进行的农作物全基因组测定和组装的工作,展示高通量测序技术在现代农业生物技术领域的广泛前景以及其建立起来的研究基础。  相似文献   

3.
证明噬菌体高通量测序中高频出现的序列即是噬菌体基因组的末端序列。在T3噬菌体基因组末端连接特异性序列接头,然后进行高通量测序,同时将不加接头的T3基因组也进行高通量测序,对测序结果进行生物信息学比较分析。采用类似高通量测序技术分析N4样噬菌体的全基因组序列。加接头的序列与无接头序列中的高频序列完全一致,证明了高通量测序过程中得到的高频序列就是加接头的基因组末端序列,同时证明T4样噬菌体的末端具有序列特异性而非完全随机,此外我们还发现N4样噬菌体基因组左侧末端具有唯一序列,而其基因组右侧末端不均一。高通量测序技术方便快捷,可用于噬菌体基因组末端和全基因组序列的同时测定。  相似文献   

4.
正反向测序信息在全基因组序列拼接及分析中的应用   总被引:1,自引:0,他引:1  
插入片段双末端正反向测序信息(double-barreled data, DB信息)已广泛应用于大基因组测序组装项目. 根据绘制籼稻全基因组工作框架图的经验, 总结了DB信息在序列组装流程中的应用, 同时, 在原有基础上提出了改进的DB信息使用方法, 包括基因组序列拼接、质量检验和重叠群的连接. 此外, 进一步提出了DB信息在下游数据分析过程中新的应用, 包括利用DB信息获得基因组文库中每个克隆所包含的基因组片段的精确信息, 以及在此基础上设计低成本全基因组基因芯片的一种基因芯片设计新方法. 随着待测序物种的逐年增多, 相信正反向测序信息在基因组测序组装工作, 以及后续的基因组研究中将发挥越来越重要的作用.  相似文献   

5.
复杂基因组测序技术研究进展   总被引:1,自引:0,他引:1  
复杂基因组指的是无法使用常规测序和组装手段直接解析的一类基因组,通常指包含高比例重复序列、高杂合度、极端GC含量、存在难消除异源DNA污染的基因组。为了解决复杂基因组的测序和组装问题,需要分别从基因组测序实验方法、测序技术平台、组装算法与策略3个方面进行深入研究。本文详细介绍了复杂基因组测序组装相关的现有技术与方法,并结合复杂基因组经典实例介绍了复杂基因组测序的技术解决途径和发展历程,可为制订合适的复杂基因组测序策略提供参考。  相似文献   

6.
赤麂线粒体全基因组的序列和结构   总被引:4,自引:0,他引:4  
提取赤麂细胞株总DNA,参照我们实验室已测定的同属动物小麂线粒体全基因组序列设计引物,PCR扩增、测序、拼接,获得赤麂线粒体全基因组序列并进行生物信息学分析。赤麂线粒体全基因组序列全长16354bp。定位了22个tRNA基因、2个rRNA基因、13个蛋白编码基因和1个D-loop区。赤麂与小麂及其它哺乳动物线粒体的基因组结构相同,它们的序列同源性都较高。  相似文献   

7.
在4月17日的《自然》杂志上,美国的科学家发表了首个利用新一代高速测序技术得到的“DNA之父”詹姆斯。沃森的全基因组。该成果标志着人类基因组测序领域的又一个里程碑,新技术向个人化基因组这一伟大目标又迈进了一步。  相似文献   

8.
测序技术对于确定基因组碱基序列至关重要。如今.采用全新方法的新一代测序仪已经开始普及。  相似文献   

9.
现代科技迅速发展的今天,无疑是分子生物学的世界。基因组测序是对生物的遗传结构进行分析的一种技术。作为一项尤为重要的生物技术。在近几年来得到了迅猛的发展以及应用,并取得了跨越性的进展,在很多领域取得了革命性的成就。无论是在人类疾病的防治,还是在畜牧遗传育种发面都发挥着重要的作用。本综述主要介绍了第一代测序技术、第二代测序技术以及第三代测序技术的原理,并对三者的优缺点进行了比较说明,还分别阐述了全基因组高通量技术在肉牛的起源、遗传育种与优良性状的选育和奶牛的疾病防治、生产性能的提高等方面的研究进展,对当下的高通量测序技术存在的问题进行了讨论,并对其未来进行了展望。  相似文献   

10.
生物序列拼接及其算法   总被引:1,自引:0,他引:1  
生物序列拼接是鸟枪法(shotgun)测序中的一个重要环节.主要介绍了生物序列拼接及其研究中所涉及的一些基本问题,概述了两类主要的生物序列拼接算法,分析了其各自的特点,并对其进行了比较.  相似文献   

11.
T Druet  I M Macleod  B J Hayes 《Heredity》2014,112(1):39-47
Genomic prediction from whole-genome sequence data is attractive, as the accuracy of genomic prediction is no longer bounded by extent of linkage disequilibrium between DNA markers and causal mutations affecting the trait, given the causal mutations are in the data set. A cost-effective strategy could be to sequence a small proportion of the population, and impute sequence data to the rest of the reference population. Here, we describe strategies for selecting individuals for sequencing, based on either pedigree relationships or haplotype diversity. Performance of these strategies (number of variants detected and accuracy of imputation) were evaluated in sequence data simulated through a real Belgian Blue cattle pedigree. A strategy (AHAP), which selected a subset of individuals for sequencing that maximized the number of unique haplotypes (from single-nucleotide polymorphism panel data) sequenced gave good performance across a range of variant minor allele frequencies. We then investigated the optimum number of individuals to sequence by fold coverage given a maximum total sequencing effort. At 600 total fold coverage (x 600), the optimum strategy was to sequence 75 individuals at eightfold coverage. Finally, we investigated the accuracy of genomic predictions that could be achieved. The advantage of using imputed sequence data compared with dense SNP array genotypes was highly dependent on the allele frequency spectrum of the causative mutations affecting the trait. When this followed a neutral distribution, the advantage of the imputed sequence data was small; however, when the causal mutations all had low minor allele frequencies, using the sequence data improved the accuracy of genomic prediction by up to 30%.  相似文献   

12.
Hai Peng  Jing Zhang 《Biologia》2009,64(1):20-26
DNA sequences can be used for the analysis of genetic variation and gene function. The high-throughput sequencing techniques that have been developed over the past three years can read as many as one billion bases per run, and are far less expensive than the traditional Sanger sequencing method. Therefore, the high-throughput sequencing has been applied extensively to genomic analyses, such as screening for mutations, construction of genomic methylation maps, and the study of DNA-protein interactions. Although they have only been available for a short period, high-throughput sequencing techniques are profoundly affecting many of the life sciences, and are opening out new potential avenues of research. With the highly-developed commercial high-throughput sequencing platforms, each laboratory has the opportunity to explore this research field. Therefore, in this paper, we have focused on commercially-popular high-throughput sequencing techniques and the ways in which they have been applied over the past three years.  相似文献   

13.
Current challenges in de novo plant genome sequencing and assembly   总被引:1,自引:0,他引:1  
Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community.  相似文献   

14.
To isolate the novel genes related to human hepatocellular carcinoma (HCC), we sequenced P1-derived artificial chromosome PAC579 (D17S926 locus) mapped in the minimum LOH (loss of heterozygosity) deletion region of chromosome 17p13.3 in HCC. Four novel genes mapped in this genomic sequence area were isolated and cloned by wet-lab experiments, and the exons of these genes were located. 0–60 kb of this genomic sequence including the genes of interest was scanned with five different computational exon prediction programs as well as four splice site recognition programs. After analyzing and comparing the computationally predicted results with the wet-lab experiment results, some potential exons were predicted in the genomic sequence by using these programs.  相似文献   

15.
16.
17.
18.

Background

Assembling genes from next-generation sequencing data is not only time consuming but computationally difficult, particularly for taxa without a closely related reference genome. Assembling even a draft genome using de novo approaches can take days, even on a powerful computer, and these assemblies typically require data from a variety of genomic libraries. Here we describe software that will alleviate these issues by rapidly assembling genes from distantly related taxa using a single library of paired-end reads: aTRAM, automated Target Restricted Assembly Method. The aTRAM pipeline uses a reference sequence, BLAST, and an iterative approach to target and locally assemble the genes of interest.

Results

Our results demonstrate that aTRAM rapidly assembles genes across distantly related taxa. In comparative tests with a closely related taxon, aTRAM assembled the same sequence as reference-based and de novo approaches taking on average < 1 min per gene. As a test case with divergent sequences, we assembled >1,000 genes from six taxa ranging from 25 – 110 million years divergent from the reference taxon. The gene recovery was between 97 – 99% from each taxon.

Conclusions

aTRAM can quickly assemble genes across distantly-related taxa, obviating the need for draft genome assembly of all taxa of interest. Because aTRAM uses a targeted approach, loci can be assembled in minutes depending on the size of the target. Our results suggest that this software will be useful in rapidly assembling genes for phylogenomic projects covering a wide taxonomic range, as well as other applications. The software is freely available http://www.github.com/juliema/aTRAM.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0515-2) contains supplementary material, which is available to authorized users.  相似文献   

19.
Illumina's Genome Analyzer generates ultra-short sequence reads, typically 36 nucleotides in length, and is primarily intended for resequencing. We tested the potential of this technology for de novo sequence assembly on the 6 Mbp genome of Pseudomonas syringae pv. syringae B728a with several freely available assembly software packages. Using an unpaired data set, velvet assembled >96% of the genome into contigs with an N50 length of 8289 nucleotides and an error rate of 0.33%. edena generated smaller contigs (N50 was 4192 nucleotides) and comparable error rates. ssake and vcake yielded shorter contigs with very high error rates. Assembly of paired-end sequence data carrying 400 bp inserts produced longer contigs (N50 up to 15 628 nucleotides), but with increased error rates (0.5%). Contig length and error rate were very sensitive to the choice of parameter values. Noncoding RNA genes were poorly resolved in de novo assemblies, while >90% of the protein-coding genes were assembled with 100% accuracy over their full length. This study demonstrates that, in practice, de novo assembly of 36-nucleotide reads can generate reasonably accurate assemblies from about 40 × deep sequence data sets. These draft assemblies are useful for exploring an organism's proteomic potential, at a very economic low cost.  相似文献   

20.
During the last three decades, both genome mapping and sequencing methods have advanced significantly to provide a foundation for scientists to understand genome structures and functions in many species. Generally speaking, genome mapping relies on genome sequencing to provide basic materials, such as DNA probes and markers for their localizations, thus constructing the maps. On the other hand, genome sequencing often requires a high-resolution map as a skeleton for whole genome assembly. However, both genome mapping and sequencing have never come together in one pipeline. After reviewing mapping and next-generation sequencing methods, we would like to share our thoughts with the genome community on how to combine the HAPPY mapping technique with the new-generation sequencing, thus integrating two systems into one pipeline, called HAPPY pipeline. The pipeline starts with preparation of a HAPPY panel, followed by multiple displacement amplification for producing a relatively large quantity of DNA. Instead of conventional marker genotyping, the amplified panel DNA samples are subject to new-generation sequencing with barcode method, which allows us to determine the presence/absence of a sequence contig as a traditional marker in the HAPPY panel. Statistical analysis will then be performed to infer how close or how far away from each other these contigs are within a genome and order the whole genome sequence assembly as well. We believe that such a universal approach will play an important role in genome sequencing, mapping, and assembly of many species; thus advancing genome science and its applications in biomedicine and agriculture.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号