首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 302 毫秒
1.
二代测序技术的涌现推动了基因组学研究,特别是在疾病相关的遗传变异研究中发挥了重要作用.虽然大多数遗传变异类型都可以借助于各种二代测序分析工具进行检测,但是仍然存在局限性,比如短串联重复序列的长度变异.许多遗传疾病是由短串联重复序列的长度扩张导致的,尤其是亨廷顿病等多种神经系统疾病.然而,现在几乎没有工具能够利用二代测序检测长度大于测序读长的短串联重复序列变异.为了突破这一限制,我们开发了一个全新的方法,该方法基于双末端二代测序辨识短串联重复序列长度变异,并可估计其扩张长度,将其应用于一项基于全外显子组测序的运动神经元疾病临床研究中,成功地鉴定出致病的短串联重复序列长度扩张.该方法首次原创性地利用测序读长覆盖深度特征来解决短串联重复序列变异检测问题,在人类遗传疾病研究中具有广泛的应用价值,并且对于其他二代测序分析方法的开发具有启发性意义.  相似文献   

2.
由于很多微生物无法单独分离培养,研究微生物群落整体的宏基因组学是目前揭示微生物多样性的重要方法。长读长测序技术可以覆盖重复序列和复杂结构,获得短读长无法检测的基因组信息。现着重介绍了两类长读长测序技术,即基于第三代测序技术的单分子长读长测序技术和基于片段相互联系的合成长读长测序技术,并进一步介绍了长读长测序技术在宏基因组学领域的应用。  相似文献   

3.
本研究介绍了基因组结构变异检测的生物信息学基本方法和前沿技术。对基于第二代测序技术的四种检测方法(读对方法,读深方法,分裂片段方法和序列拼接方法)的原理和特点进行了详细解读,分析了第二代测序技术应用在检测结构变异上的特点与发展趋势。最后介绍了三代测序、Linked-reads和光学物理图谱等新技术在基因组结构变异检测中的应用,论述了融合新技术的结构变异检测方法的特点与优势。  相似文献   

4.
用PCR技术扩增中华鲟(Acipensersinensis)线粒体DNA(mtDNA)控制区(D-loop)时,发现中华鲟天然群体内存在个体间和个体内的mtDNA长度变异现象。DNA测序表明,长度变异发生在mtDNAryloop靠近tRANpro的位置,由长约82碱基对(bp)的重复序列串联形成的。由个体内mtDNA长度变异造成的异质性个体比例为57.4%,非异质性(同质性)个体的比例为426%。非异质性个体间的mtDNA的大小也不一样,存在长度变异。在非异质性个体中,有2、3、4、5个串联重复序列形成的4种分子类型的情况,其重复序列出现的频率从高到低的循序是3→2→4→5。在异质性个体中,同一个体由2种不同分子组合的异质体最普通,占77.78%3种不同分子组合的频率次之,占18.520。4种不同分子组合的异质体比例最少,占3.70%。没有发现由5种不同分子组合的异质体。对所有异质体混合分析表明,各种类型的重复序列出现的比例与非异质体的类似,即分子大小(含重复序列数)从高到低的顺序为3→2→4→5→1。对47尾中华鲟的个体内和个体间的遗传多样性指数分析发现,有65.3%遗传变异表现在群体内的个体间,有347%的遗传变异表现在个体内。由mtDNA长度异质性造成的个体内的多样性是中华鳍物种遗传多样性的另一途径。  相似文献   

5.
张四明  张亚平 《遗传学报》1999,26(5):489-496
用PCR技术扩增中华鲟线粒体DNA(mtDNA)控制区时,发现中华鲟天然群体现人存在个体间和个体内的mtDNA长度变异现象。DNA测序表明,长度变异发生在mtDNA D-loop靠近tRAN^pro的位置,由长约82碱基对(bp)的重复序列串联形成的。  相似文献   

6.
该研究基于第二代测序技术建立了天麻的基因文库,筛选微卫星序列,并对微卫星位点的类型、丰度、长度、偏好性等进行了分析与比较;并为60条重复次数高的微卫星序列设计了引物,运用4个种群80个样本进行了PCR扩增和聚丙烯酰胺凝胶电泳检测。结果表明:(1)天麻基因组测序得到61 048条基因序列,检测出微卫星位点12 107个,其中二核苷酸重复最多、长度变异大。(2)设计的60对微卫星引物中的20对能扩增出清晰条带且有多态性,每个位点的复等位基因数(N_a)在4~14之间,平均为8.40;多态性信息含量(PIC)平均为0.77。该研究开发的天麻微卫星分子标记为开展天麻遗传学研究及种质资源鉴定等工作奠定了基础。  相似文献   

7.
基于核rDNA的ITS序列在种子植物系统发育研究中的应用   总被引:18,自引:0,他引:18  
种子植物核rDNA是高度重复的串联序列,由于同步进化的力量.大多数物种中这些重复单位间已发生纯合或接近纯合。5.8S rDNA把核rDNA的内转录间隔区分为ITS1和ITS2两部分.在被子植物中ITS1的长度为165~298bp,ITS2的长度为177~266bp,而在裸子植物中ITS片段较长。且其长度变化主要由ITS1的长度变异所致。可对这两个片段PCR产物进行直接测序或克隆测序。由于ITS序列变异较快.能够提供较丰富的变异位点和信息位点,已成为被子植物较低分类阶元的系统发育和分类研究中的重要分子标记,为探讨多倍体复合体网状进化关系,异源多倍体的起源提供了重要的系统学信息.但它一般不适合科以上水平的系统学研究。裸子植物中ITS片段较长,重复序列间的纯合程度不同,测序比较困难.因此对探讨裸子植物系统发育和分类受到了一定的限制,但近年来有所发展。  相似文献   

8.
在人类基因组中结构变异(SVs),拷贝数变化(CNVs),单核苷酸多态性(SNP)是非常普遍的,而且和人类健康与疾病密切相关,因此检测这些结构变异对于人类生命健康非常重要。基于第二代基因测序平台,目前已经有很多结构变异检测算法,这些算法主要分为五大类:微阵列方法、读对方法、读深方法、分裂读取方法、序列组装方法。本文系统地阐述了这五类方法的基本原理、优缺点以及使用范围,并简要介绍了每一种方法的经典检测算法及应用范围、检测性能等,并对未来检测算法的研究提出了展望。  相似文献   

9.
遗传病的防治是公共卫生领域的重大课题,而明确病因是遗传病防治的重要环节。高通量测序技术(又称二代测序技术)具有高通量、低成本、高准确度的优点,为遗传诊断及咨询提供了直接证据,已成为遗传学检测不可或缺的有力工具;第三代测序也凭借其长读长的独特优势在临床应用中占据一席之地。二代及三代测序技术各有特点,互为补充,临床中针对不同的检测需求有多种类型的测序方案可供选择。基于此,对二代及三代测序技术的原理、分类及其在遗传学诊断中的应用进展做一综述,以期为临床测序方案的选择提供思路和指导。  相似文献   

10.
拷贝数变异是指基因组中发生大片段的DNA序列的拷贝数增加或者减少。根据现有的研究可知,拷贝数变异是多种人类疾病的成因,与其发生与发展机制密切相关。高通量测序技术的出现为拷贝数变异检测提供了技术支持,在人类疾病研究、临床诊疗等领域,高通量测序技术已经成为主流的拷贝数变异检测技术。虽然不断有新的基于高通量测序技术的算法和软件被人们开发出来,但是准确率仍然不理想。本文全面地综述基于高通量测序数据的拷贝数变异检测方法,包括基于reads深度的方法、基于双末端映射的方法、基于拆分read的方法、基于从头拼接的方法以及基于上述4种方法的组合方法,深入探讨了每类不同方法的原理,代表性的软件工具以及每类方法适用的数据以及优缺点等,并展望未来的发展方向。  相似文献   

11.
Insertions and deletions (indels) in human genomes are associated with a wide range of phenotypes, including various clinical disorders. High-throughput, next generation sequencing (NGS) technologies enable the detection of short genetic variants, such as single nucleotide variants (SNVs) and indels. However, the variant calling accuracy for indels remains considerably lower than for SNVs. Here we present a comparative study of the performance of variant calling tools for indel calling, evaluated with a wide repertoire of NGS datasets. While there is no single optimal tool to suit all circumstances, our results demonstrate that the choice of variant calling tool greatly impacts the precision and recall of indel calling. Furthermore, to reliably detect indels, it is essential to choose NGS technologies that offer a long read length and high coverage coupled with specific variant calling tools.  相似文献   

12.
Background: Stargardt disease (STGD) is the most common form of juvenile macular dystrophy associated with progressive central vision loss, and is agenetically and clinically heterogeneous disease. Molecular diagnosis is of great significance in aiding the clinical diagnosis, helping to determine the phenotypic severity and visual prognosis. In the present study, we determined the clinical and genetic features of seven childhood-onset and three adult-onset Chinese STGD families. We performed capture next-generation sequencing (NGS) of the probands and searched for potentially disease-causing genetic variants in previously identified retinal or macular dystrophy genes.Methods: In all, ten unrelated Chinese families were enrolled. Panel-based NGS was performed to identify potentially disease-causing genetic variants in previously identified retinal or macular dystrophy genes, including the five known STGD genes (ABCA4, PROM1, PRPH2, VMD2, and ELOVL4). Variant analysis, Sanger validation, and segregation tests were utilized to validate the disease-causing mutations in these families.Results: Using systematic data analysis with an established bioinformatics pipeline and segregation analysis, 17 pathogenic mutations in ABCA4 were identified in the 10 STGD families. Four of these mutations were novel: c.371delG, c.681T > G, c.5509C > T, and EX37del. Childhood-onset STGD was associated with severe visual loss, generalized retinal dysfunction and was due to more severe variants in ABCA4 than those found in adult-onset disease.Conclusions: We expand the existing spectrum of STGD and reveal the genotype–phenotype relationships of the ABCA4 mutations in Chinese patients. Childhood-onset STGD lies at the severe end of the spectrum of ABCA4-associated retinal phenotypes.  相似文献   

13.
Next-generation sequencing (NGS) technologies have been widely used in life sciences. However, several kinds of sequencing artifacts, including low-quality reads and contaminating reads, were found to be quite common in raw sequencing data, which compromise downstream analysis. Therefore, quality control (QC) is essential for raw NGS data. However, although a few NGS data quality control tools are publicly available, there are two limitations: First, the processing speed could not cope with the rapid increase of large data volume. Second, with respect to removing the contaminating reads, none of them could identify contaminating sources de novo, and they rely heavily on prior information of the contaminating species, which is usually not available in advance. Here we report QC-Chain, a fast, accurate and holistic NGS data quality-control method. The tool synergeticly comprised of user-friendly tools for (1) quality assessment and trimming of raw reads using Parallel-QC, a fast read processing tool; (2) identification, quantification and filtration of unknown contamination to get high-quality clean reads. It was optimized based on parallel computation, so the processing speed is significantly higher than other QC methods. Experiments on simulated and real NGS data have shown that reads with low sequencing quality could be identified and filtered. Possible contaminating sources could be identified and quantified de novo, accurately and quickly. Comparison between raw reads and processed reads also showed that subsequent analyses (genome assembly, gene prediction, gene annotation, etc.) results based on processed reads improved significantly in completeness and accuracy. As regard to processing speed, QC-Chain achieves 7–8 time speed-up based on parallel computation as compared to traditional methods. Therefore, QC-Chain is a fast and useful quality control tool for read quality process and de novo contamination filtration of NGS reads, which could significantly facilitate downstream analysis. QC-Chain is publicly available at: http://www.computationalbioenergy.org/qc-chain.html.  相似文献   

14.
Short tandem repeats (STRs) are units of 1–6 bp that repeat in a tandem fashion in DNA. Along with single nucleotide polymorphisms and large structural variations, they are among the major genomic variants underlying genetic, and likely phenotypic, divergence. STRs experience mutation rates that are orders of magnitude higher than other well-studied genotypic variants. Frequent copy number changes result in a wide range of alleles, and provide unique opportunities for modulating complex phenotypes through variation in repeat length. While classical studies have identified key roles of individual STR loci, the advent of improved sequencing technology, high-quality genome assemblies for diverse species, and bioinformatics methods for genome-wide STR analysis now enable more systematic study of STR variation across wide evolutionary ranges. In this review, we explore mutation and selection processes that affect STR copy number evolution, and how these processes give rise to varying STR patterns both within and across species. Finally, we review recent examples of functional and adaptive changes linked to STRs.  相似文献   

15.
The advent of next‐generation sequencing (NGS) technologies has transformed the way microsatellites are isolated for ecological and evolutionary investigations. Recent attempts to employ NGS for microsatellite discovery have used the 454, Illumina, and Ion Torrent platforms, but other methods including single‐molecule real‐time DNA sequencing (Pacific Biosciences or PacBio) remain viable alternatives. We outline a workflow from sequence quality control to microsatellite marker validation in three plant species using PacBio circular consensus sequencing (CCS). We then evaluate the performance of PacBio CCS in comparison with other NGS platforms for microsatellite isolation, through simulations that focus on variations in read length, read quantity and sequencing error rate. Although quality control of CCS reads reduced microsatellite yield by around 50%, hundreds of microsatellite loci that are expected to have improved conversion efficiency to functional markers were retrieved for each species. The simulations quantitatively validate the advantages of long reads and emphasize the detrimental effects of sequencing errors on NGS‐enabled microsatellite development. In view of the continuing improvement in read length on NGS platforms, sequence quality and the corresponding strategies of quality control will become the primary factors to consider for effective microsatellite isolation. Among current options, PacBio CCS may be optimal for rapid, small‐scale microsatellite development due to its flexibility in scaling sequencing effort, while platforms such as Illumina MiSeq will provide cost‐efficient solutions for multispecies microsatellite projects.  相似文献   

16.
The growing number of next-generation sequencing (NGS) data presents a unique opportunity to study the combined impact of mitochondrial and nuclear-encoded genetic variation in complex disease. Mitochondrial DNA variants and in particular, heteroplasmic variants, are critical for determining human disease severity. While there are approaches for obtaining mitochondrial DNA variants from NGS data, these software do not account for the unique characteristics of mitochondrial genetics and can be inaccurate even for homoplasmic variants. We introduce MitoScape, a novel, big-data, software for extracting mitochondrial DNA sequences from NGS. MitoScape adopts a novel departure from other algorithms by using machine learning to model the unique characteristics of mitochondrial genetics. We also employ a novel approach of using rho-zero (mitochondrial DNA-depleted) data to model nuclear-encoded mitochondrial sequences. We showed that MitoScape produces accurate heteroplasmy estimates using gold-standard mitochondrial DNA data. We provide a comprehensive comparison of the most common tools for obtaining mtDNA variants from NGS and showed that MitoScape had superior performance to compared tools in every statistically category we compared, including false positives and false negatives. By applying MitoScape to common disease examples, we illustrate how MitoScape facilitates important heteroplasmy-disease association discoveries by expanding upon a reported association between hypertrophic cardiomyopathy and mitochondrial haplogroup T in men (adjusted p-value = 0.003). The improved accuracy of mitochondrial DNA variants produced by MitoScape will be instrumental in diagnosing disease in the context of personalized medicine and clinical diagnostics.  相似文献   

17.
Inherited deafness has been shown to have high genetic heterogeneity. For many decades, linkage analysis and candidate gene approaches have been the main tools to elucidate the genetics of hearing loss. However, this associated study design is costly, time-consuming, and unsuitable for small families. This is mainly due to the inadequate numbers of available affected individuals, locus heterogeneity, and assortative mating. Exome sequencing has now become technically feasible and a cost-effective method for detection of disease variants underlying Mendelian disorders due to the recent advances in next-generation sequencing (NGS) technologies. In the present study, we have combined both the Deafness Gene Mutation Detection Array and exome sequencing to identify deafness causative variants in a large Chinese composite family with deaf by deaf mating. The simultaneous screening of the 9 common deafness mutations using the allele-specific PCR based universal array, resulted in the identification of the 1555A>G in the mitochondrial DNA (mtDNA) 12S rRNA in affected individuals in one branch of the family. We then subjected the mutation-negative cases to exome sequencing and identified novel causative variants in the MYH14 and WFS1 genes. This report confirms the effective use of a NGS technique to detect pathogenic mutations in affected individuals who were not candidates for classical genetic studies.  相似文献   

18.
The development and screening of microsatellite markers have been accelerated by next‐generation sequencing (NGS) technology and in particular GS‐FLX pyro‐sequencing (454). More recent platforms such as the PGM semiconductor sequencer (Ion Torrent) offer potential benefits such as dramatic reductions in cost, but to date have not been well utilized. Here, we critically compare the advantages and disadvantages of microsatellite development using PGM semiconductor sequencing and GS‐FLX pyro‐sequencing for two gymnosperm (a conifer and a cycad) and one angiosperm species. We show that these NGS platforms differ in the quantity of returned sequence data, unique microsatellite data and primer design opportunities, mostly consistent with the differences in read length. The strength of the PGM lies in the large amount of data generated at a comparatively lower cost and time. The strength of GS‐FLX lies in the return of longer average length sequences and therefore greater flexibility in producing markers with variable product length, due to longer flanking regions, which is ideal for capillary multiplexing. These differences need to be considered when choosing a NGS method for microsatellite discovery. However, the ongoing improvement in read lengths of the NGS platforms will reduce the disadvantage of the current short read lengths, particularly for the PGM platform, allowing greater flexibility in primer design coupled with the power of a larger number of sequences.  相似文献   

19.
Next generation sequencing (NGS) is perhaps one of the most exciting advances in the field of life sciences and biomedical research in the last decade. With the availability of massive parallel sequencing, human DNA blueprint can be decoded to explore the hidden information with reduced time and cost. This technology has been used to understand the genetic aspects of various diseases including cardiomyopathies. Mutations for different cardiomyopathies have been identified and cataloging mutations on phenotypic basis are underway and are expected to lead to new discoveries that may translate to novel diagnostic, prognostic and therapeutic targets. With ease in handling NGS, cost effectiveness and fast data output, NGS is now considered as a diagnostic tool for cardiomyopathy by providing targeted gene sequencing. In addition to the number of genetic variants that are identified in cardiomyopathies, there is a need of quicker and easy way to screen multiple genes associated with the disease. In this review, an attempt has been made to explain the NGS technology, methods and applications in cardiomyopathies and their perspective in clinical practice and challenges which are to be addressed.  相似文献   

20.
Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号