首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 78 毫秒
1.
通过对美国华盛顿洲在2020年3至4月底爆发的新冠肺炎病人二代高通量测序数据分析,找出新冠病毒刺突糖蛋白(简称S朊)中存在的所有突变类型,为研究病毒在体内复制的突变规律及研究疫苗提供基础资料。利用NCBI中公布的130例美国华盛顿区报道的新冠肺炎病人二代高通量测序数据,进行序列组装,并对其朊编码基因进行深度突变分析,找出其潜在的疑似抗原变异位点(antigen variation,以下简称突变点)。排除30条未获全长的数据,共获得100份病人完整的SARS-CoV-2序列,其S朊编码基因,主要突变点集中在S1区的SP区及受体结合区(RBD)之前的间隔区(突变区基本呈连续分布:126aa~153aa, 194aa~204aa)和S2区的1250aa~1270aa区。100份样本的数据研究结果显示,新冠病毒S基因在病人体内复制过程中较为稳定,编码氨基酸的突变点(突变频率15%)呈单样本散在分布,且S2区较S1区更为稳定。未在新冠病毒的RBD区找到突变率20%的点,而S区散在零星突变区域主要集中在S1区间隔区(126aa~153aa, 194aa~204aa处,且基本呈连续分布)和S2末端约20aa处。  相似文献   

2.
使用第二代测序数据来发现癌细胞中的基因组突变,一直是很重要的科学应用问题。此研究使用一个癌症病人的大量数据,评估了甄别基因组突变的几个现有工具。经过比较各工具的方法和正确率,本文发现各自都有自己的优点和缺点。针对这些优缺点,本文提供一些建议,让工具使用者能更好地选择合适的工具。  相似文献   

3.
宋琳琳  顾朝辉  韦朝春  陈赛娟 《生物磁学》2009,(15):2899-2902,2912
目的:针对下一代测序数据量大、序列长度短的特点,研究数据分析和质量评估方法。方法:选择已发布的Illumina-Solexa平台测序数据为研究对象,通过MAQ软件将测序数据与人类全基因组序列进行比对,并以外显子区域为例,在位点水平对测序数据质量进行评估。结果:结合已有软件系统和本文自创线性算法,建立了一套包括比对、拼接在内的测序数据质量评估系统。比对分析后,发现原始测序序列共覆盖了127,113,378个位点,涉及24条染色体上的64868个外显子。其中,每个位点都被测到的外显子为0.50%,位点平均测序深度大于等于1的外显子为3.98%。结论:成功构建了基于Illumina-Solexa测序平台的数据分析和质量评估方法,其可适用于其它第二代测序平台。研究者可在质量评估的基础上完善测序试验设计,并进行SNP和突变筛选及后续功能性研究。  相似文献   

4.
结核病是严重的公共健康问题之一,而耐药结核病的增加是控制结核病流行的难点之一。快速、准确的诊断是提高结核患者治愈率和降低死亡率的关键因素。本研究建立了基于二代测序技术的扩增子测序方法,对5种一线抗结核药物的17个耐药基因进行检测。在26个临床耐药结核菌株中共鉴定出65个突变,包括33个热点突变,9个稀有突变和23个新突变。对18个新发现的错义突变进行了蛋白质序列保守性和蛋白质局部结构的分析。结果表明,14个新的错义突变在9种分枝杆菌中显示出高度保守性,并且导致了该蛋白质局部结构的改变。根据本研究检测和分析结果,推测这些新发现的突变可能是潜在的耐药突变。在本研究中,构建了扩增子测序的检测方法,可同时检测10株临床结核菌株的17个耐药基因,是一种快速、准确并且全面的检测耐药结核分枝杆菌一线治疗药物耐药突变的方法,该方法不仅能检测热点突变和稀有突变,还能发现一些未报道过的新突变。该检测方法或可用于临床诊断和基础研究。  相似文献   

5.
采用新一代高通量测序技术Illumina Solexa Hiseq 2500对发芽荞麦转录组进行测序,结合生物信息学方法开展基因表达谱研究和功能基因预测。通过测序,获得了42 953 962个序列读取片段(reads),包含了5.37 Gb碱基序列信息。对reads进行序列组装,获得45 278个单基因簇(unigenes),平均长度862 bp,序列信息达到了39 Mb。另外,从长度分布、GC含量、表达水平等方面对unigenes进行评估,数据显示测序质量好,可信度高。数据库中的序列同源性比较表明,2 127个unigenes与其他生物的己知基因具有不同程度的同源性。发芽苦荞转录组中的unigenes与细胞进程、细胞和蛋白结合相关。将unigenes与KOG数据库进行比对,根据其功能大致可分为24类。以KEGG数据库作为参考,依据代谢途径可将unigenes定位到328个代谢途径分支,包括核糖体代谢通路、碳水化合物代谢等,并且筛选出38条参与GABA合成的氧化磷酸化代谢的unigenes。SSR位点查找发现,从71 366个unigenes中共找到7 141个SSR位点。SSR不同重复基序类型中,出现频率最高的为A/T,其次是AAG/CTT和AT/AT。  相似文献   

6.
目的观察树鼩不同肠道部位菌群的多样性及构成。方法采集3只雄性树鼩回肠、盲肠、结肠内容物,提取DNA,利用Illumina PE250高通量测序平台扩增肠道菌16S rDNA V4区域,分析菌群结构和丰富度。结果树鼩回肠、盲肠、结肠菌群的优化序列数差异无统计学意义。α多样性分析,树鼩肠道3个部位菌群的Chao1指数、PD指数、Simpson指数、Shannon-Wiener指数差异无统计学意义,相对于回肠,盲肠与结肠菌群多样性的相似性较高。Rank-Abundance曲线显示,回肠菌群的丰富度较高且分布较均匀。β多样性分析,树鼩回肠菌群结构差异性较小,盲肠与结肠菌群结构差异较大。树鼩肠道菌群共检出26个门,17个门在3个组共存。互养菌门(Synergistetes)、Rokubacteria门、奇古菌门(Thaumarchaeota)、TA06门仅见于回肠;衣原体门(Chlamydiae)为盲肠中特有;迷踪菌门(Elusimicrobia)仅在结肠中发现。共获得414个属,结肠、盲肠、回肠中独有属分别为15个、7个、3个。共发现530个种,其中唾液乳杆菌(Lactobacillus salivarius)丰富度最高。Random Forest分析结果显示,在树鼩回肠、盲肠、结肠中发现7个生物标记物。结论树鼩回肠、盲肠、结肠肠道菌群多样性差异无显著性,但相对于结肠与盲肠,回肠菌群丰富度较高且分布较均匀。树鼩3个肠道部位具有各自独特的菌群。  相似文献   

7.
DNA甲基化作为一种表观遗传学修饰,在调控基因表达、X染色体失活、印记基因等方面都发挥着重要的作用.不同的DNA甲基化的预处理方法结合二代测序产生了大量的高通量甲基化数据,这些数据的存储、处理和分析是当前亟需解决的问题.在本文中,总结了目前存在的三种高通量DNA甲基化检测技术(限制性内切酶法,亲和纯化法,重亚硫酸盐转换法),以及针对这些技术产生的高通量数据开发的存储、处理和分析工具.另外,还注重介绍了单碱基水平的DNA甲基化检测技术,BS-Seq的测序原理、数据处理流程以及后续的分析工具.  相似文献   

8.
目的:大量研究证实线粒体DNA(mtDNA)突变与肿瘤发生及进展密切相关,但使用传统测序方法难以高通量、高精确度的检测mtDNA突变,为此本研究建立了基于新一代测序技术的mtDNA突变检测方法.方法:提取肝癌患者癌、癌旁组织以及外周血细胞总DNA,利用PCR技术对线粒体基因组进行富集并对PCR产物进行平末端、粘性末端连接或对PCR引物进行氨基修饰,构建mtDNA测序文库.经Illumina HiSeq 2000平台测序后利用生物信息学方法与人类mtDNA参考序列进行比对,并进行测序数据分析.结果:通过对不同质量基因组DNA进行评估后,发现三对引物法适用于大部分DNA样本的mtDNA富集.进一步我们发现PCR引物的氨基修饰可显著提高测序数据覆盖均一性,降低测序成本.结论:本研究利用新一代测序技术通过对线粒体DNA富集方法以及测序覆盖度均一性进行优化,建立了一套灵敏、特异、高通量的mtDNA突变检测策略,为mtDNA突变与疾病研究提供了新方法.  相似文献   

9.
基于高通量测序的全基因组关联研究策略   总被引:1,自引:0,他引:1  
周家蓬  裴智勇  陈禹保  陈润生 《遗传》2014,36(11):1099-1111
全基因组关联研究(Genome-wide association study, GWAS)是人类复杂疾病研究的重要组成部分之一,在群体水平检测全基因组范围的遗传变异与可观测性状间的遗传关联。传统的GWAS是以芯片(Array)技术获得高密度的遗传变异,尽管硕果累累,但也存在不少问题。如:所谓的“缺失的遗传力”,即利用关联分析检测达到全基因组水平显著的遗传变异位点只能解释小部分遗传力;在某些性状上不同研究的结果一致性较弱;显著关联的遗传变异位点的功能较难解释等。高通量测序技术,也称第二代测序(Next-generation sequencing, NGS)技术,可以快速、准确地产出高通量的变异位点数据,为解决以上问题提供了可行的方案。基于NGS技术的GWAS方法(NGS-GWAS),可在一定程度上弥补传统GWAS的不足。文章对NGS-GWAS策略和方法进行了系统性调研,提出了目前较为可行的NGS-GWAS的实施策略和方法,并对NGS-GWAS如何应用于个体化医疗(Personalized medicine, PM)进行了展望。  相似文献   

10.
主要组织相容性复合体(MHC)基因的遗传参数是濒危动物圈养遗传管理的重要参考依据.本研究通过建立MHC基因分型的直接测序方法,对中国大熊猫保护研究中心圈养的91只大熊猫Ailuropoda melanoleuca进行了3个Ⅰ类MHC基因(Aime-C、Aime-I、Aime-L)和4个Ⅱ类MHC基因(Aime-DQA1...  相似文献   

11.

Background

High-throughput DNA sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously.

Results

We present a multithreaded program suite called ANGSD. This program can calculate various summary statistics, and perform association mapping and population genetic analyses utilizing the full information in next generation sequencing data by working directly on the raw sequencing data or by using genotype likelihoods.

Conclusions

The open source c/c++ program ANGSD is available at http://www.popgen.dk/angsd. The program is tested and validated on GNU/Linux systems. The program facilitates multiple input formats including BAM and imputed beagle genotype probability files. The program allow the user to choose between combinations of existing methods and can perform analysis that is not implemented elsewhere.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0356-4) contains supplementary material, which is available to authorized users.  相似文献   

12.
Next-generation sequencing (NGS) is getting routinely used in the diagnosis of hereditary diseases, such as human cardiomyopathies. Hence, it is of utter importance to secure high quality sequencing data, enabling the identification of disease-relevant mutations or the conclusion of neg-ative test results. During the process of sample preparation, each protocol for target enrichment library preparation has its own requirements for quality control (QC); however, there is little evi-dence on the actual impact of these guidelines on resulting data quality. In this study, we analyzed the impact of QC during the diverse library preparation steps of Agilent SureSelect XT target enrichment and Illumina sequencing. We quantified the parameters for a cohort of around 600 sam-ples, which include starting amount of DNA, amount of sheared DNA, smallest and largest frag-ment size of the starting DNA; amount of DNA after the pre-PCR, and smallest and largest fragment size of the resulting DNA;as well as the amount of the final library, the corresponding smallest and largest fragment size, and the number of detected variants. Intriguingly, there is a high tolerance for variations in all QC steps, meaning that within the boundaries proposed in the current study, a considerable variance at each step of QC can be well tolerated without compromising NGS quality.  相似文献   

13.
As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits,a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data,sample quality issues,such as gender mismatch,abnormal inbreeding coefficient,cryptic relatedness,and population outliers,can also have fundamental impact on downstream analysis. However,there is a lack of tools specialized in identifying problematic samples from NGS data,often due to the limitation of sample size and variant counts. We developed SeqSQC,a Bioconductor package,to automate and accelerate sample cleaning in NGS data of any scale. SeqSQC is designed for efficient data storage and access,and equipped with interactive plots for intuitive data visualization to expedite the identification of problematic samples. SeqSQC is available at http://bioconductor. org/packages/SeqSQC.  相似文献   

14.
Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multi- ple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice.  相似文献   

15.

Background

DNA-based methods like PCR efficiently identify and quantify the taxon composition of complex biological materials, but are limited to detecting species targeted by the choice of the primer assay. We show here how untargeted deep sequencing of foodstuff total genomic DNA, followed by bioinformatic analysis of sequence reads, facilitates highly accurate identification of species from all kingdoms of life, at the same time enabling quantitative measurement of the main ingredients and detection of unanticipated food components.

Results

Sequence data simulation and real-case Illumina sequencing of DNA from reference sausages composed of mammalian (pig, cow, horse, sheep) and avian (chicken, turkey) species are able to quantify material correctly at the 1% discrimination level via a read counting approach. An additional metagenomic step facilitates identification of traces from animal, plant and microbial DNA including unexpected species, which is prospectively important for the detection of allergens and pathogens.

Conclusions

Our data suggest that deep sequencing of total genomic DNA from samples of heterogeneous taxon composition promises to be a valuable screening tool for reference species identification and quantification in biosurveillance applications like food testing, potentially alleviating some of the problems in taxon representation and quantification associated with targeted PCR-based approaches.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-639) contains supplementary material, which is available to authorized users.  相似文献   

16.
Little is known about the inheritance of very low heteroplasmy mitochondria DNA (mtDNA) variations. Even with the development of new next-generation sequencing methods, the practical lower limit of measured heteroplasmy is still about 1% due to the inherent noise level of the sequencing. In this study, we sequenced the mitochondrial genome of 44 individuals using Illumina high-throughput sequencing technology and obtained high-coverage mitochondria sequencing data. Our study population contains many mother-offspring pairs. This unique study design allows us to bypass the usual heteroplasmy limitation by analyzing the correlation of mutation levels at each position in the mtDNA sequence between maternally related pairs and non-related pairs. The study showed that very low heteroplasmy variants, down to almost 0.1%, are inherited maternally and that this inheritance begins to decrease at about 0.5%, cor- resnondin to abottleneck of about 200 mtDNA.  相似文献   

17.
To investigate the community structure and diversity of endophytic fungi in the leaves of Artemisia argyi, leaf samples were collected from five A. argyi varieties grown in different cultivation areas in China, namely, Tangyin Beiai in Henan (BA), Qichun Qiai in Hubei (QA), Wanai in Nanyang in Henan (WA), Haiai in Ningbo in Zhejiang (HA), and Anguo Qiai in Anguo in Hebei (AQA), and analyzed using Illumina high-throughput sequencing technology. A total of 365,919 pairs of reads were obtained, and the number of operational taxonomic units for each sample was between 165 and 285. The alpha diversity of the QA and BA samples was higher, and a total of two phyla, eight classes, 12 orders, 15 families, and 16 genera were detected. At the genus level, significant differences were noted in the dominant genera among the samples, with three genera being shared in all the samples. The dominant genus in QA was Erythrobasidium, while that in AQA, HA, and BA was Sporobolomyces, and that in WA was Alternaria, reaching a proportion of 16.50%. These results showed that the fungal community structure and diversity in QA and BA were high. The endophytes are of great importance to the plants, especially for protection, phytohormone and other phytochemical production, and nutrition. Therefore, this study may be significant with the industrial perspective of Artemisia species.  相似文献   

18.
19.
Multi-sample pooling and Illumina Genome Analyzer (GA) sequencing allows high throughput sequencing of multiple samples to determine population sequence variation. A preliminary experiment, using the RET proto-oncogene as a model, predicted ≤30 samples could be pooled to reliably detect singleton variants without requiring additional confirmation testing. This report used 30 and 50 sample pools to test the hypothesized pooling limit and also to test recent protocol improvements, Illumina GAIIx upgrades, and longer read chemistry. The SequalPrepTM method was used to normalize amplicons before pooling. For comparison, a single ‘control’ sample was run in a different flow cell lane. Data was evaluated by variant read percentages and the subtractive correction method which utilizes the control sample. In total, 59 variants were detected within the pooled samples, which included all 47 known true variants. The 15 known singleton variants due to Sanger sequencing had an average of 1.62±0.26% variant reads for the 30 pool (expected 1.67% for a singleton variant [unique variant within the pool]) and 1.01±0.19% for the 50 pool (expected 1%). The 76 base read lengths had higher error rates than shorter read lengths (33 and 50 base reads), which eliminated the distinction of true singleton variants from background error. This report demonstrated pooling limits from 30 up to 50 samples (depending on error rates and coverage), for reliable singleton variant detection. The presented pooling protocols and analysis methods can be used for variant discovery in other genes, facilitating molecular diagnostic test design and interpretation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号