首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
为了挖掘与‘红叶’杜仲(Eucommia ulmoides ‘Hongye’)红叶性状紧密联系的SNP位点,进一步揭示红叶性状的遗传基础和分子机理。以‘红叶’杜仲和普通绿叶杜仲‘小叶’杜仲(Eucommia ulmoides ‘Xiaoye’)为研究材料,进行覆盖深度约为10x的全基因组重测序。使用SnpEff软件预测变异位点对蛋白编码的影响,结合花色苷的代谢通路和关键酶基因,筛选与‘红叶’杜仲叶色形成相关的差异位点。利用Sanger测序二代测序筛选的SNP位点,分子标记验证群体是‘红叶’杜仲和‘小叶’杜仲。结果表明,‘红叶’杜仲测序产生Clean data为14.16 Gb,‘小叶’杜仲产生Clean data为14.29 Gb。在‘红叶’杜仲中注释到严重影响蛋白质功能的有1 516个SNP,中度影响的41 328个SNP,在‘小叶’杜仲中存在严重影响蛋白质功能的SNP为1 640个,中度影响功能的SNP为47 192个。测得26 722条基因中有228条基因是与花色苷或类黄酮合成相关的酶基因。经过筛选,确定了12个特异性的SNP位点,均属于外显子区域的错义突变。利用一代测序验证,根据SNP位置设计了7对引物,SNP准确率达到100%。  相似文献   

2.
下丘脑和垂体调节母鸡的生殖生理和产蛋性能。为了解其产卵调控机制,利用RNA-seq技术对高产和低产庄河大骨鸡的下丘脑和垂体组织进行测序和分析,获得SNP/InDel位点,并进行数据统计及生物信息学分析,以鸡基因组作为参考,对SNP所在基因进行GO富集分析。结果显示:SNP/InDel纯合子变异体数量高于杂合子变异体数量,垂体组织中的SNP/InDel数量高于下丘脑组织;SNP类型转换种类少于颠换,但转换类型的数量却远远高于颠换;垂体组织中的SNP所在基因的GO富集程度与生产性能呈负相关关系(P0.05),下丘脑组织中的SNP所在基因的GO富集程度与生产性能呈正相关关系(P0.05)。研究结果为大骨鸡的选育以及改善其生产性能的研究提供了理论依据。  相似文献   

3.
目的 通过全基因组测序(whole genome sequencing,WGS)获得高密度单核苷酸多态性(single nucleotide polymorphism,SNP)分型数据,评估分型准确性,研究建立WGS数据用于法医SNP系谱推断的方法。方法 通过华大MGISEQ-200RS测序平台对样本进行深度为30×的WGS,从测序数据中提取Wegene GSA芯片中的645 199个常染色体SNP位点,质控过滤后运用IBS/IBD算法计算预测亲缘关系,并对样本的族群来源进行分析。结果 从测序数据中提取的SNP分型与Wegene GSA芯片分型的一致率大于99.62%。测序获得的SNP数据使用IBS算法可预测1~4级亲缘关系,4级亲缘预测置信区间准确性达100%,使用IBD算法可预测1~7级亲缘关系,7级亲缘预测为有亲缘关系的准确性达100%,通过高深度WGS数据获取的SNP系谱推断能力与芯片预测结果无显著差异。同时,WGS数据用于族群推断与调查结果一致。结论 WGS技术可应用于法医SNP系谱推断,为案件侦破提供线索。  相似文献   

4.
四种常用高通量测序拼接软件的应用比较   总被引:1,自引:0,他引:1  
新一代测序平台的诞生推动了对全基因组鸟枪法测序数据的拼接算法和软件的研究,自2005年以来多种用于高通量测序的序列拼接软件已经被开发出来,并且在不断地进行改进以提高拼接效果.本文利用目前广泛使用的高通量测序拼接软件Velvet、AbySS、SOAPdenovo和CLC Genomic Workbench分别对本试验室分离的一株噬菌体IME08的高通量测序结果进行拼接,介绍这几种拼接软件的安装使用及参数优化,并对不同软件的拼接结果进行比较,针对不同的拼接软件得到优化的拼接参数,可为其他研究人员使用上述软件提供参考借鉴.  相似文献   

5.
李文轲  李丰余  张思瑶  蔡斌  郑娜  聂宇  周到  赵倩 《遗传》2014,36(6):618-624
二代测序技术的发展对测序数据的处理分析提出了很高的要求。目前二代测序数据分析软件很多, 但是绝大多数软件仅能完成单一的分析功能(例如:仅进行序列比对或变异读取或功能注释等), 如何能正确高效地选择整合这些软件已成为迫切需求。文章设计了一套基于perl语言和SGE资源管理的自动化处理流程来分析Illumina平台基因组测序数据。该流程以测序原始序列数据作为输入, 调用业界标准的数据处理软件(如:BWA, Samtools, GATK, ANNOVAR等), 最终生成带有相应功能注释、便于研究者进一步分析的变异位点列表。该流程通过自动化并行脚本控制流程的高效运行, 一站式输出分析结果和报告, 简化了数据分析过程中的人工操作, 大大提高了运行效率。用户只需填写配置文件或使用图形界面输入即可完成全部操作。该工作为广大研究者分析二代测序数据提供了便利的途径。  相似文献   

6.
利用Phred/Phrap/Consed、cross.match、RepeatMasker、Blast等软件和自主开发程序,基于Linux操作系统,构建了林木EST序列分析系统,完成了从测序峰图向核酸序列的转化、载体序列的去除、重复序列鉴定、EST序列分类和组装、EST序列功能注释与功能分类以及SSR、SNP的发掘。并通过使用Perl语言结合bioperl模块写的脚本程序使分析过程自动化,从而可以快速地对大批林木EST数据进行分析,为林木的功能基因组学研究提供有用的信息。  相似文献   

7.
《遗传》2020,(7)
随着测序技术的不断发展,产生了海量的基因组测序数据,极大地丰富了公共遗传数据资源。同时为了应对大量基因组数据的产生,基因组比较和注释算法、工具不断更新,使得联合多种注释工具得到更准确的蛋白编码基因的注释信息成为可能。目前公共数据库的原核生物基因组测序和装配有些是10多年前的,存在大量预测的功能未知的编码基因。为了提升美国国家生物信息中心(National Center for Biotechnology Information,NCBI)数据库中基因组的注释质量,本研究联合使用多种原核基因识别算法/软件和基因表达数据重注释1587个细菌和古细菌基因组。首先,利用Z曲线的33个变量从177个基因组原注释中识别获得3092个被过度注释为蛋白编码基因的序列;其次,通过同源比对为939个基因组中的4447个功能未知的蛋白编码基因注释上具体功能;最后,通过联合采用ZCURVE 3.0和Glimmer 3.02以及Prodigal这3种高精度的、广泛使用且基于算法不同而互补的基因识别软件来寻找漏注释基因。最终,从9个基因组中找到了2003个被漏注释的蛋白编码基因,这些基因属于多个蛋白质直系同源簇(clusters of orthologous groups of proteins, COG)。本研究使用新的工具并结合多组学数据重新注释早期测序的细菌和古细菌基因组,不仅为新测序菌株提供注释方法参考,而且这些重注释后得到的细菌基因序列也会对后续基础研究有所帮助。  相似文献   

8.
李鑫  李凯  李一佳  马磊 《生物信息学》2016,14(3):188-194
SeqMule可根据调用的人类基因组和外显子组数据自动调节变量,对所有测序数据的单核苷酸多态性(Single nucleotide polymorphism,SNP)进行分析和注释。目的:通过对两名痛风患者的实验数据进行分析,详细地为生物信息学研究人员介绍了SeqMule软件,以期为全基因组和外显子组测序数据提供一站式的分析途径。方法:基于SeqMule内置的BWA(BurrowsWheeler Aligner)、GATK(The Genome Analysis Toolkit)、SAMtools、Freebayes比对和分析工具,以两名痛风患者的DNA测序数据分析为例,本文详细地论述了SeqMule的特点及操作,并对两名患者的外显子测序数据进行了自动化比对与SNP分析。发现SeqMule优化了很多分析软件存在的一些问题,可以对外显子组和全基因组测序数据实现全面、灵活、高效地自动化分析,能更好地分析高通量测序数据,最终提升数据分析的一致性和准确性。  相似文献   

9.
利用Illumina HiSeqTM 2500测序平台, 对通过高温胁迫实验筛选得到的20尾耐高温和20尾不耐高温的大黄鱼(Larimichthys crocea)进行了简化基因组测序(SLAF-seq), 每个样本的平均测序深度达到10.26×, 共获得419211个高质量的群体单核苷酸多态性(SNP)位点 。利用TASSEL软件的混合线性模型(MLM)进行全基因组关联分析(GWAS), 共筛选到38个与大黄鱼耐高温性状显著相关的SNP位点(P<2.39E–08)。利用BLAST程序定位每个SNP位点在大黄鱼基因组中的位置, 并分析其周围的功能基因。结果在38个SNPs附近共找到26个已知的功能基因, 这些基因主要与细胞转录、代谢、免疫等功能相关。研究结果可为下一步大黄鱼耐高温分子机制解析及耐高温品种的选育提供参考。  相似文献   

10.
RNA编辑是重要的转录后修饰过程,目前已有多种算法用于识别RNA编辑,本文主要研究小鼠中测序深度对RNA编辑识别算法的影响,从而为RNA编辑的研究给出建议的方法. 本文使用STAR比对软件将小鼠的RNA-seq数据进行序列比对,然后使用GATK识别SNV,并用Separate Method、GIREMI、RNAEditor 3种方法识别出RNA编辑位点. 最后对3种方法识别RNA编辑位点的共同部分、识别效率、识别稳定性、识别与测序深度的关系进行分析. 结果发现3种方法识别的编辑位点数目差异大,共有位点较少,随着测序深度的增加,识别的RNA编辑位点数也在增加. 结果表明RNA编辑识别算法在小鼠中的识别性能与测序深度呈正相关.  相似文献   

11.
12.
13.
14.
The completion of the Human Genome Project provided a reference sequence to which researchers could compare sequences from individual patients in the hope of identifying disease-causing mutations. However, this still necessitated candidate gene testing or a very limited screen of multiple genes using Sanger sequencing. With the advent of high-throughput Sanger sequencing, it became possible to screen hundreds of patients for alterations in hundreds of genes. This process was time consuming and limited to a few locations/institutions that had the space to house tens of sequencing equipment. The development of next generation sequencing revolutionized the process. It is now feasible to sequence the entire exome of multiple individuals in about 10 days. However, this meant that a massive amount of data needed to be filtered to identify the relevant alteration. This is presently the rate-limiting step in providing a convincing association between a genetic alteration and a human disorder.  相似文献   

15.
Genomic measures of inbreeding based on identical-by-descent (IBD) segments are increasingly used to measure inbreeding and mostly estimated on SNP arrays and whole-genome sequencing (WGS) data. However, some softwares recurrently used for their estimation assume that genomic positions which have not been genotyped are nonvariant. This might be true for WGS data, but not for reduced genomic representations and can lead to spurious IBD segments estimation. In this project, we simulated the outputs of WGS, two SNP arrays of different sizes and RAD-sequencing for three populations with different sizes and histories. We compare the results of IBD segments estimation with two softwares: runs of homozygosity (ROHs) estimated with PLINK and homozygous-by-descent (HBD) segments estimated with RZooRoH. We demonstrate that to obtain meaningful estimates of inbreeding, RZooRoH requires a SNPs density 11 times smaller compared to PLINK: ranks of inbreeding coefficients were conserved among individuals above 22 SNPs/Mb for PLINK and 2 SNPs/Mb for RZooRoH. We also show that in populations with simple demographic histories, distribution of ROHs and HBD segments are correctly estimated with both SNP arrays and WGS. PLINK correctly estimated distribution of ROHs with SNP densities above 22 SNPs/Mb, while RZooRoH correctly estimated distribution of HBD segments with SNPs densities above 11 SNPs/Mb. However, in a population with a more complex demographic history, RZooRoH resulted in better distribution of IBD segments estimation compared to PLINK even with WGS data. Consequently, we advise researchers to use either methods relying on excess homozygosity averaged across SNPs or model-based HBD segments calling methods for inbreeding estimations.  相似文献   

16.
17.
We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm.  相似文献   

18.

Background  

Spectral processing and post-experimental data analysis are the major tasks in NMR-based metabonomics studies. While there are commercial and free licensed software tools available to assist these tasks, researchers usually have to use multiple software packages for their studies because software packages generally focus on specific tasks. It would be beneficial to have a highly integrated platform, in which these tasks can be completed within one package. Moreover, with open source architecture, newly proposed algorithms or methods for spectral processing and data analysis can be implemented much more easily and accessed freely by the public.  相似文献   

19.
Interdigitating dendritic cell sarcoma (IDCS) is an aggressive neoplasm and is an extremely rare disease, with a challenging diagnosis. Etiology of IDCS is also unknown and most studies with only case reports. In our case, immunohistochemistry showed that the tumor cells were positive for S100, CD45, and CD68, but negative for CD1a and CD21. This study aimed to investigate the causative factors of IDCS by sequencing the protein-coding regions of IDCS. We performed whole-exome sequencing with genomic DNA from blood and sarcoma tissue of the IDCS patient using the Illumina Hiseq 2500 platform. After that, we conducted Sanger sequencing for validation of sarcoma-specific variants and gene ontology analysis using DAVID bioinformatics resources. Through comparing sequencing data of sarcoma with normal blood, we obtained 15 nonsynonymous single nucleotide polymorphisms (SNPs) as sarcoma-specific variants. Although the 15 SNPs were not validated by Sanger sequencing due to tumor heterogeneity and low sensitivity of Sanger sequencing, we examined the function of the genes in which each SNP is located. Based on previous studies and gene ontology database, we found that POLQ encoding DNA polymerase theta enzyme and FNIP1 encoding tumor suppressor folliculin-interacting protein might have contributed to the IDCS. Our study provides potential causative genetic factors of IDCS and plays a role in advancing the understanding of IDCS pathogenesis.  相似文献   

20.
Even with the ubiquity of Sanger sequencing, automated assembly software are predominantly stand-alone software packages for desktop/laptop use with very few online equivalents, thus geospatially constraining sequence analysis and assembly. With increased data output worldwide, there is also a need for automated quality checks and trimming prior to large assemblies, along with automated detection of mutations. Through web servers with expanded automation and functionalities, even smartphones/phablets can be used to perform complex analysis previously limited to desktops, especially if they can upload files from cloud storage. To facilitate such online accessible sequence assembly and analysis, we created Yet Another Quick Assembly, Analysis and Trimming Tool web server for the automated assembly of multiple .ab1 and .FASTQ sequencing reads de novo with automated trimming and scanning of the assembled sequences for single nucleotide polymorphisms and insertions or deletions without installation of software, allowing it to be accessed from anywhere with Internet access and with minimal dependency on other software and web tools.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号