首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
宋琳琳  顾朝辉  韦朝春  陈赛娟 《生物磁学》2009,(15):2899-2902,2912
目的:针对下一代测序数据量大、序列长度短的特点,研究数据分析和质量评估方法。方法:选择已发布的Illumina-Solexa平台测序数据为研究对象,通过MAQ软件将测序数据与人类全基因组序列进行比对,并以外显子区域为例,在位点水平对测序数据质量进行评估。结果:结合已有软件系统和本文自创线性算法,建立了一套包括比对、拼接在内的测序数据质量评估系统。比对分析后,发现原始测序序列共覆盖了127,113,378个位点,涉及24条染色体上的64868个外显子。其中,每个位点都被测到的外显子为0.50%,位点平均测序深度大于等于1的外显子为3.98%。结论:成功构建了基于Illumina-Solexa测序平台的数据分析和质量评估方法,其可适用于其它第二代测序平台。研究者可在质量评估的基础上完善测序试验设计,并进行SNP和突变筛选及后续功能性研究。  相似文献   

2.
榆瘿蚜取食侵染榆树叶片形成了榆树虫瘿,本研究采用新一代的高通量Illumina Hi SeqTM 2000技术测序平台对榆瘿蚜取食刺激的榆树叶片进行转录组测序和功能注释,利用生物学方法对基因表达和功能进行研究。测序获得23.19 Gb碱基序列信息,通过对测序数据进行序列过滤、拼接和去冗余,共获得102 017个Unigenes,通过NR与BLAST等数据库比对,其中有37 899个(37.15%) Unigense被注释。利用KOG、GO、KEGG等数据库对榆树虫瘿叶片的Unigense进行比对,按功其能将匹配的Unigenes基因划分25大类;GO注释将信息归纳为基因的3大主类,57个亚类;以KEGG数据库为参考,将Unigene定位到110个不同的代谢通路,包括氧化应激防御、植物激素信号转导、碳水化合物以及次生物代谢等代谢相关的Unigenes通路。本研究通过二代高通量转录组测序技术研究榆瘿蚜侵染下榆树虫瘿的相关基因,为今后研究榆瘿蚜侵染榆树叶片形成虫瘿的分子机理提供了基础资料。  相似文献   

3.
结合严重急性呼吸综合征(SARS)病毒株序列信息分析和高通量测序技术,建立一种快速、简单地确定SARS病毒株并筛查SARS病毒突变位点和突变频率的方法。从感染人SARS病毒的Vero-6细胞中提取病毒RNA,反转录为cDNA后,PCR扩增目的基因片段,采用焦磷酸测序技术(Pyrosequencing Technology,PSQ)进行第2601、7919、9479、19838多个碱基突变位点测序和突变频率分析。通过测序分析多个可能出现突变的位点,确定了该病毒为北京流行株,同时发现第7919位碱基发生了A/G突变。PSQ技术对于高通量筛选研究病毒基因的突变和确定病毒株型别有着简单、快速、灵敏的特点。利用生物信息学分析核酸多态性,结合实验验证,可以确定SARS病毒流行株的特征,有利于对突发事件及早确定传染来源。  相似文献   

4.
目的:大量研究证实线粒体DNA(mtDNA)突变与肿瘤发生及进展密切相关,但使用传统测序方法难以高通量、高精确度的检测mtDNA突变,为此本研究建立了基于新一代测序技术的mtDNA突变检测方法.方法:提取肝癌患者癌、癌旁组织以及外周血细胞总DNA,利用PCR技术对线粒体基因组进行富集并对PCR产物进行平末端、粘性末端连接或对PCR引物进行氨基修饰,构建mtDNA测序文库.经Illumina HiSeq 2000平台测序后利用生物信息学方法与人类mtDNA参考序列进行比对,并进行测序数据分析.结果:通过对不同质量基因组DNA进行评估后,发现三对引物法适用于大部分DNA样本的mtDNA富集.进一步我们发现PCR引物的氨基修饰可显著提高测序数据覆盖均一性,降低测序成本.结论:本研究利用新一代测序技术通过对线粒体DNA富集方法以及测序覆盖度均一性进行优化,建立了一套灵敏、特异、高通量的mtDNA突变检测策略,为mtDNA突变与疾病研究提供了新方法.  相似文献   

5.
目的:构建一个本地化的RNA-Seq数据处理分析平台,为RNA-Seq研究人员提供数据分析平台。方法:在调研现有的RNA-Seq数据分析研究成果的基础上,构建一套本地化的RNA-Seq分析平台,平台首先将测序数据中的低质量数据进行过滤,然后使用Top Hat将过滤后的数据与参考基因组数据进行比对,利用比对结果进行可变剪切分析、基因差异表达分析等,最后通过R语言工具包对分析结果进行可视化绘图。结果:通过对2组小鼠的RNA-Seq测序数据进行分析,构建的分析平台能够较好地过滤低质量测序数据,并且分析出2组数据间的差异表达基因,同时还可以图形化表示这些差异表达基因。结论:分析平台能够实现对RNA-Seq测序数据的质量控制、差异表达分析及分析结果的可视化。  相似文献   

6.
目的:针对下一代测序数据,尤其是单端测序数据,研究快速、准确查找Indel的方法。方法:先与全基因组参考序列进行快速比对,筛选出包含Indel的序列;再对这些序列进行双向的二次比对,确定Indel长度;最后借助长度信息在锁定范围内查找Indel的确切位置和相关信息。结果:本文成功构建FIND(Fast INDel detection system)系统,用于从单端测序数据中查找Indel信息。以模拟测序数据作为测试数据,在12X测试数据情况下,FIND的灵敏度和特异性分别为87.71%和99.66%,而且该性能还随着测序倍数的增加而提升。结论:充分利用比对过程获取的信息,在确定Indle长度的同时也确定出其大致位置,最终在局部范围内实现对单端测序数据中Indle的快速而准确的查找。  相似文献   

7.
随着流感病毒基因组测序数据的急剧增加,深入挖掘流感病毒基因组大数据蕴含的生物学信息成为研究热点。基于中国流感病毒流行特征数据,建设一个集自动化、一体化和信息化的序列库系统,对于实现流感病毒基因组批量快速翻译、注释、存储、查询、分析具有重要的应用价值。本课题组通过集成一系列软件和工具包,并结合自主研发的其他功能,在底层维护的2个关键的参考数据集基础上另外追加了翻译注释信息最佳匹配的精细化筛选规则,构建具有流感病毒基因组信息存储、自动化翻译、蛋白序列精准注释、同源序列比对和进化树分析等功能的自动化系统。结果显示,通过Web端输入fasta格式的流感病毒基因序列,本系统可针对参考序列片段数据集(blastdb.fasta)进行Blast同源性检索,可以鉴定流感病毒的型别(A、B或C)、亚型和基因片段(1~8片段);在此基础上,通过查询数据库底层用于翻译、注释的基因片段参考数据集,可以获得一组肽段数据集,然后通过循环调用ProSplign软件对其进行预测。结合精细化的筛选准入规则,选出与输入序列匹配最好的翻译后产物,作为该输入序列的预测蛋白,输出为gbk,asn和fasta等通用格式的文件,给出序列长度、是否全长、病毒型别、亚型、片段等信息。基于以上工作,另外自主研发了系统其他的附加功能如进化树分析展示、基因组数据存储等功能,构建成基于Web服务的流感病毒基因组自动化翻译注释系统。本研究提示,系统高度集成系列软件以及自有的注释翻译数据库文件,实现从序列存储、翻译、注释到序列分析和展示的功能,可全面满足我国高通量基因检测数据共享化、本土化、一体化、自动化的需求。  相似文献   

8.
为加强中国特有濒危植物半枫荷资源的保护与利用工作,采用高通量测序平台Illumina HiSeq 2500对其进行转录组测序,将得到的数据过滤后进行de novo组装并聚类去冗余,获得77 629个Unigenes,通过九大功能数据库比对、分析、注释,最终有45 293个Unigenes获得注释信息;其中在KOG按功能分为25个子类,获得25 253个功能注释信息;GO功能注释可分为细胞组分、生物学过程、分子功能3大类、分别可细分为22、26、17亚类(共65个亚类);与KEGG数据库对比,共发现286条代谢通路,其中发现可能与半枫荷药用活性成分相关的次生代谢产物生物合成的177条途径;根据组装结果预测出88个基因家族共1 547个编码转录因子的Unigenes,发现控制药效合成的转录因子家族。另外,根据注释结果检测到12 579个SNP多态位点和预测出57 671个CDS位点。本研究首次对半枫荷转录组进行分析,为深入开展半枫荷分子生物学研究提供基础数据来源。  相似文献   

9.
陆才瑞  邹长松  宋国立 《遗传》2015,37(8):765-776
传统的利用正向遗传学方法的基因定位一般是通过构建遗传连锁图谱进行的,该过程步骤繁琐、耗时耗力,很多情形下定位精确度低、区间大。随着高通量测序技术的快速发展以及测序成本的不断降低,多种简单快捷的利用测序手段定位基因的方法被开发出来,包括对突变体基因组直接测序定位、突变体材料构建混池测序定位和遗传分离群体测序构建图谱定位等,还可以对转录组和部分基因组进行测序定位。这些方法可以在核苷酸水平鉴定突变位点,并已推广到复杂的遗传背景中。近期报道的一些测序定位甚至是在不依赖于参考基因组序列、遗传杂交和连锁信息的情况下完成的,这使得很多非模式物种也能开展正向遗传学研究。本文就这些新技术及其在基因定位中的应用进行了综述。  相似文献   

10.
玉米象Sitophilus zeamais是一种世界性的储粮害虫,但其基因信息尚不完善.本研究利用高通量测序平台Illumina HiSeq TM2000对玉米象成虫进行转录组测序,总计获得64358条unigenes,总长度39481752 bp,最短201 bp,最长29046 bp,平均长度613 bp.将unigenes序列在7大数据库Nr、GO、Swissprot、KOG、Pfam、KEGG、TrEMBL中进行注释,分别有25271、20747、16477、12060、9471、3370、25500条unigenes获得注释.通过GO功能分类,共有20747条unigenes在GO数据库中生物学过程、细胞组分、分子功能3大类68个亚类功能组中找到对应.KOG结果显示,12606条unigenes归到25个基因家族.通过KEGG代谢通路分析,共有3370条unigenes被注释,分别归属于细胞进程、环境信息进程、遗传信息进程、新陈代谢和有机体系统5大类代谢途径,共33个亚类274个功能通路.进一步的数据分析,共鉴定出146081个SNP位点和5002个简单序列重复(SSR)位点.本研究获得的玉米象转录组信息,为玉米象的功能基因挖掘提供了重要的信息资源.  相似文献   

11.
Phaeochromocytomas (PCCs) and paragangliomas (PGLs) are rare, catecholamine-producing tumors. Most familial PCC/PGLs have been detected to be autosomal dominantly inherited. However, this study was undertaken in a family with PCCs to determine candidate genes in a dominant or recessive inheritance pattern. After excluding mutations in ten PCC/PGL susceptibility genes by Sanger sequencing, we used whole exome sequencing for screening on the four family members to discover novel candidate genes associated with PCCs. Based on the inexistence of non-synonymous mutations or indels in the ten known genes and the structure of this pedigree, 3 damaging loci with dominant inheritance pattern, and 5 damaging loci with recessive homozygous inheritance pattern and 6 damaging genes with compound heterozygous inheritance pattern were narrowed down to indicate the association with PCCs. According to the Gene Ontology (GO) category analysis on the combined results, cell adhesion showed the most significant enrichment.  相似文献   

12.
Mutation analysis of Taiwanese Wilson disease patients   总被引:5,自引:0,他引:5  
Wilson disease (WD) is an autosomal recessive disorder of copper metabolism, which is caused by mutation in copper-transporting ATPase (ATP7B). In the present study, we report a molecular diagnosis method to screen the WD chromosome in patients or in heterozygotic carriers in Taiwan. Exons 8, 11, 12, 13, 16, 17, and 18 of ATP7B are selected for the screening of mutations. The most common mutation, Arg778Leu or Arg778Gln, was first screened by PCR-RFLP then we combined single-stranded conformation polymorphism (SSCP) analysis followed by direct DNA sequencing on the DNA fragments with mobility shift on SSCP analysis. The diagnostic rate was compared with standard ATP7B whole gene sequencing analysis. Ten different mutations were identified among 29 WD patients; among them four were novel (Ala1168Pro, Thr1178Ala, Ala1193Pro, and Pro1273Gln). The false positive rates were tested against 100 normal individuals and listed as follows: exon 8: 5%; exon 11: 4%; exon 12: 6%; exon 13: 5%; exon 16: 5%; exon 17: 3%; exon 18: 4%. The Arg778Leu mutation exhibited the highest allelic frequency (43.1%). The detection rate of WD chromosomes is 65.52%, which is as sensitive as whole gene sequencing scanning. According to our results, WD chromosomes in Taiwan are predominantely located at exons 8, 11, 12, 13, 16, 17, and 18. The standard sequencing analysis on the entire gene is time consuming. We recommend screening these 7 exons first on those individuals who have a higher risk in having WD, before whole gene and promoter sequencing analysis in Taiwan.  相似文献   

13.
Isolated dystonia is a disorder characterized by involuntary twisting postures arising from sustained muscle contractions. Although autosomal-dominant mutations in TOR1A, THAP1, and GNAL have been found in some cases, the molecular mechanisms underlying isolated dystonia are largely unknown. In addition, although emphasis has been placed on dominant isolated dystonia, the disorder is also transmitted as a recessive trait, for which no mutations have been defined. Using whole-exome sequencing in a recessive isolated dystonia-affected kindred, we identified disease-segregating compound heterozygous mutations in COL6A3, a collagen VI gene associated previously with muscular dystrophy. Genetic screening of a further 367 isolated dystonia subjects revealed two additional recessive pedigrees harboring compound heterozygous mutations in COL6A3. Strikingly, all affected individuals had at least one pathogenic allele in exon 41, including an exon-skipping mutation that induced an in-frame deletion. We tested the hypothesis that disruption of this exon is pathognomonic for isolated dystonia by inducing a series of in-frame deletions in zebrafish embryos. Consistent with our human genetics data, suppression of the exon 41 ortholog caused deficits in axonal outgrowth, whereas suppression of other exons phenocopied collagen deposition mutants. All recessive mutation carriers demonstrated early-onset segmental isolated dystonia without muscular disease. Finally, we show that Col6a3 is expressed in neurons, with relevant mRNA levels detectable throughout the adult mouse brain. Taken together, our data indicate that loss-of-function mutations affecting a specific region of COL6A3 cause recessive isolated dystonia with underlying neurodevelopmental deficits and highlight the brain extracellular matrix as a contributor to dystonia pathogenesis.  相似文献   

14.

Background

Targeting Induced Local Lesions IN Genomes (TILLING) is a reverse genetics approach to directly identify point mutations in specific genes of interest in genomic DNA from a large chemically mutagenized population. Classical TILLING processes, based on enzymatic detection of mutations in heteroduplex PCR amplicons, are slow and labor intensive.

Results

Here we describe a new TILLING strategy in zebrafish using direct next generation sequencing (NGS) of 250bp amplicons followed by Paired-End Low-Error (PELE) sequence analysis. By pooling a genomic DNA library made from over 9,000 N-ethyl-N-nitrosourea (ENU) mutagenized F1 fish into 32 equal pools of 288 fish, each with a unique Illumina barcode, we reduce the complexity of the template to a level at which we can detect mutations that occur in a single heterozygous fish in the entire library. MiSeq sequencing generates 250 base-pair overlapping paired-end reads, and PELE analysis aligns the overlapping sequences to each other and filters out any imperfect matches, thereby eliminating variants introduced during the sequencing process. We find that this filtering step reduces the number of false positive calls 50-fold without loss of true variant calls. After PELE we were able to validate 61.5% of the mutant calls that occurred at a frequency between 1 mutant call:100 wildtype calls and 1 mutant call:1000 wildtype calls in a pool of 288 fish. We then use high-resolution melt analysis to identify the single heterozygous mutation carrier in the 288-fish pool in which the mutation was identified.

Conclusions

Using this NGS-TILLING protocol we validated 28 nonsense or splice site mutations in 20 genes, at a two-fold higher efficiency than using traditional Cel1 screening. We conclude that this approach significantly increases screening efficiency and accuracy at reduced cost and can be applied in a wide range of organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1263-4) contains supplementary material, which is available to authorized users.  相似文献   

15.
Nephrotic syndrome (NS), the association of gross proteinuria, hypoalbuminaemia, edema, and hyperlipidemia, can be clinically divided into steroid-sensitive (SSNS) and steroid-resistant (SRNS) forms. SRNS regularly progresses to end-stage renal failure. By homozygosity mapping and whole exome sequencing, we here identify recessive mutations in Crumbs homolog 2 (CRB2) in four different families affected by SRNS. Previously, we established a requirement for zebrafish crb2b, a conserved regulator of epithelial polarity, in podocyte morphogenesis. By characterization of a loss-of-function mutation in zebrafish crb2b, we now show that zebrafish crb2b is required for podocyte foot process arborization, slit diaphragm formation, and proper nephrin trafficking. Furthermore, by complementation experiments in zebrafish, we demonstrate that CRB2 mutations result in loss of function and therefore constitute causative mutations leading to NS in humans. These results implicate defects in podocyte apico-basal polarity in the pathogenesis of NS.  相似文献   

16.
目的:研究甲状腺过氧化物酶基因(TPO)在中国先天性甲状腺功能减退症(CH)患儿中的突变及其家系遗传规律。方法:收集140例CH患儿及部分家系,提取外周血DNA,采用靶向测序的方法检测患者TPO基因的突变情况,设计引物扩增TPO基因的各个外显子区以及外显子内含子的交界区,用二代测序技术检测TPO基因的突变且进行一代测序验证,同时对其中两例携带有TPO基因复合杂合突变的患儿的父母进行一代测序验证。结果:140名先天性甲减患儿中,13例病人携带12个不同的TPO基因突变位点(R189Q、C269S、W428R、A430E、A433P、A489T、V748M、C756fs、E799D、G860R、P883S、Q913fs),其中有一个位点为热点突变(6个病人携带C756fs),三个突变为新发现的位点(C269S、A430E、E799D)。结论:TPO基因在中国先天性甲减患儿中的突变率较高,遗传模式为常染色体隐性遗传。  相似文献   

17.
Retinitis Pigmentosa (RP) is a heterogeneous group of inherited retinal dystrophies characterised ultimately by the loss of photoreceptor cells. RP is the leading cause of visual loss in individuals younger than 60 years, with a prevalence of about 1 in 4000. The molecular genetic diagnosis of autosomal recessive RP (arRP) is challenging due to the large genetic and clinical heterogeneity. Traditional methods for sequencing arRP genes are often laborious and not easily available and a screening technique that enables the rapid detection of the genetic cause would be very helpful in the clinical practice. The goal of this study was to develop and apply microarray-based resequencing technology capable of detecting both known and novel mutations on a single high-throughput platform. Hence, the coding regions and exon/intron boundaries of 16 arRP genes were resequenced using microarrays in 102 Spanish patients with clinical diagnosis of arRP. All the detected variations were confirmed by direct sequencing and potential pathogenicity was assessed by functional predictions and frequency in controls. For validation purposes 4 positive controls for variants consisting of previously identified changes were hybridized on the array. As a result of the screening, we detected 44 variants, of which 15 are very likely pathogenic detected in 14 arRP families (14%). Finally, the design of this array can easily be transformed in an equivalent diagnostic system based on targeted enrichment followed by next generation sequencing.  相似文献   

18.
Animal models provide an in vivo system to study gene function by transgenic and knockout approaches. Targeted knockout approaches have been very successful in mice, but are currently not feasible in zebrafish due to the inability to grow embryonic stem cells. As an alternative, a reverse genetic approach that utilizes screening by resequencing and/or TILLING (Targeting Induced Local Lesions INGenomes) of mutagenized genomes has recently gained popularity in the zebrafish field. Spermatogonia of healthy males are mutagenized using ENU (N-ethyl-N-nitrosourea) and F1 progeny is collected by breeding treated males with healthy wild type females. Sperm and DNA banks are generated from F1 males. DNA is screened for ENU-induced mutations by sequencing or TILLING. These mutations can then be studied by in vitro fertilization (IVF) from the cryopreserved sperm of the corresponding F1 male followed by breeding to homozygosity. A high-throughput method of screening for rare heterozygotes and efficient recovery of mutant lines are important in identification of a large number of mutations using this approach. This article provides optimized protocols for resequencing and TILLING based on our experiences. We performed a pilot screen on 1235 F1 males by resequencing 54 exons from 17 genes and analyzed the sequencing data using multiple programs to maximize the mutation detection with minimal false positive detection. As an alternative to sequencing, we developed the protocols for TILLING by capillary electrophoresis using an ABI Genetic analyzer 3100 platform followed by fragment analysis using GeneScan and Genotyper softwares. PCR products generated by fluorescently labeled universal primers and tailed exon-specific primers were pooled 4-fold prior to heteroduplex formation. Overall, our pilot screen shows that a combination of TILLING and sequencing is optimal for achieving cost-effective, high-throughput screening of a large number of samples. Amplicons with fewer common SNPs are ideal for TILLING whereas amplicons with multiple SNPs and in/del polymorphisms are best suited for sequencing followed by analysis with SNPdetector.  相似文献   

19.
Bowen ME  Henke K  Siegfried KR  Warman ML  Harris MP 《Genetics》2012,190(3):1017-1024
The generation and analysis of mutants in zebrafish has been instrumental in defining the genetic regulation of vertebrate development, physiology, and disease. However, identifying the genetic changes that underlie mutant phenotypes remains a significant bottleneck in the analysis of mutants. Whole-genome sequencing has recently emerged as a fast and efficient approach for identifying mutations in nonvertebrate model organisms. However, this approach has not been applied to zebrafish due to the complicating factors of having a large genome and lack of fully inbred lines. Here we provide a method for efficiently mapping and detecting mutations in zebrafish using these new parallel sequencing technologies. This method utilizes an extensive reference SNP database to define regions of homozygosity-by-descent by low coverage, whole-genome sequencing of pooled DNA from only a limited number of mutant F(2) fish. With this approach we mapped each of the five different zebrafish mutants we sequenced and identified likely causative nonsense mutations in two and candidate mutations in the remainder. Furthermore, we provide evidence that one of the identified mutations, a nonsense mutation in bmp1a, underlies the welded mutant phenotype.  相似文献   

20.
Hai Peng  Jing Zhang 《Biologia》2009,64(1):20-26
DNA sequences can be used for the analysis of genetic variation and gene function. The high-throughput sequencing techniques that have been developed over the past three years can read as many as one billion bases per run, and are far less expensive than the traditional Sanger sequencing method. Therefore, the high-throughput sequencing has been applied extensively to genomic analyses, such as screening for mutations, construction of genomic methylation maps, and the study of DNA-protein interactions. Although they have only been available for a short period, high-throughput sequencing techniques are profoundly affecting many of the life sciences, and are opening out new potential avenues of research. With the highly-developed commercial high-throughput sequencing platforms, each laboratory has the opportunity to explore this research field. Therefore, in this paper, we have focused on commercially-popular high-throughput sequencing techniques and the ways in which they have been applied over the past three years.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号