首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
目的将两个不同的二代测序平台及其生物信息学流程进行的相同标本的微生物群分析结果的异同进行比较。方法在同一实验室采用相同的生物信息学分析方案,使用两个不同的测序平台(Ion Torrent S5-xl和Illumina HiSeq 2500)对56个(28对)母婴粪便样本进行16S rRNA扩增子测序,采用相关性分析、主成分分析(PCA)、主坐标分析(PCoA)以及MRPP分析比较两个平台产生的微生物群落结构的异同。结果 Alpha多样性除Shannon指数外,Chao1指数(t=1.96,P=0.001 1)、Observed species指数(t=2.13,P0.001 0)、PD_whole tree指数(t=2.07,P0.001 0)、Simpson指数(t=1.87,P=0.003 1)和Good coverage指数(t=2.32,P0.001 0)差异存在统计学意义。对不同分类水平下的菌群相对丰度进行分析,测序和注释的细菌种类越多,两个平台间建立的细菌相对丰度的相关性越低。PCA结果显示,超过87%的样本被聚类。通过PCoA结果将两个平台的56个样本分为两个集群(cluster),两个平台之间的重叠率为71.43%。MRPP差异分别显示了两个测序平台的菌群数据在科水平、属水平上差异存在统计学意义(A=0.094 1,P=0.001 0;A=0.085 2,P=0.002 1)。当考虑样本来源时,在科水平上,母亲组的菌群组成没有明显的测序平台差异(A=0.006 3,P=0.149 1),新生儿组差异则有统计学意义(A=0.035 2,P=0.006 1);在属水平,母、婴组的菌群组成有显著的测序平台差异(A=0.021 6,P=0.004 2;A=0.098 1,P=0.001 0)。结论相同样本在不同平台进行扩增子测序,其菌群结构相对丰度基本相似,但其多样性和相关性仍然有很大差异。为了队列研究数据的可重复性和可靠性,建议使用相同的测序平台和分析流程以减少菌群分析中的偏倚。  相似文献   

2.
After analyzing the base composition asymmetry of coding regionsin vertebrate mitochondria, we identified 2 fishes, Albula glossodontaand Bathygadus antrodes, with inverted compositional patterns.Both species appear to have an unusual control region (CR),and in B. antrodes, it has switched from the light strand tothe heavy strand. To our knowledge, this is the first reportin vertebrates of inverted mitochondrial replication, causedby an inversion of the CR. These findings support the strand-asymmetricmodel of mtDNA replication and suggest that vertebrate mtDNAcan tolerate globally reversed mutational pressures. In addition,we propose that nucleotide bias is not strand specific but thatit depends on the location of the CR.  相似文献   

3.
The Chinese indigenous pig breeds in the Taihu Lake region are the most prolific pig breeds in the world. In this study, we investigated the genetic diversity and population structure of six breeds, including Meishan, Erhualian, Mi, Fengjing, Shawutou and Jiaxing Black, in this region using whole‐genome SNP data. A high SNP with proportions of polymorphic markers ranging from 0.925 to 0.995 was exhibited by the Chinese indigenous pigs in the Taihu Lake region. The allelic richness and expected heterozygosity also were calculated and indicated that the genetic diversity of the Meishan breed was the greatest, whereas that of the Fengjing breed was the lowest. The genetic differentiation, as indicated by the fixation index, exhibited an overall mean of 0.149. Both neighbor‐joining tree and principal components analysis were able to distinguish the breeds from each other, but structure analysis indicated that the Mi and Erhualian breeds exhibited similar major signals of admixture. With this genome‐wide comprehensive survey of the genetic diversity and population structure of the indigenous Chinese pigs in the Taihu Lake region, we confirmed the rationality of the current breed classification of the pigs in this region.  相似文献   

4.
Noncanonical microRNAs (miRNAs) and endogenous small interfering RNAs (endo-siRNAs) are distinct subclasses of small RNAs that bypass the DGCR8/DROSHA Microprocessor but still require DICER1 for their biogenesis. What role, if any, they have in mammals remains unknown. To identify potential functional properties for these subclasses, we compared the phenotypes resulting from conditional deletion of Dgcr8 versus Dicer1 in post-mitotic neurons. The loss of Dicer1 resulted in an earlier lethality, more severe structural abnormalities, and increased apoptosis relative to that from Dgcr8 loss. Deep sequencing of small RNAs from the hippocampus and cortex of the conditional knockouts and control littermates identified multiple noncanonical microRNAs that were expressed at high levels in the brain relative to other tissues, including mirtrons and H/ACA snoRNA-derived small RNAs. In contrast, we found no evidence for endo-siRNAs in the brain. Taken together, our findings provide evidence for a diverse population of highly expressed noncanonical miRNAs that together are likely to play important functional roles in post-mitotic neurons.  相似文献   

5.
Owing to rapid advancements in NGS (next generation sequen-cing), genomic alteration is now considered an essential pre-dictive biomarkers that impact the treatment decision in many cases of cancer. Among the various predictive biomarkers, tumor mutation burden (TMB) was identified by NGS and was con-sidered to be useful in predicting a clinical response in cancer cases treated by immunotherapy. In this study, we directly com-pared the lab-developed-test (LDT) results by target sequencing panel, K-MASTER panel v3.0 and whole-exome sequencing (WES) to evaluate the concordance of TMB. As an initial step, the reference materials (n = 3) with known TMB status were used as an exploratory test. To validate and evaluate TMB, we used one hundred samples that were acquired from surgically resected tissues of non-small cell lung cancer (NSCLC) patients. The TMB of each sample was tested by using both LDT and WES methods, which extracted the DNA from samples at the same time. In addition, we evaluated the impact of capture re-gion, which might lead to different values of TMB; the evalu-ation of capture region was based on the size of NGS and target sequencing panels. In this pilot study, TMB was evalu-ated by LDT and WES by using duplicated reference samples; the results of TMB showed high concordance rate (R2 = 0.887). This was also reflected in clinical samples (n = 100), which showed R2 of 0.71. The difference between the coding sequence ratio (3.49%) and the ratio of mutations (4.8%) indicated that the LDT panel identified a relatively higher number of mutations. It was feasible to calculate TMB with LDT panel, which can be useful in clinical practice. Furthermore, a customized approach must be developed for calculating TMB, which differs according to cancer types and specific clinical settings.  相似文献   

6.
脓毒血症是一种严重威胁生命的感染,精准、快速的病原学诊断可帮助临床医师优化抗菌药物的使用。目前,基于病原菌培养的方法仍是脓毒血症病原学诊断的主要手段,但具有耗时长、灵敏度低等不可忽视的缺点。近年来出现了一些不依赖培养的病原学诊断方法,其中基于聚合酶链反应(polymerase chain reaction,PCR)的方法已发展较为成熟。但PCR只能检测已知的特定病原体,临床定量PCR仅用于检测病毒及少数细菌,脓毒血症中的病原体PCR多仅为定性检测。目前,二代测序技术不断成熟并用于临床,成为病原学诊断的有力手段。与血培养等传统病原学检测方法相比,其具有快速、非选择性、可定量或半定量分析的优点。现阶段二代测序仍存在公认判读标准缺乏、测序结果与治疗关系不明确、耐药基因检测困难等不足,亦缺乏较大规模的二代测序与传统诊断方法比较验证的研究结果,尚有待更高级的循证医学证据支持。  相似文献   

7.
使用第二代测序数据来发现癌细胞中的基因组突变,一直是很重要的科学应用问题。此研究使用一个癌症病人的大量数据,评估了甄别基因组突变的几个现有工具。经过比较各工具的方法和正确率,本文发现各自都有自己的优点和缺点。针对这些优缺点,本文提供一些建议,让工具使用者能更好地选择合适的工具。  相似文献   

8.
MicroRNAs (miRNAs) have been implicated to play key roles in normal physiological functions, and altered expression of specific miRNAs has been associated with a number of diseases. It is of great interest to understand their roles and a prerequisite for such study is the ability to comprehensively and accurately assess the levels of the entire repertoire of miRNAs in a given sample. It has been shown that some miRNAs frequently have sequence variations termed isomirs. To better understand the extent of miRNA sequence heterogeneity and its potential implications for miRNA function and measurement, we conducted a comprehensive survey of miRNA sequence variations from human and mouse samples using next generation sequencing platforms. Our results suggest that the process of generating this isomir spectrum might not be random and that heterogeneity at the ends of miRNA affects the consistency and accuracy of miRNA level measurement. In addition, we have constructed a database from our sequencing data that catalogs the entire repertoire of miRNA sequences (http://galas.systemsbiology.net/cgi-bin/isomir/find.pl). This enables users to determine the most abundant sequence and the degree of heterogeneity for each individual miRNA species. This information will be useful both to better understand the functions of isomirs and to improve probe or primer design for miRNA detection and measurement.  相似文献   

9.
Catla catla, the second most important Indian major carp, is gaining its popularity among Indian fish farmers due to its high growth rate and consumer preferences. Simple sequence repeats (SSRs) are rapidly evolving, versatile, co-dominant and highly informative molecular markers used in genetic research. However, the time and cost involved in developing such resources has limited their extensive use. Advent of massive parallel sequencing technology has considerably eased these limitations. In the present investigation, we used Ion Torrent sequencing platform to identify potentially amplifiable microsatellite loci for catla. A modest sequencing volume generated approximately 5.7 MB of sequence data. Out of 29,794 sequences generated, 21,477 contained simple sequence repeats. Only 81 sequences had enough flanking sequences for primer designing. Out of 81 loci, 51 were successfully PCR amplified in a panel of five unrelated individuals. Out of 15 loci randomly checked for polymorphism, 13 loci were polymorphic with allele number ranged from 3 to 6 and two loci were found to be monomorphic. The observed and expected heterozygosities ranged from 0.565 to 0.870 and 0.483–0.804, respectively. These markers will be useful for studying genetics of wild populations, breeding programs of C. catla and closely related species.  相似文献   

10.
Detection of antibodies in serum has many important applications. Our goal was to develop a facile general experimental approach for identifying antibody-specific peptide ligands that could be used as the reagents for antibody detection. Our emphasis was on an approach that would allow identification of peptide ligands for antibodies in serum without the need to isolate the target antibody or to know the identity of its antigen. We combined ribosome display (RD) with the analysis of peptide libraries by next generation sequencing (NGS) of their coding RNA to facilitate identification of antibody-specific peptide ligands from random sequence peptide library. We first demonstrated, using purified antibodies, that with our approach-specific peptide ligands for antibodies with simple linear epitopes, as well as peptide mimotopes for antibodies recognizing complex epitopes, were readily identified. Inclusion of NGS analysis reduced the number of RD selection rounds that were required to identify specific ligands and facilitated discrimination between specific and spurious nonspecific sequences. We then used a model of human serum spiked with a known target antibody to develop NGS-based analysis that allowed identification of specific ligands for a target antibody in the context of an overwhelming amount of unrelated immunoglobins present in serum.  相似文献   

11.

Background

The biological and clinical consequences of the tight interactions between host and microbiota are rapidly being unraveled by next generation sequencing technologies and sophisticated bioinformatics, also referred to as microbiota metagenomics. The recent success of metagenomics has created a demand to rapidly apply the technology to large case–control cohort studies and to studies of microbiota from various habitats, including habitats relatively poor in microbes. It is therefore of foremost importance to enable a robust and rapid quality assessment of metagenomic data from samples that challenge present technological limits (sample numbers and size). Here we demonstrate that the distribution of overlapping k-mers of metagenome sequence data predicts sequence quality as defined by gene distribution and efficiency of sequence mapping to a reference gene catalogue.

Results

We used serial dilutions of gut microbiota metagenomic datasets to generate well-defined high to low quality metagenomes. We also analyzed a collection of 52 microbiota-derived metagenomes. We demonstrate that k-mer distributions of metagenomic sequence data identify sequence contaminations, such as sequences derived from “empty” ligation products. Of note, k-mer distributions were also able to predict the frequency of sequences mapping to a reference gene catalogue not only for the well-defined serial dilution datasets, but also for 52 human gut microbiota derived metagenomic datasets.

Conclusions

We propose that k-mer analysis of raw metagenome sequence reads should be implemented as a first quality assessment prior to more extensive bioinformatics analysis, such as sequence filtering and gene mapping. With the rising demand for metagenomic analysis of microbiota it is crucial to provide tools for rapid and efficient decision making. This will eventually lead to a faster turn-around time, improved analytical quality including sample quality metrics and a significant cost reduction. Finally, improved quality assessment will have a major impact on the robustness of biological and clinical conclusions drawn from metagenomic studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1406-7) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

RNA viruses have high mutation rates and exist within their hosts as large, complex and heterogeneous populations, comprising a spectrum of related but non-identical genome sequences. Next generation sequencing is revolutionising the study of viral populations by enabling the ultra deep sequencing of their genomes, and the subsequent identification of the full spectrum of variants within the population. Identification of low frequency variants is important for our understanding of mutational dynamics, disease progression, immune pressure, and for the detection of drug resistant or pathogenic mutations. However, the current challenge is to accurately model the errors in the sequence data and distinguish real viral variants, particularly those that exist at low frequency, from errors introduced during sequencing and sample processing, which can both be substantial.

Results

We have created a novel set of laboratory control samples that are derived from a plasmid containing a full-length viral genome with extremely limited diversity in the starting population. One sample was sequenced without PCR amplification whilst the other samples were subjected to increasing amounts of RT and PCR amplification prior to ultra-deep sequencing. This enabled the level of error introduced by the RT and PCR processes to be assessed and minimum frequency thresholds to be set for true viral variant identification. We developed a genome-scale computational model of the sample processing and NGS calling process to gain a detailed understanding of the errors at each step, which predicted that RT and PCR errors are more likely to occur at some genomic sites than others. The model can also be used to investigate whether the number of observed mutations at a given site of interest is greater than would be expected from processing errors alone in any NGS data set. After providing basic sample processing information and the site’s coverage and quality scores, the model utilises the fitted RT-PCR error distributions to simulate the number of mutations that would be observed from processing errors alone.

Conclusions

These data sets and models provide an effective means of separating true viral mutations from those erroneously introduced during sample processing and sequencing.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1456-x) contains supplementary material, which is available to authorized users.  相似文献   

13.
本研究介绍了基因组结构变异检测的生物信息学基本方法和前沿技术。对基于第二代测序技术的四种检测方法(读对方法,读深方法,分裂片段方法和序列拼接方法)的原理和特点进行了详细解读,分析了第二代测序技术应用在检测结构变异上的特点与发展趋势。最后介绍了三代测序、Linked-reads和光学物理图谱等新技术在基因组结构变异检测中的应用,论述了融合新技术的结构变异检测方法的特点与优势。  相似文献   

14.

Background

Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform’s sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects.

Results

Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics.

Conclusion

FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0366-2) contains supplementary material, which is available to authorized users.  相似文献   

15.
16.
In the present study the whole bacterial community structure of Tapovan hot spring soil located in the state of Uttarakhand, India was analysed through next generation sequencing. The hot spring soil is slightly alkaline in nature with abundance of sulphur. The spring soil was rich in various metallic and non-metallic elements required for bacterial survival. The community was found to comprise of 14 bacterial phyla with representation of members belonging to Firmicutes, Proteobacteria, Thermi, Bacteroidetes, Aquificae, Actinobacteria, chloroflexi, TM7, Fusobacteria etc. At the genus level Bacillus, Pseudomonas, Symbiobacterium, Thermus, Geobacillus, Anoxybacillus were found in abundance as compared to other genera like Flavobacterium, Ureibacillus, Clostridium, Meiothermus, Acinetobacter, Desulfotomaculum and Rheinheimera.  相似文献   

17.
李燕  李垚垚 《生物信息学》2015,13(3):186-191
基于不同的测序技术,基因拷贝数变异的检测方法有多种,但时间复杂度较高,而新一代测序技术的发展为基因拷贝数变异检测的研究开辟了新领域。通过仿真实验、置换检验设计出一种新的基于新一代测序的拷贝数变异检测算法。不同于其它算法,本算法无需参考样本,通过直接研究比对后的序列以及reads与拷贝数的关系,来研究检测拷贝数变异,实验结果表明在时间复杂度上能提高50%以上的运算速度,这对今后拷贝数与疾病的研究具有重要意义。  相似文献   

18.
阴道感染性疾病是育龄妇女最常见的妇科疾病,其主要的类型有细菌性阴道病和外阴阴道假丝酵母菌病。目前的临床诊断主要包括病原体的显微镜观察以及病原体代谢物的检测等,但因其诊断阳性率低、敏感性低等限制,导致抗感染药物选择困难、治愈率低、复发率高等。本综述在比较阴道感染性疾病,特别是细菌性阴道病以及外阴阴道假丝酵母菌病临床常见诊断方法的基础上,介绍了高通量测序技术在细菌性阴道病和外阴阴道假丝酵母菌病的病原体鉴定及诊断中的应用,以期为阴道感染性疾病的诊断、治疗及愈后提供实验依据。  相似文献   

19.
Dicentrarchus labrax is one of the major marine aquaculture species in the European Union. In this study, we have developed a directed-sequencing strategy to sequence three sea bass chromosomes and compared results with other teleosts.Three BAC DNA pools were created from sea bass BAC clones that mapped to stickleback chromosomes/groups V, XVII and XXI. The pools were sequenced to 17-39x coverage by pyrosequencing. Data assembly was supported by Sanger reads and mate pair data and resulted in superscaffolds of 13.2 Mb, 17.5 Mb and 13.7 Mb respectively. Annotation features of the superscaffolds include 1477 genes. We analyzed size change of exon, intron and intergenic sequence between teleost species and deduced a simple model for the evolution of genome composition in teleost lineage.Combination of second generation sequencing technologies, Sanger sequencing and genome partitioning strategies allows “high-quality draft assemblies” of chromosome-sized superscaffolds, which are crucial for the prediction and annotation of complete genes.  相似文献   

20.
Due to the presence of moisture and nutrients, brewery filling line surfaces are susceptible to unwanted microbial attachment. Knowledge of the attaching microbes will aid in designing hygienic control of the process. In this study the bacterial diversity present on brewery filling line surfaces was revealed by next generation sequencing. The two filling lines studied maintained their characteristic bacterial community throughout three sampling times (13–163 days). On the glass bottle line, γ-proteobacteria dominated (35–82% of all OTUs), whereas on the canning line α-, β- and γ-proteobacteria and actinobacteria were most common. The most frequently detected genera were Acinetobacter, Propinobacterium and Pseudomonas. The halophilic genus Halomonas was commonly detected, which might be due to its tolerance to alkaline foam cleaners. This study has revealed a detailed overall picture of the bacterial groups present on filling line surfaces. Further effort should be given to determine the efficacy of washing procedures on different bacterial groups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号