首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors’ knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded fromhttp://hmpdacc.org). MALINA is made freely available on the web athttp://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.  相似文献   

3.
Next generation sequencing (NGS) of PCR amplicons is a standard approach to detect genetic variations in personalized medicine such as cancer diagnostics. Computer programs used in the NGS community often miss insertions and deletions (indels) that constitute a large part of known human mutations. We have developed HeurAA, an open source, heuristic amplicon aligner program. We tested the program on simulated datasets as well as experimental data from multiplex sequencing of 40 amplicons in 12 oncogenes collected on a 454 Genome Sequencer from lung cancer cell lines. We found that HeurAA can accurately detect all indels, and is more than an order of magnitude faster than previous programs. HeurAA can compare reads and reference sequences up to several thousand base pairs in length, and it can evaluate data from complex mixtures containing reads of different gene-segments from different samples. HeurAA is written in C and Perl for Linux operating systems, the code and the documentation are available for research applications at http://sourceforge.net/projects/heuraa/  相似文献   

4.
The Korean Native Chicken (KNC) is an important endemic biological resource in Korea. While numerous studies have been conducted exploring this breed, none have used next-generation sequencing to identify its specific genomic features. We sequenced five strains of KNC and identified 10.9 million SNVs and 1.3 million InDels. Through the analysis, we found that the highly variable region common to all 5 strains had genes like PCHD15, CISD1, PIK3C2A, and NUCB2 that might be related to the phenotypic traits of the chicken such as auditory sense, growth rate and egg traits. In addition, we assembled unaligned reads that could not be mapped to the reference genome. By assembling the unaligned reads, we were able to present genomic sequences characteristic to the KNC. Based on this, we also identified genes related to the olfactory receptors and antigen that are common to all 5 strains. Finally, through the reconstructed mitochondrial genome sequences, we performed phylogenomic analysis and elucidated the maternal origin of the artificially restored KNC. Our results revealed that the KNC has multiple maternal origins which are in agreement with Korea''s history of chicken breed imports. The results presented here provide a valuable basis for future research on genomic features of KNC and further understanding of KNC''s origin.  相似文献   

5.
The emergence of benchtop sequencers has made clinical genetic testing using next-generation sequencing more feasible. Ion Torrent''s PGMTM is one such benchtop sequencer that shows clinical promise in detecting single nucleotide variations (SNVs) and microindel variations (indels). However, the large number of false positive indels caused by the high frequency of homopolymer sequencing errors has impeded PGMTM''s usage for clinical genetic testing. An extensive analysis of PGMTM data from the sequencing reads of the well-characterized genome of the Escherichia coli DH10B strain and sequences of the BRCA1 and BRCA2 genes from six germline samples was done. Three commonly used variant detection tools, SAMtools, Dindel, and GATK''s Unified Genotyper, all had substantial false positive rates for indels. By incorporating filters on two major measures we could dramatically improve false positive rates without sacrificing sensitivity. The two measures were: B-Allele Frequency (BAF) and VARiation of the Width of gaps and inserts (VARW) per indel position. A BAF threshold applied to indels detected by UnifiedGenotyper removed ∼99% of the indel errors detected in both the DH10B and BRCA sequences. The optimum BAF threshold for BRCA sequences was determined by requiring 100% detection sensitivity and minimum false discovery rate, using variants detected from Sanger sequencing as reference. This resulted in 15 indel errors remaining, of which 7 indel errors were removed by selecting a VARW threshold of zero. VARW specific errors increased in frequency with higher read depth in the BRCA datasets, suggesting that homopolymer-associated indel errors cannot be reduced by increasing the depth of coverage. Thus, using a VARW threshold is likely to be important in reducing indel errors from data with higher coverage. In conclusion, BAF and VARW thresholds provide simple and effective filtering criteria that can improve the specificity of indel detection in PGMTM data without compromising sensitivity.  相似文献   

6.
GWASs have identified numerous genetic variants associated with a wide variety of diseases, yet despite the wide availability of genetic testing the insights that would enhance the interpretability of these results are not widely available to members of the public. As a proof of concept and demonstration of technological feasibility, we developed PAGEANT (Personal Access to Genome & Analysis of Natural Traits), usable through Graphical User Interface or command line-based version, aiming to serve as a protocol and prototype that guides the overarching design of genetic reporting tools. PAGEANT is structured across five core modules, summarized by five Qs: (i) quality assurance of the genetic data; (ii) qualitative assessment of genetic characteristics; (iii) quantitative assessment of health risk susceptibility based on polygenic risk scores and population reference; (iv) query of third-party variant databases (e.g. ClinVAR and PharmGKB) and (v) quick Response code of genetic variants of interest. Literature review was conducted to compare PAGEANT with academic and industry tools. For 2504 genomes made publicly available through the 1000 Genomes Project, we derived their genomic characteristics for a suite of qualitative and quantitative traits. One exemplary trait is susceptibility to COVID-19, based on the most up-to-date scientific findings reported.  相似文献   

7.
The warfare among microbial species as well as between pathogens and hosts is fierce, complicated, and continuous. In Pseudomonas aeruginosa, the muramidase effector Tse3 (Type VI secretion exported 3) can be injected into the periplasm of neighboring bacterial competitors by a Type VI secretion apparatus, eventually leading to cell lysis and death. However, P. aeruginosa protects itself from lysis by expressing immune protein Tsi3 (Type six secretion immunity 3). Here, we report the crystal structure of the Tse3-Tsi3 complex at 1.8 Å resolution, revealing that Tse3 possesses one open accessible, goose-type lysozyme-like domain with peptidoglycan hydrolysis activity. Calcium ions bind specifically in the Tse3 active site and are identified to be crucial for its bacteriolytic activity. In combination with biochemical studies, the structural basis of self-protection mechanism of Tsi3 is also elucidated, thus providing an understanding and new insights into the effectors of Type VI secretion system.  相似文献   

8.
In most bacteria, two tRNAs decode the four arginine CGN codons. One tRNA harboring a wobble inosine (tRNAArgICG) reads the CGU, CGC and CGA codons, whereas a second tRNA harboring a wobble cytidine (tRNAArgCCG) reads the remaining CGG codon. The reduced genomes of Mycoplasmas and other Mollicutes lack the gene encoding tRNAArgCCG. This raises the question of how these organisms decode CGG codons. Examination of 36 Mollicute genomes for genes encoding tRNAArg and the TadA enzyme, responsible for wobble inosine formation, suggested an evolutionary scenario where tadA gene mutations first occurred. This allowed the temporary accumulation of non-deaminated tRNAArgACG, capable of reading all CGN codons. This hypothesis was verified in Mycoplasma capricolum, which contains a small fraction of tRNAArgACG with a non-deaminated wobble adenosine. Subsets of Mollicutes continued to evolve by losing both the mutated tRNAArgCCG and tadA, and then acquired a new tRNAArgUCG. This permitted further tRNAArgACG mutations with tRNAArgGCG or its disappearance, leaving a single tRNAArgUCG to decode the four CGN codons. The key point of our model is that the A-to-I deamination activity had to be controlled before the loss of the tadA gene, allowing the stepwise evolution of Mollicutes toward an alternative decoding strategy.  相似文献   

9.

Background

Metagenomics can reveal the vast majority of microbes that have been missed by traditional cultivation-based methods. Due to its extremely wide range of application areas, fast metagenome sequencing simulation systems with high fidelity are in great demand to facilitate the development and comparison of metagenomics analysis tools.

Results

We present here a customizable metagenome simulation system: NeSSM (Next-generation Sequencing Simulator for Metagenomics). Combining complete genomes currently available, a community composition table, and sequencing parameters, it can simulate metagenome sequencing better than existing systems. Sequencing error models based on the explicit distribution of errors at each base and sequencing coverage bias are incorporated in the simulation. In order to improve the fidelity of simulation, tools are provided by NeSSM to estimate the sequencing error models, sequencing coverage bias and the community composition directly from existing metagenome sequencing data. Currently, NeSSM supports single-end and pair-end sequencing for both 454 and Illumina platforms. In addition, a GPU (graphics processing units) version of NeSSM is also developed to accelerate the simulation. By comparing the simulated sequencing data from NeSSM with experimental metagenome sequencing data, we have demonstrated that NeSSM performs better in many aspects than existing popular metagenome simulators, such as MetaSim, GemSIM and Grinder. The GPU version of NeSSM is more than one-order of magnitude faster than MetaSim.

Conclusions

NeSSM is a fast simulation system for high-throughput metagenome sequencing. It can be helpful to develop tools and evaluate strategies for metagenomics analysis and it’s freely available for academic users at http://cbb.sjtu.edu.cn/~ccwei/pub/software/NeSSM.php.  相似文献   

10.
Clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) systems in bacteria and archaea target foreign elements, such as bacteriophages and conjugative plasmids, through the incorporation of short sequences (termed spacers) from the foreign element into the CRISPR array, thereby allowing sequence-specific targeting of the invader. Thus, CRISPR-Cas systems are typically considered a microbial adaptive immune system. While many of these incorporated spacers match targets on bacteriophages and plasmids, a noticeable number are derived from chromosomal DNA. While usually lethal to the self-targeting bacteria, in certain circumstances, these self-targeting spacers can have profound effects in regard to microbial biology, including functions beyond adaptive immunity. In this minireview, we discuss recent studies that focus on the functions and consequences of CRISPR-Cas self-targeting, including reshaping of the host population, group behavior modification, and the potential applications of CRISPR-Cas self-targeting as a tool in microbial biotechnology. Understanding the effects of CRISPR-Cas self-targeting is vital to fully understanding the spectrum of function of these systems.  相似文献   

11.
MetaMetaDB (http://mmdb.aori.u-tokyo.ac.jp/) is a database and analytic system for investigating microbial habitability, i.e., how a prokaryotic group can inhabit different environments. The interaction between prokaryotes and the environment is a key issue in microbiology because distinct prokaryotic communities maintain distinct ecosystems. Because 16S ribosomal RNA (rRNA) sequences play pivotal roles in identifying prokaryotic species, a system that comprehensively links diverse environments to 16S rRNA sequences of the inhabitant prokaryotes is necessary for the systematic understanding of the microbial habitability. However, existing databases are biased to culturable prokaryotes and exhibit limitations in the comprehensiveness of the data because most prokaryotes are unculturable. Recently, metagenomic and 16S rRNA amplicon sequencing approaches have generated abundant 16S rRNA sequence data that encompass unculturable prokaryotes across diverse environments; however, these data are usually buried in large databases and are difficult to access. In this study, we developed MetaMetaDB (Meta-Metagenomic DataBase), which comprehensively and compactly covers 16S rRNA sequences retrieved from public datasets. Using MetaMetaDB, users can quickly generate hypotheses regarding the types of environments a prokaryotic group may be adapted to. We anticipate that MetaMetaDB will improve our understanding of the diversity and evolution of prokaryotes.  相似文献   

12.
Triggering receptor expressed on myeloid cells-2 (TREM-2) is rapidly emerging as a key regulator of the innate immune response via its regulation of macrophage inflammatory responses. Here we demonstrate that proximal TREM-2 signaling parallels other DAP12-based receptor systems in its use of Syk and Src-family kinases. However, we find that the linker for activation of T cells (LAT) is severely reduced as monocytes differentiate into macrophages and that TREM-2 exclusively uses the linker for activation of B cells (LAB encoded by the gene Lat2−/−) to mediate downstream signaling. LAB is required for TREM-2-mediated activation of Erk1/2 and dampens proximal TREM-2 signals through a novel LAT-independent mechanism resulting in macrophages with proinflammatory properties. Thus, Lat2−/− macrophages have increased TREM-2-induced proximal phosphorylation, and lipopolysaccharide stimulation of these cells leads to increased interleukin-10 (IL-10) and decreased IL-12p40 production relative to wild type cells. Together these data identify LAB as a critical, LAT-independent regulator of TREM-2 signaling and macrophage development capable of controlling subsequent inflammatory responses.  相似文献   

13.
14.
Telomere shortening can cause detrimental diseases and contribute to aging. It occurs due to the end replication problem in cells lacking telomerase. Furthermore, recent studies revealed that telomere shortening can be attributed to difficulties of the semi-conservative DNA replication machinery to replicate the bulk of telomeric DNA repeats. To investigate telomere replication in a comprehensive manner, we develop QTIP-iPOND - Quantitative Telomeric chromatin Isolation Protocol followed by isolation of Proteins On Nascent DNA - which enables purification of proteins that associate with telomeres specifically during replication. In addition to the core replisome, we identify a large number of proteins that specifically associate with telomere replication forks. Depletion of several of these proteins induces telomere fragility validating their importance for telomere replication. We also find that at telomere replication forks the single strand telomere binding protein POT1 is depleted, whereas histone H1 is enriched. Our work reveals the dynamic changes of the telomeric proteome during replication, providing a valuable resource of telomere replication proteins. To our knowledge, this is the first study that examines the replisome at a specific region of the genome.  相似文献   

15.
Rapid and accurate strain identification is paramount in the battle against microbial outbreaks, and several subtyping approaches have been developed. One such method uses clustered regular interspaced short palindromic repeats (CRISPRs), DNA repeat elements that are present in approximately half of all bacteria. Though their signature function is as an adaptive immune system against invading DNA such as bacteriophages and plasmids, CRISPRs also provide an excellent framework for pathogen tracking and evolutionary studies. Analysis of the spacer DNA sequences that reside between the repeats has been tremendously useful for bacterial subtyping during molecular epidemiological investigations. Subtyping, or strain identification, using CRISPRs has been employed in diverse Gram-positive and Gram-negative bacteria, including Mycobacterium tuberculosis, Salmonella enterica, and the plant pathogen Erwinia amylovora. This review discusses the several ways in which CRISPR sequences are exploited for subtyping. This includes the well-established spoligotyping methodologies that have been used for 2 decades to type Mycobacterium species, as well as in-depth consideration of newer, higher-throughput CRISPR-based protocols.  相似文献   

16.
High-throughput sequencing techniques are becoming attractive to molecular biologists and ecologists as they provide a time- and cost-effective way to explore diversity patterns in environmental samples at an unprecedented resolution. An issue common to many studies is the definition of what fractions of a data set should be considered as rare or dominant. Yet this question has neither been satisfactorily addressed, nor is the impact of such definition on data set structure and interpretation been fully evaluated. Here we propose a strategy, MultiCoLA (Multivariate Cutoff Level Analysis), to systematically assess the impact of various abundance or rarity cutoff levels on the resulting data set structure and on the consistency of the further ecological interpretation. We applied MultiCoLA to a 454 massively parallel tag sequencing data set of V6 ribosomal sequences from marine microbes in temperate coastal sands. Consistent ecological patterns were maintained after removing up to 35–40% rare sequences and similar patterns of beta diversity were observed after denoising the data set by using a preclustering algorithm of 454 flowgrams. This example validates the importance of exploring the impact of the definition of rarity in large community data sets. Future applications can be foreseen for data sets from different types of habitats, e.g. other marine environments, soil and human microbiota.  相似文献   

17.

Background

The Schistosoma mansoni Venom-Allergen-Like proteins (SmVALs) are members of the SCP/TAPS (Sperm-coating protein/Tpx-1/Ag5/PR-1/Sc7) protein superfamily, which may be important in the host-pathogen interaction. Some of these molecules were suggested by us and others as potential immunomodulators and vaccine candidates, due to their functional classification, expression profile and predicted localization. From a vaccine perspective, one of the concerns is the potential allergic effect of these molecules.

Methodology/Principal Findings

Herein, we characterized the putative secreted proteins SmVAL4 and SmVAL26 and explored the mouse model of airway inflammation to investigate their potential allergenic properties. The respective recombinant proteins were obtained in the Pichia pastoris system and the purified proteins used to produce specific antibodies. SmVAL4 protein was revealed to be present only in the cercarial stage, increasing from 0–6 h in the secretions of newly transformed schistosomulum. SmVAL26 was identified only in the egg stage, mainly in the hatched eggs'' fluid and also in the secretions of cultured eggs. Concerning the investigation of the allergic properties of these proteins in the mouse model of airway inflammation, SmVAL4 induced a significant increase in total cells in the bronchoalveolar lavage fluid, mostly due to an increase in eosinophils and macrophages, which correlated with increases in IgG1, IgE and IL-5, characterizing a typical allergic airway inflammation response. High titers of anaphylactic IgG1 were revealed by the Passive Cutaneous Anaphylactic (PCA) hypersensitivity assay. Additionally, in a more conventional protocol of immunization for vaccine trials, rSmVAL4 still induced high levels of IgG1 and IgE.

Conclusions

Our results suggest that members of the SmVAL family do present allergic properties; however, this varies significantly and therefore should be considered in the design of a schistosomiasis vaccine. Additionally, the murine model of airway inflammation proved to be useful in the investigation of allergic properties of potential vaccine candidates.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号