首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 34 毫秒
1.

Background

Microbial communities of traditional cheeses are complex and insufficiently characterized. The origin, safety and functional role in cheese making of these microbial communities are still not well understood. Metagenomic analysis of these communities by high throughput shotgun sequencing is a promising approach to characterize their genomic and functional profiles. Such analyses, however, critically depend on the availability of appropriate reference genome databases against which the sequencing reads can be aligned.

Results

We built a reference genome catalog suitable for short read metagenomic analysis using a low-cost sequencing strategy. We selected 142 bacteria isolated from dairy products belonging to 137 different species and 67 genera, and succeeded to reconstruct the draft genome of 117 of them at a standard or high quality level, including isolates from the genera Kluyvera, Luteococcus and Marinilactibacillus, still missing from public database. To demonstrate the potential of this catalog, we analysed the microbial composition of the surface of two smear cheeses and one blue-veined cheese, and showed that a significant part of the microbiota of these traditional cheeses was composed of microorganisms newly sequenced in our study.

Conclusions

Our study provides data, which combined with publicly available genome references, represents the most expansive catalog to date of cheese-associated bacteria. Using this extended dairy catalog, we revealed the presence in traditional cheese of dominant microorganisms not deliberately inoculated, mainly Gram-negative genera such as Pseudoalteromonas haloplanktis or Psychrobacter immobilis, that may contribute to the characteristics of cheese produced through traditional methods.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1101) contains supplementary material, which is available to authorized users.  相似文献   

2.

Background

Genotype imputation is commonly used in genetic association studies to test untyped variants using information on linkage disequilibrium (LD) with typed markers. Imputing genotypes requires a suitable reference population in which the LD pattern is known, most often one selected from HapMap. However, some populations, such as American Indians, are not represented in HapMap. In the present study, we assessed accuracy of imputation using HapMap reference populations in a genome-wide association study in Pima Indians.

Results

Data from six randomly selected chromosomes were used. Genotypes in the study population were masked (either 1% or 20% of SNPs available for a given chromosome). The masked genotypes were then imputed using the software Markov Chain Haplotyping Algorithm. Using four HapMap reference populations, average genotype error rates ranged from 7.86% for Mexican Americans to 22.30% for Yoruba. In contrast, use of the original Pima Indian data as a reference resulted in an average error rate of 1.73%.

Conclusions

Our results suggest that the use of HapMap reference populations results in substantial inaccuracy in the imputation of genotypes in American Indians. A possible solution would be to densely genotype or sequence a reference American Indian population.  相似文献   

3.

Background

Structural genomic variation study, along with microarray technology development has provided many genomic resources related with architecture of human genome, and led to the fact that human genome structure is a lot more complicated than previously thought.

Methodology/Principal Findings

In the case of International HapMap Project, Epstein-Barr various immortalized cell lines were preferably used over blood in order to get a larger number of genomic DNA. However, genomic aberration stemming from immortalization process, biased representation of the donor tissue, and culture process may influence the accuracy of SNP genotypes. In order to identify chromosome aberrations including loss of heterozygosity (LOH), large-scale and small-scale copy number variations, we used Illumina HumanHap500 BeadChip (555,352 markers) on Korean HapMap individuals (n = 90) to obtain Log R ratio and B allele frequency information, and then utilized the data with various programs including Illumina ChromoZone, cnvParition and PennCNV. As a result, we identified 28 LOHs (>3 mb) and 35 large-scale CNVs (>1 mb), with 4 samples having completely duplicated chromosome. In addition, after checking the sample quality (standard deviation of log R ratio <0.30), we selected 79 samples and used both signal intensity and B allele frequency simultaneously for identification of small-scale CNVs (<1 mb) to discover 4,989 small-scale CNVs. Identified CNVs in this study were successfully validated using visual examination of the genoplot images, overlapping analysis with previously reported CNVs in DGV, and quantitative PCR.

Conclusion/Significance

In this study, we describe the result of the identified chromosome aberrations in Korean HapMap individuals, and expect that these findings will provide more meaningful information on the human genome.  相似文献   

4.

Background

Mangalicas are fatty type local/rare pig breeds with an increasing presence in the niche pork market in Hungary and in other countries. To explore their genetic resources, we have analysed data from next-generation sequencing of an individual male from each of three Mangalica breeds along with a local male Duroc pig. Structural variations, such as SNPs, INDELs and CNVs, were identified and particular genes with SNP variations were analysed with special emphasis on functions related to fat metabolism in pigs.

Results

More than 60 Gb of sequence data were generated for each of the sequenced individuals, resulting in 11× to 19× autosomal median coverage. After stringent filtering, around six million SNPs, of which approximately 10% are novel compared to the dbSNP138 database, were identified in each animal. Several hundred thousands of INDELs and about 1,000 CNV gains were also identified. The functional annotation of genes with exonic, non-synonymous SNPs, which are common in all three Mangalicas but are absent in either the reference genome or the sequenced Duroc of this study, highlighted 52 genes in lipid metabolism processes. Further analysis revealed that 41 of these genes are associated with lipid metabolic or regulatory pathways, 49 are in fat-metabolism and fatness-phenotype QTLs and, with the exception of ACACA, ANKRD23, GM2A, KIT, MOGAT2, MTTP, FASN, SGMS1, SLC27A6 and RETSAT, have not previously been associated with fat-related phenotypes.

Conclusions

Genome analysis of Mangalica breeds revealed that local/rare breeds could be a rich source of sequence variations not present in cosmopolitan/industrial breeds. The identified Mangalica variations may, therefore, be a very useful resource for future studies of agronomically important traits in pigs.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-761) contains supplementary material, which is available to authorized users.  相似文献   

5.
6.

Background

Prestin, encoded by the gene SLC26A5, is a transmembrane protein of the cochlear outer hair cell (OHC). Prestin is required for the somatic electromotile activity of OHCs, which is absent in OHCs and causes severe hearing impairment in mice lacking prestin. In humans, the role of sequence variations in SLC26A5 in hearing loss is less clear. Although prestin is expected to be required for functional human OHCs, the clinical significance of reported putative mutant alleles in humans is uncertain.

Methodology/Principal Findings

To explore the hypothesis that SLC26A5 may act as a modifier gene, affecting the severity of hearing loss caused by an independent etiology, a patient-control cohort was screened for DNA sequence variations in SLC26A5 using sequencing and allele specific methods. Patients in this study carried known pathogenic or controversial sequence variations in GJB2, encoding Connexin 26, or confirmed or suspected sequence variations in SLC26A5; controls included four ethnic populations. Twenty-three different DNA sequence variations in SLC26A5, 14 of which are novel, were observed: 4 novel sequence variations were found exclusively among patients; 7 novel sequence variations were found exclusively among controls; and, 12 sequence variations, 3 of which are novel, were found in both patients and controls. Twenty-one of the 23 DNA sequence variations were located in non-coding regions of SLC26A5. Two coding sequence variations, both novel, were observed only in patients and predict a silent change, p.S434S, and an amino acid substitution, p.I663V. In silico analysis of the p.I663V amino acid variation suggested this variant might be benign. Using Fisher''s exact test, no statistically significant difference was observed between patients and controls in the frequency of the identified DNA sequence variations. Haplotype analysis using HaploView 4.0 software revealed the same predominant haplotype in patients and controls and derived haplotype blocks in the patient-control cohort similar to those generated from the International HapMap Project.

Conclusions/Significance

Although these data fail to support a hypothesis that SLC26A5 acts as a modifier gene of GJB2-related hearing loss, the sample size is small and investigation of a larger population might be more informative. The 14 novel DNA sequence variations in SLC26A5 reported here will serve as useful research tools for future studies of prestin.  相似文献   

7.
8.

Background

With the advance of next generation sequencing (NGS) technologies, a large number of insertion and deletion (indel) variants have been identified in human populations. Despite much research into variant calling, it has been found that a non-negligible proportion of the identified indel variants might be false positives due to sequencing errors, artifacts caused by ambiguous alignments, and annotation errors.

Results

In this paper, we examine indel redundancy in dbSNP, one of the central databases for indel variants, and develop a standalone computational pipeline, dubbed Vindel, to detect redundant indels. The pipeline first applies indel position information to form candidate redundant groups, then performs indel mutations to the reference genome to generate corresponding indel variant substrings. Finally the indel variant substrings in the same candidate redundant groups are compared in a pairwise fashion to identify redundant indels. We applied our pipeline to check for redundancy in the human indels in dbSNP. Our pipeline identified approximately 8% redundancy in insertion type indels, 12% in deletion type indels, and overall 10% for insertions and deletions combined. These numbers are largely consistent across all human autosomes. We also investigated indel size distribution and adjacent indel distance distribution for a better understanding of the mechanisms generating indel variants.

Conclusions

Vindel, a simple yet effective computational pipeline, can be used to check whether a set of indels are redundant with respect to those already in the database of interest such as NCBI’s dbSNP. Of the approximately 5.9 million indels we examined, nearly 0.6 million are redundant, revealing a serious limitation in the current indel annotation. Statistics results prove the consistency of the pipeline on indel redundancy detection for all 22 chromosomes. Apart from the standalone Vindel pipeline, the indel redundancy check algorithm is also implemented in the web server http://bioinformatics.cs.vt.edu/zhanglab/indelRedundant.php.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0359-1) contains supplementary material, which is available to authorized users.  相似文献   

9.
10.

Background

The inhalation of allergens by allergic asthmatics results in the early asthmatic response (EAR), which is characterized by acute airway obstruction beginning within a few minutes. The EAR is the earliest indicator of the pathological progression of allergic asthma. Because the molecular mechanism underlying the EAR is not fully defined, this study will contribute to a better understanding of asthma.

Methods

In order to gain insight into the molecular basis of the EAR, we examined changes in protein expression patterns in the lung tissue of asthmatic rats during the EAR using 2-DE/MS-based proteomic techniques. Bioinformatic analysis of the proteomic data was then performed using PPI Spider and KEGG Spider to investigate the underlying molecular mechanism.

Results

In total, 44 differentially expressed protein spots were detected in the 2-DE gels. Of these 44 protein spots, 42 corresponded to 36 unique proteins successfully identified using mass spectrometry. During subsequent bioinformatic analysis, the gene ontology classification, the protein-protein interaction networking and the biological pathway exploration demonstrated that the identified proteins were mainly involved in glycolysis, calcium binding and mitochondrial activity. Using western blot and semi-quantitative RT-PCR, we confirmed the changes in expression of five selected proteins, which further supports our proteomic and bioinformatic analyses.

Conclusions

Our results reveal that the allergen-induced EAR in asthmatic rats is associated with glycolysis, calcium binding and mitochondrial activity, which could establish a functional network in which calcium binding may play a central role in promoting the progression of asthma.  相似文献   

11.

Background

Schistosomiasis japonica is a serious debilitating and sometimes fatal disease. Accurate diagnostic tests play a key role in patient management and control of the disease. However, currently available diagnostic methods are not ideal, and the detection of the parasite DNA in blood samples has turned out to be one of the most promising tools for the diagnosis of schistosomiasis. In our previous investigations, a 230-bp sequence from the highly repetitive retrotransposon SjR2 was identified and it showed high sensitivity and specificity for detecting Schistosoma japonicum DNA in the sera of rabbit model and patients. Recently, 29 retrotransposons were found in S. japonicum genome by our group. The present study highlighted the key factors for selecting a new perspective sensitive target DNA sequence for the diagnosis of schistosomiasis, which can serve as example for other parasitic pathogens.

Methodology/Principal Findings

In this study, we demonstrated that the key factors based on the bioinformatic analysis for selecting target sequence are the higher genome proportion, repetitive complete copies and partial copies, and active ESTs than the others in the chromosome genome. New primers based on 25 novel retrotransposons and SjR2 were designed and their sensitivity and specificity for detecting S. japonicum DNA were compared. The results showed that a new 303-bp sequence from non-long terminal repeat (LTR) retrotransposon (SjCHGCS19) had high sensitivity and specificity. The 303-bp target sequence was amplified from the sera of rabbit model at 3 d post-infection by nested-PCR and it became negative at 17 weeks post-treatment. Furthermore, the percentage sensitivity of the nested-PCR was 97.67% in 43 serum samples of S. japonicum-infected patients.

Conclusions/Significance

Our findings highlighted the key factors based on the bioinformatic analysis for selecting target sequence from S. japonicum genome, which provide basis for establishing powerful molecular diagnostic techniques that can be used for monitoring early infection and therapy efficacy to support schistosomiasis control programs.  相似文献   

12.
13.

Background

A key argument in favor of conserving biodiversity is that as yet undiscovered biodiversity will yield products of great use to humans. However, the link between undiscovered biodiversity and useful products is largely conjectural. Here we provide direct evidence from bioassays of endophytes isolated from tropical plants and bioinformatic analyses that novel biology will indeed yield novel chemistry of potential value.

Methodology/Principal Findings

We isolated and cultured 135 endophytic fungi and bacteria from plants collected in Peru. nrDNAs were compared to samples deposited in GenBank to ascertain the genetic novelty of cultured specimens. Ten endophytes were found to be as much as 15–30% different than any sequence in GenBank. Phylogenetic trees, using the most similar sequences in GenBank, were constructed for each endophyte to measure phylogenetic distance. Assays were also conducted on each cultured endophyte to record bioactivity, of which 65 were found to be bioactive.

Conclusions/Significance

The novelty of our contribution is that we have combined bioinformatic analyses that document the diversity found in environmental samples with culturing and bioassays. These results highlight the hidden hyperdiversity of endophytic fungi and the urgent need to explore and conserve hidden microbial diversity. This study also showcases how undergraduate students can obtain data of great scientific significance.  相似文献   

14.
15.

Background

Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is a key enzyme of the glycolytic pathway, reversibly catalyzing the sixth step of glycolysis and concurrently reducing the coenzyme NAD+ to NADH. In photosynthetic organisms a GAPDH paralog (Gap2 in Cyanobacteria, GapA in most photosynthetic eukaryotes) functions in the Calvin cycle, performing the reverse of the glycolytic reaction and using the coenzyme NADPH preferentially. In a number of photosynthetic eukaryotes that acquired their plastid by the secondary endosymbiosis of a eukaryotic red alga (Alveolates, haptophytes, cryptomonads and stramenopiles) GapA has been apparently replaced with a paralog of the host’s own cytosolic GAPDH (GapC1). Plastid GapC1 and GapA therefore represent two independent cases of functional divergence and adaptations to the Calvin cycle entailing a shift in subcellular targeting and a shift in binding preference from NAD+ to NADPH.

Methods

We used the programs FunDi, GroupSim, and Difference Evolutionary-Trace to detect sites involved in the functional divergence of these two groups of GAPDH sequences and to identify potential cases of convergent evolution in the Calvin-cycle adapted GapA and GapC1 families. Sites identified as being functionally divergent by all or some of these programs were then investigated with respect to their possible roles in the structure and function of both glycolytic and plastid-targeted GAPDH isoforms.

Conclusions

In this work we found substantial evidence for convergent evolution in GapA/B and GapC1. In many cases sites in GAPDHs of these groups converged on identical amino acid residues in specific positions of the protein known to play a role in the function and regulation of plastid-functioning enzymes relative to their cytosolic counterparts. In addition, we demonstrate that bioinformatic software like FunDi are important tools for the generation of meaningful biological hypotheses that can then be tested with direct experimental techniques.  相似文献   

16.

Background

Although numerous sequence variants in desmoglein-2 (DSG2) have been associated with arrhythmogenic right ventricular cardiomyopathy (ARVC), the functional impact of new sequence variations is difficult to estimate.

Methodology/Principal Findings

To test the functional consequences of DSG2-variants, we established an expression system for the extracellular domain and the full-length DSG2 using the human cell line HT1080. We established new tools to investigate ARVC-associated DSG2 variations and compared wild-type proteins and proteins with one of the five selected variations (DSG2-p.R46Q, -p.D154E, -p.D187G, -p.K294E, -p.V392I) with respect to prodomain cleavage, adhesion properties and cellular localisation.

Conclusions/Significance

The ARVC-associated DSG2-p.R46Q variation was predicted to be probably damaging by bioinformatics tools and to concern a conserved proprotein convertase cleavage site. In this study an impaired prodomain cleavage and an influence on the DSG2-properties could be demonstrated for the R46Q-variant leading to the classification of the variant as a potential gain-of-function mutant. In contrast, the variants DSG2-p.K294E and -p.V392I, which have an arguable impact on ARVC pathogenesis and are predicted to be benign, did not show functional differences to the wild-type protein in our study. Notably, the variants DSG2-p.D154E and -p.D187G, which were predicted to be damaging by bioinformatics tools, had no detectable effects on the DSG2 protein properties in our study.  相似文献   

17.
18.

Background

Functional similarity is challenging to identify when global sequence and structure similarity is low. Active-sites or functionally relevant regions are evolutionarily more stable relative to the remainder of a protein structure and provide an alternative means to identify potential functional similarity between proteins. We recently developed the FAST-NMR methodology to discover biochemical functions or functional hypotheses of proteins of unknown function by experimentally identifying ligand binding sites. FAST-NMR utilizes our CPASS software and database to assign a function based on a similarity in the structure and sequence of ligand binding sites between proteins of known and unknown function.

Methodology/Principal Findings

The PrgI protein from Salmonella typhimurium forms the needle complex in the type III secretion system (T3SS). A FAST-NMR screen identified a similarity between the ligand binding sites of PrgI and the Bcl-2 apoptosis protein Bcl-xL. These ligand binding sites correlate with known protein-protein binding interfaces required for oligomerization. Both proteins form membrane pores through this oligomerization to release effector proteins to stimulate cell death. Structural analysis indicates an overlap between the PrgI structure and the pore forming motif of Bcl-xL. A sequence alignment indicates conservation between the PrgI and Bcl-xL ligand binding sites and pore formation regions. This active-site similarity was then used to verify that chelerythrine, a known Bcl-xL inhibitor, also binds PrgI.

Conclusions/Significance

A structural and functional relationship between the bacterial T3SS and eukaryotic apoptosis was identified using our FAST-NMR ligand affinity screen in combination with a bioinformatic analysis based on our CPASS program. A similarity between PrgI and Bcl-xL is not readily apparent using traditional global sequence and structure analysis, but was only identified because of conservation in ligand binding sites. These results demonstrate the unique opportunity that ligand-binding sites provide for the identification of functional relationships when global sequence and structural information is limited.  相似文献   

19.

Background

Pyrosequencing technology has the potential to rapidly sequence HIV-1 viral quasispecies without requiring the traditional approach of cloning. In this study, we investigated the utility of ultra-deep pyrosequencing to characterize genetic diversity of the HIV-1 gag quasispecies and assessed the possible contribution of pyrosequencing technology in studying HIV-1 biology and evolution.

Methodology/Principal Findings

HIV-1 gag gene was amplified from 96 patients using nested PCR. The PCR products were cloned and sequenced using capillary based Sanger fluorescent dideoxy termination sequencing. The same PCR products were also directly sequenced using the 454 pyrosequencing technology. The two sequencing methods were evaluated for their ability to characterize quasispecies variation, and to reveal sites under host immune pressure for their putative functional significance. A total of 14,034 variations were identified by 454 pyrosequencing versus 3,632 variations by Sanger clone-based (SCB) sequencing. 11,050 of these variations were detected only by pyrosequencing. These undetected variations were located in the HIV-1 Gag region which is known to contain putative cytotoxic T lymphocyte (CTL) and neutralizing antibody epitopes, and sites related to virus assembly and packaging. Analysis of the positively selected sites derived by the two sequencing methods identified several differences. All of them were located within the CTL epitope regions.

Conclusions/Significance

Ultra-deep pyrosequencing has proven to be a powerful tool for characterization of HIV-1 genetic diversity with enhanced sensitivity, efficiency, and accuracy. It also improved reliability of downstream evolutionary and functional analysis of HIV-1 quasispecies.  相似文献   

20.

Background

The HUGO Pan-Asian SNP Consortium (PASNP) has generated a genetic resource of almost 55,000 autosomal single nucleotide polymorphisms (SNPs) across more than 1,800 individuals from 73 urban and indigenous populations in Asia. This has offered valuable insights into the correlation between the genetic ancestry of these populations with major linguistic systems and geography. Here, we attempt to understand whether adaptation to local climate, diet and environment partly explains the genetic variation present in these populations by investigating the genomic signatures of positive selection.

Results

To evaluate the impact to the selection analyses due to the considerably lower SNP density as compared to other population genetics resources such as the International HapMap Project (HapMap) or the Singapore Genome Variation Project, we evaluated the extent of haplotype phasing switch errors and the consistency of selection signals from three haplotype-based approaches (iHS, XP-EHH, haploPS) when the HapMap data is thinned to a similar density as PASNP. We subsequently applied haploPS to detect and characterize positive selection in the PASNP populations, identifying 59 genomics regions that were selected in at least one PASNP populations. A cluster analysis on the basis of these 59 signals showed that indigenous populations such as the Negrito from Malaysia and Philippines, the China Hmong, and the Taiwan Ami and Atayal shared more of these signals. We also reported evidence of a positive selection signal encompassing the beta globin gene in the Taiwan Ami and Atayal that was distinct from the signal in the HapMap Africans, suggesting the possibility of convergent evolution at this locus due to malarial selection.

Conclusions

We established that the lower SNP content of the PASNP data conferred weaker ability to detect signatures of positive selection, but the availability of the new approach haploPS retained modest power. Out of all the populations in PASNP, we identified only 59 signals, suggesting a strong need for high-density population-level genotyping data or sequencing data in order to achieve a comprehensive survey of positive selection in Asian populations.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-332) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号