首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background:

The goal of the gene normalization task is to link genes or gene products mentioned in the literature to biological databases. This is a key step in an accurate search of the biological literature. It is a challenging task, even for the human expert; genes are often described rather than referred to by gene symbol and, confusingly, one gene name may refer to different genes (often from different organisms). For BioCreative II, the task was to list the Entrez Gene identifiers for human genes or gene products mentioned in PubMed/MEDLINE abstracts. We selected abstracts associated with articles previously curated for human genes. We provided 281 expert-annotated abstracts containing 684 gene identifiers for training, and a blind test set of 262 documents containing 785 identifiers, with a gold standard created by expert annotators. Inter-annotator agreement was measured at over 90%.

Results:

Twenty groups submitted one to three runs each, for a total of 54 runs. Three systems achieved F-measures (balanced precision and recall) between 0.80 and 0.81. Combining the system outputs using simple voting schemes and classifiers obtained improved results; the best composite system achieved an F-measure of 0.92 with 10-fold cross-validation. A 'maximum recall' system based on the pooled responses of all participants gave a recall of 0.97 (with precision 0.23), identifying 763 out of 785 identifiers.

Conclusion:

Major advances for the BioCreative II gene normalization task include broader participation (20 versus 8 teams) and a pooled system performance comparable to human experts, at over 90% agreement. These results show promise as tools to link the literature with biological databases.
  相似文献   

2.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

3.

Background

The group B Streptococcus (GBS) is a human commensal bacterium, which is capable of causing several infectious diseases in infants, and people with chronic diseases. GBS has been the most common cause of infections in urinary tract of the elders, but relatively few studies reported the urine-isolated GBS and their antimicrobial susceptibilities. Hence, we decided to investigate GBS specially isolated from urine in Suzhou, China.

Methods

27 GBS samples were isolated from urine in Suzhou, China. The PCR and agarose gel electrophoresis were used to identify the serotype distribution. Susceptibility tests were based on MIC test and Kirby–Bauer test. Genome were sequenced via Illumina Hiseq platform and assembled by SPAdes. Genomes of five isolates were sequenced and submitted to NCBI genome database. The sequencing files in fastq format were submitted to NCBI SRA database.

Results

Five serotypes were identified. The resistant rates measured for tetracycline, erythromycin, clindamycin and fluoroquinolones were 74.1, 63.0, 44.4 and 48.1%, respectively. 18.5% of the isolates were nonsusceptible to nitrofurantoin. The resistance to tetracycline was mainly associated with the gene tetM. The erythromycin resistance was mainly associated with the genes ermB and mefE. The genes ermB and lnuB were the prevalent genes in cMLSB type. No known nitrofurantoin resistance gene was found in nitrofurantoin-nonsusceptible GBS.

Conclusions

Five serotypes were identified in our study. High rates of GBS isolates were resistant to tetracycline, erythromycin, clindamycin and fluoroquinolones. The genes ermB and lnuB occupied high rates in cMLSB phenotype.
  相似文献   

4.
Yan X  Zheng T 《BMC genomics》2008,9(Z2):S14

Background

Gene expression data extracted from microarray experiments have been used to study the difference between mRNA abundance of genes under different conditions. In one of such experiments, thousands of genes are measured simultaneously, which provides a high-dimensional feature space for discriminating between different sample classes. However, most of these dimensions are not informative about the between-class difference, and add noises to the discriminant analysis.

Results

In this paper we propose and study feature selection methods that evaluate the "informativeness" of a set of genes. Two measures of information based on multigene expression profiles are considered for a backward information-driven screening approach for selecting important gene features. By considering multigene expression profiles, we are able to utilize interaction information among these genes. Using a breast cancer data, we illustrate our methods and compare them to the performance of existing methods.

Conclusion

We illustrate in this paper that methods considering gene-gene interactions have better classification power in gene expression analysis. In our results, we identify important genes with relative large p-values from single gene tests. This indicates that these are genes with weak marginal information but strong interaction information, which will be overlooked by strategies that only examine individual genes.
  相似文献   

5.

Background

Serrapeptase is a proteolytic enzyme with many favorable biological properties like anti-inflammatory, analgesic, anti-bacterial, fibrinolytic properties and hence, is widely used in clinical practice for the treatment of many diseases. Although Serrapeptase is widely used, there are very few published papers and the information available about the enzyme is very meagre. Hence this review article compiles all the information about this important enzyme Serrapeptase.

Methods

A literature search against various databases and search engines like PubMed, SpringerLink, Scopus etc. was performed.

Results

We gathered and highlight all the published information regarding the molecular aspects, properties, sources, production, purification, detection, optimizing yield, immobilization, clinical studies, pharmacology, interaction studies, formulation, dosage and safety of the enzyme Serrapeptase.

Conclusion

Serrapeptase is used in many clinical studies against various diseases for its anti-inflammatory, fibrinolytic and analgesic effects. There is insufficient data regarding the safety of the enzyme as a health supplement. Data about the antiatherosclerotic activity, safety, tolerability, efficacy and mechanism of action of the Serrapeptase are still required.
  相似文献   

6.

Background

Protein kinase C ζ (PKCζ), an isoform of the atypical protein kinase C, is a pivotal regulator in cancer. However, the molecular and cellular mechanisms whereby PKCζ regulates tumorigenesis and metastasis are still not fully understood. In this study, proteomics and bioinformatics analyses were performed to establish a protein-protein interaction (PPI) network associated with PKCζ, laying a stepping stone to further understand the diverse biological roles of PKCζ.

Methods

Protein complexes associated with PKCζ were purified by co-immunoprecipitation from breast cancer cell MDA-MB-231 and identified by LC-MS/MS. Two biological replicates and two technical replicates were analyzed. The observed proteins were filtered using the CRAPome database to eliminate the potential false positives. The proteomics identification results were combined with PPI database search to construct the interactome network. Gene ontology (GO) and pathway analysis were performed by PANTHER database and DAVID. Next, the interaction between PKCζ and protein phosphatase 2 catalytic subunit alpha (PPP2CA) was validated by co-immunoprecipitation, Western blotting and immunofluorescence. Furthermore, the TCGA database and the COSMIC database were used to analyze the expressions of these two proteins in clinical samples.

Results

The PKCζ centered PPI network containing 178 nodes and 1225 connections was built. Network analysis showed that the identified proteins were significantly associated with several key signaling pathways regulating cancer related cellular processes.

Conclusions

Through combining the proteomics and bioinformatics analyses, a PKCζ centered PPI network was constructed, providing a more complete picture regarding the biological roles of PKCζ in both cancer regulation and other aspects of cellular biology.
  相似文献   

7.
8.

Background

Extracting biological knowledge from large amounts of gene expression information deposited in public databases is a major challenge of the postgenomic era. Additional insights may be derived by data integration and cross-platform comparisons of expression profiles. However, database meta-analysis is complicated by differences in experimental technologies, data post-processing, database formats, and inconsistent gene and sample annotation.

Results

We have analysed expression profiles from three public databases: Gene Expression Atlas, SAGEmap and TissueInfo. These are repositories of oligonucleotide microarray, Serial Analysis of Gene Expression and Expressed Sequence Tag human gene expression data respectively. We devised a method, Preferential Expression Measure, to identify genes that are significantly over- or under-expressed in any given tissue. We examined intra- and inter-database consistency of Preferential Expression Measures. There was good correlation between replicate experiments of oligonucleotide microarray data, but there was less coherence in expression profiles as measured by Serial Analysis of Gene Expression and Expressed Sequence Tag counts. We investigated inter-database correlations for six tissue categories, for which data were present in the three databases. Significant positive correlations were found for brain, prostate and vascular endothelium but not for ovary, kidney, and pancreas.

Conclusion

We show that data from Gene Expression Atlas, SAGEmap and TissueInfo can be integrated using the UniGene gene index, and that expression profiles correlate relatively well when large numbers of tags are available or when tissue cellular composition is simple. Finally, in the case of brain, we demonstrate that when PEM values show good correlation, predictions of tissue-specific expression based on integrated data are very accurate.
  相似文献   

9.

Background

Pancreatic cancer is one of the most lethal tumors with poor prognosis, and lacks of effective biomarkers in diagnosis and treatment. The aim of this investigation was to identify hub genes in pancreatic cancer, which would serve as potential biomarkers for cancer diagnosis and therapy in the future.

Methods

Combination of two expression profiles of GSE16515 and GSE22780 from Gene Expression Omnibus (GEO) database was served as training set. Differentially expressed genes (DEGs) with top 25% variance followed by protein-protein interaction (PPI) network were performed to find candidate genes. Then, hub genes were further screened by survival and cox analyses in The Cancer Genome Atlas (TCGA) database. Finally, hub genes were validated in GSE15471 dataset from GEO by supervised learning methods k-nearest neighbor (kNN) and random forest algorithms.

Results

After quality control and batch effect elimination of training set, 181 DEGs bearing top 25% variance were identified as candidate genes. Then, two hub genes, MMP7 and ITGA2, correlating with diagnosis and prognosis of pancreatic cancer were screened as hub genes according to above-mentioned bioinformatics methods. Finally, hub genes were demonstrated to successfully differ tumor samples from normal tissues with predictive accuracies reached to 93.59 and 81.31% by using kNN and random forest algorithms, respectively.

Conclusions

All the hub genes were associated with the regulation of tumor microenvironment, which implicated in tumor proliferation, progression, migration, and metastasis. Our results provide a novel prospect for diagnosis and treatment of pancreatic cancer, which may have a further application in clinical.
  相似文献   

10.

Objective

Recent studies showed coagulation factors play important role in controlling pregnancy duration in addition to controlling homeostasis. Recent studies showed several polymorphisms of coagulation factors genes increase the clot formation and lead to abortion. In this study, we evaluated the polymorphisms of coagulation factors and their effects on the development of the fetus.

Material and Methods

Relevant literature was identified by a PubMed search (1988-2017) of English language papers using the terms Abortion, pregnancy woman, coagulation factor and polymorphism.

Result

Several polymorphisms of coagulation factors disturb the exchange of food and other materials between the fetus and the mother, and impairs the formation of the placenta during embryonic stages.

Discussion

Evaluation of functional polymorphisms in coagulation factors gene during fetal development can be used as a prognostic factor in the prevention of the abortion.
  相似文献   

11.

Background

Charge states of tandem mass spectra from low-resolution collision induced dissociation can not be determined by mass spectrometry. As a result, such spectra with multiple charges are usually searched multiple times by assuming each possible charge state. Not only does this strategy increase the overall database search time, but also yields more false positives. Hence, it is advantageous to determine charge states of such spectra before database search.

Results

We propose a new approach capable of determining the charge states of low-resolution tandem mass spectra. Four novel and discriminant features are introduced to describe tandem mass spectra and used in Gaussian mixture model to distinguish doubly and triply charged peptides. By testing on three independent datasets with known validity, the results have shown that this method can assign charge states to low-resolution tandem mass spectra more accurately than existing methods.

Conclusions

The proposed method can be used to improve the speed and reliability of peptide identification.
  相似文献   

12.
13.

Background

Development of biologically relevant models from gene expression data notably, microarray data has become a topic of great interest in the field of bioinformatics and clinical genetics and oncology. Only a small number of gene expression data compared to the total number of genes explored possess a significant correlation with a certain phenotype. Gene selection enables researchers to obtain substantial insight into the genetic nature of the disease and the mechanisms responsible for it. Besides improvement of the performance of cancer classification, it can also cut down the time and cost of medical diagnoses.

Methods

This study presents a modified Artificial Bee Colony Algorithm (ABC) to select minimum number of genes that are deemed to be significant for cancer along with improvement of predictive accuracy. The search equation of ABC is believed to be good at exploration but poor at exploitation. To overcome this limitation we have modified the ABC algorithm by incorporating the concept of pheromones which is one of the major components of Ant Colony Optimization (ACO) algorithm and a new operation in which successive bees communicate to share their findings.

Results

The proposed algorithm is evaluated using a suite of ten publicly available datasets after the parameters are tuned scientifically with one of the datasets. Obtained results are compared to other works that used the same datasets. The performance of the proposed method is proved to be superior.

Conclusion

The method presented in this paper can provide subset of genes leading to more accurate classification results while the number of selected genes is smaller. Additionally, the proposed modified Artificial Bee Colony Algorithm could conceivably be applied to problems in other areas as well.
  相似文献   

14.

Background

Periodontitis i.e. inflammation of the periodontium is a multifactorial disease. Antimicrobial peptides (AMPs) which demonstrate a broad-spectrum of activity against varied number of bacteria, fungi, viruses, and parasites, and cancerous cells have been linked to periodontitis. The AMPs even possess the caliber of immunomodulation, and are significantly responsive to innate immuno-stimulation and infections. LL-37 plays a salubrious role by preventing and in treatment of chronic forms of periodontitis.

Objective

In the present work we will review the role of antimicrobial peptide LL-37 in periodontitis.

Methods

A systematic search was carried out from the beginning till August, 2016 using the Pubmed search engine. The keywords included “LL-37,” “periodontitis,” “Papillon–Lefevre syndrome,” “Morbus Kostmann,” “Haim-Munk syndrome” along with use of Boolean operator “and.”

Results

The search resulted in identifying 67 articles which included articles linking LL-37 with periodontitis, articles on Papillon–Lefevre syndrome, Morbus Kostmann, Haim-Munk syndrome, LL-37 and periodontitis and articles on pathogenicity of periodontitis.

Conclusion

The literature search concluded that LL-37 plays a pivotal role in preventing and treatment of severe form of periodontitis.
  相似文献   

15.

Background

Glioblastoma multiforme, the most prevalent and aggressive brain tumour, has a poor prognosis. The molecular mechanisms underlying gliomagenesis remain poorly understood. Therefore, molecular research, including various markers, is necessary to understand the occurrence and development of glioma.

Method

Weighted gene co-expression network analysis (WGCNA) was performed to construct a gene co-expression network in TCGA glioblastoma samples. Gene ontology (GO) and pathway-enrichment analysis were used to identify significance of gene modules. Cox proportional hazards regression model was used to predict outcome of glioblastoma patients.

Results

We performed weighted gene co-expression network analysis (WGCNA) and identified a gene module (yellow module) related to the survival time of TCGA glioblastoma samples. Then, 228 hub genes were calculated based on gene significance (GS) and module significance (MS). Four genes (OSMR + SOX21?+?MED10?+?PTPRN) were selected to construct a Cox proportional hazards regression model with high accuracy (AUC?=?0.905). The prognostic value of the Cox proportional hazards regression model was also confirmed in GSE16011 dataset (GBM: n?=?156).

Conclusion

We developed a promising mRNA signature for estimating overall survival in glioblastoma patients.
  相似文献   

16.
17.

Background

Porphyromonas gingivalis is a periodontal pathogen, which is considered to be a keystone pathogen for periodontitis. A diverse conglomerate of P. gingivalis virulence factors including lipopolysaccharide, fimbriae, capsular polysaccharide, haemagglutinin and cysteine proteases (Arg-gingipains and Lys-gingipain) are considered to be involved in the pathogenesis of periodontitis. Leupeptin is a cysteine protease inhibitor which is specific for Arg gingipains. The present review focuses on action of leupeptin on Arg gingipains.

Method

A search was carried out systematically from the start till September, 2016. The search was made in Medline database via PubMed. The keywords enlisted were “leupeptin”; “gingipains”; “periodontitis” using Boolean operator “and.”

Results

The result was selection of 58 articles which linked leupeptin to periodontitis and gingipains; pathogenesis of periodontitis, pathogenicity of gingipains and role of leupeptin.

Conclusion

It was concluded that leupeptin inhibits and attenuates a number of destructive activities of Arg gingipains including inhibition of platelet aggregation; inhibit degradation of LL-37, which is an antimicrobial peptide; blocking inhibition of monocyte chemoattractant protein; restoring level of interleukin-2; inhibiting degradation of collagen type I and IV to name a few.
  相似文献   

18.
19.

Background

Breast cancer and ovarian cancer are hormone driven and are known to have some predisposition genes in common such as the two well known cancer genes BRCA1 and BRCA2. The objective of this study is to compare the coexpression network modules of both cancers, so as to infer the potential cancer-related modules.

Methods

We applied the eigen-decomposition to the matrix that integrates the gene coexpression networks of both breast cancer and ovarian cancer. With hierarchical clustering of the related eigenvectors, we obtained the network modules of both cancers simultaneously. Enrichment analysis on Gene Ontology (GO), KEGG pathway, Disease Ontology (DO), and Gene Set Enrichment Analysis (GSEA) in the identified modules was performed.

Results

We identified 43 modules that are enriched by at least one of the four types of enrichments. 31, 25, and 18 modules are enriched by GO terms, KEGG pathways, and DO terms, respectively. The structure of 29 modules in both cancers is significantly different with p-values less than 0.05, of which 25 modules have larger densities in ovarian cancer. One module was found to be significantly enriched by the terms related to breast cancer from GO, KEGG and DO enrichment. One module was found to be significantly enriched by ovarian cancer related terms.

Conclusion

Breast cancer and ovarian cancer share some common properties on the module level. Integration of both cancers helps identifying the potential cancer associated modules.
  相似文献   

20.

Background

One of the 3 tracks of iDASH Privacy & Security Workshop 2017 competition was to execute a whole genome variants search on private genomic data. Particularly, the search application was to find the top most significant SNPs (Single-Nucleotide Polymorphisms) in a database of genome records labeled with control or case. In this paper we discuss the solution submitted by our team to this competition.

Methods

Privacy and confidentiality of genome data had to be ensured using Intel SGX enclaves. The typical use-case of this application is the multi-party computation (each party possessing one or several genome records) of the SNPs which statistically differentiate control and case genome datasets.

Results

Our solution consists of two applications: (i) compress and encrypt genome files and (ii) perform genome processing (top most important SNPs search). We have opted for a horizontal treatment of genome records and heavily used parallel processing. Rust programming language was employed to develop both applications.

Conclusions

Execution performance of the processing applications scales well and very good performance metrics are obtained. Contest organizers selected it as the best submission amongst other received competition entries and our team was awarded the first prize on this track.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号