共查询到20条相似文献,搜索用时 15 毫秒
1.
ManChon U Eric Talevich Samiksha Katiyar Khaled Rasheed Natarajan Kannan 《PLoS computational biology》2014,10(4)
Cancer is a genetic disease that develops through a series of somatic mutations, a subset of which drive cancer progression. Although cancer genome sequencing studies are beginning to reveal the mutational patterns of genes in various cancers, identifying the small subset of “causative” mutations from the large subset of “non-causative” mutations, which accumulate as a consequence of the disease, is a challenge. In this article, we present an effective machine learning approach for identifying cancer-associated mutations in human protein kinases, a class of signaling proteins known to be frequently mutated in human cancers. We evaluate the performance of 11 well known supervised learners and show that a multiple-classifier approach, which combines the performances of individual learners, significantly improves the classification of known cancer-associated mutations. We introduce several novel features related specifically to structural and functional characteristics of protein kinases and find that the level of conservation of the mutated residue at specific evolutionary depths is an important predictor of oncogenic effect. We consolidate the novel features and the multiple-classifier approach to prioritize and experimentally test a set of rare unconfirmed mutations in the epidermal growth factor receptor tyrosine kinase (EGFR). Our studies identify T725M and L861R as rare cancer-associated mutations inasmuch as these mutations increase EGFR activity in the absence of the activating EGF ligand in cell-based assays. 相似文献
2.
BackgroundVarious studies assessing the diagnostic value of serum tumor markers in patients with esophageal cancer remain controversial. This study aims to comprehensively and quantitatively summarize the potential diagnostic value of 5 serum tumour markers in esophageal cancer.MethodsWe systematically searched PubMed, Embase, Chinese National Knowledge Infrastructure (CNKI) and Chinese Biomedical Database (CBM), through February 28, 2013, without language restriction. Studies were assessed for quality using QUADAS (quality assessment of studies of diagnostic accuracy). The positive likelihood ratio (PLR) and negative likelihood ratio (NLR) were pooled separately and compared with overall accuracy measures using diagnostic odds ratios (DORs) and symmetric summary receiver operating characteristic (SROC) curves.ResultsOf 4391 studies initially identified, 44 eligible studies including five tumor markers met the inclusion criteria for the meta-analysis, while meta-analysis could not be conducted for 12 other tumor markers. Approximately 79.55% (35/44) of the included studies were of relatively high quality (QUADAS score≥7). The summary estimates of the positive likelihood ratio (PLR), negative likelihood ratio (NLR) and diagnostic odds ratio (DOR) for diagnosing EC were as follows: CEA, 5.94/0.76/9.26; Cyfra21-1, 12.110.59/22.27; p53 antibody, 6.71/0.75/9.60; SCC-Ag, 7.66/0.68/12.41; and VEGF-C, 0.74/0.37/8.12. The estimated summary receiver operating characteristic curves showed that the performance of all five tumor markers was reasonable.ConclusionsThe current evidence suggests that CEA, Cyfra21-1, p53, SCC-Ag and VEGF-C have a potential diagnostic value for esophageal carcinoma. 相似文献
3.
4.
Legumes are known to provide nutritious proteins and vegetableoils while at the same time providing industrial products suchas biodiesel. It is estimated that approximately 25% of worldcrop production is derived from legumes. Recently, knowledgeof the molecular biology and genomics of legumes have been extendedsignificantly using two model species, Lotus japonicus (http://www.kazusa.or.jp/lotus/)and Medicago 相似文献
5.
6.
Lucía Irene González-Villaseñor Thomas T. Chen 《Marine biotechnology (New York, N.Y.)》1999,1(3):211-220
Antibodies elicited by novel synthetic peptide antigens derived from a highly conserved domain of the growth hormone (GH)
and prolactin (PRL) of vertebrates were developed using the multiple antigen peptide approach. The sequence of the antigens
is located near the carboxy-terminus in the D domain of the GH and PRL in a cluster of 11 and 10 conserved amino acids, respectively,
within a sequence of 18 residues. The synthetic peptides were manually synthesized, purified by high-performance liquid chromatography,
and the corresponding antibodies, elicited in rabbits, were cross-reacted with the GH and PRL of a variety of mammalian (human,
bovine, ovine, pig, and equine) and nonmammalian (chicken, coho salmon, chum salmon, rainbow trout, catfish and striped bass)
vertebrates. The cross-reactivity between the immunogen and its corresponding antigen was tested by immunobloting using either
GH or PRL. The GH and PRL of the organisms tested cross-reacted specifically with the corresponding antibody. Chicken and
fish GH and PRL showed stronger antibody cross-reactivity than that observed in mammalian sources. These results demonstrate
the utility of peptide-derived polyclonal antibodies in the detection of native and recombinant GH and PRL of a variety of
vertebrates.
Received June 1, 1998; accepted November 13, 1998. 相似文献
7.
System for Determining the Relative Fitness of Multiple Bacterial Populations without Using Selective Markers 总被引:1,自引:0,他引:1 下载免费PDF全文
A device for simultaneously measuring the relative fitness of multiple bacterial populations was developed and evaluated. The new device eliminates the need to construct strains with selectively neutral markers so that strains can be readily distinguished, and it provides a means to perform multispecies competition experiments. 相似文献
8.
Maureen H. Diaz Jessica L. Waller Rebecca A. Napoliello Md. Shahidul Islam Bernard J. Wolff Daniel J. Burken Rhiannon L. Holden Velusamy Srinivasan Melissa Arvay Lesley McGee M. Steven Oberste Cynthia G. Whitney Stephanie J. Schrag Jonas M. Winchell Samir K. Saha 《PloS one》2013,8(6)
Identification of etiology remains a significant challenge in the diagnosis of infectious diseases, particularly in resource-poor settings. Viral, bacterial, and fungal pathogens, as well as parasites, play a role for many syndromes, and optimizing a single diagnostic system to detect a range of pathogens is challenging. The TaqMan Array Card (TAC) is a multiple-pathogen detection method that has previously been identified as a valuable technique for determining etiology of infections and holds promise for expanded use in clinical microbiology laboratories and surveillance studies. We selected TAC for use in the Aetiology of Neonatal Infection in South Asia (ANISA) study for identifying etiologies of severe disease in neonates in Bangladesh, India, and Pakistan. Here we report optimization of TAC to improve pathogen detection and overcome technical challenges associated with use of this technology in a large-scale surveillance study. Specifically, we increased the number of assay replicates, implemented a more robust RT-qPCR enzyme formulation, and adopted a more efficient method for extraction of total nucleic acid from blood specimens. We also report the development and analytical validation of ten new assays for use in the ANISA study. Based on these data, we revised the study-specific TACs for detection of 22 pathogens in NP/OP swabs and 12 pathogens in blood specimens as well as two control reactions (internal positive control and human nucleic acid control) for each specimen type. The cumulative improvements realized through these optimization studies will benefit ANISA and perhaps other studies utilizing multiple-pathogen detection approaches. These lessons may also contribute to the expansion of TAC technology to the clinical setting. 相似文献
9.
Kemal Sonmez Naunihal T. Zaveri Ilan A. Kerman Sharon Burke Charles R. Neal Xinmin Xie Stanley J. Watson Lawrence Toll 《PLoS computational biology》2009,5(1)
There are currently a large number of “orphan” G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development. 相似文献
10.
Disease gene prioritization aims to suggest potential implications of genes in disease susceptibility. Often accomplished in a guilt-by-association scheme, promising candidates are sorted according to their relatedness to known disease genes. Network-based methods have been successfully exploiting this concept by capturing the interaction of genes or proteins into a score. Nonetheless, most current approaches yield at least some of the following limitations: (1) networks comprise only curated physical interactions leading to poor genome coverage and density, and bias toward a particular source; (2) scores focus on adjacencies (direct links) or the most direct paths (shortest paths) within a constrained neighborhood around the disease genes, ignoring potentially informative indirect paths; (3) global clustering is widely applied to partition the network in an unsupervised manner, attributing little importance to prior knowledge; (4) confidence weights and their contribution to edge differentiation and ranking reliability are often disregarded. We hypothesize that network-based prioritization related to local clustering on graphs and considering full topology of weighted gene association networks integrating heterogeneous sources should overcome the above challenges. We term such a strategy Interactogeneous. We conducted cross-validation tests to assess the impact of network sources, alternative path inclusion and confidence weights on the prioritization of putative genes for 29 diseases. Heat diffusion ranking proved the best prioritization method overall, increasing the gap to neighborhood and shortest paths scores mostly on single source networks. Heterogeneous associations consistently delivered superior performance over single source data across the majority of methods. Results on the contribution of confidence weights were inconclusive. Finally, the best Interactogeneous strategy, heat diffusion ranking and associations from the STRING database, was used to prioritize genes for Parkinson’s disease. This method effectively recovered known genes and uncovered interesting candidates which could be linked to pathogenic mechanisms of the disease. 相似文献
11.
12.
13.
Antimicrobial peptides (AMPs) belong to a class of natural microbicidal molecules that have been receiving great attention for their lower propensity for inducing drug resistance, hence, their potential as alternative drugs to conventional antibiotics. By generating AMP libraries, one can study a large number of candidates for their activities simultaneously in a timely manner. Here, we describe a novel methodology where in silico designed AMP-encoding oligonucleotide libraries are cloned and expressed in a cellular host for rapid screening of active molecules. The combination of parallel oligonucleotide synthesis with microbial expression systems not only offers complete flexibility for sequence design but also allows for economical construction of very large peptide libraries. An application of this approach to discovery of novel AMPs has been demonstrated by constructing and screening a custom library of twelve thousand plantaricin-423 mutants in Escherichia coli. Analysis of selected clones by both Sanger-sequencing and 454 high-throughput sequencing produced a significant amount of data for positionally important residues of plantaricin-423 responsible for antimicrobial activity and, moreover, resulted in identification of many novel variants with enhanced specific activities against Listeria innocua. This approach allows for generation of fully tailored peptide collections in a very cost effective way and will have countless applications from discovery of novel AMPs to gaining fundamental understanding of their biological function and characteristics. 相似文献
14.
Judith R. Denery Ashlee A. K. Nunes Mark S. Hixon Tobin J. Dickerson Kim D. Janda 《PLoS neglected tropical diseases》2010,4(10)
Background
Development of robust, sensitive, and reproducible diagnostic tests for understanding the epidemiology of neglected tropical diseases is an integral aspect of the success of worldwide control and elimination programs. In the treatment of onchocerciasis, clinical diagnostics that can function in an elimination scenario are non-existent and desperately needed. Due to its sensitivity and quantitative reproducibility, liquid chromatography-mass spectrometry (LC-MS) based metabolomics is a powerful approach to this problem.Methodology/Principal Findings
Analysis of an African sample set comprised of 73 serum and plasma samples revealed a set of 14 biomarkers that showed excellent discrimination between Onchocerca volvulus–positive and negative individuals by multivariate statistical analysis. Application of this biomarker set to an additional sample set from onchocerciasis endemic areas where long-term ivermectin treatment has been successful revealed that the biomarker set may also distinguish individuals with worms of compromised viability from those with active infection. Machine learning extended the utility of the biomarker set from a complex multivariate analysis to a binary format applicable for adaptation to a field-based diagnostic, validating the use of complex data mining tools applied to infectious disease biomarker discovery and diagnostic development.Conclusions/Significance
An LC-MS metabolomics-based diagnostic has the potential to monitor the progression of onchocerciasis in both endemic and non-endemic geographic areas, as well as provide an essential tool to multinational programs in the ongoing fight against this neglected tropical disease. Ultimately this technology can be expanded for the diagnosis of other filarial and/or neglected tropical diseases. 相似文献15.
Jordan N. Grapel Domenic V. Cicchetti Fred R. Volkmar 《The Yale journal of biology and medicine》2015,88(1):69-71
In this study, we examined the frequency of sensory-related issues as reported by parents in a large sample of school-age adolescents and adults with autism/autism spectrum disorder (ASD) [1] as compared to a group of individuals receiving similar clinical evaluations for developmental/behavioral difficulties but whose final diagnoses were not on the autism spectrum. In no comparison were the features examined predictive of autism or autism spectrum in comparison to the non-ASD sample. Only failure to respond to noises had sensitivity above .75 in the comparison of the broader autism spectrum group, but specificity was poor. While sensory issues are relatively common in autism/ASD, they are also frequent in other disorders. These results question the rationale for including sensory items as a diagnostic criterion for autism. 相似文献
16.
《Endocrine practice》2008,14(9):1075-1083
ObjectiveTo identify triggers for islet neogenesis in humans that may lead to new treatments that address the underlying mechanism of disease for patients with type 1 or type 2 diabetes.MethodsIn an effort to identify bioactive human peptide sequences that might trigger islet neogenesis, we evaluated amino acid sequences within a variety of mammalian pancreas-specific REG genes. We evaluated GenBank, the Basic Local Alignment Search Tool algorithm, and all available proteomic databases and developed large-scale protein-to-protein interaction maps. Studies of peptides of interest were conducted in human pancreatic ductal tissue, followed by investigations in mice with streptozocin-induced diabetes.ResultsOur team has defined a 14-amino acid bioactive peptide encoded by a portion of the human REG3a gene we termed Human proIslet Peptide (HIP), which is well conserved among many mammals. Treatment of human pancreatic ductal tissue with HIP stimulated the production of insulin. In diabetic mice, administration of HIP improved glycemic control and significantly increased islet number. Bioinformatics analysis, coupled with biochemical interaction studies in a human pancreatic cell line, identified the human exostoses-like protein 3 (EXTL3) as a HIP-binding protein. HIP enhanced EXTL3 translocation from the membrane to the nucleus, in support of a model whereby EXTL3 mediates HIP signaling for islet neogenesis.ConclusionOur data suggest that HIP may be a potential stimulus for islet neogenesis and that the differentiation of new islets is a process distinct from beta cell proliferation within existing islets. Human clinical trials are soon to commence to determine the effect of HIP on generating new islets from one’s own pancreatic progenitor cells. (Endocr Pract. 2008;14:1075-1083) 相似文献
17.
Jeng-Yih Wu Chun-Chia Cheng Jaw-Yuan Wang Deng-Chyang Wu Jan-Sing Hsieh Shui-Cheng Lee Wen-Ming Wang 《PloS one》2014,9(1)
Gastric cancer (GC) has a high rate of morbidity and mortality among various cancers worldwide. The development of noninvasive diagnostic methods or technologies for tracking the occurrence of GC is urgent, and searching reliable biomarkers is considered.This study intended to directly discover differential biomarkers from GC tissues by two-dimension-differential gel electrophoresis (2D-DIGE), and further validate protein expression by western blotting (WB) and immunohistochemistry (IHC).Pairs of GC tissues (gastric cancer tissues and the adjacent normal tissues) obtained from surgery was investigated for 2D-DIEG.Five proteins wereconfirmed by WB and IHC, including glucose-regulated protein 78 (GRP78), glutathione s-transferase pi (GSTpi), apolipoprotein AI (ApoAI), alpha-1 antitrypsin (A1AT) and gastrokine-1 (GKN-1). Among the results, GRP78, GSTpi and A1ATwere significantlyup-regulated and down-regulated respectively in gastric cancer patients. Moreover, GRP78 and ApoAI were correlated with A1AT for protein expressions.This study presumes these proteins could be candidates of reliable biomarkers for gastric cancer. 相似文献
18.
Given thousands of proteins constituting a eukaryotic pathogen, the principal objective for a high-throughput in silico vaccine discovery pipeline is to select those proteins worthy of laboratory validation. Accurate prediction of T-cell epitopes on protein antigens is one crucial piece of evidence that would aid in this selection. Prediction of peptides recognised by T-cell receptors have to date proved to be of insufficient accuracy. The in silico approach is consequently reliant on an indirect method, which involves the prediction of peptides binding to major histocompatibility complex (MHC) molecules. There is no guarantee nevertheless that predicted peptide-MHC complexes will be presented by antigen-presenting cells and/or recognised by cognate T-cell receptors. The aim of this study was to determine if predicted peptide-MHC binding scores could provide contributing evidence to establish a protein’s potential as a vaccine. Using T-Cell MHC class I binding prediction tools provided by the Immune Epitope Database and Analysis Resource, peptide binding affinity to 76 common MHC I alleles were predicted for 160 Toxoplasma gondii proteins: 75 taken from published studies represented proteins known or expected to induce T-cell immune responses and 85 considered less likely vaccine candidates. The results show there is no universal set of rules that can be applied directly to binding scores to distinguish a vaccine from a non-vaccine candidate. We present, however, two proposed strategies exploiting binding scores that provide supporting evidence that a protein is likely to induce a T-cell immune response–one using random forest (a machine learning algorithm) with a 72% sensitivity and 82.4% specificity and the other, using amino acid conservation scores with a 74.6% sensitivity and 70.5% specificity when applied to the 160 benchmark proteins. More importantly, the binding score strategies are valuable evidence contributors to the overall in silico vaccine discovery pool of evidence. 相似文献
19.
20.
Peter S. Kutchukian Nadya Y. Vasilyeva Jordan Xu Mika K. Lindvall Michael P. Dillon Meir Glick John D. Coley Natasja Brooijmans 《PloS one》2012,7(11)
Medicinal chemists’ “intuition” is critical for success in modern drug discovery. Early in the discovery process, chemists select a subset of compounds for further research, often from many viable candidates. These decisions determine the success of a discovery campaign, and ultimately what kind of drugs are developed and marketed to the public. Surprisingly little is known about the cognitive aspects of chemists’ decision-making when they prioritize compounds. We investigate 1) how and to what extent chemists simplify the problem of identifying promising compounds, 2) whether chemists agree with each other about the criteria used for such decisions, and 3) how accurately chemists report the criteria they use for these decisions. Chemists were surveyed and asked to select chemical fragments that they would be willing to develop into a lead compound from a set of ∼4,000 available fragments. Based on each chemist’s selections, computational classifiers were built to model each chemist’s selection strategy. Results suggest that chemists greatly simplified the problem, typically using only 1–2 of many possible parameters when making their selections. Although chemists tended to use the same parameters to select compounds, differing value preferences for these parameters led to an overall lack of consensus in compound selections. Moreover, what little agreement there was among the chemists was largely in what fragments were undesirable. Furthermore, chemists were often unaware of the parameters (such as compound size) which were statistically significant in their selections, and overestimated the number of parameters they employed. A critical evaluation of the problem space faced by medicinal chemists and cognitive models of categorization were especially useful in understanding the low consensus between chemists. 相似文献