首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
K Zhou  C Ai  P Dong  X Fan  L Yang 《Glycoconjugate journal》2012,29(7):551-564
In silico approaches have become an alternative method to study O-glycosylation. In this paper, we developed a linear interpretable model for O-glycosylation prediction based on an unbalanced dataset, analyzing the underlying biological knowledge of glycosylation. A training set of 4446 sites involving 468 positive sites and 3978 negative sites was developed during this research. The sites were encoded using the amino acid index (AAindex), and the forward stepwise procedure utilized for feature selection. The linear discriminant analysis with an equal a priori probability (PP-LDA) was employed to develop the interpretable model. Performance of the model was verified using both the internal leave-one-out cross-validation and external validation methods. Two non-linear algorithms, the supervised support vector machine and the unsupervised self-organizing competitive neural network, were used as comparisons. The PP-LDA model exhibited improved classification results with accuracy of 82.1?% for cross-validations and 80.3?% for external prediction. Further analysis of this linear model indicated that the properties at position R(1) and the properties relative to hydrophobicity contributed more to the glycosylation prediction. However, the alpha and turn propensities at the C-terminal, together with physicochemical properties at the N-terminal, are also relative to the glycosylation activity. This model is not only capable of predicting the possibility of glycosylation using an unbalanced dataset, but is also helpful to understand the underlying biological mechanisms of glycosylation. Considering the publicly accessibility of our prediction model, a downloadable program is provided in our supply materials.  相似文献   

4.
5.
6.
7.
In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position–specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα‐Cα atoms. First, using a rigorous leave‐one‐protein‐out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state‐of‐the‐art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. COMSAT is freely accessible at http://hpcc.siat.ac.cn/COMSAT/ . Proteins 2016; 84:332–348. © 2016 Wiley Periodicals, Inc.  相似文献   

8.
9.
In the present report, the use of the atom-based linear indices for finding functions that discriminate between the tyrosinase inhibitor compounds and inactive ones is presented. In this sense, discriminant models were applied and globally good classifications of 93.51% and 92.46% were observed for non-stochastic and stochastic linear indices best models, respectively, in the training set. The external prediction sets had accuracies of 91.67% and 89.44%. In addition, these fitted models were used in the screening of new cycloartane compounds isolated from herbal plants. A good behavior is shown between the theoretical and experimental results. These results provide a tool that can be used in the identification of new tyrosinase inhibitor compounds.  相似文献   

10.
The application of 3D-MEDNEs as a novel alternative technique to reduce the use of animal experimentation in toxicology in the early stages of medicinal chemistry research has been extended from agranulocytosis to chemically induced eosinophilia. Firstly, a heterogeneous series of organic compounds, which are classified either as eosinophilia inductors or noninductors, was collected. A linear discriminant analysis was subsequently used to obtain a QSTR that gave rise to a very good classification of 91.82% (110 chemicals within training series). Eosinophilia inductors (88.89%) composed the first group while the other one contained only harmless compounds (97.37%). The total predictability (88.1%) was tested by means of an external validation series (42 compounds). The model correctly classifies 88.89% of harmless compounds and 87.5% of toxic ones. Finally, comparison of predicted versus experimental results for G1 [2-bromo-5-(2-bromo-2-nitroethenyl)furan, which is a promising antibacterial-antifungal compound] illustrates the practical application of the method. A dose-dependent study of G1 (9.8-185.6 mg/Kg) at 48, 72 and 96 h after oral administration in rats is reported here for the first time. The study has shown that G1 does not affect the murine eosinophils count under these conditions--a situation in total agreement with the model prediction.  相似文献   

11.
The detection of lung cancer has a special value in the diagnosis of cancer diseases. Based on nine elemental concentrations (i.e., chromium, iron, manganese, aluminum, cadmium, copper, zinc, nickel, and selenium) in urine samples and an ensemble linear discriminant analysis (ELDA), a detection method for lung cancer has been developed. A dataset containing 30 healthy samples and 27 lung cancer samples is used for experiment. The whole dataset was first split into a training set with 29 samples and a test set with 28 samples. The prediction results from the ELDA classifier were compared with those from single Fisher’s discriminate analysis (FDA). On the test set, the ELDA classifier achieved better performance, that is, a sensitivity of 100%, a specificity of 86.7%, and an overall accuracy of 92.9%, while the FDA classifier had a sensitivity of 92.3%, a specificity of 93.3%, and an overall accuracy of 92.9%. The superiority of ELDA to FDA is ascribed to the fact that ELDA can model more nonlinear relationships through the cooperation of several single models, suggesting that ensemble modeling is more advisable in such a task.  相似文献   

12.
MARCH-INSIDE methodology and a statistical classification method—linear discriminant analysis (LDA)—is proposed as an alternative method to the Draize eye irritation test. This methodology has been successfully applied to a set of 46 neutral organic chemicals, which have been defined as ocular irritant or nonirritant. The model allow to categorize correctly 37 out of 46 compounds, showing an accuracy of 80.46%. Specifically, this model demonstrates the existence of a good categorization average of 91.67 and 76.47% for irritant and nonirritant compounds, respectively. Validation of the model was carried out using two cross-validation tools: Leave-one-out (LOO) and leave-group-out (LGO), showing a global predictability of the model of 71.7 and 70%, respectively. The average of coincidence of the predictions between leave-one-out/leave-group-out studies and train set were 91.3% (42 out of 46 cases)/89.1% (41 out of 46 cases) proving the robustness of the model obtained. Ocular irritancy distribution diagram is carried out in order to determine the intervals of the property where the probability of finding an irritant compound is maximal relating to the choice of find a false nonirritant one. It seems that, until today, the present model may be the first predictive linear discriminant equation able to discriminate between eye irritant and nonirritant chemicals.  相似文献   

13.
14.
There are many of pathogen parasite species with different susceptibility profile to antiparasitic drugs. Unfortunately, almost QSAR models predict the biological activity of drugs against only one parasite species. Consequently, predicting the probability with which a drug is active against different species with a single unify model is a goal of the major importance. In so doing, we use Markov Chains theory to calculate new multi-target spectral moments to fit a QSAR model that predict by the first time a mt-QSAR model for 500 drugs tested in the literature against 16 parasite species and other 207 drugs no tested in the literature using spectral moments. The data was processed by linear discriminant analysis (LDA) classifying drugs as active or non-active against the different tested parasite species. The model correctly classifies 311 out of 358 active compounds (86.9%) and 2328 out of 2577 non-active compounds (90.3%) in training series. Overall training performance was 89.9%. Validation of the model was carried out by means of external predicting series. In these series the model classified correctly 157 out 190, 82.6% of antiparasitic compounds and 1151 out of 1277 non-active compounds (90.1%). Overall predictability performance was 89.2%. In addition we developed four types of non Linear Artificial neural networks (ANN) and we compared with the mt-QSAR model. The improved ANN model had an overall training performance was 87%. The present work report the first attempts to calculate within a unify framework probabilities of antiparasitic action of drugs against different parasite species based on spectral moment analysis.  相似文献   

15.
A new application of TOPological Sub-structural MOlecular DEsign (TOPS-MODE) was carried out in anti-inflammatory compounds using computer-aided molecular design. Two series of compounds, one containing anti-inflammatory and the other containing nonanti-inflammatory compounds were processed by a k-means cluster analysis in order to design the training and prediction sets. A linear classification function to discriminate the anti-inflammatory from the inactive compounds was developed. The model correctly and clearly classified 88% of active and 91% of inactive compounds in the training set. More specifically, the model showed a good global classification of 90%, that is, (399 cases out of 441). While in the prediction set, they showed an overall predictability of 88% and 84% for active and inactive compounds, being the global percentage of good classification of 85%. Furthermore this paper describes a fragment analysis in order to determine the contribution of several fragments towards anti-inflammatory property, also the present of halogens in the selected fragments were analyzed. It seems that the present TOPS-MODE based QSAR is the first alternate general 'in silico' technique to experimentation in anti-inflammatory discovery.  相似文献   

16.
A novel method for in silico selection of fluckicidal drugs is introduced. Two QSARs that permit us to discriminate between fasciolicide and non-fasciolicide drugs (the first) and to outline some conclusions about the possible mechanism of action of a chemical (the second) are performed. The first model correctly classified 93.85% of compounds in the training series and 89.5% of the compounds in the predicting one. This model correctly classified 87.7, 93.8, 92.2 and 93.9% of compounds in leave- n-out cross validation procedures when n takes values from 2 to until 6. The model seems to be stable in around 92% of good classification in leave- n-out cross validation analysis when n>6. The second model correctly classified 70% of non-fasciolicide compounds, 85.71% of beta-tubulin inhibitors and 100% of proton ionophores in the training set. This model recognizes as proton ionophores 100% of any nitrosalicylanilides in the predicting series. Both models have a low p-level <0.05. Finally, the experimental assay of six organic chemicals by an in vivo test permit us to carry out an assessment of the model with a fairly good 100% agreement between experiment and theoretical prediction.  相似文献   

17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号