期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

首页 | 本学科首页

官方微博 | 高级检索

相似文献

共查询到20条相似文献，搜索用时 15 毫秒

1.

细胞穿膜肽的穿膜活性与序列特征的关系

曹赞霞董川赵立岭王吉华《生物化学与生物物理进展》2016,43(1):75-82

细胞穿膜肽(cell penetrating peptides,CPPs)是一种小分子多肽,能够容易地穿过细胞膜.这类分子,尤其是具有靶向功能的CPPs为高效率投送药物到靶细胞带来希望.因此,对其展开研究对于生物医学有着一定的意义.本工作主要从序列水平对具有不同穿膜活性的CPPs进行研究,试图找出影响CPPs穿膜活性的因素,以及不同活性CPPs与非穿膜肽(Non CPPs)序列上的差异,并引入一种分析生物序列的方法.我们基于CPPsite数据库和不同的文献获取CPPs和Non CPPs序列,并进一步从CPPs序列中提取具有高、中、低穿膜活性的穿膜肽(HCPPs、MCPPs、LCPPs)用于构建数据集.基于这些数据集,开展了以下研究:首先,利用方差分析的方法,对不同活性的CPPs以及Non CPPs的氨基酸及二级结构组成进行分析,发现氨基酸的静电与疏水相互作用对CPPs的穿膜活性起到了重要影响,同时螺旋结构和无规卷曲也会影响CPPs的穿膜活性;其次,使用理化性质与长度将不同活性的CPPs展示在二维平面上,发现在某些特殊的性质下不同活性的CPPs与Non CPPs可以产生聚簇现象,HCPPs、MCPPs以及LCPPs和Non CPPs被分成了三簇,这种现象显示了它们之间的差异;最后,本文引入了生物序列理化质心的概念,将组成序列的残基看作质点,进而把序列抽象成质点系进行研究,并将此方法应用到CPPs的分析中,通过PCA方法将不同活性的CPPs投射到三维平面上,结果发现绝大部分CPPs聚在一起,部分LCPPs与Non CPPs聚在一起.此工作对于CPPs的设计,以及理解不同活性CPPs序列上的差异具有一定的意义.另外,本文引入的生物序列理化质心的分析方法也可以用于其他生物问题的分析,同时它们可以作为某些生物分类问题的输入参数,在模式识别中起到一定的作用. 相似文献

2.

穿膜肽的内化机制及其应用

张兰馨张书祥《中国生物化学与分子生物学报》2008,24(12):1092-1096

穿膜肽是一类具有特殊穿膜功能的多肽分子,能携带其它分子甚至超分子颗粒穿膜进入细胞内部．早期研究认为,其进胞是一种无需受体、也不存在饱和状态的非经典胞吞行为．近年研究表明,其穿膜机制可能与其含有的氨基酸种类有很大关系．现在,穿膜肽的穿膜过程称为巨型胞饮行为,它与传统的胞吞形式很相似．当然,还可能存在着其它的进胞方式而没有被证明或发现．关于穿膜肽的应用也是人们最感兴趣的,在很多领域的研究都在进行并不断取得进展．不论是生物界还是医学界,穿膜肽都被认为将是一类非常有发展潜力的多肽分子．相似文献

3.

PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines

Chatterjee P Basu S Kundu M Nasipuri M Plewczynski D 《Journal of molecular modeling》2011,17(9):2191-2201

Secondary structure prediction is a crucial task for understanding the variety of protein structures and performed biological functions. Prediction of secondary structures for new proteins using their amino acid sequences is of fundamental importance in bioinformatics. We propose a novel technique to predict protein secondary structures based on position-specific scoring matrices (PSSMs) and physico-chemical properties of amino acids. It is a two stage approach involving multiclass support vector machines (SVMs) as classifiers for three different structural conformations, viz., helix, sheet and coil. In the first stage, PSSMs obtained from PSI-BLAST and five specially selected physicochemical properties of amino acids are fed into SVMs as features for sequence-to-structure prediction. Confidence values for forming helix, sheet and coil that are obtained from the first stage SVM are then used in the second stage SVM for performing structure-to-structure prediction. The two-stage cascaded classifiers (PSP_MCSVM) are trained with proteins from RS126 dataset. The classifiers are finally tested on target proteins of critical assessment of protein structure prediction experiment-9 (CASP9). PSP_MCSVM with brainstorming consensus procedure performs better than the prediction servers like Predator, DSC, SIMPA96, for randomly selected proteins from CASP9 targets. The overall performance is found to be comparable with the current state-of-the art. PSP_MCSVM source code, train-test datasets and supplementary files are available freely in public domain at: and 相似文献

4.

Mitochondrial targeting of a cationic amphiphilic polyproline helix

Kalafut D Anderson TN Chmielewski J 《Bioorganic & medicinal chemistry letters》2012,22(1):561-563

The development of cell penetrating peptides (CPPs) for the cellular delivery of attached cargo is an area of growing interest. Many CPPs, however, are found trapped within endosomes, thereby limiting their use as drug delivery agents with sub-cellular applications. Herein, we detail the properties of a highly efficient class of CPPs, cationic amphiphilic polyproline helices (CAPHs), that are found localized to the mitochondria by direct transport into cells. 相似文献

5.

A novel numerical model for protein sequences analysis based on spherical coordinates and multiple physicochemical properties of amino acids

Jia-Feng Yu Ang Qu Hu-Cheng Tang Fang-Hua Wang Chun-Ling Wang Hong-Mei Wang Ji-Hua Wang Huai-Qiu Zhu 《Biopolymers》2019,110(8):e23282

How to characterize short protein sequences to make an effective connection to their functions is an unsolved problem. Here we propose to map the physicochemical properties of each amino acid onto unit spheres so that each protein sequence can be represented quantitatively. We demonstrate the usefulness of this representation by applying it to the prediction of cell penetrating peptides. We show that its combination with traditional composition features yields the best performance across different datasets, among several methods compared. For the convenience of users, a web server has been established for automatic calculations of the proposed features at http://biophy.dzu.edu.cn/SNumD/ . 相似文献

6.

Peptide internalization enabled by folding: triple helical cell‐penetrating peptides

下载免费PDF全文

Aparna Shinde Katie M. Feher Chloe Hu Katarzyna Slowinska 《Journal of peptide science》2015,21(2):77-84

Cell‐penetrating peptides (CPPs) are known as efficient transporters of molecular cargo across cellular membranes. Their properties make them ideal candidates for in vivo applications. However, challenges in the development of effective CPPs still exist: CPPs are often fast degraded by proteases and large concentration of CPPs required for cargo transporting can cause cytotoxicity. It was previously shown that restricting peptide flexibility can improve peptide stability against enzymatic degradation and limiting length of CPP peptide can lower cytotoxic effects. Here, we present peptides (30‐mers) that efficiently penetrate cellular membranes by combining very short CPP sequences and collagen‐like folding domains. The CPP domains are hexa‐arginine (R₆) or arginine/glycine (RRGRRG). Folding is achieved through multiple proline–hydroxyproline–glycine (POG [proline‐hydroxyproline‐glycine])_n repeats that form a collagen‐like triple helical conformation. The folded peptides with CPP domains are efficiently internalized, show stability against enzymatic degradation in human serum and have minimal toxicity. Peptides lacking correct folding (random coil) or CPP domains are unable to cross cellular membranes. These features make triple helical cell‐penetrating peptides promising candidates for efficient transporters of molecular cargo across cellular membranes. Copyright © 2014 European Peptide Society and John Wiley & Sons, Ltd. 相似文献

7.

Predicting protein structural class by SVM with class-wise optimized features and decision probabilities

Anand A Pugalenthi G Suganthan PN 《Journal of theoretical biology》2008,253(2):375-380

Determination of protein structural class solely from sequence information is a challenging task. Several attempts to solve this problem using various methods can be found in literature. We present support vector machine (SVM) approach where probability-based decision is used along with class-wise optimized feature sets. This approach has two distinguishing characteristics from earlier attempts: (1) it uses class-wise optimized features and (2) decisions of different SVM classifiers are coupled with probability estimates to make the final prediction. The algorithm was tested on three datasets, containing 498 domains, 1092 domains and 5261 domains. Ten-fold external cross-validation was performed to assess the performance of the algorithm. Significantly high accuracy of 92.89% was obtained for the 498-dataset. We achieved 54.67% accuracy for the dataset with 1092 domains, which is better than the previously reported best accuracy of 53.8%. We obtained 59.43% prediction accuracy for the larger and less redundant 5261-dataset. We also investigated the advantage of using class-wise features over union of these features (conventional approach) in one-vs.-all SVM framework. Our results clearly show the advantage of using class-wise optimized features. Brief analysis of the selected class-wise features indicates their biological significance. 相似文献

8.

Fatty acyl moieties: improving Pro-rich peptide uptake inside HeLa cells.

J Fernández-Carneado M J Kogan N Van Mau S Pujals C López-Iglesias F Heitz E Giralt 《The journal of peptide research》2005,65(6):580-590

In the field of drug delivery there has been a continuous study of powerful delivery systems to aid non permeable drugs in reaching their intracellular target. Among the systems explored are cell penetrating peptides (CPPs), which first garnered interest a decade ago when the interesting translocation properties of the pioneer CPPs Tat and Antp were described. A new family of CPPs has recently been described as non cytotoxic Pro-rich vectors with favorable profiles for internalization in HeLa cells. Fatty acyl moieties that can tune a peptide's interaction with the lipophilic environment of a cell membrane have been incorporated into the Pro-rich sequence. Improvements in cellular uptake of peptides modified with fatty acyl groups, as studied by confocal microscopy and flow cytometry, as well as the results obtained by the interaction of these peptides with a model dioleoylphosphatidylcholine (DOPC) membrane and transmission electron microscopy (TEM), illustrate the importance of the fatty acyl moieties for efficient internalization. 相似文献

9.

Global discriminative learning for higher-accuracy computational gene prediction

下载免费PDF全文

Bernal A Crammer K Hatzigeorgiou A Pereira F 《PLoS computational biology》2007,3(3):e54

Most ab initio gene predictors use a probabilistic sequence model, typically a hidden Markov model, to combine separately trained models of genomic signals and content. By combining separate models of relevant genomic features, such gene predictors can exploit small training sets and incomplete annotations, and can be trained fairly efficiently. However, that type of piecewise training does not optimize prediction accuracy and has difficulty in accounting for statistical dependencies among different parts of the gene model. With genomic information being created at an ever-increasing rate, it is worth investigating alternative approaches in which many different types of genomic evidence, with complex statistical dependencies, can be integrated by discriminative learning to maximize annotation accuracy. Among discriminative learning methods, large-margin classifiers have become prominent because of the success of support vector machines (SVM) in many classification tasks. We describe CRAIG, a new program for ab initio gene prediction based on a conditional random field model with semi-Markov structure that is trained with an online large-margin algorithm related to multiclass SVMs. Our experiments on benchmark vertebrate datasets and on regions from the ENCODE project show significant improvements in prediction accuracy over published gene predictors that use intrinsic features only, particularly at the gene level and on genes with long introns. 相似文献

10.

Properties of cell penetrating peptides (CPPs)

Kerkis A Hayashi MA Yamane T Kerkis I 《IUBMB life》2006,58(1):7-13

Different approaches have been developed for the introduction of macromolecules, proteins and DNA into target cells. Viral (retroviruses, lentiviruses, etc.) and nonviral (liposomes, bioballistics etc.) vectors as well as lipid particles have been tested as DNA delivery systems. However, all of them share several undesirable effects that are difficult to overcome, such as unwanted immunoresponse and limited cell targeting. The discovery of the cell penetrating peptides (CPPs) showing properties of macromolecules carriers and enhancers of viral vectors, opened new opportunities for the delivery of biologically active cargos, including therapeutically relevant genes into various cells and tissues. This review summarizes recent data about the best characterized CPPs as well as those sharing cell-penetrating and cargo delivery properties despite differing in the primary sequence. The putative mechanisms of CPPs penetration into cells and interaction with intracellular structures such as chromosomes, cytoskeleton and centrioles are addressed. We further discuss recent developments in overcoming the lack of cells specificity, one of the main obstacles for CPPs application in gene therapy. In particular, we review a newly discovered affinity of CPPs to actively proliferating cells. 相似文献

11.

Transduction of peptides and proteins into live cells by cell penetrating peptides

Mussbach F Franke M Zoch A Schaefer B Reissmann S 《Journal of cellular biochemistry》2011,112(12):3824-3833

Internalization of peptides and proteins into live cells is an essential prerequisite for studies on intracellular signal pathways, for treatment of certain microbial diseases and for signal transduction therapy, especially for cancer treatment. Cell penetrating peptides (CPPs) facilitate the transport of cargo-proteins through the cell membrane into live cells. CPPs which allow formation of non-covalent complexes with the cargo are used primarily in this study due to the relatively easy handling procedure. Efficiency of the protein uptake is estimated qualitatively by fluorescence microscopy and quantitatively by SDS-PAGE. Using the CPP cocktail JBS-Proteoducin, the intracellular concentrations of a secondary antibody and bovine serum albumin can reach the micromolar range. Internalization of antibodies allows mediation of intracellular pathways including knock down of signal transduction. The high specificity and affinity of antibodies makes them potentially more powerful than siRNA. Thus, CPPs represent a significant new possibility to study signal transduction processes in competition or in comparison to the commonly used other techniques. To estimate the highest attainable intracellular concentrations of cargo proteins, the CPPs are tested for cytotoxicity. Cell viability and membrane integrity relative to concentration of CPPs are investigated. Viability as estimated by the reductive activity of mitochondria (MTT-test) is more sensitive to higher concentrations of CPPs versus membrane integrity, as measured by the release of dead cell protease. Distinct differences in uptake efficiency and cytotoxic effects are found using six different CPPs and six different adhesion and suspension cell lines. 相似文献

12.

Potential efficacy of cell-penetrating peptides for nucleic acid and drug delivery in cancer

Bolhassani A 《Biochimica et biophysica acta》2011,1816(2):232-246

Cell penetrating peptides (CPPs) are short amphipathic and cationic peptides that are rapidly internalized across cell membranes. They can be used to deliver molecular cargo, such as imaging agents (fluorescent dyes and quantum dots), drugs, liposomes, peptide/protein, oligonucleotide/DNA/RNA, nanoparticles and bacteriophage into cells. The utilized CPP, attached cargo, concentration and cell type, all significantly affect the mechanism of internalization. The mechanism of cellular uptake and subsequent processing still remains controversial. It is now clear that CPP can mediate intracellular delivery via both endocytic and non-endocytic pathways. In addition, the orientation of the peptide and cargo and the type of linkage are likely important. In gene therapy, the designed cationic peptides must be able to 1) tightly condense DNA into small, compact particles; 2) target the condensate to specific cell surface receptors; 3) induce endosomal escape; and 4) target the DNA cargo to the nucleus for gene expression. The other studies have demonstrated that these small peptides can be conjugated to tumor homing peptides in order to achieve tumor-targeted delivery in vivo. On the other hand, one of the major aims in molecular cancer research is the development of new therapeutic strategies and compounds that target directly the genetic and biochemical agents of malignant transformation. For example, cell penetrating peptide aptamers might disrupt protein-protein interactions crucial for cancer cell growth or survival. In this review, we discuss potential functions of CPPs especially for drug and gene delivery in cancer and indicate their powerful promise for clinical efficacy. 相似文献

13.

Secondary structure of cell-penetrating peptides controls membrane interaction and insertion

Emelía Eiríksdóttir Karidia Konate Ülo Langel Gilles Divita Sébastien Deshayes 《生物化学与生物物理学报:生物膜》2010,1798(6):1119-1128

The clinical use of efficient therapeutic agents is often limited by the poor permeability of the biological membranes. In order to enhance their cell delivery, short amphipathic peptides called cell-penetrating peptides (CPPs) have been intensively developed for the last two decades. CPPs are based either on protein transduction domains, model peptide or chimeric constructs and have been used to deliver cargoes into cells through either covalent or non-covalent strategies. Although several parameters are simultaneously involved in their internalization mechanism, recent focuses on CPPs suggested that structural properties and interactions with membrane phospholipids could play a major role in the cellular uptake mechanism. In the present work, we report a comparative analysis of the structural plasticity of 10 well-known CPPs as well as their ability to interact with phospholipid membranes. We propose a new classification of CPPs based on their structural properties, affinity for phospholipids and internalization pathways already reported in the literature. 相似文献

14.

Chemical-Functional Diversity in Cell-Penetrating Peptides

Sofie Stalmans Evelien Wynendaele Nathalie Bracke Bert Gevaert Matthias D’Hondt Kathelijne Peremans Christian Burvenich Bart De Spiegeleer 《PloS one》2013,8(8)

相似文献

15.

Analysis of recursive gene selection approaches from microarray data 总被引：1，自引：0，他引：1

Li F Yang Y 《Bioinformatics (Oxford, England)》2005,21(19):3741-3747

MOTIVATION: Finding a small subset of most predictive genes from microarray for disease prediction is a challenging problem. Support vector machines (SVMs) have been found to be successful with a recursive procedure in selecting important genes for cancer prediction. However, it is not well understood how much of the success depends on the choice of the specific classifier and how much on the recursive procedure. We answer this question by examining multiple classifers [SVM, ridge regression (RR) and Rocchio] with feature selection in recursive and non-recursive settings on three DNA microarray datasets (ALL-AML Leukemia data, Breast Cancer data and GCM data). RESULTS: We found recursive RR most effective. On the AML-ALL dataset, it achieved zero error rate on the test set using only three genes (selected from over 7000), which is more encouraging than the best published result (zero error rate using 8 genes by recursive SVM). On the Breast Cancer dataset and the two largest categories of the GCM dataset, the results achieved by recursive RR are also very encouraging. A further analysis of the experimental results shows that different classifiers penalize redundant features to different extent and this property plays an important role in the recursive feature selection process. RR classifier tends to penalize redundant features to a much larger extent than the SVM does. This may be the reason why recursive RR has a better performance in selecting genes. 相似文献

16.

Cell penetrating peptide modulation of membrane biomechanics by Molecular dynamics

《Journal of biomechanics》2018

The efficacy of a pharmaceutical treatment is often countered by the inadequate membrane permeability, that prevents drugs from reaching their specific intracellular targets. Cell penetrating peptides (CPPs) are able to route across cells’ membrane various types of cargo, including drugs and nanoparticles. However, CPPs internalization mechanisms are not yet fully understood and depend on a wide variety of aspects. In this contest, the entry of a CPP into the lipid bilayer might induce molecular conformational changes, including marked variations on membrane’s mechanical properties. Understanding how the CPP does influence the mechanical properties of cells membrane is crucial to design, engineer and improve new and existing penetrating peptides. Here, all atom Molecular Dynamics (MD) simulations were used to investigate the interaction between different types of CPPs embedded in a lipid bilayer of dioleoyl phosphatidylcholine (DOPC). In a greater detail, we systematically highlighted how CPP properties are responsible for modulating the membrane bending modulus. Our findings highlighted the CPP hydropathy strongly correlated with penetration of water molecules in the lipid bilayer, thus supporting the hypothesis that the amount of water each CPP can route inside the membrane is modulated by the hydrophobic and hydrophilic character of the peptide. Water penetration promoted by CPPs leads to a local decrease of the lipid order, which emerges macroscopically as a reduction of the membrane bending modulus. 相似文献

17.

Machine learning study of classifiers trained with biophysiochemical properties of amino acids to predict fibril forming Peptide motifs

Nair SS Subba Reddy NV Hareesha KS 《Protein and peptide letters》2012,19(9):917-923

It is important to understand the cause of amyloid illnesses by predicting the short protein fragments capable of forming amyloid-like fibril motifs aiding in the discovery of sequence-targeted anti-aggregation drugs. It is extremely desirable to design computational tools to provide affordable in silico predictions owing to the limitations of molecular techniques for their identification. In this research article, we tried to study, from a machine learning perspective, the performance of several machine learning classifiers that use heterogenous features based on biochemical and biophysical properties of amino acids to discriminate between amyloidogenic and non-amyloidogenic regions in peptides. Four conventional machine learning classifiers namely Support Vector Machine, Neural network, Decision tree and Random forest were trained and tested to find the best classifier that fits the problem domain well. Prior to classification, novel implementations of two biologically-inspired feature optimization techniques based on evolutionary algorithms and methodologies that mimic social life and a multivariate method based on projection are utilized in order to remove the unimportant and uninformative features. Among the dimenionality reduction algorithms considered under the study, prediction results show that algorithms based on evolutionary computation is the most effective. SVM best suits the problem domain in its fitment among the classifiers considered. The best classifier is also compared with an online predictor to evidence the equilibrium maintained between true positive rates and false positive rates in the proposed classifier. This exploratory study suggests that these methods are promising in providing amyloidogenity prediction and may be further extended for large-scale proteomic studies. 相似文献

18.

Cell Penetrating Peptides and Cationic Antibacterial Peptides: TWO SIDES OF THE SAME COIN*

Jonathan G. Rodriguez Plaza Rosmarbel Morales-Nava Christian Diener Gabriele Schreiber Zyanya D. Gonzalez Maria Teresa Lara Ortiz Ivan Ortega Blake Omar Pantoja Rudolf Volkmer Edda Klipp Andreas Herrmann Gabriel Del Rio 《The Journal of biological chemistry》2014,289(21):14448-14457

Cell penetrating peptides (CPP) and cationic antibacterial peptides (CAP) have similar physicochemical properties and yet it is not understood how such similar peptides display different activities. To address this question, we used Iztli peptide 1 (IP-1) because it has both CPP and CAP activities. Combining experimental and computational modeling of the internalization of IP-1, we show it is not internalized by receptor-mediated endocytosis, yet it permeates into many different cell types, including fungi and human cells. We also show that IP-1 makes pores in the presence of high electrical potential at the membrane, such as those found in bacteria and mitochondria. These results provide the basis to understand the functional redundancy of CPPs and CAPs. 相似文献

19.

Mem-PHybrid: hybrid features-based prediction system for classifying membrane protein types

Hayat M Khan A 《Analytical biochemistry》2012,424(1):35-44

Membrane proteins are a major class of proteins and encoded by approximately 20% to 30% of genes in most organisms. In this work, a two-layer novel membrane protein prediction system, called Mem-PHybrid, is proposed. It is able to first identify the protein query as a membrane or nonmembrane protein. In the second level, it further identifies the type of membrane protein. The proposed Mem-PHybrid prediction system is based on hybrid features, whereby a fusion of both the physicochemical and split amino acid composition-based features is performed. This enables the proposed Mem-PHybrid to exploit the discrimination capabilities of both types of feature extraction strategy. In addition, minimum redundancy and maximum relevance has also been applied to reduce the dimensionality of a feature vector. We employ random forest, evidence-theoretic K-nearest neighbor, and support vector machine (SVM) as classifiers and analyze their performance on two datasets. SVM using hybrid features yields the highest accuracy of 89.6% and 97.3% on dataset1 and 91.5% and 95.5% on dataset2 for jackknife and independent dataset tests, respectively. The enhanced prediction performance of Mem-PHybrid is largely attributed to the exploitation of the discrimination power of the hybrid features and of the learning capability of SVM. Mem-PHybrid is accessible at http://www.111.68.99.218/Mem-PHybrid. 相似文献

20.

Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art

Walia RR Caragea C Lewis BA Towfic FG Terribilini M El-Manzalawy Y Dobbs D Honavar V 《BMC bioinformatics》2012,13(1):89

ABSTRACT: BACKGROUND: RNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition 'code' that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction. RESULTS: We provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naive Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues. CONCLUSIONS: Our results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons. 相似文献

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司京ICP备09084417号