首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
According to the concept of ritualization, acquisition of signal function (i.e., species recognition) leads to formalization of specific signal structure (i.e., species-specific pattern). By contrast, loss of species recognition may result in specific pattern collapse of the signal, from a temporally regular pattern to an irregular form. Several studies have reported the loss of species recognition driven by release from selective pressure against hybridization under isolation from close relatives, but no study has reported collapse of a regular signal pattern that serves in species recognition. In nocturnal lizards of the genus Gekko, several species pairs hybridize when brought together, whereas other species pairs never hybridize, despite being sympatric. I investigated the relationship between the existence of natural hybridization, courtship call similarity and species recognition ability in eight species of Gekko. I found that males of all eight species court using calls, and four of these species have species-specific regular patterns in the courtship calls, whereas the other four have no temporal patterns. In playback experiments, females of the species-specific pattern call species discriminated species by the call. On the other hand, in the patternless call species, females did not discriminate species by the call, suggesting that natural hybridization has resulted from the loss of species recognition ability in the patternless call species. In insular Gekko isolated from congeners, acoustic species recognition might have been lost, followed by pattern collapse of the courtship calls. This unique example of the signal pattern collapse would provide a key to answer basic questions concerning the causes and consequences of signal evolution.  相似文献   

3.
Multivariate time-series (MTS) data are prevalent in diverse domains and often high dimensional. We propose new random projection ensemble classifiers with high-dimensional MTS. The method first applies dimension reduction in the time domain via randomly projecting the time-series variables into some low-dimensional space, followed by measuring the disparity via some novel base classifier between the data and the candidate generating processes in the projected space. Our contributions are twofold: (i) We derive optimal weighted majority voting schemes for pooling information from the base classifiers for multiclass classification and (ii) we introduce new base frequency-domain classifiers based on Whittle likelihood (WL), Kullback-Leibler (KL) divergence, eigen-distance (ED), and Chernoff (CH) divergence. Both simulations for binary and multiclass problems, and an Electroencephalogram (EEG) application demonstrate the efficacy of the proposed methods in constructing accurate classifiers with high-dimensional MTS.  相似文献   

4.
G-protein coupled receptor (GPCR) is a membrane protein family, which serves as an interface between cell and the outside world. They are involved in various physiological processes and are the targets of more than 50% of the marketed drugs. The function of GPCRs can be known by conducting Biological experiments. However, the rapid increase of GPCR sequences entering into databanks, it is very time consuming and expensive to determine their function based only on experimental techniques. Hence, the computational prediction of GPCRs is very much demanding for both pharmaceutical and educational research. Feature extraction of GPCRs in the proposed research is performed using three techniques i.e. Pseudo amino acid composition, Wavelet based multi-scale energy and Evolutionary information based feature extraction by utilizing the position specific scoring matrices. For classification purpose, a majority voting based ensemble method is used; whose weights are optimized using genetic algorithm. Four classifiers are used in the ensemble i.e. Nearest Neighbor, Probabilistic Neural Network, Support Vector Machine and Grey Incidence Degree. The performance of the proposed method is assessed using Jackknife test for a number of datasets. First, the individual performances of classifiers are assessed for each dataset using Jackknife test. After that, the performance for each dataset is improved by using weighted ensemble classification. The weights of ensemble are optimized using various runs of Genetic Algorithm. We have compared our method with various other methods. The significance in performance of the proposed method depicts it to be useful for GPCRs classification.  相似文献   

5.
Lu J  Zhu Y  Li Y  Lu W  Hu L  Niu B  Qing P  Gu L 《Protein and peptide letters》2010,17(12):1536-1541
Information about interactions between enzymes and small molecules is important for understanding various metabolic bioprocesses. In this article we applied a majority voting system to predict the interactions between enzymes and small molecules in the metabolic pathways, by combining several classifiers including AdaBoost, Bagging and KNN together. The advantage of such a strategy is based on the principle that a predictor based majority voting systems usually provide more reliable results than any single classifier. The prediction accuracies thus obtained on a training dataset and an independent testing dataset were 82.8% and 84.8%, respectively. The prediction accuracy for the networking couples in the independent testing dataset was 75.5%, which is about 4% higher than that reported in a previous study. The web-server for the prediction method presented in this paper is available at http://chemdata.shu.edu.cn/small-enz.  相似文献   

6.
This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognition task (Task 1A).  相似文献   

7.

Background  

Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e.g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application.  相似文献   

8.
Pattern recognition and classification are two of the key topics in computer science. In this paper a novel method for the task of pattern classification is presented. The proposed method combines a hybrid associative classifier (Clasificador Híbrido Asociativo con Traslación, CHAT, in Spanish), a coding technique for output patterns called one-hot vector and majority voting during the classification step. The method is termed as CHAT One-Hot Majority (CHAT-OHM). The performance of the method is validated by comparing the accuracy of CHAT-OHM with other well-known classification algorithms. During the experimental phase, the classifier was applied to four datasets related to the medical field. The results also show that the proposed method outperforms the original CHAT classification accuracy.  相似文献   

9.
10.
The third party     
Abstract. Spatial and temporal variation in interactions among plants, other species and the abiotic environment create context‐dependency in vegetation pattern. We argue that we can enhance understanding of context‐dependency by being more explicit about the kinds of direct interactions that occur among more than two living and non‐living entities (i.e., third through nth parties) and formalizing how their combinations create context‐dependency using simple conceptual models. This general approach can be translated into field studies of context‐dependency in communities by combining: progressive sampling of local variation in vegetation pattern that encompasses variation in combinations of direct interactions; spatial and temporal measures of these direct interactions; locally parameterized versions of the conceptual models; and appropriately scaled experiments.  相似文献   

11.
In pharmaceutical sciences, a crucial step of the drug discovery process is the identification of drug-target interactions. However, only a small portion of the drug-target interactions have been experimentally validated, as the experimental validation is laborious and costly. To improve the drug discovery efficiency, there is a great need for the development of accurate computational approaches that can predict potential drug-target interactions to direct the experimental verification. In this paper, we propose a novel drug-target interaction prediction algorithm, namely neighborhood regularized logistic matrix factorization (NRLMF). Specifically, the proposed NRLMF method focuses on modeling the probability that a drug would interact with a target by logistic matrix factorization, where the properties of drugs and targets are represented by drug-specific and target-specific latent vectors, respectively. Moreover, NRLMF assigns higher importance levels to positive observations (i.e., the observed interacting drug-target pairs) than negative observations (i.e., the unknown pairs). Because the positive observations are already experimentally verified, they are usually more trustworthy. Furthermore, the local structure of the drug-target interaction data has also been exploited via neighborhood regularization to achieve better prediction accuracy. We conducted extensive experiments over four benchmark datasets, and NRLMF demonstrated its effectiveness compared with five state-of-the-art approaches.  相似文献   

12.
Human rationality–the ability to behave in order to maximize the achievement of their presumed goals (i.e., their optimal choices)–is the foundation for democracy. Research evidence has suggested that voters may not make decisions after exhaustively processing relevant information; instead, our decision-making capacity may be restricted by our own biases and the environment. In this paper, we investigate the extent to which humans in a democratic society can be rational when making decisions in a serious, complex situation–voting in a local political election. We believe examining human rationality in a political election is important, because a well-functioning democracy rests largely upon the rational choices of individual voters. Previous research has shown that explicit political attitudes predict voting intention and choices (i.e., actual votes) in democratic societies, indicating that people are able to reason comprehensively when making voting decisions. Other work, though, has demonstrated that the attitudes of which we may not be aware, such as our implicit (e.g., subconscious) preferences, can predict voting choices, which may question the well-functioning democracy. In this study, we systematically examined predictors on voting intention and choices in the 2014 mayoral election in Taipei, Taiwan. Results indicate that explicit political party preferences had the largest impact on voting intention and choices. Moreover, implicit political party preferences interacted with explicit political party preferences in accounting for voting intention, and in turn predicted voting choices. Ethnic identity and perceived voting intention of significant others were found to predict voting choices, but not voting intention. In sum, to the comfort of democracy, voters appeared to engage mainly explicit, controlled processes in making their decisions; but findings on ethnic identity and perceived voting intention of significant others may suggest otherwise.  相似文献   

13.
14.
MiRNAs play an essential role in the networks of gene regulation by inhibiting the translation of target mRNAs. Several computational approaches have been proposed for the prediction of miRNA target-genes. Reports reveal a large fraction of under-predicted or falsely predicted target genes. Thus, there is an imperative need to develop a computational method by which the target mRNAs of existing miRNAs can be correctly identified. In this study, combined pattern recognition neural network (PRNN) and principle component analysis (PCA) architecture has been proposed in order to model the complicated relationship between miRNAs and their target mRNAs in humans. The results of several types of intelligent classifiers and our proposed model were compared, showing that our algorithm outperformed them with higher sensitivity and specificity. Using the recent release of the mirBase database to find potential targets of miRNAs, this model incorporated twelve structural, thermodynamic and positional features of miRNA:mRNA binding sites to select target candidates.  相似文献   

15.
Classification of datasets with imbalanced sample distributions has always been a challenge. In general, a popular approach for enhancing classification performance is the construction of an ensemble of classifiers. However, the performance of an ensemble is dependent on the choice of constituent base classifiers. Therefore, we propose a genetic algorithm-based search method for finding the optimum combination from a pool of base classifiers to form a heterogeneous ensemble. The algorithm, called GA-EoC, utilises 10 fold-cross validation on training data for evaluating the quality of each candidate ensembles. In order to combine the base classifiers decision into ensemble’s output, we used the simple and widely used majority voting approach. The proposed algorithm, along with the random sub-sampling approach to balance the class distribution, has been used for classifying class-imbalanced datasets. Additionally, if a feature set was not available, we used the (α, β) − k Feature Set method to select a better subset of features for classification. We have tested GA-EoC with three benchmarking datasets from the UCI-Machine Learning repository, one Alzheimer’s disease dataset and a subset of the PubFig database of Columbia University. In general, the performance of the proposed method on the chosen datasets is robust and better than that of the constituent base classifiers and many other well-known ensembles. Based on our empirical study we claim that a genetic algorithm is a superior and reliable approach to heterogeneous ensemble construction and we expect that the proposed GA-EoC would perform consistently in other cases.  相似文献   

16.
S Wang  X Li  J Fang 《BMC bioinformatics》2012,13(1):178-26
ABSTRACT: BACKGROUND: Previous studies on tumor classification based on gene expression profiles suggest that gene selection plays a key role in improving the classification performance. Moreover, finding important tumor-related genes with the highest accuracy is a very important task because these genes might serve as tumor biomarkers, which is of great benefit to not only tumor molecular diagnosis but also drug development. RESULTS: This paper proposes a novel gene selection method with rich biomedical meaning based on Heuristic Breadth-first Search Algorithm (HBSA) to find as many optimal gene subsets as possible. Due to the curse of dimensionality, this type of method could suffer from over-fitting and selection bias problems. To address these potential problems, a HBSA-based ensemble classifier is constructed using majority voting strategy from individual classifiers constructed by the selected gene subsets, and a novel HBSA-based gene ranking method is designed to find important tumor-related genes by measuring the significance of genes using their occurrence frequencies in the selected gene subsets. The experimental results on nine tumor datasets including three pairs of cross-platform datasets indicate that the proposed method can not only obtain better generalization performance but also find many important tumor-related genes. CONCLUSIONS: It is found that the frequencies of the selected genes follow a power-law distribution, indicating that only a few top-ranked genes can be used as potential diagnosis biomarkers. Moreover, the top-ranked genes leading to very high prediction accuracy are closely related to specific tumor subtype and even hub genes. Compared with other related methods, the proposed method can achieve higher prediction accuracy with fewer genes. Moreover, they are further justified by analyzing the top-ranked genes in the context of individual gene function, biological pathway, and protein-protein interaction network.  相似文献   

17.
MOTIVATION: Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classification methods and examined many issues important for a practical recognition system. RESULTS: Most current discriminative methods for protein fold prediction use the one-against-others method, which has the well-known 'False Positives' problem. We investigated two new methods: the unique one-against-others and the all-against-all methods. Both improve prediction accuracy by 14-110% on a dataset containing 27 SCOP folds. We used the Support Vector Machine (SVM) and the Neural Network (NN) learning methods as base classifiers. SVMs converges fast and leads to high accuracy. When scores of multiple parameter datasets are combined, majority voting reduces noise and increases recognition accuracy. We examined many issues involved with large number of classes, including dependencies of prediction accuracy on the number of folds and on the number of representatives in a fold. Overall, recognition systems achieve 56% fold prediction accuracy on a protein test dataset, where most of the proteins have below 25% sequence identity with the proteins used in training.  相似文献   

18.
The crystal structure of activated tobacco rubisco, complexed with the reaction-intermediate analogue 2-carboxy-arabinitol 1,5-bisphosphate (CABP) has been determined by molecular replacement, using the structure of activated spinach rubisco (Knight, S., Andersson, I., & Brändén, C.-I., 1990, J. Mol. Biol. 215, 113-160) as a model. The R-factor after refinement is 21.0% for 57,855 reflections between 9.0 and 2.7 A resolution. The local fourfold axis of the rubisco hexadecamer coincides with a crystallographic twofold axis. The result is that the asymmetric unit of the crystals contains half of the L8S8 complex (molecular mass 280 kDa in the asymmetric unit). The activated form of tobacco rubisco is very similar to the activated form of spinach rubisco. The root mean square difference is 0.4 A for 587 equivalent C alpha atoms. Analysis of mutations between tobacco and spinach rubisco revealed that the vast majority of mutations concerned exposed residues. Only 7 buried residues were found to be mutated versus 54 residues at or near the surface of the protein. The crystal structure suggests that the Cys 247-Cys 247 and Cys 449-Cys 459 pairs are linked via disulfide bridges. This pattern of disulfide links differ from the pattern of disulfide links observed in crystals of unactivated tobacco rubisco (Curmi, P.M.G., et al., 1992, J. Biol. Chem. 267, 16980-16989) and is similar to the pattern observed for activated spinach tobacco.  相似文献   

19.
Multigradient method for optimization of slow biotechnological processes   总被引:1,自引:0,他引:1  
A new method (named a "jumping spider") is introduced for the optimization of slow biotechnological processes. The more traditional sequential experimentation (i.e., gradient search, simplex, etc.) is not well suited for slow dynamic processes, e.g., plant cell culture and differentiation. Therefore, a more simultaneous approach is proposed. A large number of initial experiments are performed, on the basis of which several of the initial experiments are selected as starting points. A search is then performed simultaneously from several gradient directions and the optimum is estimated by a quadratic approximation. In simulations, the spider generally climbs up the slopes quickly and the final estimator yields good maximum point estimates even on a complex topography. The spider may even approach more than one local maximum point simultaneously. As a model application, the average xylitol conversion rate of Candida guilliermondii was optimized in relation to cultivation volume (oxygen availability) and the concentration of nitrogen and phosphorus in the medium. A threefold increase in xylitol production was obtained with three experimental steps. (c) 1993 John Wiley & Sons, Inc.  相似文献   

20.
Linear regression and two-class classification with gene expression data   总被引:3,自引:0,他引:3  
MOTIVATION: Using gene expression data to classify (or predict) tumor types has received much research attention recently. Due to some special features of gene expression data, several new methods have been proposed, including the weighted voting scheme of Golub et al., the compound covariate method of Hedenfalk et al. (originally proposed by Tukey), and the shrunken centroids method of Tibshirani et al. These methods look different and are more or less ad hoc. RESULTS: We point out a close connection of the three methods with a linear regression model. Casting the classification problem in the general framework of linear regression naturally leads to new alternatives, such as partial least squares (PLS) methods and penalized PLS (PPLS) methods. Using two real data sets, we show the competitive performance of our new methods when compared with the other three methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号