共查询到20条相似文献,搜索用时 15 毫秒
1.
Allan Birnbaum 《Genetics》1972,72(4):739-758
A random phenotype is defined as a probability distribution over any given set of phenotypes. This includes as special cases the kinds of phenotypes usually considered (qualitative, quantitative, and threshold characters) and all others. Correspondingly general methods are indicated for analyzing data of all forms in terms of the classical Mendelian factor concept (as distinct from the biometrical methods usually applied to measurement and graded data, associated with the effective factor concept). These are applied in a new analysis of the data of E. L. Green (1951, 1954, 1962) on skeletal variation in the mouse. The adequacies of various classical one-factor and several-factor models are considered. Indications of an underlying scale are found from this new standpoint. The results are compared with those obtained by Green using the scaling approach. An illustrative application is also made to some of Bruell's (1962) continuous behavioural data on mice. This work was substantially completed in 1959 but not previously prepared for publication. The same approach was originated and developed independently by R. L. Collins who has treated a wider range of theoretical problems (cf. Collins 1967, 1968a, 1969b, 1970c) and a wider range of applications (cf. Collins and Fuller 1968; Collins 1968b, 1969a, 1970a). A less general independent development is that of Mode and Gasser 1972. 相似文献
2.
Cell populations can benefit from changing phenotype when the environment changes. One mechanism for generating these changes
is stochastic phenotype switching, whereby cells switch stochastically from one phenotype to another according to genetically
determined rates, irrespective of the current environment, with the matching of phenotype to environment then determined by
selective pressure. This mechanism has been observed in numerous contexts, but identifying the precise connection between
switching rates and environmental changes remains an open problem. Here, we introduce a simple model to study the evolution
of phenotype switching in a finite population subject to random environmental shocks. We compare the successes of competing
genotypes with different switching rates, and analyze how the optimal switching rates depend on the frequency of environmental
changes. If environmental changes are as rare as mutations, then the optimal switching rates mimic the rates of environmental
changes. If the environment changes more frequently, then the optimal genotype either maximally favors fitness in the more
common environment or has the maximal switching rate to each phenotype. Our results also explain why the optimum is relatively
insensitive to fitness in each environment. 相似文献
3.
Mariana R. Mendoza Guilherme C. da Fonseca Guilherme Loss-Morais Ronnie Alves Rogerio Margis Ana L. C. Bazzan 《PloS one》2013,8(7)
MicroRNAs are key regulators of eukaryotic gene expression whose fundamental role has already been identified in many cell pathways. The correct identification of miRNAs targets is still a major challenge in bioinformatics and has motivated the development of several computational methods to overcome inherent limitations of experimental analysis. Indeed, the best results reported so far in terms of specificity and sensitivity are associated to machine learning-based methods for microRNA-target prediction. Following this trend, in the current paper we discuss and explore a microRNA-target prediction method based on a random forest classifier, namely RFMirTarget. Despite its well-known robustness regarding general classifying tasks, to the best of our knowledge, random forest have not been deeply explored for the specific context of predicting microRNAs targets. Our framework first analyzes alignments between candidate microRNA-target pairs and extracts a set of structural, thermodynamics, alignment, seed and position-based features, upon which classification is performed. Experiments have shown that RFMirTarget outperforms several well-known classifiers with statistical significance, and that its performance is not impaired by the class imbalance problem or features correlation. Moreover, comparing it against other algorithms for microRNA target prediction using independent test data sets from TarBase and starBase, we observe a very promising performance, with higher sensitivity in relation to other methods. Finally, tests performed with RFMirTarget show the benefits of feature selection even for a classifier with embedded feature importance analysis, and the consistency between relevant features identified and important biological properties for effective microRNA-target gene alignment. 相似文献
4.
For controlling dexterous prosthetic hand with a high number of active Degrees of Freedom (DOF),it is necessary to reliably extract control volitions of finger motions from the human body.In this study,a large variety of finger motions are discriminated based on the diversities of the pressure distribution produced by the mechanical actions of muscles on the forearm.The pressure distribution patterns corresponding to the motions were measured by sensor array which is composed of 32 Force Sensitive Resistor (FSR) sensors.In order to map the pressure patterns with different finger motions,a multiclass classifier was designed based on the Support Vector Machine (SVM) algorithm.The multi-subject experiments show that it is possible to identify as many as seventeen different finger motions,including individual finger motions and multi-finger grasping motions,with the accuracy above 99% in the in-session validation.Further,the cross-session validation demonstrates that the performance of the proposed method is robust for use if the FSR array is not reset.The results suggest that the proposed method has great application prospects for the control of multi-DOF dexterous hand prosthesis. 相似文献
5.
The extracellular matrix (ECM) is a dynamic composite of secreted proteins that play important roles in numerous biological processes such as tissue morphogenesis, differentiation and homeostasis. Furthermore, various diseases are caused by the dysfunction of ECM proteins. Therefore, identifying these important ECM proteins may assist in understanding related biological processes and drug development. In view of the serious imbalance in the training dataset, a Random Forest-based ensemble method with hybrid features is developed in this paper to identify ECM proteins. Hybrid features are employed by incorporating sequence composition, physicochemical properties, evolutionary and structural information. The Information Gain Ratio and Incremental Feature Selection (IGR-IFS) methods are adopted to select the optimal features. Finally, the resulting predictor termed IECMP (Identify ECM Proteins) achieves an balanced accuracy of 86.4% using the 10-fold cross-validation on the training dataset, which is much higher than results obtained by other methods (ECMPRED: 71.0%, ECMPP: 77.8%). Moreover, when tested on a common independent dataset, our method also achieves significantly improved performance over ECMPP and ECMPRED. These results indicate that IECMP is an effective method for ECM protein prediction, which has a more balanced prediction capability for positive and negative samples. It is anticipated that the proposed method will provide significant information to fully decipher the molecular mechanisms of ECM-related biological processes and discover candidate drug targets. For public access, we develop a user-friendly web server for ECM protein identification that is freely accessible at http://iecmp.weka.cc. 相似文献
6.
The article focus is the improvement of machine learning models capable of predicting protein expression levels based on their codon encoding. Support vector regression (SVR) and partial least squares (PLS) were used to create the models. SVR yields predictions that surpass those of PLS. It is shown that it is possible to improve the models predictive ability by using two more input features, codon identification number and codon count, besides the already used codon bias and minimum free energy. In addition, applying ensemble averaging to the SVR or PLS models also improves the results even further. The present work motivates the test of different ensembles and features with the aim of improving the prediction models whose correlation coefficients are still far from perfect. These results are relevant for the optimization of codon usage and enhancement of protein expression levels in synthetic biology problems. 相似文献
7.
HPSLPred: An Ensemble Multi‐Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source 下载免费PDF全文
Predicting the subcellular localization of proteins is an important and challenging problem. Traditional experimental approaches are often expensive and time‐consuming. Consequently, a growing number of research efforts employ a series of machine learning approaches to predict the subcellular location of proteins. There are two main challenges among the state‐of‐the‐art prediction methods. First, most of the existing techniques are designed to deal with multi‐class rather than multi‐label classification, which ignores connections between multiple labels. In reality, multiple locations of particular proteins imply that there are vital and unique biological significances that deserve special focus and cannot be ignored. Second, techniques for handling imbalanced data in multi‐label classification problems are necessary, but never employed. For solving these two issues, we have developed an ensemble multi‐label classifier called HPSLPred, which can be applied for multi‐label classification with an imbalanced protein source. For convenience, a user‐friendly webserver has been established at http://server.malab.cn/HPSLPred. 相似文献
8.
As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning (DL) network classifier based on long short-term memory (LSTM) with word embedding (LSTMWE) for the prediction of mammalian malonylation sites. LSTMWE performs better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTMWE is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning (ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTMWE and the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence. LEMP is available at http://www.bioinfogo.org/lemp. 相似文献
9.
10.
Supatcha Lertampaiporn Chinae Thammarongtham Chakarida Nukoolkit Boonserm Kaewkamnerdpong Marasri Ruengjitchatchawalya 《Nucleic acids research》2014,42(11):e93
To identify non-coding RNA (ncRNA) signals within genomic regions, a classification tool was developed based on a hybrid random forest (RF) with a logistic regression model to efficiently discriminate short ncRNA sequences as well as long complex ncRNA sequences. This RF-based classifier was trained on a well-balanced dataset with a discriminative set of features and achieved an accuracy, sensitivity and specificity of 92.11%, 90.7% and 93.5%, respectively. The selected feature set includes a new proposed feature, SCORE. This feature is generated based on a logistic regression function that combines five significant features—structure, sequence, modularity, structural robustness and coding potential—to enable improved characterization of long ncRNA (lncRNA) elements. The use of SCORE improved the performance of the RF-based classifier in the identification of Rfam lncRNA families. A genome-wide ncRNA classification framework was applied to a wide variety of organisms, with an emphasis on those of economic, social, public health, environmental and agricultural significance, such as various bacteria genomes, the Arthrospira (Spirulina) genome, and rice and human genomic regions. Our framework was able to identify known ncRNAs with sensitivities of greater than 90% and 77.7% for prokaryotic and eukaryotic sequences, respectively. Our classifier is available at http://ncrna-pred.com/HLRF.htm. 相似文献
11.
12.
《IRBM》2022,43(6):549-560
Objectives: In recent times, MR image is used to detect the dementia diagnostic differences in preclinical stages. Mild cognitive impairment (MCI) is characterized by slight cognitive deficits. This can be categorized into early and late mild cognitive impairment according to extent of episodic cognitive impairment. There is a higher risk of MCI subject to convert into Alzheimers disease. It is observed that there is no appropriate biomarker to find severity changes in dementia. Thus, this work aims to identify appropriate biomarker using radiomic and hybrid social algorithms.Materials: ADNI database is utilized for this study. Grey matter, cerebrospinal fluid, ventricle, hippocampus, brain stem and mid brain regions are examined to extract the radiomic features. This provides local and global tissue changes of these regions. The significant features are obtained using hybrid salp swarm and particle swarm optimization method (SSA-PSO). SVM is adopted to classify the normal and severity groups. The performance of work is validated clinically and statistically.Results: Results show that radiomic features capture anatomical changes for considered regions. The significant features from SSA-PSO show greater causal association and statistical significance for all considered regions. However, hippocampus achieves 88.5% of classification accuracy than other regions in the considered group. The inter class variations of hippocampus gives precise prognosis differences. From the clinical validation, it is also found that the obtained result show high statistical significance () among the different severity.Conclusion: The proposed work shows promising results in using these biomarkers in detection of dementia and support clinical decisions. 相似文献
13.
解码癫痫发作前脑电信号的神经元集群异常痫样放电活动,对癫痫发作进行有效预测并实施病前干预,可显著减少疾病病损,是癫痫防治的研究热点之一。基于脑电信号的癫痫发作预测研究关键在于发作间期和前期的异常状态识别,研究上述两状态间的神经动力学特征差异对明确癫痫发病机制、选取高分辨特征,进而有效识别该渐进性疾病所处的发作阶段具有重要价值。目前,研究者已对当前主流特征提取及模式识别方法进行了充分的调研梳理,但忽视了神经动态特征变化对于癫痫发作预测的重要意义。基于此,本文归纳总结了5类典型的发作预测特征分析方法及其优缺点,重点剖析了发作间期至前期神经生理特征的动态变化及其动力学特性,类比分析了当前该领域主流的机器学习和深度学习特征识别方法,以期为进一步建立精准、高效的癫痫发作预测技术提供新思路。 相似文献
14.
外膜蛋白(Outer Membrane Proteins, OMPs)是一类具有重要生物功能的蛋白质, 通过生物信息学方法来预测OMPs能够为预测OMPs的二级和三级结构以及在基因组发现新的OMPs提供帮助。文中提出计算蛋白质序列的氨基酸含量特征、二肽含量特征和加权多阶氨基酸残基指数相关系数特征, 将三类特征组合, 采用支持向量机(Support Vector Machine, SVM)算法来识别OMPs。计算了包括四种残基指数的多种组合特征的识别结果, 并且讨论了相关系数的阶次和权值对预测性能的影响。在数据集上的十倍交叉验证测试和独立性测试结果显示, 组合特征识别方法对OMPs和非OMPs的识别精度最高分别达到96.96%和97.33%, 优于现有的多种方法。在五种细菌基因组内识别OMPs的结果显示, 组合特征方法具有很高的特异性, 并且对PDB数据库中已知结构的OMPs识别准确度超过99%。表明该方法能够作为基因组内筛选OMPs的有效工具。 相似文献
15.
外膜蛋白(Outer Membrane Proteins, OMPs)是一类具有重要生物功能的蛋白质, 通过生物信息学方法来预测OMPs能够为预测OMPs的二级和三级结构以及在基因组发现新的OMPs提供帮助。文中提出计算蛋白质序列的氨基酸含量特征、二肽含量特征和加权多阶氨基酸残基指数相关系数特征, 将三类特征组合, 采用支持向量机(Support Vector Machine, SVM)算法来识别OMPs。计算了包括四种残基指数的多种组合特征的识别结果, 并且讨论了相关系数的阶次和权值对预测性能的影响。在数据集上的十倍交叉验证测试和独立性测试结果显示, 组合特征识别方法对OMPs和非OMPs的识别精度最高分别达到96.96%和97.33%, 优于现有的多种方法。在五种细菌基因组内识别OMPs的结果显示, 组合特征方法具有很高的特异性, 并且对PDB数据库中已知结构的OMPs识别准确度超过99%。表明该方法能够作为基因组内筛选OMPs的有效工具。 相似文献
16.
In face recognition, most appearance-based methods require several images of each person to construct the feature space for recognition. However, in the real world it is difficult to collect multiple images per person, and in many cases there is only a single sample per person (SSPP). In this paper, we propose a method to generate new images with various illuminations from a single image taken under frontal illumination. Motivated by the integral image, which was developed for face detection, we extract the bidirectional integral feature (BIF) to obtain the characteristics of the illumination condition at the time of the picture being taken. The experimental results for various face databases show that the proposed method results in improved recognition performance under illumination variation. 相似文献
17.
Samme Amena Tasmia Fee Faysal Ahmed Parvez Mosharaf Mehedi Hasan Nurul Haque Mollah 《Current Genomics》2021,22(2):122
Background Lysine succinylation is one of the reversible protein post-translational modifications (PTMs), which regulate the structure and function of proteins. It plays a significant role in various cellular physiologies including some diseases of human as well as many other organisms. The accurate identification of succinylation site is essential to understand the various biological functions and drug development.Methods In this study, we developed an improved method to predict lysine succinylation sites mapping on Homo sapiens by the fusion of three encoding schemes such as binary, the composition of k-spaced amino acid pairs (CKSAAP) and amino acid composition (AAC) with the random forest (RF) classifier. The prediction performance of the proposed random forest (RF) based on the fusion model in a comparison of other candidates was investigated by using 20-fold cross-validation (CV) and two independent test datasets were collected from two different sources.Results The CV results showed that the proposed predictor achieves the highest scores of sensitivity (SN) as 0.800, specificity (SP) as 0.902, accuracy (ACC) as 0.919, Mathew correlation coefficient (MCC) as 0.766 and partial AUC (pAUC) as 0.163 at a false-positive rate (FPR) = 0.10 and area under the ROC curve (AUC) as 0.958. It achieved the highest performance scores of SN as 0.811, SP as 0.902, ACC as 0.891, MCC as 0.629 and pAUC as 0.139 and AUC as 0.921 for the independent test protein set-1 and SN as 0.772, SP as 0.901, ACC as 0.836, MCC as 0.677 and pAUC as 0.141 at FPR = 0.10 and AUC as 0.923 for the independent test protein set-2. It also outperformed all the other existing prediction models.Conclusion The prediction performances as discussed in this article recommend that the proposed method might be a useful and encouraging computational resource for lysine succinylation site prediction in the case of human population. 相似文献
18.
Teruya Nakamura Sachiko Meshitsuka Seiju Kitagawa Nanase Abe Junichi Yamada Tetsuya Ishino Hiroaki Nakano Teruhisa Tsuzuki Takefumi Doi Yuji Kobayashi Satoshi Fujii Mutsuo Sekiguchi Yuriko Yamagata 《The Journal of biological chemistry》2010,285(1):444-452
Escherichia coli MutT hydrolyzes 8-oxo-dGTP to 8-oxo-dGMP, an event that can prevent the misincorporation of 8-oxoguanine opposite adenine in DNA. Of the several enzymes that recognize 8-oxoguanine, MutT exhibits high substrate specificity for 8-oxoguanine nucleotides; however, the structural basis for this specificity is unknown. The crystal structures of MutT in the apo and holo forms and in the binary and ternary forms complexed with the product 8-oxo-dGMP and 8-oxo-dGMP plus Mn2+, respectively, were determined. MutT strictly recognizes the overall conformation of 8-oxo-dGMP through a number of hydrogen bonds. This recognition mode revealed that 8-oxoguanine nucleotides are discriminated from guanine nucleotides by not only the hydrogen bond between the N7-H and Oδ (N119) atoms but also by the syn glycosidic conformation that 8-oxoguanine nucleotides prefer. Nevertheless, these discrimination factors cannot by themselves explain the roughly 34,000-fold difference between the affinity of MutT for 8-oxo-dGMP and dGMP. When the binary complex of MutT with 8-oxo-dGMP is compared with the ligand-free form, ordering and considerable movement of the flexible loops surrounding 8-oxo-dGMP in the binary complex are observed. These results indicate that MutT specifically recognizes 8-oxoguanine nucleotides by the ligand-induced conformational change. 相似文献
19.