首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Mechanisms through which tissues are formed and maintained remain unknown but are fundamental aspects in biology. Tissue-specific gene expression is a valuable tool to study such mechanisms. But in many biomedical studies, cell lines, rather than human body tissues, are used to investigate biological mechanisms Whether or not cell lines maintain their tissue-specific characteristics after they are isolated and cultured outside the human body remains to be explored. In this study, we applied a novel computational method to identify core genes that contribute to the differentiation of cell lines from various tissues. Several advanced computational techniques, such as Monte Carlo feature selection method, incremental feature selection method, and support vector machine (SVM) algorithm, were incorporated in the proposed method, which extensively analyzed the gene expression profiles of cell lines from different tissues. As a result, we extracted a group of functional genes that can indicate the differences of cell lines in different tissues and built an optimal SVM classifier for identifying cell lines in different tissues. In addition, a set of rules for classifying cell lines were also reported, which can give a clearer picture of cell lines in different issues although its performance was not better than the optimal SVM classifier. Finally, we compared such genes with the tissue-specific genes identified by the Genotype-tissue Expression project. Results showed that most expression patterns between tissues remained in the derived cell lines despite some uniqueness that some genes show tissue specificity.  相似文献   

2.
In this technical note, we investigate a combination PCA with SVM to classify gait pattern based on kinetic data. The gait data of 30 young and 30 elderly participants were recorded using a strain gauge force platform during normal walking. The gait features were first extracted from the recorded vertical directional foot- ground reaction forces curve using PCA, and then these extracted features were adopted to develop the SVM gait classifier. The test results indicated that the performance of PCA-based SVM was on average 90% to recognize young- elderly gait patterns, resulting in a markedly improved performance over an artificial neural network-based classifier. The classification ability of the SVM with polynomial and radial basis function kernels was superior to that of the SVM with linear kernel. These results suggest that the proposed technique could provide an effective tool for gait classification in future clinical applications.  相似文献   

3.
This paper proposes a new power spectral-based hybrid genetic algorithm-support vector machines (SVMGA) technique to classify five types of electrocardiogram (ECG) beats, namely normal beats and four manifestations of heart arrhythmia. This method employs three modules: a feature extraction module, a classification module and an optimization module. Feature extraction module extracts electrocardiogram's spectral and three timing interval features. Non-parametric power spectral density (PSD) estimation methods are used to extract spectral features. Support vector machine (SVM) is employed as a classifier to recognize the ECG beats. We investigate and compare two such classification approaches. First they are specified experimentally by the trial and error method. In the second technique the approach optimizes the relevant parameters through an intelligent algorithm. These parameters are: Gaussian radial basis function (GRBF) kernel parameter σ and C penalty parameter of SVM classifier. Then their performances in classification of ECG signals are evaluated for eight files obtained from the MIT–BIH arrhythmia database. Classification accuracy of the SVMGA approach proves superior to that of the SVM which has constant and manually extracted parameter.  相似文献   

4.
Support vector machine applications in bioinformatics   总被引:14,自引:0,他引:14  
  相似文献   

5.
Long intergenic non-coding RNAs (lincRNAs) are a new type of non-coding RNAs and are closely related with the occurrence and development of diseases. In previous studies, most lincRNAs have been identified through next-generation sequencing. Because lincRNAs exhibit tissue-specific expression, the reproducibility of lincRNA discovery in different studies is very poor. In this study, not including lincRNA expression, we used the sequence, structural and protein-coding potential features as potential features to construct a classifier that can be used to distinguish lincRNAs from non-lincRNAs. The GA–SVM algorithm was performed to extract the optimized feature subset. Compared with several feature subsets, the five-fold cross validation results showed that this optimized feature subset exhibited the best performance for the identification of human lincRNAs. Moreover, the LincRNA Classifier based on Selected Features (linc-SF) was constructed by support vector machine (SVM) based on the optimized feature subset. The performance of this classifier was further evaluated by predicting lincRNAs from two independent lincRNA sets. Because the recognition rates for the two lincRNA sets were 100% and 99.8%, the linc-SF was found to be effective for the prediction of human lincRNAs.  相似文献   

6.
This paper applies and studies the behavior of three learning algorithms, i.e. the Support Vector machine (SVM), the Radial Basis Function Network (the RBF network), and k-Nearest Neighbor (k-NN) for predicting HIV-1 drug resistance from genotype data. In addition, a new algorithm for classifier combination is proposed. The results of comparing the predictive performance of three learning algorithms show that, SVM yields the highest average accuracy, the RBF network gives the highest sensitivity, and k-NN yields the best in specificity. Finally, the comparison of the predictive performance of the composite classifier with three learning algorithms demonstrates that the proposed composite classifier provides the highest average accuracy.  相似文献   

7.
Histone H3 (H3) phosphorylation at Ser(10) occurs during mitosis in eukaryotes and was recently shown to play an important role in chromosome condensation in Tetrahymena. When producing monoclonal antibodies that recognize glial fibrillary acidic protein phosphorylation at Thr(7), we obtained some monoclonal antibodies that cross-reacted with early mitotic chromosomes. They reacted with 15-kDa phosphoprotein specifically in mitotic cell lysate. With microsequencing, this phosphoprotein was proved to be H3. Mutational analysis revealed that they recognized H3 Ser(28) phosphorylation. Then we produced a monoclonal antibody, HTA28, using a phosphopeptide corresponding to phosphorylated H3 Ser(28). This antibody specifically recognized the phosphorylation of H3 Ser(28) but not that of glial fibrillary acidic protein Thr(7). Immunocytochemical studies with HTA28 revealed that Ser(28) phosphorylation occurred in chromosomes predominantly during early mitosis and coincided with the initiation of mitotic chromosome condensation. Biochemical analyses using (32)P-labeled mitotic cells also confirmed that H3 is phosphorylated at Ser(28) during early mitosis. In addition, we found that H3 is phosphorylated at Ser(28) as well as Ser(10) when premature chromosome condensation was induced in tsBN2 cells. These observations suggest that H3 phosphorylation at Ser(28), together with Ser(10), is a conserved event and is likely to be involved in mitotic chromosome condensation.  相似文献   

8.
In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.  相似文献   

9.
Micro array data provides information of expression levels of thousands of genes in a cell in a single experiment. Numerous efforts have been made to use gene expression profiles to improve precision of tumor classification. In our present study we have used the benchmark colon cancer data set for analysis. Feature selection is done using t‐statistic. Comparative study of class prediction accuracy of 3 different classifiers viz., support vector machine (SVM), neural nets and logistic regression was performed using the top 10 genes ranked by the t‐statistic. SVM turned out to be the best classifier for this dataset based on area under the receiver operating characteristic curve (AUC) and total accuracy. Logistic Regression ranks as the next best classifier followed by Multi Layer Perceptron (MLP). The top 10 genes selected by us for classification are all well documented for their variable expression in colon cancer. We conclude that SVM together with t-statistic based feature selection is an efficient and viable alternative to popular techniques.  相似文献   

10.
Mass spectrometry (MS)-based metabolomics studies often require handling of both identified and unidentified metabolite data. In order to avoid bias in data interpretation, it would be of advantage for the data analysis to include all available data. A practical challenge in exploratory metabolomics analysis is therefore how to interpret the changes related to unidentified peaks. In this paper, we address the challenge by predicting the class membership of unknown peaks by applying and comparing multiple supervised classifiers to selected lipidomics datasets. The employed classifiers include k-nearest neighbours (k-NN), support vector machines (SVM), partial least squares and discriminant analysis (PLS-DA) and Naive Bayes methods which are known to be effective and efficient in predicting the labels for unseen data. Here, the class label predictions are sought for unidentified lipid profiles coming from high throughput global screening in Ultra Performance Liquid Chromatography Mass Spectrometry (UPLCTM/MS) experimental setup. Our investigation reveals that k-NN and SVM classifiers outperform both PLS-DA and Naive Bayes classifiers. Naive Bayes classifier perform poorly among all models and this observation seems logical as lipids are highly co-regulated and do not respect Naive Bayes assumptions of features being conditionally independent given the class. Common label predictions from k-NN and SVM can serve as a good starting point to explore full data and thereby facilitating exploratory studies where label information is critical for the data interpretation.  相似文献   

11.
Zhao N  Pang B  Shyu CR  Korkin D 《Proteomics》2011,11(22):4321-4330
Structural knowledge about protein-protein interactions can provide insights to the basic processes underlying cell function. Recent progress in experimental and computational structural biology has led to a rapid growth of experimentally resolved structures and computationally determined near-native models of protein-protein interactions. However, determining whether a protein-protein interaction is physiological or it is the artifact of an experimental or computational method remains a challenging problem. In this work, we have addressed two related problems. The first problem is distinguishing between the experimentally obtained physiological and crystal-packing protein-protein interactions. The second problem is concerned with the classification of near-native and inaccurate docking models. We first defined a universal set of interface features and employed a support vector machines (SVM)-based approach to classify the interactions for both problems, with the accuracy, precision, and recall for the first problem classifier reaching 93%. To improve the classification, we next developed a semi-supervised learning approach for the second problem, using transductive SVM (TSVM). We applied both classifiers to a commonly used protein docking benchmark of 124 complexes. We found that while we reached the classification accuracies of 78.9% for the SVM classifier and 80.3% for the TSVM classifier, improving protein-docking methods by model re-ranking remains a challenging problem.  相似文献   

12.
13.
Elderly tripping falls cost billions annually in medical funds and result in high mortality rates often perpetrated by pulmonary embolism (internal bleeding) and infected fractures that do not heal well. In this paper, we propose an intelligent gait detection system (AR-SVM) for screening elderly individuals at risk of suffering tripping falls. The motivation of this system is to provide early detection of elderly gait reminiscent of tripping characteristics so that preventive measures could be administered. Our system is composed of two stages, a predictor model estimated by an autoregressive (AR) process and a support vector machine (SVM) classifier. The system input is a digital signal constructed from consecutive measurements of minimum toe clearance (MTC) representative of steady-state walking. The AR-SVM system was tested on 23 individuals (13 healthy and 10 having suffered at least one tripping fall in the past year) who each completed a minimum of 10 min of walking on a treadmill at a self-selected pace. In the first stage, a fourth order AR model required at least 64 MTC values to correctly detect all fallers and non-fallers. Detection was further improved to less than 1 min of walking when the model coefficients were used as input features to the SVM classifier. The system achieved a detection accuracy of 95.65% with the leave one out method using only 16 MTC samples, but was reduced to 69.57% when eight MTC samples were used. These results demonstrate a fast and efficient system requiring a small number of strides and only MTC measurements for accurate detection of tripping gait characteristics.  相似文献   

14.
Computational models of cytochrome P450 3A4 inhibition were developed based on high-throughput screening data for 4470 proprietary compounds. Multiple models differentiating inhibitors (IC(50) <3 microM) and noninhibitors were generated using various machine-learning algorithms (recursive partitioning [RP], Bayesian classifier, logistic regression, k-nearest-neighbor, and support vector machine [SVM]) with structural fingerprints and topological indices. Nineteen models were evaluated by internal 10-fold cross-validation and also by an independent test set. Three most predictive models, Barnard Chemical Information (BCI)-fingerprint/SVM, MDL-keyset/SVM, and topological indices/RP, correctly classified 249, 248, and 236 compounds of 291 noninhibitors and 135, 137, and 147 compounds of 179 inhibitors in the validation set. Their overall accuracies were 82%, 82%, and 81%, respectively. Investigating applicability of the BCI/SVM model found a strong correlation between the predictive performance and the structural similarity to the training set. Using Tanimoto similarity index as a confidence measurement for the predictions, the limitation of the extrapolation was 0.7 in the case of the BCI/SVM model. Taking consensus of the 3 best models yielded a further improvement in predictive capability, kappa = 0.65 and accuracy = 83%. The consensus model could also be tuned to minimize either false positives or false negatives depending on the emphasis of the screening.  相似文献   

15.
Comprehensive proteome analysis of rare cell phenotypes remains a significant challenge. We report a method for low cell number MS-based proteomics using protease digestion of mildly formaldehyde-fixed cells in cellulo, which we call the “in-cell digest.” We combined this with averaged MS1 precursor library matching to quantitatively characterize proteomes from low cell numbers of human lymphoblasts. About 4500 proteins were detected from 2000 cells, and 2500 proteins were quantitated from 200 lymphoblasts. The ease of sample processing and high sensitivity makes this method exceptionally suited for the proteomic analysis of rare cell states, including immune cell subsets and cell cycle subphases. To demonstrate the method, we characterized the proteome changes across 16 cell cycle states (CCSs) isolated from an asynchronous TK6 cells, avoiding synchronization. States included late mitotic cells present at extremely low frequency. We identified 119 pseudoperiodic proteins that vary across the cell cycle. Clustering of the pseudoperiodic proteins showed abundance patterns consistent with “waves” of protein degradation in late S, at the G2&M border, midmitosis, and at mitotic exit. These clusters were distinguished by significant differences in predicted nuclear localization and interaction with the anaphase-promoting complex/cyclosome. The dataset also identifies putative anaphase-promoting complex/cyclosome substrates in mitosis and the temporal order in which they are targeted for degradation. We demonstrate that a protein signature made of these 119 high-confidence cell cycle–regulated proteins can be used to perform unbiased classification of proteomes into CCSs. We applied this signature to 296 proteomes that encompass a range of quantitation methods, cell types, and experimental conditions. The analysis confidently assigns a CCS for 49 proteomes, including correct classification for proteomes from synchronized cells. We anticipate that this robust cell cycle protein signature will be crucial for classifying cell states in single-cell proteomes.  相似文献   

16.
17.
A system has been developed that automatically recognizes the mitotic phase of human chromosome spreads for karyotyping. Suitable spreads are classified into one of five subphases of mitosis. Classification is performed on the basis of summed chromosome length and most probable chromosome width. Classification requires 100-500 msec. A television camera scans the spread through microscope optics; computer and special purpose electronics process the video signals to generate run length histograms. The histograms are used to determine mitotic phase. Unbanded spreads, 133, were classified with a 4.5% error rate. One hundred banded spreads were classified with a 15% error rate.  相似文献   

18.
Functional annotation of protein sequences with low similarity to well characterized protein sequences is a major challenge of computational biology in the post genomic era. The cyclin protein family is once such important family of proteins which consists of sequences with low sequence similarity making discovery of novel cyclins and establishing orthologous relationships amongst the cyclins, a difficult task. The currently identified cyclin motifs and cyclin associated domains do not represent all of the identified and characterized cyclin sequences. We describe a Support Vector Machine (SVM) based classifier, CyclinPred, which can predict cyclin sequences with high efficiency. The SVM classifier was trained with features of selected cyclin and non cyclin protein sequences. The training features of the protein sequences include amino acid composition, dipeptide composition, secondary structure composition and PSI-BLAST generated Position Specific Scoring Matrix (PSSM) profiles. Results obtained from Leave-One-Out cross validation or jackknife test, self consistency and holdout tests prove that the SVM classifier trained with features of PSSM profile was more accurate than the classifiers based on either of the other features alone or hybrids of these features. A cyclin prediction server--CyclinPred has been setup based on SVM model trained with PSSM profiles. CyclinPred prediction results prove that the method may be used as a cyclin prediction tool, complementing conventional cyclin prediction methods.  相似文献   

19.
Since a culture increases in cell number when dividing cells separate into two newborn cells, the fraction of mitotic cells in a growing cell population directly reflects the overall growth behavior of a cell culture. To rapidly assess the effects of growth conditions on the fraction of mitotic cells we have employed an antibody specific for the phosphorylated form of histone H3 for the identification of mitotic cells using flow cytometry. The phosphorylation of histone H3 closely correlates with the chromosomal condensation that accompanies the onset of mitosis, and, therefore, it represents a convenient marker for dividing cells. We have optimized the protocol for the staining of mitotic cells for both Chinese hamster ovary and hybridoma cell cultures. Fluorescence micrographs taken of stained cells show that cells in the various stages of mitosis can be detected based on the morphological characteristics of the chromosomes. The variation in the mitotic cell fraction has been determined throughout the batch growth phases of cultures under different growth conditions. The dynamics of the mitotic index show that balanced growth was never truly reached and that the growth rate is in fact quite variable for these cultures since large variations in the mitotic index are observed. In addition, a large increase in the fraction of mitotic cells just prior to the exponential growth phase for all cultures indicates that they are partially synchronized at the exit from the lag phase. According to a two-staged, age structured population balance model, the mitotic index is directly proportional to the growth rate of a culture. The proportionality constant for this case is shown to be the time required for cells to progress through mitosis. This time is believed to be constant for a particular cell line, as shown by experimental data. Thus, growth rates can be determined solely by measurement of the fraction of cells in mitosis. The mitotic index measurements were then used to calculate the growth in cell number of the cultures, and these simulations accurately reflect observed cell counts. Other simulations also show that changes in cell growth can be predicted before they are reflected in the cell count data. This technique can be used as a sensitive indicator of cell growth and could be useful as a process monitoring technique and for developing better feeding strategies for animal cell cultures.  相似文献   

20.
Summary A method based on BrdU incorporation for analyzing in detail the kinetics of the cell cycle is described. The S phase has been subdivided into five subphases, each recognizable by their BrdU incorporation pattern at metaphase. The method can be useful for the study of abnormal cell cycles, and may have particular application in mutagenesis studies concerning the various subphases of the S phase, without using synchronization techniques. An application of the method is described, showing that -irradiation, during the course of the S phase, leads to a lack of cells which were in early S phase at the time of irradiation. This finding can be related either to a higher lethality at this stage of the cell cycle or to a delay in completion of DNA replication after irradiation.Hoider of a C.E.C. scholarship  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号