首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Information of the proteins' subcellular localization is crucially important for revealing their biological functions in a cell, the basic unit of life. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to develop computational tools for timely identifying their subcellular locations based on the sequence information alone. The current study is focused on the Gram-negative bacterial proteins. Although considerable efforts have been made in protein subcellular prediction, the problem is far from being solved yet. This is because mounting evidences have indicated that many Gram-negative bacterial proteins exist in two or more location sites. Unfortunately, most existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions important for both basic research and drug design. In this study, by using the multi-label theory, we developed a new predictor called “pLoc-mGneg” for predicting the subcellular localization of Gram-negative bacterial proteins with both single and multiple locations. Rigorous cross-validation on a high quality benchmark dataset indicated that the proposed predictor is remarkably superior to “iLoc-Gneg”, the state-of-the-art predictor for the same purpose. For the convenience of most experimental scientists, a user-friendly web-server for the novel predictor has been established at http://www.jci-bioinfo.cn/pLoc-mGneg/, by which users can easily get their desired results without the need to go through the complicated mathematics involved.  相似文献   

2.
Many efforts have been made in predicting the subcellular localization of eukaryotic proteins, but most of the existing methods have the following two limitations: (1) their coverage scope is less than ten locations and hence many organelles in an eukaryotic cell cannot be covered, and (2) they can only be used to deal with single-label systems in which each of the constituent proteins has one and only one location. Actually, proteins with multiple locations are particularly interesting since they may have some exceptional functions very important for in-depth understanding the biological process in a cell and for selecting drug target as well. Although several predictors (such as “Euk-mPLoc”, “Euk-PLoc 2.0” and “iLoc-Euk”) can cover up to 22 different location sites, and they also have the function to treat multi-labeled proteins, further efforts are needed to improve their prediction quality, particularly in enhancing the absolute true rate and in reducing the absolute false rate. Here we propose a new predictor called “pLoc-mEuk” by extracting the key GO (Gene Ontology) information into the general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validations on a high-quality and stringent benchmark dataset have indicated that the proposed pLoc-mEuk predictor is remarkably superior to iLoc-Euk, the best of the aforementioned three predictors. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mEuk/, by which users can easily get their desired results without the need to go through the complicated mathematics involved.  相似文献   

3.
Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mGpos” was developed for identifying the subcellular localization of Gram-positive bacterial proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mGpos was trained by an extremely skewed dataset in which some subset (subcellular location) was over 11 times the size of the other subsets. Accordingly, it cannot avoid the bias consequence caused by such an uneven training dataset. To alleviate such bias consequence, we have developed a new and bias-reducing predictor called pLoc_bal-mGpos by quasi-balancing the training dataset. Rigorous target jackknife tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mGpos, the existing state-of-the-art predictor in identifying the subcellular localization of Gram-positive bacterial proteins. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mGpos/, by which users can easily get their desired results without the need to go through the detailed mathematics.  相似文献   

4.
5.
As one of the most important posttranslational modifications (PTMs), ubiquitination plays an important role in regulating varieties of biological processes, such as signal transduction, cell division, apoptosis, and immune response. Ubiquitination is also named “lysine ubiquitination” because it occurs when an ubiquitin is covalently attached to lysine (K) residues of targeting proteins. Given an uncharacterized protein sequence that contains many lysine residues, which one of them is the ubiquitination site, and which one is of non-ubiquitination site? With the avalanche of protein sequences generated in the postgenomic age, it is highly desired for both basic research and drug development to develop an automated method for rapidly and accurately annotating the ubiquitination sites in proteins. In view of this, a new predictor called “iUbiq-Lys” was developed based on the evolutionary information, gray system model, as well as the general form of pseudo-amino acid composition. It was demonstrated via the rigorous cross-validations that the new predictor remarkably outperformed all its counterparts. As a web-server, iUbiq-Lys is accessible to the public at http://www.jci-bioinfo.cn/iUbiq-Lys. For the convenience of most experimental scientists, we have further provided a protocol of step-by-step guide, by which users can easily get their desired results without the need to follow the complicated mathematics that were presented in this paper just for the integrity of its development process.  相似文献   

6.
7.
With the explosive growth of protein sequences entering into protein data banks in the post-genomic era, it is highly demanded to develop automated methods for rapidly and effectively identifying the protein–protein binding sites (PPBSs) based on the sequence information alone. To address this problem, we proposed a predictor called iPPBS-PseAAC, in which each amino acid residue site of the proteins concerned was treated as a 15-tuple peptide segment generated by sliding a window along the protein chains with its center aligned with the target residue. The working peptide segment is further formulated by a general form of pseudo amino acid composition via the following procedures: (1) it is converted into a numerical series via the physicochemical properties of amino acids; (2) the numerical series is subsequently converted into a 20-D feature vector by means of the stationary wavelet transform technique. Formed by many individual “Random Forest” classifiers, the operation engine to run prediction is a two-layer ensemble classifier, with the 1st-layer voting out the best training data-set from many bootstrap systems and the 2nd-layer voting out the most relevant one from seven physicochemical properties. Cross-validation tests indicate that the new predictor is very promising, meaning that many important key features, which are deeply hidden in complicated protein sequences, can be extracted via the wavelets transform approach, quite consistent with the facts that many important biological functions of proteins can be elucidated with their low-frequency internal motions. The web server of iPPBS-PseAAC is accessible at http://www.jci-bioinfo.cn/iPPBS-PseAAC, by which users can easily acquire their desired results without the need to follow the complicated mathematical equations involved.  相似文献   

8.
Information about the interactions of drug compounds with proteins in cellular networking is very important for drug development. Unfortunately, all the existing predictors for identifying drug–protein interactions were trained by a skewed benchmark data-set where the number of non-interactive drug–protein pairs is overwhelmingly larger than that of the interactive ones. Using this kind of highly unbalanced benchmark data-set to train predictors would lead to the outcome that many interactive drug–protein pairs might be mispredicted as non-interactive. Since the minority interactive pairs often contain the most important information for drug design, it is necessary to minimize this kind of misprediction. In this study, we adopted the neighborhood cleaning rule and synthetic minority over-sampling technique to treat the skewed benchmark datasets and balance the positive and negative subsets. The new benchmark datasets thus obtained are called the optimized benchmark datasets, based on which a new predictor called iDrug-Target was developed that contains four sub-predictors: iDrug-GPCR, iDrug-Chl, iDrug-Ezy, and iDrug-NR, specialized for identifying the interactions of drug compounds with GPCRs (G-protein-coupled receptors), ion channels, enzymes, and NR (nuclear receptors), respectively. Rigorous cross-validations on a set of experiment-confirmed datasets have indicated that these new predictors remarkably outperformed the existing ones for the same purpose. To maximize users’ convenience, a public accessible Web server for iDrug-Target has been established at http://www.jci-bioinfo.cn/iDrug-Target/, by which users can easily get their desired results. It has not escaped our notice that the aforementioned strategy can be widely used in many other areas as well.  相似文献   

9.
Involved in many diseases such as cancer, diabetes, neurodegenerative, inflammatory and respiratory disorders, G-protein-coupled receptors (GPCRs) are among the most frequent targets of therapeutic drugs. It is time-consuming and expensive to determine whether a drug and a GPCR are to interact with each other in a cellular network purely by means of experimental techniques. Although some computational methods were developed in this regard based on the knowledge of the 3D (dimensional) structure of protein, unfortunately their usage is quite limited because the 3D structures for most GPCRs are still unknown. To overcome the situation, a sequence-based classifier, called “iGPCR-drug”, was developed to predict the interactions between GPCRs and drugs in cellular networking. In the predictor, the drug compound is formulated by a 2D (dimensional) fingerprint via a 256D vector, GPCR by the PseAAC (pseudo amino acid composition) generated with the grey model theory, and the prediction engine is operated by the fuzzy K-nearest neighbour algorithm. Moreover, a user-friendly web-server for iGPCR-drug was established at http://www.jci-bioinfo.cn/iGPCR-Drug/. For the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated math equations presented in this paper just for its integrity. The overall success rate achieved by iGPCR-drug via the jackknife test was 85.5%, which is remarkably higher than the rate by the existing peer method developed in 2010 although no web server was ever established for it. It is anticipated that iGPCR-Drug may become a useful high throughput tool for both basic research and drug development, and that the approach presented here can also be extended to study other drug – target interaction networks.  相似文献   

10.
The malaria disease has become a cause of poverty and a major hindrance to economic development. The culprit of the disease is the parasite, which secretes an array of proteins within the host erythrocyte to facilitate its own survival. Accordingly, the secretory proteins of malaria parasite have become a logical target for drug design against malaria. Unfortunately, with the increasing resistance to the drugs thus developed, the situation has become more complicated. To cope with the drug resistance problem, one strategy is to timely identify the secreted proteins by malaria parasite, which can serve as potential drug targets. However, it is both expensive and time-consuming to identify the secretory proteins of malaria parasite by experiments alone. To expedite the process for developing effective drugs against malaria, a computational predictor called “iSMP-Grey” was developed that can be used to identify the secretory proteins of malaria parasite based on the protein sequence information alone. During the prediction process a protein sample was formulated with a 60D (dimensional) feature vector formed by incorporating the sequence evolution information into the general form of PseAAC (pseudo amino acid composition) via a grey system model, which is particularly useful for solving complicated problems that are lack of sufficient information or need to process uncertain information. It was observed by the jackknife test that iSMP-Grey achieved an overall success rate of 94.8%, remarkably higher than those by the existing predictors in this area. As a user-friendly web-server, iSMP-Grey is freely accessible to the public at http://www.jci-bioinfo.cn/iSMP-Grey. Moreover, for the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematical equations involved in this paper.  相似文献   

11.
Circular RNAs (circRNAs) from back-splicing of exon(s) have been recently identified to be broadly expressed in eukaryotes, in tissue- and species-specific manners. Although functions of most circRNAs remain elusive, some circRNAs are shown to be functional in gene expression regulation and potentially relate to diseases. Due to their stability, circRNAs can also be used as biomarkers for diagnosis. Profiling circRNAs by integrating their expression among different samples thus provides molecular basis for further functional study of circRNAs and their potential application in clinic. Here, we report CIRCpedia v2, an updated database for comprehensive circRNA annotation from over 180 RNA-seq datasets across six different species. This atlas allows users to search, browse, and download circRNAs with expression features in various cell types/tissues, including disease samples. In addition, the updated database incorporates conservation analysis of circRNAs between humans and mice. Finally, the web interface also contains computational tools to compare circRNA expression among samples. CIRCpedia v2 is accessible at http://www.picb.ac.cn/rnomics/circpedia.  相似文献   

12.
Lin WZ  Fang JA  Xiao X  Chou KC 《PloS one》2011,6(9):e24756
DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power. By incorporating the features into the general form of pseudo amino acid composition that were extracted from protein sequences via the "grey model" and by adopting the random forest operation engine, we proposed a new predictor, called iDNA-Prot, for identifying uncharacterized proteins as DNA-binding proteins or non-DNA binding proteins based on their amino acid sequences information alone. The overall success rate by iDNA-Prot was 83.96% that was obtained via jackknife tests on a newly constructed stringent benchmark dataset in which none of the proteins included has ≥25% pairwise sequence identity to any other in a same subset. In addition to achieving high success rate, the computational time for iDNA-Prot is remarkably shorter in comparison with the relevant existing predictors. Hence it is anticipated that iDNA-Prot may become a useful high throughput tool for large-scale analysis of DNA-binding proteins. As a user-friendly web-server, iDNA-Prot is freely accessible to the public at the web-site on http://icpr.jci.edu.cn/bioinfo/iDNA-Prot or http://www.jci-bioinfo.cn/iDNA-Prot. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results.  相似文献   

13.
Ancestry composition of populations and individuals has been extensively investigated in recent years due to advances in the genotyping and sequencing technologies. As the number of populations and individuals used for ancestry inference increases remarkably, say more than 100 populations or 1000 individuals, it is usually challenging to present the ancestry composition in a traditional way using a rectangular graph. To address this issue, we developed a program, AncestryPainter, which can illustrate the ancestry composition of populations and individuals with a rounded and nice-looking graph to save space. Individuals are depicted as length-fixed bars partitioned into colored segments representing different ancestries, and the population of interest can be highlighted as a pie chart in the center of the circle plot. In addition, AncestryPainter can also be applied to display personal ancestry in a way similar to that for displaying population ancestry. AncestryPainter is publicly available at http://www.picb.ac.cn/PGG/resource.php.  相似文献   

14.
PXR, pregnane X receptor, in its activated state, is a validated target for controlling certain drug–drug interactions in humans. In this context, there is a paucity of inhibitors directed toward activated PXR. Using prior observations with ketoconazole as a PXR inhibitor, the target compound 3 was synthesized from (s)-glycidol with overall 56% yield. (+)-Glycidol was reacted with 4-bromophenol and potassium carbonate in DMF to yield the ring opened compound 6. This was then heated to reflux in benzene along with 2′, 4′-difluoroacetophenone and catalytic amount of para-toluene sulfonic acid to yield 8. The resultant acetal 8 was then functionalized using Palladium chemistry to yield the target compound 3. The activity of the compound was compared with ketoconazole and UCL2158H. However, in contrast with ketoconazole (IC50  0.020 μM; 100% inhibition), 3 has negligible effects on inhibition of microsomal CYP450 (maximum 20% inhibition) at concentrations >40 μM. In vitro, micromolar concentration of ketoconazole is toxic to passaged human cell lines, while 3 does not exhibit cytotoxicity up to concentrations 100 μM (viability >85%). This is the first demonstration of a chemical analog of a PXR inhibitor that retains activity against activated PXR. Furthermore, in contrast with ketoconazole, 3 is less toxic in human cell lines and has negligible CYP450 activity.  相似文献   

15.
Objectives: Thiopurines play an essential role in the management of inflammatory bowel diseases (IBD, i.e. Crohn's disease and ulcerative colitis). Over the past decade, several strategies to optimize treatment with thiopurines have been evaluated, including co-administration of allopurinol, a xanthine-oxidoreductase (XO) inhibitor, to low-dose thiopurine therapy. We aimed to assess the inter-individual variability of XO-activity between IBD-patients.

Methods: We assessed XO activity in serum of IBD-patients of two medical centers in The Netherlands using the Amplex® Red Xanthine/Xanthine Oxidase Assay Kit, which measures the superoxide formation in a coupled reaction to the red-fluorescent oxidation product, resofurine.

Results: We observed a high inter-individual variability of XO-activity in 119 patients, with a median activity of 16 µU/ml/hour (range 1–85 µU/ml/hour). The XO-activity was influenced by gender (male 19.5 vs. female 14.0 µU/ml/hour, p < 0.01), patient's age (Pearson's correlation r = 0.21, p = 0.02) and duration of IBD (r = 0.23, p = 0.01). The XO activity was not affected by the type of IBD, smoking status, body mass index or (type of) thiopurine use (p > 0.05).

Conclusions: There is a high inter-individual variability of XO-activity in IBD-patients; XO-activity is positively associated with male gender and patient's age.  相似文献   


16.
17.
DNA methylation is an important epigenetic mark that plays a vital role in gene expression and cell differentiation. The average DNA methylation level among a group of cells has been extensively documented. However, the cell-to-cell heterogeneity in DNA methylation, which reflects the differentiation of epigenetic status among cells, remains less investigated. Here we established a gold standard of the cell-to-cell heterogeneity in DNA methylation based on single-cell bisulfite sequencing (BS-seq) data. With that, we optimized a computational pipeline for estimating the heterogeneity in DNA methylation from bulk BS-seq data. We further built HeteroMeth, a database for searching, browsing, visualizing, and downloading the data for heterogeneity in DNA methylation for a total of 141 samples in humans, mice, Arabidopsis, and rice. Three genes are used as examples to illustrate the power of HeteroMeth in the identification of unique features in DNA methylation. The optimization of the computational strategy and the construction of the database in this study complement the recent experimental attempts on single-cell DNA methylomes and will facilitate the understanding of epigenetic mechanisms underlying cell differentiation and embryonic development. HeteroMeth is publicly available at http://qianlab.genetics.ac.cn/HeteroMeth.  相似文献   

18.
By introducing the "multi-layer scale", as well as hybridizing the information of gene ontology and the sequential evolution information, a novel predictor, called iLoc-Gpos, has been developed for predicting the subcellular localization of Gram positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gpos-mPLoc was adopted to demonstrate the power of iLoc-Gpos. The dataset contains 519 Gram-positive bacterial proteins classified into the following four subcellular locations: (1) cell membrane, (2) cell wall, (3) cytoplasm, and (4) extracell; none of proteins included has ≥25% pairwise sequence identity to any other in a same subset (subcellular location). The overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gpos was over 93%, which is about 11% higher than that by GposmPLoc. As a user-friendly web-server, iLoc-Gpos is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc- Gpos or http://www.jci-bioinfo.cn/iLoc-Gpos. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user ? s convenience, the iLoc-Gpos web-server also has the function to accept the batch job submission, which is not available in the existing version of Gpos-mPLoc web-server.  相似文献   

19.

Key message

We develop a set of universal genetic markers based on single-copy orthologous (COSII) genes in Poaceae.

Abstract

Being evolutionary conserved, single-copy orthologous (COSII) genes are particularly useful in comparative mapping and phylogenetic investigation among species. In this study, we identified 2,684 COSII genes based on five sequenced Poaceae genomes including rice, maize, sorghum, foxtail millet, and brachypodium, and then developed 1,072 COSII markers whose transferability and polymorphism among five bamboo species were further evaluated with 46 pairs of randomly selected primers. 91.3 % of the 46 primers obtained clear amplification in at least one bamboo species, and 65.2 % of them produced polymorphism in more than one species. We also used 42 of them to construct the phylogeny for the five bamboo species, and it might reflect more precise evolutionary relationship than the one based on the vegetative morphology. The results indicated a promising prospect of applying these markers to the investigation of genetic diversity and the classification of Poaceae. To ease and facilitate access of the information of common interest to readers, a web-based database of the COSII markers is provided (http://www.sicau.edu.cn/web/yms/PCOSWeb/PCOS.html).  相似文献   

20.
??????? 在医改中硬件是基础,软件是根本,基层医疗机构人才队伍建设问题至关重要,本文就如何吸引毕业生下沉到基层、如何提升基层现存医疗队伍的技术水平提出建议,并为如何实现2020年培养30万名全科医生的总体目标,提出利用社会融资方法培养农村全科医生的构想。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号