首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Shen HB  Yang J  Chou KC 《Amino acids》2007,33(1):57-67
With the avalanche of newly-found protein sequences emerging in the post genomic era, it is highly desirable to develop an automated method for fast and reliably identifying their subcellular locations because knowledge thus obtained can provide key clues for revealing their functions and understanding how they interact with each other in cellular networking. However, predicting subcellular location of eukaryotic proteins is a challenging problem, particularly when unknown query proteins do not have significant homology to proteins of known subcellular locations and when more locations need to be covered. To cope with the challenge, protein samples are formulated by hybridizing the information derived from the gene ontology database and amphiphilic pseudo amino acid composition. Based on such a representation, a novel ensemble hybridization classifier was developed by fusing many basic individual classifiers through a voting system. Each of these basic classifiers was engineered by the KNN (K-Nearest Neighbor) principle. As a demonstration, a new benchmark dataset was constructed that covers the following 18 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cyanelle, (5) cytoplasm, (6) cytoskeleton, (7) endoplasmic reticulum, (8) extracell, (9) Golgi apparatus, (10) hydrogenosome, (11) lysosome, (12) mitochondria, (13) nucleus, (14) peroxisome, (15) plasma membrane, (16) plastid, (17) spindle pole body, and (18) vacuole. To avoid the homology bias, none of the proteins included has > or =25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the 5-fold and jackknife cross-validation tests were 81.6 and 80.3%, respectively, which were 40-50% higher than those performed by the other existing methods on the same strict dataset. The powerful predictor, named "Euk-PLoc", is available as a web-server at http://202.120.37.186/bioinf/euk . Furthermore, to support the need of people working in the relevant areas, a downloadable file will be provided at the same website to list the results predicted by Euk-PLoc for all eukaryotic protein entries (excluding fragments) in Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results will be updated twice a year to include the new entries of eukaryotic proteins and reflect the continuous development of Euk-PLoc.  相似文献   

3.
Kazi JU  Kabir NN  Soh JW 《Gene》2008,410(1):147-153
Eukaryotic protein kinases, containing a conserved catalytic domain, represent one of the largest superfamilies of the eukaryotic proteins and play distinct roles in cell signaling and diseases. Near completion of rat genome sequencing project enables the evaluation of a near complete set of rat protein kinases. Publicly accessible genetic sequence databases were searched for rat protein kinases, and 515 eukaryotic protein kinases, 40 atypical protein kinases and 45 kinase pseudogenes were identified. The rat has 509 putative protein kinases orthologous to human kinases. Unlike microtubule affinity-regulating kinases, the rat has a few more kinases, in addition to the orthologous pairs of mouse kinases. The comparison of 11 different eukaryotic species revealed the evolutionary conservation of this diverse family of proteins. The evolutionary rate studies of human disease and non-disease associated kinases suggested that relatively uniform selective pressures have been applied to these kinase classes. This bioinformatic study of the rat protein kinases provides a suitable framework for further characterization of the functional and structural properties of these protein kinases.  相似文献   

4.
5.
Computer programs for eukaryotic gene prediction   总被引:3,自引:0,他引:3  
Seven popular programs for gene prediction in eukaryotic organisms are described and evaluated on the basis of availability for in-house and on-line use and prediction accuracy. This report outlines generally applicable approaches to computational gene prediction and known limitations in this field.  相似文献   

6.
The human genome sequence is the book of our life. Buried in this large volume are our genes, which are scattered as small DNA fragments throughout the genome and comprise a small percentage of the total text. Finding these indistinct 'needles' in a vast genomic 'haystack' can be extremely challenging. In response to this challenge, computational prediction approaches have proliferated in recent years that predict the location and structure of genes. Here, I discuss these approaches and explain why they have become essential for the analyses of newly sequenced genomes.  相似文献   

7.
The biopharmaceuticals market is currently outperforming the pharmaceuticals market and is now valued at US$ 48 billion with an average annual growth of 19%. Behind this success is a 100-fold increase in productivities of eukaryotic expression systems. However, the productivity per cell has remained unchanged for more than 10 years. The engineering of the ER-resident protein folding machinery is discussed together with an overview of signal transduction pathways activated by heterologous protein overexpression to increase cell specific productivities.  相似文献   

8.
Cryptococcus neoformans, which causes fatal infection in immunocompromised individuals, has an elaborate polysaccharide capsule surrounding its cell wall. The cryptococcal capsule is the major virulence factor of this fungal organism, but its biosynthetic pathways are virtually unknown. Extracellular polysaccharides of eukaryotes may be made at the cell membrane or within the secretory pathway. To test these possibilities for cryptococcal capsule synthesis, we generated a secretion mutant in C. neoformans by mutating a Sec4/Rab8 GTPase homolog. At a restrictive temperature, the mutant displayed reduced growth and protein secretion, and accumulated approximately 100-nm vesicles in a polarized manner. These vesicles were not endocytic, as shown by their continued accumulation in the absence of polymerized actin, and could be labeled with anti-capsular antibodies as visualized by immunoelectron microscopy. These results indicate that glucuronoxylomannan, the major cryptococcal capsule polysaccharide, is trafficked within post-Golgi secretory vesicles. This strongly supports the conclusion that cryptococcal capsule is synthesized intracellularly and secreted via exocytosis.  相似文献   

9.
Perspectives on the classification of eukaryotic diversity have changed rapidly in recent years, as the four eukaryotic groups within the five-kingdom classification--plants, animals, fungi, and protists--have been transformed through numerous permutations into the current system of six "supergroups." The intent of the supergroup classification system is to unite microbial and macroscopic eukaryotes based on phylogenetic inference. This supergroup approach is increasing in popularity in the literature and is appearing in introductory biology textbooks. We evaluate the stability and support for the current six-supergroup classification of eukaryotes based on molecular genealogies. We assess three aspects of each supergroup: (1) the stability of its taxonomy, (2) the support for monophyly (single evolutionary origin) in molecular analyses targeting a supergroup, and (3) the support for monophyly when a supergroup is included as an out-group in phylogenetic studies targeting other taxa. Our analysis demonstrates that supergroup taxonomies are unstable and that support for groups varies tremendously, indicating that the current classification scheme of eukaryotes is likely premature. We highlight several trends contributing to the instability and discuss the requirements for establishing robust clades within the eukaryotic tree of life.  相似文献   

10.
One of the critical challenges in predicting protein subcellular localization is how to deal with the case of multiple location sites. Unfortunately, so far, no efforts have been made in this regard except for the one focused on the proteins in budding yeast only. For most existing predictors, the multiple-site proteins are either excluded from consideration or assumed even not existing. Actually, proteins may simultaneously exist at, or move between, two or more different subcellular locations. For instance, according to the Swiss-Prot database (version 50.7, released 19-Sept-2006), among the 33,925 eukaryotic protein entries that have experimentally observed subcellular location annotations, 2715 have multiple location sites, meaning about 8% bearing the multiplex feature. Proteins with multiple locations or dynamic feature of this kind are particularly interesting because they may have some very special biological functions intriguing to investigators in both basic research and drug discovery. Meanwhile, according to the same Swiss-Prot database, the number of total eukaryotic protein entries (except those annotated with "fragment" or those with less than 50 amino acids) is 90,909, meaning a gap of (90,909-33,925) = 56,984 entries for which no knowledge is available about their subcellular locations. Although one can use the computational approach to predict the desired information for the blank, so far, all the existing methods for predicting eukaryotic protein subcellular localization are limited in the case of single location site only. To overcome such a barrier, a new ensemble classifier, named Euk-mPLoc, was developed that can be used to deal with the case of multiple location sites as well. Euk-mPLoc is freely accessible to the public as a Web server at http://202.120.37.186/bioinf/euk-multi. Meanwhile, to support the people working in the relevant areas, Euk-mPLoc has been used to identify all eukaryotic protein entries in the Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results thus obtained have been deposited at the same Web site via a downloadable file prepared with Microsoft Excel and named "Tab_Euk-mPLoc.xls". Furthermore, to include new entries of eukaryotic proteins and reflect the continuous development of Euk-mPLoc in both the coverage scope and prediction accuracy, we will timely update the downloadable file as well as the predictor, and keep users informed by publishing a short note in the Journal and making an announcement in the Web Page.  相似文献   

11.
We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0729-7) contains supplementary material, which is available to authorized users.  相似文献   

12.
Models for prediction and recognition of eukaryotic promoters   总被引:13,自引:0,他引:13  
  相似文献   

13.
Li L  Zhang Y  Zou L  Li C  Yu B  Zheng X  Zhou Y 《PloS one》2012,7(1):e31057
With the rapid increase of protein sequences in the post-genomic age, it is challenging to develop accurate and automated methods for reliably and quickly predicting their subcellular localizations. Till now, many efforts have been tried, but most of which used only a single algorithm. In this paper, we proposed an ensemble classifier of KNN (k-nearest neighbor) and SVM (support vector machine) algorithms to predict the subcellular localization of eukaryotic proteins based on a voting system. The overall prediction accuracies by the one-versus-one strategy are 78.17%, 89.94% and 75.55% for three benchmark datasets of eukaryotic proteins. The improved prediction accuracies reveal that GO annotations and hydrophobicity of amino acids help to predict subcellular locations of eukaryotic proteins.  相似文献   

14.
Gao X  Jin C  Ren J  Yao X  Xue Y 《Genomics》2008,92(6):457-463
Protein phosphorylation is one of the most essential post-translational modifications (PTMs), and orchestrates a variety of cellular functions and processes. Besides experimental studies, numerous computational predictors implemented in various algorithms have been developed for phosphorylation sites prediction. However, large-scale predictions of kinase-specific phosphorylation sites have not been successfully pursued and remained to be a great challenge. In this work, we raised a “kiss farewell” model and conducted a high-throughput prediction of cAMP-dependent kinase (PKA) phosphorylation sites. Since a protein kinase (PK) should at least “kiss” its substrates and then run away, we proposed a PKA-binding protein to be a potential PKA substrate if at least one PKA site was predicted. To improve the prediction specificity, we reduced false positive rate (FPR) less than 1% when the cut-off value was set as 4. Successfully, we predicted 1387, 630, 568 and 912 potential PKA sites from 410, 217, 173 and 260 PKA-interacting proteins in Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens, respectively. Most of these potential phosphorylation sites remained to be experimentally verified. In addition, we detected two sites in one of PKA regulatory subunits to be conserved in eukaryotes as potentially ancient regulatory signals. Our prediction results provide an excellent resource for delineating PKA-mediated signaling pathways and their system integration underlying cellular dynamics and plasticity.  相似文献   

15.
Salmonella -induced enteritis is associated with the induction of an acute intestinal inflammatory response and net fluid secretion into the lumen of infected mucosa. Proteins secreted by the Inv/Spa type III secretion system of Salmonella play a key role in the induction of these responses. We have demonstrated recently that the Inv/Spa-secreted SopB and SopD effector proteins are translocated into eukaryotic cells via a Sip dependent pathway and act in concert to mediate inflammation and fluid secretion in infected ileal mucosa. Mutations of both sopB and sopD significantly reduced, but did not abrogate, the enteropathogenic phenotype. This indicated that other virulence factors are involved in the induction of enteritis. In this work, we characterize SopA, a secreted protein belonging to the family of Sop effectors of Salmonella dublin . We demonstrate that SopA is translocated into eukaryotic cells and provide evidence suggesting that SopA has a role in the induction of enteritis.  相似文献   

16.
17.
Gram-positive bacteria have been widely investigated for their huge capability to secrete proteins, such as those involved in gene expression, bacterial surface display and bacterial pathogenesis. The N-terminal signal peptide of a secretory protein is responsible for the translocation of polypeptide through the cytoplasmic membrane. Recently, the signal peptide prediction has become a major task in bioinformatics, and many programs with different algorithms were developed to predict signal peptides. In this paper, five prediction programs (SignalP 3.0, PrediSi, Phobius, SOSUIsignal and SIG-Pred) were selected to evaluate their prediction accuracy for signal peptides and cleavage site using 509 unbiased and experimentally verified Gram-positive protein sequences. The results showed that SignalP was the most accurate program in signal peptide (96% accuracy) and cleavage site (83%) prediction. Prediction performance could further be improved by combining multiple methods into consensus prediction, which would increase the accuracy to 98%, and decrease the false positive to zero. When the consensus method was used to predict Bacillus’s extracellular proteins identified by proteomics, more new signal peptides were successfully identified. It could be concluded that the consensus method would be useful to make prediction of signal peptides more reliable.  相似文献   

18.
Enteritis induced by non-typhoid pathogenic Salmonella is characterized by fluid secretion and inflammatory responses in the infected ileum. The inflammatory response provoked by Salmonella initially consists largely of a neutrophil (PMN) migration into the intestinal mucosa and the gut lumen. The interactions between Salmonella and intestinal epithelial cells are known to play an essential role in inducing the inflammatory response. Upon interaction with epithelial cells salmonellae are able to elicit transepithelial signalling to neutrophils. This signalling is recognized as a key virulence feature underlying Salmonella -induced enteritis. However, the nature and mechanism of such signalling has not been clarified to date. Here, we characterize SopB, a novel secreted effector protein of Salmonella dublin , and present data implying that SopB is translocated into eukaryotic cells via a sip -dependent pathway to promote fluid secretion and inflammatory responses in the infected ileum.  相似文献   

19.
Characterization of the extracellular protein interactome has lagged far behind that of intracellular proteins, where mass spectrometry and yeast two-hybrid technologies have excelled. Improved methods for identifying receptor-ligand and extracellular matrix protein interactions will greatly accelerate biological discovery in cell signaling and cellular communication. These technologies must be able to identify low-affinity binding events that are often observed between membrane-bound coreceptor molecules during cell-cell or cell-extracellular matrix contact. Here we demonstrate that functional protein microarrays are particularly well-suited for high-throughput screening of extracellular protein interactions. To evaluate the performance of the platform, we screened a set of 89 immunoglobulin (Ig)-type receptors against a highly diverse extracellular protein microarray with 686 genes represented. To enhance detection of low-affinity interactions, we developed a rapid method to assemble bait Fc fusion proteins into multivalent complexes using protein A microbeads. Based on these screens, we developed a statistical methodology for hit calling and identification of nonspecific interactions on protein microarrays. We found that the Ig receptor interactions identified using our methodology are highly specific and display minimal off-target binding, resulting in a 70% true-positive to false-positive hit ratio. We anticipate that these methods will be useful for a wide variety of functional protein microarray users.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号