期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The molecular biology and nomenclature of the activating transcription factor/cAMP responsive element binding family of transcription factors: activating transcription factor proteins and homeostasis 总被引：36，自引：0，他引：36

Hai T Hartman MG 《Gene》2001,273(1):1-11

相似文献

2.

Mastery of naming skills by a chimpanzee

Timothy V. Gill Duane M. Rumbaugh 《Journal of human evolution》1974,3(6):483-492

The acquisition of naming skills by a chimpanzee in a computer-controlled language-training situation is described. Initial training consisted of presenting one of two exemplars coupled with the question, What name of this? Upon mastery of that phase, two transfer-of-naming tasks were given, results of which demonstrated that the subject had come to learn that things can be referred to by name. 相似文献

3.

Efficient ASK‐assisted system for expression and purification of plant F‐box proteins

下载免费PDF全文

Haiou Li Ruifeng Yao Sui Ma Shuai Hu Suhua Li Yupei Wang Chun Yan Daoxin Xie Jianbin Yan 《The Plant journal : for cell and molecular biology》2017,92(4):736-743

Ubiquitin‐mediated protein degradation plays an essential role in plant growth and development as well as responses to environmental and endogenous signals. F‐box protein is one of the key components of the SCF (SKP1‐CUL1‐F‐box protein) E3 ubiquitin ligase complex, which recruit specific substrate proteins for subsequent ubiquitination and 26S proteasome‐mediated degradation to regulate developmental processes and signaling networks. However, it is not easy to obtain purified F‐box proteins with high activity due to their unstable protein structures. Here, we found that Arabidopsis SKP‐like proteins (ASKs) can significantly improve soluble expression of F‐box proteins and maintain their bioactivity. We established an efficient ASK‐assisted method to express and purify plant F‐box proteins. The method meets a broad range of criteria required for the biochemical analysis or protein crystallization of plant F‐box proteins. 相似文献

4.

Altered low-γ sampling in auditory cortex accounts for the three main facets of dyslexia

Lehongre K Ramus F Villiermet N Schwartz D Giraud AL 《Neuron》2011,72(6):1080-1090

It has recently been conjectured that dyslexia arises from abnormal auditory sampling. What sampling rate is altered and how it affects reading remains unclear. We hypothesized that by impairing phonemic parsing abnormal low-gamma sampling could yield phonemic representations of unusual format and disrupt phonological processing and verbal memory. Using magnetoencephalography and behavioral tests, we show in dyslexic subjects a reduced left-hemisphere bias for phonemic processing, reflected in less entrainment to ≈30?Hz acoustic modulations in left auditory cortex. This deficit correlates with measures of phonological processing and rapid naming. We further observed enhanced cortical entrainment at rates beyond 40?Hz in dyslexics and show that this particularity is associated with a verbal memory deficit. These data suggest that a single auditory anomaly, i.e., phonemic oversampling in left auditory cortex, accounts for three main facets of the linguistic deficit in dyslexia. 相似文献

5.

On Estimating the Relationship between Longitudinal Measurements and Time‐to‐Event Data Using a Simple Two‐Stage Procedure

Paul S. Albert Joanna H. Shih 《Biometrics》2010,66(3):983-987

Summary Ye, Lin, and Taylor (2008, Biometrics 64 , 1238–1246) proposed a joint model for longitudinal measurements and time‐to‐event data in which the longitudinal measurements are modeled with a semiparametric mixed model to allow for the complex patterns in longitudinal biomarker data. They proposed a two‐stage regression calibration approach that is simpler to implement than a joint modeling approach. In the first stage of their approach, the mixed model is fit without regard to the time‐to‐event data. In the second stage, the posterior expectation of an individual's random effects from the mixed‐model are included as covariates in a Cox model. Although Ye et al. (2008) acknowledged that their regression calibration approach may cause a bias due to the problem of informative dropout and measurement error, they argued that the bias is small relative to alternative methods. In this article, we show that this bias may be substantial. We show how to alleviate much of this bias with an alternative regression calibration approach that can be applied for both discrete and continuous time‐to‐event data. Through simulations, the proposed approach is shown to have substantially less bias than the regression calibration approach proposed by Ye et al. (2008) . In agreement with the methodology proposed by Ye et al. (2008) , an advantage of our proposed approach over joint modeling is that it can be implemented with standard statistical software and does not require complex estimation techniques. 相似文献

6.

Understanding protein non-folding

Vladimir N. Uversky A. Keith Dunker 《Biochimica et Biophysica Acta - Proteins and Proteomics》2010,1804(6):1231-1264

This review describes the family of intrinsically disordered proteins, members of which fail to form rigid 3-D structures under physiological conditions, either along their entire lengths or only in localized regions. Instead, these intriguing proteins/regions exist as dynamic ensembles within which atom positions and backbone Ramachandran angles exhibit extreme temporal fluctuations without specific equilibrium values. Many of these intrinsically disordered proteins are known to carry out important biological functions which, in fact, depend on the absence of a specific 3-D structure. The existence of such proteins does not fit the prevailing structure–function paradigm, which states that a unique 3-D structure is a prerequisite to function. Thus, the protein structure–function paradigm has to be expanded to include intrinsically disordered proteins and alternative relationships among protein sequence, structure, and function. This shift in the paradigm represents a major breakthrough for biochemistry, biophysics and molecular biology, as it opens new levels of understanding with regard to the complex life of proteins. This review will try to answer the following questions: how were intrinsically disordered proteins discovered? Why don't these proteins fold? What is so special about intrinsic disorder? What are the functional advantages of disordered proteins/regions? What is the functional repertoire of these proteins? What are the relationships between intrinsically disordered proteins and human diseases? 相似文献

7.

Reduced alphabet for protein folding prediction

下载免费PDF全文

Jitao T. Huang Titi Wang Shanran R. Huang Xin Li 《Proteins》2015,83(4):631-639

What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc. 相似文献

8.

Analysis of sequence-reactivity space for protein-protein interactions

Li J Yi Z Laskowski MC Laskowski M Bailey-Kellogg C 《Proteins》2005,58(3):661-671

Sequence–reactivity space is defined by the relationships between amino acid type choices at some residue positions in a protein and the reactivities of the resulting variants. We are studying Kazal superfamily serine proteinase inhibitors, under substitution of any combination of residue types at 10 binding‐region positions. Reactivities are defined by the standard free energy of association for an inhibitor against an enzyme, and we are interested in both the strength (the free energy value) and specificity (relative free energy values for one inhibitor against different enzymes). Characterizing the structure of such a space poses several interesting questions: (1) How many sequences achieve particular strength and specificity characteristics? (2) What is the best such sequence? (3) What are some nearly‐as‐good alternatives? (4) What are their common residue type characteristics (e.g., conservation and correlation)? Although these problems are all highly combinatorial in nature, this article develops an efficient, integrated mechanism to address them under a data‐driven model that predicts reactivity for given sequences. We employ sampling and a novel deterministic distribution propagation algorithm, in order to determine both the reactivity distribution and sequence composition statistics; integer programming and a novel branch‐and‐bound search algorithm, in order to optimize sequences and enumerate near‐optimal sequences; and correlation‐based sequence decomposition, in order to identify sequence motifs. We demonstrate the value of our mechanism in analyzing the Kazal superfamily sequence–reactivity space, providing insights into the underlying biochemistry and suggesting hypotheses for further experimental consideration. In general, our mechanism offers a valuable tool for investigating the available degrees of freedom in protein design within a combined computational–experimental framework. Proteins 2005. © 2004 Wiley‐Liss, Inc. 相似文献

9.

Benchmarking protein–protein interface predictions: Why you should care about protein size

Juliette Martin 《Proteins》2014,82(7):1444-1452

A number of predictive methods have been developed to predict protein–protein binding sites. Each new method is traditionally benchmarked using sets of protein structures of various sizes, and global statistics are used to assess the quality of the prediction. Little attention has been paid to the potential bias due to protein size on these statistics. Indeed, small proteins involve proportionally more residues at interfaces than large ones. If a predictive method is biased toward small proteins, this can lead to an over‐estimation of its performance. Here, we investigate the bias due to the size effect when benchmarking protein‐protein interface prediction on the widely used docking benchmark 4.0. First, we simulate random scores that favor small proteins over large ones. Instead of the 0.5 AUC (Area Under the Curve) value expected by chance, these biased scores result in an AUC equal to 0.6 using hypergeometric distributions, and up to 0.65 using constant scores. We then use real prediction results to illustrate how to detect the size bias by shuffling, and subsequently correct it using a simple conversion of the scores into normalized ranks. In addition, we investigate the scores produced by eight published methods and show that they are all affected by the size effect, which can change their relative ranking. The size effect also has an impact on linear combination scores by modifying the relative contributions of each method. In the future, systematic corrections should be applied when benchmarking predictive methods using data sets with mixed protein sizes. Proteins 2014; 82:1444–1452. © 2014 Wiley Periodicals, Inc. 相似文献

10.

LISTA, a comprehensive compilation of nucleotide sequences encoding proteins from the yeast Saccharomyces. 总被引：2，自引：2，他引：0

下载免费PDF全文

P Linder R Dlz M O Moss J Lazowska P P Slonimski 《Nucleic acids research》1993,21(13):3001-3002

The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry. 相似文献

11.

Predicting enzyme family class in a hybridization space

Chou KC Cai YD 《Protein science : a publication of the Protein Society》2004,13(11):2857-2863

Given the sequence of a protein, how can we predict whether it is an enzyme or a non‐enzyme? If it is, what enzyme family class it belongs to? Because these questions are closely relevant to the biological function of a protein and its acting object, their importance is self‐evident. Particularly with the explosion of protein sequences entering into data banks and the relatively much slower progress in using biochemical experiments to determine their functions, it is highly desired to develop an automated method that can be used to give fast answers to these questions. By hybridizing the gene ontology and pseudo‐amino‐acid composition, we have introduced a new method that is called GO‐PseAA predictor and operate it in a hybridization space. To avoid redundancy and bias, demonstrations were performed on a data set in which none of the proteins in an individual class has ≥40% sequence identity to any other. The overall success rate thus obtained by the jackknife cross‐validation test in identifying enzyme and non‐enzyme was 93%, and that in identifying the enzyme family was 94% for the following six main Enzyme Commission (EC) classes: (1) oxidoreductase, (2) transferase, (3) hydrolase, (4) lyase, (5) isomerase, and (6) ligase. The corresponding rates by the independent data set test were 98% and 97%, respectively. 相似文献

12.

Exploring dynamics of protein structure determination and homology-based prediction to estimate the number of superfamilies and folds

Ruslan I Sadreyev Nick V Grishin 《BMC structural biology》2006,6(1):6-14

Background

As tertiary structure is currently available only for a fraction of known protein families, it is important to assess what parts of sequence space have been structurally characterized. We consider protein domains whose structure can be predicted by sequence similarity to proteins with solved structure and address the following questions. Do these domains represent an unbiased random sample of all sequence families? Do targets solved by structural genomic initiatives (SGI) provide such a sample? What are approximate total numbers of structure-based superfamilies and folds among soluble globular domains?

Results

To make these assessments, we combine two approaches: (i) sequence analysis and homology-based structure prediction for proteins from complete genomes; and (ii) monitoring dynamics of the assigned structure set in time, with the accumulation of experimentally solved structures. In the Clusters of Orthologous Groups (COG) database, we map the growing population of structurally characterized domain families onto the network of sequence-based connections between domains. This mapping reveals a systematic bias suggesting that target families for structure determination tend to be located in highly populated areas of sequence space. In contrast, the subset of domains whose structure is initially inferred by SGI is similar to a random sample from the whole population. To accommodate for the observed bias, we propose a new non-parametric approach to the estimation of the total numbers of structural superfamilies and folds, which does not rely on a specific model of the sampling process. Based on dynamics of robust distribution-based parameters in the growing set of structure predictions, we estimate the total numbers of superfamilies and folds among soluble globular proteins in the COG database.

Conclusion

The set of currently solved protein structures allows for structure prediction in approximately a third of sequence-based domain families. The choice of targets for structure determination is biased towards domains with many sequence-based homologs. The growing SGI output in the future should further contribute to the reduction of this bias. The total number of structural superfamilies and folds in the COG database are estimated as ~4000 and ~1700. These numbers are respectively four and three times higher than the numbers of superfamilies and folds that can currently be assigned to COG proteins. 相似文献

13.

English names for a world list of mammals, exemplified by species of Indochina

J. W. DUCKWORTH RONALD H. PINE† 《Mammal Review》2003,33(2):151-173

1. The lack of a globally accepted list of English‐language names for mammal species leads to various problems stemming from the reduced ability to communicate unambiguously. This impacts directly on their conservation. We use the larger mammals of Indochina to exemplify the use of an explicit set of principles designed to provide each species with a unique and non‐misleading (or at least minimally so) English name. 2. For most species, a suitable name is already in use, sometimes generally so. For species for which multiple names are in use, standardization would consist of adopting the most suitable name. Only for a very few species are all extant names so unsuitable that a neologism should be coined. One species, Panthera pardus, presents potentially insoluble problems. 3. Name standardization among the world's birds has generated some controversy, but this has not led to abandonment of the process. Much can be learned by those developing a similar process for mammals, through studying the bird‐naming process. Progress can be advanced by detractors indicating whether they oppose standardization per se, the principles used or the names resulting from application of the principles. Also, proponents of standardization should always emphasize that the purpose of the process is to produce a list available for those who want to use it, not to produce a binding selection that must be used in all circumstances. 相似文献

14.

Evaluation of floristic diversity in urban areas as a basis for habitat management

Audrey Muratet E. Porcher V. Devictor G. Arnal J. Moret S. Wright N. Machon 《应用植被学》2008,11(4):451-460

Questions: How can floristic diversity be evaluated in conser‐vation plans to identify sites of highest interest for biodiversity? What are the mechanisms influencing the distribution of species in human‐dominated environments? What are the best criteria to identify sites where active urban management is most likely to enhance floristic diversity? Location: The Hauts‐de‐Seine district bordering Paris, France. Methods: We described the floristic diversity in one of the most urbanized French districts through the inventory of ca. 1000 sites located in 23 habitats. We built a new index of floristic interest (IFI), integrating information on richness, indigeneity, typicality and rarity of species, to identify sites and habitats of highest interest for conservation. Finally, we explored the relationship between site IFI and land use patterns (LUP). Results: We observed a total of 626 vascular plant species. Habitats with highest IFI were typically situated in seminatural environments or environments with moderate human impact. We also showed that neighbouring (urban) structures had a significant influence on the floristic interest of sites: for example, the presence of collective dwellings around a site had a strong negative impact on IFI. Conclusions: Our approach can be used to optimize management in urban zones; we illustrate such possibilities by defining a ‘Site Potential Value’, which was then compared with the observed IFI, to identify areas (e.g. river banks) where better management could improve the district's biodiversity. 相似文献

15.

The necessity of adjusting tests of protein category enrichment in discovery proteomics

Louie B Higdon R Kolker E 《Bioinformatics (Oxford, England)》2010,26(24):3007-3011

MOTIVATION: Enrichment tests are used in high-throughput experimentation to measure the association between gene or protein expression and membership in groups or pathways. The Fisher's exact test is commonly used. We specifically examined the associations produced by the Fisher test between protein identification by mass spectrometry discovery proteomics, and their Gene Ontology (GO) term assignments in a large yeast dataset. We found that direct application of the Fisher test is misleading in proteomics due to the bias in mass spectrometry to preferentially identify proteins based on their biochemical properties. False inference about associations can be made if this bias is not corrected. Our method adjusts Fisher tests for these biases and produces associations more directly attributable to protein expression rather than experimental bias. RESULTS: Using logistic regression, we modeled the association between protein identification and GO term assignments while adjusting for identification bias in mass spectrometry. The model accounts for five biochemical properties of peptides: (i) hydrophobicity, (ii) molecular weight, (iii) transfer energy, (iv) beta turn frequency and (v) isoelectric point. The model was fit on 181 060 peptides from 2678 proteins identified in 24 yeast proteomics datasets with a 1% false discovery rate. In analyzing the association between protein identification and their GO term assignments, we found that 25% (134 out of 544) of Fisher tests that showed significant association (q-value ≤0.05) were non-significant after adjustment using our model. Simulations generating yeast protein sets enriched for identification propensity show that unadjusted enrichment tests were biased while our approach worked well. 相似文献

16.

Cover Picture: Proteomics 13/2008

《Proteomics》2008,8(13)

In this issue of Proteomics you will find the following highlighted articles: Mini pig kidney pie? A lot of antigens to chew on Miniature pigs have been of interest as potential organ xeno‐transplant donors for a number of years but mostly without success. A galactosyl transferase gene knock‐out heart lasted for 6 months, but then succumbed to vascular rejection, indicating previously unrecognized antigens. Kim, et al. applied current glycome analysis techniques to mini‐pig kidney surface antigens. They found an abundance of new ones–over 100 N‐glycans total, some sialylated, some neutral, some never reported before. The structures of many were determined and relatively quantitated. What was sauce for the kidney was not necessarily sauce for the heart. The information gathered and the questions raised will keep transplanters chewing for a long time. Y.‐G. Kim et al., Proteomics 2008, 8, 2596–2610. PACE‐ing along with the DUKX that are really hamsters Turning a marching band or moving it through a bottleneck requires different speeds at different points across the ranks. So does maximal production of biologically produced pharmaceuticals. Here Meleady, et al. use 2‐D DIGE technology to look at the required proteins and the levels of expression required for optimal production of human bone morphogenetic protein 2 (rhBMP‐2) in Chinese hamster ovary‐derived cell lines (CHO DUKX and engineered derivatives). Maturation of BMP‐2 requires the action of PACE (paired basic amino acid cleaving enzyme) and PACE levels are improved by co‐transfection with a soluble PACE gene. With high levels of PACE activity, yields of BMP‐2 improved 4‐fold. PACEsol enhances production of a variety of other proteins as well. Comparison of DUKX‐BMP‐2 cells expressing vs. not expressing PACEsol showed ～180 differentially expressed proteins, 60 identified, that were assigned to a number of functional categories. P. Meleady et al., Proteomics 2008, 8, 2611–2624. Ever deeper into cheesy secretome Kluyveromyces lactis, a budding yeast related to Saccharomyces cerevisiae, is of genetic and industrial interest. Its name comes from its ability to convert sweet milk to sour by fermentation of lactose to lactic acid, not quite the same as glucose to ethanol, but useful nonetheless. Industrially, it has been engineered to produce a vegetarian rennet for cheese‐making as well as other secreted protein products. Swaim, et al. compared the proteins in spent fermentation broth of the industrial expression strain K. lactis GG799 to the predicted secretion products based on genome sequence information and to predicted secretions from Candida albicans and S. cerevisiae. Using multidimensional LC‐MS/MS to analyze tryptic digests, they found 81 secreted products out of 178 predicted. Twenty‐six of those did not exhibit an N‐terminal secretion signal, suggesting that there are alternative pathways to the cell surface. An intracellular nano‐Swiss, perhaps? C. L. Swaim et al., Proteomics 2008, 8, 2714–2723. 相似文献

17.

Parasite‐encoded Hsp40 proteins define novel mobile structures in the cytosol of the P. falciparum‐infected erythrocyte

Simone Külzer Melanie Rug Klaus Brinkmann Ping Cannon Alan Cowman Klaus Lingelbach Gregory L. Blatch Alexander G. Maier Jude M. Przyborski 《Cellular microbiology》2010,12(10):1398-1420

Plasmodium falciparum is predicted to transport over 300 proteins to the cytosol of its chosen host cell, the mature human erythrocyte, including 19 members of the Hsp40 family. Here, we have generated transfectant lines expressing GFP‐ or HA‐Strep‐tagged versions of these proteins, and used these to investigate both localization and other properties of these Hsp40 co‐chaperones. These fusion proteins labelled punctate structures within the infected erythrocyte, initially suggestive of a Maurer's clefts localization. Further experiments demonstrated that these structures were distinct from the Maurer's clefts in protein composition. Transmission electron microscopy verifies a non‐cleft localization for HA‐Strep‐tagged versions of these proteins. We were not able to label these structures with BODIPY–ceramide, suggesting a lower size and/or different lipid composition compared with the Maurer's clefts. Solubility studies revealed that the Hsp40–GFP fusion proteins appear to be tightly associated with membranes, but could be released from the bilayer under conditions affecting membrane cholesterol content or organization, suggesting interaction with a binding partner localized to cholesterol‐rich domains. These novel structures are highly mobile in the infected erythrocyte, but based on velocity calculations, can be distinguished from the ‘highly mobile vesicles’ previously described. Our study identifies a further extra‐parasitic structure in the P. falciparum‐infected erythrocyte, which we name ‘J‐dots’ (as their defining characteristic so far is the content of J‐proteins). We suggest that these J‐dots are involved in trafficking of parasite‐encoded proteins through the cytosol of the infected erythrocyte. 相似文献

18.

In this issue: Proteomics 13/2008

《Proteomics》2008,8(13)

In this issue of Proteomics you will find the following highlighted articles: Mini pig kidney pie? A lot of antigens to chew on Miniature pigs have been of interest as potential organ xeno‐transplant donors for a number of years but mostly without success. A galactosyl transferase gene knock‐out heart lasted for 6 months, but then succumbed to vascular rejection, indicating previously unrecognized antigens. Kim, et al. applied current glycome analysis techniques to mini‐pig kidney surface antigens. They found an abundance of new ones–over 100 N‐glycans total, some sialylated, some neutral, some never reported before. The structures of many were determined and relatively quantitated. What was sauce for the kidney was not necessarily sauce for the heart. The information gathered and the questions raised will keep transplanters chewing for a long time. Y.‐G. Kim et al., Proteomics 2008, 8, 2596–2610. PACE‐ing along with the DUKX that are really hamsters Turning a marching band or moving it through a bottleneck requires different speeds at different points across the ranks. So does maximal production of biologically produced pharmaceuticals. Here Meleady, et al. use 2‐D DIGE technology to look at the required proteins and the levels of expression required for optimal production of human bone morphogenetic protein 2 (rhBMP‐2) in Chinese hamster ovary‐derived cell lines (CHO DUKX and engineered derivatives). Maturation of BMP‐2 requires the action of PACE (paired basic amino acid cleaving enzyme) and PACE levels are improved by co‐transfection with a soluble PACE gene. With high levels of PACE activity, yields of BMP‐2 improved 4‐fold. PACEsol enhances production of a variety of other proteins as well. Comparison of DUKX‐BMP‐2 cells expressing vs. not expressing PACEsol showed ～180 differentially expressed proteins, 60 identified, that were assigned to a number of functional categories. P. Meleady et al., Proteomics 2008, 8, 2611–2624. Ever deeper into cheesy secretome Kluyveromyces lactis, a budding yeast related to Saccharomyces cerevisiae, is of genetic and industrial interest. Its name comes from its ability to convert sweet milk to sour by fermentation of lactose to lactic acid, not quite the same as glucose to ethanol, but useful nonetheless. Industrially, it has been engineered to produce a vegetarian rennet for cheese‐making as well as other secreted protein products. Swaim, et al. compared the proteins in spent fermentation broth of the industrial expression strain K. lactis GG799 to the predicted secretion products based on genome sequence information and to predicted secretions from Candida albicans and S. cerevisiae. Using multidimensional LC‐MS/MS to analyze tryptic digests, they found 81 secreted products out of 178 predicted. Twenty‐six of those did not exhibit an N‐terminal secretion signal, suggesting that there are alternative pathways to the cell surface. An intracellular nano‐Swiss, perhaps? C. L. Swaim et al., Proteomics 2008, 8, 2714–2723. 相似文献

19.

Generation of a large gene/protein lexicon by morphological pattern analysis

Tanabe L Wilbur WJ 《Journal of bioinformatics and computational biology》2004,1(4):611-626

The identification of gene/protein names in natural language text is an important problem in named entity recognition. In previous work we have processed MEDLINE documents to obtain a collection of over two million names of which we estimate that perhaps two thirds are valid gene/protein names. Our problem has been how to purify this set to obtain a high quality subset of gene/protein names. Here we describe an approach which is based on the generation of certain classes of names that are characterized by common morphological features. Within each class inductive logic programming (ILP) is applied to learn the characteristics of those names that are gene/protein names. The criteria learned in this manner are then applied to our large set of names. We generated 193 classes of names and ILP led to criteria defining a select subset of 1,240,462 names. A simple false positive filter was applied to remove 8% of this set leaving 1,145,913 names. Examination of a random sample from this gene/protein name lexicon suggests it is composed of 82% (+/-3%) complete and accurate gene/protein names, 12% names related to genes/proteins (too generic, a valid name plus additional text, part of a valid name, etc.), and 6% names unrelated to genes/proteins. The lexicon is freely available at ftp.ncbi.nlm.nih.gov/pub/tanabe/Gene.Lexicon. 相似文献

20.

Probing the role of interfacial waters in protein–DNA recognition using a hybrid implicit/explicit solvation model

Shen Li Philip Bradley 《Proteins》2013,81(8):1318-1329

When proteins bind to their DNA target sites, ordered water molecules are often present at the protein–DNA interface bridging protein and DNA through hydrogen bonds. What is the role of these ordered interfacial waters? Are they important determinants of the specificity of DNA sequence recognition, or do they act in binding in a primarily nonspecific manner, by improving packing of the interface, shielding unfavorable electrostatic interactions, and solvating unsatisfied polar groups that are inaccessible to bulk solvent? When modeling details of structure and binding preferences, can fully implicit solvent models be fruitfully applied to protein–DNA interfaces, or must the individualistic properties of these interfacial waters be accounted for? To address these questions, we have developed a hybrid implicit/explicit solvation model that specifically accounts for the locations and orientations of small numbers of DNA‐bound water molecules, while treating the majority of the solvent implicitly. Comparing the performance of this model with that of its fully implicit counterpart, we find that explicit treatment of interfacial waters results in a modest but significant improvement in protein side‐chain placement and DNA sequence recovery. Base‐by‐base comparison of the performance of the two models highlights DNA sequence positions whose recognition may be dependent on interfacial water. Our study offers large‐scale statistical evidence for the role of ordered water for protein–DNA recognition, together with detailed examination of several well‐characterized systems. In addition, our approach provides a template for modeling explicit water molecules at interfaces that should be extensible to other systems. Proteins 2013; 81:1318–1329. © 2013 Wiley Periodicals, Inc. 相似文献