首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 20 毫秒
1.
Elastomeric proteins: biological roles, structures and mechanisms   总被引:1,自引:0,他引:1  
Elastomeric proteins are able to withstand significant deformations without rupture before returning to their original state when the stress is removed. Although elastomeric proteins differ considerably in their amino acid sequence, they all have a complex domain structure and share two common properties. Namely, they contain elastomeric domains, comprised of repeated sequences, and additional domains that form intermolecular crosslinks. Furthermore, several protein contain beta-turns as a structural motif within the elastomeric domains.  相似文献   

2.
Low-complexity sequences are extremely abundant in eukaryotic proteins for reasons that remain unclear. One hypothesis is that they contribute to the formation of novel coding sequences, facilitating the generation of novel protein functions. Here, we test this hypothesis by examining the content of low-complexity sequences in proteins of different age. We show that recently emerged proteins contain more low-complexity sequences than older proteins and that these sequences often form functional domains. These data are consistent with the idea that low-complexity sequences may play a key role in the emergence of novel genes.  相似文献   

3.
Pleckstrin homology (PH) domains are a family of compact protein modules defined by sequences of roughly 100 amino acids. These domains are common in vertebrate, Drosophila, C. elegans and yeast proteins, suggesting an early origin and fundamental importance to eukaryotic biology. Many enzymes which have important regulatory functions contain PH domains, and mutant forms of several such proteins are implicated in oncogenesis and developmental disorders. Numerous recent studies show that PH domains bind various proteins and inositolphosphates. Here I discuss PH domains in detail and conclude that they form a versatile family of membrane binding and protein localization modules.  相似文献   

4.
Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions and will not be amenable to crystallization. It is therefore essential to identify regions in protein sequences that are likely to be suitable for structural study. Scooby-Domain is a fast and simple method to identify globular domains in protein sequences. Domains are compact units of protein structure and their correct delineation will aid structural elucidation through a divide-and-conquer approach. Scooby-Domain predictions are based on the observed lengths and hydrophobicities of domains from proteins with known tertiary structure. The prediction method employs an A*-search to identify sequence regions that form a globular structure and those that are unstructured. On a test set of 173 proteins with consensus CATH and SCOP domain definitions, Scooby-Domain has a sensitivity of 50% and an accuracy of 29%, which is better than current state-of-the-art methods. The method does not rely on homology searches and, therefore, can identify previously unknown domains.  相似文献   

5.
Desmoplakins (DP) and bullous pemphigoid antigen (BPA) are major plaque components of the desmosome and hemidesmosome, respectively. These cell adhesion structures are both associated intimately with the intermediate filament (IF) network. Structural analyses of DP and BPA sequences have indicated that these molecules are likely to form extended dumbbell-shaped dimers with a central rod and globular end domains. Recent sequence data have indicated that the N-terminal domains of both DP and BPA (like their C-terminal domains) are highly related: the former contain regions of heptad repeats that are predicted to form several alpha-helical bundles. Comparisons of DP and BPA protein sequences with that of plectin (PL), a 466 kDa IF-associated protein, have also revealed large scale homology. Identities between their N-terminal domains are: DP:BPA = 35%, DP:PL = 32%, BPA:PL = 40%, suggesting that BPA is more closely related to PL than DP in this region. In the C-terminal domains, which contain a 38-residue repeating motif, however, DP and PL are closer relatives (identities: DP:BPA = 38%, BPA:PL = 40%, DP:PL = 49%). The central domains of all three proteins have extensive heptad repeat substructure, express the same periodic distribution of charged residues, and are predicted to form two-stranded alpha-helical coiled-coil ropes. These observations suggest that DP, BPA and PL belong to a new gene family encoding proteins involved in IF organization.  相似文献   

6.
Cullin-RING ubiquitin ligases promote the polyubiquitination and degradation of many important cellular proteins, which previous studies indicated can be targeted for degradation via interaction with BTB domain-containing subunits of this E3 ligase complex. PEST domains are known to promote the degradation of proteins that contain them. However, the molecular mechanism by which PEST sequences promote degradation of these proteins is not understood. Here we show that the PEST sequences of a short-lived protein called HSF2 interact with Cullin3, a subunit of a Cullin-RING E3 ubiquitin ligase, and that this interaction mediates the Cul3-dependent ubiquitination and degradation of HSF2. These results indicate how, at the molecular level, PEST sequences can promote the proteolysis of proteins that contain them. They also expand understanding of the mechanisms by which substrates can be recruited to Cullin-RING E3 ubiquitin ligases to include interactions between PEST sequences and Cul3.  相似文献   

7.
Rolling-circle replication of bacterial plasmids.   总被引:24,自引:1,他引:23       下载免费PDF全文
Many bacterial plasmids replicate by a rolling-circle (RC) mechanism. Their replication properties have many similarities to as well as significant differences from those of single-stranded DNA (ssDNA) coliphages, which also replicate by an RC mechanism. Studies on a large number of RC plasmids have revealed that they fall into several families based on homology in their initiator proteins and leading-strand origins. The leading-strand origins contain distinct sequences that are required for binding and nicking by the Rep proteins. Leading-strand origins also contain domains that are required for the initiation and termination of replication. RC plasmids generate ssDNA intermediates during replication, since their lagging-strand synthesis does not usually initiate until the leading strand has been almost fully synthesized. The leading- and lagging-strand origins are distinct, and the displaced leading-strand DNA is converted to the double-stranded form by using solely the host proteins. The Rep proteins encoded by RC plasmids contain specific domains that are involved in their origin binding and nicking activities. The replication and copy number of RC plasmids, in general, are regulated at the level of synthesis of their Rep proteins, which are usually rate limiting for replication. Some RC Rep proteins are known to be inactivated after supporting one round of replication. A number of in vitro replication systems have been developed for RC plasmids and have provided insight into the mechanism of plasmid RC replication.  相似文献   

8.
Eukaryotic cells contain assemblies of RNAs and proteins termed RNA granules. Many proteins within these bodies contain KH or RRM RNA-binding domains as well as low complexity (LC) sequences of unknown function. We discovered that exposure of cell or tissue lysates to a biotinylated isoxazole (b-isox) chemical precipitated hundreds of RNA-binding proteins with significant overlap to the constituents of RNA granules. The LC sequences within these proteins are both necessary and sufficient for b-isox-mediated aggregation, and these domains can undergo a concentration-dependent phase transition to a hydrogel-like state in the absence of the chemical. X-ray diffraction and EM studies revealed the hydrogels to be composed of uniformly polymerized amyloid-like fibers. Unlike pathogenic fibers, the LC sequence-based polymers described here are dynamic and accommodate heterotypic polymerization. These observations offer a framework for understanding the function of LC sequences as well as an organizing principle for cellular structures that are not membrane bound.  相似文献   

9.
用RACE结合cDNA文库筛选的方法获取新的锌指蛋白基因   总被引:6,自引:1,他引:5  
杜占文  刘立仁  张俊武 《遗传》2002,24(3):329-331
大多数有重要功能的蛋白质都含相应的由保守氨基酸顺序组成的功能结构域。本文首先根据蛋白质功能结构域保守氨基酸序列设计简并引物,用PCR方法扩增出基因EST序列,再利用改进的快速扩增cDNA末端(RACE)方法从cDNA文库中扩增出基因非同源部位,然后以非同源序列为探针,筛选cDNA文库。利用此方法成功地从人骨髓cDNA文库中克隆到几个编码锌指蛋白并代表原有EST的新的全长cDNA。这一策略也应适用于筛选编码具有其他序列保守性功能结构域蛋白的基因。 Abstract:Most of the important functionally proteins contain the corresponding function domains that consist of conserved amino acid sequences.The study provided a method to identify novel genes that encode proteins containing important functionally domains with conserved sequences.First,primers were designed according to the sequence of the cDNA library vector and the ESTs that have been obtained by reverse PCR and degenerate primers encoding Zinc finger domain.The cDNA library DNA was used as template for PCR amplification.The amplified fragment that contains nonhomologous sequences of the cDNA was inserted into pGEM-T easy vector.The fragment was recovered and used as a probe for screening the cDNA library.Several cDNAs with full length that encode proteins with Zinc finger domain and represent the original ESTs have been successfully cloned from a human bone marrow cDNA library.This strategy can also be used in screening genes that encode proteins containing differential function domains with conserved sequences.  相似文献   

10.
Origin and evolution of eukaryotic apoptosis: the bacterial connection   总被引:1,自引:0,他引:1  
The availability of numerous complete genome sequences of prokaryotes and several eukaryotic genome sequences provides for new insights into the origin of unique functional systems of the eukaryotes. Several key enzymes of the apoptotic machinery, including the paracaspase and metacaspase families of the caspase-like protease superfamily, apoptotic ATPases and NACHT family NTPases, and mitochondrial HtrA-like proteases, have diverse homologs in bacteria, but not in archaea. Phylogenetic analysis strongly suggests a mitochondrial origin for metacaspases and the HtrA-like proteases, whereas acquisition from Actinomycetes appears to be the most likely scenario for AP-ATPases. The homologs of apoptotic proteins are particularly abundant and diverse in bacteria that undergo complex development, such as Actinomycetes, Cyanobacteria and alpha-proteobacteria, the latter being progenitors of the mitochondria. In these bacteria, the apoptosis-related domains typically form multidomain proteins, which are known or inferred to participate in signal transduction and regulation of gene expression. Some of these bacterial multidomain proteins contain fusions between apoptosis-related domains, such as AP-ATPase fused with a metacaspase or a TIR domain. Thus, bacterial homologs of eukaryotic apoptotic machinery components might functionally and physically interact with each other as parts of signaling pathways that remain to be investigated. An emerging scenario of the origin of the eukaryotic apoptotic system involves acquisition of several central apoptotic effectors as a consequence of mitochondrial endosymbiosis and probably also as a result of subsequent, additional horizontal gene transfer events, which was followed by recruitment of newly emerging eukaryotic domains as adaptors.  相似文献   

11.
The protein kinases C (PKCs) define a growing family of ubiquitous signal transducting serine/threonine kinases that control ion conductance channels, release of hormones and cell growth and proliferation. Degenerated oligonucleotides were used as primers for polymerase chain reactions to amplify PKC-related sequences from the white truffle species Tuber magnatum and Tuber borchii. The deduced amino acid sequences of cloned sequences reveal domains homologous to the regulatory and kinase domains of PKC-related proteins, but lack typical Ca(2+)-binding domain and therefore should be classified as nPKCs. Both contain a large extended N-terminus which is found exclusively in fungi PKCs. Phylogenetic analysis of the kinase domain demonstrates high homology with known filamentous fungi isoenzymes.  相似文献   

12.
Classifications of proteins into groups of related sequences are in some respects like a periodic table for biology, allowing us to understand the underlying molecular biology of any organism. Pfam is a large collection of protein domains and families. Its scientific goal is to provide a complete and accurate classification of protein families and domains. The next release of the database will contain over 10,000 entries, which leads us to reflect on how far we are from completing this work. Currently Pfam matches 72% of known protein sequences, but for proteins with known structure Pfam matches 95%, which we believe represents the likely upper bound. Based on our analysis a further 28,000 families would be required to achieve this level of coverage for the current sequence database. We also show that as more sequences are added to the sequence databases the fraction of sequences that Pfam matches is reduced, suggesting that continued addition of new families is essential to maintain its relevance.  相似文献   

13.
Members of the immunoglobulin superfamily in bacteria.   总被引:4,自引:0,他引:4       下载免费PDF全文
We report a prediction that two prokaryotic proteins contain immunoglobulin superfamily domains. Immunoglobulin-like folds have been identified previously in prokaryotic proteins, but these share no recognizable sequence similarity with eukaryotic immunoglobulin superfamily (IgSF) folds, and may be the result of the physics and chemistry of proteins favoring certain common folds. In contrast, the prokaryotic proteins identified have sequences whose match to the immunoglobulin superfamily can be detected by hidden Markov modeling, BLASTP matches, key residue analysis, and secondary structure predictions. We propose that these prokaryotic immunoglobulin-like domains are almost certain to be related by divergence from a common ancestor to eukaryotic immunoglobulin superfamily domains.  相似文献   

14.
Abstract

The Protein Data Bank (PDB) is the preeminent source of protein structural information. PDB contains over 32,500 experimentally determined 3-D structures solved using X-ray crystallography or nuclear magnetic resonance spectroscopy. Intrinsically disordered regions fail to form a fixed 3-D structure under physiological conditions. In this study, we compare the amino-acid sequences of proteins whose structures are determined by X-ray crystallography with the corresponding sequences from the Swiss-Prot database. The analyzed dataset includes 16,370 structures, which represent 18,101 PDB chains and 5,434 different proteins from 910 different organisms (2,793 eukaryotic, 2,109 bacterial, 288 viral, and 244 archaeal). In this dataset, on average, each Swiss-Prot protein is represented by 7 PDB chains with 76% of the crystallized regions being represented by more than one structure. Intriguingly, the complete sequences of only ~7% of proteins are observed in the corresponding PDB structures, and only ~25% of the total dataset have >95% of their lengths observed in the corresponding PDB structures. This suggests that the vast majority of PDB proteins is shorter than their corresponding Swiss-Prot sequences and/or contain numerous residues, which are not observed in maps of electron density. To determine the prevalence of disordered regions in PDB, the residues in the Swiss-Prot sequences were grouped into four general categories, “Observed” (which correspond to structured regions), “Not observed” (regions with missing electron density, potentially disordered), “Uncharacterized,” and “Ambiguous,” depending on their appearance in the corresponding PDB entries. This non-redundant set of residues can be viewed as a ‘fragment’ or empirical domain database that contains a set of experimentally determined structured regions or domains and a set of experimentally verified disordered regions or domains. We studied the propensities and properties of residues in these four categories and analyzed their relations to the predictions of disorder using several algorithms. “Non-observed,” “Ambiguous,” and “Uncharacterized” regions were shown to possess the amino acid compositional biases typical of intrinsically disordered proteins. The application of four different disorder predictors (PONDR® VL-XT, VL3-BA, VSL1P, and IUPred) revealed that the vast majority of residues in the “Observed” dataset are ordered, and that the “Not observed” regions are mostly disordered. The “Uncharacterized” regions possess some tendency toward order, whereas the predictions for the short “Ambiguous” regions are really ambiguous. Long “Ambiguous” regions (>70 amino acid residues) are mostly predicted to be ordered, suggesting that they are likely to be “wobbly” domains.

Overall, we showed that completely ordered proteins are not highly abundant in PDB and many PDB sequences have disordered regions. In fact, in the analyzed dataset ~10% of the PDB proteins contain regions of consecutive missing or ambiguous residues longer than 30 amino-acids and ~40% of the proteins possess short regions (≥10 and <30 amino-acid long) of missing and ambiguous residues.  相似文献   

15.
Intrinsic disorder in the Protein Data Bank   总被引:2,自引:0,他引:2  
The Protein Data Bank (PDB) is the preeminent source of protein structural information. PDB contains over 32,500 experimentally determined 3-D structures solved using X-ray crystallography or nuclear magnetic resonance spectroscopy. Intrinsically disordered regions fail to form a fixed 3-D structure under physiological conditions. In this study, we compare the amino-acid sequences of proteins whose structures are determined by X-ray crystallography with the corresponding sequences from the Swiss-Prot database. The analyzed dataset includes 16,370 structures, which represent 18,101 PDB chains and 5,434 different proteins from 910 different organisms (2,793 eukaryotic, 2,109 bacterial, 288 viral, and 244 archaeal). In this dataset, on average, each Swiss-Prot protein is represented by 7 PDB chains with 76% of the crystallized regions being represented by more than one structure. Intriguingly, the complete sequences of only approximately 7% of proteins are observed in the corresponding PDB structures, and only approximately 25% of the total dataset have >95% of their lengths observed in the corresponding PDB structures. This suggests that the vast majority of PDB proteins is shorter than their corresponding Swiss-Prot sequences and/or contain numerous residues, which are not observed in maps of electron density. To determine the prevalence of disordered regions in PDB, the residues in the Swiss-Prot sequences were grouped into four general categories, "Observed" (which correspond to structured regions), "Not observed" (regions with missing electron density, potentially disordered), "Uncharacterized," and "Ambiguous," depending on their appearance in the corresponding PDB entries. This non-redundant set of residues can be viewed as a 'fragment' or empirical domain database that contains a set of experimentally determined structured regions or domains and a set of experimentally verified disordered regions or domains. We studied the propensities and properties of residues in these four categories and analyzed their relations to the predictions of disorder using several algorithms. "Non-observed," "Ambiguous," and "Uncharacterized" regions were shown to possess the amino acid compositional biases typical of intrinsically disordered proteins. The application of four different disorder predictors (PONDR(R) VL-XT, VL3-BA, VSL1P, and IUPred) revealed that the vast majority of residues in the "Observed" dataset are ordered, and that the "Not observed" regions are mostly disordered. The "Uncharacterized" regions possess some tendency toward order, whereas the predictions for the short "Ambiguous" regions are really ambiguous. Long "Ambiguous" regions (>70 amino acid residues) are mostly predicted to be ordered, suggesting that they are likely to be "wobbly" domains. Overall, we showed that completely ordered proteins are not highly abundant in PDB and many PDB sequences have disordered regions. In fact, in the analyzed dataset approximately 10% of the PDB proteins contain regions of consecutive missing or ambiguous residues longer than 30 amino-acids and approximately 40% of the proteins possess short regions (> or =10 and < 30 amino-acid long) of missing and ambiguous residues.  相似文献   

16.
Harvey SH  Krien MJ  O'Connell MJ 《Genome biology》2002,3(2):reviews3003.1-reviews30035
The structural maintenance of chromosomes (SMC) proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms. SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and each has five distinct domains: amino- and carboxy-terminal globular domains, which contain sequences characteristic of ATPases, two coiled-coil regions separating the terminal domains and a central flexible hinge. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression. Recent studies are beginning to decipher molecular details of how these processes are carried out.  相似文献   

17.
Fliess A  Motro B  Unger R 《Proteins》2002,48(2):377-387
An important question in protein evolution is to what extent proteins may have undergone swaps (switches of domain or fragment order) during evolution. Such events might have occurred in several forms: Swaps of short fragments, swaps of structural and functional motifs, or recombination of domains in multidomain proteins. This question is important for the theoretical understanding of the evolution of proteins, and has practical implications for using swaps as a design tool in protein engineering. In order to analyze the question systematically, we conducted a large scale survey of possible swaps and permutations among all pairs of protein from the Swissport database. A swap is defined as a specific kind of sequence mutation between two proteins in which two fragments that appear in both sequences have different relative order in the two sequences. For example, aXbYc and dYeXf are defined as a swap, where X and Y represent sequence fragments that switched their order. Identifying such swaps is difficult using standard sequence comparison packages. One of the main problems in the analysis stems from the fact that many sequences contain repeats, which may be identified as false-positive swaps. We have used two different approaches to detect pairs of proteins with swaps. The first approach is based on the predefined list of domains in Pfam. We identified all the proteins that share at least two domains and analyzed their relative order, looking for pairs in which the order of these domains was switched. We designed an algorithm to distinguish between real swaps and duplications. In the second approach, we used Blast to detect pairs of proteins that share several fragments. Then, we used an automatic procedure to select pairs that are likely to contain swaps. Those pairs were analyzed visually, using a graphical tool, to eliminate duplications. Combining these approaches, about 140 different cases of swaps in the Swissprot database were found (after eliminating multiple pairs within the same family). Some of the cases have been described in the literature, but many are novel examples. Although each new example identified may be interesting to analyze, our main conclusion is that cases of swaps are rare in protein evolution. This observation is at odds with the common view that proteins are very modular to the point that modules (e.g., domains) can be shuffled between proteins with minimal constraints. Our study suggests that sequential constraints, i.e., the relative order between domains, are highly conserved.  相似文献   

18.
K Weber  U Plessmann    W Ulrich 《The EMBO journal》1989,8(11):3221-3227
The giant body muscle cells of the nematode Ascaris lumbricoides show a complex three dimensional array of intermediate filaments (IFs). They contain two proteins, A (71 kd) and B (63 kd), which we now show are able to form homopolymeric filaments in vitro. The complete amino acid sequence of B and 80% of A have been determined. A and B are two homologous proteins with a 55% sequence identity over the rod and tail domains. Sequence comparisons with the only other invertebrate IF protein currently known (Helix pomatia) and with vertebrate IF proteins show that along the coiled-coil rod domain, sequence principles rather than actual sequences are conserved in evolution. Noticeable exceptions are the consensus sequences at the ends of the rod, which probably play a direct role in IF assembly. Like the Helix IF protein the nematode proteins have six extra heptads in the coil 1b segment. These are characteristic of nuclear lamins from vertebrates and invertebrates and are not found in vertebrate IF proteins. Unexpectedly the enhanced homology between lamins and invertebrate IF proteins continues in the tail domains, which in vertebrate IF proteins totally diverge. The sequence alignment necessitates the introduction of a 15 residue deletion in the tail domain of all three invertebrate IF proteins. Its location coincides with the position of the karyophilic signal sequence, which dictates nuclear entry of the lamins. The results provide the first molecular support for the speculation that nuclear lamins and cytoplasmic IF proteins arose in eukaryotic evolution from a common lamin-like predecessor.  相似文献   

19.
《Genomics》2020,112(3):2271-2281
Collagens and collagen-like proteins are found in a wide range of organisms. The common feature of these proteins is a triple helix fold, requiring a characteristic pattern of amino acid sequences, composed of Gly-X-Y tripeptide repeats. Collagen-like proteins from bacteria are heterogeneous in terms of length and amino acid composition of their collagenous sequences. However, different bacteria live in different environments, some at extreme temperatures and conditions. This study explores the occurrence of collagen-like sequences in the genomes of different extreme condition-adapted bacteria, and investigates features that could be linked to conditions where they thrive. Our results show that proteins containing collagen-like sequences are encoded by genomes of various extremophiles. Some of these proteins contain conservative domains, characteristic of cell or endospore surface proteins, while most other proteins are unknown. The characteristics of collagenous sequences may depend on both, the phylogenetic relationship and the living conditions of the bacteria.  相似文献   

20.
Bracovirus gene products are highly divergent from insect proteins   总被引:1,自引:0,他引:1  
Recently, several polydnavirus (PDV) genomes have been completely sequenced. The dsDNA circles enclosed in virus particles and injected by wasps into caterpillars appear to mainly encode virulence factors potentially involved in altering host immunity and/or development, thereby allowing the survival of the parasitoid larvae within the host tissues. Parasitoid wasps generally inject virulence factors produced in the venom gland. As PDV genomes are inherited vertically by wasps through a proviral form, wasp virulence genes may have been transferred to this chromosomal form, leading to their incorporation into virus particles. Indeed, many gene products from Cotesia congregata bracovirus (CcBV), such as PTPs, IkappaB-like, and cystatins, contain protein domains conserved in metazoans. Surprisingly however, CcBV virulence gene products are not more closely related to insect proteins than to human proteins. To determine whether the distance between CcBV and insect proteins is a specific feature of BV proteins or simply reflects a general high divergence of parasitoid wasp products, which might be due to parasitic lifestyle, we have analyzed the sequences of wasp genes obtained from a cDNA library. Wasp sequences having a high similarity with Apis mellifera genes involved in a variety of biological functions could be identified indicating that the high level of divergence observed for BV products is a hallmark of these viral proteins. We discuss how this divergence might be explained in the context of the current hypotheses on the origin and evolution of wasp-bracovirus associations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号