首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Summary Internal regularities of amino acid sequences of flavodoxins, FMN-containing, low molecular weight flavoproteins, were statistically examined using the minimum mutation method. The sequence ofClostridium pasteurianum flavodoxin shows statistically significant evidence of repetitious internal gene duplications at different levels of structure. Peptide pairs with a low chance probability of occurrence were frequently observed at a shift of 5 residues. The pairs with the lowest chance probabilites are a pair of heptapeptides at positions 39–45 vs. 44–50, a 5 residue shift (p = 9 × 10–6). Most of the related pairs are consistent and could best be explained by the repeating pentapeptide sequence: (Lys-Gly-Ala-Asp-Val-)n and appropriate gaps. Internal repetitions with longer shifts were also suggested for other flavodoxins. Repetitious gene duplication is proposed for the early stages of flavodoxin evolution.  相似文献   

3.
This paper continues an examination of the hypothesis that modern proteins evolved from random heteropeptide sequences. In support of the hypothesis, White and Jacobs (1993, J Mol Evol 36:79–95) have shown that any sequence chosen randomly from a large collection of nonhomologous proteins has a 90% or better chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. The goal of the present study was to investigate the possibility that the random-origin hypothesis could explain the lengths of modern protein sequences without invoking specific mechanisms such as gene duplication or exon splicing. The sets of sequences examined were taken from the 1989 PIR database and consisted of 1,792 super-family proteins selected to have little sequence identity, 623 E. coli sequences, and 398 human sequences. The length distributions of the proteins could be described with high significance by either of two closely related probability density functions: The gamma distribution with parameter 2 or the distribution for the sum of two exponential random independent variables. A simple theory for the distributions was developed which assumes that (1) protoprotein sequences had exponentially distributed random independent lengths, (2) the length dependence of protein stability determined which of these protoproteins could fold into compact primitive proteins and thereby attain the potential for biochemical activity, (3) the useful protein sequences were preserved by the primitive genome, and (4) the resulting distribution of sequence lengths is reflected by modern proteins. The theory successfully predicts the two observed distributions which can be distinguished by the functional form of the dependence of protein stability on length.The theory leads to three interesting conclusions. First, it predicts that a tetra-nucleotide was the signal for primitive translation termination. This prediction is entirely consistent with the observations of Brown et al. (1990a,b, Nucleic Acids Res 18:2079–2086 and 18: 6339-6345) which show that tetra-nucleotides (stop codon plus following nucleotide) are the actual signals for termination of translation in both prokaryotes and eukaryotes. Second, the strong dependence of statistical length distributions on sequence-termination signaling codes implies that the evolution of stop codons and translation-termination processes was as important as gene splicing in early evolution. Third, because the theory is based upon a simple no-exon stochastic model, it provides a plausible alternative to a limited universe of exons from which all proteins evolved by gene duplication and exon splicing (Dorit et al. 1990, Science 250:1377–1382).  相似文献   

4.
Summary A nucleic acid chain L nucleotides in length, with the specific base sequence B1B2.BL, each Bi being A, G, C, or T, is defined by the L-dimensional vector B = (B1, B2, , BL), the kth position in the chain being occupied by the base Bk. Let PBB' be the twelve given constant non-negative transition probabilities that in a specified position the base B is replaced by the base B in a single step, and let P BB' (XX) be the probability that the position goes from base B to B in X steps. An exact analytical expression for P BB' (XX) is derived. Assuming that each base mutates independently of the others, an exact expression is derived for the probability P BB' (XX) that the initial gene sequence B goes to a sequence B = (B1, B2, , BL) after X = (X1, X2, , XL) base replacements, where Xk is the number of single-step base replacements in the kth position. The resulting equations allow a more precise accounting for the effects of Darwinian natural selection in molecular evolution than does the idealized but biologically less accurate assumption that each of the four nucleotides is equally likely to mutate to and be fixed as one of the other three. Illustrative applications of the theory to some problems in biological evolution are given.  相似文献   

5.
Abstract The secreted yield of hen egg-white lysozyme (HEWL) from the filamentous fungus Aspergillus niger was increased 10–20-fold by constructing a novel gene fusion. The cDNA sequence encoding mature HEWL was fused in frame to part of the native A. niger gene encoding glucoamylase ( gla A), separated by a proteolytic cleavage site for in vivo processing. Using this construct, peak secreted HEWL yields of 1 g/l were obtained in A. niger shake flask cultures compared to about 50 mg/l when using an expression cassette lacking any gla A coding sequence. The portion of gla A used in the gene fusion encoded the first 498 amino acids of glucoamylase (G498) and comprised its secretion signal, the catalytic domain and most of the O-glycosylated linker region which, in the entire glucoamylase molecule, spatially separates and links the catalytic and starch-binding domains.  相似文献   

6.
Protein phosphatase 2A (PP2A) is regulated through a variety of mechanisms, including post-translational modifications and association with regulatory proteins. Alpha4 is one such regulatory protein that binds the PP2A catalytic subunit (PP2Ac) and protects it from polyubiquitination and degradation. Alpha4 is a multidomain protein with a C-terminal domain that binds Mid1, a putative E3 ubiquitin ligase, and an N-terminal domain containing the PP2Ac-binding site. In this work, we present the structure of the N-terminal domain of mammalian Alpha4 determined by x-ray crystallography and use double electron-electron resonance spectroscopy to show that it is a flexible tetratricopeptide repeat-like protein. Structurally, Alpha4 differs from its yeast homolog, Tap42, in two important ways: 1) the position of the helix containing the PP2Ac-binding residues is in a more open conformation, showing flexibility in this region; and 2) Alpha4 contains a ubiquitin-interacting motif. The effects of wild-type and mutant Alpha4 on PP2Ac ubiquitination and stability were examined in mammalian cells by performing tandem ubiquitin-binding entity precipitations and cycloheximide chase experiments. Our results reveal that both the C-terminal Mid1-binding domain and the PP2Ac-binding determinants are required for Alpha4-mediated protection of PP2Ac from polyubiquitination and degradation.  相似文献   

7.
Boehm AM  Sickmann A 《Proteomics》2006,6(15):4223-4226
In mass spectrometry-based proteomics, protein identification results usually consist of peptide sequences and database-dependent accession identifiers of the matching proteins. Often certain annotations are only available in particular databases that in turn must be queried by a certain identifier. In order to simplify and unify the tracing of identified proteins back to their original annotation information, a system capable of set-oriented mapping the different accession identifiers of proteins derived from multiple sequence database sources has been developed. This allows unification of the access to protein information and tracing to other online resources providing additional information as well as resolving cross-references of protein identifications. The interface of seqDB is available via http://www.protein-ms.de following the link to seqDB.  相似文献   

8.
An E. coli vector system was constructed which allows the expression of fusion genes via a l-rhamnose-inducible promotor. The corresponding fusion proteins consist of the maltose-binding protein and a His-tag sequence for affinity purification, the Saccharomyces cerevisiae Smt3 protein for protein processing by proteolytic cleavage and the protein of interest. The Smt3 gene was codon-optimized for expression in E. coli. In a second rhamnose-inducible vector, the S. cerevisiae Ulp1 protease gene for processing Smt3 fusion proteins was fused in the same way to maltose-binding protein and His-tag sequence but without the Smt3 gene. The enhanced green fluorescent protein (eGFP) was used as reporter and protein of interest. Both fusion proteins (MalE-6xHis-Smt3-eGFP and MalE-6xHis-Ulp1) were efficiently produced in E. coli and separately purified by amylose resin. After proteolytic cleavage the products were applied to a Ni-NTA column to remove protease and tags. Pure eGFP protein was obtained in the flow-through of the column in a yield of around 35% of the crude cell extract.  相似文献   

9.
Abstract The untranslatable, RNA polymerase II-dependent gene ( dutA ) of Dictyostelium discoideum is induced early in development. However, unlike other early genes, dutA induction was not affected by cAMP pulses and occurred normally in various cAMP-related mutant cells, the results indicating that this induction depended solely on factors other than cAMP. In the knockout strain of the catalytic subunit of protein kinase A, dutA expression was severely blocked and not recovered by cAMP pulses. This demonstrates that even the cAMP-independent gene, dutA , requires protein kinase A for its expression.  相似文献   

10.
 Three different probes, obtained by PCR amplification and labelling of different segments of a PDI cDNA clone from common wheat, were used to identify and assign to wheat chromosomes the gene sequences coding for protein disulphide isomerase (PDI). One of these probes, containing the whole coding region except for a short segment coding for the C-terminal sequence, displayed defined and specific RFLP patterns. PDI gene sequences were consequently assigned to wheat chromosome arms 4BS, 4DS, 4AL and 1BS by Southern hybridisation of EcoRI- HindIII- and BamHI-digested total DNA of nulli-tetrasomic and di-telosomic lines of Chinese Spring. This probe was also employed for assessing the restriction fragment length polymorphism in several hexaploid and tetraploid cultivated wheats. These showed considerable conservation at PDI loci; in fact polymorphism was only observed for the chromosome 1B fragment. Received: 7 July 1998 / Accepted: 14 August 1998  相似文献   

11.
Summary A method has been developed which enables the estimation of the plant gene flow parameters p (pollen dispersal), s (seed dispersal) and t (outcrossing rate) from a selection-free continuously structured population in equilibrium. The method uses Wright's F-coefficients and introduces a new F-function which describes the genetic similarity as a function of the spatial distance. The method has been elaborated for wind pollinated plant species but can be modified for insect pollination and for animal species. In practice allozymes will provide for the necessary neutral genetic variation. The more loci used and the more intermediate the gene frequencies, the more reliable the results. For the estimation of p and t together (when the outcrossing rate is not known) at least two chromosomally unlinked loci are required. The method for estimating s depends on whether the plant species is annual or perennial. The mechanism of selfing has been analysed by the explanation of the value of t by three components: population density (d), pollen flow (p) and relative fertilization potential of own pollen (Z). The concepts of neighbourhood size and isolation by distance, developed by Wright, who used a single gene flow parameter , have been extended to the situation which is realistic for seed plants, using all three parameters p, s and t. When p is large with respect to s, s largely determines the value of the neighbourhood size, whereas p is the most dominating factor in isolation by distance. The use of local effective population size and mean gene transport per generation instead of neighbourhood size and neighbourhood area, respectively, is proposed to avoid confusion. Computer simulations have been carried out to check the validity and the reliability of the method. Populations of 200 plants, using two or three loci with intermediate allele frequencies, gave good results in the calculation of p with known value of t and of s and Ne. With unknown t, especially with lower values of t, larger populations of at least 1,000 plants are necessary to obtain reasonably accurate results for p and mean gene transport per generation M.Grassland Species Research Group Publication No. 81  相似文献   

12.
13.
The gbpC gene encoding the glucan-binding protein C which is involved in dextran (glucan)-dependent aggregation (ddag) of Streptococcus mutans has been identified by random mutagenesis. We analyzed ddag(-) mutants containing the intact gbpC gene and found that these mutants possessed a large and characteristic duplication of a region of the chromosome which was responsible for the phenotype. Based upon characterization of these duplications, we developed a strategy to introduce a duplication into any specific region of the chromosome of these organisms. The 690-bp gene responsible for the ddag(-) phenotype was identified within a 60-kb region by observing ddag (positive or negative) phenotypes of successively constructed specific duplication mutants.  相似文献   

14.
Introduction of graphic representation for biological sequences can provide intuitive overall pictures as well as useful insights for performing large-scale analysis. Here, a new two-dimensional graph, called “2D-MH”, is proposed to represent protein sequences. It is formed by incorporating the information of the side-chain mass of each of the constituent amino acids and its hydrophobicity. The graphic curve thus generated is featured by (1) an one-to-one correspondence relation without circuit or degeneracy, (2) better reflecting the innate structure of the protein sequence, (3) clear visibility in displaying the similarity of protein sequences, (4) more sensitive for the mutation sites important for drug targeting, and (5) being able to be used as a metric for the “evolutionary distance” of a protein from one species to the other. It is anticipated that the presented graphic method may become a useful vehicle for large-scale analysis of the avalanche of protein sequences generated in the post-genomic age. As a web-server, 2D-MH is freely accessible at http://icpr.jci.jx.cn/bioinfo/pplot/2D-MH, by which one can easily generate the two-dimensional graphs for any number of protein sequences and compare the evolutionary distances between them.  相似文献   

15.
The real-time polymerase chain reaction (PCR) data requires normalization with an internal control gene expressed at constant levels under all the experimental conditions being analyzed for accurate and reliable gene expression results. In this study, the expression of 12 candidate internal control genes, including ACT1, EF1α, GAPDH, IF4a, TUB6, UBC, UBQ5, UBQ10, 18SrRNA, 25SrRNA, GRX and HSP90, in a diverse set of 18 tissue samples representing different organs/developmental stages and stress conditions in chickpea (Cicer arietinum L.) has been validated. Their expression levels vary considerably in various tissue samples analyzed. The expression levels of EF1α and HSP90 are most constant across various organs/developmental stages analyzed. Similarly, the expression levels of IF4a and GAPDH are most constant across various stress conditions. A set of two most stable genes is found sufficient for accurate and reliable normalization of real-time PCR data in the given set of tissue samples of chickpea. The genes with most constant expression identified in this study should be useful for normalization of gene expression data in a wide variety of tissue samples in chickpea.  相似文献   

16.
Abstract Porphyromonas gingivalis 381 fimbriae and a synthetic peptide composed of residues 69–73 (ALTTE) of the fimbrial subunit protein, FP381(69–73), function in the induction of interleukin 6 (IL-6) production, IL-6 mRNA expression, and tyrosine and serine/threonine phosphorylation of several proteins in human peripheral blood mononuclear cells (PBMC). Herbimycin A and H-7, inhibitors of tyrosine kinases and protein kinase C (PKC), markedly inhibited IL-6 production, gene expression, and tyrosine and serine/threonine phosphorylation of proteins. An inactive analog of synthetic peptide replaced alanine to glycine at position 69 in FP381(69–73), GLTTE, exhibited an antagonistic effect on the IL-6 production induced by the fimbriae. These results suggest that the peptide ALTTE functions as an agent in inflammatory reactions and immune responses in the inflamed gingival and periodontal tissues, in which the participation of protein phosphorylation by tyrosine kinases and PKC in signal transduction may be considered.  相似文献   

17.
18.
Summary A method for detecting homology between two protein or nucleic acid sequences which require insertions or deletions for optimum alignment has been devised for use with a computer. Sequences are assessed for possible relationship by Monte Carlo methods involving comparisons between the alignment of the real sequences and alignments of randomly scrambled sequences of the Same composition as the real sequences, each alignment having the optimum number of gaps. As each gap is successively introduced into a comparison (real or random) a maximum score is determined from the similarity of the aligned residues. From the distribution of the maximum alignment scores of randomly scrambled sequences having the same number of gaps, the percentage of random comparisons having higher scores is determined, and the smallest of these percentage levels for each pair of sequences (real or random) indicates the optimum alignment. The fraction of the comparisons of random sequences having percentage levels at their optimum alignment below that of the real sequence comparison at its optimum estimates the probability that such an alignment might have arisen by chance. Related sequences are detected since their optimum alignment score, by virtue of a contribution from ancestral homology in addition to optimised random considerations, occupies a more extreme position in the appropriate frequency distribution of scores than do the majority of optimum scores of randomly scrambled sequences in their appropriate distributions.Application of this optimum match method of sequence comparison shows that the sensitivity of the maximum match method of Needleman and Wunsch (1970) decreases quite dramatically with sequence comparisons which require only a few gaps for a reasonable alignment, or when sequences differ greatly in length. The maximum match method as applied by Barker and Dayhoff (1972) has the additional disadvantage that deletions which have occurred in the longer of two homologous protein sequences further decrease the sensitivity of detection of relationship. The constrained match method of Sankoff and Cedergren (1973) is seen to be misleading since large increments in the alignment score from added gaps do not necessarily result in a high total alignment score required to demonstrate sequence homology.  相似文献   

19.
Serum amyloid A protein (apo-SAA) is an acute-phase reactant and an apolipoprotein of high density lipoproteins (HDL). Six major isoforms of apo-SAA occur in humans (pI 6.0, 6.4, 7.0, 7.4, 7.5, 8.0). In this report we have rationalized the phenotypic expression of apo-SAA isoforms with published apo-SAA structures predicted from apo-SAA cDNA's pA1 and pSAA82 and the genomic DNA SAAg9. The six apo-SAA isoforms fall into three pairs, pI 6.0/6.4, 7.0/7.5, and 7.4/8.0, which are products of cDNA pA1, cDNA pSAA82, and genomic DNA SAAg9, respectively. The second of each isoform pair (i.e. pI 6.4, 7.5, and 8.0) is the "primary" product: a 104-residue peptide with the NH2-terminal sequence Arg-Ser-Phe-Phe. Each primary product is processed either to a major 103-residue peptide with the NH2-terminal sequence Ser-Phe-Phe or processed to a minor 102-residue product which results from the loss of both an Arg and a Ser residue from the NH2 termini. These "secondary" products have the lower pI values of 6.0, 7.0, and 7.4, respectively. The isoelectric points of the SAAg9 products were confirmed by expression of SAAg9 in transfected mouse L-cells. Both the pI 8.0 and 7.4 isoforms were present in cellular extracts, suggesting that post-translational modification of apo-SAA may occur intracellularly. However, the greater relative abundance of the pI 7.4 isoform extracellularly suggests that the major conversion may occur after secretion. Whereas the gene corresponding to the pA1 cDNA sequence does not show allelic variation, the segregation characteristics of the pI 7.0/7.5 and 7.4/8.0 isoform pairs amongst individuals suggests that these isoforms are the products of genes (with sequences corresponding to pSAA82 and SAAg9, respectively) which are allelic variants at a single locus distinct from that for the pI 6.0/6.4 isoform pair.  相似文献   

20.
We have recently cloned a cDNA encoding mitochondrial porin in Drosophila melanogaster and shown its chromosomal localization (Messina et al., FEBS Lett. (1996) 384, 9–13). Such cDNA was used as a probe for screening a genomic library. We thus cloned and sequenced a 4494-bp genomic region which contained the whole gene for the mitochondrial porin or VDAC. It was found that this D. melanogaster porin gene contains five exons, numbered IA (115 bp), IB (123 bp), II (320 bp), III (228 bp) and IV (752 bp). The exons II, III and IV contain the protein coding sequence and the 3′ untranslated sequence (3′-UTR). The first base in exon II precisely corresponds to the first base of the starting ATG codon. Exon IA corresponds to the 5′-UTR sequence reported in the published cDNA sequence. Exon IB corresponds to an alternative 5′-UTR sequence, demonstrated to be transcribed by 5′-RACE experiments. The exon-intron splicing borders and the length of the exon III perfectly match a homologous internal exon detected in the mouse genes. Such exon encodes a protein domain predicted by sequence transmembrane arrangement models to contain major hydrophilic loops and it is thus suspected to have a conserved distinct function. In situ hybridization experiments confirmed the localization of the genomic clone on the chromosome 2L at region 32B3-4. Together with genomic Southern blotting at various stringencies, the same experiment did not confirm the presence of a second genetic locus on D. melanogaster chromosomes. Northern blots demonstrated that the porin gene is a housekeeping one: three messages of approx. 1.2–1.6 kbp are transcribed in every fly developmental stage that was studied. They were shown to derive by an alternative usage of different promoters and polyadenylation sites.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号