首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 32 毫秒
1.
AAindex: amino acid index database   总被引:12,自引:0,他引:12  
AAindex is a database of amino acid indices and amino acid mutation matrices. An amino acid index is a set of 20 numerical values representing various physico-chemical and biochemical properties of amino acids. An amino acid mutation matrix is generally 20 × 20 numerical values representing similarity of amino acids. AAindex consists of two sections: AAindex1 for the collection of published amino acid indices and AAindex2 for the collection of published amino acid mutation matrices. Each entry of either AAindex1 or AAindex2 consists of the definition, the reference information, a list of related entries in terms of the correlation coefficient and the actual data. The database may be accessed through the DBGET/LinkDB system at GenomeNet (http://www.genome.ad.jp/aaindex/ ) or may be downloaded by anonymous FTP (ftp://ftp.genome.ad.jp/db/genomenet/aaindex/ ).  相似文献   

2.
Mitochondria often use genetic codes different from the standard genetic code. Now that many mitochondrial genomes have been sequenced, these variant codes provide the first opportunity to examine empirically the processes that produce new genetic codes. The key question is: Are codon reassignments the sole result of mutation and genetic drift? Or are they the result of natural selection? Here we present an analysis of 24 phylogenetically independent codon reassignments in mitochondria. Although the mutation-drift hypothesis can explain reassignments from stop to an amino acid, we found that it cannot explain reassignments from one amino acid to another. In particular—and contrary to the predictions of the mutation-drift hypothesis—the codon involved in such a reassignment was not rare in the ancestral genome. Instead, such reassignments appear to take place while the codon is in use at an appreciable frequency. Moreover, the comparison of inferred amino acid usage in the ancestral genome with the neutral expectation shows that the amino acid gaining the codon was selectively favored over the amino acid losing the codon. These results are consistent with a simple model of weak selection on the amino acid composition of proteins in which codon reassignments are selected because they compensate for multiple slightly deleterious mutations throughout the mitochondrial genome. We propose that the selection pressure is for reduced protein synthesis cost: most reassignments give amino acids that are less expensive to synthesize. Taken together, our results strongly suggest that mitochondrial genetic codes evolve to match the amino acid requirements of proteins.  相似文献   

3.
AAindex: Amino Acid Index Database.   总被引:10,自引:0,他引:10       下载免费PDF全文
AAindex is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. It consists of two sections: AAindex1 for the amino acid index of 20 numerical values and AAindex2 for the amino acid mutation matrix of 210 numerical values. Each entry of either AAindex1 or AAindex2 consists of the definition, the reference information, a list of related entries in terms of the correlation coefficient, and the actual data. The database may be accessed through the DBGET/LinkDB system at GenomeNet (http://www.genome.ad. jp/dbget/) or may be downloaded by anonymous FTP (ftp://ftp.genome. ad.jp/db/genomenet/aaindex/).  相似文献   

4.
The genomes of the spirochaetes Borrelia burgdorferi and Treponema pallidum show strong strand-specific skews in nucleotide composition, with the leading strand in replication being richer in G and T than the lagging strand in both species. This mutation bias results in codon usage and amino acid composition patterns that are significantly different between genes encoded on the two strands, in both species. There are also substantial differences between the species, with T.pallidum having a much higher G+C content than B. burgdorferi. These changes in amino acid and codon compositions represent neutral sequence change that has been caused by strong strand- and species-specific mutation pressures. Genes that have been relocated between the leading and lagging strands since B. burgdorferi and T.pallidum diverged from a common ancestor now show codon and amino acid compositions typical of their current locations. There is no evidence that translational selection operates on codon usage in highly expressed genes in these species, and the primary influence on codon usage is whether a gene is transcribed in the same direction as replication, or opposite to it. The dnaA gene in both species has codon usage patterns distinctive of a lagging strand gene, indicating that the origin of replication lies downstream of this gene, possibly within dnaN. Our findings strongly suggest that gene-finding algorithms that ignore variability within the genome may be flawed.  相似文献   

5.
Sau K  Gupta SK  Sau S  Mandal SC  Ghosh TC 《Bio Systems》2006,85(2):107-113
Synonymous codon and amino acid usage biases have been investigated in 903 Mimivirus protein-coding genes in order to understand the architecture and evolution of Mimivirus genome. As expected for an AT-rich genome, third codon positions of the synonymous codons of Mimivirus carry mostly A or T bases. It was found that codon usage bias in Mimivirus genes is dictated both by mutational pressure and translational selection. Evidences show that four factors such as mean molecular weight (MMW), hydropathy, aromaticity and cysteine content are mostly responsible for the variation of amino acid usage in Mimivirus proteins. Based on our observation, we suggest that genes involved in translation, DNA repair, protein folding, etc., have been laterally transferred to Mimivirus a long ago from living organism and with time these genes acquire the codon usage pattern of other Mimivirus genes under selection pressure.  相似文献   

6.
How genomic diversity within bacterial populations originates and is maintained in the presence of frequent recombination is a central problem in understanding bacterial evolution. Natural populations of Borrelia burgdorferi, the bacterial agent of Lyme disease, consist of diverse genomic groups co-infecting single individual vertebrate hosts and tick vectors. To understand mechanisms of sympatric genome differentiation in B. burgdorferi, we sequenced and compared 23 genomes representing major genomic groups in North America and Europe. Linkage analysis of >13,500 single-nucleotide polymorphisms revealed pervasive horizontal DNA exchanges. Although three times more frequent than point mutation, recombination is localized and weakly affects genome-wide linkage disequilibrium. We show by computer simulations that, while enhancing population fitness, recombination constrains neutral and adaptive divergence among sympatric genomes through periodic selective sweeps. In contrast, simulations of frequency-dependent selection with recombination produced the observed pattern of a large number of sympatric genomic groups associated with major sequence variations at the selected locus. We conclude that negative frequency-dependent selection targeting a small number of surface-antigen loci (ospC in particular) sufficiently explains the maintenance of sympatric genome diversity in B. burgdorferi without adaptive divergence. We suggest that pervasive recombination makes it less likely for local B. burgdorferi genomic groups to achieve host specialization. B. burgdorferi genomic groups in the northeastern United States are thus best viewed as constituting a single bacterial species, whose generalist nature is a key to its rapid spread and human virulence.  相似文献   

7.
We outline a method for estimating quantitatively the influence of point mutations and selection on the frequencies of codons and amino acids. We show how the mutation rate, i.e., the rate of amino acid replacement due to point mutation, can be affected by the codon usage as well as by the rates of the involved base exchanges. A comparison of the mutation rates calculated from reliable values of codon usage and base exchange probabilities with those that would be expected on the basis of chance reveals a notable suppression of replacements leading to tryptophan, glutamate, lysine, and methionine, and particularly of those leading to the termination codons. If selection constraints are neglected and only mutations are taken into account, the best agreement between expected and observed frequencies of both codons and amino acids is obtained for alpha = 1.13-1.15, where (Formula: see text). The "selection values" of codons and amino acids derived by our method show a pattern that partially deviates from others in the literature. For example, the selection pressure on methionine and cysteine turns out to be much more pronounced than expected if only the discrepancies between their observed and expected occurrences in proteins are considered. To estimate to what extent randomly occurring amino acid replacements are accepted by selection, we constructed an "acceptability matrix" from the well-established matrix of accepted point mutations. On the basis of this matrix "acceptability values" of the amino acids can be defined that correlate with their selection values. We also examine the significance of mutations and selection of amino acids with respect to their physicochemical properties and functions in proteins. The conservatism of amino acid replacements with respect to certain properties such as polarity can be brought about by the mutational process alone, whereas the conservatism with respect to other relevant properties--among them all measures of bulkiness--obviously is the result of additional selectional constraints on the evolution of protein structures.  相似文献   

8.
9.
The relationship among 222 published indices representing various physicochemical and biochemical properties of amino acid residues has been investigated by hierarchical cluster analysis. The clustering result is illustrated by the minimum spanning tree, which is conveniently divided into four regions: alpha and turn propensities, beta propensity, hydrophobicity and other physicochemical properties including, among others, bulkiness of amino acid residues. In addition, several subclasses of hydrophobicity scales have been identified: preference of inside and outside, accessible surface area, surrounding hydrophobicity and other mostly experimental scales including transfer free energy, partition coefficients, HPLC parameters and polarity. Representative amino acid indices are identified in each of these groups. The collection of amino acid indices is a useful resource for empirical analyses correlating sequence information with structural and functional properties of proteins. As an example, the indices that best reproduce the amino acid mutation data matrix are searched against this collection.  相似文献   

10.
On the PAM matrix model of protein evolution   总被引:2,自引:0,他引:2  
The internal consistency of the PAM matrix model of protein evolution is here investigated. The 1 PAM matrix has been constructed from amino acid replacements observed in closely related sequences. Such replacements are of two types, those that do not require an intermediate amino acid replacement and those that do. The second type of replacement must generally be produced by a repetition of the first. This allows data on the first type to be used in predicting data on the second type so that some elements of the 1 PAM matrix may be used to predict others. A discrepancy of more than two orders of magnitude is found between the predictions and the data when this is carried out. This is partly accounted for by an error in constructing the matrix. However, it also seems necessary that the basic model be modified. Several possibilities are considered. One of these is to incorporate a site-dependent spectrum of mutabilities associated with each amino acid.   相似文献   

11.
da Silva J 《Genetics》2012,190(3):1087-1099
Human immunodeficiency virus type 1 (HIV-1) undergoes a severe population bottleneck during sexual transmission and yet adapts extremely rapidly to the earliest immune responses. The bottleneck has been inferred to typically consist of a single genome, and typically eight amino acid mutations in viral proteins spread to fixation by the end of the early chronic phase of infection in response to selection by CD8(+) T cells. Stochastic simulation was used to examine the effects of the transmission bottleneck and of potential interference among spreading immune-escape mutations on the adaptive dynamics of the virus in early infection. If major viral population genetic parameters are assigned realistic values that permit rapid adaptive evolution, then a bottleneck of a single genome is not inconsistent with the observed pattern of adaptive fixations. One requirement is strong selection by CD8(+) T cells that decreases over time. Such selection may reduce effective population sizes at linked loci through genetic hitchhiking. However, this effect is predicted to be minor in early infection because the transmission bottleneck reduces the effective population size to such an extent that the resulting strong selection and weak mutation cause beneficial mutations to fix sequentially and thus avoid interference.  相似文献   

12.
Summary Chou-Fasman parameters, measuring preferences of each amino acid for different conformational regions in proteins, were used to obtain an amino acid difference index of conformational parameter distance (CPD) values. CPD values were found to be significantly lower for amino acid exchanges representing in the genetic code transitions of purines, GA than for exchanges representing either transitions of pyrimidines, CU, or transversions of purines and pyrimidines. Inasmuch as the distribution of CPD values in these non GA exchanges resembles that obtained for amino acid pairs with double or triple base differences in their underlying codons, we conclude that the genetic code was not particularly designed to minimize effects of mutation on protein conformation. That natural selection minimizes these changes, however, was shown by tabulating results obtained by the maximum parsimony method for eight protein genealogies with a total occurrence of 4574 base substitutions. At the beginning position of the codons GA transitions were in very great excess over other base substitutions, and, conversely, CU transitions were deficient. At the middle position of the codons only fast evolving proteins showed an excess of GA transitions, as though selection mainly preserved conformation in these proteins while weeding out mutations affecting chemical properties of functional sites in slow evolving proteins. In both fast and slow evolving proteins the net direction of transitions and transversions was found to be from G beginning codons to non-G beginning codons resulting in more commonly occurring amino acids, especially alanine with its generalized conformational properties, being replaced at suitable sites by amino acids with more specialized conformational and chemical properties. Historical circumstances pertaining to the origin of the genetic code and the nature of primordial proteins could account for such directional changes leading to increases in the functional density of proteins.In order to further explore the course of protein evolution, a modified parsimony algorithm was developed for constructing protein genealogies on the basis of minimum CPD length. The algorithm's ability to judge with finer discrimination that in protein evolution certain pathways of amino acid substitution should occur more readily than others was considered a potential advantage over strict maximum parsimony. In developing this CPD algorithm, the path of minimum CPD length through intermediate amino acids allowed by the genetic code for each pair of amino acids was determined. It was found that amino acid exchanges representing two base changes have a considerably lower average CPD value per base substitution than the amino acid exchanges representing single base changes. Amino acid exchanges representing three base changes have yet a further marked reduction in CPD per base change. This shows how extreme constraining effects of stabilizing selection can be circumvented, for by way of intermediate amino acids almost any amino acid can ultimately be substituted for another without damage to an evolving protein's conformation during the process.  相似文献   

13.
Yang Z  Nielsen R  Goldman N  Pedersen AM 《Genetics》2000,155(1):431-449
Comparison of relative fixation rates of synonymous (silent) and nonsynonymous (amino acid-altering) mutations provides a means for understanding the mechanisms of molecular sequence evolution. The nonsynonymous/synonymous rate ratio (omega = d(N)d(S)) is an important indicator of selective pressure at the protein level, with omega = 1 meaning neutral mutations, omega < 1 purifying selection, and omega > 1 diversifying positive selection. Amino acid sites in a protein are expected to be under different selective pressures and have different underlying omega ratios. We develop models that account for heterogeneous omega ratios among amino acid sites and apply them to phylogenetic analyses of protein-coding DNA sequences. These models are useful for testing for adaptive molecular evolution and identifying amino acid sites under diversifying selection. Ten data sets of genes from nuclear, mitochondrial, and viral genomes are analyzed to estimate the distributions of omega among sites. In all data sets analyzed, the selective pressure indicated by the omega ratio is found to be highly heterogeneous among sites. Previously unsuspected Darwinian selection is detected in several genes in which the average omega ratio across sites is <1, but in which some sites are clearly under diversifying selection with omega > 1. Genes undergoing positive selection include the beta-globin gene from vertebrates, mitochondrial protein-coding genes from hominoids, the hemagglutinin (HA) gene from human influenza virus A, and HIV-1 env, vif, and pol genes. Tests for the presence of positively selected sites and their subsequent identification appear quite robust to the specific distributional form assumed for omega and can be achieved using any of several models we implement. However, we encountered difficulties in estimating the precise distribution of omega among sites from real data sets.  相似文献   

14.
15.
We develop an approximate maximum likelihood method to estimate flanking nucleotide context-dependent mutation rates and amino acid exchange-dependent selection in orthologous protein-coding sequences and use it to analyze genome-wide coding sequence alignments from mammals and yeast. Allowing context-dependent mutation provides a better fit to coding sequence data than simpler (context-independent or CpG "hotspot") models and significantly affects selection parameter estimates. Allowing asymmetric (nonreciprocal) selection on amino acid exchanges gives a better fit than simple dN/dS or symmetric selection models. Relative selection strength estimates from our models show good agreement with independent estimates derived from human disease-causing and engineered mutations. Selection strengths depend on local protein structure, showing expected biophysical trends in helical versus nonhelical regions and increased asymmetry on polar-hydrophobic exchanges with increased burial. The more stringent selection that has previously been observed for highly expressed proteins is primarily concentrated in buried regions, supporting the notion that such proteins are under stronger than average selection for stability. Our analyses indicate that a highly parameterized model of mutation and selection is computationally tractable and is a useful tool for exploring a variety of biological questions concerning protein and coding sequence evolution.  相似文献   

16.
A gene encoding a putative carboxyl-terminal protease (CtpA), an unusual type of protease, is present in the Borrelia burgdorferi B31 genome. The B. burgdorferi CtpA amino acid sequence exhibits similarities to the sequences of the CtpA enzymes of the cyanobacterium Synechocystis sp. strain PCC 6803 and higher plants and also exhibits similarities to the sequences of putative CtpA proteins in other bacterial species. Here, we studied the effect of ctpA gene inactivation on the B. burgdorferi protein expression profile. Total B. burgdorferi proteins were separated by two-dimensional gel electrophoresis, and the results revealed that six proteins of the wild type were not detected in the ctpA mutant and that nine proteins observed in the ctpA mutant were undetectable in the wild type. Immunoblot analysis showed that the integral outer membrane protein P13 was larger and had a more acidic pI in the ctpA mutant, which is consistent with the theoretical change in pI for P13 not processed at the carboxyl terminus. Matrix-assisted laser desorption ionization-time of flight data indicated that in addition to P13, the BB0323 protein may serve as a substrate for carboxyl-terminal processing by CtpA. Complementation analysis of the ctpA mutant provided strong evidence that the observed effect on proteins depended on inactivation of the ctpA gene alone. We show that CtpA in B. burgdorferi is involved in the processing of proteins such as P13 and BB0323 and that inactivation of ctpA has a pleiotropic effect on borrelial protein synthesis. To our knowledge, this is the first analysis of both a CtpA protease and different substrate proteins in a pathogenic bacterium.  相似文献   

17.
The advent of full genome sequences provides exceptionally rich data sets to explore molecular and evolutionary mechanisms that shape divergence among and within genomes. In this study, we use multivariate analysis to determine the processes driving genome-wide patterns of amino usage in the obligate endosymbiont Buchnera and its close free-living relative Escherichia coli. In the AT-rich Buchnera genome, the primary source of variation in amino acid usage differentiates high- and low-expression genes. Amino acids of high-expression Buchnera genes are generally less aromatic and use relatively GC-rich codons, suggesting that selection against aromatic amino acids and against amino acids with AT-rich codons is stronger in high-expression genes. Selection to maintain hydrophobic amino acids in integral membrane proteins is a primary factor driving protein evolution in E. coli but is a secondary factor in Buchnera. In E. coli, gene expression is a secondary force driving amino acid usage, and a correlation with tRNA abundance suggests that translational selection contributes to this effect. Although this and previous studies demonstrate that AT mutational bias and genetic drift influence amino acid usage in Buchnera, this genome-wide analysis argues that selection is sufficient to affect the amino acid content of proteins with different expression and hydropathy levels.  相似文献   

18.
Glycoside hydrolase family 77 (GH77) contains prokaryotic amylomaltases and plant-disproportionating enzymes (both possessing the 4-alpha-glucanotransferase activity; EC 2.4.1.25). Together with GH13 and GH70, it forms the clan GH-H, known as the alpha-amylase family. Bioinformatics analysis revealed that the putative GH77 amylomaltase (MalQ) from the Lyme disease spirochaete Borrelia burgdorferi genome (BB0166) contains several amino acid substitutions in the positions that are important and conserved in all GH77 amylomaltases. The most important mutation concerned the functionally important arginine positioned two residues before the catalytic nucleophile that is replaced by lysine in B. burgdorferi MalQ. Similar remarkable substitutions were found in two other putative GH77 amylomaltases from related borreliae. In order to confirm the exclusive sequence features and to verify the eventual enzymatic activity, the malQ gene from B. burgdorferi was amplified using PCR. A c. 1.5-kb amplified DNA fragment was sequenced, cloned and expressed in Escherichia coli, and the resulting recombinant protein was preliminarily characterized for its activity towards glucose (G1) and a series of malto-oligosaccharides (G2-G7). This study confirmed that the remarkable substitution of the arginine really exists and the GH77 MalQ protein from B. burgdorferi is a functional amylomaltase because it is able to hydrolyse the malto-oligosaccharides as well as to form their longer transglycosylation products.  相似文献   

19.
The RNA genome of the hepatitis C virus (HCV) diversifies rapidly during the acute phase of infection, but the selective forces that drive this process remain poorly defined. Here we examined whether Darwinian selection pressure imposed by CD8(+) T cells is a dominant force driving early amino acid replacement in HCV viral populations. This question was addressed in two chimpanzees followed for 8 to 10 years after infection with a well-defined inoculum composed of a clonal genotype 1a (isolate H77C) HCV genome. Detailed characterization of CD8(+) T cell responses combined with sequencing of recovered virus at frequent intervals revealed that most acute-phase nonsynonymous mutations were clustered in class I epitopes and appeared much earlier than those in the remainder of the HCV genome. Moreover, the ratio of nonsynonymous to synonymous mutations, a measure of positive selection pressure, was increased 50-fold in class I epitopes compared with the rest of the HCV genome. Finally, some mutation of the clonal H77C genome toward a genotype 1a consensus sequence considered most fit for replication was observed during the acute phase of infection, but the majority of these amino acid substitutions occurred slowly over several years of chronic infection. Together these observations indicate that during acute hepatitis C, virus evolution was driven primarily by positive selection pressure exerted by CD8(+) T cells. This influence of immune pressure on viral evolution appears to subside as chronic infection is established and genetic drift becomes the dominant evolutionary force.  相似文献   

20.
辛高伟  胡熙璕  王克剑  王兴春 《遗传》2018,40(12):1112-1119
成簇的规律间隔短回文重复序列及CRISPR相关蛋白(clustered regularly interspaced short palindromic repeats/CRISPR-associated 9, CRISPR/Cas9)系统是近年来发展起来并被广泛应用的第三代基因组编辑工具。但是,该系统的酿脓链球菌Cas9(Streptococcus pyogenes, SpCas9)仅能识别NGG前间区序列邻近基序(protospacer adjacent motif, PAM),极大地限制了基因组编辑的范围。SpCas9变体VQR(D1135V/R1335Q/T1337R)在水稻中可识别NGAA、NGAG和NGAT PAM,但尚不清楚是否能识别NGAC PAM。本研究利用改进后的CRISPR/VQR系统对水稻中3个相对低效的VQR靶位点NAL1-Q1、NAL1-Q2和LPA1-Q进行了编辑,结果表明改进后的CRISPR/VQR系统可以高效编辑这3个靶位点,编辑效率分别为9.75%、43.90%和29.26%。为了明确改进后的CRISPR/VQR系统对NGAC PAM的识别情况,本研究选择水稻叶片宽度调控基因NARROW LEAF 1 (NAL1)中的NAL-C位点和蜡质合成基因GLOSSY1 (GL1)中的GL1-C位点进行基因编辑,并获得57株转基因水稻。靶位点PCR扩增及测序结果表明,NAL1-C和GL1-C靶标位点突变的植株分别为27株和44株,突变率分别为47.36%和77.19%;其中NAL1-C/GL1-C双突变植株为26株,双突变率为45.61%。进一步分析表明,CRISPR/VQR系统造成的突变有4种类型,分别为杂合突变、双等位突变、嵌合体突变和纯合突变,其中以杂合突变和双等位突变为主。这些结果表明,改进的CRISPR/VQR系统可以高效编辑水稻NGAC PAM位点,并产生丰富的突变类型。本研究为水稻及其他植物相关基因NGAC PAM位点的编辑提供了理论依据。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号