首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Sexually induced gene 1 (Sig1) in the centric diatom Thalassiosira weissflogii is considered to encode a gamete recognition protein. Sorhannus (2003) analyzed nucleotide sequences of Sig1 using parsimony analysis and the maximum-likelihood (ML)-based Bayesian method for inferring positive selection at single amino acid sites and reported that positively selected sites were detected by the latter method but not by the former. He then concluded that for this type of study, the ML-based method is more reliable than parsimony analysis. Here we show that his results apparently represent false-positive cases of the ML-based method and that there is no solid evidence that this gene contains positively selected sites. We further demonstrate that in the tax gene of human T-cell lymphotropic virus type I (HTLV-I), all codon sites, including invariable sites, can be inferred as positively selected sites by the ML-based method. These observations indicate that the ML-based method may produce many false-positive sites. One of the main reasons for the occurrence of false positives is that in the ML-based method, codon sites are grouped into several categories, with different nonsynonymous/synonymous rate ratios (omegas), on a purely statistical basis, and positive selection is inferred indirectly by examining whether the average omega for each category is greater than 1. In parsimony analysis, however, the evolutionary change of nucleotides at each codon site is examined. For this reason, parsimony-based methods rarely produce false positives and are safer than ML-based methods for detecting positive selection at individual codon sites, although a large number of sequences are necessary.  相似文献   

2.
The reliabilities of parsimony-based and likelihood-based methods for inferring positive selection at single amino acid sites were studied using the nucleotide sequences of human leukocyte antigen (HLA) genes, in which positive selection is known to be operating at the antigen recognition site. The results indicate that the inference by parsimony-based methods is robust to the use of different evolutionary models and generally more reliable than that by likelihood-based methods. In contrast, the results obtained by likelihood-based methods depend on the models and on the initial parameter values used. It is sometimes difficult to obtain the maximum likelihood estimates of parameters for a given model, and the results obtained may be false negatives or false positives depending on the initial parameter values. It is therefore preferable to use parsimony-based methods as long as the number of sequences is relatively large and the branch lengths of the phylogenetic tree are relatively small.  相似文献   

3.
Single likelihood ancestor counting (SLAC), fixed effects likelihood (FEL), and several random effects likelihood (REL) methods were utilized to identify positively and negatively selected sites in sexually induced gene 1 (Sig1) of four different Thalassiosira species. The SLAC analysis did not find any sites affected by positive selection but suggested 13 sites influenced by negative selection. The SLAC approach may be too conservative because of low sequence divergence. The FEL and REL analyses revealed over 60 negatively selected sites and two positively selected sites that were unique to each method. The REL method may not be able to reliably identify individual sites under selection when applied to short sequences with low divergence. Instead, we proposed a new alignment-wide test for adaptive evolution based on codon models with variation in synonymous and nonsynonymous substitution rates among sites and found evidence for diversifying evolution without relying on site-by-site testing. The performance of the FEL and REL approaches was evaluated by subjecting the tests to a type I error rate simulation analysis, using the specific characteristics of the Sig1 data set. Simulation results indicated that the FEL test had reasonable Type I errors, while REL might have been too liberal, suggesting that the two positively selected sites identified by FEL (codons 94 and 174) are not likely to be false positives. The evolution of these codon sites, one of which is located in functional domain II, appears to be associated with divergence among the three major Thalassiosira lineages. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. Martin Kreitman]  相似文献   

4.
Inferring positive selection at single amino acid sites is of biological and medical importance. Parsimony-based and likelihood-based methods have been developed for this purpose, but the reliabilities of these methods are not well understood. Because the evolutionary models assumed in these methods are only rough approximations to reality, it is desirable that the methods are not very sensitive to violation of the assumptions made. In this study we show by computer simulation that the likelihood-based method is sensitive to violation of the assumptions and produces many false-positive results under certain conditions, whereas the parsimony-based method tends to be conservative. These observations, together with those from previous studies, suggest that the positively selected sites inferred by the parsimony-based method are more reliable than those inferred by the likelihood-based method.  相似文献   

5.
SSU1基因是涉及亚硫酸外排及SO2耐受性的重要因素之一。为了研究酿酒酵母(Saccharomyce cerevisiae)中SSU1对SO2耐受性及其分化机制, 文章探讨了SSU1基因在酿酒酵母中的遗传特征及进化规律。基于SSU1基因序列的聚类分析表明, 酿酒酵母群体可通过该基因分为3个亚群, 且与其分离的地理位置无关; 基于群体数据的McDonald-Kreitman 检验表明, SSU1基因在酿酒酵母中受到适应性选择的作用; Ka/Ks检验表明, 在酿酒酵母中, 不同的亚群间有Ka/Ks显著大于1 的值, 且PAML的支系模型检验到正选择作用在群体中特定的支系上; PAML的支系-位点模型检验获得9个潜在正选择作用位点, 其中有4个发生在受正选择作用的特定支系中; 基于ssu1p蛋白结构的分析表明, 在特定支系存在的正选择作用位点中, 除345(R/K)位点上两氨基酸替换均为碱性氨基酸外, 其他3个位点均是极性氨基酸/疏水性氨基酸之间替换, 考虑不同区域的氨基酸pKa值对其维持正常的功能有着重要的作用, 在该类位点的替换可能影响到ssu1p蛋白对SO2的转运作用。  相似文献   

6.
Ramaiah Arunachalam 《Genetica》2013,141(4-6):143-155
In the twenty-first century, the first pandemic novel human influenza A/H1N1virus (NIV) outbreak was reported at Mexico and USA on March and early April, 2009 respectively. The outbreak occurred among human populations due to the presence of meager or no immune response against newly emerged viruses. The success of vaccines and drugs depends on their low susceptibility to the formation of escape mutants in virus. Identification of excess, non-synonymous substitutions over synonymous ones is a main indicator of positive Darwinian selection in protein-coding genes of NIVs. The positive Darwinian selection operating on each site of proteins were inferred by computing ω, the ratio of the non-synonymous/synonymous substitutions [dN/dS (or) Ka/Ks], which was calculated by three different methods in terms of codon-based maximum likelihood, branch-site and empirical Bayesian methods under various models. Totally, nine sites from PB2, PB1, HA, M2 and NS1 are inferred as positively selected. The function for amino acid sites of NIVs proteins under positive selection are inferred by comparing the sites with experimentally determined functionally known amino acid sites. Completely 4 positively selected sites of PB1, HA and M2 are found to be involved in B-cell epitopes (BCEs). Interestingly, most of these sites are also involving in T-cell epitopes (TCEs). However, more sites under positive selection forces are involved in TCEs than those of BCEs. Amino acid sites engaged in both BCEs and TCEs should be measured as highly suitable targets, because these sites could induce the strong humoral and cellular immune responses against targets.  相似文献   

7.
Amino acid residues that are involved in functional interactions in proteins have strong evolutionary pressure to remain unchanged and consequently their substitution patterns are different from those that are noninteracting. To characterize and quantify the differences between amino acid substitution patterns due to structural restraints and those under functional restraints, we have made a comparative analysis of families of homologous proteins. Residues classified as having the same amino acid type, secondary structure, accessibility, and side-chain hydrogen bonds are shown to be better conserved if they are close to the active site. We have focused on enzyme families for this analysis since they have functional sites that are easily defined by their catalytic residues. We have derived new sets of environment-specific substitution tables, which we term function-dependent environment-specific substitution tables, where amino acid residues are classified according to their distance from the functional sites. The residues that are within a distance of 9 A from the active site have distinct amino acid substitution patterns when compared to the other sites. The function-dependent environment-specific substitution tables have been tested using the sequence-structure homology recognition program FUGUE and the results compared with the recognition performance obtained using the standard environment-specific substitution tables. Significant improvements are obtained in both recognition performance and alignment accuracy using the function-dependent environment-specific substitution tables (P-value = 0.02, according to the Wilcoxon signed rank test for alignment accuracy). The alignments near the active site are greatly improved with pronounced improvements at lower percentage identities (less than 30%).  相似文献   

8.
Natural selection operating at the amino acid sequence level can be detected by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) nucleotide substitutions, where r(N)/r(S) (omega) > 1 and omega < 1 suggest positive and negative selection, respectively. The branch-site test has been developed for detecting positive selection operating at a group of amino acid sites for a pre-specified (foreground) branch of a phylogenetic tree by taking into account the heterogeneity of omega among sites and branches. Here the performance of the branch-site test was examined by computer simulation, with special reference to the false-positive rate when the divergence of the sequences analyzed was small. The false-positive rate was found to inflate when the assumptions made on the omega values for the foreground and other (background) branches in the branch-site test were violated. In addition, under a similar condition, false-positive results were often obtained even when Bonferroni correction was conducted and the false-discovery rate was controlled in a large-scale analysis. False-positive results were also obtained even when the number of nonsynonymous substitutions for the foreground branch was smaller than the minimum value required for detecting positive selection. The existence of a codon site with a possibility of occurrence of multiple nonsynonymous substitutions for the foreground branch often caused the branch-site test to falsely identify positive selection. In the re-analysis of orthologous trios of protein-coding genes from humans, chimpanzees, and macaques, most of the genes previously identified to be positively selected for the human or chimpanzee branch by the branch-site test contained such a codon site, suggesting a possibility that a significant fraction of these genes are false-positives.  相似文献   

9.
Acrosin is thought to fulfill several different roles in fertilization including that of a serine protease and in secondary zona pellucida (ZP) binding. However, acrosin's importance as a fertilization protein has been questioned. Especially since it was discovered that acrosin knockout mice are fertile. In this study, we explored the sites involved in serine protease activity and secondary binding. We also assessed conservation in functional sites across species and examined whether amino acid changes present in the human population have the potential to affect fertility. In addition, since many mammalian reproduction proteins have been found to evolve rapidly, we tested for positive selection. Sequences from 43 mammals from all 19 placental orders, which included a total of 828 nucleotides from acrosin exons 2, 3, 4, and a portion of exon 5, were obtained. We found that all sites of the serine catalytic triad as well as three other sites linked to catalytic activity were completely conserved. Five of six sites proposed to play a role in secondary binding were 100% conserved as basic residues. These results support an evolutionary conserved role for acrosin as a serine protease and secondary binding protein across placental mammals. We found statistically significant support for positive selection within acrosin, but no single amino acid site reached the significance level of P > 0.95 for inclusion within the category omega > 1. Based upon two amino acid mutation scoring systems, three out of seven human residue changing single nucleotide polymorphisms (SNPs) were found to be potentially protein-altering mutations.  相似文献   

10.
The nature of selection on capsid genes of foot-and-mouth disease virus (FMDV) was characterized by examining the ratio of nonsynonymous to synonymous substitutions in 11 data sets of sequences obtained from six different serotypes of FMDV. Using a method of analysis that assigns each codon position to one of a number of estimated values of nonsynonymous to synonymous ratio, significant evidence of positive selection was identified in 5 data sets, operating at 1-7% of codon positions. Evidence of positive selection was identified in complete capsid sequences of serotypes A and C and in VP1 sequences of serotypes SAT 1 and 2. Sequences of serotype SAT-2 recovered from a persistently infected African buffalo also revealed evidence for positive selection. Locations of codons under positive selection coincide closely with those of antigenic sites previously identified with the use of monoclonal antibody escape mutants. The vast majority of codons are under mild to strong purifying selection. However, these results suggest that arising antigenic variants benefit from a selective advantage in their interaction with the immune system, either during the course of an infection or in transmission to individuals with previous exposure to antigen. Analysis of amino acid usage at sites under positive selection indicates that this selective advantage can be conferred by amino acid substitutions that share physicochemically similar properties.  相似文献   

11.
Yang Z  Nielsen R  Goldman N  Pedersen AM 《Genetics》2000,155(1):431-449
Comparison of relative fixation rates of synonymous (silent) and nonsynonymous (amino acid-altering) mutations provides a means for understanding the mechanisms of molecular sequence evolution. The nonsynonymous/synonymous rate ratio (omega = d(N)d(S)) is an important indicator of selective pressure at the protein level, with omega = 1 meaning neutral mutations, omega < 1 purifying selection, and omega > 1 diversifying positive selection. Amino acid sites in a protein are expected to be under different selective pressures and have different underlying omega ratios. We develop models that account for heterogeneous omega ratios among amino acid sites and apply them to phylogenetic analyses of protein-coding DNA sequences. These models are useful for testing for adaptive molecular evolution and identifying amino acid sites under diversifying selection. Ten data sets of genes from nuclear, mitochondrial, and viral genomes are analyzed to estimate the distributions of omega among sites. In all data sets analyzed, the selective pressure indicated by the omega ratio is found to be highly heterogeneous among sites. Previously unsuspected Darwinian selection is detected in several genes in which the average omega ratio across sites is <1, but in which some sites are clearly under diversifying selection with omega > 1. Genes undergoing positive selection include the beta-globin gene from vertebrates, mitochondrial protein-coding genes from hominoids, the hemagglutinin (HA) gene from human influenza virus A, and HIV-1 env, vif, and pol genes. Tests for the presence of positively selected sites and their subsequent identification appear quite robust to the specific distributional form assumed for omega and can be achieved using any of several models we implement. However, we encountered difficulties in estimating the precise distribution of omega among sites from real data sets.  相似文献   

12.
In human immunodeficiency virus type 1 (HIV-1), mutations that escape from cytotoxic T-lymphocyte (CTL) recognition have been documented, and sequence analyses have provided indirect support for the hypothesis that natural selection has favored CTL escape mutants within an infected host. In spite of such evidence for within-host selection by CTL, it has been more difficult to determine how natural selection by host CTL has influenced long-term evolution of HIV-1. We used statistical analysis of published HIV-1 genomic sequences to examine the role of natural selection in between-host evolution of CTL epitopes. Based on a phylogenetic analysis, we identified 21 pairs of closely related genomes isolated from different hosts and examined the pattern of nucleotide substitution in genomic regions encoding well-characterized CTL epitopes. The results revealed that certain CTL epitopes have been subject to repeated positive selection across the population, while others are generally conserved. Furthermore, evidence of positive selection was associated with divergence from the canonical epitope sequence and with an enhanced frequency of convergent amino acid sequence changes in CTL epitopes. The results support the hypothesis that CTL-driven selection has been a major factor in the long-term evolution of HIV-1.  相似文献   

13.
A number of statistical tests have been proposed to detect positive Darwinian selection affecting a few amino acid sites in a protein, exemplified by an excess of nonsynonymous nucleotide substitutions. These tests are often more powerful than pairwise sequence comparison, which averages synonymous (d(S)) and nonsynonymous (d(N)) rates over the whole gene. In a recent study, however, Hughes AL and Friedman R (2005. Variation in the pattern of synonymous and nonsynonymous difference between two fungal genomes. Mol Bio Evol. 22: 1320-1324) argue that d(S) and d(N) are expected to fluctuate along the sequence by chance and that an excess of nonsynonymous differences in individual codons is no evidence for positive selection. The authors compared codons in protein-coding genes from the genomes of 2 yeast species, Saccharomyces cerevisiae and Saccharomyces paradoxus. They calculated the proportions of synonymous and nonsynonymous differences per site (p(S) and p(N)) in every codon and discovered that p(N) is often greater than p(S) and that among some codons p(S) and p(N) are negatively correlated. The authors argued that these results invalidate previous tests of codons under positive selection. Here I discuss several errors of statistics in the analysis of Hughes and Friedman, including confusion of statistics with parameters, arbitrary data filtering, and derivation of hypotheses from data. I also apply likelihood ratio tests of positive selection to the yeast data and illustrate empirically that Hughes and Friedman's criticisms on such tests are not valid.  相似文献   

14.
选取竹亚科中两个超族、六个族和三个亚族的10个竹种为材料,分别是泰竹、凤尾竹、青皮竹、大叶慈、慈竹、野龙竹、毛竹、香竹、苦竹、菲白竹,分离克隆了它们的lea3基因,并将它们与外类群物种水稻进行序列比对和进化分析。结果发现在分支模型与分支位点模型的检测中,不同竹种所含lea3基因承受了不同的正选择压力,清除选择作用在lea3基因编码区中占主导地位(ω<1)。在位点模型的检测中,共检测出了18个显著性正选择位点,占总氨基酸数目的111%。对这18个显著性正选择位点进行定位后,发现其中的15个位于11个氨基酸串联重复序列附近。这说明lea3基因中的11个氨基酸串联重复序列区比基因其它区域更容易受自然选择作用影响。同时,在位点模型检测结果的基础上,通过对强烈清除选择位点的定位,发现在11个氨基酸串联重复序列区内存在一长段无强烈清除位点的序列区。  相似文献   

15.
Isoeugenol-O-methyltransferase (IEMT) is an enzyme involved in the production of the floral volatile compounds methyl eugenol and methyl isoeugenol in Clarkia breweri (Onagraceae). IEMT likely evolved by gene duplication from caffeic acid-O-methyltransferase followed by amino acid divergence, leading to the acquisition of its novel function. To investigate the selective context under which IEMT evolved, maximum likelihood methods that estimate variable d(N)/d(S) ratios among lineages, among sites, and among a combination of both lineages and sites were utilized. Statistically significant support was obtained for a hypothesis of positive selection driving the evolution of IEMT since its origin. Subsequent Bayesian analyses identified several sites in IEMT that have experienced positive selection. Most of these positions are in the active site of IEMT and have been shown by site-directed mutagenesis to have large effects on substrate specificity. Although the selective agent is unknown, the adaptive evolution of this gene may have resulted in increased effectiveness of pollinator attraction or herbivore repellence.  相似文献   

16.
Human enterovirus 71 viruses have been long circulating throughout the world. In this study, we performed a positive selection analysis of the VP1 genes of capsid proteins from Enterovirus 71 viruses. Our results showed that although most sites were under negative or neutral evolution, four positions of the VP1 genes were under positive selection pressure. This might account for the spread and frequent outbreaks of the viruses and the enhanced neurovirulence. In particular, position 98 might be involved in neutralizing antibodies, modulating the virus-receptor interaction and enhancing the virulence of the viruses. Moreover, both positions 145 and 241 might correlate to determine the receptor specificity. However, these positions did not display much difference in amino acid polymorphism. In addition, no position in the VP1 genes of viruses isolated from China was under positive selection.  相似文献   

17.
We investigated variable selective pressures among amino acid sites in HIV-1 genes. Selective pressure at the amino acid level was measured by using the nonsynonymous/synonymous substitution rate ratio ( = dN/dS). To identify amino acid sites under positive selection with > 1, we applied maximum likelihood models that allow variable ratios among sites to analyze genomic sequences of 26 HIV-1 lineages including subtypes A, B, and C. Likelihood ratio tests detected sites under positive selection in each of the major genes in the genome: env, gag, pol, vif, and vpr. Positive selection was also detected in nef, tat, and vpu, although those genes are very small. The majority of positive selection sites is located in gp160. Positive selection was not detected if was estimated as an average across all sites, indicating the lack of power of the averaging approach. Candidate positive selection sites were mapped onto the available protein tertiary structures and immunogenic epitopes. We measured the physiochemical properties of amino acids and found that those at positive selection sites were more diverse than those at variable sites. Furthermore, amino acid residues at exposed positive selection sites were more physiochemically diverse than at buried positive selection sites. Our results demonstrate genomewide diversifying selection acting on the HIV-1.  相似文献   

18.
Positive selection has been shown to be pervasive in sex-related proteins of many metazoan taxa. However, we are only beginning to understand molecular evolutionary processes on the lineage to humans. To elucidate the evolution of proteins involved in human reproduction, we studied the sequence evolution of MAM domains of the sperm-ligand zonadhesin in respect to single amino acid sites, solvent accessibility, and posttranslational modification. GenBank-data were supplemented by new cDNA-sequences of a representative non-human primate panel. Solvent accessibility predictions identified a probably exposed fragment of 30 amino acids belonging to MAM domain 2 (i.e., MAM domain 3 in mouse). The fragment is characterized by significantly increased rate of positively selected amino acid sites and exhibits high variability in predicted posttranslational modification, and, thus, might represent a binding region in the mature protein. At the same time, there is a significant coincidence of positively selected amino acid sites and non-conserved posttranslational motifs. We conclude that the binding specificity of zonadhesin MAM domains, especially of the presumed epitope, is achieved by positive selection at the level of single amino acid sites and posttranslational modifications, respectively.  相似文献   

19.
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.  相似文献   

20.
The envelope glycoprotein of human immunodeficiency virus type 1 (HIV-1) interacts with receptors on the target cell and mediates virus entry by fusing the viral and cell membranes. To maintain the viral infectivity, amino acids that interact with receptors are expected to be more conserved than the other sites on the protein surface. In contrast to the functional constraint of amino acids for the receptor binding, some amino acid changes in this protein may produce antigenic variations that enable the virus to escape from recognition of the host immune system. Therefore, both positive selection (higher fitness) and negative selection (lower fitness) against amino acid changes are taking place during evolution of surface proteins of parasites To elucidate the evolutionary mechanisms of the whole HIV-1 gp120 envelope glycoprotein at the single site level, we collected and analyzed all available sequence data for the protein. By analyzing 186 sequences of the HIV-1 gp120 (subtype B), we reevaluated amino acid variability at the single site level, and estimated the numbers of synonymous and nonsynonymous substitutions at each codon position to detect positive and negative selection. We identified 33 amino acid positions which may be under positive selection. Some of these positions may form discontinuous epitopes. We also analyzed amino acid sequences to find amino acid positions responsible for usage of the second receptor. We found that, in addition to the V3 loop, amino acid variation at residue 440 in C4 region is clearly linked with the usage of CXCR 4.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号