共查询到20条相似文献,搜索用时 93 毫秒
1.
Background
Widely used substitution models for proteins, such as the Jones-Taylor-Thornton (JTT) or Whelan and Goldman (WAG) models, are based on empirical amino acid interchange matrices estimated from databases of protein alignments that incorporate the average amino acid frequencies of the data set under examination (e.g JTT + F). Variation in the evolutionary process between sites is typically modelled by a rates-across-sites distribution such as the gamma (Γ) distribution. However, sites in proteins also vary in the kinds of amino acid interchanges that are favoured, a feature that is ignored by standard empirical substitution matrices. Here we examine the degree to which the pattern of evolution at sites differs from that expected based on empirical amino acid substitution models and evaluate the impact of these deviations on phylogenetic estimation. 相似文献2.
Internal protein dynamics is essential for biological function. During evolution, protein divergence is functionally constrained:
properties more relevant for function vary more slowly than less important properties. Thus, if protein dynamics is relevant
for function, it should be evolutionary conserved. In contrast with the well-studied evolution of protein structure, the evolutionary
divergence of protein dynamics has not been addressed systematically before, apart from a few case studies. X-Ray diffraction
analysis gives information not only on protein structure but also on B-factors, which characterize the flexibility that results
from protein dynamics. Here we study the evolutionary divergence of protein backbone dynamics by comparing the Cα flexibility (B-factor) profiles for a large dataset of homologous proteins classified into families and superfamilies. We
show that Cα flexibility profiles diverge slowly, so that they are conserved at family and superfamily levels, even for pairs of proteins
with nonsignificant sequence similarity. We also analyze and discuss the correlations among the divergences of flexibility,
sequence, and structure.
Electronic Supplementary Material Electronic Supplementary material is available for this article at
and accessible for authorised users.
[Reviewing Editor: Dr. David Pollock] 相似文献
3.
Hebsgaard MB Wiuf C Gilbert MT Glenner H Willerslev E 《Journal of molecular evolution》2007,64(1):50-60
The retrieval of Neanderthal (Homo neanderthalsensis) mitochondrial DNA is thought to be among the most significant ancient DNA contributions to date, allowing conflicting hypotheses
on modern human (Homo sapiens) evolution to be tested directly. Recently, however, both the authenticity of the Neanderthal sequences and their phylogenetic
position outside contemporary human diversity have been questioned. Using Bayesian inference and the largest dataset to date,
we find strong support for a monophyletic Neanderthal clade outside the diversity of contemporary humans, in agreement with
the expectations of the Out-of-Africa replacement model of modern human origin. From average pairwise sequence differences,
we obtain support for claims that the first published Neanderthal sequence may include errors due to postmortem damage in
the template molecules for PCR. In contrast, we find that recent results implying that the Neanderthal sequences are products
of PCR artifacts are not well supported, suffering from inadequate experimental design and a presumably high percentage (>68%)
of chimeric sequences due to “jumping PCR” events.
Electronic Supplementary Material Electronic Supplementary material is available for this article at
and accessible for authorised users.
[Reviewing Editor: Dr. Martin Kreitman] 相似文献
4.
Flinn B Rothwell C Griffiths R Lägue M DeKoeyer D Sardana R Audy P Goyer C Li XQ Wang-Pruski G Regan S 《Plant molecular biology》2005,59(3):407-433
To help develop an understanding of the genes that govern the developmental characteristics of the potato (Solanum tuberosum), as well as the genes associated with responses to specified pathogens and storage conditions, The Canadian Potato Genome
Project (CPGP) carried out 5′ end sequencing of regular, normalized and full-length cDNA libraries of the Shepody potato cultivar,
generating over 66,600 expressed sequence tags (ESTs). Libraries sequenced represented tuber developmental stages, pathogen-challenged
tubers, as well as leaf, floral developmental stages, suspension cultured cells and roots. All libraries analysed to date
have contributed unique sequences, with the normalized libraries high on the list. In addition, a low molecular weight library
has enhanced the 3′ ends of our sequence assemblies. Using the combined assembly dataset, unique tuber developmental, cold
storage and pathogen-challenged sequences have been identified. A comparison of the ESTs specific to the pathogen-challenged
tuber and foliar libraries revealed minimal overlap between these libraries. Mixed assemblies using over 189,000 potato EST
sequences from CPGP and The Institute for Genomics Research (TIGR) has revealed common sequences, as well as CPGP- and TIGR-unique
sequences.
Electronic Supplementary Material Electronic Supplementary material is available for this article at
and accessible for authorised users. 相似文献
5.
A series of metallopeptides based on the amino terminal copper/nickel (ATCUN) binding motif have been evaluated as classical inhibitors and catalytic inactivators of both rabbit and human angiotensin-converting enzyme (hACE), and human endothelin-converting enzyme 1 (hECE-1). The cobalt complex [KGHK–Co(NH3)2]2+, where KGHK is lysylglycylhistidyllysine, displayed similar K
I and IC50 values to those found for [KGHK–Cu]+, in spite of the enhanced charge, and so either the influence of charge is offset by the steric influence of the axially coordinated ammine ligands, or binding is dominated by contributions from the amino acid side chains, especially the C-terminal lysine that mimics the binding pattern observed for lisinopril. Moreover, the inhibition observed for [KGHK–Co(NH3)2]2+ contrasts with the activation of hACE by Co2+(aq), reflecting the stimulation of enzyme activity following replacement of the catalytic zinc cofactor by cobalt ion at each of the two active sites. Quantitative analysis of the dose-dependent stimulation of activity by Co2+(aq) yielded apparent affinities of 1.3 ± 0.2 and 56 ± 8 μM for the two sites in the presence of saturating Zn2+ (10 μM). Catalytic inactivation of hACE by [KGHK–Cu] + at subsaturating concentrations had previously been characterized, with k
obs = 2.9 ± 0.5 × 10−2 min−1. Under similar conditions, the same complex is found to catalytically inactivate hECE-1, with k
obs = 2.12 ± 0.16 × 10−2 min−1, demonstrating the potential for dual-action activity against two key drug targets in cardiovascular disease. Irreversible inactivation of a drug target represents a novel mechanism of drug action that complements existing classical inhibitor strategies that underlie current drug discovery efforts.Electronic Supplementary Material Supplementary material is available to authorized users in the online version of this article at . 相似文献
6.
Genes related to sex and reproduction are known to evolve rapidly, however, the mechanism for rapid evolutionary change is
proving to be more complex than a simple relaxation of selective constraint. We compared the divergence between orthologous
human and mouse fertility genes according to their degree of dispensability as suggested by mouse knockout mutation phenotypes.
The dataset consisted of 161 orthologous genes affecting fertility and 803 orthologous genes affecting viability. We find
that essential fertility genes affecting both sexes evolve at a similar rate as essential viability genes, but that within
sexes the degree of dispensability is not an important factor affecting the rate of fertility gene evolution. We also find
no difference in the evolutionary rates of fertility genes that affect the male versus the female, however, there are a greater
number of sterility genes that affect the male. Generally there are a significantly greater number of fertility genes that
affect one sex rather than both, suggesting that fertility genes tend toward sex-specific functions, particularly in the male.
Our findings support the hypothesis that the rapid evolution of sex- and reproduction-related genes is facilitated through
an increased specialization of gene function and that dispensability is not a major factor determining their evolutionary
rate.
Electronic Supplementary Material Electronic Supplementary material is available for this article at
and accessible for authorised users.
[Reviewing Editor: Dr. Willie J. Swanson] 相似文献
7.
A pathogenesis related protein, AhPR10 from peanut: an insight of its mode of antifungal activity 总被引:10,自引:0,他引:10
A pathogenesis related protein (AhPR10) is identified from a clone of 6-day old Arachis hypogaea L. (peanut) cDNA library. The clone expressed as a ∼20 kDa protein in E. coli. Nucleotide sequence derived amino acid sequence of the coding region shows its homology with PR10 proteins having Betv1 domain and P loop motif. Recombinant AhPR10 has ribonuclease activity, and antifungal activity against the peanut pathogens Fusarium oxysporum and Rhizoctonia solani. Mutant protein AhPR10-K54N where lys54 is mutated to asn54 loses its ribonuclease and antifungal activities. FITC labeled AhPR10 and AhPR10-K54N are internalized by hyphae of F. oxysporum and R. solani but the later protein does not inhibit the fungal growth. This suggests that the ribonuclease function of AhPR10 is essential for its antifungal activity. Energy and temperature dependent internalization of AhPR10 into sensitive fungal hyphae indicate that internalization of the protein occurs through active uptake.Electronic Supplementary Material Supplementary material is available to authorised users in the online version of this article at .The nucleotide sequence of AhPR10 reported in this paper is submitted to NCBI Nucleotide Sequence Database under the Accession number AY726607. 相似文献
8.
Chromosomal deoxyribonucleic acid was isolated and purified from 10 strains ofFlavobacterium breve, originating from human or other animal sources. The mean and standard deviation for the species in base content was 32.4±0.6%
G+C, and in genome size was 3.21±0.37×109 daltons. In vitro DNA reassociation showed that sevenF. breve strains (mainly from human sources) had high levels of intraspecific base sequence similarity (>70%) as derived from reassociations
done at the optimum temperature of reassociation (TOR) or TOR—10°C (nonstringent conditions). The three otherF. breve strains contained a high degree of base sequence divergence. All 10 strains ofF. breve were readily distinguishable in their DNA characteristics fromF. meningosepticum, F. odoratum, and allied Gram-negative bacteria. 相似文献
9.
Many biological questions, including the estimation of deep evolutionary histories and the detection of remote homology between protein sequences, rely upon multiple sequence alignments and phylogenetic trees of large datasets. However, accurate large-scale multiple sequence alignment is very difficult, especially when the dataset contains fragmentary sequences. We present UPP, a multiple sequence alignment method that uses a new machine learning technique, the ensemble of hidden Markov models, which we propose here. UPP produces highly accurate alignments for both nucleotide and amino acid sequences, even on ultra-large datasets or datasets containing fragmentary sequences. UPP is available at https://github.com/smirarab/sepp.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0688-z) contains supplementary material, which is available to authorized users. 相似文献10.
Siu Wah Wong-Deyrup Youngbae Kim Sonya J. Franklin 《Journal of biological inorganic chemistry》2006,11(1):17-25
The DNA-binding behavior and target sequences of two designed metallopeptides have been investigated with an iterative electrophoresis
mobility shift assay followed by PCR amplification, and by circular dichroism spectroscopy. Peptides P3W and P5b were designed
based on the structural similarity of the helix–turn–helix motif of homeodomains and the EF-hand motifs of calmodulin, as
previously described for P3W. Like P3W, P5b binds both Eu(III) (K
d=12.6±1.9 μM) and Ca(II) (K
d=70±8 μM) with reasonable affinity. Binding selection from a library of randomized 8-mer DNA oligonucleotide sequences identified
one target family for CaP5b [5′-pur-T-pur-G-(G/C)-3′], and two target sites for CaP3W [5′-(A/T)-G-G-G-(T/C)-3′ and 5′-A-T-(G/T)-T-G-3′].
Circular dichroism studies indicate that unlike EuP3W, EuP5b is poorly folded in the absence of DNA. In the presence of DNA
containing target-binding sites for both peptides, both EuP3W and EuP5b increase in helical content, in the latter case significantly.
These results suggest that EuP5b binding to target DNA involves an induced-fit mechanism. These small chimeric metallopeptides
have been found to bind selectively to DNA targets, analogous to natural protein–DNA interactions. This corroborates our earlier
conclusions (J. Am. Chem. Soc. 125:6656, 2003) that sequence-preferential DNA cleavage by Ce(IV)P3W was due to sequence recognition.
Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users. 相似文献
11.
Miyazawa S 《PloS one》2011,6(3):e17244
Background
Empirical substitution matrices represent the average tendencies of substitutions over various protein families by sacrificing gene-level resolution. We develop a codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene. First, selective constraints averaged over proteins are estimated by maximizing the likelihood of each 1-PAM matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution matrices. Then, selective constraints specific to given proteins are approximated as a linear function of those estimated from the empirical substitution matrices.Results
Akaike information criterion (AIC) values indicate that a model allowing multiple nucleotide changes fits the empirical substitution matrices significantly better. Also, the ML estimates of transition-transversion bias obtained from these empirical matrices are not so large as previously estimated. The selective constraints are characteristic of proteins rather than species. However, their relative strengths among amino acid pairs can be approximated not to depend very much on protein families but amino acid pairs, because the present model, in which selective constraints are approximated to be a linear function of those estimated from the JTT/WAG/LG/KHG matrices, can provide a good fit to other empirical substitution matrices including cpREV for chloroplast proteins and mtREV for vertebrate mitochondrial proteins.Conclusions/Significance
The present codon-based model with the ML estimates of selective constraints and with adjustable mutation rates of nucleotide would be useful as a simple substitution model in ML and Bayesian inferences of molecular phylogenetic trees, and enables us to obtain biologically meaningful information at both nucleotide and amino acid levels from codon and protein sequences. 相似文献12.
Macario AJ Brocchieri L Shenoy AR Conway de Macario E 《Journal of molecular evolution》2006,63(1):74-86
The stress chaperone protein Hsp70 (DnaK) (abbreviated DnaK) and its co-chaperones Hsp40(DnaJ) (or DnaJ) and GrpE are universal
in bacteria and eukaryotes but occur only in some archaea clustered in the order 5′-grpE-dnaK-dnaJ-3′ in a locus termed Locus I. Three structural varieties of Locus I, termed Types I, II, and III, were identified, respectively,
in Methanosarcinales, in Thermoplasmatales and Methanothermobacter thermoautotrophicus, and in Halobacteriales. These Locus I types corresponded to three groups identified by phylogenetic trees of archaeal DnaK
proteins including the same archaeal subdivisions. These archaeal DnaK groups were not significantly interrelated, clustering
instead with DnaKs from three bacterial lineages, Methanosarcinales with Firmicutes, Thermoplasmatales and M. thermoautotrophicus with Thermotoga, and Halobacteriales with Actinobacteria, suggesting that the three archaeal types of Locus I were acquired by independent
events of lateral gene transfer. These associations, however, lacked strong bootstrap support and were sensitive to dataset
choice and tree-reconstruction method. Structural features of dnaK loci in bacteria revealed that Methanosarcinales and Firmicutes shared a similar structure, also common to most other bacterial
groups. Structural differences were observed instead in Thermotoga compared to Thermoplasmatales and M. thermoautotrophicus, and in Actinobacteria compared to Halobacteriales. It was also found that the association between the DnaK sequences from
Halobacteriales and Actinobacteria likely reflects common biases in their amino acid compositions. Although the loci structural
features and the DnaK trees suggested the possibility of lateral gene transfer between Firmicutes and Methanosarcinales, the
similarity between the archaeal and the ancestral bacterial loci favors the more parsimonious hypothesis that all archaeal
sequences originated from a unique prokaryotic ancestor.
Electronic Supplementary Material Electronic Supplementary material is available for this article at
and accessible for authorised users.
[Reviewing Editor: Dr. Stephen Freeland] 相似文献
13.
Daniell H Lee SB Grevich J Saski C Quesada-Vargas T Guda C Tomkins J Jansen RK 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2006,112(8):1503-1518
Despite the agricultural importance of both potato and tomato, very little is known about their chloroplast genomes. Analysis of the complete sequences of tomato, potato, tobacco, and Atropa chloroplast genomes reveals significant insertions and deletions within certain coding regions or regulatory sequences (e.g., deletion of repeated sequences within 16S rRNA, ycf2 or ribosomal binding sites in ycf2). RNA, photosynthesis, and atp synthase genes are the least divergent and the most divergent genes are clpP, cemA, ccsA, and matK. Repeat analyses identified 33–45 direct and inverted repeats ≥30 bp with a sequence identity of at least 90%; all but five of the repeats shared by all four Solanaceae genomes are located in the same genes or intergenic regions, suggesting a functional role. A comprehensive genome-wide analysis of all coding sequences and intergenic spacer regions was done for the first time in chloroplast genomes. Only four spacer regions are fully conserved (100% sequence identity) among all genomes; deletions or insertions within some intergenic spacer regions result in less than 25% sequence identity, underscoring the importance of choosing appropriate intergenic spacers for plastid transformation and providing valuable new information for phylogenetic utility of the chloroplast intergenic spacer regions. Comparison of coding sequences with expressed sequence tags showed considerable amount of variation, resulting in amino acid changes; none of the C-to-U conversions observed in potato and tomato were conserved in tobacco and Atropa. It is possible that there has been a loss of conserved editing sites in potato and tomato.Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users. 相似文献
14.
15.
The maize ZmEA1 protein was recently postulated to be involved in short-range pollen tube guidance from the embryo sac. To
date, EA1-like sequences had only been identified in monocot species. Using a more conserved C-terminal motif found in the
monocot species, numerous ZmEA1-like sequences were retrieved in EST databases from dicot species, as well as from unannotated
genomic sequences of Arabidopsis
thaliana. RT-PCR analyses were produced for these unannotated genes and showed that these were indeed expressed genes. Further structural
and phylogenetic analyses revealed that all members of the EA1-like (EAL) gene family shared a conserved 27–29 amino acid motif, termed the EA box near the C-terminal end, and appear to be secretory
proteins. Therefore, the EA box proteins defines a new class of small secretory proteins, some of which being possibly involved
in pollen tube guidance.
Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users. 相似文献
16.
Connecting protein sequence to function is becoming increasingly relevant since high-throughput sequencing studies accumulate large amounts of genomic data. In order to go beyond the existing database annotation, it is fundamental to understand the mechanisms underlying functional inheritance and divergence. If the homology relationship between proteins is known, can we determine whether the function diverged? In this work, we analyze different possibilities of protein sequence evolution after gene duplication and identify “inter-paralog inversions”, i.e., sites where the relationship between the ancestry and the functional signal is decoupled. The amino acids in these sites are masked from being recognized by other prediction tools. Still, they play a role in functional divergence and could indicate a shift in protein function. We develop a method to specifically recognize inter-paralog amino acid inversions in a phylogeny and test it on real and simulated datasets. In a dataset built from the Epidermal Growth Factor Receptor (EGFR) sequences found in 88 fish species, we identify 19 amino acid sites that went through inversion after gene duplication, mostly located at the ligand-binding extracellular domain. Our work uncovers an outcome of protein duplications with direct implications in protein functional annotation and sequence evolution. The developed method is optimized to work with large protein datasets and can be readily included in a targeted protein analysis pipeline. 相似文献
17.
Obaidur Rahman Stephen P. Cummings Dean J. Harrington Iain C. Sutcliffe 《World journal of microbiology & biotechnology》2008,24(11):2377-2382
Bacterial lipoproteins are a diverse and functionally important group of proteins that are amenable to bioinformatic analyses
because of their unique signal peptide features. Here we have used a dataset of sequences of experimentally verified lipoproteins
of Gram-positive bacteria to refine our previously described lipoprotein recognition pattern (G+LPP). Sequenced bacterial
genomes can be screened for putative lipoproteins using the G+LPP pattern. The sequences identified can then be validated
using online tools for lipoprotein sequence identification. We have used our protein sequence datasets to evaluate six online
tools for efficacy of lipoprotein sequence identification. Our analyses demonstrate that LipoP () performs best individually but that a consensus approach, incorporating outputs from predictors of general signal peptide
properties, is most informative.
Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. 相似文献
18.
Variation in the internal transcribed spacer (ITS) of the rRNA (rrn) operon is increasingly used to infer population-level diversity in bacterial communities. However, intragenomic ITS variation
may skew diversity estimates that do not correct for multiple rrn operons within a genome. This study characterizes variation in ITS length, tRNA composition, and intragenomic nucleotide
divergence across 155 Bacteria genomes. On average, these genomes encode 4.8 rrn operons (range: 2–15) and contain 2.4 unique ITS length variants (range: 1–12) and 2.8 unique sequence variants (range: 1–12).
ITS variation stems primarily from differences in tRNA gene composition, with ITS regions containing tRNA-Ala + tRNA-Ile (48%
of sequences), tRNA-Ala or tRNA-Ile (10%), tRNA-Glu (11%), other tRNAs (3%), or no tRNA genes (27%). Intragenomic divergence
among paralogous ITS sequences grouped by tRNA composition ranges from 0% to 12.11% (mean: 0.94%). Low divergence values indicate
extensive homogenization among ITS copies. In 78% of alignments, divergence is <1%, with 54% showing zero variation and 81%
containing at least two identical sequences. ITS homogenization occurs over relatively long sequence tracts, frequently spanning
the entire ITS, and is largely independent of the distance (basepairs) between operons. This study underscores the potential
contribution of interoperon ITS variation to bacterial microdiversity studies, as well as unequivocally demonstrates the pervasiveness
of concerted evolution in the rrn gene family.
Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.
Reviewing
Editor: Dr. Margaret Riley 相似文献
19.
Berlin S Brandström M Backström N Axelsson E Smith NG Ellegren H 《Journal of molecular evolution》2006,62(2):226-233
Germline mutation rates have been found to be higher in males than in females in many organisms, a likely consequence of cell
division being more frequent in spermatogenesis than in oogenesis. If the majority of mutations are due to DNA replication
error, the male-to-female mutation rate ratio (αm) is expected to be similar to the ratio of the number of germ line cell divisions in males and females (c), an assumption that can be tested with proper estimates of αm and c. αm is usually estimated by comparing substitution rates in putatively neutral sequences on the sex chromosomes. However, substantial
regional variation in substitution rates across chromosomes may bias estimates of αm based on the substitution rates of short sequences. To investigate regional substitution rate variation, we estimated sequence
divergence in 16 gametologous introns located on the Z and W chromosomes of five bird species of the order Galliformes. Intron
ends and potentially conserved blocks were excluded to reduce the effect of using sequences subject to negative selection.
We found significant substitution rate variation within Z chromosome (G15 = 37.6, p = 0.0010) as well as within W chromosome introns (G15 = 44.0, p = 0.0001). This heterogeneity also affected the estimates of αm, which varied significantly, from 1.53 to 3.51, among the introns (ANOVA: F13,14 =2.68, p = 0.04). Our results suggest the importance of using extensive data sets from several genomic regions to avoid the effects
of regional mutation rate variation and to ensure accurate estimates of αm.
Electronic Supplementary Material Electronic Supplementary material is available for this article at
and accessible for authorised users.
[Reviewing Editor: Mr. Martin Kreitman]
Nick G.C. Smith Deceased 相似文献
20.
The comparative study of the human and chimpanzee genomes may shed light on the genetic ingredients for the evolution of the
unique traits of humans. Here, we present a simple procedure to identify human-specific nonsense mutations that might have
arisen since the human–chimpanzee divergence. The procedure involves collecting orthologous sequences in which a stop codon
of the human sequence is aligned to a non-stop codon in the chimpanzee sequence and verifying that the latter is ancestral
by finding homologs in other species without a stop codon. Using this procedure, we identify nine genes (CML2, FLJ14640, MT1L, NPPA, PDE3B, SERPINA13, TAP2, UIP1, and ZNF277) that would produce human-specific truncated proteins resulting in a loss or modification of the function. The premature
terminations of CML2, MT1L, and SERPINA13 genes appear to abolish the original function of the encoded protein because the mutation removes a major part of the known
active site in each case. The other six mutated genes are either known or presumed to produce functionally modified proteins.
The mutations of five genes (CML2, FLJ14640, MT1L, NPPA, TAP2) are known or predicted to be polymorphic in humans. In these cases, the stop codon alleles are more prevalent than the ancestral
allele, suggesting that the mutant alleles are approaching fixation since their emergence during the human evolution. The
findings support the notion that functional modification or inactivation of genes by nonsense mutation is a part of the process
of adaptive evolution and acquisition of species-specific features.
Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users. 相似文献