首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 609 毫秒
1.
Protein evolution is most commonly studied by analyzing related protein sequences and generating ancestral sequences through Bayesian and Maximum Likelihood methods, and/or by resurrecting ancestral proteins in the lab and performing ligand binding studies to determine function. Structural and dynamic evolution have largely been left out of molecular evolution studies. Here we incorporate both structure and dynamics to elucidate the molecular principles behind the divergence in the evolutionary path of the steroid receptor proteins. We determine the likely structure of three evolutionarily diverged ancestral steroid receptor proteins using the Zipping and Assembly Method with FRODA (ZAMF). Our predictions are within ∼2.7 Å all-atom RMSD of the respective crystal structures of the ancestral steroid receptors. Beyond static structure prediction, a particular feature of ZAMF is that it generates protein dynamics information. We investigate the differences in conformational dynamics of diverged proteins by obtaining the most collective motion through essential dynamics. Strikingly, our analysis shows that evolutionarily diverged proteins of the same family do not share the same dynamic subspace, while those sharing the same function are simultaneously clustered together and distant from those, that have functionally diverged. Dynamic analysis also enables those mutations that most affect dynamics to be identified. It correctly predicts all mutations (functional and permissive) necessary to evolve new function and ∼60% of permissive mutations necessary to recover ancestral function.  相似文献   

2.
DNA polymerase I (pol I) processes RNA primers during lagging-strand synthesis and fills small gaps during DNA repair reactions. However, it is unclear how pol I and pol III work together during replication and repair or how extensive pol I processing of Okazaki fragments is in vivo. Here, we address these questions by analyzing pol I mutations generated through error-prone replication of ColE1 plasmids. The data were obtained by direct sequencing, allowing an accurate determination of the mutation spectrum and distribution. Pol I’s mutational footprint suggests: (i) during leading-strand replication pol I is gradually replaced by pol III over at least 1.3 kb; (ii) pol I processing of Okazaki fragments is limited to ∼20 nt and (iii) the size of Okazaki fragments is short (∼250 nt). While based on ColE1 plasmid replication, our findings are likely relevant to other pol I replicative processes such as chromosomal replication and DNA repair, which differ from ColE1 replication mostly at the recruitment steps. This mutation footprinting approach should help establish the role of other prokaryotic or eukaryotic polymerases in vivo, and provides a tool to investigate how sequence topology, DNA damage, or interactions with protein partners may affect the function of individual DNA polymerases.  相似文献   

3.
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a ‘variants reduction’ protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires ∼4 min to perform gene-based annotation and ∼15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.  相似文献   

4.
Interpreting the impact of human genome variation on phenotype is challenging. The functional effect of protein-coding variants is often predicted using sequence conservation and population frequency data, however other factors are likely relevant. We hypothesized that variants in protein post-translational modification (PTM) sites contribute to phenotype variation and disease. We analyzed fraction of rare variants and non-synonymous to synonymous variant ratio (Ka/Ks) in 7,500 human genomes and found a significant negative selection signal in PTM regions independent of six factors, including conservation, codon usage, and GC-content, that is widely distributed across tissue-specific genes and function classes. PTM regions are also enriched in known disease mutations, suggesting that PTM variation is more likely deleterious. PTM constraint also affects flanking sequence around modified residues and increases around clustered sites, indicating presence of functionally important short linear motifs. Using target site motifs of 124 kinases, we predict that at least ∼180,000 motif-breaker amino acid residues that disrupt PTM sites when substituted, and highlight kinase motifs that show specific negative selection and enrichment of disease mutations. We provide this dataset with corresponding hypothesized mechanisms as a community resource. As an example of our integrative approach, we propose that PTPN11 variants in Noonan syndrome aberrantly activate the protein by disrupting an uncharacterized cluster of phosphorylation sites. Further, as PTMs are molecular switches that are modulated by drugs, we study mutated binding sites of PTM enzymes in disease genes and define a drug-disease network containing 413 novel predicted disease-gene links.  相似文献   

5.
What is the extent and scale of local adaptation (LA)? How quickly does LA arise? And what is its underlying molecular basis? Our review and meta-analysis on salmonid fishes estimates the frequency of LA to be ∼55–70%, with local populations having a 1.2 times average fitness advantage relative to foreign populations or to their performance in new environments. Salmonid LA is evident at a variety of spatial scales (for example, few km to>1000 km) and can manifest itself quickly (6–30 generations). As the geographic scale between populations increases, LA is generally more frequent and stronger. Yet the extent of LA in salmonids does not appear to differ from that in other assessed taxa. Moreover, the frequency with which foreign salmonid populations outperform local populations (∼23–35%) suggests that drift, gene flow and plasticity often limit or mediate LA. The relatively few studies based on candidate gene and genomewide analyses have identified footprints of selection at both small and large geographical scales, likely reflecting the specific functional properties of loci and the associated selection regimes (for example, local niche partitioning, pathogens, parasites, photoperiodicity and seasonal timing). The molecular basis of LA in salmonids is still largely unknown, but differential expression at the same few genes is implicated in the convergent evolution of certain phenotypes. Collectively, future research will benefit from an integration of classical and molecular approaches to understand: (i) species differences and how they originate, (ii) variation in adaptation across scales, life stages, population sizes and environmental gradients, and (iii) evolutionary responses to human activities.  相似文献   

6.
7.
eIF4A is a key component in eukaryotic translation initiation; however, it has not been clear how auxiliary factors like eIF4B and eIF4G stimulate eIF4A and how this contributes to the initiation process. Based on results from isothermal titration calorimetry, we propose a two-site model for eIF4A binding to an 83.5 kDa eIF4G fragment (eIF4G-MC), with a high- and a low-affinity site, having binding constants KD of ∼50 and ∼1000 nM, respectively. Small angle X-ray scattering analysis shows that the eIF4G-MC fragment adopts an elongated, well-defined structure with a maximum dimension of 220 Å, able to span the width of the 40S ribosomal subunit. We establish a stable eIF4A–eIF4B complex requiring RNA, nucleotide and the eIF4G-MC fragment, using an in vitro RNA pull-down assay. The eIF4G-MC fragment does not stably associate with the eIF4A–eIF4B–RNA-nucleotide complex but acts catalytically in its formation. Furthermore, we demonstrate that eIF4B and eIF4G-MC act synergistically in stimulating the ATPase activity of eIF4A.  相似文献   

8.
9.
Although patterns of somatic alterations have been reported for tumor genomes, little is known on how they compare with alterations present in non-tumor genomes. A comparison of the two would be crucial to better characterize the genetic alterations driving tumorigenesis. We sequenced the genomes of a lymphoblastoid (HCC1954BL) and a breast tumor (HCC1954) cell line derived from the same patient and compared the somatic alterations present in both. The lymphoblastoid genome presents a comparable number and similar spectrum of nucleotide substitutions to that found in the tumor genome. However, a significant difference in the ratio of non-synonymous to synonymous substitutions was observed between both genomes (P = 0.031). Protein–protein interaction analysis revealed that mutations in the tumor genome preferentially affect hub-genes (P = 0.0017) and are co-selected to present synergistic functions (P < 0.0001). KEGG analysis showed that in the tumor genome most mutated genes were organized into signaling pathways related to tumorigenesis. No such organization or synergy was observed in the lymphoblastoid genome. Our results indicate that endogenous mutagens and replication errors can generate the overall number of mutations required to drive tumorigenesis and that it is the combination rather than the frequency of mutations that is crucial to complete tumorigenic transformation.  相似文献   

10.
11.
12.
Current methods for measuring deoxyribonucleoside triphosphates (dNTPs) employ reagent and labor-intensive assays utilizing radioisotopes in DNA polymerase-based assays and/or chromatography-based approaches. We have developed a rapid and sensitive 96-well fluorescence-based assay to quantify cellular dNTPs utilizing a standard real-time PCR thermocycler. This assay relies on the principle that incorporation of a limiting dNTP is required for primer-extension and Taq polymerase-mediated 5–3′ exonuclease hydrolysis of a dual-quenched fluorophore-labeled probe resulting in fluorescence. The concentration of limiting dNTP is directly proportional to the fluorescence generated. The assay demonstrated excellent linearity (R2 > 0.99) and can be modified to detect between ∼0.5 and 100 pmol of dNTP. The limits of detection (LOD) and quantification (LOQ) for all dNTPs were defined as <0.77 and <1.3 pmol, respectively. The intra-assay and inter-assay variation coefficients were determined to be <4.6% and <10%, respectively with an accuracy of 100 ± 15% for all dNTPs. The assay quantified intracellular dNTPs with similar results obtained from a validated LC–MS/MS approach and successfully measured quantitative differences in dNTP pools in human cancer cells treated with inhibitors of thymidylate metabolism. This assay has important application in research that investigates the influence of pathological conditions or pharmacological agents on dNTP biosynthesis and regulation.  相似文献   

13.
Detection of copy number variation (CNV) in DNA has recently become an important method for understanding the pathogenesis of cancer. While existing algorithms for extracting CNV from microarray data have worked reasonably well, the trend towards ever larger sample sizes and higher resolution microarrays has vastly increased the challenges they face. Here, we present Segmentation analysis of DNA (SAD), a clustering algorithm constructed with a strategy in which all operational decisions are based on simple and rigorous applications of statistical principles, measurement theory and precise mathematical relations. Compared with existing packages, SAD is simpler in formulation, more user friendly, much faster and less thirsty for memory, offers higher accuracy and supplies quantitative statistics for its predictions. Unique among such algorithms, SAD''s running time scales linearly with array size; on a typical modern notebook, it completes high-quality CNV analyses for a 250 thousand-probe array in ∼1 s and a 1.8 million-probe array in ∼8 s.  相似文献   

14.
Gosal G  Kochut KJ  Kannan N 《PloS one》2011,6(12):e28782

Background

Protein kinases are a large and diverse family of enzymes that are genomically altered in many human cancers. Targeted cancer genome sequencing efforts have unveiled the mutational profiles of protein kinase genes from many different cancer types. While mutational data on protein kinases is currently catalogued in various databases, integration of mutation data with other forms of data on protein kinases such as sequence, structure, function and pathway is necessary to identify and characterize key cancer causing mutations. Integrative analysis of protein kinase data, however, is a challenge because of the disparate nature of protein kinase data sources and data formats.

Results

Here, we describe ProKinO, a protein kinase-specific ontology, which provides a controlled vocabulary of terms, their hierarchy, and relationships unifying sequence, structure, function, mutation and pathway information on protein kinases. The conceptual representation of such diverse forms of information in one place not only allows rapid discovery of significant information related to a specific protein kinase, but also enables large-scale integrative analysis of protein kinase data in ways not possible through other kinase-specific resources. We have performed several integrative analyses of ProKinO data and, as an example, found that a large number of somatic mutations (∼288 distinct mutations) associated with the haematopoietic neoplasm cancer type map to only 8 kinases in the human kinome. This is in contrast to glioma, where the mutations are spread over 82 distinct kinases. We also provide examples of how ontology-based data analysis can be used to generate testable hypotheses regarding cancer mutations.

Conclusion

We present an integrated framework for large-scale integrative analysis of protein kinase data. Navigation and analysis of ontology data can be performed using the ontology browser available at: http://vulcan.cs.uga.edu/prokino.  相似文献   

15.
Recently it has been shown that cancer mutations selectively target protein-protein interactions. We hypothesized that mutations affecting distinct protein interactions involving established cancer genes could contribute to tumor heterogeneity, and that novel mechanistic insights might be gained into tumorigenesis by investigating protein interactions under positive selection in cancer. To identify protein interactions under positive selection in cancer, we mapped over 1.2 million nonsynonymous somatic cancer mutations onto 4,896 experimentally determined protein structures and analyzed their spatial distribution. In total, 20% of mutations on the surface of known cancer genes perturbed protein-protein interactions (PPIs), and this enrichment for PPI interfaces was observed for both tumor suppressors (Odds Ratio 1.28, P-value < 10−4) and oncogenes (Odds Ratio 1.17, P-value < 10−3). To study this further, we constructed a bipartite network representing structurally resolved PPIs from all available human complexes in the Protein Data Bank (2,864 proteins, 3,072 PPIs). Analysis of frequently mutated cancer genes within this network revealed that tumor-suppressors, but not oncogenes, are significantly enriched with functional mutations in homo-oligomerization regions (Odds Ratio 3.68, P-Value < 10−8). We present two important examples, TP53 and beta-2-microglobulin, for which the patterns of somatic mutations at interfaces provide insights into specifically perturbed biological circuits. In patients with TP53 mutations, patient survival correlated with the specific interactions that were perturbed. Moreover, we investigated mutations at the interface of protein-nucleotide interactions and observed an unexpected number of missense mutations but not silent mutations occurring within DNA and RNA binding sites. Finally, we provide a resource of 3,072 PPI interfaces ranked according to their mutation rates. Analysis of this list highlights 282 novel candidate cancer genes that encode proteins participating in interactions that are perturbed recurrently across tumors. In summary, mutation of specific protein interactions is an important contributor to tumor heterogeneity and may have important implications for clinical outcomes.  相似文献   

16.
The chemical modification of histones at specific DNA regulatory elements is linked to the activation, inactivation and poising of genes. A number of tools exist to predict enhancers from chromatin modification maps, but their practical application is limited because they either (i) consider a smaller number of marks than those necessary to define the various enhancer classes or (ii) work with an excessive number of marks, which is experimentally unviable. We have developed a method for chromatin state detection using support vector machines in combination with genetic algorithm optimization, called ChromaGenSVM. ChromaGenSVM selects optimum combinations of specific histone epigenetic marks to predict enhancers. In an independent test, ChromaGenSVM recovered 88% of the experimentally supported enhancers in the pilot ENCODE region of interferon gamma-treated HeLa cells. Furthermore, ChromaGenSVM successfully combined the profiles of only five distinct methylation and acetylation marks from ChIP-seq libraries done in human CD4+ T cells to predict ∼21 000 experimentally supported enhancers within 1.0 kb regions and with a precision of ∼90%, thereby improving previous predictions on the same dataset by 21%. The combined results indicate that ChromaGenSVM comfortably outperforms previously published methods and that enhancers are best predicted by specific combinations of histone methylation and acetylation marks.  相似文献   

17.
18.
Phenotypic variation in natural populations results from a combination of genetic effects, environmental effects, and gene-by-environment interactions. Despite the vast amount of genomic data becoming available, many pressing questions remain about the nature of genetic mutations that underlie functional variation. We present the results of combining genome-wide association analysis of 41 different phenotypes in ∼5,000 inbred maize lines to analyze patterns of high-resolution genetic association among of 28.9 million single-nucleotide polymorphisms (SNPs) and ∼800,000 copy-number variants (CNVs). We show that genic and intergenic regions have opposite patterns of enrichment, minor allele frequencies, and effect sizes, implying tradeoffs among the probability that a given polymorphism will have an effect, the detectable size of that effect, and its frequency in the population. We also find that genes tagged by GWAS are enriched for regulatory functions and are ∼50% more likely to have a paralog than expected by chance, indicating that gene regulation and gene duplication are strong drivers of phenotypic variation. These results will likely apply to many other organisms, especially ones with large and complex genomes like maize.  相似文献   

19.
Abnormal heat shock protein (HSP) levels have been observed in a number of human tumours, where they are involved in all hallmarks of cancer. Since bovine urothelial tumours share striking morphological and biochemical features with their human counterparts, the aim of this study was to evaluate the immunohistochemical levels of Hsp27, Hsp60, Hsp72, Hsp73 and Hsp90 in 28 normal bovine urinary bladders and 30 bovine papillomavirus-positive urothelial tumours (9 in situ carcinomas, 9 low-grade and 12 high-grade carcinomas) and adjacent premalignant lesions obtained from cows suffering from chronic enzootic haematuria, in order to investigate the role of these proteins in the process of urothelial carcinogenesis. A semi-quantitative method was used for the analysis of the results. Western blot analysis was also used to confirm HSP expression in normal controls. All investigated HSPs were expressed in normal bovine urothelium, showing characteristic patterns of immunolabelling throughout urothelial cell layers, which usually appeared to be conserved in urothelial hyperplasia and dysplasia. On the other hand, gradual loss of Hsp27 immunostaining resulted to be significantly associated with increasing histological grade of malignancy (P < 0.01). As well, a significantly reduced immunosignal of Hsp73 and Hsp90 was observed in high-grade and low-/high-grade carcinomas, respectively (P < 0.01). In contrast, Hsp60 (P < 0.01) and Hsp72 (P < 0.05) immunoreactivity appeared to be significantly increased both in premalignant and malignant lesions when compared to that observed in normal urothelium, thus suggesting an early involvement of these proteins in neoplastic transformation of urinary bladder mucosa.  相似文献   

20.
The tumor suppressor protein p53 can lose its function upon single-point missense mutations in the core DNA-binding domain (“cancer mutants”). Activity can be restored by second-site suppressor mutations (“rescue mutants”). This paper relates the functional activity of p53 cancer and rescue mutants to their overall molecular dynamics (MD), without focusing on local structural details. A novel global measure of protein flexibility for the p53 core DNA-binding domain, the number of clusters at a certain RMSD cutoff, was computed by clustering over 0.7 µs of explicitly solvated all-atom MD simulations. For wild-type p53 and a sample of p53 cancer or rescue mutants, the number of clusters was a good predictor of in vivo p53 functional activity in cell-based assays. This number-of-clusters (NOC) metric was strongly correlated (r2 = 0.77) with reported values of experimentally measured ΔΔG protein thermodynamic stability. Interpreting the number of clusters as a measure of protein flexibility: (i) p53 cancer mutants were more flexible than wild-type protein, (ii) second-site rescue mutations decreased the flexibility of cancer mutants, and (iii) negative controls of non-rescue second-site mutants did not. This new method reflects the overall stability of the p53 core domain and can discriminate which second-site mutations restore activity to p53 cancer mutants.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号