首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
We assessed the disease-causing potential of single nucleotide polymorphisms (SNPs) based on a simple set of sequence-based features. We focused on SNPs from the dbSNP database in G-protein-coupled receptors (GPCRs), a large class of important transmembrane (TM) proteins. Apart from the location of the SNP in the protein, we evaluated the predictive power of three major classes of features to differentiate between disease-causing mutations and neutral changes: (i) properties derived from amino-acid scales, such as volume and hydrophobicity; (ii) position-specific phylogenetic features reflecting evolutionary conservation, such as normalized site entropy, residue frequency and SIFT score; and (iii) substitution-matrix scores, such as those derived from the BLOSUM62, GRANTHAM and PHAT matrices. We validated our approach using a control dataset consisting of known disease-causing mutations and neutral variations. Logistic regression analyses indicated that position-specific phylogenetic features that describe the conservation of an amino acid at a specific site are the best discriminators of disease mutations versus neutral variations, and integration of all our features improves discrimination power. Overall, we identify 115 SNPs in GPCRs from dbSNP that are likely to be associated with disease and thus are good candidates for genotyping in association studies.  相似文献   

3.
Type 2 diabetes mellitus (T2DM) is a non-autoimmune, complex, heterogeneous and polygenic metabolic disease condition characterized by persistent elevated blood glucose levels (hyperglycemia). India as said to be the diabetic capital of the world is likely to experience the largest increase in T2DM and a greater number of diabetic individuals in the world by the year 2030. Identification of specific genetic variations in a particular ethnic group has a critical role in understanding the risk of developing T2DM in a much efficient way in future. These genetic variations include numerous types of polymorphisms among which single nucleotide polymorphisms (SNPs) is the most frequent. SNPs are basically located within the regulatory elements of several gene sequences. There are scores of genes interacting with various environmental factors affecting various pathways and sometimes even the whole signalling network that cause diseases like T2DM. This review discusses the biomarkers for early risk prediction of T2DM. Such predictions could be used in order to understand the pathogenesis of T2DM and to better diagnostics, treatment, and eventually prevention.  相似文献   

4.
C A Wise  M Sraml  S Easteal 《Genetics》1998,148(1):409-421
To test whether patterns of mitochondrial DNA (mtDNA) variation are consistent with a neutral model of molecular evolution, nucleotide sequences were determined for the 1041 bp of the NADH dehydrogenase subunit 2 (ND2) gene in 20 geographically diverse humans and 20 common chimpanzees. Contingency tests of neutrality were performed using four mutational categories for the ND2 molecule: synonymous and nonsynonymous mutations in the transmembrane regions, and synonymous and nonsynonymous mutations in the surface regions. The following three topological mutational categories were also used: intraspecific tips, intraspecific interiors, and interspecific fixed differences. The analyses reveal a significantly greater number of nonsynonymous polymorphisms within human transmembrane regions than expected based on interspecific comparisons, and they are inconsistent with a neutral equilibrium model. This pattern of excess nonsynonymous polymorphism is not seen within chimpanzees. Statistical tests of neutrality, such as TAJIMA''s D test, and the D and F tests proposed by FU and LI, indicate an excess of low frequency polymorphisms in the human data, but not in the chimpanzee data. This is consistent with recent directional selection, a population bottleneck or background selection of slightly deleterious mutations in human mtDNA samples. The analyses further support the idea that mitochondrial genome evolution is governed by selective forces that have the potential to affect its use as a "neutral" marker in evolutionary and population genetic studies.  相似文献   

5.
Mularoni L  Veitia RA  Albà MM 《Genomics》2007,89(3):316-325
Single-amino-acid tandem repeats are very common in mammalian proteins but their function and evolution are still poorly understood. Here we investigate how the variability and prevalence of amino acid repeats are related to the evolutionary constraints operating on the proteins. We find a significant positive correlation between repeat size difference and protein nonsynonymous substitution rate in human and mouse orthologous genes. This association is observed for all the common amino acid repeat types and indicates that rapid diversification of repeat structures, involving both trinucleotide slippage and nucleotide substitutions, preferentially occurs in proteins subject to low selective constraints. However, strikingly, we also observe a significant negative correlation between the number of repeats in a protein and the gene nonsynonymous substitution rate, particularly for glutamine, glycine, and alanine repeats. This implies that proteins subject to strong selective constraints tend to contain an unexpectedly high number of repeats, which tend to be well conserved between the two species. This is consistent with a role for selection in the maintenance of a significant number of repeats. Analysis of the codon structure of the sequences encoding the repeats shows that codon purity is associated with high repeat size interspecific variability. Interestingly, polyalanine and polyglutamine repeats associated with disease show very distinctive features regarding the degree of repeat conservation and the protein sequence selective constraints.  相似文献   

6.
Recent analyses have shown that nonsynonymous variation in human mitochondrial DNA (mtDNA) contains nonneutral variants, suggesting the presence of mildly deleterious mutations. Many of the disease-causing mutations in mtDNA occur in the genes encoding the tRNAs. Nucleotide sequence variation in these genes has not been studied in human populations, nor have the structural consequences of nucleotide substitutions in tRNA molecules been examined. We therefore determined the nucleotide sequences of the 22 tRNA genes in the mtDNA of 477 Finns and, also, obtained 435 European sequences from the MitoKor database. No differences in population polymorphism indices were found between the two data sets. We assessed selective constraints against various tRNA domains by comparing allele frequencies between these domains and the synonymous and nonsynonymous sites, respectively. All tRNA domains except the variable loop were more conserved than synonymous sites, and T stem and D stem were more conserved than the respective loops. We also analyzed the energetic consequences of the 96 polymorphisms recovered in the two data sets or in the Mitomap database. The minimum free energy (ΔG) was calculated using the free energy rules as implemented in mfold version 3.1. The ΔG’s were normally distributed among the 22 wild-type tRNA genes, whereas the 96 polymorphic tRNAs departed significantly from a normal distribution. The largest differences in ΔG between the wild-type and the polymorphic tRNAs in the Finnish population tended to be in the polymorphisms that were present at low frequencies. Allele frequency distributions and minimum free energy calculations both suggested that some polymorphisms in tRNA genes are nonneutral.Reviewing Editor: Dr. Rüdiger Cerff  相似文献   

7.
Non-synonymous single nucleotide polymorphisms (nsSNPs) are known to alter protein function, contributing to disease susceptibility. This report explores the nature of nsSNPs in the gene products of the highly conserved mitogen-activated protein kinase (MAPK) signaling pathways already implicated in cancer development. MAPK signaling pathways regulate cellular processes such as proliferation, differentiation, apoptosis, and survival mediated through interconnected signaling cascades. Using the dbSNP database, we have identified 25 nsSNPs in 17 out of 98 MAPK genes studied. Computational algorithms were used to predict whether the amino acid substitutions were evolutionarily tolerated, or affected putative functional units such as phosphorylation sites, protein motifs and domains. This study predicts that 36% of nsSNPs are likely to have functional consequences, based on evolutionary conservation analysis, and 36% based on phosphorylation prediction analysis. All such nsSNPs represent potentially functional and disease-causing/modifying alleles. More interestingly, the epistatic relationships discussed in this report represent potential synergistic/ antagonistic/additive effects of nsSNP combinations found within the same protein, or within members of the same protein complex and cascades. This strategy can effectively determine which nsSNPs potentially alter protein function, and can be utilized to study the genetic architecture and disease association of other biological protein complexes and networks.  相似文献   

8.
Baum J  Thomas AW  Conway DJ 《Genetics》2003,163(4):1327-1336
Malaria parasite antigens involved in erythrocyte invasion are primary vaccine candidates. The erythrocyte-binding antigen 175K (EBA-175) of Plasmodium falciparum binds to glycophorin A on the human erythrocyte surface via an N-terminal cysteine-rich region (termed region II) and is a target of antibody responses. A survey of polymorphism in a malaria-endemic population shows that nucleotide alleles in eba-175 region II occur at more intermediate frequencies than expected under neutrality, but polymorphisms in the homologous domains of two closely related genes, eba-140 (encoding a second erythrocyte-binding protein) and psieba-165 (a putative pseudogene), show an opposite trend. McDonald-Kreitman tests employing interspecific comparison with the orthologous genes in P. reichenowi (a closely related parasite of chimpanzees) reveal a significant excess of nonsynonymous polymorphism in P. falciparum eba-175 but not in eba-140. An analysis of the Duffy-binding protein gene, encoding a major erythrocyte-binding antigen in the other common human malaria parasite P. vivax, also reveals a significant excess of nonsynonymous polymorphisms when compared with divergence from its ortholog in P. knowlesi (a closely related parasite of macaques). The results suggest that EBA-175 in P. falciparum and DBP in P. vivax are both under diversifying selection from acquired human immune responses.  相似文献   

9.
Evolution of the Borrelia burgdorferi outer surface protein OspC.   总被引:1,自引:0,他引:1       下载免费PDF全文
The genes coding for outer surface protein OspC from 22 Borrelia burgdorferi strains isolated from patients with Lyme borreliosis were cloned and sequenced. For reference purposes, the 16S rRNA genes from 17 of these strains were sequenced after being cloned. The deduced OspC amino acid sequences were aligned with 12 published OspC sequences and revealed the presence of 48 conserved amino acids. On the basis of the alignment, OspC could be divided into an amino-terminal relatively conserved region and a relatively variable region in the central portion. The distance tree obtained divided the ospC sequences into three groups. The first group contained ospC alleles from all (n = 13) sensu stricto strains, the second group contained ospC alleles from seven Borrelia afzelii strains, and the third group contained ospC alleles from five B. afzelii and all (n = 9) Borrelia garinii strains. The ratio of the mean number of synonymous (dS) and nonsynonymous (dN) nucleotide substitutions per site calculated for B. burgdorferi sensu stricto, B. garinii, and B. afzelii ospC alleles suggested that the polymorphism of OspC is due to positive selection favoring diversity at the amino acid level in the relatively variable region. On the basis of the comparison of 16S rRNA gene sequences, Borrelia hermsii is more closely related to B. afzelii than to B. burgdorferi sensu stricto and B. garinii. In contrast, the phylogenetic tree obtained for the B. hermsii variable major protein, Vmp33, and 18 OspC amino acid sequences suggested that Vmp33 and OspC from B. burgdorferi sensu stricto strains share a common evolutionary origin.  相似文献   

10.
Our understanding of the impact of recombination, mutation, genetic drift, and selection on the evolution of a single gene is still limited. Here we investigate the impact of all these evolutionary forces at the complementary sex determiner (csd) gene that evolves under a balancing mode of selection. Females are heterozygous at the csd gene and males are hemizygous; diploid males are lethal and occur when csd is homozygous. Rare alleles thus have a selective advantage, are seldom lost by the effect of genetic drift, and are maintained over extended periods of time when compared with neutral polymorphisms. Here, we report on the analysis of 17, 19, and 15 csd alleles of Apis cerana, Apis dorsata, and Apis mellifera honeybees, respectively. We observed great heterogeneity of synonymous (piS) and nonsynonymous (piN) polymorphisms across the gene, with a consistent peak in exons 6 and 7. We propose that exons 6 and 7 encode the potential specifying domain (csd-PSD) that has accumulated elevated nucleotide polymorphisms over time by balancing selection. We observed no direct evidence that balancing selection favors the accumulation of nonsynonymous changes at csd-PSD (piN/piS ratios are all <1, ranging from 0.6 to 0.95). We observed an excess of shared nonsynonymous changes, which suggest that strong evolutionary constraints are operating at csd-PSD resulting in the independent accumulation of the same nonsynonymous changes in different alleles across species (convergent evolution). Analysis of csd-PSD genealogy revealed relatively short average coalescence times ( approximately 6 Myr), low average synonymous nucleotide diversity (piS < 0.09), and a lack of trans-specific alleles that substantially contrasts with previously analyzed loci under strong balancing selection. We excluded the possibility of a burst of diversification after population bottlenecking and intragenic recombination as explanatory factors, leaving high turnover rates as the explanation for this observation. By comparing observed allele richness and average coalescence times with a simplified model of csd-coalescence, we found that small long-term population sizes (i.e., N(e) < 10(4)), but not high mutation rates, can explain short maintenance times, implicating a strong historical impact of genetic drift on the molecular evolution of highly social honeybees.  相似文献   

11.
One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation (SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (<15 bp) conserved “chunks.” Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence.  相似文献   

12.
Proteins are extensively modified after translation due to cellular regulation, signal transduction, or chemical damage. Peptide tandem mass spectrometry can discover post-translational modifications, as well as sequence polymorphisms. Recent efforts have studied modifications at the proteomic scale. In this context, it becomes crucial to assess the accuracy of modification discovery. We discuss methods to quantify the false discovery rate from a search and demonstrate how several features can be used to distinguish valid modifications from search artifacts. We present a tool, PTMFinder, which implements these methods. We summarize the corpus of post-translational modifications identified on large data sets. Thousands of known and novel modification sites are identified, including site-specific modifications conserved over vast evolutionary distances.  相似文献   

13.
Three common protein isoforms of apolipoprotein E (apoE), encoded by the epsilon2, epsilon3, and epsilon4 alleles of the APOE gene, differ in their association with cardiovascular and Alzheimer's disease risk. To gain a better understanding of the genetic variation underlying this important polymorphism, we identified sequence haplotype variation in 5.5 kb of genomic DNA encompassing the whole of the APOE locus and adjoining flanking regions in 96 individuals from four populations: blacks from Jackson, MS (n=48 chromosomes), Mayans from Campeche, Mexico (n=48), Finns from North Karelia, Finland (n=48), and non-Hispanic whites from Rochester, MN (n=48). In the region sequenced, 23 sites varied (21 single nucleotide polymorphisms, or SNPs, 1 diallelic indel, and 1 multiallelic indel). The 22 diallelic sites defined 31 distinct haplotypes in the sample. The estimate of nucleotide diversity (site-specific heterozygosity) for the locus was 0.0005+/-0.0003. Sequence analysis of the chimpanzee APOE gene showed that it was most closely related to human epsilon4-type haplotypes, differing from the human consensus sequence at 67 synonymous (54 substitutions and 13 indels) and 9 nonsynonymous fixed positions. The evolutionary history of allelic divergence within humans was inferred from the pattern of haplotype relationships. This analysis suggests that haplotypes defining the epsilon3 and epsilon2 alleles are derived from the ancestral epsilon4s and that the epsilon3 group of haplotypes have increased in frequency, relative to epsilon4s, in the past 200,000 years. Substantial heterogeneity exists within all three classes of sequence haplotypes, and there are important interpopulation differences in the sequence variation underlying the protein isoforms that may be relevant to interpreting conflicting reports of phenotypic associations with variation in the common protein isoforms.  相似文献   

14.
Sen K  Ghosh TC 《Gene》2012,501(2):164-170
Pseudogenes, the 'genomic fossils' present portrayal of evolutionary history of human genome. The human genes configuring pseudogenes are also now coming forth as important resources in the study of human protein evolution. In this communication, we explored evolutionary conservation of the genes forming pseudogenes over the genes lacking any pseudogene and delving deeper, we probed an evolutionary rate difference between the disease genes in the two groups. We illustrated this differential evolutionary pattern by gene expressivity, number of regulatory miRNA targeting per gene, abundance of protein complex forming genes and lesser percentage of protein intrinsic disorderness. Furthermore, pseudogenes are observed to harbor sequence variations, over their entirety, those become degenerative disease-causing mutations though the disease involvement of their progenitors is still unexplored. Here, we unveiled an immense association of disease genes in the genes casting pseudogenes in human. We interpreted the issue by disease associated miRNA targeting, genes containing polymorphisms in miRNA target sites, abundance of genes having disease causing non-synonymous mutations, disease gene specific network properties, presence of genes having repeat regions, affluence of dosage sensitive genes and the presence of intrinsically unstructured protein regions.  相似文献   

15.
Crohn's disease is a chronic inflammatory bowel disease, with multifactorial traits, that can involve any part of the gastrointestinal tract. In recent years, a dozen genome-wide association scan and meta-analysis were published bringing the number of susceptibility alleles to more than 30 variations. However, the major susceptibility gene for Crohn's disease is NOD2, located on proximal 16q, which is involved in the innate immune response. Three main variants of this gene: two single nucleotide polymorphisms p.Arg702Trp and p.Gly908Arg substitutions and frameshift polymorphism p.Leu1007fsinsC are involved in susceptibility to Crohn's disease.  相似文献   

16.
We have obtained 15 sequences of Est-6 from a natural population of Drosophila melanogaster to test whether linkage disequilibrium exists between Est-6 and the closely linked Sod, and whether natural selection may be involved. An early experiment with allozymes had shown linkage disequilibrium between these two loci, while none was detected between other gene pairs. The Sod sequences for the same 15 haplotypes were obtained previously. The two genes exhibit similar levels of nucleotide polymorphism, but the patterns are different. In Est-6, there are nine amino acid replacement polymorphisms, one of which accounts for the S-F allozyme polymorphism. In Sod, there is only one replacement polymorphism, which corresponds to the S-F allozyme polymorphism. The transversion/transition ratio is more than five times larger in Sod than in Est-6. At the nucleotide level, the S and F alleles of Est-6 make up two allele families that are quite different from each other, while there is relatively little variation within each of them. There are also two families of alleles in Sod, one consisting of a subset of F alleles, and the other consisting of another subset of F alleles, designed F(A), plus all the S alleles. The Sod F(A) and S alleles are completely or nearly identical in nucleotide sequence, except for the replacement mutation that accounts for the allozyme difference. The two allele families have independent evolutionary histories in the two genes. There are traces of statistically significant linkage disequilibrium between the two genes that, we suggest, may have arisen as a consequence of selection favoring one particular sequence at each locus.  相似文献   

17.
目的:探讨UCP2-866G/A 和ADIPOQ+45T/G 基因多态性的交互作用与2型糖尿病合并冠心病发病风险的关系。方法:随机 选取2014 年10 月至2015 年5 月在佳木斯大学附属第一医院就诊的130 例单纯2 型糖尿病患者和128 例2 型糖尿病合并冠心 病患者进行病例对照研究。分别采用聚合酶链反应- 限制性片段长度多态性(PCR-RFLP)方法和聚合酶链反应- 高分辨率溶解曲 线(PCR-HRM)方法检测UCP2-866G/A 和ADIPOQ+45T/G 的基因多态性,并用非条件Logistic 回归分析两基因间的交互作用。 结果:在两组间分别进行UCP2-866G/A 和ADIPOQ+45T/G 基因多态性的单独关联分析,两变异位点的基因型和等位基因的频率 在两组间的分布及遗传模型关联分析均无统计学差异(P>0.05)。两变异位点联合分析发现,UCP2-866 G/A 的GG、GA 分别和 ADIPOQ+45T/G 的TG 在2 型糖尿病合并冠心病中存在正向交互作用(P=0.000,ORI=ORAB/(ORA× ORB)=30.533/(0.549× 0.116) >1;P=0.007,ORI= ORAB/(ORA× ORB)=13.914/(0.525× 0.116)>1。结论:该研究显示: UCP-866G/A 和ADIPOQ+ 45T/G 单一基因的 多态性与2 型糖尿病合并冠心病患病风险无关,而两者之间的交互作用可能增加2 型糖尿病合并冠心病的发病风险。  相似文献   

18.
Examination of polymorphisms in the Plasmodium falciparum gene for falcipain 2 revealed that this gene is one of two paralogs separated by 10.8 kb in chromosome 11. We designate the annotated gene denoted chr11.gen_424 as encoding falcipain 2A and the annotated gene denoted chr11.gen_427 as encoding falcipain 2B. The paralogs are 96% identical at the nucleotide level and 93% identical at the amino acid level. The consensus sequences differ in 31/309 synonymous sites and 45/1140 nonsynonymous sites, including three amino acid replacements (V393I, A400P, and Q414E) that are near the catalytic site and that may affect substrate affinity or specificity. In six reference isolates, among 36 synonymous sites and 46 nonsynonymous sites that are polymorphic in the gene for falcipain 2A, falcipain 2B, or both, significant spatial clustering is observed. All but one of the polymorphisms appear to result from gene conversion between the paralogs. The estimated rate of gene conversion between the paralogs may be as many as 1,400 to 1,700 times greater than the rate of mutation. Owing to gene conversion, one of the falcipain 2A alleles is more similar to the falcipain 2B alleles than it is to other falcipain 2A alleles. Divergence among the synonymous sites suggests that the paralogous genes last shared a common ancestor 15.2 MYA, with a range of 8.8 to 20.6 MYA. During this period, the paralogs have acquired 0.10 synonymous substitutions per synonymous site in the coding region. The 5' and 3' flanking regions differ in 47.7% and 39.8% of the nucleotide sites, respectively. Hence synonymous sites and flanking regions are not conserved in sequence in spite of their high AT content and T skew.  相似文献   

19.
Mutations in the androgen receptor (AR) are associated with a variety of diseases including androgen insensitivity syndrome and prostate cancer, but the way in which these mutations cause disease is poorly understood. We present a method for distinguishing likely disease-causing mutations from mutations that are merely associated with disease but have no causal role. Our method uses a measure of nucleotide conservation, and we find that conservation often correlates with severity of the clinical phenotype. Further, by only including mutations whose pathogenicity has been proven experimentally, this correlation is enhanced in the case of prostate cancer-associated mutations. Our method provides a means for assessing the significance of single nucleotide polymorphisms (SNPs) and cancer-associated mutations.  相似文献   

20.
The β3-adrenergic receptor (ADRB3) is predominantly expressed in white and brown adipose tissue and mediates the lipolytic and thermogenic effects of high catecholamine concentrations. Variation in the ADRB3 gene (ADRB3) has been associated with obesity and the earlier onset of non-insulin-dependent diabetes mellitus in some ethnic groups, as well as some production traits of sheep, but to date variation of bovine ADRB3 has not been reported. In this study, variation in the promoter region of bovine ADRB3 was investigated in 737 cattle by polymerase chain reaction-single strand conformational polymorphism (PCR-SSCP) analysis. Six PCR-SSCP patterns representing six allelic variations and containing four single nucleotide polymorphisms (SNPs) and three nucleotide deletions/insertions were observed. Allele A was the most common allele (93.83%), whereas alleles C, D, E and F were rare (0.07, 1.09, 0.41, and 0.34%, respectively). The variation identified here might have an impact on both the function and level of expression of bovine ADRB3.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号