首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The volumetric growth of tumor cells as a function of time is most often likely to be a complex trait, controlled by the combined influences of multiple genes and environmental influences. Genetic mapping has proven to be a powerful tool for detecting and identifying specific genes affecting complex traits, i.e., quantitative trait loci (QTL), based on polymorphic markers. In this article, we present a novel statistical model for genetic mapping of QTL governing tumor growth trajectories in humans. In principle, this model is a combination of functional mapping proposed to map function-valued traits and linkage disequilibrium mapping designed to provide high resolution mapping of QTL by making use of recombination events created at a historic time. We implement an EM-simplex hybrid algorithm for parameter estimation, in which a closed-form solution for the EM algorithm is derived to estimate the population genetic parameters of QTL including the allele frequencies and the coefficient of linkage disequilibrium, and the simplex algorithm incorporated to estimate the curve parameters describing the dynamic changes of cancer cells for different QTL genotypes. Extensive simulations are performed to investigate the statistical properties of our model. Through a number of hypothesis tests, our model allows for cutting-edge studies aimed to decipher the genetic mechanisms underlying cancer growth, development and differentiation. The implications of our model in gene therapy for cancer research are discussed.  相似文献   

2.
Increasing evidence shows that quantitative inheritance is based on both DNA sequence and non‐DNA sequence variants. However, how to simultaneously detect these variants from a mapping study has been unexplored, hampering our effort to illustrate the detailed genetic architecture of complex traits. We address this issue by developing a unified model of quantitative trait locus (QTL) mapping based on an open‐pollinated design composed of randomly sampling maternal plants from a natural population and their half‐sib seeds. This design forms a two‐level hierarchical platform for a joint linkage‐linkage disequilibrium analysis of population structure. The EM algorithm was implemented to estimate and test DNA sequence‐based effects and non‐DNA sequence‐based effects of QTLs. We applied this model to analyze genetic mapping data from the OP design of a gymnosperm coniferous species, Torreya grandis, identifying 25 significant DNA sequence and non‐DNA sequence QTLs for seedling height and diameter growth in different years. Results from computer simulation show that the unified model has good statistical properties and is powerful for QTL detection. Our model enables the tests of how a complex trait is affected differently by DNA‐based effects and non‐DNA sequence‐based transgenerational effects, thus allowing a more comprehensive picture of genetic architecture to be charted and quantified.  相似文献   

3.
Single nucleotide polymorphisms (SNPs) represent the most widespread type of DNA sequence variation in the human genome and they have recently emerged as valuable genetic markers for revealing the genetic architecture of complex traits in terms of nucleotide combination and sequence. Here, we extend an algorithmic model for the haplotype analysis of SNPs to estimate the effects of genetic imprinting expressed at the DNA sequence level. The model provides a general procedure for identifying the number and types of optimal DNA sequence variants that are expressed differently due to their parental origin. The model is used to analyze a genetic data set collected from a pain genetics project. We find that DNA haplotype GAC from three SNPs, OPRKG36T (with two alleles G and T), OPRKA843G (with alleles A and G), and OPRKC846T (with alleles C and T), at the kappa-opioid receptor, triggers a significant effect on pain sensitivity, but with expression significantly depending on the parent from which it is inherited (p = 0.008). With a tremendous advance in SNP identification and automated screening, the model founded on haplotype discovery and statistical inference may provide a useful tool for genetic analysis of any quantitative trait with complex inheritance.  相似文献   

4.
Mining single-nucleotide polymorphisms from hexaploid wheat ESTs.   总被引:20,自引:0,他引:20  
Single-nucleotide polymorphisms (SNPs) represent a new form of functional marker, particularly when they are derived from expressed sequence tags (ESTs). A bioinformatics strategy was developed to discover SNPs within a large wheat EST database and to demonstrate the utility of SNPs in genetic mapping and genetic diversity applications. A collection of > 90000 wheat ESTs was assembled into contiguous sequences (contigs), and 45 random contigs were then visually inspected to identify primer pairs capable of amplifying specific alleles. We estimate that homoeologue sequence variants occurred 1 in 24 bp and the frequency of SNPs between wheat genotypes was 1 SNP/540 bp (theta = 0.0069). Furthermore, we estimate that one diagnostic SNP test can be developed from every contig with 10-60 EST members. Thus, EST databases are an abundant source of SNP markers. Polymorphism information content for SNPs ranged from 0.04 to 0.50 and ESTs could be mapped into a framework of microsatellite markers using segregating populations. The results showed that SNPs in wheat can be discovered in ESTs, validated, and be applied to conventional genetic studies.  相似文献   

5.
It has been recognized that genetic mutations in specific nucleotides may give rise to cancer via the alteration of signaling pathways. Thus, the detection of those cancer-causing mutations has received considerable interest in cancer genetic research. Here, we propose a statistical model for characterizing genes that lead to cancer through point mutations using genome-wide single nucleotide polymorphism (SNP) data. The basic idea of the model is that mutated genes may be in high association with their nearby SNPs because of evolutionary forces. By genotyping SNPs in both normal and cancer cells, we formulate a polynomial likelihood to estimate the population genetic parameters related to cancer, such as allele frequencies of cancer-causing alleles, mutation rates of alleles derived from maternal or paternal parents, and zygotic linkage disequilibria between different loci after the mutation occurs. We implement the EM algorithm to estimate some of these parameters because of the missing information in the likelihood construction. The model allows the elegant tests of the significant associations between mutated cancer genes and genome-wide SNPs, thus providing a way for predicting the occurrence and formation of cancer with genetic information. The model, validated through computer simulation, may help cancer geneticists design efficient experiments and formulate hypotheses for cancer gene identification.  相似文献   

6.
Linkage mapping of gene-associated SNPs to pig chromosome 11   总被引:3,自引:0,他引:3  
Single nucleotide polymorphisms (SNPs) were discovered in porcine expressed sequence tags (ESTs) orthologous to genes from human chromosome 13 (HSA13) and predicted to be located on pig chromosome 11 (SSC11). The SNPs were identified as sequence variants in clusters of EST sequences from pig cDNA libraries constructed in the Sino-Danish pig genome project. In total, 312 human gene sequences from HSA13 were used for similarity searches in our pig EST database. Pig ESTs showing significant similarity with HSA13 genes were clustered and candidate SNPs were identified. Allele frequencies for 26 SNPs were estimated in a group of 80 unrelated pigs from Danish commercial pig breeds: Duroc, Hampshire, Landrace and Large White. Eighteen of the 26 SNPs genotyped in the PiGMaP Reference Families were mapped by linkage analysis to SSC11. The EST-based SNPs published here are new genetic markers useful for linkage and association studies in commercial and experimental pig populations. This study represents the first gene-associated SNP linkage map of pig chromosome 11 and adds new comparative mapping information between SSC11 and HSA13. Furthermore, our data facilitate future studies aimed at the identification of interesting regions on pig chromosome 11, positional cloning and fine mapping of quantitative trait loci in pig.  相似文献   

7.
Quantitative trait nucleotide analysis using Bayesian model selection   总被引:4,自引:0,他引:4  
Although much attention has been given to statistical genetic methods for the initial localization and fine mapping of quantitative trait loci (QTLs), little methodological work has been done to date on the problem of statistically identifying the most likely functional polymorphisms using sequence data. In this paper we provide a general statistical genetic framework, called Bayesian quantitative trait nucleotide (BQTN) analysis, for assessing the likely functional status of genetic variants. The approach requires the initial enumeration of all genetic variants in a set of resequenced individuals. These polymorphisms are then typed in a large number of individuals (potentially in families), and marker variation is related to quantitative phenotypic variation using Bayesian model selection and averaging. For each sequence variant a posterior probability of effect is obtained and can be used to prioritize additional molecular functional experiments. An example of this quantitative nucleotide analysis is provided using the GAW12 simulated data. The results show that the BQTN method may be useful for choosing the most likely functional variants within a gene (or set of genes). We also include instructions on how to use our computer program, SOLAR, for association analysis and BQTN analysis.  相似文献   

8.
Kang HM  Zaitlen NA  Wade CM  Kirby A  Heckerman D  Daly MJ  Eskin E 《Genetics》2008,178(3):1709-1723
Genomewide association mapping in model organisms such as inbred mouse strains is a promising approach for the identification of risk factors related to human diseases. However, genetic association studies in inbred model organisms are confronted by the problem of complex population structure among strains. This induces inflated false positive rates, which cannot be corrected using standard approaches applied in human association studies such as genomic control or structured association. Recent studies demonstrated that mixed models successfully correct for the genetic relatedness in association mapping in maize and Arabidopsis panel data sets. However, the currently available mixed-model methods suffer from computational inefficiency. In this article, we propose a new method, efficient mixed-model association (EMMA), which corrects for population structure and genetic relatedness in model organism association mapping. Our method takes advantage of the specific nature of the optimization problem in applying mixed models for association mapping, which allows us to substantially increase the computational speed and reliability of the results. We applied EMMA to in silico whole-genome association mapping of inbred mouse strains involving hundreds of thousands of SNPs, in addition to Arabidopsis and maize data sets. We also performed extensive simulation studies to estimate the statistical power of EMMA under various SNP effects, varying degrees of population structure, and differing numbers of multiple measurements per strain. Despite the limited power of inbred mouse association mapping due to the limited number of available inbred strains, we are able to identify significantly associated SNPs, which fall into known QTL or genes identified through previous studies while avoiding an inflation of false positives. An R package implementation and webserver of our EMMA method are publicly available.  相似文献   

9.
Functional mapping is a statistical method for mapping quantitative trait loci (QTLs) that regulate the dynamic pattern of a biological trait. This method integrates mathematical aspects of biological complexity into a mixture model for genetic mapping and tests the genetic effects of QTLs by comparing genotype-specific curve parameters. As a way of quantitatively specifying the dynamic behaviour of a system, differential equations have proved to be powerful for modelling and unravelling the biochemical, molecular, and cellular mechanisms of a biological process, such as biological rhythms. The equipment of functional mapping with biologically meaningful differential equations provides new insights into the genetic control of any dynamic processes. We formulate a new functional mapping framework for a dynamic biological rhythm by incorporating a group of ordinary differential equations (ODE). The Runge–Kutta fourth-order algorithm was implemented to estimate the parameters that define the system of ODE. The new model will find its implications for understanding the interplay between gene interactions and developmental pathways in complex biological rhythms.  相似文献   

10.
Functional mapping is a statistical method for mapping quantitative trait loci (QTLs) that regulate the dynamic pattern of a biological trait. This method integrates mathematical aspects of biological complexity into a mixture model for genetic mapping and tests the genetic effects of QTLs by comparing genotype-specific curve parameters. As a way of quantitatively specifying the dynamic behavior of a system, differential equations have proven to be powerful for modeling and unraveling the biochemical, molecular, and cellular mechanisms of a biological process, such as biological rhythms. The equipment of functional mapping with biologically meaningful differential equations provides new insights into the genetic control of any dynamic processes. We formulate a new functional mapping framework for a dynamic biological rhythm by incorporating a group of ordinary differential equations (ODE). The Runge-Kutta fourth order algorithm was implemented to estimate the parameters that define the system of ODE. The new model will find its implications for understanding the interplay between gene interactions and developmental pathways in complex biological rhythms.  相似文献   

11.
Lin M  Lou XY  Chang M  Wu R 《Genetics》2003,165(2):901-913
Because of uncertainty about linkage phases of founders, linkage mapping in nonmodel, outcrossing systems using molecular markers presents one of the major statistical challenges in genetic research. In this article, we devise a statistical method for mapping QTL affecting a complex trait by incorporating all possible QTL-marker linkage phases within a mapping framework. The advantage of this model is the simultaneous estimation of linkage phases and QTL location and effect parameters. These estimates are obtained through maximum-likelihood methods implemented with the EM algorithm. Extensive simulation studies are performed to investigate the statistical properties of our model. In a case study from a forest tree, this model has successfully identified a significant QTL affecting wood density. Also, the probability of the linkage phase between this QTL and its flanking markers is estimated. The implications of our model and its extension to more general circumstances are discussed.  相似文献   

12.
Genome-wide association studies (GWAS) examine the entire human genome with the goal of identifying genetic variants (usually single nucleotide polymorphisms (SNPs)) that are associated with phenotypic traits such as disease status and drug response. The discordance of significantly associated SNPs for the same disease identified from different GWAS indicates that false associations exist in such results. In addition to the possible sources of spurious associations that have been investigated and discussed intensively, such as sample size and population stratification, an accurate and reproducible genotype calling algorithm is required for concordant GWAS results from different studies. However, variations of genotype calling of an algorithm and their effects on significantly associated SNPs identified in downstream association analyses have not been systematically investigated. In this paper, the variations of genotype calling using the Bayesian Robust Linear Model with Mahalanobis distance classifier (BRLMM) algorithm and the resulting influence on the lists of significantly associated SNPs were evaluated using the raw data of 270 HapMap samples analysed with the Affymetrix Human Mapping 500K Array Set (Affy500K) by changing algorithmic parameters. Modified were the Dynamic Model (DM) call confidence threshold (threshold) and the number of randomly selected SNPs (size). Comparative analysis of the calling results and the corresponding lists of significantly associated SNPs identified through association analysis revealed that algorithmic parameters used in BRLMM affected the genotype calls and the significantly associated SNPs. Both the threshold and the size affected the called genotypes and the lists of significantly associated SNPs in association analysis. The effect of the threshold was much larger than the effect of the size. Moreover, the heterozygous calls had lower consistency compared to the homozygous calls.  相似文献   

13.
The growing collection of publicly available high-throughput data provides an invaluable resource for generating preliminary in silico data in support of novel hypotheses. In this study we used a cross-dataset meta-analysis strategy to identify novel candidate genes and genetic variations relevant to paclitaxel/carboplatin-induced myelosuppression and neuropathy. We identified genes affected by drug exposure and present in tissues associated with toxicity. From ten top-ranked genes 42 non-synonymous single nucleotide polymorphisms (SNPs) were identified in silico and genotyped in 94 cancer patients treated with carboplatin/paclitaxel. We observed variations in 11 SNPs, of which seven were present in a sufficient frequency for statistical evaluation. Of these seven SNPs, three were present in ABCA1 and ATM, and showed significant or borderline significant association with either myelosuppression or neuropathy. The strikingly high number of associations between genotype and clinically observed toxicity provides support for our data-driven computations strategy to identify biomarkers for drug toxicity.  相似文献   

14.
Identification of genes that harbor variation associated with inter-individual differences in risk of complex diseases remains one of the most challenging and important problems in human genetics. For genetic variants that are sufficiently common and have sufficiently large effects, direct tests of association through linkage disequilibrium with anonymous SNPs may prove effective. But the two critical parameters - the frequency of risk-inflating alleles and the magnitudes of their effect on risk - remain largely unknown. In this review we consider the latest information regarding the likely efficacy of the linkage disequilibrium mapping approach.  相似文献   

15.
16.
In spite of the success of genome-wide association studies (GWASs), only a small proportion of heritability for each complex trait has been explained by identified genetic variants, mainly SNPs. Likely reasons include genetic heterogeneity (i.e., multiple causal genetic variants) and small effect sizes of causal variants, for which pathway analysis has been proposed as a promising alternative to the standard single-SNP-based analysis. A pathway contains a set of functionally related genes, each of which includes multiple SNPs. Here we propose a pathway-based test that is adaptive at both the gene and SNP levels, thus maintaining high power across a wide range of situations with varying numbers of the genes and SNPs associated with a trait. The proposed method is applicable to both common variants and rare variants and can incorporate biological knowledge on SNPs and genes to boost statistical power. We use extensively simulated data and a WTCCC GWAS dataset to compare our proposal with several existing pathway-based and SNP-set-based tests, demonstrating its promising performance and its potential use in practice.  相似文献   

17.
Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12% dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation.  相似文献   

18.
MOTIVATION: Genetic interactions or epistasis may play an important role in the genetic etiology of drug response. With the availability of large-scale, high-density single nucleotide polymorphism markers, a great challenge is how to associate haplotype structures and complex drug response through its underlying pharmacodynamic mechanisms. RESULTS: We have derived a general statistical model for detecting an interactive network of DNA sequence variants that encode pharmacodynamic processes based on the haplotype map constructed by single nucleotide polymorphisms. The model was validated by a pharmacogenetic study for two predominant beta-adrenergic receptor (betaAR) subtypes expressed in the heart, beta1AR and beta2AR. Haplotypes from these two receptors trigger significant interaction effects on the response of heart rate to different dose levels of dobutamine. This model will have implications for pharmacogenetic and pharmacogenomic research and drug discovery. AVAILABILITY: A computer program written in Matlab can be downloaded from the webpage of statistical genetics group at the University of Florida. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

19.
The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs) that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network identified computationally by our method under SNP perturbations is well supported by the results from experimental perturbation studies related to DNA replication stress response.  相似文献   

20.
Cui Y  Wu R 《Genetical research》2005,86(1):65-75
To study the effects of maternal and endosperm quantitative trait locus (QTL) interaction on endosperm development, we derive a two-stage hierarchical statistical model within the maximum-likelihood context, implemented with an expectation-maximization algorithm. A model incorporating both maternal and offspring marker information can improve the accuracy and precision of genetic mapping. Extensive simulations under different sampling strategies, heritability levels and gene action modes were performed to investigate the statistical properties of the model. The QTL location and parameters are better estimated when two QTLs are located at different intervals than when they are located at the same interval. Also, the additive effect of the offspring QTLs is better estimated than the additive effect of the maternal QTLs. The implications of our model for agricultural and evolutionary genetic research are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号