首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Basu S  Pan W  Oetting WS 《Human heredity》2011,71(4):234-245
Studying one locus or one single nucleotide polymorphism (SNP) at a time may not be sufficient to understand complex diseases because they are unlikely to result from the effect of only one SNP. Each SNP alone may have little or no effect on the risk of the disease, but together they may increase the risk substantially. Analyses focusing on individual SNPs ignore the possibility of interaction among SNPs. In this paper, we propose a parsimonious model to assess the joint effect of a group of SNPs in a case-control study. The model implements a data reduction strategy within a likelihood framework and uses a test to assess the statistical significance of the effect of the group of SNPs on the binary trait. The primary advantage of the proposed approach is that the dimension reduction technique produces a test statistic with degrees of freedom significantly lower than a multiple logistic regression with only main effects of the SNPs, and our parsimonious model can incorporate the possibility of interaction among the SNPs. Moreover, the proposed approach estimates the direction of association of each SNP with the disease and provides an estimate of the average effect of the group of SNPs positively and negatively associated with the disease in the given SNP set. We illustrate the proposed model on simulated and real data, and compare its performance with a few other existing approaches. Our proposed approach appeared to outperform the other approaches for independent SNPs in our simulation studies.  相似文献   

2.

Background

The problem of efficient utilization of genome-wide expression profiles for identification and prediction of complex disease conditions is both important and challenging. Polygenic pathologies such as most types of cancer involve disregulation of many interacting genes which has prompted search for suitable statistical models for their representation. By accounting for changes in gene regulations between comparable conditions, graphical statistical models are expected to improve prediction precision.

Methods

In comparison problems with two or more experimental conditions, we represent the classes by categorical Bayesian networks that share one and the same graph structure but have class-specific probability parameters. The graph structure is learned by a score-based procedure that maximizes the difference between class probabilities using a suitable measure of divergence. The proposed framework includes an indirect model selection by adhering to a principle of optimal class separation and identifies interactions presenting significant difference between the compared conditions.

Results

We evaluate the performance of the new model against some benchmark algorithms such as support vector machine, penalized linear regression and linear Gaussian networks. The classifiers are compared by prediction accuracy across 15 different data sets from breast, lung, gastric and renal cancer studies. In addition to the demonstrated strong performance against the competitors, the proposed method is able to identify disease specific changes in gene regulations which are inaccessible by other approaches. The latter is illustrated by analyzing some gene interactions differentiating adenocarcinoma and squamous cell lung cancers.
  相似文献   

3.
ABSTRACT Ecologists often develop complex regression models that include multiple categorical and continuous variables, interactions among predictors, and nonlinear relationships between the response and predictor variables. Nomograms, which are graphical devices for presenting mathematical functions and calculating output values, can aid biologists in interpreting and presenting these complex models. To illustrate benefits of nomograms, we developed a logistic regression model of elk (Cervus elaphus) resource selection. With this model, we demonstrated how a nomogram helps scientists and managers interpret interactions among variables, compare the relative biological importance of variables, and examine predicted shapes of relationships (e.g., linear vs. nonlinear) between response and predictor variables. Although our example focused on logistic regression, nomograms are equally useful for other linear and nonlinear models. Regardless of the approach used for model development, nomograms and other graphical summaries can help scientists and managers develop, interpret, and apply statistical models.  相似文献   

4.
Summary Continuous‐time multistate models are widely used for categorical response data, particularly in the modeling of chronic diseases. However, inference is difficult when the process is only observed at discrete time points, with no information about the times or types of events between observation times, unless a Markov assumption is made. This assumption can be limiting as rates of transition between disease states might instead depend on the time since entry into the current state. Such a formulation results in a semi‐Markov model. We show that the computational problems associated with fitting semi‐Markov models to panel‐observed data can be alleviated by considering a class of semi‐Markov models with phase‐type sojourn distributions. This allows methods for hidden Markov models to be applied. In addition, extensions to models where observed states are subject to classification error are given. The methodology is demonstrated on a dataset relating to development of bronchiolitis obliterans syndrome in post‐lung‐transplantation patients.  相似文献   

5.

Background

Genome-wide association studies (GWAS) aim to identify causal variants and genes for complex disease by independently testing a large number of SNP markers for disease association. Although genes have been implicated in these studies, few utilise the multiple-hit model of complex disease to identify causal candidates. A major benefit of multi-locus comparison is that it compensates for some shortcomings of current statistical analyses that test the frequency of each SNP in isolation for the phenotype population versus control.

Results

Here we developed and benchmarked several protocols for GWAS data analysis using different in-silico gene prediction and prioritisation methodologies. We adopted a high sensitivity approach to the data, using less conservative statistical SNP associations. Multiple gene search spaces, either of fixed-widths or proximity-based, were generated around each SNP marker. We used the candidate disease gene prediction system Gentrepid to identify candidates based on shared biomolecular pathways or domain-based protein homology. Predictions were made either with phenotype-specific known disease genes as input; or without a priori knowledge, by exhaustive comparison of genes in distinct loci. Because Gentrepid uses biomolecular data to find interactions and common features between genes in distinct loci of the search spaces, it takes advantage of the multi-locus aspect of the data.

Conclusions

Results suggest testing multiple SNP-to-gene search spaces compensates for differences in phenotypes, populations and SNP platforms. Surprisingly, domain-based homology information was more informative when benchmarked against gene candidates reported by GWA studies compared to previously determined disease genes, possibly suggesting a larger contribution of gene homologs to complex diseases than Mendelian diseases.  相似文献   

6.
GWAS have emerged as popular tools for identifying genetic variants that are associated with disease risk. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. However, this approach suffers from limited reproducibility and difficulties in detecting multi-SNP and epistatic effects. As an alternative analytical strategy, we propose grouping SNPs together into SNP sets on the basis of proximity to genomic features such as genes or haplotype blocks, then testing the joint effect of each SNP set. Testing of each SNP set proceeds via the logistic kernel-machine-based test, which is based on a statistical framework that allows for flexible modeling of epistatic and nonlinear SNP effects. This flexibility and the ability to naturally adjust for covariate effects are important features of our test that make it appealing in comparison to individual SNP tests and existing multimarker tests. Using simulated data based on the International HapMap Project, we show that SNP-set testing can have improved power over standard individual-SNP analysis under a wide range of settings. In particular, we find that our approach has higher power than individual-SNP analysis when the median correlation between the disease-susceptibility variant and the genotyped SNPs is moderate to high. When the correlation is low, both individual-SNP analysis and the SNP-set analysis tend to have low power. We apply SNP-set analysis to analyze the Cancer Genetic Markers of Susceptibility (CGEMS) breast cancer GWAS discovery-phase data.  相似文献   

7.
Best linear unbiased prediction is well known for its wide rangeof applications including small area estimation. While the theoryis well established for mixed linear models and under normalityof the error and mixing distributions, the literature is sparsefor nonlinear mixed models under nonnormality of the error distributionor of the mixing distributions. We develop a resampling-basedunified approach for predicting mixed effects under a generalizedmixed model set-up. Second-order-accurate nonnegative estimatorsof mean squared prediction errors are also developed. Giventhe parametric model, the proposed methodology automaticallyproduces estimators of the small area parameters and their meansquared prediction errors, without requiring explicit analyticalexpressions for the mean squared prediction errors.  相似文献   

8.
Networks offer a powerful tool for understanding and visualizing inter-species ecological and evolutionary interactions. Previously considered examples, such as trophic networks, are just representations of experimentally observed direct interactions. However, species interactions are so rich and complex it is not feasible to directly observe more than a small fraction. In this paper, using data mining techniques, we show how potential interactions can be inferred from geographic data, rather than by direct observation. An important application area for this methodology is that of emerging diseases, where, often, little is known about inter-species interactions, such as between vectors and reservoirs. Here, we show how using geographic data, biotic interaction networks that model statistical dependencies between species distributions can be used to infer and understand inter-species interactions. Furthermore, we show how such networks can be used to build prediction models. For example, for predicting the most important reservoirs of a disease, or the degree of disease risk associated with a geographical area. We illustrate the general methodology by considering an important emerging disease - Leishmaniasis. This data mining methodology allows for the use of geographic data to construct inferential biotic interaction networks which can then be used to build prediction models with a wide range of applications in ecology, biodiversity and emerging diseases.  相似文献   

9.
Building an accurate disease risk prediction model is an essential step in the modern quest for precision medicine. While high-dimensional genomic data provides valuable data resources for the investigations of disease risk, their huge amount of noise and complex relationships between predictors and outcomes have brought tremendous analytical challenges. Deep learning model is the state-of-the-art methods for many prediction tasks, and it is a promising framework for the analysis of genomic data. However, deep learning models generally suffer from the curse of dimensionality and the lack of biological interpretability, both of which have greatly limited their applications. In this work, we have developed a deep neural network (DNN) based prediction modeling framework. We first proposed a group-wise feature importance score for feature selection, where genes harboring genetic variants with both linear and non-linear effects are efficiently detected. We then designed an explainable transfer-learning based DNN method, which can directly incorporate information from feature selection and accurately capture complex predictive effects. The proposed DNN-framework is biologically interpretable, as it is built based on the selected predictive genes. It is also computationally efficient and can be applied to genome-wide data. Through extensive simulations and real data analyses, we have demonstrated that our proposed method can not only efficiently detect predictive features, but also accurately predict disease risk, as compared to many existing methods.  相似文献   

10.
Liu XY  Li M  Yang SY  Su B  Yin LD 《动物学研究》2011,32(5):499-503
精神分裂症是一种常见的复杂精神疾病.大量的实验证据表明,遗传因素在精神分裂症的发生中起到了重要的作用.截至目前,有报道称至少100个基因与精神分裂症相关,但它们在不同人群中的重复性不好.在这些基因中,RELN在多个人群中都被证实与精神分裂症相关,表明它可能是一个真实的易感基因.目前,在RELN基因上有很多个单核苷酸多态性位点被证实与精神分裂症相关,其中研究最多的是通过全基因组关联分析发现的在RELN基因第四个内含子中的单核苷酸多态性位点rs7341475,它被证明与精神分裂症的发生相关.为了验证该位点在中国人群中是否与精神分裂症相关,作者对来自中国玉溪的病例——对照样本(400位患者和400位正常人)进行了遗传分析.结果显示,在该样本中rs7341475与精神分裂症不相关,这表明rs7341475在中国人群中可能不是致病多态性位点.  相似文献   

11.
Weed risk assessment has become an accepted methodology for examining the likelihood and consequence of a plant species becoming invasive outside of its native range. Weed risk assessment draws upon biological and ecological information to estimate the likelihood and magnitude of the threats posed by introducing non-indigenous plants. In geographical terms, this has traditionally been understood as within a new country following importation of plant material. However, recent risk assessment development has focused more specifically on intracountry risk posed by already-present invasive plants and is referred to as post-border weed risk management. This form of assessment calls for fine-scale predictions of invasive species habitat suitability. This study applies some of the more popular and widely available habitat prediction models that represent a variety of different statistical approaches (linear regression, logistic regression, Bayesian probability, Classification and Regression Trees, Genetic Algorithm for Rule-set Production) to a single invasive plant, the vertebrate-dispersed, fleshy fruited European olive ( Olea europaea L.) in southern Australia. The relationships between the dependant ( O. europaea distribution) and independent (soil and climate) variables are used in the models to produce predictive maps for each model. Accuracy was calculated for each model output as well as a combined surface to examine whether recent calls for ensemble modelling of distributions produces improved predictions. Overall, the combined prediction demonstrated superior accuracy compared to any individual model outputs. The combined outputs can be likened to mapped gradations of predicted habitat suitability. The type of output produced in this study should form a critical component of post-border weed risk management but more importantly, the methodology will add to this important discipline.  相似文献   

12.
We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.  相似文献   

13.
Schizophrenia is a serious neuropsychiatric illness affecting about 1% of the world’s population. Family, twin and adoption studies have demonstrated that 65%―85% of the susceptibility to schizophrenia can be attributed to genes. On the basis of genetic model-ing of epidemiological data and recent results of a number of whole-genome screens for susceptibility genes, schizophrenia has been considered a complex disorder. A number of genes with small to moderate effects are involved in combinat…  相似文献   

14.
The aim of genetic mapping is to locate the loci responsible for specific traits such as complex diseases. These traits are normally caused by mutations at multiple loci of unknown locations and interactions. In this work, we model the biological system that relates DNA polymorphisms with complex traits as a linear mixing process. Given this model, we propose a new fine-scale genetic mapping method based on independent component analysis. The proposed method outputs both independent associated groups of SNPs in addition to specific associated SNPs with the phenotype. It is applied to a clinical data set for the Schizophrenia disease with 368 individuals and 42 SNPs. It is also applied to a simulation study to investigate in more depth its performance. The obtained results demonstrate the novel characteristics of the proposed method compared to other genetic mapping methods. Finally, we study the robustness of the proposed method with missing genotype values and limited sample sizes.  相似文献   

15.
Colorectal cancer screening is well established. The identification of high risk populations is the key to implement effective risk‐adjusted screening. Good statistical approaches for risk prediction do not exist. The family's colorectal cancer history is used for identification of high risk families and usually assessed by a questionnaire. This paper introduces a prediction algorithm to designate a family for colorectal cancer risk and discusses its statistical properties. The new algorithm uses Bayesian reasoning and a detailed family history illustrated by a pedigree and a Lexis diagram. The algorithm is able to integrate different hereditary mechanisms that define complex latent class or random factor structures. They are generic and do not reflect specific genetic models. This is comparable to strategies in complex segregation analysis. Furthermore, the algorithm can integrate different statistical penetrance models for right censored event data. Computational challenges related to the handling of the likelihood are discussed. Simulation studies assess the predictive quality of the new algorithm in terms of ROC curves and corresponding AUCs. The algorithm is applied to data of a recent study on familial colorectal cancer risk. Its predictive performance is compared to that of a questionnaire currently used in screening for familial colorectal cancer. The results of the proposed algorithm are robust against different inheritance models. Using the simplest hereditary mechanism, the simulation study provides evidence that the algorithm improves detection of families with high cancer risk in comparison to the currently used questionnaire. The applicability of the algorithm goes beyond the field of colorectal cancer.  相似文献   

16.
Whole-genome resequencing technology has improved rapidly during recent years and is expected to improve further such that the sequencing of an entire human genome sequence for $1000 is within reach. Our main aim here is to use whole-genome sequence data for the prediction of genetic values of individuals for complex traits and to explore the accuracy of such predictions. This is relevant for the fields of plant and animal breeding and, in human genetics, for the prediction of an individual''s risk for complex diseases. Here, population history and genomic architectures were simulated under the Wright–Fisher population and infinite-sites mutation model, and prediction of genetic value was by the genomic selection approach, where a Bayesian nonlinear model was used to predict the effects of individual SNPs. The Bayesian model assumed a priori that only few SNPs are causative, i.e., have an effect different from zero. When using whole-genome sequence data, accuracies of prediction of genetic value were >40% increased relative to the use of dense ∼30K SNP chips. At equal high density, the inclusion of the causative mutations yielded an extra increase of accuracy of 2.5–3.7%. Predictions of genetic value remained accurate even when the training and evaluation data were 10 generations apart. Best linear unbiased prediction (BLUP) of SNP effects does not take full advantage of the genome sequence data, and nonlinear predictions, such as the Bayesian method used here, are needed to achieve maximum accuracy. On the basis of theoretical work, the results could be extended to more realistic genome and population sizes.GENOME resequencing technologies are currently developing at a very rapid rate, which we for simplicity call genome sequencing even though it is used on a species with a reference sequence. The current generation sequencing technology is two orders of magnitude faster and more cost effective than the technologies used for the sequencing of the human genome (Shendure and Ji 2008; TenBosch and Grody 2008). Future technologies are expected to reduce cost by another 100-fold so that sequencing an entire human genome for $1000 is considered achievable in the near future (Mardis 2008). The question arises: How can we make best use of entire genome sequence data on many individuals? One use will be the ability to predict the genetic value of an individual for complex traits. In the fields of animal and plant breeding, this would be of great practical benefit because most important traits are complex, quantitative traits, i.e., traits that are affected by many genes and by the environment. In humans the promise of personalized medicine relies on the ability to predict an individual''s genetic risk for complex, multifactorial diseases, such as Crohn''s disease (Barrett et al. 2008), and the ability to predict response to alternative treatments. The first aim of this article is to explore the accuracy of this prediction using the full genome sequence of the individual.The use of high-density SNP genotype data to predict genetic value, called genomic selection, was first proposed by Meuwissen et al. (2001). In its most sophisticated form, a Bayesian model was used to predict the effects of thousands of SNPs on the total genetic value simultaneously, where a priori it was assumed that only few SNPs were useful for predicting the trait [because they were in linkage disequilibrium (LD) with mutations causing variation in the trait], while many SNPs were not useful. Even among the SNPs that were useful for prediction, it was assumed that the distribution of effects was not normal because there were occasionally SNPs in LD with quantitative trait loci (QTL) that may occasionally have very large effect. To model this, the distribution of SNP effects was assumed to follow a distribution with thicker tails than the normal distribution (e.g., the t-distribution is often used). In the case of whole-genome sequence data, the polymorphisms that are causing the genetic differences between the individuals are among those being analyzed. For the sake of simplicity we call all polymorphisms in the sequence data SNPs while recognizing that other types of polymorphisms such as indels will be included. Assuming that the causal SNPs are included in the analysis simplifies the prior distribution of the SNP effects, because the effects of all the other SNPs, even if they are in LD with the causal SNPs, are expected to disappear. Thus, the prior distribution simplifies to the fact that some SNPs are expected to be causative and have an effect drawn from the distribution of the gene effects. The distribution of gene effects is investigated extensively in the evolutionary and other literature and is reported to be gamma (Hayes and Goddard 2001) or exponentially distributed (Erickson et al. 2004; Rocha et al. 2004), where the latter is a special form of the gamma distribution. On the downside, whole-genome sequence data will contain millions of SNPs and it may be difficult for genomic selection to separate the relatively few causative SNPs from all the others.Meuwissen et al. (2001) also investigated a model in which all SNPs were assumed to have an effect drawn from the same normal distribution [the so-called genome-wide best linear unbiased prediction (GWBLUP) model]. Although this model seems biologically implausible, it has been found to perform well in data from dairy cattle (VanRaden et al. 2009). However, we hypothesize that with sequence level data the BLUP model will not perform as well as models that assume that only some causal SNPs need to be included in the model.The aims here are to investigate the following: how accurately genetic values for complex traits can be predicted by genomic selection when whole-genome sequence data are available on a large number of individuals; whether it makes a difference to have the whole-genome sequence available, including the causative mutations, vs. very dense SNP marker genotypes; whether the estimates of the SNP effects can be used on individuals that are many generations separated from the data set in which they were estimated; the effect of the statistical model used on accuracy of prediction; and how accurately causative mutations can be detected and mapped. Because whole-genome sequence data on many individuals are not yet available, and because we needed to know the true genetic values of the individuals, the aforementioned questions were investigated by computer simulations of whole-genome sequence data.  相似文献   

17.

Background

Predicting disease causative genes (or simply, disease genes) has played critical roles in understanding the genetic basis of human diseases and further providing disease treatment guidelines. While various computational methods have been proposed for disease gene prediction, with the recent increasing availability of biological information for genes, it is highly motivated to leverage these valuable data sources and extract useful information for accurately predicting disease genes.

Results

We present an integrative framework called N2VKO to predict disease genes. Firstly, we learn the node embeddings from protein-protein interaction (PPI) network for genes by adapting the well-known representation learning method node2vec. Secondly, we combine the learned node embeddings with various biological annotations as rich feature representation for genes, and subsequently build binary classification models for disease gene prediction. Finally, as the data for disease gene prediction is usually imbalanced (i.e. the number of the causative genes for a specific disease is much less than that of its non-causative genes), we further address this serious data imbalance issue by applying oversampling techniques for imbalance data correction to improve the prediction performance. Comprehensive experiments demonstrate that our proposed N2VKO significantly outperforms four state-of-the-art methods for disease gene prediction across seven diseases.

Conclusions

In this study, we show that node embeddings learned from PPI networks work well for disease gene prediction, while integrating node embeddings with other biological annotations further improves the performance of classification models. Moreover, oversampling techniques for imbalance correction further enhances the prediction performance. In addition, the literature search of predicted disease genes also shows the effectiveness of our proposed N2VKO framework for disease gene prediction.
  相似文献   

18.
The clinical utility of family history and genetic tests is generally well understood for simple Mendelian disorders and rare subforms of complex diseases that are directly attributable to highly penetrant genetic variants. However, little is presently known regarding the performance of these methods in situations where disease susceptibility depends on the cumulative contribution of multiple genetic factors of moderate or low penetrance. Using quantitative genetic theory, we develop a model for studying the predictive ability of family history and single nucleotide polymorphism (SNP)–based methods for assessing risk of polygenic disorders. We show that family history is most useful for highly common, heritable conditions (e.g., coronary artery disease), where it explains roughly 20%–30% of disease heritability, on par with the most successful SNP models based on associations discovered to date. In contrast, we find that for diseases of moderate or low frequency (e.g., Crohn disease) family history accounts for less than 4% of disease heritability, substantially lagging behind SNPs in almost all cases. These results indicate that, for a broad range of diseases, already identified SNP associations may be better predictors of risk than their family history–based counterparts, despite the large fraction of missing heritability that remains to be explained. Our model illustrates the difficulty of using either family history or SNPs for standalone disease prediction. On the other hand, we show that, unlike family history, SNP–based tests can reveal extreme likelihood ratios for a relatively large percentage of individuals, thus providing potentially valuable adjunctive evidence in a differential diagnosis.  相似文献   

19.
SUMMARY: The conditional autoregressive (CAR) model is widely used to describe the geographical distribution of a specific disease risk in lattice mapping. Successful developments based on frequentist and Bayesian procedures have been extensively applied to obtain two-stage disease risk predictions at the subregional level. Bayesian procedures are preferred for making inferences, as the posterior standard errors (SE) of the two-stage prediction account for the variability in the variance component estimates; however, some recent work based on frequentist procedures and the use of bootstrap adjustments for the SE has been undertaken. In this article we investigate the suitability of an analytical adjustment for disease risk inference that provides accurate interval predictions by using the penalized quasilikelihood (PQL) technique to obtain model parameter estimates. The method is a first-order approximation of the naive SE based on a Taylor expansion and is interpreted as a conditional measure of variability providing conditional calibrated prediction intervals, given the data. We conduct a simulation study to demonstrate how the method can be used to estimate the specific subregion risk by interval. We evaluate the proposed methodology by analyzing the commonly used example data set of lip cancer incidence in the 56 counties of Scotland for the period 1975-1980. This evaluation reveals a close similarity between the solutions provided by the method proposed here and those of its fully Bayesian counterpart.  相似文献   

20.
Chen J  Lin D  Hochner H 《Biometrics》2012,68(3):869-877
Summary Case-control mother-child pair design represents a unique advantage for dissecting genetic susceptibility of complex traits because it allows the assessment of both maternal and offspring genetic compositions. This design has been widely adopted in studies of obstetric complications and neonatal outcomes. In this work, we developed an efficient statistical method for evaluating joint genetic and environmental effects on a binary phenotype. Using a logistic regression model to describe the relationship between the phenotype and maternal and offspring genetic and environmental risk factors, we developed a semiparametric maximum likelihood method for the estimation of odds ratio association parameters. Our method is novel because it exploits two unique features of the study data for the parameter estimation. First, the correlation between maternal and offspring SNP genotypes can be specified under the assumptions of random mating, Hardy-Weinberg equilibrium, and Mendelian inheritance. Second, environmental exposures are often not affected by offspring genes conditional on maternal genes. Our method yields more efficient estimates compared with the standard prospective method for fitting logistic regression models to case-control data. We demonstrated the performance of our method through extensive simulation studies and the analysis of data from the Jerusalem Perinatal Study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号