首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Inherited genetic variation contributes to individual risk for many complex diseases and is increasingly being used for predictive patient stratification. Previous work has shown that genetic factors are not equally relevant to human traits across age and other contexts, though the reasons for such variation are not clear. Here, we introduce methods to infer the form of the longitudinal relationship between genetic relative risk for disease and age and to test whether all genetic risk factors behave similarly. We use a proportional hazards model within an interval-based censoring methodology to estimate age-varying individual variant contributions to genetic relative risk for 24 common diseases within the British ancestry subset of UK Biobank, applying a Bayesian clustering approach to group variants by their relative risk profile over age and permutation tests for age dependency and multiplicity of profiles. We find evidence for age-varying relative risk profiles in nine diseases, including hypertension, skin cancer, atherosclerotic heart disease, hypothyroidism and calculus of gallbladder, several of which show evidence, albeit weak, for multiple distinct profiles of genetic relative risk. The predominant pattern shows genetic risk factors having the greatest relative impact on risk of early disease, with a monotonic decrease over time, at least for the majority of variants, although the magnitude and form of the decrease varies among diseases. As a consequence, for diseases where genetic relative risk decreases over age, genetic risk factors have stronger explanatory power among younger populations, compared to older ones. We show that these patterns cannot be explained by a simple model involving the presence of unobserved covariates such as environmental factors. We discuss possible models that can explain our observations and the implications for genetic risk prediction.  相似文献   

2.

Background

After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions.

Methods

We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi.

Results

Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk.

Conclusions

The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities.  相似文献   

3.
Disentangling the relative contributions of selective and neutral processes underlying phenotypic and genetic variation under natural, environmental conditions remains a central challenge in evolutionary ecology. However, much of the variation that could be informative in this area of research is likely to be cryptic in nature; thus, the identification of wild populations suitable for study may be problematic. We use a landscape genetics approach to identify such populations of three-spined stickleback inhabiting the Saint Lawrence River estuary. We sampled 1865 adult fish over multiple years. Individuals were genotyped for nine microsatellite loci, and georeferenced multilocus data were used to infer population groupings, as well as locations of genetic discontinuities, under a Bayesian model framework ( geneland ). We modelled environmental data using nonparametric multiple regression to explain genetic differentiation as a function of spatio-ecological effects. Additionally, we used genotype data to estimate dispersal and gene flow to parameterize a simple model predicting adaptive vs. plastic divergence between demes. We demonstrate a bipartite division of the genetic landscape into freshwater and maritime zones, independent of geographical distance. Moreover, we show that the greatest proportion of genetic variation (31.5%) is explained by environmental differences. However, the potential for either adaptive or plastic divergence between demes is highly dependent upon the strength of migration and selection. Consequently, we highlight the utility of landscape genetics as a tool for hypothesis generation and experimental design, to identify focal populations and putative selection gradients, in order to distinguish between phenotypic plasticity and local adaptation.  相似文献   

4.
We propose a semiparametric mean residual life mixture cure model for right-censored survival data with a cured fraction. The model employs the proportional mean residual life model to describe the effects of covariates on the mean residual time of uncured subjects and the logistic regression model to describe the effects of covariates on the cure rate. We develop estimating equations to estimate the proposed cure model for the right-censored data with and without length-biased sampling, the latter is often found in prevalent cohort studies. In particular, we propose two estimating equations to estimate the effects of covariates in the cure rate and a method to combine them to improve the estimation efficiency. The consistency and asymptotic normality of the proposed estimates are established. The finite sample performance of the estimates is confirmed with simulations. The proposed estimation methods are applied to a clinical trial study on melanoma and a prevalent cohort study on early-onset type 2 diabetes mellitus.  相似文献   

5.

Background

Neonatal mortality contributes a large proportion towards early childhood mortality in developing countries, with considerable geographical variation at small areas within countries.

Methods

A geo-additive logistic regression model is proposed for quantifying small-scale geographical variation in neonatal mortality, and to estimate risk factors of neonatal mortality. Random effects are introduced to capture spatial correlation and heterogeneity. The spatial correlation can be modelled using the Markov random fields (MRF) when data is aggregated, while the two dimensional P-splines apply when exact locations are available, whereas the unstructured spatial effects are assigned an independent Gaussian prior. Socio-economic and bio-demographic factors which may affect the risk of neonatal mortality are simultaneously estimated as fixed effects and as nonlinear effects for continuous covariates. The smooth effects of continuous covariates are modelled by second-order random walk priors. Modelling and inference use the empirical Bayesian approach via penalized likelihood technique. The methodology is applied to analyse the likelihood of neonatal deaths, using data from the 2000 Malawi demographic and health survey. The spatial effects are quantified through MRF and two dimensional P-splines priors.

Results

Findings indicate that both fixed and spatial effects are associated with neonatal mortality.

Conclusions

Our study, therefore, suggests that the challenge to reduce neonatal mortality goes beyond addressing individual factors, but also require to understanding unmeasured covariates for potential effective interventions.  相似文献   

6.
Recent studies have implicated folic acid as an important determinant of normal human growth, development, and function. Insufficient folate levels appear to be a risk factor for neural tube defects (NTD), as well as for several chronic diseases of adulthood. However, relatively little is known about the factors that influence folate status in the general population. To estimate the relative contribution of genetic and nongenetic factors to variation in folate, we have evaluated red blood cell (RBC) folate levels in 440 pairs of MZ twins and in 331 pairs of DZ twins. The data were best described by a model in which 46% of the variance in RBC folate was attributable to additive genetic effects, 16% of the variance was due to measured phenotypic covariates, and 38% of the variance was due to random environmental effects. Moreover, the correlations for RBC folate in MZ co-twins (r = .46) and in repeat measures from the same individual (r = .51) were very similar, indicating that virtually all repeatable variation in RBC folate is attributable to genetic factors. On the basis of these results, it would seem reasonable to initiate a search for the specific genes that influence RBC folate levels in the general population. Such genes ultimately may be used to identify individuals at increased risk for NTD and other folate-related diseases.  相似文献   

7.
Spatial and environmental heterogeneity are major factors in structuring species distributions in alpine landscapes. These landscapes have also been affected by glacial advances and retreats, causing alpine taxa to undergo range shifts and demographic changes. These nonequilibrium population dynamics have the potential to obscure the effects of environmental factors on the distribution of genetic variation. Here, we investigate how demographic change and environmental factors influence genetic variation in the alpine butterfly Colias behrii. Data from 14 microsatellite loci provide evidence of bottlenecks in all population samples. We test several alternative models of demography using approximate Bayesian computation (ABC), with the results favouring a model in which a recent bottleneck precedes rapid population growth. Applying independent calibrations to microsatellite loci and a nuclear gene, we estimate that this bottleneck affected both northern and southern populations 531–281 years ago, coinciding with a period of global cooling. Using regression approaches, we attempt to separate the effects of population structure, geographical distance and landscape on patterns of population genetic differentiation. Only 40% of the variation in FST is explained by these models, with geographical distance and least‐cost distance among meadow patches selected as the best predictors. Various measures of genetic diversity within populations are also decoupled from estimates of local abundance and habitat patch characteristics. Our results demonstrate that demographic change can have a disproportionate influence on genetic diversity in alpine species, contrasting with other studies that suggest landscape features control contemporary demographic processes in high‐elevation environments.  相似文献   

8.
Spatial statistics has seen rapid application in many fields, especially epidemiology and public health. Many studies, nonetheless, make limited use of the geographical location information and also usually assume that the covariates, which are related to the response variable, have linear effects. We develop a Bayesian semi-parametric regression model for HIV prevalence data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (McMC). The model is applied to HIV prevalence data among men in Kenya, derived from the Kenya AIDS indicator survey, with n = 3,662. Past studies have concluded that HIV infection has a nonlinear association with age. In this study a smooth function based on penalized regression splines is used to estimate this nonlinear effect. Other covariates were assumed to have a linear effect. Spatial references to the counties were modeled as both structured and unstructured spatial effects. We observe that circumcision reduces the risk of HIV infection. The results also indicate that men in the urban areas were more likely to be infected by HIV as compared to their rural counterpart. Men with higher education had the lowest risk of HIV infection. A nonlinear relationship between HIV infection and age was established. Risk of HIV infection increases with age up to the age of 40 then declines with increase in age. Men who had STI in the last 12 months were more likely to be infected with HIV. Also men who had ever used a condom were found to have higher likelihood to be infected by HIV. A significant spatial variation of HIV infection in Kenya was also established. The study shows the practicality and flexibility of Bayesian semi-parametric regression model in analyzing epidemiological data.  相似文献   

9.
The space-time pattern and environmental drivers (land cover, climate) of bovine anaplasmosis in the Midwestern state of Kansas was retrospectively evaluated using Bayesian hierarchical spatio-temporal models and publicly available, remotely-sensed environmental covariate information. Cases of bovine anaplasmosis positively diagnosed at Kansas State Veterinary Diagnostic Laboratory (n = 478) between years 2005–2013 were used to construct the models, which included random effects for space, time and space-time interaction effects with defined priors, and fixed-effect covariates selected a priori using an univariate screening procedure. The Bayesian posterior median and 95% credible intervals for the space-time interaction term in the best-fitting covariate model indicated a steady progression of bovine anaplasmosis over time and geographic area in the state. Posterior median estimates and 95% credible intervals derived for covariates in the final covariate model indicated land surface temperature (minimum), relative humidity and diurnal temperature range to be important risk factors for bovine anaplasmosis in the study. The model performance measured using the Area Under the Curve (AUC) value indicated a good performance for the covariate model (> 0.7). The relevance of climatological factors for bovine anaplasmosis is discussed.  相似文献   

10.
Bayesian hierarchical models usually model the risk surface on the same arbitrary geographical units for all data sources. Poisson/gamma random field models overcome this restriction as the underlying risk surface can be specified independently to the resolution of the data. Moreover, covariates may be considered as either excess or relative risk factors. We compare the performance of the Poisson/gamma random field model to the Markov random field (MRF)‐based ecologic regression model and the Bayesian Detection of Clusters and Discontinuities (BDCD) model, in both a simulation study and a real data example. We find the BDCD model to have advantages in situations dominated by abruptly changing risk while the Poisson/gamma random field model convinces by its flexibility in the estimation of random field structures and by its flexibility incorporating covariates. The MRF‐based ecologic regression model is inferior. WinBUGS code for Poisson/gamma random field models is provided.  相似文献   

11.
Capture-recapture models were developed to estimate survival using data arising from marking and monitoring wild animals over time. Variation in survival may be explained by incorporating relevant covariates. We propose nonparametric and semiparametric regression methods for estimating survival in capture-recapture models. A fully Bayesian approach using Markov chain Monte Carlo simulations was employed to estimate the model parameters. The work is illustrated by a study of Snow petrels, in which survival probabilities are expressed as nonlinear functions of a climate covariate, using data from a 40-year study on marked individuals, nesting at Petrels Island, Terre Adélie.  相似文献   

12.
Shen Y  Cheng SC 《Biometrics》1999,55(4):1093-1100
In the context of competing risks, the cumulative incidence function is often used to summarize the cause-specific failure-time data. As an alternative to the proportional hazards model, the additive risk model is used to investigate covariate effects by specifying that the subject-specific hazard function is the sum of a baseline hazard function and a regression function of covariates. Based on such a formulation, we present an approach to constructing simultaneous confidence intervals for the cause-specific cumulative incidence function of patients with given risk factors. A melanoma data set is used for the purpose of illustration.  相似文献   

13.

Background

Atherosclerotic peripheral arterial disease (PAD) affects 8–10 million people in the United States and is associated with a marked impairment in quality of life and an increased risk of cardiovascular events. Noninvasive assessment of PAD is performed by measuring the ankle-brachial index (ABI). Complex traits, such as ABI, are influenced by a large array of genetic and environmental factors and their interactions. We attempted to characterize the genetic architecture of ABI by examining the main and interactive effects of individual single nucleotide polymorphisms (SNPs) and conventional risk factors.

Methods

We applied linear regression analysis to investigate the association of 435 SNPs in 112 positional and biological candidate genes with ABI and related physiological and biochemical traits in 1046 non-Hispanic white, hypertensive participants from the Genetic Epidemiology Network of Arteriopathy (GENOA) study. The main effects of each SNP, as well as SNP-covariate and SNP-SNP interactions, were assessed to investigate how they contribute to the inter-individual variation in ABI. Multivariable linear regression models were then used to assess the joint contributions of the top SNP associations and interactions to ABI after adjustment for covariates. We reduced the chance of false positives by 1) correcting for multiple testing using the false discovery rate, 2) internal replication, and 3) four-fold cross-validation.

Results

When the results from these three procedures were combined, only two SNP main effects in NOS3, three SNP-covariate interactions (ADRB2 Gly 16 – lipoprotein(a) and SLC4A5 – diabetes interactions), and 25 SNP-SNP interactions (involving SNPs from 29 different genes) were significant, replicated, and cross-validated. Combining the top SNPs, risk factors, and their interactions into a model explained nearly 18% of variation in ABI in the sample. SNPs in six genes (ADD2, ATP6V1B1, PRKAR2B, SLC17A2, SLC22A3, and TGFB3) were also influencing triglycerides, C-reactive protein, homocysteine, and lipoprotein(a) levels.

Conclusion

We found that candidate gene SNP main effects, SNP-covariate and SNP-SNP interactions contribute to the inter-individual variation in ABI, a marker of PAD. Our findings underscore the importance of conducting systematic investigations that consider context-dependent frameworks for developing a deeper understanding of the multidimensional genetic and environmental factors that contribute to complex diseases.  相似文献   

14.
Species distribution models (SDMs) are a common approach to describing species’ space-use and spatially-explicit abundance. With a myriad of model types, methods and parameterization options available, it is challenging to make informed decisions about how to build robust SDMs appropriate for a given purpose. One key component of SDM development is the appropriate parameterization of covariates, such as the inclusion of covariates that reflect underlying processes (e.g. abiotic and biotic covariates) and covariates that act as proxies for unobserved processes (e.g. space and time covariates). It is unclear how different SDMs apportion variance among a suite of covariates, and how parameterization decisions influence model accuracy and performance. To examine trade-offs in covariation parameterization in SDMs, we explore the attribution of spatiotemporal and environmental variation across a suite of SDMs. We first used simulated species distributions with known environmental preferences to compare three types of SDM: a machine learning model (boosted regression tree), a semi-parametric model (generalized additive model) and a spatiotemporal mixed-effects model (vector autoregressive spatiotemporal model, VAST). We then applied the same comparative framework to a case study with three fish species (arrowtooth flounder, pacific cod and walleye pollock) in the eastern Bering Sea, USA. Model type and covariate parameterization both had significant effects on model accuracy and performance. We found that including either spatiotemporal or environmental covariates typically reproduced patterns of species distribution and abundance across the three models tested, but model accuracy and performance was maximized when including both spatiotemporal and environmental covariates in the same model framework. Our results reveal trade-offs in the current generation of SDM tools between accurately estimating species abundance, accurately estimating spatial patterns, and accurately quantifying underlying species–environment relationships. These comparisons between model types and parameterization options can help SDM users better understand sources of model bias and estimate error.  相似文献   

15.
Lifestyle and genetic factors play a large role in the development of Type 2 Diabetes (T2D). Despite the important role of genetic factors, genetic information is not incorporated into the clinical assessment of T2D risk. We assessed and compared Whole Genome Regression methods to predict the T2D status of 5,245 subjects from the Framingham Heart Study. For evaluating each method we constructed the following set of regression models: A clinical baseline model (CBM) which included non-genetic covariates only. CBM was extended by adding the first two marker-derived principal components and 65 SNPs identified by a recent GWAS consortium for T2D (M-65SNPs). Subsequently, it was further extended by adding 249,798 genome-wide SNPs from a high-density array. The Bayesian models used to incorporate genome-wide marker information as predictors were: Bayes A, Bayes Cπ, Bayesian LASSO (BL), and the Genomic Best Linear Unbiased Prediction (G-BLUP). Results included estimates of the genetic variance and heritability, genetic scores for T2D, and predictive ability evaluated in a 10-fold cross-validation. The predictive AUC estimates for CBM and M-65SNPs were: 0.668 and 0.684, respectively. We found evidence of contribution of genetic effects in T2D, as reflected in the genomic heritability estimates (0.492±0.066). The highest predictive AUC among the genome-wide marker Bayesian models was 0.681 for the Bayesian LASSO. Overall, the improvement in predictive ability was moderate and did not differ greatly among models that included genetic information. Approximately 58% of the total number of genetic variants was found to contribute to the overall genetic variation, indicating a complex genetic architecture for T2D. Our results suggest that the Bayes Cπ and the G-BLUP models with a large set of genome-wide markers could be used for predicting risk to T2D, as an alternative to using high-density arrays when selected markers from large consortiums for a given complex trait or disease are unavailable.  相似文献   

16.
Factors influencing soay sheep survival: a Bayesian analysis   总被引:1,自引:0,他引:1  
King R  Brooks SP  Morgan BJ  Coulson T 《Biometrics》2006,62(1):211-220
This article presents a Bayesian analysis of mark-recapture-recovery data on Soay sheep. A reversible jump Markov chain Monte Carlo technique is used to determine age classes of common survival, and to model the survival probabilities in those classes using logistic regression. This involves environmental and individual covariates, as well as random effects. Auxiliary variables are used to impute missing covariates measured on individual sheep. The Bayesian approach suggests different models from those previously obtained using classical statistical methods. Following model averaging, features that were not previously detected, and which are of ecological importance, are identified.  相似文献   

17.
In survival models, some covariates affecting the lifetime could not be observed or measured. These covariates may correspond to environmental or genetic factors and be considered as a random effect related to a frailty of the individuals explaining their survival times. We propose a methodology based on a Birnbaum–Saunders frailty regression model, which can be applied to censored or uncensored data. Maximum‐likelihood methods are used to estimate the model parameters and to derive local influence techniques. Diagnostic tools are important in regression to detect anomalies, as departures from error assumptions and presence of outliers and influential cases. Normal curvatures for local influence under different perturbations are computed and two types of residuals are introduced. Two examples with uncensored and censored real‐world data illustrate the proposed methodology. Comparison with classical frailty models is carried out in these examples, which shows the superiority of the proposed model.  相似文献   

18.
Elucidating the factors influencing genetic differentiation is an important task in biology, and the relative contribution from natural selection and genetic drift has long been debated. In this study, we used a regression-based approach to simultaneously estimate the quantitative contributions of environmental adaptation and isolation by distance on genetic variation in Boechera stricta, a wild relative of Arabidopsis. Patterns of discrete and continuous genetic differentiation coexist within this species. For the discrete differentiation between two major genetic groups, environment has larger contribution than geography, and we also identified a significant environment-by-geography interaction effect. Elsewhere in the species range, we found a latitudinal cline of genetic variation reflecting only isolation by distance. To further confirm the effect of environmental selection on genetic divergence, we identified the specific environmental variables predicting local genotypes in allopatric and sympatric regions. Water availability was identified as the possible cause of differential local adaptation in both geographical regions, confirming the role of environmental adaptation in driving and maintaining genetic differentiation between the two major genetic groups. In addition, the environment-by-geography interaction is further confirmed by the finding that water availability is represented by different environmental factors in the allopatric and sympatric regions. In conclusion, this study shows that geographical and environmental factors together created stronger and more discrete genetic differentiation than isolation by distance alone, which only produced a gradual, clinal pattern of genetic variation. These findings emphasize the importance of environmental selection in shaping patterns of species-wide genetic variation in the natural environment.  相似文献   

19.
Genes, environment, and the interaction between them are each known to play an important role in the risk for developing complex diseases such as metabolic syndrome. For environmental factors, most studies focused on the measurements observed at the individual level, and therefore can only consider the gene-environment interaction at the same individual scale. Indeed the group-level (called contextual) environmental variables, such as community factors and the degree of local area development, may modify the genetic effect as well. To examine such cross-level interaction between genes and contextual factors, a flexible statistical model quantifying the variability of the genetic effects across different categories of the contextual variable is in need. With a Bayesian generalized linear mixed-effects model with an unconditional likelihood, we investigate whether the individual genetic effect is modified by the group-level residential environment factor in a matched case-control metabolic syndrome study. Such cross-level interaction is evaluated by examining the heterogeneity in allelic effects under various contextual categories, based on posterior samples from Markov chain Monte Carlo methods. The Bayesian analysis indicates that the effect of rs1801282 on metabolic syndrome development is modified by the contextual environmental factor. That is, even among individuals with the same genetic component of PPARG_Pro12Ala, living in a residential area with low availability of exercise facilities may result in higher risk. The modification of the group-level environment factors on the individual genetic attributes can be essential, and this Bayesian model is able to provide a quantitative assessment for such cross-level interaction. The Bayesian inference based on the full likelihood is flexible with any phenotype, and easy to implement computationally. This model has a wide applicability and may help unravel the complexity in development of complex diseases.  相似文献   

20.
We assessed the relative roles of natural covariates, human disturbance (water quality and catchment land use) together with geography in driving variation in aquatic macrophyte community composition, richness and status among 101 lakes in southern and central Finland. In addition to all species together, we studied different growth forms (i.e. emergent and submerged macrophytes and aquatic bryophytes) separately. Partial redundancy analysis (taxonomic composition) and partial least-squares regression (species richness and status index) were employed to display the share of variability in macrophyte assemblages that was attributable to the environmental factors (both natural and human-affected) and the spatial filters generated through principal coordinates of neighbor matrices (PCNM).Macrophyte community composition, richness and status were explained by natural covariates, together with joint effects of human disturbance variables and space. The contributions of pure fractions of human disturbance and space were mostly modest, albeit variable among macrophyte groups and status indices. Alkalinity, historical distributions, colour, dynamic ratio and lake area were most important natural covariates for macrophytes. Of those variables influenced by human, macrophytes were mostly explained by conductivity, total phosphorus, turbidity and chlorophyll-a.Our results demonstrate, as expected, that macrophytes are dominantly affected by local environmental variables, whereas dispersal-related processes seem not to be important at regional extent. Response of macrophyte growth forms to environment and space, however, varied significantly. Community composition and richness of emergent macrophytes showed congruent response to natural covariates and human disturbance. Aquatic bryophytes, which are rarely studied along lake macrophytes, responded stronger than other growth forms to human disturbance. Contrary to our expectations, ecological indices were not affected by dispersal-related processes, but were mainly explained by natural covariates. This study is the first to investigate spatial patterns in aquatic macrophytes derived bioassessment. Geographical structuring of environmental variables and regional extent negatively affected indices, suggesting that ecological status assessment needs further development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号