首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Different methods have been developed to consider the effects of statistical associations among genes that arise in population genetics models: kin selection models deal with associations among genes present in different interacting individuals, while multilocus models deal with associations among genes at different loci. It was pointed out recently that these two types of models are very similar in essence. In this paper, we present a method to construct multilocus models in the infinite island model of population structure (where deme size may be arbitrarily small). This method allows one to compute recursions on allele frequencies, and different types of genetic associations (including associations between different individuals from the same deme), and incorporates selection. Recursions can be simplified using quasi-equilibrium approximations; however, we show that quasi-equilibrium calculations for associations that are different from zero under neutrality must include a term that has not been previously considered. The method is illustrated using simple examples.  相似文献   

2.
Recent advances in statistical software have led to the rapid diffusion of new methods for modelling longitudinal data. Multilevel (also known as hierarchical or random effects) models for binary outcomes have generally been based on a logistic-normal specification, by analogy with earlier work for normally distributed data. The appropriate application and interpretation of these models remains somewhat unclear, especially when compared with the computationally more straightforward semiparametric or 'marginal' modelling (GEE) approaches. In this paper we pose two interrelated questions. First, what limits should be placed on the interpretation of the coefficients and inferences derived from random-effect models involving binary outcomes? Second, what diagnostic checks are appropriate for evaluating whether such random-effect models provide adequate fits to the data? We address these questions by means of an extended case study using data on adolescent smoking from a large cohort study. Bayesian estimation methods are used to fit a discrete-mixture alternative to the standard logistic-normal model, and posterior predictive checking is used to assess model fit. Surprising parallels in the parameter estimates from the logistic-normal and mixture models are described and used to question the interpretability of the so-called 'subject-specific' regression coefficients from the standard multilevel approach. Posterior predictive checks suggest a serious lack of fit of both multilevel models. The results do not provide final answers to the two questions posed, but we expect that lessons learned from the case study will provide general guidance for further investigation of these important issues.  相似文献   

3.
The analyses of observational longitudinal studies involving concurrent changes in treatment and medical conditions present difficulties because of the multitude of directions of potential relationships: past medication influences current symptoms; past symptoms influence current medication; and current medication is associated with current symptoms. In the context of a long-term study of non-randomized pharmacological treatment of schizophrenic relapse, we present an analysis of bivariate discrete-time transitional data with binary responses in an attempt to understand the transitional and concurrent relationships between schizophrenia relapse and medication use. A naive analysis does not show any association between previous medication and current relapse. However, we provide evidence suggesting that current treatment may impact current relapse for those who have previously taken medication, but not for those who haven't taken medication in the past. When univariate models are specified to assess these associations, the bivariate nature of the problem requires a choice of which response, relapse or medication, should be the dependent variable. In this case, the choice of relapse or medication as a dependent variable does matter. Hence, our results derive from models where both relapse and medication are treated as dependent variables. Specifically, we specify a bivariate log odds ratio for current relapse and current medication use and a separate univariate logit component for each of these outcomes. Each of these components contains transitional associations with previous relapse and medication. Such models represent extensions of univariate transitional association models (e.g. Diggle et al. (1994)) and correspond to bivariate transitional models (e.g. Zeger and Liang (1991)). We incorporate changes in transitional associations into the full-data parametric model for final inference, and investigate if these temporal changes are due to learning effects or the impact of drop-out. We also perform residual analyses and sensitivity analyses in the context of missing data patterns.  相似文献   

4.
Disease mapping of a single disease has been widely studied in the public health setup. Simultaneous modeling of related diseases can also be a valuable tool both from the epidemiological and from the statistical point of view. In particular, when we have several measurements recorded at each spatial location, we need to consider multivariate models in order to handle the dependence among the multivariate components as well as the spatial dependence between locations. It is then customary to use multivariate spatial models assuming the same distribution through the entire population density. However, in many circumstances, it is a very strong assumption to have the same distribution for all the areas of population density. To overcome this issue, we propose a hierarchical multivariate mixture generalized linear model to simultaneously analyze spatial Normal and non‐Normal outcomes. As an application of our proposed approach, esophageal and lung cancer deaths in Minnesota are used to show the outperformance of assuming different distributions for different counties of Minnesota rather than assuming a single distribution for the population density. Performance of the proposed approach is also evaluated through a simulation study.  相似文献   

5.
Traditionally, the niche of a species is described as a hypothetical 3D space, constituted by well‐known biotic interactions (e.g. predation, competition, trophic relationships, resource–consumer interactions, etc.) and various abiotic environmental factors. Species distribution models (SDMs), also called “niche models” and often used to predict wildlife distribution at landscape scale, are typically constructed using abiotic factors with biotic interactions generally been ignored. Here, we compared the goodness of fit of SDMs for red‐backed shrike Lanius collurio in farmlands of Western Poland, using both the classical approach (modeled only on environmental variables) and the approach which included also other potentially associated bird species. The potential associations among species were derived from the relevant ecological literature and by a correlation matrix of occurrences. Our findings highlight the importance of including heterospecific interactions in improving our understanding of niche occupation for bird species. We suggest that suite of measures currently used to quantify realized species niches could be improved by also considering the occurrence of certain associated species. Then, an hypothetical “species 1” can use the occurrence of a successfully established individual of “species 2” as indicator or “trace” of the location of available suitable habitat to breed. We hypothesize this kind of biotic interaction as the “heterospecific trace effect” (HTE): an interaction based on the availability and use of “public information” provided by individuals from different species. Finally, we discuss about the incomes of biotic interactions for enhancing the predictive capacities on species distribution models.  相似文献   

6.
In addition to the processes structuring free‐living communities, host‐associated microbiota are directly or indirectly shaped by the host. Therefore, microbiota data have a hierarchical structure where samples are nested under one or several variables representing host‐specific factors, often spanning multiple levels of biological organization. Current statistical methods do not accommodate this hierarchical data structure and therefore cannot explicitly account for the effect of the host in structuring the microbiota. We introduce a novel extension of joint species distribution models (JSDMs) which can straightforwardly accommodate and discern between effects such as host phylogeny and traits, recorded covariates such as diet and collection site, among other ecological processes. Our proposed methodology includes powerful yet familiar outputs seen in community ecology overall, including (a) model‐based ordination to visualize and quantify the main patterns in the data; (b) variance partitioning to assess how influential the included host‐specific factors are in structuring the microbiota; and (c) co‐occurrence networks to visualize microbe‐to‐microbe associations.  相似文献   

7.
Cure models are used in time-to-event analysis when not all individuals are expected to experience the event of interest, or when the survival of the considered individuals reaches the same level as the general population. These scenarios correspond to a plateau in the survival and relative survival function, respectively. The main parameters of interest in cure models are the proportion of individuals who are cured, termed the cure proportion, and the survival function of the uncured individuals. Although numerous cure models have been proposed in the statistical literature, there is no consensus on how to formulate these. We introduce a general parametric formulation of mixture cure models and a new class of cure models, termed latent cure models, together with a general estimation framework and software, which enable fitting of a wide range of different models. Through simulations, we assess the statistical properties of the models with respect to the cure proportion and the survival of the uncured individuals. Finally, we illustrate the models using survival data on colon cancer, which typically display a plateau in the relative survival. As demonstrated in the simulations, mixture cure models which are not guaranteed to be constant after a finite time point, tend to produce accurate estimates of the cure proportion and the survival of the uncured. However, these models are very unstable in certain cases due to identifiability issues, whereas LC models generally provide stable results at the price of more biased estimates.  相似文献   

8.
A vast literature has recently been concerned with the analysis of variation in disease counts recorded across geographical areas with the aim of detecting clusters of regions with homogeneous behavior. Most of the proposed modeling approaches have been discussed for the univariate case and only very recently spatial models have been extended to predict more than one outcome simultaneously. In this paper we extend the standard finite mixture models to the analysis of multiple, spatially correlated, counts. Dependence among outcomes is modeled using a set of correlated random effects and estimation is carried out by numerical integration through an EM algorithm without assuming any specific parametric distribution for the random effects. The spatial structure is captured by the use of a Gibbs representation for the prior probabilities of component membership through a Strauss‐like model. The proposed model is illustrated using real data (© 2009 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

9.
In this paper we study analytically the stick-slip models recently introduced to explain the stochastic migration of free cells. We show that persistent motion of cells of many different types is compatible with stochastic reorientation models which admit an analytical mesoscopic treatment. This is proved by examining and discussing experimental data compiled from different sources in the literature, and by fitting some of these results too. We are able to explain many of the ‘apparently complex’ migration patterns obtained recently from cell tracking data, like power-law dependences in the mean square displacement or non-Gaussian behavior for the kurtosis and the velocity distributions, which depart from the predictions of the classical Ornstein-Uhlenbeck process.  相似文献   

10.
Large-scale hypothesis testing has become a ubiquitous problem in high-dimensional statistical inference, with broad applications in various scientific disciplines. One relevant application is constituted by imaging mass spectrometry (IMS) association studies, where a large number of tests are performed simultaneously in order to identify molecular masses that are associated with a particular phenotype, for example, a cancer subtype. Mass spectra obtained from matrix-assisted laser desorption/ionization (MALDI) experiments are dependent, when considered as statistical quantities. False discovery proportion (FDP) estimation and  control under arbitrary dependency structure among test statistics is an active topic in modern multiple testing research. In this context, we are concerned with the evaluation of associations between the binary outcome variable (describing the phenotype) and multiple predictors derived from MALDI measurements. We propose an inference procedure in which the correlation matrix of the test statistics is utilized. The approach is based on multiple marginal models. Specifically, we fit a marginal logistic regression model for each predictor individually. Asymptotic joint normality of the stacked vector of the marginal regression coefficients is established under standard regularity assumptions, and their (limiting) correlation matrix is estimated. The proposed method extracts common factors from the resulting empirical correlation matrix. Finally, we estimate the realized FDP of a thresholding procedure for the marginal p-values. We demonstrate a practical application of the proposed workflow to MALDI IMS data in an oncological context.  相似文献   

11.
Correlations between heterozygosity and components of fitness have been investigated in natural populations for over 20 years. Positive correlations between a trait of interest and heterozygosity (usually measured at allozyme loci) are generally recognized as evidence of inbreeding depression. More recently, molecular markers such as microsatellites have been employed for the same purpose. A typical study might use around five to ten markers. In this paper we use a panel of 71 microsatellite loci to: (1) Compare the efficacy of heterozygosity and a related microsatellite‐specific variable, mean d2, in detecting inbreeding depression; (2) Examine the statistical power of heterozygosity to detect such associations. We performed our analyses in a wild population of red deer (Cervus elaphus) in which inbreeding depression in juvenile traits had previously been detected using a panel of nine markers. We conclude that heterozygosity‐based measures outperform mean d2‐based measures, but that power to detect heterozygosity‐fitness associations is nonetheless low when ten or fewer markers are typed.  相似文献   

12.
Why species are found where they are is a central question in biogeography. The most widely used tool for understanding the controls on distribution is species distribution modelling. Species distribution modelling is now a well‐established method in both the theoretical and applied ecological literature. In this special issue we examine the current state of the art in species distribution modelling and explore avenues for including more biological processes in such models. In particular we focus on physiological, demographic, dispersal, competitive and ecological‐modulation processes. This overview highlights opportunities for new species distribution model concepts and developments, as well as a statistical agenda for implementing such models.  相似文献   

13.
Accurate modelling of biological systems requires a deeper and more complete knowledge about the molecular components and their functional associations than we currently have. Traditionally, new knowledge on protein associations generated by experiments has played a central role in systems modelling, in contrast to generally less trusted bio-computational predictions. However, we will not achieve realistic modelling of complex molecular systems if the current experimental designs lead to biased screenings of real protein networks and leave large, functionally important areas poorly characterised. To assess the likelihood of this, we have built comprehensive network models of the yeast and human proteomes by using a meta-statistical integration of diverse computationally predicted protein association datasets. We have compared these predicted networks against combined experimental datasets from seven biological resources at different level of statistical significance. These eukaryotic predicted networks resemble all the topological and noise features of the experimentally inferred networks in both species, and we also show that this observation is not due to random behaviour. In addition, the topology of the predicted networks contains information on true protein associations, beyond the constitutive first order binary predictions. We also observe that most of the reliable predicted protein associations are experimentally uncharacterised in our models, constituting the hidden or “dark matter” of networks by analogy to astronomical systems. Some of this dark matter shows enrichment of particular functions and contains key functional elements of protein networks, such as hubs associated with important functional areas like the regulation of Ras protein signal transduction in human cells. Thus, characterising this large and functionally important dark matter, elusive to established experimental designs, may be crucial for modelling biological systems. In any case, these predictions provide a valuable guide to these experimentally elusive regions.  相似文献   

14.
Species interactions are dynamic processes that vary across environmental and ecological contexts, and operate across scale boundaries, making them difficult to quantify. Nevertheless, ecologists are increasingly interested in inferring species interactions from observational data using statistical analyses of their spatial co‐occurrence patterns. Trophic interactions present a particular challenge, as predators and prey may frequently or rarely co‐occur, depending on the spatial or temporal scale of observation. In this study, we investigate the accuracy of inferred interactions among species that both compete and trophically interact. We utilized a long‐term dataset of pond‐breeding amphibian co‐occurrences from Mt Rainier National Park (Washington, USA) and compiled a new dataset of their empirical interactions from the literature. We compared the accuracy of four statistical methods in inferring these known species interactions from spatial associations. We then used the best performing statistical method, the Markov network, to further investigate the sensitivity of interaction inference to spatial scale‐dependence and the presence of predators. We show that co‐occurrence methods are generally inaccurate when estimating trophic interactions. Further the strength and sign of inferred interactions were dependent upon the spatial scale of observation and predator presence influenced the detectability of competitive interactions among prey species. However, co‐occurrence analysis revealed new patterns of spatial association among pairs of species with known interactions. Overall, our study highlights a limiting frontier in co‐occurrence theory and the disconnect between widely implemented methodologies and their ability to accurately infer interactions in trophically‐structured communities.  相似文献   

15.
An equation for the rate of photosynthesis as a function of irradiance introduced by T. T. Bannister included an empirical parameter b to account for observed variations in curvature between the initial slope and the maximum rate of photosynthesis. Yet researchers have generally favored equations with fixed curvature, possibly because b was viewed as having no physiological meaning. We developed an analytic photosynthesis‐irradiance equation relating variations in curvature to changes in the degree of connectivity between photosystems, and also considered a recently published alternative, based on changes in the size of the plastoquinone pool. When fitted to a set of 185 observed photosynthesis‐irradiance curves, it was found that the Bannister equation provided the best fit more frequently compared to either of the analytic equations. While Bannister's curvature parameter engendered negligible improvement in the statistical fit to the study data, we argued that the parameter is nevertheless quite useful because it allows for consistent estimates of initial slope and saturation irradiance for observations exhibiting a range of curvatures, which would otherwise have to be fitted to different fixed‐curvature equations. Using theoretical models, we also found that intra‐ and intercellular self‐shading can result in biased estimates of both curvature and the saturation irradiance parameter. We concluded that Bannister's is the best currently available equation accounting for variations in curvature precisely because it does not assign inappropriate physiological meaning to its curvature parameter, and we proposed that b should be thought of as the expression of the integration of all factors impacting curvature.  相似文献   

16.
Understanding and predicting a species’ distribution across a landscape is of central importance in ecology, biogeography and conservation biology. However, it presents daunting challenges when populations are highly dynamic (i.e. increasing or decreasing their ranges), particularly for small populations where information about ecology and life history traits is lacking. Currently, many modelling approaches fail to distinguish whether a site is unoccupied because the available habitat is unsuitable or because a species expanding its range has not arrived at the site yet. As a result, habitat that is indeed suitable may appear unsuitable. To overcome some of these limitations, we use a statistical modelling approach based on spatio‐temporal log‐Gaussian Cox processes. These model the spatial distribution of the species across available habitat and how this distribution changes over time, relative to covariates. In addition, the model explicitly accounts for spatio‐temporal dynamics that are unaccounted for by covariates through a spatio‐temporal stochastic process. We illustrate the approach by predicting the distribution of a recently established population of Eurasian cranes Grus grus in England, UK, and estimate the effect of a reintroduction in the range expansion of the population. Our models show that wetland extent and perimeter‐to‐area ratio have a positive and negative effect, respectively, in crane colonisation probability. Moreover, we find that cranes are more likely to colonise areas near already occupied wetlands and that the colonisation process is progressing at a low rate. Finally, the reintroduction of cranes in SW England can be considered a human‐assisted long‐distance dispersal event that has increased the dispersal potential of the species along a longitudinal axis in S England. Spatio‐temporal log‐Gaussian Cox process models offer an excellent opportunity for the study of species where information on life history traits is lacking, since these are represented through the spatio‐temporal dynamics reflected in the model.  相似文献   

17.
Recent technological advances continue to provide noninvasive and more accurate biomarkers for evaluating disease status. One standard tool for assessing the accuracy of diagnostic tests is the receiver operating characteristic (ROC) curve. Few statistical methods exist to accommodate multiple continuous‐scale biomarkers in the framework of ROC analysis. In this paper, we propose a method to integrate continuous‐scale biomarkers to optimize classification accuracy. Specifically, we develop semiparametric transformation models for multiple biomarkers. We assume that unknown and marker‐specific transformations of biomarkers follow a multivariate normal distribution. Our models accommodate biomarkers subject to limits of detection and account for the dependence among biomarkers by including a subject‐specific random effect. We also propose a diagnostic measure using an optimal linear combination of the transformed biomarkers. Our diagnostic rule does not depend on any monotone transformation of biomarkers and is not sensitive to extreme biomarker values. Nonparametric maximum likelihood estimation (NPMLE) is used for inference. We show that the parameter estimators are asymptotically normal and efficient. We illustrate our semiparametric approach using data from the Endometriosis, Natural History, Diagnosis, and Outcomes (ENDO) study.  相似文献   

18.
Clinical prediction models play a key role in risk stratification, therapy assignment and many other fields of medical decision making. Before they can enter clinical practice, their usefulness has to be demonstrated using systematic validation. Methods to assess their predictive performance have been proposed for continuous, binary, and time-to-event outcomes, but the literature on validation methods for discrete time-to-event models with competing risks is sparse. The present paper tries to fill this gap and proposes new methodology to quantify discrimination, calibration, and prediction error (PE) for discrete time-to-event outcomes in the presence of competing risks. In our case study, the goal was to predict the risk of ventilator-associated pneumonia (VAP) attributed to Pseudomonas aeruginosa in intensive care units (ICUs). Competing events are extubation, death, and VAP due to other bacteria. The aim of this application is to validate complex prediction models developed in previous work on more recently available validation data.  相似文献   

19.
Variation in mitochondrial DNA is often assumed to be neutral and is used to construct the genealogical relationships among populations and species. However, if extant variation is the result of episodes of positive selection, these genealogies may be incorrect, although this information itself may provide biologically and evolutionary meaningful information. In fact, positive Darwinian selection has been detected in the mitochondrial‐encoded subunits that comprise complex I from diverse taxa with seemingly dissimilar bioenergetic life histories, but the functional implications of the selected sites are unknown. Complex I produces roughly 40% of the proton flux that is used to synthesize ATP from ADP, and a functional model based on the high‐resolution structure of complex I described a unique biomechanical apparatus for proton translocation. We reported positive selection at sites in this apparatus during the evolution of Pacific salmon, and it appeared this was also the case in published reports from other taxa, but a comparison among studies was difficult because different statistical tests were used to detect selection and oftentimes, specific sites were not reported. Here we review the literature of positive selection in mitochondrial genomes, the statistical tests used to detect selection, and the structural and functional models that are currently available to study the physiological implications of selection. We then search for signatures of positive selection among the coding mitochondrial genomes of 237 species with a common set of tests and verify that the ND5 subunit of complex I is a repeated target of positive Darwinian selection in diverse taxa. We propose a novel hypothesis to explain the results based on their bioenergetic life histories and provide a guide for laboratory and field studies to test this hypothesis.  相似文献   

20.
Spatial autocorrelation is a well‐recognized concern for observational data in general, and more specifically for spatial data in ecology. Generalized linear mixed models (GLMMs) with spatially autocorrelated random effects are a potential general framework for handling these spatial correlations. However, as the result of statistical and practical issues, such GLMMs have been fitted through the undocumented use of procedures based on penalized quasi‐likelihood approximations (PQL), and under restrictive models of spatial correlation. Alternatively, they are often neglected in favor of simpler but more questionable approaches. In this work we aim to provide practical and validated means of inference under spatial GLMMs, that overcome these limitations. For this purpose, a new software is developed to fit spatial GLMMs. We use it to assess the performance of likelihood ratio tests for fixed effects under spatial autocorrelation, based on Laplace or PQL approximations of the likelihood. Expectedly, the Laplace approximation performs generally slightly better, although a variant of PQL was better in the binary case. We show that a previous implementation of PQL methods in the R language, glmmPQL, is not appropriate for such applications. Finally, we illustrate the efficiency of a bootstrap procedure for correcting the small sample bias of the tests, which applies also to non‐spatial models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号