首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Huiping Xu  Bruce A. Craig 《Biometrics》2009,65(4):1145-1155
Summary Traditional latent class modeling has been widely applied to assess the accuracy of dichotomous diagnostic tests. These models, however, assume that the tests are independent conditional on the true disease status, which is rarely valid in practice. Alternative models using probit analysis have been proposed to incorporate dependence among tests, but these models consider restricted correlation structures. In this article, we propose a probit latent class model that allows a general correlation structure. When combined with some helpful diagnostics, this model provides a more flexible framework from which to evaluate the correlation structure and model fit. Our model encompasses several other PLC models but uses a parameter‐expanded Monte Carlo EM algorithm to obtain the maximum‐likelihood estimates. The parameter‐expanded EM algorithm was designed to accelerate the convergence rate of the EM algorithm by expanding the complete‐data model to include a larger set of parameters and it ensures a simple solution in fitting the PLC model. We demonstrate our estimation and model selection methods using a simulation study and two published medical studies.  相似文献   

2.
Variable selection is critical in competing risks regression with high-dimensional data. Although penalized variable selection methods and other machine learning-based approaches have been developed, many of these methods often suffer from instability in practice. This paper proposes a novel method named Random Approximate Elastic Net (RAEN). Under the proportional subdistribution hazards model, RAEN provides a stable and generalizable solution to the large-p-small-n variable selection problem for competing risks data. Our general framework allows the proposed algorithm to be applicable to other time-to-event regression models, including competing risks quantile regression and accelerated failure time models. We show that variable selection and parameter estimation improved markedly using the new computationally intensive algorithm through extensive simulations. A user-friendly R package RAEN is developed for public use. We also apply our method to a cancer study to identify influential genes associated with the death or progression from bladder cancer.  相似文献   

3.
Transition state theory provides a well established means to compute the rate at which rare events occur; however, this is strictly an equilibrium approach. Here we consider a nonequilibrium problem of this nature in the form of transport through a liquid–liquid interface. When two immiscible liquids are coexisting in equilibrium, there will be a certain amount of mixing between the two phases, resulting in a finite linear mobility across the liquid–liquid interface. We derive an exact relationship between the mobility and the local diffusion in the direction perpendicular to the interface. We compute the mobility using both nonequilibrium molecular dynamics and a variety of linear response type approaches, with accurate agreement being obtained for the best of these. Our analysis makes it clear how the local diffusion is influenced by the inhomogeneities of the interface, even when at a distance from it. This nonlocal character to the mobility has not been appreciated before and results in a strong variation in the local diffusion, which is formally coupled to the variation in the potential of mean force. The nonlocal aspect of the diffusion requires the velocity autocorrelation function to be integrated out to far longer times than is the case for homogeneous liquids, and requires special care with regard to the choice of numerical approach.  相似文献   

4.
Phenotypic plasticity and related processes (learning, developmental noise) have been proposed to both accelerate and slow down genetically based evolutionary change. While both views have been supported by various mathematical models and simulations, no general predictions have been offered as to when these alternative outcomes should occur. Here we propose a general framework to study the effects of plasticity on the rate of evolution under directional selection. It is formulated in terms of the fitness gain gradient, which measures the effect of a marginal change in the degree of plasticity on the slope of the relationship between the genotypic value of the focal trait and log fitness. If the gain gradient has the same sign as the direction of selection, an increase in plasticity will magnify the response to selection; if the two signs are opposite, greater plasticity will lead to slower response. We use this general result to derive conditions for the acceleration/deceleration under several simple forms of plasticity, including developmental noise. We also show that our approach explains the results of several specific models from the literature and thus provides a unifying framework.  相似文献   

5.
Monomorphic loci evolve through a series of substitutions on a fitness landscape. Understanding how mutation, selection, and genetic drift drive this process, and uncovering the structure of the fitness landscape from genomic data are two major goals of evolutionary theory. Population genetics models of the substitution process have traditionally focused on the weak-selection regime, which is accurately described by diffusion theory. Predictions in this regime can be considered universal in the sense that many population models exhibit equivalent behavior in the diffusion limit. However, a growing number of experimental studies suggest that strong selection plays a key role in some systems, and thus there is a need to understand universal properties of models without a priori assumptions about selection strength. Here we study time reversibility in a general substitution model of a monomorphic haploid population. We show that for any time-reversible population model, such as the Moran process, substitution rates obey an exact scaling law. For several other irreversible models, such as the simple Wright-Fisher process and its extensions, the scaling law is accurate up to selection strengths that are well outside the diffusion regime. Time reversibility gives rise to a power-law expression for the steady-state distribution of populations on an arbitrary fitness landscape. The steady-state behavior is dominated by weak selection and is thus adequately described by the diffusion approximation, which guarantees universality of the steady-state formula and its applicability to the problem of reconstructing fitness landscapes from DNA or protein sequence data.  相似文献   

6.
Summary .   We propose a general framework for the analysis of animal telemetry data through the use of weighted distributions. It is shown that several interpretations of resource selection functions arise when constructed from the ratio of a use and availability distribution. Through the proposed general framework, several popular resource selection models are shown to be special cases of the general model by making assumptions about animal movement and behavior. The weighted distribution framework is shown to be easily extended to readily account for telemetry data that are highly autocorrelated; as is typical with use of new technology such as global positioning systems animal relocations. An analysis of simulated data using several models constructed within the proposed framework is also presented to illustrate the possible gains from the flexible modeling framework. The proposed model is applied to a brown bear data set from southeast Alaska.  相似文献   

7.

Background  

Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal.  相似文献   

8.
Roze D  Rousset F 《Genetics》2003,165(4):2153-2166
Population structure affects the relative influence of selection and drift on the change in allele frequencies. Several models have been proposed recently, using diffusion approximations to calculate fixation probabilities, fixation times, and equilibrium properties of subdivided populations. We propose here a simple method to construct diffusion approximations in structured populations; it relies on general expressions for the expectation and variance in allele frequency change over one generation, in terms of partial derivatives of a "fitness function" and probabilities of genetic identity evaluated in a neutral model. In the limit of a very large number of demes, these probabilities can be expressed as functions of average allele frequencies in the metapopulation, provided that coalescence occurs on two different timescales, which is the case in the island model. We then use the method to derive expressions for the probability of fixation of new mutations, as a function of their dominance coefficient, the rate of partial selfing, and the rate of deme extinction. We obtain more precise approximations than those derived by recent work, in particular (but not only) when deme sizes are small. Comparisons with simulations show that the method gives good results as long as migration is stronger than selection.  相似文献   

9.
Nonparametric feature selection for high-dimensional data is an important and challenging problem in the fields of statistics and machine learning. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space. The space is generated by a novel tensor product kernel, which depends on a set of parameters that determines the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove the oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and applications to two real studies.  相似文献   

10.
We apply the concept of marginal stability hypothesis, which has been proposed for solving the problem of dendritic crystal growth, to the pattern selection problem in the Gierer-Meinhardt models. In the case of a large system, the system selects a definite wavelength of the ultimate spatial pattern when the unstable homogeneous steady state is locally disturbed. The numerical results are analyzed theoretically by means of the marginal stability hypothesis, and they are in good agreement with it. Biologically, these results imply why for large systems the Gierer-Meinhardt model (and presumably other reaction-diffusion schemes) have the ability to explain the observation that pattern-generating mechanisms are remarkably insensitive to a wide range of environmental and experimental conditions.  相似文献   

11.
Recently several papers that model parasitic egg-laying by birds in the nests of others of their own species have been published. Whilst these papers are concerned with answering different questions, they approach the problem in a similar way and have a lot of common features. In this paper a framework is developed which unifies these models, in the sense that they all become special cases of a more general model. This is useful for two main reasons; firstly in order to aid clarity, in that the assumptions and conclusions of each of the models are easier to compare. Secondly it provides a base for further similar models to start from. The basic assumptions for this framework are outlined and a method for finding the ESSs of such models is introduced. Some mathematical results for the general, and more specific, models are considered and their implications discussed. In addition we explore the biological consequences of the results that we have obtained and suggest possible questions which could be investigated using models within or very closely related to our framework.M. Broom is also a member of the Centre for the Study of Evolution at the University of Sussex.  相似文献   

12.
Models for sexual partner choice are discussed for the case of highly variable sexual activity in the population. It is demonstrated that the variances in the number of infected persons may be extremely large. For the random mixing model, higher order cumulants are also evaluated. On the basis of these results the applicability of deterministic models and models for expectations only are questioned. A general model is proposed for handling nonrandom, or correlated, mixing. The problem of inconsistency is overcome by considering the couples having sex as the natural unit in the model. In the case of s discrete homogeneous groups it is shown that only (s2) parameters defining the interaction between the groups can be chosen freely. Finally, the effect of correlation in partner choice is demonstrated by a bivariate lognormal model for partner choice.  相似文献   

13.
We have developed an approximate maximum likelihood framework for the problem of estimating the selection coefficients in a simple fertility selection model via random union of zygotes. We consider a sampling scheme where a random sample from each (discrete) generation of a population observed over several generations is collected and genotyped based on one nuclear locus and a cytonuclear locus, simultaneously. Simulation results show excellent small sample performance of the resulting approximate MLE. Asymptotic variance‐covariance matrix of our estimator is also obtained. We further show that these estimates can be used to obtain simple test statistics for testing various types of selection hypotheses including a test of neutrality.  相似文献   

14.
Several models for the spread of AIDS within a homosexual community have been proposed that incorporate biased mixing of different risk groups. A simple model is presented that captures many of the features of these more complex models. Analytical expressions are derived for the time to the state of maximum infection (SMI) in a particular risk group, the proportion infected at SMI, and the number of infected individuals as the group approaches SMI. These results agree qualitatively with numerical simulations of the model.  相似文献   

15.
Optical Mapping is an emerging technology for constructing ordered restriction maps of DNA molecules. The underlying computational problems for this technology have been studied and several models have been proposed in recent literature. Most of these propose combinatorial models; some of them also present statistical approaches. However, it is not a priori clear as to how these models relate to one another and to the underlying problem. We present a uniform framework for the restriction map problems where each of these various models is a specific instance of the basic framework. We achieve this by identifying two "signature" functions f() and g() that characterize the models. We identify the constraints these two functions must satisfy, thus opening up the possibility of exploring other plausible models. We show that for all of the combinatorial models proposed in literature, the signature functions are semi-algebraic. We also analyze a proposed statistical method in this framework and show that the signature functions are transcendental for this model. We also believe that this framework would provide useful guidelines for dealing with other inferencing problems arising in practice. Finally, we indicate the open problems by including a survey of the best known results for these problems.  相似文献   

16.
We hypothesized that the growth rates of filaments and floc formers in activated sludge are affected by the combination of kinetic selection (Lou and de los Reyes, Biotechnol Bioeng 92(6): 729-739, 2005b) and substrate diffusion limitation (Martins et al., Water Res 37:2555-2570, 2003). To clarify the influence of these factors in explaining filamentous bulking, a conceptual framework was developed in this study. The framework suggests the existence of three different regions corresponding to bulking, non-bulking, and intermediate regions, based on substrate concentration. In the bulking and non-bulking regions, kinetic growth differences control the competition process, and filaments or floc formers dominate, respectively. In the intermediate region, substrate diffusion limitation, determined by the floc size, plays the major role in causing bulking. To test this framework, sequencing batch reactors (SBRs) were operated with influent COD of 100, 300, 600, and 1,000 mg/L, and the sludge settleability was measured at various floc size distributions that were developed using different mixing strengths. The experimental data in the bulking and intermediate regions supported the proposed framework. A model integrating the two factors was developed to simulate the substrate concentrations at different depths and floc sizes under intermittently feeding conditions. The modeling results confirmed that substrate diffusion limitation occurs inside the flocs at a certain range of activated sludge floc sizes over the operation cycle, and provided additional support for the proposed framework.  相似文献   

17.
How natural selection acts to limit the proliferation of transposable elements (TEs) in genomes has been of interest to evolutionary biologists for many years. To describe TE dynamics in populations, previous studies have used models of transposition–selection equilibrium that assume a constant rate of transposition. However, since TE invasions are known to happen in bursts through time, this assumption may not be reasonable. Here we propose a test of neutrality for TE insertions that does not rely on the assumption of a constant transposition rate. We consider the case of TE insertions that have been ascertained from a single haploid reference genome sequence. By conditioning on the age of an individual TE insertion allele (inferred by the number of unique substitutions that have occurred within the particular TE sequence since insertion), we determine the probability distribution of the insertion allele frequency in a population sample under neutrality. Taking models of varying population size into account, we then evaluate predictions of our model against allele frequency data from 190 retrotransposon insertions sampled from North American and African populations of Drosophila melanogaster. Using this nonequilibrium neutral model, we are able to explain ∼80% of the variance in TE insertion allele frequencies based on age alone. Controlling for both nonequilibrium dynamics of transposition and host demography, we provide evidence for negative selection acting against most TEs as well as for positive selection acting on a small subset of TEs. Our work establishes a new framework for the analysis of the evolutionary forces governing large insertion mutations like TEs, gene duplications, or other copy number variants.  相似文献   

18.
Building an accurate disease risk prediction model is an essential step in the modern quest for precision medicine. While high-dimensional genomic data provides valuable data resources for the investigations of disease risk, their huge amount of noise and complex relationships between predictors and outcomes have brought tremendous analytical challenges. Deep learning model is the state-of-the-art methods for many prediction tasks, and it is a promising framework for the analysis of genomic data. However, deep learning models generally suffer from the curse of dimensionality and the lack of biological interpretability, both of which have greatly limited their applications. In this work, we have developed a deep neural network (DNN) based prediction modeling framework. We first proposed a group-wise feature importance score for feature selection, where genes harboring genetic variants with both linear and non-linear effects are efficiently detected. We then designed an explainable transfer-learning based DNN method, which can directly incorporate information from feature selection and accurately capture complex predictive effects. The proposed DNN-framework is biologically interpretable, as it is built based on the selected predictive genes. It is also computationally efficient and can be applied to genome-wide data. Through extensive simulations and real data analyses, we have demonstrated that our proposed method can not only efficiently detect predictive features, but also accurately predict disease risk, as compared to many existing methods.  相似文献   

19.
Song YS  Steinrücken M 《Genetics》2012,190(3):1117-1129
The transition density function of the Wright-Fisher diffusion describes the evolution of population-wide allele frequencies over time. This function has important practical applications in population genetics, but finding an explicit formula under a general diploid selection model has remained a difficult open problem. In this article, we develop a new computational method to tackle this classic problem. Specifically, our method explicitly finds the eigenvalues and eigenfunctions of the diffusion generator associated with the Wright-Fisher diffusion with recurrent mutation and arbitrary diploid selection, thus allowing one to obtain an accurate spectral representation of the transition density function. Simplicity is one of the appealing features of our approach. Although our derivation involves somewhat advanced mathematical concepts, the resulting algorithm is quite simple and efficient, only involving standard linear algebra. Furthermore, unlike previous approaches based on perturbation, which is applicable only when the population-scaled selection coefficient is small, our method is nonperturbative and is valid for a broad range of parameter values. As a by-product of our work, we obtain the rate of convergence to the stationary distribution under mutation-selection balance.  相似文献   

20.
A critically important challenge in empirical population genetics is distinguishing neutral nonequilibrium processes from selective forces that produce similar patterns of variation. We here examine the extent to which linkage disequilibrium (i.e., nonrandom associations between markers) improves this discrimination. We show that patterns of linkage disequilibrium recently proposed to be unique to hitchhiking models are replicated under nonequilibrium neutral models. We also demonstrate that jointly considering spatial patterns of association among variants alongside the site-frequency spectrum is nonetheless of value. Through a comparison of models of equilibrium neutrality, nonequilibrium neutrality, equilibrium hitchhiking, nonequilibrium hitchhiking, and recurrent hitchhiking, we evaluate a linkage disequilibrium (LD) statistic (omega(max)) that appears to have power to identify regions recently shaped by positive selection. Most notably, for demographic parameters relevant to non-African populations of Drosophila melanogaster, we demonstrate that selected loci are distinguishable from neutral loci using this statistic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号