首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 937 毫秒
1.
Procedures for discriminating between competing statistical models of synaptic transmission, and for providing confidence limits on the parameters of these models, have been developed. These procedures were tested against simulated data and were used to analyze the fluctuations in synaptic currents evoked in hippocampal neurones. All models were fitted to data using the Expectation-Maximization algorithm and a maximum likelihood criterion. Competing models were evaluated using the log-likelihood ratio (Wilks statistic). When the competing models were not nested, Monte Carlo sampling of the model used as the null hypothesis (H0) provided density functions against which H0 and the alternate model (H1) were tested. The statistic for the log-likelihood ratio was determined from the fit of H0 and H1 to these probability densities. This statistic was used to determine the significance level at which H0 could be rejected for the original data. When the competing models were nested, log-likelihood ratios and the chi 2 statistic were used to determine the confidence level for rejection. Once the model that provided the best statistical fit to the data was identified, many estimates for the model parameters were calculated by resampling the original data. Bootstrap techniques were then used to obtain the confidence limits of these parameters.  相似文献   

2.
Molecular markers allow to estimate the pairwise relatedness between the members of a breeding pool when their selection history is no longer available or has become too complex for a classical pedigree analysis. The field of population genetics has several estimation procedures at its disposal, but when the genotyped individuals are highly selected inbred lines, their application is not warranted as the theoretical assumptions on which these estimators were built, usually linkage equilibrium between marker loci or even Hardy–Weinberg equilibrium, are not met. An alternative approach requires the availability of a genotyped reference set of inbred lines, which allows to correct the observed marker similarities for their inherent upward bias when used as a coancestry measure. However, this approach does not guarantee that the resulting coancestry matrix is at least positive semi-definite (psd), a necessary condition for its use as a covariance matrix. In this paper we present the weighted alikeness in state (WAIS) estimator. This marker-based coancestry estimator is compared to several other commonly applied relatedness estimators under realistic hybrid breeding conditions in a number of simulations. We also fit a linear mixed model to phenotypical data from a commercial maize breeding programme and compare the likelihood of the different variance structures. WAIS is shown to be psd which makes it suitable for modelling the covariance between genetic components in linear mixed models involved in breeding value estimation or association studies. Results indicate that it generally produces a low root mean squared error under different breeding circumstances and provides a fit to the data that is comparable to that of several other marker-based alternatives. Recommendations for each of the examined coancestry measures are provided.  相似文献   

3.
The problem of testing the separability of a covariance matrix against an unstructured variance‐covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first‐order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ2 distribution. The tests are implemented on a real dataset from medical studies.  相似文献   

4.
Vonesh EF  Chinchilli VM  Pu K 《Biometrics》1996,52(2):572-587
In recent years, generalized linear and nonlinear mixed-effects models have proved to be powerful tools for the analysis of unbalanced longitudinal data. To date, much of the work has focused on various methods for estimating and comparing the parameters of mixed-effects models. Very little work has been done in the area of model selection and goodness-of-fit, particularly with respect to the assumed variance-covariance structure. In this paper, we present a goodness-of-fit statistic which can be used in a manner similar to the R2 criterion in linear regression for assessing the adequacy of an assumed mean and variance-covariance structure. In addition, we introduce an approximate pseudo-likelihood ratio test for testing the adequacy of the hypothesized convariance structure. These methods are illustrated and compared to the usual normal theory likelihood methods (Akaike's information criterion and the likelihood ratio test) using three examples. Simulation results indicate the pseudo-likelihood ratio test compares favorably with the standard normal theory likelihood ratio test, but both procedures are sensitive to departures from normality.  相似文献   

5.
We propose a general likelihood-based approach to the linkage analysis of qualitative and quantitative traits using identity by descent (IBD) data from sib-pairs. We consider the likelihood of IBD data conditional on phenotypes and test the null hypothesis of no linkage between a marker locus and a gene influencing the trait using a score test in the recombination fraction theta between the two loci. This method unifies the linkage analysis of qualitative and quantitative traits into a single inferential framework, yielding a simple and intuitive test statistic. Conditioning on phenotypes avoids unrealistic random sampling assumptions and allows sib-pairs from differing ascertainment mechanisms to be incorporated into a single likelihood analysis. In particular, it allows the selection of sib-pairs based on their trait values and the analysis of only those pairs having the most informative phenotypes. The score test is based on the full likelihood, i.e. the likelihood based on all phenotype data rather than just differences of sib-pair phenotypes. Considering only phenotype differences, as in Haseman and Elston (1972) and Kruglyak and Lander (1995), may result in important losses in power. The linkage score test is derived under general genetic models for the trait, which may include multiple unlinked genes. Population genetic assumptions, such as random mating or linkage equilibrium at the trait loci, are not required. This score test is thus particularly promising for the analysis of complex human traits. The score statistic readily extends to accommodate incomplete IBD data at the test locus, by using the hidden Markov model implemented in the programs MAPMAKER/SIBS and GENEHUNTER (Kruglyak and Lander, 1995; Kruglyak et al., 1996). Preliminary simulation studies indicate that the linkage score test generally matches or outperforms the Haseman-Elston test, the largest gains in power being for selected samples of sib-pairs with extreme phenotypes.  相似文献   

6.
The central theme in case-control genetic association studies is to efficiently identify genetic markers associated with trait status. Powerful statistical methods are critical to accomplishing this goal. A popular method is the omnibus Pearson's chi-square test applied to genotype counts. To achieve increased power, tests based on an assumed trait model have been proposed. However, they are not robust to model misspecification. Much research has been carried out on enhancing robustness of such model-based tests. An analysis framework that tests the equality of allele frequency while allowing for different deviation from Hardy-Weinberg equilibrium (HWE) between cases and controls is proposed. The proposed method does not require specification of trait models nor HWE. It involves only 1 degree of freedom. The likelihood ratio statistic, score statistic, and Wald statistic associated with this framework are introduced. Their performance is evaluated by extensive computer simulation in comparison with existing methods.  相似文献   

7.
针对SNPs数据不服从正态分布的情况,拟采用S-B测度调整估计方法拟合验证性因子模型,进行SNPs整体效应和关联性分析。用GAWl7提供的SNPs数据进行实例分析。本研究随机选取2号染色体上,分布在6个基因之中的13个SNPs作为研究对象,对选取的6个基因做潜变量得分,然后对基因和疾病感染做检验。结果显示:X^2/妒最大似然估计方法的卡方自由度比为3.59,S-B测度调整估计方法的卡方自由度比X^2/df为2.89,最大似然估计方法的RMSEA为0.061,S-B测度调整估计方法的RMSEA为0.052。6个基因对该感染都有影响.由此得出结论,在处理SNPs数据时,使用S-B测度调整估计能得到更好的拟合模型。可以推测这6个基因下的13个SNP位点可能是感染的致病位点。  相似文献   

8.
Analytic approaches to twin data using structural equation models   总被引:5,自引:0,他引:5  
The classical twin study is the most popular design in behavioural genetics. It has strong roots in biometrical genetic theory, which allows predictions to be made about the correlations between observed traits of identical and fraternal twins in terms of underlying genetic and environmental components. One can infer the relative importance of these 'latent' factors (model parameters) by structural equation modelling (SEM) of observed covariances of both twin types. SEM programs estimate model parameters by minimising a goodness-of-fit function between observed and predicted covariance matrices, usually by the maximum-likelihood criterion. Likelihood ratio statistics also allow the comparison of fit of different competing models. The program Mx, specifically developed to model genetically sensitive data, is now widely used in twin analyses. The flexibility of Mx allows the modelling of multivariate data to examine the genetic and environmental relations between two or more phenotypes and the modelling to categorical traits under liability-threshold models.  相似文献   

9.
Likelihood methods for detecting temporal shifts in diversification rates   总被引:8,自引:0,他引:8  
Maximum likelihood is a potentially powerful approach for investigating the tempo of diversification using molecular phylogenetic data. Likelihood methods distinguish between rate-constant and rate-variable models of diversification by fitting birth-death models to phylogenetic data. Because model selection in this context is a test of the null hypothesis that diversification rates have been constant over time, strategies for selecting best-fit models must minimize Type I error rates while retaining power to detect rate variation when it is present. Here I examine model selection, parameter estimation, and power to reject the null hypothesis using likelihood models based on the birth-death process. The Akaike information criterion (AIC) has often been used to select among diversification models; however, I find that selecting models based on the lowest AIC score leads to a dramatic inflation of the Type I error rate. When appropriately corrected to reduce Type I error rates, the birth-death likelihood approach performs as well or better than the widely used gamma statistic, at least when diversification rates have shifted abruptly over time. Analyses of datasets simulated under a range of rate-variable diversification scenarios indicate that the birth-death likelihood method has much greater power to detect variation in diversification rates when extinction is present. Furthermore, this method appears to be the only approach available that can distinguish between a temporal increase in diversification rates and a rate-constant model with nonzero extinction. I illustrate use of the method by analyzing a published phylogeny for Australian agamid lizards.  相似文献   

10.
The power to detect linkage for likelihood and nonparametric (Haseman-Elston, affected-sib-pair, and affected-pedigree-member) methods is compared for the case of a common, dichotomous trait resulting from the segregation of two loci. Pedigree data for several two-locus epistatic and heterogeneity models have been simulated, with one of the loci linked to a marker locus. Replicate samples of 20 three-generation pedigrees (16 individuals/pedigree) were simulated and then ascertained for having at least 6 affected individuals. The power of linkage detection calculated under the correct two-locus model is only slightly higher than that under a single locus model with reduced penetrance. As expected, the nonparametric linkage methods have somewhat lower power than does the lod-score method, the difference depending on the mode of transmission of the linked locus. Thus, for many pedigree linkage studies, the lod-score method will have the best power. However, this conclusion depends on how many times the lod score will be calculated for a given marker. The Haseman-Elston method would likely be preferable to calculating lod scores under a large number of genetic models (i.e., varying both the mode of transmission and the penetrances), since such an analysis requires an increase in the critical value of the lod criterion. The power of the affected-pedigree-member method is lower than the other methods, which can be shown to be largely due to the fact that marker genotypes for unaffected individuals are not used.  相似文献   

11.
Statistical methods to map quantitative trait loci (QTL) in outbred populations are reviewed, extensions and applications to human and plant genetic data are indicated, and areas for further research are identified. Simple and computationally inexpensive methods include (multiple) linear regression of phenotype on marker genotypes and regression of squared phenotypic differences among relative pairs on estimated proportions of identity-by-descent at a locus. These methods are less suited for genetic parameter estimation in outbred populations but allow the determination of test statistic distributions via simulation or data permutation; however, further inferences including confidence intervals of QTL location require the use of Monte Carlo or bootstrap sampling techniques. A method which is intermediate in computational requirements is residual maximum likelihood (REML) with a covariance matrix of random QTL effects conditional on information from multiple linked markers. Testing for the number of QTLs on a chromosome is difficult in a classical framework. The computationally most demanding methods are maximum likelihood and Bayesian analysis, which take account of the distribution of multilocus marker-QTL genotypes on a pedigree and permit investigators to fit different models of variation at the QTL. The Bayesian analysis includes the number of QTLs on a chromosome as an unknown.  相似文献   

12.
The use of polynomial functions to describe the average growth trajectory and covariance functions of Nellore and MA (21/32 Charolais+11/32 Nellore) young bulls in performance tests was studied. The average growth trajectories and additive genetic and permanent environmental covariance functions were fit with Legendre (linear through quintic) and quadratic B-spline (with two to four intervals) polynomials. In general, the Legendre and quadratic B-spline models that included more covariance parameters provided a better fit with the data. When comparing models with the same number of parameters, the quadratic B-spline provided a better fit than the Legendre polynomials. The quadratic B-spline with four intervals provided the best fit for the Nellore and MA groups. The fitting of random regression models with different types of polynomials (Legendre polynomials or B-spline) affected neither the genetic parameters estimates nor the ranking of the Nellore young bulls. However, fitting different type of polynomials affected the genetic parameters estimates and the ranking of the MA young bulls. Parsimonious Legendre or quadratic B-spline models could be used for genetic evaluation of body weight of Nellore young bulls in performance tests, whereas these parsimonious models were less efficient for animals of the MA genetic group owing to limited data at the extreme ages.  相似文献   

13.
Elston TC 《Biophysical journal》2002,82(3):1239-1253
A quantitative analysis of experimental data for posttranslational translocation into the endoplasmic reticulum is performed. This analysis reveals that translocation involves a single rate-limiting step, which is postulated to be the release of the signal sequence from the translocation channel. Next, the Brownian ratchet and power stroke models of translocation are compared against the data. The data sets are simultaneously fit using a least-squares criterion, and both models are found to accurately reproduce the experimental results. A likelihood-ratio test reveals that the optimal fit of the Brownian ratchet model, which contains one fewer free parameter, does not differ significantly from that of the power stroke model. Therefore, the data considered here cannot be used to reject this import mechanism. The models are further analyzed using the estimated parameters to make experimentally testable predictions.  相似文献   

14.
A procedure for comparing survival times between several groups of patients through rank analysis of covariance was introduced by WOOLSON and LACHENBRUCH (1983). It is a modification of Quade' rank analysis of covariance procedure (1967) and can be used for the analysis of right-censored data. In this paper, two additional modifications of Quade' original test statistic are proposed and compared to the original modification introduced by Woolson and Lachenbruch. These statistics are compared to one another and to the score test from Cox' proportional hazards model by way of a limited Monte Carlo study. One of the statistics, QR2, is recommended for general use for the rank analysis of covariance of right-censored survivorship data.  相似文献   

15.
In the analysis of data generated by change-point processes, one critical challenge is to determine the number of change-points. The classic Bayes information criterion (BIC) statistic does not work well here because of irregularities in the likelihood function. By asymptotic approximation of the Bayes factor, we derive a modified BIC for the model of Brownian motion with changing drift. The modified BIC is similar to the classic BIC in the sense that the first term consists of the log likelihood, but it differs in the terms that penalize for model dimension. As an example of application, this new statistic is used to analyze array-based comparative genomic hybridization (array-CGH) data. Array-CGH measures the number of chromosome copies at each genome location of a cell sample, and is useful for finding the regions of genome deletion and amplification in tumor cells. The modified BIC performs well compared to existing methods in accurately choosing the number of regions of changed copy number. Unlike existing methods, it does not rely on tuning parameters or intensive computing. Thus it is impartial and easier to understand and to use.  相似文献   

16.
It is fundamentally important to assess the fit of data to model in phylogenetic and evolutionary studies. Phylogenetic methods using molecular sequences typically start with a multiple alignment. It is possible to measure the fit of data to model expectations of data, for example, via the likelihood-ratio (G) test or the X(2) test, if all sites in all sequences have an unambiguous residue. However, nearly all alignments of interest contain sites (columns of the alignment) with missing data, that is, ambiguous nucleotides, gaps, or unsequenced regions, which must presently be removed before using the above tests. Unfortunately, this is often either undesirable or impractical, as it will discard much of the data. Here, we show how iterative ML estimators may directly estimate the site-pattern probabilities for columns with missing data, given only standard i.i.d. assumptions. The optimization may use an EM or Newton algorithm, or any other hill-climbing approach. The resulting optimal likelihood under the unconstrained or multinomial model may be compared directly with the likelihood of the data coming from the model (a G statistic). Alternatively the modified observed and the expected frequencies of site patterns may be compared using a X(2) test. The distribution of such statistics is best assessed using appropriate simulations. The new method is applicable to models using codons or paired sites. The methods are also useful with Hadamard conjugations (spectral analysis) and are illustrated with these and with ML evolutionary models that allow site-rate variability.  相似文献   

17.
On the inheritance of abdominal aortic aneurysm.   总被引:7,自引:0,他引:7       下载免费PDF全文
To determine the mode of inheritance of abdominal aortic aneurysm, data on first-degree relatives of 91 probands were collected. Results of segregation analysis performed on these data are reported. Many models, including nongenetic and genetic models, were compared using likelihood methods. The nongenetic model was rejected; statistically significant evidence in favor of a genetic model was found. Among the many genetic models compared, the most parsimonious genetic model was that susceptibility to abdominal aortic aneurysm is determined by a recessive gene at an autosomal diallelic major locus. A multifactorial component in addition to the major locus does not increase the likelihood of the data significantly.  相似文献   

18.
A central goal of computational biology is the prediction of phenotype from DNA and protein sequence data. Recent models of sequence change use in silico prediction systems to incorporate the effects of phenotype on evolutionary rates. These models have been designed for analyzing sequence data from different species and have been accompanied by statistical techniques for estimating model parameters when the incorporation of phenotype induces dependent change among sequence positions. A difficulty with these efforts to link phenotype and interspecific evolution is that evolution occurs within populations, and parameters of interspecific models should have population genetic interpretations. We show, with two examples, how population genetic interpretations can be assigned to evolutionary models. The first example considers the impact of RNA secondary structure on sequence change, and the second reflects the tendency for protein tertiary structure to influence nonsynonymous substitution rates. We argue that statistical fit to data should not be the sole criterion for assessing models of sequence change. A good interspecific model should also yield a clear and biologically plausible population genetic interpretation.  相似文献   

19.
Generalized linear model analyses of repeated measurements typically rely on simplifying mathematical models of the error covariance structure for testing the significance of differences in patterns of change across time. The robustness of the tests of significance depends, not only on the degree of agreement between the specified mathematical model and the actual population data structure, but also on the precision and robustness of the computational criteria for fitting the specified covariance structure to the data. Generalized estimating equation (GEE) solutions utilizing the robust empirical sandwich estimator for modeling of the error structure were compared with general linear mixed model (GLMM) solutions that utilized the commonly employed restricted maximum likelihood (REML) procedure. Under the conditions considered, the GEE and GLMM procedures were identical in assuming that the data are normally distributed and that the variance‐covariance structure of the data is the one specified by the user. The question addressed in this article concerns relative sensitivity of tests of significance for treatment effects to varying degrees of misspecification of the error covariance structure model when fitted by the alternative procedures. Simulated data that were subjected to monte carlo evaluation of actual Type I error and power of tests of the equal slopes hypothesis conformed to assumptions of ordinary linear model ANOVA for repeated measures except for autoregressive covariance structures and missing data due to dropouts. The actual within‐groups correlation structures of the simulated repeated measurements ranged from AR(1) to compound symmetry in graded steps, whereas the GEE and GLMM formulations restricted the respective error structure models to be either AR(1), compound symmetry (CS), or unstructured (UN). The GEE‐based tests utilizing empirical sandwich estimator criteria were documented to be relatively insensitive to misspecification of the covariance structure models, whereas GLMM tests which relied on restricted maximum likelihood (REML) were highly sensitive to relatively modest misspecification of the error correlation structure even though normality, variance homogeneity, and linearity were not an issue in the simulated data.Goodness‐of‐fit statistics were of little utility in identifying cases in which relatively minor misspecification of the GLMM error structure model resulted in inadequate alpha protection for tests of the equal slopes hypothesis. Both GEE and GLMM formulations that relied on unstructured (UN) error model specification produced nonconservative results regardless of the actual correlation structure of the repeated measurements. A random coefficients model produced robust tests with competitive power across all conditions examined. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

20.
We consider the statistical modeling and analysis of replicated multi-type point process data with covariates. Such data arise when heterogeneous subjects experience repeated events or failures which may be of several distinct types. The underlying processes are modeled as nonhomogeneous mixed Poisson processes with random (subject) and fixed (covariate) effects. The method of maximum likelihood is used to obtain estimates and standard errors of the failure rate parameters and regression coefficients. Score tests and likelihood ratio statistics are used for covariate selection. A graphical test of goodness of fit of the selected model is based on generalized residuals. Measures for determining the influence of an individual observation on the estimated regression coefficients and on the score test statistic are developed. An application is described to a large ongoing randomized controlled clinical trial for the efficacy of nutritional supplements of selenium for the prevention of two types of skin cancer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号