期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An alternative derivation of Harville's restricted log likelihood function for variance component estimation

Shizhong Xu 《Biometrical journal. Biometrische Zeitschrift》2019,61(1):157-161

Estimation of variance components in linear mixed models is important in clinical trial and longitudinal data analysis. It is also important in animal and plant breeding for accurately partitioning total phenotypic variance into genetic and environmental variances. Restricted maximum likelihood (REML) method is often preferred over the maximum likelihood (ML) method for variance component estimation because REML takes into account the lost degree of freedom resulting from estimating the fixed effects. The original restricted likelihood function involves a linear transformation of the original response variable (a collection of error contrasts). Harville's final form of the restricted likelihood function does not involve the transformation and thus is much easier to manipulate than the original restricted likelihood function. There are several different ways to show that the two forms of the restricted likelihood are equivalent. In this study, I present a much simpler way to derive Harville's restricted likelihood function. I first treat the fixed effects as random effects and call such a mixed model a pseudo random model (PDRM). I then construct a likelihood function for the PDRM. Finally, I let the variance of the pseudo random effects be infinity and show that the limit of the likelihood function of the PDRM is the restricted likelihood function. 相似文献

2.

Estimation of population parameters and recombination rates from single nucleotide polymorphisms 总被引：14，自引：0，他引：14

Nielsen R 《Genetics》2000,154(2):931-942

Some general likelihood and Bayesian methods for analyzing single nucleotide polymorphisms (SNPs) are presented. First, an efficient method for estimating demographic parameters from SNPs in linkage equilibrium is derived. The method is applied in the estimation of growth rates of a human population based on 37 SNP loci. It is demonstrated how ascertainment biases, due to biased sampling of loci, can be avoided, at least in some cases, by appropriate conditioning when calculating the likelihood function. Second, a Markov chain Monte Carlo (MCMC) method for analyzing linked SNPs is developed. This method can be used for Bayesian and likelihood inference on linked SNPs. The utility of the method is illustrated by estimating recombination rates in a human data set containing 17 SNPs and 60 individuals. Both methods are based on assumptions of low mutation rates. 相似文献

3.

HAPLOFREQ--estimating haplotype frequencies efficiently.

Eran Halperin Elad Hazan 《Journal of computational biology》2006,13(2):481-500

A commonly used tool in disease association studies is the search for discrepancies between the haplotype distribution in the case and control populations. In order to find this discrepancy, the haplotypes frequency in each of the populations is estimated from the genotypes. We present a new method HAPLOFREQ to estimate haplotype frequencies over a short genomic region given the genotypes or haplotypes with missing data or sequencing errors. Our approach incorporates a maximum likelihood model based on a simple random generative model which assumes that the genotypes are independently sampled from the population. We first show that if the phased haplotypes are given, possibly with missing data, we can estimate the frequency of the haplotypes in the population by finding the global optimum of the likelihood function in polynomial time. If the haplotypes are not phased, finding the maximum value of the likelihood function is NP-hard. In this case, we define an alternative likelihood function which can be thought of as a relaxed likelihood function. We show that the maximum relaxed likelihood can be found in polynomial time and that the optimal solution of the relaxed likelihood approaches asymptotically to the haplotype frequencies in the population. In contrast to previous approaches, our algorithms are guaranteed to converge in polynomial time to a global maximum of the different likelihood functions. We compared the performance of our algorithm to the widely used program PHASE, and we found that our estimates are at least 10% more accurate than PHASE and about ten times faster than PHASE. Our techniques involve new algorithms in convex optimization. These algorithms may be of independent interest. Particularly, they may be helpful in other maximum likelihood problems arising from survey sampling. 相似文献

4.

Distribution-free regression analysis of grouped survival data 总被引：1，自引：0，他引：1

D A Pierce W H Stewart K J Kopecky 《Biometrics》1979,35(4):785-793

Methods based on regression models for logarithmic hazard functions, Cox models, are given for analysis of grouped and censored survival data. By making an approximation it is possible to obtain explicitly a maximum likelihood function involving only the regression parameters. This likelihood function is a convenient analog to Cox's partial likelihood for ungrouped data. The method is applied to data from a toxicological experiment. 相似文献

5.

A Likelihood Approach to Populations Samples of Microsatellite Alleles 总被引：4，自引：2，他引：2

R. Nielsen 《Genetics》1997,146(2):711-716

This paper presents a likelihood approach to population samples of microsatellite alleles. A Markov chain recursion method previously published by GRIFFITHS and TAVARE is applied to estimate the likelihood function under different models of microsatellite evolution. The method presented can be applied to estimate a fundamental population genetics parameter θ as well as parameters of the mutational model. The new likelihood estimator provides a better estimator of θ in terms of the mean square error than previous approaches. Furthermore, it is demonstrated how the method may easily be applied to test models of microsatellite evolution. In particular it is shown how to compare a one-step model of microsatellite evolution to a multi-step model by a likelihood ratio test. 相似文献

6.

Multifactorial analysis of family data ascertained through truncation: a comparative evaluation of two methods of statistical inference. 总被引：4，自引：3，他引：1

下载免费PDF全文

D C Rao R Wette W J Ewens 《American journal of human genetics》1988,42(3):506-515

When family data are ascertained through single selection based on truncation, a prevailing method of analysis is to condition the likelihood function on the proband's actual phenotypic value. An alternative method conditions the likelihood function on the event that the proband's measurement lies in the truncation region. Both methods are contrasted here by using Monte Carlo simulations; identical sets of data were analyzed using both methods. The results suggest that, under either method, (1) parameter estimates are nearly unbiased and (2) likelihood-ratio tests of null hypotheses are approximately distributed as chi 2. However, conditioning on the proband's actual phenotypic value yields considerably less efficient estimates and reduced power for hypothesis tests. A corresponding result also holds under complete ascertainment. It is argued, therefore, that whenever sufficient information is available on the nature of truncation, the alternative approach should be used. 相似文献

7.

Improved semiparametric estimation of the proportional rate model with recurrent event data

Ming-Yueh Huang Chiung-Yu Huang 《Biometrics》2023,79(3):1686-1700

Owing to its robustness properties, marginal interpretations, and ease of implementation, the pseudo-partial likelihood method proposed in the seminal papers of Pepe and Cai and Lin et al. has become the default approach for analyzing recurrent event data with Cox-type proportional rate models. However, the construction of the pseudo-partial score function ignores the dependency among recurrent events and thus can be inefficient. An attempt to investigate the asymptotic efficiency of weighted pseudo-partial likelihood estimation found that the optimal weight function involves the unknown variance–covariance process of the recurrent event process and may not have closed-form expression. Thus, instead of deriving the optimal weights, we propose to combine a system of pre-specified weighted pseudo-partial score equations via the generalized method of moments and empirical likelihood estimation. We show that a substantial efficiency gain can be easily achieved without imposing additional model assumptions. More importantly, the proposed estimation procedures can be implemented with existing software. Theoretical and numerical analyses show that the empirical likelihood estimator is more appealing than the generalized method of moments estimator when the sample size is sufficiently large. An analysis of readmission risk in colorectal cancer patients is presented to illustrate the proposed methodology. 相似文献

8.

Interpreting Statistical Evidence with Empirical Likelihood Functions

Zhiwei Zhang 《Biometrical journal. Biometrische Zeitschrift》2009,51(4):710-720

There has been growing interest in the likelihood paradigm of statistics, where statistical evidence is represented by the likelihood function and its strength is measured by likelihood ratios. The available literature in this area has so far focused on parametric likelihood functions, though in some cases a parametric likelihood can be robustified. This focused discussion on parametric models, while insightful and productive, may have left the impression that the likelihood paradigm is best suited to parametric situations. This article discusses the use of empirical likelihood functions, a well‐developed methodology in the frequentist paradigm, to interpret statistical evidence in nonparametric and semiparametric situations. A comparative review of literature shows that, while an empirical likelihood is not a true probability density, it has the essential properties, namely consistency and local asymptotic normality that unify and justify the various parametric likelihood methods for evidential analysis. Real examples are presented to illustrate and compare the empirical likelihood method and the parametric likelihood methods. These methods are also compared in terms of asymptotic efficiency by combining relevant results from different areas. It is seen that a parametric likelihood based on a correctly specified model is generally more efficient than an empirical likelihood for the same parameter. However, when the working model fails, a parametric likelihood either breaks down or, if a robust version exists, becomes less efficient than the corresponding empirical likelihood. 相似文献

9.

Computation of the Full Likelihood Function for Estimating Variance at a Quantitative Trait Locus 总被引：1，自引：1，他引：0

下载免费PDF全文

S. Xu 《Genetics》1996,144(4):1951-1960

The proportion of alleles identical by descent (IBD) determines the genetic covariance between relatives, and thus is crucial in estimating genetic variances of quantitative trait loci (QTL). However, IBD proportions at QTL are unobservable and must be inferred from marker information. The conventional method of QTL variance analysis maximizes the likelihood function by replacing the missing IBDs by their conditional expectations (the expectation method), while in fact the full likelihood function should take into account the conditional distribution of IBDs (the distribution method). The distribution method for families of more than two sibs has not been obvious because there are n(n - 1)/2 IBD variables in a family of size n, forming an n X n symmetrical matrix. In this paper, I use four binary variables, where each indicates the event that an allele from one of the four grandparents has passed to the individual. The IBD proportion between any two sibs is then expressed as a function of the indicators. Subsequently, the joint distribution of the IBD matrix is derived from the distribution of the indicator variables. Given the joint distribution of the unknown IBDs, a method to compute the full likelihood function is developed for families of arbitrary sizes. 相似文献

10.

Empirical Likelihood Semiparametric Regression Analysis for Longitudinal Data 总被引：1，自引：0，他引：1

Xue Liugen; Zhu Lixing 《Biometrika》2007,94(4):921-937

A semiparametric regression model for longitudinal data is considered.The empirical likelihood method is used to estimate the regressioncoefficients and the baseline function, and to construct confidenceregions and intervals. It is proved that the maximum empiricallikelihood estimator of the regression coefficients achievesasymptotic efficiency and the estimator of the baseline functionattains asymptotic normality when a bias correction is made.Two calibrated empirical likelihood approaches to inferencefor the baseline function are developed. We propose a groupwiseempirical likelihood procedure to handle the inter-series dependencefor the longitudinal semiparametric regression model, and employbias correction to construct the empirical likelihood ratiofunctions for the parameters of interest. This leads us to provea nonparametric version of Wilks' theorem. Compared with methodsbased on normal approximations, the empirical likelihood doesnot require consistent estimators for the asymptotic varianceand bias. A simulation compares the empirical likelihood andnormal-based methods in terms of coverage accuracies and averageareas/lengths of confidence regions/intervals. 相似文献

11.

Age trends in human chiasma frequencies and recombination fractions. II. Method for analyzing recombination fractions and applications to the ABO:Nail-Patella linkage

下载免费PDF全文

R. C. Elston Kenneth Lange K. K. Namboodiri 《American journal of human genetics》1976,28(1):69-76

A new method is presented for studying the relationship between human recombination fractions and parental age at the time of conception. Assuming the sex specific recombination fraction to be a linear function of age, a feasible computer algorithm is described whereby the likelihood of multigenerational families can be calculated. Using this method and the likelihood ratio test, it is found that for the ABO:nail-patella linkage age (P= .17)is more significant than sex (p= .23) in its effect on the recombination fraction. The age effect, if it is real, appears to be limited to males: the paternal recombination fraction decreases by .0062(+/- .0036) per year. 相似文献

12.

Approximate likelihood ratios for general estimating functions 总被引：1，自引：0，他引：1

HANFELT JOHN J.; LIANG KUNG-YEE 《Biometrika》1995,82(3):461-477

The method of estimating functions (Godambe, 1991) is commonlyused when one desires to conduct inference about some parametersof interest but the full distribution of the observations isunknown. However, this approach may have limited utility, dueto multiple roots for the estimating function, a poorly behavedWald test, or lack of a goodness-of-fit test. This paper presentsapproximate likelihood ratios that can be used along with estimatingfunctions when any of these three problems occurs. We show thatthe approximate likelihood ratio provides correct large sampleinference under very general circumstances, including clustereddata and misspecified weights in the estimating function. Twomethods of constructing the approximate likelihood ratio, onebased on the quasi-likelihood approach and the other based onthe linear projection approach, are compared and shown to beclosely related. In particular we show that quasi-likelihoodis the limit of the projection approach. We illustrate the techniquewith two applications. 相似文献

13.

MLEP: an R package for exploring the maximum likelihood estimates of penetrance parameters

Y Sugaya 《BMC research notes》2012,5(1):465

ABSTRACT: BACKGROUND: Linkage analysis is a useful tool for detecting genetic variants that regulate a trait of interest, especially genes associated with a given disease. Although penetrance parameters play an important role in determining gene location, they are assigned arbitrary values according to the researcher's intuition or as estimated by the maximum likelihood principle. Several methods exist by which to evaluate the maximum likelihood estimates of penetrance, although not all of these are supported by software packages and some are biased by marker genotype information, even when disease development is due solely to the genotype of a single allele. FINDINGS: Programs for exploring the maximum likelihood estimates of penetrance parameters were developed using the R statistical programming language supplemented by external C functions. The software returns a vector of polynomial coefficients of penetrance parameters, representing the likelihood of pedigree data. From the likelihood polynomial supplied by the proposed method, the likelihood value and its gradient can be precisely computed. To reduce the effect of the supplied dataset on the likelihood function, feasible parameter constraints can be introduced into maximum likelihood estimates, thus enabling flexible exploration of the penetrance estimates. An auxiliary program generates a perspective plot allowing visual validation of the model's convergence. The functions are collectively available as the MLEP R package. CONCLUSIONS: Linkage analysis using penetrance parameters estimated by the MLEP package enables feasible localization of a disease locus. This is shown through a simulation study and by demonstrating how the package is used to explore maximum likelihood estimates. Although the input dataset tends to bias the likelihood estimates, the method yields accurate results superior to the analysis using intuitive penetrance values for disease with low allele frequencies. MLEP is part of the Comprehensive R Archive Network and is freely available at http://cran.r-project.org/web/packages/MLEP/index.html. 相似文献

14.

Estimation of Additive Genetic Variance Components in Aquaculture Populations Selectively Pedigreed by DNA Fingerprinting

X. Li C. Field R. Doyle 《Biometrical journal. Biometrische Zeitschrift》2003,45(1):61-72

A method to estimate genetic variance components in populations partially pedigreed by DNA fingerprinting is presented. The focus is on aquaculture, where breeding procedures may produce thousands of individuals. In aquaculture populations the individuals available for measurement will often be selected, i.e. will come from the upper tail of a size‐at‐age distribution, or the lower tail of an age‐at‐maturity distribution etc. Selection typically occurs by size grading during grow‐out and/or choice of superior fish as broodstock. The method presented in this paper enables us to estimate genetic variance components when only a small proportion of individuals, those with extreme phenotypes, have been identified by DNA fingerprinting. We replace the usual normal density by appropriate robust least favourable densities to ensure the robustness of our estimates. Standard analysis of variance or maximum likelihood estimation cannot be used when only the extreme progeny have been pedigreed because of the biased nature of the estimates. In our model‐based procedure a full robust likelihood function is defined, in which the missing information about non‐extreme progeny has been taken into account. This robust likelihood function is transformed into a computable function which is maximized to get the estimates. The estimates of sire and dam additive variance components are significantly and uniformly more accurate than those obtained by any of the standard methods when tested on simulated population data and have desirable robustness properties. 相似文献

15.

Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times

dos Reis M Yang Z 《Molecular biology and evolution》2011,28(7):2161-2172

The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets. 相似文献

16.

Maximum Likelihood Estimation of Exponential Polynomial Rate for Poisson Data

C. D. Mathers 《Biometrical journal. Biometrische Zeitschrift》1984,26(1):33-38

The heterogeneous Poisson process with discretized exponential quadratic rate function is considered. Maximum likelihood estimates of the parameters of the rate function are derived for the case when the data consists of numbers of occurrences in consecutive equal time periods. A likelihood ratio test of the null hypothesis of exponential quadratic rate is presented. Its power against exponential linear rate functions is estimated using Monte Carlo simulation. The maximum likelihood method is compared with a log-linear least squares techniques. An application of the technique to the analysis of mortality rates due to congenital malformations is presented. 相似文献

17.

Proportional hazards model with covariates subject to measurement error. 总被引：1，自引：0，他引：1

T Nakamura 《Biometrics》1992,48(3):829-838

When covariates of a proportional hazards model are subject to measurement error, the maximum likelihood estimates of regression coefficients based on the partial likelihood are asymptotically biased. Prentice (1982, Biometrika 69, 331-342) presents an example of such bias and suggests a modified partial likelihood. This paper applies the corrected score function method (Nakamura, 1990, Biometrika 77, 127-137) to the proportional hazards model when measurement errors are additive and normally distributed. The result allows a simple correction to the ordinary partial likelihood that yields asymptotically unbiased estimates; the validity of the correction is confirmed via a limited simulation study. 相似文献

18.

SNP calling, genotype calling, and sample allele frequency estimation from new-generation sequencing data

R Nielsen T Korneliussen A Albrechtsen Y Li J Wang 《PloS one》2012,7(7):e37558

We present a statistical framework for estimation and application of sample allele frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the allele frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is calculated using a dynamic programming algorithm and numerically optimized using analytical derivatives. We then use a Bayesian method for estimating the sample allele frequency in a single site, and show how the method can be used for genotype calling and SNP calling. We also show how the method can be extended to various other cases including cases with deviations from Hardy-Weinberg equilibrium. We evaluate the statistical properties of the methods using simulations and by application to a real data set. 相似文献

19.

Effects of implicit parameters in segregation analysis

Tai JJ Hsiao CK 《Human heredity》2001,51(4):192-198

In human genetic analysis, data are collected through the so-called 'ascertainment procedure'. Statistically this sampling scheme can be thought of as a multistage sampling method. At the first stage, one or several probands are ascertained. At the subsequent stages, a sequential sampling scheme is applied. Sampling in such a way is virtually a nonrandom procedure, which, in most cases, causes biased estimation which may be intractable. This paper focuses on the underlying causes of the intractability problem of ascertained genetic data. Three types of parameters, i.e. target, design and nuisance parameters, are defined as the essences to formulate the true likelihood of a set of data. These parameters are also classified into explicit or implicit parameters depending on whether they can be expressed explicity in the likelihood function. For ascertained genetic data, a sequential scheme is regarded as an implicit design parameter, and a true pedigree structure as an implicit nuisance parameter. The intractability problem is attributed to loss of information of any implicit parameter in likelihood formulation. Several approaches to build a likelihood for estimation of the segregation ratio when only an observed pedigree structure is available are proposed. 相似文献

20.

Reconstructing phylogeny by quadratically approximated maximum likelihood

Woodhams MD Hendy MD 《Bioinformatics (Oxford, England)》2004,20(Z1):i348-i354

Maximum likelihood (ML) for phylogenetic inference from sequence data remains a method of choice, but has computational limitations. In particular, it cannot be applied for a global search through all potential trees when the number of taxa is large, and hence a heuristic restriction in the search space is required. In this paper, we derive a quadratic approximation, QAML, to the likelihood function whose maximum is easily determined for a given tree. The derivation depends on Hadamard conjugation, and hence is limited to the simple symmetric models of Kimura and of Jukes and Cantor. Preliminary testing has demonstrated the accuracy of QAML is close to that of ML. 相似文献