首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Bayesian Inference in Semiparametric Mixed Models for Longitudinal Data   总被引:1,自引:0,他引:1  
Summary .  We consider Bayesian inference in semiparametric mixed models (SPMMs) for longitudinal data. SPMMs are a class of models that use a nonparametric function to model a time effect, a parametric function to model other covariate effects, and parametric or nonparametric random effects to account for the within-subject correlation. We model the nonparametric function using a Bayesian formulation of a cubic smoothing spline, and the random effect distribution using a normal distribution and alternatively a nonparametric Dirichlet process (DP) prior. When the random effect distribution is assumed to be normal, we propose a uniform shrinkage prior (USP) for the variance components and the smoothing parameter. When the random effect distribution is modeled nonparametrically, we use a DP prior with a normal base measure and propose a USP for the hyperparameters of the DP base measure. We argue that the commonly assumed DP prior implies a nonzero mean of the random effect distribution, even when a base measure with mean zero is specified. This implies weak identifiability for the fixed effects, and can therefore lead to biased estimators and poor inference for the regression coefficients and the spline estimator of the nonparametric function. We propose an adjustment using a postprocessing technique. We show that under mild conditions the posterior is proper under the proposed USP, a flat prior for the fixed effect parameters, and an improper prior for the residual variance. We illustrate the proposed approach using a longitudinal hormone dataset, and carry out extensive simulation studies to compare its finite sample performance with existing methods.  相似文献   

2.
Böhning D  Sarol J 《Biometrics》2000,56(1):304-308
In this paper, we consider the case of efficient estimation of the risk difference in a multicenter study allowing for baseline heterogeneity. We consider the optimally weighted estimator for the common risk difference and show that this estimator has considerable bias when the true weights (which are inversely proportional to the variances of the center-specific risk difference estimates) are replaced by their sample estimates. In addition, we propose a new estimator for this situation of the Mantel-Haenszel type that is unbiased and, in addition, has a smaller variance for small sample sizes within the study centers. Simulations illustrate these findings.  相似文献   

3.
4.
Reduction of bias in estimating the frequency of recessive genes.   总被引:3,自引:2,他引:1       下载免费PDF全文
The standard approach to estimating the frequency of a completely recessive autosomal gene is to use the maximum-likelihood estimator (MLE), q = square root q2. Since the expectation oof Q using MLE is systematically less than the true value, this estimator always gives a negatively biased estimate of q. Here we describe the bias associated the MLE over a range of q and N values, explore some of the properties of this estimator, and propose new estimators which reduce the bias. We also describe some of the new estimators' properties, as well as the remaining bias associated with them for varying q and N values. We further propose one of these estimators as the one which most effectively reduces bias over a specific q value range of approximately .005 to .05, and which is less biased than JLE over essentially all q and N values. The proposed estimator also is directly compared with MLE in calculating various available estimates of q, demonstrating the percentage of reduction in bias achieved. This reduction varies from negligible for estimates of q above .3 and N greater than 100, to a 23% reduction in bias for a q value of .09 and an N value of 215.  相似文献   

5.
Robust estimation of multivariate covariance components   总被引:1,自引:0,他引:1  
Dueck A  Lohr S 《Biometrics》2005,61(1):162-169
In many settings, such as interlaboratory testing, small area estimation in sample surveys, and heritability studies, investigators are interested in estimating covariance components for multivariate measurements. However, the presence of outliers can seriously distort estimates obtained using standard procedures such as maximum likelihood. We propose a procedure based on M-estimation for robustly estimating multivariate covariance components in the presence of outliers; the procedure applies to balanced and unbalanced data. We present an algorithm for computing the robust estimates and examine the performance of the estimator through a simulation study. The estimator is used to find covariance components and identify outliers in a study of variability of egg length and breadth measurements of American coots.  相似文献   

6.
A stabilized moment estimator for the beta-binomial distribution   总被引:1,自引:0,他引:1  
R N Tamura  S S Young 《Biometrics》1987,43(4):813-824
The beta-binomial distribution has been proposed as a model for the incorporation of historical control data in the analysis of rodent carcinogenesis bioassays. Low spontaneous tumor incidences along with the small number and sizes of historical control groups combine to make the moment and maximum likelihood estimates of the beta-binomial parameters deficient. We therefore propose a stabilized moment estimator for one of the parameters. The stabilized moment estimator is similar to the ridge regression estimator and introduces a shrinkage parameter. Computer simulations were run to examine the behavior of the stabilized moment estimator. The effect of the stabilized moment estimator on the score test for dose-related trend is considered both on simulated data and on an example from the literature.  相似文献   

7.
We propose a Bayesian hypothesis testing procedure for comparing the distributions of paired samples. The procedure is based on a flexible model for the joint distribution of both samples. The flexibility is given by a mixture of Dirichlet processes. Our proposal uses a spike-slab prior specification for the base measure of the Dirichlet process and a particular parametrization for the kernel of the mixture in order to facilitate comparisons and posterior inference. The joint model allows us to derive the marginal distributions and test whether they differ or not. The procedure exploits the correlation between samples, relaxes the parametric assumptions, and detects possible differences throughout the entire distributions. A Monte Carlo simulation study comparing the performance of this strategy to other traditional alternatives is provided. Finally, we apply the proposed approach to spirometry data collected in the United States to investigate changes in pulmonary function in children and adolescents in response to air polluting factors.  相似文献   

8.
Bochkina N  Richardson S 《Biometrics》2007,63(4):1117-1125
We consider the problem of identifying differentially expressed genes in microarray data in a Bayesian framework with a noninformative prior distribution on the parameter quantifying differential expression. We introduce a new rule, tail posterior probability, based on the posterior distribution of the standardized difference, to identify genes differentially expressed between two conditions, and we derive a frequentist estimator of the false discovery rate associated with this rule. We compare it to other Bayesian rules in the considered settings. We show how the tail posterior probability can be extended to testing a compound null hypothesis against a class of specific alternatives in multiclass data.  相似文献   

9.
Accurate estimation of the size of animal populations is an important task in ecological science. Recent advances in the field of molecular genetics researches allow the use of genetic data to estimate the size of a population from a single capture occasion rather than repeated occasions as in the usual capture–recapture experiments. Estimating the population size using genetic data also has sometimes led to estimates that differ markedly from each other and also from classical capture–recapture estimates. Here, we develop a closed form estimator that uses genetic information to estimate the size of a population consisting of mothers and daughters, focusing on estimating the number of mothers, using data from a single sample. We demonstrate the estimator is consistent and propose a parametric bootstrap to estimate the standard errors. The estimator is evaluated in a simulation study and applied to real data. We also consider maximum likelihood in this setting and discover problems that preclude its general use.  相似文献   

10.
Problems involving thousands of null hypotheses have been addressed by estimating the local false discovery rate (LFDR). A previous LFDR approach to reporting point and interval estimates of an effect-size parameter uses an estimate of the prior distribution of the parameter conditional on the alternative hypothesis. That estimated prior is often unreliable, and yet strongly influences the posterior intervals and point estimates, causing the posterior intervals to differ from fixed-parameter confidence intervals, even for arbitrarily small estimates of the LFDR. That influence of the estimated prior manifests the failure of the conditional posterior intervals, given the truth of the alternative hypothesis, to match the confidence intervals. Those problems are overcome by changing the posterior distribution conditional on the alternative hypothesis from a Bayesian posterior to a confidence posterior. Unlike the Bayesian posterior, the confidence posterior equates the posterior probability that the parameter lies in a fixed interval with the coverage rate of the coinciding confidence interval. The resulting confidence-Bayes hybrid posterior supplies interval and point estimates that shrink toward the null hypothesis value. The confidence intervals tend to be much shorter than their fixed-parameter counterparts, as illustrated with gene expression data. Simulations nonetheless confirm that the shrunken confidence intervals cover the parameter more frequently than stated. Generally applicable sufficient conditions for correct coverage are given. In addition to having those frequentist properties, the hybrid posterior can also be motivated from an objective Bayesian perspective by requiring coherence with some default prior conditional on the alternative hypothesis. That requirement generates a new class of approximate posteriors that supplement Bayes factors modified for improper priors and that dampen the influence of proper priors on the credibility intervals. While that class of posteriors intersects the class of confidence-Bayes posteriors, neither class is a subset of the other. In short, two first principles generate both classes of posteriors: a coherence principle and a relevance principle. The coherence principle requires that all effect size estimates comply with the same probability distribution. The relevance principle means effect size estimates given the truth of an alternative hypothesis cannot depend on whether that truth was known prior to observing the data or whether it was learned from the data.  相似文献   

11.
Computing Bayes factors using thermodynamic integration   总被引:1,自引:0,他引:1  
In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present article, we propose to employ another method, based on an analogy with statistical physics, called thermodynamic integration. We describe the method, propose an implementation, and show on two analytical examples that this numerical method yields reliable estimates. In contrast, the harmonic mean estimator leads to a strong overestimation of the marginal likelihood, which is all the more pronounced as the model is higher dimensional. As a result, the harmonic mean estimator systematically favors more parameter-rich models, an artefact that might explain some recent puzzling observations, based on harmonic mean estimates, suggesting that Bayes factors tend to overscore complex models. Finally, we apply our method to the comparison of several alternative models of amino-acid replacement. We confirm our previous observations, indicating that modeling pattern heterogeneity across sites tends to yield better models than standard empirical matrices.  相似文献   

12.
Researchers interested in studying populations that are difficult to reach through traditional survey methods can now draw on a range of methods to access these populations. Yet many of these methods are more expensive and difficult to implement than studies using conventional sampling frames and trusted sampling methods. The network scale-up method (NSUM) provides a middle ground for researchers who wish to estimate the size of a hidden population, but lack the resources to conduct a more specialized hidden population study. Through this method it is possible to generate population estimates for a wide variety of groups that are perhaps unwilling to self-identify as such (for example, users of illegal drugs or other stigmatized populations) via traditional survey tools such as telephone or mail surveys—by asking a representative sample to estimate the number of people they know who are members of such a “hidden” subpopulation. The original estimator is formulated to minimize the weight a single scaling variable can exert upon the estimates. We argue that this introduces hidden and difficult to predict biases, and instead propose a series of methodological advances on the traditional scale-up estimation procedure, including a new estimator. Additionally, we formalize the incorporation of sample weights into the network scale-up estimation process, and propose a recursive process of back estimation “trimming” to identify and remove poorly performing predictors from the estimation process. To demonstrate these suggestions we use data from a network scale-up mail survey conducted in Nebraska during 2014. We find that using the new estimator and recursive trimming process provides more accurate estimates, especially when used in conjunction with sampling weights.  相似文献   

13.
We propose methods for Bayesian inference for a new class of semiparametric survival models with a cure fraction. Specifically, we propose a semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution. We show that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates. Several novel properties of the proposed model are derived. In addition, we propose a class of improper noninformative priors based on this model and examine the properties of the implied posterior. Also, a class of informative priors based on historical data is proposed and its theoretical properties are investigated. A case study involving a melanoma clinical trial is discussed in detail to demonstrate the proposed methodology.  相似文献   

14.
Estimating pairwise correlation from replicated genome-scale (a.k.a. OMICS) data is fundamental to cluster functionally relevant biomolecules to a cellular pathway. The popular Pearson correlation coefficient estimates bivariate correlation by averaging over replicates. It is not completely satisfactory since it introduces strong bias while reducing variance. We propose a new multivariate correlation estimator that models all replicates as independent and identically distributed (i.i.d.) samples from the multivariate normal distribution. We derive the estimator by maximizing the likelihood function. For small sample data, we provide a resampling-based statistical inference procedure, and for moderate to large sample data, we provide an asymptotic statistical inference procedure based on the Likelihood Ratio Test (LRT). We demonstrate advantages of the new multivariate correlation estimator over Pearson bivariate correlation estimator using simulations and real-world data analysis examples. AVAILABILITY: The estimator and statistical inference procedures have been implemented in an R package 'CORREP' that is available from CRAN [http://cran.r-project.org] and Bioconductor [http://www.bioconductor.org/]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

15.
Best linear unbiased allele-frequency estimation in complex pedigrees   总被引:4,自引:0,他引:4  
McPeek MS  Wu X  Ober C 《Biometrics》2004,60(2):359-367
Many types of genetic analyses depend on estimates of allele frequencies. We consider the problem of allele-frequency estimation based on data from related individuals. The motivation for this work is data collected on the Hutterites, an isolated founder population, so we focus particularly on the case in which the relationships among the sampled individuals are specified by a large, complex pedigree for which maximum likelihood estimation is impractical. For this case, we propose to use the best linear unbiased estimator (BLUE) of allele frequency. We derive this estimator, which is equivalent to the quasi-likelihood estimator for this problem, and we describe an efficient algorithm for computing the estimate and its variance. We show that our estimator has certain desirable small-sample properties in common with the maximum likelihood estimator (MLE) for this problem. We treat both the case when parental origin of each allele is known and when it is unknown. The results are extended to prediction of allele frequency in some set of individuals S based on genotype data collected on a set of individuals R. We compare the mean-squared error of the BLUE, the commonly used naive estimator (sample frequency) and the MLE when the latter is feasible to calculate. The results indicate that although the MLE performs the best of the three, the BLUE is close in performance to the MLE and is substantially easier to calculate, making it particularly useful for large complex pedigrees in which MLE calculation is impractical or infeasible. We apply our method to allele-frequency estimation in a Hutterite data set.  相似文献   

16.
Unbiased estimator for genetic drift and effective population size   总被引:2,自引:0,他引:2       下载免费PDF全文
Jorde PE  Ryman N 《Genetics》2007,177(2):927-935
Amounts of genetic drift and the effective size of populations can be estimated from observed temporal shifts in sample allele frequencies. Bias in this so-called temporal method has been noted in cases of small sample sizes and when allele frequencies are highly skewed. We characterize bias in commonly applied estimators under different sampling plans and propose an alternative estimator for genetic drift and effective size that weights alleles differently. Numerical evaluations of exact probability distributions and computer simulations verify that this new estimator yields unbiased estimates also when based on a modest number of alleles and loci. At the cost of a larger standard deviation, it thus eliminates the bias associated with earlier estimators. The new estimator should be particularly useful for microsatellite loci and panels of SNPs, representing a large number of alleles, many of which will occur at low frequencies.  相似文献   

17.
For multicenter randomized trials or multilevel observational studies, the Cox regression model has long been the primary approach to study the effects of covariates on time-to-event outcomes. A critical assumption of the Cox model is the proportionality of the hazard functions for modeled covariates, violations of which can result in ambiguous interpretations of the hazard ratio estimates. To address this issue, the restricted mean survival time (RMST), defined as the mean survival time up to a fixed time in a target population, has been recommended as a model-free target parameter. In this article, we generalize the RMST regression model to clustered data by directly modeling the RMST as a continuous function of restriction times with covariates while properly accounting for within-cluster correlations to achieve valid inference. The proposed method estimates regression coefficients via weighted generalized estimating equations, coupled with a cluster-robust sandwich variance estimator to achieve asymptotically valid inference with a sufficient number of clusters. In small-sample scenarios where a limited number of clusters are available, however, the proposed sandwich variance estimator can exhibit negative bias in capturing the variability of regression coefficient estimates. To overcome this limitation, we further propose and examine bias-corrected sandwich variance estimators to reduce the negative bias of the cluster-robust sandwich variance estimator. We study the finite-sample operating characteristics of proposed methods through simulations and reanalyze two multicenter randomized trials.  相似文献   

18.
Although generally considered environmentally friendly, wind power has been associated with extensive mortality of birds and bats. In this perspective, there is a need for reliable estimates of fatalities at wind farms, where the heterogeneity of the basic information, used among environmental assessment studies, is unlikely to support an accurate universal estimation method. We tested the applicability of the Stochastic Dynamic Methodology (StDM) to estimate bat fatalities, based on multifactorial cause–effect relationships (by integrating multi-model inference statistical analysis and dynamic modelling) between mortality estimates, detected fatalities and the selected key-components of the reality, such as the real number of bat mortalities simulated, the rate of carcasses removal, the searcher efficiency, the monitoring periodicity and the number of turbines for different realistic scenarios associated with particular wind farm conditions. Although some existing mortality estimators are considered accurate, the choice of a given universal formula for all mortality assessments, based on deterministic parameters and assumptions, may originate unsuspected errors. Therefore, we propose a flexible dynamic modelling framework, the StDM estimator, where the obtained algorithms are adaptable to the universe of application intended. The StDM estimator takes into account random, non-constant and scenario dependent parameters, providing bias-corrected estimates. The StDM estimator was applied for the European wind farm context and validated in the most cases tested, through the confrontation with independent data. Overall, this approach is considered a valuable tool to improve the quality of mortality estimates at onshore wind facilities, within the local, environmental and methodological gradients (including the cases where no mortality is detected), namely in the scope of environmental impact assessments and general ecological monitoring programmes.  相似文献   

19.
Thall PF  Simon RM  Shen Y 《Biometrics》2000,56(1):213-219
We propose an approximate Bayesian method for comparing an experimental treatment to a control based on a randomized clinical trial with multivariate patient outcomes. Overall treatment effect is characterized by a vector of parameters corresponding to effects on the individual patient outcomes. We partition the parameter space into four sets where, respectively, the experimental treatment is superior to the control, the control is superior to the experimental, the two treatments are equivalent, and the treatment effects are discordant. We compute posterior probabilities of the parameter sets by treating an estimator of the parameter vector like a random variable in the Bayesian paradigm. The approximation may be used in any setting where a consistent, asymptotically normal estimator of the parameter vector is available. The method is illustrated by application to a breast cancer data set consisting of multiple time-to-event outcomes with covariates and to count data arising from a cross-classification of response, infection, and treatment in an acute leukemia trial.  相似文献   

20.
Summary Expressed sequence tag (EST) sequencing is a one‐pass sequencing reading of cloned cDNAs derived from a certain tissue. The frequency of unique tags among different unbiased cDNA libraries is used to infer the relative expression level of each tag. In this article, we propose a hierarchical multinomial model with a nonlinear Dirichlet prior for the EST data with multiple libraries and multiple types of tissues. A novel hierarchical prior is developed and the properties of the proposed prior are examined. An efficient Markov chain Monte Carlo algorithm is developed for carrying out the posterior computation. We also propose a new selection criterion for detecting which genes are differentially expressed between two tissue types. Our new method with the new gene selection criterion is demonstrated via several simulations to have low false negative and false positive rates. A real EST data set is used to motivate and illustrate the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号