首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Huang Y  Leroux B 《Biometrics》2011,67(3):843-851
Summary Williamson, Datta, and Satten's (2003, Biometrics 59 , 36–42) cluster‐weighted generalized estimating equations (CWGEEs) are effective in adjusting for bias due to informative cluster sizes for cluster‐level covariates. We show that CWGEE may not perform well, however, for covariates that can take different values within a cluster if the numbers of observations at each covariate level are informative. On the other hand, inverse probability of treatment weighting accounts for informative treatment propensity but not for informative cluster size. Motivated by evaluating the effect of a binary exposure in presence of such types of informativeness, we propose several weighted GEE estimators, with weights related to the size of a cluster as well as the distribution of the binary exposure within the cluster. Choice of the weights depends on the population of interest and the nature of the exposure. Through simulation studies, we demonstrate the superior performance of the new estimators compared to existing estimators such as from GEE, CWGEE, and inverse probability of treatment‐weighted GEE. We demonstrate the use of our method using an example examining covariate effects on the risk of dental caries among small children.  相似文献   

2.
Many late-phase clinical trials recruit subjects at multiple study sites. This introduces a hierarchical structure into the data that can result in a power-loss compared to a more homogeneous single-center trial. Building on a recently proposed approach to sample size determination, we suggest a sample size recalculation procedure for multicenter trials with continuous endpoints. The procedure estimates nuisance parameters at interim from noncomparative data and recalculates the sample size required based on these estimates. In contrast to other sample size calculation methods for multicenter trials, our approach assumes a mixed effects model and does not rely on balanced data within centers. It is therefore advantageous, especially for sample size recalculation at interim. We illustrate the proposed methodology by a study evaluating a diabetes management system. Monte Carlo simulations are carried out to evaluate operation characteristics of the sample size recalculation procedure using comparative as well as noncomparative data, assessing their dependence on parameters such as between-center heterogeneity, residual variance of observations, treatment effect size and number of centers. We compare two different estimators for between-center heterogeneity, an unadjusted and a bias-adjusted estimator, both based on quadratic forms. The type 1 error probability as well as statistical power are close to their nominal levels for all parameter combinations considered in our simulation study for the proposed unadjusted estimator, whereas the adjusted estimator exhibits some type 1 error rate inflation. Overall, the sample size recalculation procedure can be recommended to mitigate risks arising from misspecified nuisance parameters at the planning stage.  相似文献   

3.
This paper develops Bayesian sample size formulae for experiments comparing two groups, where relevant preexperimental information from multiple sources can be incorporated in a robust prior to support both the design and analysis. We use commensurate predictive priors for borrowing of information and further place Gamma mixture priors on the precisions to account for preliminary belief about the pairwise (in)commensurability between parameters that underpin the historical and new experiments. Averaged over the probability space of the new experimental data, appropriate sample sizes are found according to criteria that control certain aspects of the posterior distribution, such as the coverage probability or length of a defined density region. Our Bayesian methodology can be applied to circumstances that compare two normal means, proportions, or event times. When nuisance parameters (such as variance) in the new experiment are unknown, a prior distribution can further be specified based on preexperimental data. Exact solutions are available based on most of the criteria considered for Bayesian sample size determination, while a search procedure is described in cases for which there are no closed-form expressions. We illustrate the application of our sample size formulae in the design of clinical trials, where pretrial information is available to be leveraged. Hypothetical data examples, motivated by a rare-disease trial with an elicited expert prior opinion, and a comprehensive performance evaluation of the proposed methodology are presented.  相似文献   

4.
In clinical trials of chronic diseases such as acquired immunodeficiency syndrome, cancer, or cardiovascular diseases, the concept of quality-adjusted lifetime (QAL) has received more and more attention. In this paper, we consider the problem of how the covariates affect the mean QAL when the data are subject to right censoring. We allow a very general form for the mean model as a function of covariates. Using the idea of inverse probability weighting, we first construct a simple weighted estimating equation for the parameters in our mean model. We then find the form of the most efficient estimating equation, which yields the most efficient estimator for the regression parameters. Since the most efficient estimator depends on the distribution of the health history processes, and thus cannot be estimated nonparametrically, we consider different approaches for improving the efficiency of the simple weighted estimating equation using observed data. The applicability of these methods is demonstrated by both simulation experiments and a data example from a breast cancer clinical trial study.  相似文献   

5.
Marginal structural models for time‐fixed treatments fit using inverse‐probability weighted estimating equations are increasingly popular. Nonetheless, the resulting effect estimates are subject to finite‐sample bias when data are sparse, as is typical for large‐sample procedures. Here we propose a semi‐Bayes estimation approach which penalizes or shrinks the estimated model parameters to improve finite‐sample performance. This approach uses simple symmetric data‐augmentation priors. Limited simulation experiments indicate that the proposed approach reduces finite‐sample bias and improves confidence‐interval coverage when the true values lie within the central “hill” of the prior distribution. We illustrate the approach with data from a nonexperimental study of HIV treatments.  相似文献   

6.
In this paper, we investigate K‐group comparisons on survival endpoints for observational studies. In clinical databases for observational studies, treatment for patients are chosen with probabilities varying depending on their baseline characteristics. This often results in noncomparable treatment groups because of imbalance in baseline characteristics of patients among treatment groups. In order to overcome this issue, we conduct propensity analysis and match the subjects with similar propensity scores across treatment groups or compare weighted group means (or weighted survival curves for censored outcome variables) using the inverse probability weighting (IPW). To this end, multinomial logistic regression has been a popular propensity analysis method to estimate the weights. We propose to use decision tree method as an alternative propensity analysis due to its simplicity and robustness. We also propose IPW rank statistics, called Dunnett‐type test and ANOVA‐type test, to compare 3 or more treatment groups on survival endpoints. Using simulations, we evaluate the finite sample performance of the weighted rank statistics combined with these propensity analysis methods. We demonstrate these methods with a real data example. The IPW method also allows us for unbiased estimation of population parameters of each treatment group. In this paper, we limit our discussions to survival outcomes, but all the methods can be easily modified for any type of outcomes, such as binary or continuous variables.  相似文献   

7.
1.?Correlative species distribution models (SDMs) assess relationships between species distribution data and environmental features, to evaluate the environmental suitability (ES) of a given area for a species, by providing a measure of the probability of presence. If the output of SDMs represents the relationships between habitat features and species performance well, SDM results can be related also to other key parameters of populations, including reproductive parameters. To test this hypothesis, we evaluated whether SDM results can be used as a proxy of reproductive parameters (breeding output, territory size) in red-backed shrikes (Lanius collurio). 2.?The distribution of 726 shrike territories in Northern Italy was obtained through multiple focused surveys; for a subset of pairs, we also measured territory area and number of fledged juveniles. We used Maximum Entropy modelling to build a SDM on the basis of territory distribution. We used generalized least squares and spatial generalized mixed models to relate territory size and number of fledged juveniles to SDM suitability, while controlling for spatial autocorrelation. 3.?Species distribution models predicted shrike distribution very well. Territory size was negatively related to suitability estimated through SDM, while the number of fledglings significantly increased with the suitability of the territory. This was true also when SDM was built using only spatially and temporally independent data. 4.?Results show a clear relationship between ES estimated through presence-only SDMs and two key parameters related to species' reproduction, suggesting that suitability estimated by SDM, and habitat quality determining reproduction parameters in our model system, are correlated. Our study shows the potential use of SDMs to infer important fitness parameters; this information can have great importance in management and conservation.  相似文献   

8.
The internal pilot study design enables to estimate nuisance parameters required for sample size calculation on the basis of data accumulated in an ongoing trial. By this, misspecifications made when determining the sample size in the planning phase can be corrected employing updated knowledge. According to regulatory guidelines, blindness of all personnel involved in the trial has to be preserved and the specified type I error rate has to be controlled when the internal pilot study design is applied. Especially in the late phase of drug development, most clinical studies are run in more than one centre. In these multicentre trials, one may have to deal with an unequal distribution of the patient numbers among the centres. Depending on the type of the analysis (weighted or unweighted), unequal centre sample sizes may lead to a substantial loss of power. Like the variance, the magnitude of imbalance is difficult to predict in the planning phase. We propose a blinded sample size recalculation procedure for the internal pilot study design in multicentre trials with normally distributed outcome and two balanced treatment groups that are analysed applying the weighted or the unweighted approach. The method addresses both uncertainty with respect to the variance of the endpoint and the extent of disparity of the centre sample sizes. The actual type I error rate as well as the expected power and sample size of the procedure is investigated in simulation studies. For the weighted analysis as well as for the unweighted analysis, the maximal type I error rate was not or only minimally exceeded. Furthermore, application of the proposed procedure led to an expected power that achieves the specified value in many cases and is throughout very close to it.  相似文献   

9.
In observational cohort studies with complex sampling schemes, truncation arises when the time to event of interest is observed only when it falls below or exceeds another random time, that is, the truncation time. In more complex settings, observation may require a particular ordering of event times; we refer to this as sequential truncation. Estimators of the event time distribution have been developed for simple left-truncated or right-truncated data. However, these estimators may be inconsistent under sequential truncation. We propose nonparametric and semiparametric maximum likelihood estimators for the distribution of the event time of interest in the presence of sequential truncation, under two truncation models. We show the equivalence of an inverse probability weighted estimator and a product limit estimator under one of these models. We study the large sample properties of the proposed estimators and derive their asymptotic variance estimators. We evaluate the proposed methods through simulation studies and apply the methods to an Alzheimer's disease study. We have developed an R package, seqTrun , for implementation of our method.  相似文献   

10.
Although a large body of work investigating tests of correlated evolution of two continuous characters exists, hypotheses such as character displacement are really tests of whether substantial evolutionary change has occurred on a particular branch or branches of the phylogenetic tree. In this study, we present a methodology for testing such a hypothesis using ancestral character state reconstruction and simulation. Furthermore, we suggest how to investigate the robustness of the hypothesis test by varying the reconstruction methods or simulation parameters. As a case study, we tested a hypothesis of character displacement in body size of Caribbean Anolis lizards. We compared squared-change, weighted squared-change, and linear parsimony reconstruction methods, gradual Brownian motion and speciational models of evolution, and several resolution methods for linear parsimony. We used ancestor reconstruction methods to infer the amount of body size evolution, and tested whether evolutionary change in body size was greater on branches of the phylogenetic tree in which a transition from occupying a single-species island to a two-species island occurred. Simulations were used to generate null distributions of reconstructed body size change. The hypothesis of character displacement was tested using Wilcoxon Rank-Sums. When tested against simulated null distributions, all of the reconstruction methods resulted in more significant P-values than when standard statistical tables were used. These results confirm that P-values for tests using ancestor reconstruction methods should be assessed via simulation rather than from standard statistical tables. Linear parsimony can produce an infinite number of most parsimonious reconstructions in continuous characters. We present an example of assessing the robustness of our statistical test by exploring the sample space of possible resolutions. We compare ACCTRAN and DELTRAN resolutions of ambiguous character reconstructions in linear parsimony to the most and least conservative resolutions for our particular hypothesis.  相似文献   

11.
This paper presents an analysis of a longitudinal multi-center clinical trial with missing data. It illustrates the application, the appropriateness, and the limitations of a straightforward ratio estimation procedure for dealing with multivariate situations in which missing data occur at random and with small probability. The parameter estimates are computed via matrix operators such as those used for the generalized least squares analysis of catetorical data. Thus, the estimates may be conveniently analyzed by asymptotic regression methods within the same computer program which computes the estimates, provided that the sample size is sufficiently computer program which computes the estimates, provided that the sample size is sufficiently large.  相似文献   

12.
Various models have been developed for modeling the distribution of chromosome aberrations in the literature (e. g. Consul, 1989). Generalized Poisson distribution is among the popular ones. The parameters of this distribution provide meaningful interpretation of the induction mechanism of the chromosome aberrations. In this article, we apply several estimation methods to estimate the generalized Poisson parameters for fitting the number of chromosome aberrations under different dosages of radiations. The methods compared are moment, maximum likelihood, minimum chi-square, weighted discrepancy and empirically weighted rates of change. Our study suggests that the empirically weighted rates of change method results in smallest Mean Square Error and Mean Absolute Error for most dosages of radiations. The data used for this comparison are from Janardan and Schaeffer (1977).  相似文献   

13.
The fitness of animals subjected to natural selection can be defined as the probability of surviving selection for a given interval of time, or some convenient multiple of this probability. If the fitness of animals is related to some quantitative variable X (such as size) then this relationship is expressed mathematically in the fitness function w(x) and this function can be estimated by comparing the distribution of X in samples taken before and after selection. In this note five methods for estimating the fitness function on the basis of samples from a large population are discussed. They are compared on three previously published sets of data and as a result estimation according to weighted multiple regression is recommended.  相似文献   

14.
随机水文过程受到随机性和确定性因素的综合影响,其时间序列不仅具有反映遗传特性的纯随机成分,还含有反映变异特性的确定性跳跃、趋势、周期成分和随机性相依成分,使得随机水文过程表现出复杂的变化形态和演变规律.为了对上述复杂的变化形态和演变规律进行统一认识,本文从随机过程模拟和时间序列分析两个角度描述了非一致性水文序列的遗传和变异特性或规律,同时对非一致性水文频率计算途径进行比较,说明非一致性研究面临的主要问题.在此基础上,本文借鉴生物基因概念来定义水文基因,并分别利用常规矩、权函数矩、概率权重矩、线性矩等描述水文基因的构建和表达过程;同时定义跳跃、趋势、周期、相依和纯随机成分为构成水文基因的5种水文碱基,综合考虑非一致性水文序列的遗传成分和变异成分,并阐述其遗传、变异和进化原理,以揭示水文要素概率分布遗传、变异和进化的演变规律.  相似文献   

15.
Chen GB  Xu Y  Xu HM  Li MD  Zhu J  Lou XY 《PloS one》2011,6(2):e16981
Detection of interacting risk factors for complex traits is challenging. The choice of an appropriate method, sample size, and allocation of cases and controls are serious concerns. To provide empirical guidelines for planning such studies and data analyses, we investigated the performance of the multifactor dimensionality reduction (MDR) and generalized MDR (GMDR) methods under various experimental scenarios. We developed the mathematical expectation of accuracy and used it as an indicator parameter to perform a gene-gene interaction study. We then examined the statistical power of GMDR and MDR within the plausible range of accuracy (0.50~0.65) reported in the literature. The GMDR with covariate adjustment had a power of >80% in a case-control design with a sample size of ≥2000, with theoretical accuracy ranging from 0.56 to 0.62. However, when the accuracy was <0.56, a sample size of ≥4000 was required to have sufficient power. In our simulations, the GMDR outperformed the MDR under all models with accuracy ranging from 0.56~0.62 for a sample size of 1000-2000. However, the two methods performed similarly when the accuracy was outside this range or the sample was significantly larger. We conclude that with adjustment of a covariate, GMDR performs better than MDR and a sample size of 1000~2000 is reasonably large for detecting gene-gene interactions in the range of effect size reported by the current literature; whereas larger sample size is required for more subtle interactions with accuracy <0.56.  相似文献   

16.
Clustering of dwell times in data from single-channel recordings, which is in excess of the value predicted from the probability density function (pdf) alone, provides restrictions on modeling schemes. Two methods, (a) the probability density function of the running median for groups of any size of sequential dwell times, and (b) the distribution of cumulative probabilities associated with dwell times separated by any lag, or the second cumulative probability distribution, are proposed as alternative representations of single-channel data; these methods are suitable for the detection of such clusters or modes. Simulation of three models with and without modes is done to test the efficacy of these methods. It is found that they often yield a better estimate of moding parameters than the methods of running mean pdf and autocorrelation.  相似文献   

17.
We are concerned with calculating the sample size required for estimating the mean of the continuous distribution in the context of a two component nonstandard mixture distribution (i.e., a mixture of an identifiable point degenerate function F at a constant with probability P and a continuous distribution G with probability 1 – P). A common ad hoc procedure of escalating the naïve sample size n (calculated under the assumption of no point degenerate function F) by a factor of 1/(1 – P), has about 0.5 probability of achieving the pre‐specified statistical power. Such an ad hoc approach may seriously underestimate the necessary sample size and jeopardize inferences in scientific investigations. We argue that sample size calculations in this context should have a pre‐specified probability of power ≥1 – β set by the researcher at a level greater than 0.5. To that end, we propose an exact method and an approximate method to calculate sample size in this context so that the pre‐specified probability of achieving a desired statistical power is determined by the researcher. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

18.
Allele frequencies after a bottleneck   总被引:1,自引:0,他引:1  
The effect that a recent change in population size (a “bottleneck”) has on the genetic composition of a random sample of genes is studied. The population is assumed to evolve as in the Wright-Fisher model with infinitely many neutral alleles. Simple analytic formulas are found for such quantities as the probability distribution and moments of the total number of alleles, the allelic “frequency spectrum,” and the homozygosity, in the sample. Numerical examples are given which compare these results with those obtained previously by a variety of other methods.  相似文献   

19.
Outcome-dependent sampling (ODS) schemes can be a cost effective way to enhance study efficiency. The case-control design has been widely used in epidemiologic studies. However, when the outcome is measured on a continuous scale, dichotomizing the outcome could lead to a loss of efficiency. Recent epidemiologic studies have used ODS sampling schemes where, in addition to an overall random sample, there are also a number of supplemental samples that are collected based on a continuous outcome variable. We consider a semiparametric empirical likelihood inference procedure in which the underlying distribution of covariates is treated as a nuisance parameter and is left unspecified. The proposed estimator has asymptotic normality properties. The likelihood ratio statistic using the semiparametric empirical likelihood function has Wilks-type properties in that, under the null, it follows a chi-square distribution asymptotically and is independent of the nuisance parameters. Our simulation results indicate that, for data obtained using an ODS design, the semiparametric empirical likelihood estimator is more efficient than conditional likelihood and probability weighted pseudolikelihood estimators and that ODS designs (along with the proposed estimator) can produce more efficient estimates than simple random sample designs of the same size. We apply the proposed method to analyze a data set from the Collaborative Perinatal Project (CPP), an ongoing environmental epidemiologic study, to assess the relationship between maternal polychlorinated biphenyl (PCB) level and children's IQ test performance.  相似文献   

20.
Features of the Correlation Structure of Price Indices   总被引:1,自引:0,他引:1  
What are the features of the correlation structure of price indices? To answer this question, 5 types of price indices, including 195 specific price indices from 2003 to 2011, were selected as sample data. To build a weighted network of price indices each price index is represented by a vertex, and a positive correlation between two price indices is represented by an edge. We studied the features of the weighted network structure by applying economic theory to the analysis of complex network parameters. We found that the frequency of the price indices follows a normal distribution by counting the weighted degrees of the nodes, and we identified the price indices which have an important impact on the network''s structure. We found out small groups in the weighted network by the methods of k-core and k-plex. We discovered structure holes in the network by calculating the hierarchy of the nodes. Finally, we found that the price indices weighted network has a small-world effect by calculating the shortest path. These results provide a scientific basis for macroeconomic control policies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号