期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Robust inference for the stepped wedge design

James P. Hughes Patrick J. Heagerty Fan Xia Yuqi Ren 《Biometrics》2020,76(1):119-130

Stepped wedge designed trials are a type of cluster-randomized study in which the intervention is introduced to each cluster in a random order over time. This design is often used to assess the effect of a new intervention as it is rolled out across a series of clinics or communities. Based on a permutation argument, we derive a closed-form expression for an estimate of the intervention effect, along with its standard error, for a stepped wedge design trial. We show that these estimates are robust to misspecification of both the mean and covariance structure of the underlying data-generating mechanism, thereby providing a robust approach to inference for the intervention effect in stepped wedge designs. We use simulations to evaluate the type 1 error and power of the proposed estimate and to compare the performance of the proposed estimate to the optimal estimate when the correct model specification is known. The limitations, possible extensions, and open problems regarding the method are discussed. 相似文献

2.

Genotype‐free estimation of allele frequencies reduces bias and improves demographic inference from RADSeq data

Vera M. Warmuth Hans Ellegren 《Molecular ecology resources》2019,19(3):586-596

Restriction‐site associated DNA sequencing (RADSeq) facilitates rapid generation of thousands of genetic markers at relatively low cost; however, several sources of error specific to RADSeq methods often lead to biased estimates of allele frequencies and thereby to erroneous population genetic inference. Estimating the distribution of sample allele frequencies without calling genotypes was shown to improve population inference from whole genome sequencing data, but the ability of this approach to account for RADSeq‐specific biases remains unexplored. Here we assess in how far genotype‐free methods of allele frequency estimation affect demographic inference from empirical RADSeq data. Using the well‐studied pied flycatcher (Ficedula hypoleuca) as a study system, we compare allele frequency estimation and demographic inference from whole genome sequencing data with that from RADSeq data matched for samples using both genotype‐based and genotype free methods. The demographic history of pied flycatchers as inferred from RADSeq data was highly congruent with that inferred from whole genome resequencing (WGS) data when allele frequencies were estimated directly from the read data. In contrast, when allele frequencies were derived from called genotypes, RADSeq‐based estimates of most model parameters fell outside the 95% confidence interval of estimates derived from WGS data. Notably, more stringent filtering of the genotype calls tended to increase the discrepancy between parameter estimates from WGS and RADSeq data, respectively. The results from this study demonstrate the ability of genotype‐free methods to improve allele frequency spectrum‐ (AFS‐) based demographic inference from empirical RADSeq data and highlight the need to account for uncertainty in NGS data regardless of sequencing method. 相似文献

3.

Bayesian inference in ecology 总被引：14，自引：1，他引：13

Aaron M. Ellison 《Ecology letters》2004,7(6):509-520

Bayesian inference is an important statistical tool that is increasingly being used by ecologists. In a Bayesian analysis, information available before a study is conducted is summarized in a quantitative model or hypothesis: the prior probability distribution. Bayes’ Theorem uses the prior probability distribution and the likelihood of the data to generate a posterior probability distribution. Posterior probability distributions are an epistemological alternative to P‐values and provide a direct measure of the degree of belief that can be placed on models, hypotheses, or parameter estimates. Moreover, Bayesian information‐theoretic methods provide robust measures of the probability of alternative models, and multiple models can be averaged into a single model that reflects uncertainty in model construction and selection. These methods are demonstrated through a simple worked example. Ecologists are using Bayesian inference in studies that range from predicting single‐species population dynamics to understanding ecosystem processes. Not all ecologists, however, appreciate the philosophical underpinnings of Bayesian inference. In particular, Bayesians and frequentists differ in their definition of probability and in their treatment of model parameters as random variables or estimates of true values. These assumptions must be addressed explicitly before deciding whether or not to use Bayesian methods to analyse ecological data. 相似文献

4.

A flexible hierarchical framework for improving inference in area-referenced environmental health studies

Monica Pirani Alexina J. Mason Anna L. Hansell Sylvia Richardson Marta Blangiardo 《Biometrical journal. Biometrische Zeitschrift》2020,62(7):1650-1669

Study designs where data have been aggregated by geographical areas are popular in environmental epidemiology. These studies are commonly based on administrative databases and, providing a complete spatial coverage, are particularly appealing to make inference on the entire population. However, the resulting estimates are often biased and difficult to interpret due to unmeasured confounders, which typically are not available from routinely collected data. We propose a framework to improve inference drawn from such studies exploiting information derived from individual-level survey data. The latter are summarized in an area-level scalar score by mimicking at ecological level the well-known propensity score methodology. The literature on propensity score for confounding adjustment is mainly based on individual-level studies and assumes a binary exposure variable. Here, we generalize its use to cope with area-referenced studies characterized by a continuous exposure. Our approach is based upon Bayesian hierarchical structures specified into a two-stage design: (i) geolocated individual-level data from survey samples are up-scaled at ecological level, then the latter are used to estimate a generalized ecological propensity score (EPS) in the in-sample areas; (ii) the generalized EPS is imputed in the out-of-sample areas under different assumptions about the missingness mechanisms, then it is included into the ecological regression, linking the exposure of interest to the health outcome. This delivers area-level risk estimates, which allow a fuller adjustment for confounding than traditional areal studies. The methodology is illustrated by using simulations and a case study investigating the risk of lung cancer mortality associated with nitrogen dioxide in England (UK). 相似文献

5.

Robustness of Generalized Estimating Equation (GEE) Tests of Significance against Misspecification of the Error Structure Model

John E. Overall Scott Tonidandel 《Biometrical journal. Biometrische Zeitschrift》2004,46(2):203-213

Generalized linear model analyses of repeated measurements typically rely on simplifying mathematical models of the error covariance structure for testing the significance of differences in patterns of change across time. The robustness of the tests of significance depends, not only on the degree of agreement between the specified mathematical model and the actual population data structure, but also on the precision and robustness of the computational criteria for fitting the specified covariance structure to the data. Generalized estimating equation (GEE) solutions utilizing the robust empirical sandwich estimator for modeling of the error structure were compared with general linear mixed model (GLMM) solutions that utilized the commonly employed restricted maximum likelihood (REML) procedure. Under the conditions considered, the GEE and GLMM procedures were identical in assuming that the data are normally distributed and that the variance‐covariance structure of the data is the one specified by the user. The question addressed in this article concerns relative sensitivity of tests of significance for treatment effects to varying degrees of misspecification of the error covariance structure model when fitted by the alternative procedures. Simulated data that were subjected to monte carlo evaluation of actual Type I error and power of tests of the equal slopes hypothesis conformed to assumptions of ordinary linear model ANOVA for repeated measures except for autoregressive covariance structures and missing data due to dropouts. The actual within‐groups correlation structures of the simulated repeated measurements ranged from AR(1) to compound symmetry in graded steps, whereas the GEE and GLMM formulations restricted the respective error structure models to be either AR(1), compound symmetry (CS), or unstructured (UN). The GEE‐based tests utilizing empirical sandwich estimator criteria were documented to be relatively insensitive to misspecification of the covariance structure models, whereas GLMM tests which relied on restricted maximum likelihood (REML) were highly sensitive to relatively modest misspecification of the error correlation structure even though normality, variance homogeneity, and linearity were not an issue in the simulated data.Goodness‐of‐fit statistics were of little utility in identifying cases in which relatively minor misspecification of the GLMM error structure model resulted in inadequate alpha protection for tests of the equal slopes hypothesis. Both GEE and GLMM formulations that relied on unstructured (UN) error model specification produced nonconservative results regardless of the actual correlation structure of the repeated measurements. A random coefficients model produced robust tests with competitive power across all conditions examined. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献

6.

Valid inference in random effects meta-analysis 总被引：2，自引：0，他引：2

Follmann DA Proschan MA 《Biometrics》1999,55(3):732-737

The standard approach to inference for random effects meta-analysis relies on approximating the null distribution of a test statistic by a standard normal distribution. This approximation is asymptotic on k, the number of studies, and can be substantially in error in medical meta-analyses, which often have only a few studies. This paper proposes permutation and ad hoc methods for testing with the random effects model. Under the group permutation method, we randomly switch the treatment and control group labels in each trial. This idea is similar to using a permutation distribution for a community intervention trial where communities are randomized in pairs. The permutation method theoretically controls the type I error rate for typical meta-analyses scenarios. We also suggest two ad hoc procedures. Our first suggestion is to use a t-reference distribution with k-1 degrees of freedom rather than a standard normal distribution for the usual random effects test statistic. We also investigate the use of a simple t-statistic on the reported treatment effects. 相似文献

7.

Exact two-sample inference with missing data

Cheung YK 《Biometrics》2005,61(2):524-531

When comparing follow-up measurements from two independent populations, missing records may arise due to censoring by events whose occurrence is associated with baseline covariates. In these situations, inferences based only on the completely followed observations may be biased if the follow-up measurements and the covariates are correlated. This article describes exact inference for a class of modified U-statistics under covariate-dependent dropouts. The method involves weighing each permutation according to the retention probabilities, and thus requires estimation of the missing data mechanism. The proposed procedure is nonparametric in that no distributional assumption is necessary for the outcome variables and the missingness patterns. Monte Carlo approximation by the Gibbs sampler is proposed, and is shown to be fast and accurate via simulation. The method is illustrated in two small data sets for which asymptotic inferential procedures may not be appropriate. 相似文献

8.

Permutation inference methods for multivariate meta-analysis

Hisashi Noma Kengo Nagashima Toshi A. Furukawa 《Biometrics》2020,76(1):337-347

Multivariate meta-analysis is gaining prominence in evidence synthesis research because it enables simultaneous synthesis of multiple correlated outcome data, and random-effects models have generally been used for addressing between-studies heterogeneities. However, coverage probabilities of confidence regions or intervals for standard inference methods for random-effects models (eg, restricted maximum likelihood estimation) cannot retain their nominal confidence levels in general, especially when the number of synthesized studies is small because their validities depend on large sample approximations. In this article, we provide permutation-based inference methods that enable exact joint inferences for average outcome measures without large sample approximations. We also provide accurate marginal inference methods under general settings of multivariate meta-analyses. We propose effective approaches for permutation inferences using optimal weighting based on the efficient score statistic. The effectiveness of the proposed methods is illustrated via applications to bivariate meta-analyses of diagnostic accuracy studies for airway eosinophilia in asthma and a network meta-analysis for antihypertensive drugs on incident diabetes, as well as through simulation experiments. In numerical evaluations performed via simulations, our methods generally provided accurate confidence regions or intervals under a broad range of settings, whereas the current standard inference methods exhibited serious undercoverage properties. 相似文献

9.

Permutation‐based inference for the AUC: A unified approach for continuous and discontinuous data

下载免费PDF全文

Markus Pauly Thomas Asendorf Frank Konietschke 《Biometrical journal. Biometrische Zeitschrift》2016,58(6):1319-1337

We investigate rank‐based studentized permutation methods for the nonparametric Behrens–Fisher problem, that is, inference methods for the area under the ROC curve. We hereby prove that the studentized permutation distribution of the Brunner‐Munzel rank statistic is asymptotically standard normal, even under the alternative. Thus, incidentally providing the hitherto missing theoretical foundation for the Neubert and Brunner studentized permutation test. In particular, we do not only show its consistency, but also that confidence intervals for the underlying treatment effects can be computed by inverting this permutation test. In addition, we derive permutation‐based range‐preserving confidence intervals. Extensive simulation studies show that the permutation‐based confidence intervals appear to maintain the preassigned coverage probability quite accurately (even for rather small sample sizes). For a convenient application of the proposed methods, a freely available software package for the statistical software R has been developed. A real data example illustrates the application. 相似文献

10.

Robust estimation of multivariate covariance components 总被引：1，自引：0，他引：1

Dueck A Lohr S 《Biometrics》2005,61(1):162-169

In many settings, such as interlaboratory testing, small area estimation in sample surveys, and heritability studies, investigators are interested in estimating covariance components for multivariate measurements. However, the presence of outliers can seriously distort estimates obtained using standard procedures such as maximum likelihood. We propose a procedure based on M-estimation for robustly estimating multivariate covariance components in the presence of outliers; the procedure applies to balanced and unbalanced data. We present an algorithm for computing the robust estimates and examine the performance of the estimator through a simulation study. The estimator is used to find covariance components and identify outliers in a study of variability of egg length and breadth measurements of American coots. 相似文献

11.

Order‐restricted Scores Test for the Evaluation of Population‐based Case–control Studies when the Genetic Model is Unknown

Ludwig A. Hothorn Torsten Hothorn 《Biometrical journal. Biometrische Zeitschrift》2009,51(4):659-669

The Cochran–Armitage (CA) linear trend test for proportions is often used for genotype‐based analysis of candidate gene association. Depending on the underlying genetic mode of inheritance, the use of model‐specific scores maximises the power. Commonly, the underlying genetic model, i.e. additive, dominant or recessive mode of inheritance, is a priori unknown. Association studies are commonly analysed using permutation tests, where both inference and identification of the underlying mode of inheritance are important. Especially interesting are tests for case–control studies, defined by a maximum over a series of standardised CA tests, because such a procedure has power under all three genetic models. We reformulate the test problem and propose a conditional maximum test of scores‐specific linear‐by‐linear association tests. For maximum‐type, sum and quadratic test statistics the asymptotic expectation and covariance can be derived in a closed form and the limiting distribution is known. Both the limiting distribution and approximations of the exact conditional distribution can easily be computed using standard software packages. In addition to these technical advances, we extend the area of application to stratified designs, studies involving more than two groups and the simultaneous analysis of multiple loci by means of multiplicity‐adjusted p‐values for the underlying multiple CA trend tests. The new test is applied to reanalyse a study investigating genetic components of different subtypes of psoriasis. A new and flexible inference tool for association studies is available both theoretically as well as practically since already available software packages can be easily used to implement the suggested test procedures. 相似文献

12.

An estimating function approach to inference for inhomogeneous Neyman-Scott processes 总被引：2，自引：0，他引：2

Waagepetersen RP 《Biometrics》2007,63(1):252-258

This article is concerned with inference for a certain class of inhomogeneous Neyman-Scott point processes depending on spatial covariates. Regression parameter estimates obtained from a simple estimating function are shown to be asymptotically normal when the "mother" intensity for the Neyman-Scott process tends to infinity. Clustering parameter estimates are obtained using minimum contrast estimation based on the K-function. The approach is motivated and illustrated by applications to point pattern data from a tropical rain forest plot. 相似文献

13.

On sample size and inference for two-stage adaptive designs

Liu Q Chi GY 《Biometrics》2001,57(1):172-177

Proschan and Hunsberger (1995, Biometrics 51, 1315-1324) proposed a two-stage adaptive design that maintains the Type I error rate. For practical applications, a two-stage adaptive design is also required to achieve a desired statistical power while limiting the maximum overall sample size. In our proposal, a two-stage adaptive design is comprised of a main stage and an extension stage, where the main stage has sufficient power to reject the null under the anticipated effect size and the extension stage allows increasing the sample size in case the true effect size is smaller than anticipated. For statistical inference, methods for obtaining the overall adjusted p-value, point estimate and confidence intervals are developed. An exact two-stage test procedure is also outlined for robust inference. 相似文献

14.

Estimated quadratic inference function for correlated failure time data

Feifei Yan Yanyan Liu Jianwen Cai Haibo Zhou 《Biometrics》2023,79(2):1145-1158

An estimated quadratic inference function method is proposed for correlated failure time data with auxiliary covariates. The proposed method makes efficient use of the auxiliary information for the incomplete exposure covariates and preserves the property of the quadratic inference function method that requires the covariates to be completely observed. It can improve the estimation efficiency and easily deal with the situation when the cluster size is large. The proposed estimator which minimizes the estimated quadratic inference function is shown to be consistent and asymptotically normal. A chi-squared test based on the estimated quadratic inference function is proposed to test hypotheses about the regression parameters. The small-sample performance of the proposed method is investigated through extensive simulation studies. The proposed method is then applied to analyze the Study of Left Ventricular Dysfunction (SOLVD) data as an illustration. 相似文献

15.

Nonparametric inference for the cumulative incidence function of a competing risk,with an emphasis on confidence bands in the presence of left‐truncation

Susanna Di Termini Stefanie Hieke Martin Schumacher Jan Beyersmann 《Biometrical journal. Biometrische Zeitschrift》2012,54(4):568-578

Motivated by a study on pregnancy outcome, a computationally simple resampling procedure for nonparametric analysis of the cumulative incidence function of a competing risk is investigated for left‐truncated data. We also modify the original procedure to producing the more desirable Greenwood‐type variance estimates. These approaches are used to construct simultaneous confidence bands of the cumulative incidence functions which is otherwise hampered by the complicated nature of the covariance process. Simulation results and a real data example are provided. 相似文献

16.

Implications of ignoring telemetry error on inference in wildlife resource use models

Robert A. Montgomery Gary J. Roloff Jay M. Ver Hoef 《The Journal of wildlife management》2011,75(3):702-708

Global Positioning System (GPS) and very high frequency (VHF) telemetry data redefined the examination of wildlife resource use. Researchers collar animals, relocate those animals over time, and utilize the estimated locations to infer resource use and build predictive models. Precision of these estimated wildlife locations, however, influences the reliability of point-based models with accuracy depending on the interaction between mean telemetry error and how habitat characteristics are mapped (categorical raster resolution and patch size). Telemetry data often foster the assumption that locational error can be ignored without biasing study results. We evaluated the effects of mean telemetry error and categorical raster resolution on the correct characterization of patch use when locational error is ignored. We found that our ability to accurately attribute patch type to an estimated telemetry location improved nonlinearly as patch size increased and mean telemetry error decreased. Furthermore, the exact shape of these relationships was directly influenced by categorical raster resolution. Accuracy ranged from 100% (200-ha patch size, 1- to 5-m telemetry error) to 46% (0.5-ha patch size, 56- to 60-m telemetry error) for 10 m resolution rasters. Accuracy ranged from 99% (200-ha patch size, 1- to 5-m telemetry error) to 57% (0.5-ha patch size, 56- to 60-m telemetry error) for 30-m resolution rasters. When covariate rasters were less resolute (30 m vs. 10 m) estimates for the ignore technique were more accurate at smaller patch sizes. Hence, both fine resolution (10 m) covariate rasters and small patch sizes increased probability of patch misidentification. Our results help frame the scope of ecological inference made from point-based wildlife resource use models. For instance, to make ecological inferences with 90% accuracy at small patch sizes (≤5 ha) mean telemetry error ≤5 m is required for 10-m resolution categorical rasters. To achieve the same inference on 30-m resolution categorical rasters, mean telemetry error ≤10 m is required. We encourage wildlife professionals creating point-based models to assess whether reasonable estimates of resource use can be expected given their telemetry error, covariate raster resolution, and range of patch sizes. © 2011 The Wildlife Society. 相似文献

17.

A Note on Balanced Sampling and Robust Genetic Predictors

G. M. Tallis 《Biometrical journal. Biometrische Zeitschrift》1982,24(7):663-672

The concept of balanced sampling is applied to prediction in finite samples using model based inference procedures. Necessary and sufficient conditions are derived for a general linear model with arbitrary covariance structure to yield the expansion estimator as the best linear unbiased predictor for the mean. The analysis is extended to produce a robust estimator for the mean squared error under balanced sampling and the results are discussed in the context of statistical genetics where appropriate sampling produces simple efficient and robust genetic predictors free from unnecessary genetic assumptions. 相似文献

18.

Genomic data detect corresponding signatures of population size change on an ecological time scale in two salamander species

下载免费PDF全文

Schyler O. Nunziata Stacey L. Lance David E. Scott Emily Moriarty Lemmon David W. Weisrock 《Molecular ecology》2017,26(4):1060-1074

Understanding the demography of species over recent history (e.g. <100 years) is critical in studies of ecology and evolution, but records of population history are rarely available. Surveying genetic variation is a potential alternative to census‐based estimates of population size, and can yield insight into the demography of a population. However, to assess the performance of genetic methods, it is important to compare their estimates of population history to known demography. Here, we leveraged the exceptional resources from a wetland with 37 years of amphibian mark–recapture data to study the utility of genetically based demographic inference on salamander species with documented population declines (Ambystoma talpoideum) and expansions (A. opacum), patterns that have been shown to be correlated with changes in wetland hydroperiod. We generated ddRAD data from two temporally sampled populations of A. opacum (1993, 2013) and A. talpoideum (1984, 2011) and used coalescent‐based demographic inference to compare alternate evolutionary models. For both species, demographic model inference supported population size changes that corroborated mark–recapture data. Parameter estimation in A. talpoideum was robust to our variations in analytical approach, while estimates for A. opacum were highly inconsistent, tempering our confidence in detecting a demographic trend in this species. Overall, our robust results in A. talpoideum suggest that genome‐based demographic inference has utility on an ecological scale, but researchers should also be cognizant that these methods may not work in all systems and evolutionary scenarios. Demographic inference may be an important tool for population monitoring and conservation management planning. 相似文献

19.

Doubly robust estimation in missing data and causal inference models 总被引：3，自引：0，他引：3

Bang H Robins JM 《Biometrics》2005,61(4):962-973

The goal of this article is to construct doubly robust (DR) estimators in ignorable missing data and causal inference models. In a missing data model, an estimator is DR if it remains consistent when either (but not necessarily both) a model for the missingness mechanism or a model for the distribution of the complete data is correctly specified. Because with observational data one can never be sure that either a missingness model or a complete data model is correct, perhaps the best that can be hoped for is to find a DR estimator. DR estimators, in contrast to standard likelihood-based or (nonaugmented) inverse probability-weighted estimators, give the analyst two chances, instead of only one, to make a valid inference. In a causal inference model, an estimator is DR if it remains consistent when either a model for the treatment assignment mechanism or a model for the distribution of the counterfactual data is correctly specified. Because with observational data one can never be sure that a model for the treatment assignment mechanism or a model for the counterfactual data is correct, inference based on DR estimators should improve upon previous approaches. Indeed, we present the results of simulation studies which demonstrate that the finite sample performance of DR estimators is as impressive as theory would predict. The proposed method is applied to a cardiovascular clinical trial. 相似文献

20.

Improved power of familywise error rate procedures for discrete data under dependency

Li He Joseph F. Heyse 《Biometrical journal. Biometrische Zeitschrift》2019,61(1):101-114

In many applications where it is necessary to test multiple hypotheses simultaneously, the data encountered are discrete. In such cases, it is important for multiplicity adjustment to take into account the discreteness of the distributions of the p‐values, to assure that the procedure is not overly conservative. In this paper, we review some known multiple testing procedures for discrete data that control the familywise error rate, the probability of making any false rejection. Taking advantage of the fact that the exact permutation or exact pairwise permutation distributions of the p‐values can often be determined when the sample size is small, we investigate procedures that incorporate the dependence structure through the exact permutation distribution and propose two new procedures that incorporate the exact pairwise permutation distributions. A step‐up procedure is also proposed that accounts for the discreteness of the data. The performance of the proposed procedures is investigated through simulation studies and two applications. The results show that by incorporating both discreteness and dependency of p‐value distributions, gains in power can be achieved. 相似文献