首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The nonparametric Behrens‐Fisher hypothesis is the most appropriate null hypothesis for the two‐sample comparison when one does not wish to make restrictive assumptions about possible distributions. In this paper, a numerical approach is described by which the likelihood ratio test can be calculated for the nonparametric Behrens‐Fisher problem. The approach taken here effectively reduces the number of parameters in the score equations to one by using a recursive formula for the remaining parameters. The resulting single dimensional problem can be solved numerically. The power of the likelihood ratio test is compared by simulation to that of a generalized Wilcoxon test of Brunner and Munzel. The tests have similar power for all alternatives considered when a simulated null distribution is used to generate cutoff values for the tests. The methods are illustrated on data on shoulder pain from a clinical trial.  相似文献   

2.
High‐dimensional data provide many potential confounders that may bolster the plausibility of the ignorability assumption in causal inference problems. Propensity score methods are powerful causal inference tools, which are popular in health care research and are particularly useful for high‐dimensional data. Recent interest has surrounded a Bayesian treatment of propensity scores in order to flexibly model the treatment assignment mechanism and summarize posterior quantities while incorporating variance from the treatment model. We discuss methods for Bayesian propensity score analysis of binary treatments, focusing on modern methods for high‐dimensional Bayesian regression and the propagation of uncertainty. We introduce a novel and simple estimator for the average treatment effect that capitalizes on conjugacy of the beta and binomial distributions. Through simulations, we show the utility of horseshoe priors and Bayesian additive regression trees paired with our new estimator, while demonstrating the importance of including variance from the treatment regression model. An application to cardiac stent data with almost 500 confounders and 9000 patients illustrates approaches and facilitates comparison with existing alternatives. As measured by a falsifiability endpoint, we improved confounder adjustment compared with past observational research of the same problem.  相似文献   

3.

Summary

We consider a functional linear Cox regression model for characterizing the association between time‐to‐event data and a set of functional and scalar predictors. The functional linear Cox regression model incorporates a functional principal component analysis for modeling the functional predictors and a high‐dimensional Cox regression model to characterize the joint effects of both functional and scalar predictors on the time‐to‐event data. We develop an algorithm to calculate the maximum approximate partial likelihood estimates of unknown finite and infinite dimensional parameters. We also systematically investigate the rate of convergence of the maximum approximate partial likelihood estimates and a score test statistic for testing the nullity of the slope function associated with the functional predictors. We demonstrate our estimation and testing procedures by using simulations and the analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data. Our real data analyses show that high‐dimensional hippocampus surface data may be an important marker for predicting time to conversion to Alzheimer's disease. Data used in the preparation of this article were obtained from the ADNI database ( adni.loni.usc.edu ).  相似文献   

4.
Fei Liu  David Dunson  Fei Zou 《Biometrics》2011,67(2):504-512
Summary This article considers the problem of selecting predictors of time to an event from a high‐dimensional set of candidate predictors using data from multiple studies. As an alternative to the current multistage testing approaches, we propose to model the study‐to‐study heterogeneity explicitly using a hierarchical model to borrow strength. Our method incorporates censored data through an accelerated failure time model. Using a carefully formulated prior specification, we develop a fast approach to predictor selection and shrinkage estimation for high‐dimensional predictors. For model fitting, we develop a Monte Carlo expectation maximization (MC‐EM) algorithm to accommodate censored data. The proposed approach, which is related to the relevance vector machine (RVM), relies on maximum a posteriori estimation to rapidly obtain a sparse estimate. As for the typical RVM, there is an intrinsic thresholding property in which unimportant predictors tend to have their coefficients shrunk to zero. We compare our method with some commonly used procedures through simulation studies. We also illustrate the method using the gene expression barcode data from three breast cancer studies.  相似文献   

5.
A nonparametric estimator of a joint distribution function F0 of a d‐dimensional random vector with interval‐censored (IC) data is the generalized maximum likelihood estimator (GMLE), where d ≥ 2. The GMLE of F0 with univariate IC data is uniquely defined at each follow‐up time. However, this is no longer true in general with multivariate IC data as demonstrated by a data set from an eye study. How to estimate the survival function and the covariance matrix of the estimator in such a case is a new practical issue in analyzing IC data. We propose a procedure in such a situation and apply it to the data set from the eye study. Our method always results in a GMLE with a nonsingular sample information matrix. We also give a theoretical justification for such a procedure. Extension of our procedure to Cox's regression model is also mentioned.  相似文献   

6.
The popularity of penalized regression in high‐dimensional data analysis has led to a demand for new inferential tools for these models. False discovery rate control is widely used in high‐dimensional hypothesis testing, but has only recently been considered in the context of penalized regression. Almost all of this work, however, has focused on lasso‐penalized linear regression. In this paper, we derive a general method for controlling the marginal false discovery rate that can be applied to any penalized likelihood‐based model, such as logistic regression and Cox regression. Our approach is fast, flexible and can be used with a variety of penalty functions including lasso, elastic net, MCP, and MNet. We derive theoretical results under which the proposed method is valid, and use simulation studies to demonstrate that the approach is reasonably robust, albeit slightly conservative, when these assumptions are violated. Despite being conservative, we show that our method often offers more power to select causally important features than existing approaches. Finally, the practical utility of the method is demonstrated on gene expression datasets with binary and time‐to‐event outcomes.  相似文献   

7.
A score‐type test is proposed for testing the hypothesis of independent binary random variables against positive correlation in linear logistic models with sparse data and cluster specific covariates. The test is developed for univariate and multivariate one‐sided alternatives. The main advantage of using score test is that it requires estimation of the model only under the null hypothesis, that in this case corresponds to the binomial maximum likelihood fit. The score‐type test is developed from a class of estimating equations with block‐diagonal structure in which the coefficients of the linear logistic model are estimated simultaneously with the correlation. The simplicity of the score test is illustrated in two particular examples.  相似文献   

8.
Longitudinal data are common in clinical trials and observational studies, where missing outcomes due to dropouts are always encountered. Under such context with the assumption of missing at random, the weighted generalized estimating equation (WGEE) approach is widely adopted for marginal analysis. Model selection on marginal mean regression is a crucial aspect of data analysis, and identifying an appropriate correlation structure for model fitting may also be of interest and importance. However, the existing information criteria for model selection in WGEE have limitations, such as separate criteria for the selection of marginal mean and correlation structures, unsatisfactory selection performance in small‐sample setups, and so forth. In particular, there are few studies to develop joint information criteria for selection of both marginal mean and correlation structures. In this work, by embedding empirical likelihood into the WGEE framework, we propose two innovative information criteria named a joint empirical Akaike information criterion and a joint empirical Bayesian information criterion, which can simultaneously select the variables for marginal mean regression and also correlation structure. Through extensive simulation studies, these empirical‐likelihood‐based criteria exhibit robustness, flexibility, and outperformance compared to the other criteria including the weighted quasi‐likelihood under the independence model criterion, the missing longitudinal information criterion, and the joint longitudinal information criterion. In addition, we provide a theoretical justification of our proposed criteria, and present two real data examples in practice for further illustration.  相似文献   

9.
Pratiti Bhadra  Debnath Pal 《Proteins》2014,82(10):2443-2454
Inference of molecular function of proteins is the fundamental task in the quest for understanding cellular processes. The task is getting increasingly difficult with thousands of new proteins discovered each day. The difficulty arises primarily due to lack of high‐throughput experimental technique for assessing protein molecular function, a lacunae that computational approaches are trying hard to fill. The latter too faces a major bottleneck in absence of clear evidence based on evolutionary information. Here we propose a de novo approach to annotate protein molecular function through structural dynamics match for a pair of segments from two dissimilar proteins, which may share even <10% sequence identity. To screen these matches, corresponding 1 µs coarse‐grained (CG) molecular dynamics trajectories were used to compute normalized root‐mean‐square‐fluctuation graphs and select mobile segments, which were, thereafter, matched for all pairs using unweighted three‐dimensional autocorrelation vectors. Our in‐house custom‐built forcefield (FF), extensively validated against dynamics information obtained from experimental nuclear magnetic resonance data, was specifically used to generate the CG dynamics trajectories. The test for correspondence of dynamics‐signature of protein segments and function revealed 87% true positive rate and 93.5% true negative rate, on a dataset of 60 experimentally validated proteins, including moonlighting proteins and those with novel functional motifs. A random test against 315 unique fold/function proteins for a negative test gave >99% true recall. A blind prediction on a novel protein appears consistent with additional evidences retrieved therein. This is the first proof‐of‐principle of generalized use of structural dynamics for inferring protein molecular function leveraging our custom‐made CG FF, useful to all. Proteins 2014; 82:2443–2454. © 2014 Wiley Periodicals, Inc.  相似文献   

10.
It has long been known that insufficient consideration of spatial autocorrelation leads to unreliable hypothesis‐tests and inaccurate parameter estimates. Yet, ecologists are confronted with a confusing array of methods to account for spatial autocorrelation. Although Beale et al. (2010) provided guidance for continuous data on regular grids, researchers still need advice for other types of data in more flexible spatial contexts. In this paper, we extend Beale et al. (2010)‘s work to count data on both regularly‐ and irregularly‐spaced plots, the latter being commonly encountered in ecological studies. Through a simulation‐based approach, we assessed the accuracy and the type I errors of two frequentist and two Bayesian ready‐to‐use methods in the family of generalized mixed models, with distance‐based or neighbourhood‐based correlated random effects. In addition, we tested whether the methods are robust to spatial non‐stationarity, and over‐ and under‐dispersion – both typical features of species distribution count data which violate standard regression assumptions. In the simplest of our simulated datasets, the two frequentist methods gave inflated type I errors, while the two Bayesian methods provided satisfying results. When facing real‐world complexities, the distance‐based Bayesian method (MCMC with Langevin–Hastings updates) performed best of all. We hope that, in the light of our results, ecological researchers will feel more comfortable including spatial autocorrelation in their analyses of count data.  相似文献   

11.
Occupancy modeling is important for exploring species distribution patterns and for conservation monitoring. Within this framework, explicit attention is given to species detection probabilities estimated from replicate surveys to sample units. A central assumption is that replicate surveys are independent Bernoulli trials, but this assumption becomes untenable when ecologists serially deploy remote cameras and acoustic recording devices over days and weeks to survey rare and elusive animals. Proposed solutions involve modifying the detection‐level component of the model (e.g., first‐order Markov covariate). Evaluating whether a model sufficiently accounts for correlation is imperative, but clear guidance for practitioners is lacking. Currently, an omnibus goodness‐of‐fit test using a chi‐square discrepancy measure on unique detection histories is available for occupancy models (MacKenzie and Bailey, Journal of Agricultural, Biological, and Environmental Statistics, 9, 2004, 300; hereafter, MacKenzie–Bailey test). We propose a join count summary measure adapted from spatial statistics to directly assess correlation after fitting a model. We motivate our work with a dataset of multinight bat call recordings from a pilot study for the North American Bat Monitoring Program. We found in simulations that our join count test was more reliable than the MacKenzie–Bailey test for detecting inadequacy of a model that assumed independence, particularly when serial correlation was low to moderate. A model that included a Markov‐structured detection‐level covariate produced unbiased occupancy estimates except in the presence of strong serial correlation and a revisit design consisting only of temporal replicates. When applied to two common bat species, our approach illustrates that sophisticated models do not guarantee adequate fit to real data, underscoring the importance of model assessment. Our join count test provides a widely applicable goodness‐of‐fit test and specifically evaluates occupancy model lack of fit related to correlation among detections within a sample unit. Our diagnostic tool is available for practitioners that serially deploy survey equipment as a way to achieve cost savings.  相似文献   

12.
In this paper, our aim is to analyze geographical and temporal variability of disease incidence when spatio‐temporal count data have excess zeros. To that end, we consider random effects in zero‐inflated Poisson models to investigate geographical and temporal patterns of disease incidence. Spatio‐temporal models that employ conditionally autoregressive smoothing across the spatial dimension and B‐spline smoothing over the temporal dimension are proposed. The analysis of these complex models is computationally difficult from the frequentist perspective. On the other hand, the advent of the Markov chain Monte Carlo algorithm has made the Bayesian analysis of complex models computationally convenient. Recently developed data cloning method provides a frequentist approach to mixed models that is also computationally convenient. We propose to use data cloning, which yields to maximum likelihood estimation, to conduct frequentist analysis of zero‐inflated spatio‐temporal modeling of disease incidence. One of the advantages of the data cloning approach is that the prediction and corresponding standard errors (or prediction intervals) of smoothing disease incidence over space and time is easily obtained. We illustrate our approach using a real dataset of monthly children asthma visits to hospital in the province of Manitoba, Canada, during the period April 2006 to March 2010. Performance of our approach is also evaluated through a simulation study.  相似文献   

13.
Summary We propose a Bayesian dose‐finding design that accounts for two important factors, the severity of toxicity and heterogeneity in patients' susceptibility to toxicity. We consider toxicity outcomes with various levels of severity and define appropriate scores for these severity levels. We then use a multinomial‐likelihood function and a Dirichlet prior to model the probabilities of these toxicity scores at each dose, and characterize the overall toxicity using an average toxicity score (ATS) parameter. To address the issue of heterogeneity in patients' susceptibility to toxicity, we categorize patients into different risk groups based on their susceptibility. A Bayesian isotonic transformation is applied to induce an order‐restricted posterior inference on the ATS. We demonstrate the performance of the proposed dose‐finding design using simulations based on a clinical trial in multiple myeloma.  相似文献   

14.
A new statistical testing approach is developed for rodent tumorigenicity assays that have a single terminal sacrifice or occasionally interim sacrifices but not cause‐of‐death data. For experiments that lack cause‐of‐death data, statistically imputed numbers of fatal tumors and incidental tumors are used to modify Peto's cause‐of‐death test which is usually implemented using pathologist‐assigned cause‐of‐death information. The numbers of fatal tumors are estimated using a constrained nonparametric maximum likelihood estimation method. A new Newton‐based approach under inequality constraints is proposed for finding the global maximum likelihood estimates. In this study, the proposed method is concentrated on data with a single sacrifice experiment without implementing further assumptions. The new testing approach may be more reliable than Peto's test because of the potential for a misclassification of cause‐of‐death by pathologists. A Monte Carlo simulation study for the proposed test is conducted to assess size and power of the test. Asymptotic normality for the statistic of the proposed test is also investigated. The proposed testing approach is illustrated using a real data set.  相似文献   

15.
Successful pharmaceutical drug development requires finding correct doses. The issues that conventional dose‐response analyses consider, namely whether responses are related to doses, which doses have responses differing from a control dose response, the functional form of a dose‐response relationship, and the dose(s) to carry forward, do not need to be addressed simultaneously. Determining if a dose‐response relationship exists, regardless of its functional form, and then identifying a range of doses to study further may be a more efficient strategy. This article describes a novel estimation‐focused Bayesian approach (BMA‐Mod) for carrying out the analyses when the actual dose‐response function is unknown. Realizations from Bayesian analyses of linear, generalized linear, and nonlinear regression models that may include random effects and covariates other than dose are optimally combined to produce distributions of important secondary quantities, including test‐control differences, predictive distributions of possible outcomes from future trials, and ranges of doses corresponding to target outcomes. The objective is similar to the objective of the hypothesis‐testing based MCP‐Mod approach, but provides more model and distributional flexibility and does not require testing hypotheses or adjusting for multiple comparisons. A number of examples illustrate the application of the method.  相似文献   

16.
Summary We propose a Bayesian chi‐squared model diagnostic for analysis of data subject to censoring. The test statistic has the form of Pearson's chi‐squared test statistic and is easy to calculate from standard output of Markov chain Monte Carlo algorithms. The key innovation of this diagnostic is that it is based only on observed failure times. Because it does not rely on the imputation of failure times for observations that have been censored, we show that under heavy censoring it can have higher power for detecting model departures than a comparable test based on the complete data. In a simulation study, we show that tests based on this diagnostic exhibit comparable power and better nominal Type I error rates than a commonly used alternative test proposed by Akritas (1988, Journal of the American Statistical Association 83, 222–230). An important advantage of the proposed diagnostic is that it can be applied to a broad class of censored data models, including generalized linear models and other models with nonidentically distributed and nonadditive error structures. We illustrate the proposed model diagnostic for testing the adequacy of two parametric survival models for Space Shuttle main engine failures.  相似文献   

17.
Chris J. Lloyd 《Biometrics》2010,66(3):975-982
Summary Clinical trials data often come in the form of low‐dimensional tables of small counts. Standard approximate tests such as score and likelihood ratio tests are imperfect in several respects. First, they can give quite different answers from the same data. Second, the actual type‐1 error can differ significantly from nominal, even for quite large sample sizes. Third, exact inferences based on these can be strongly nonmonotonic functions of the null parameter and lead to confidence sets that are discontiguous. There are two modern approaches to small sample inference. One is to use so‐called higher order asymptotics ( Reid, 2003 , Annal of Statistics 31 , 1695–1731) to provide an explicit adjustment to the likelihood ratio statistic. The theory for this is complex but the statistic is quick to compute. The second approach is to perform an exact calculation of significance assuming the nuisance parameters equal their null estimate ( Lee and Young, 2005 , Statistic and Probability Letters 71 , 143–153), which is a kind of parametric bootstrap. The purpose of this article is to explain and evaluate these two methods, for testing whether a difference in probabilities p2? p1 exceeds a prechosen noninferiority margin δ0 . On the basis of an extensive numerical study, we recommend bootstrap P‐values as superior to all other alternatives. First, they produce practically identical answers regardless of the basic test statistic chosen. Second, they have excellent size accuracy and higher power. Third, they vary much less erratically with the null parameter value δ0 .  相似文献   

18.
We consider the estimation of the prevalence of a rare disease, and the log‐odds ratio for two specified groups of individuals from group testing data. For a low‐prevalence disease, the maximum likelihood estimate of the log‐odds ratio is severely biased. However, Firth correction to the score function leads to a considerable improvement of the estimator. Also, for a low‐prevalence disease, if the diagnostic test is imperfect, the group testing is found to yield more precise estimate of the log‐odds ratio than the individual testing.  相似文献   

19.
Summary The rapid development of new biotechnologies allows us to deeply understand biomedical dynamic systems in more detail and at a cellular level. Many of the subject‐specific biomedical systems can be described by a set of differential or difference equations that are similar to engineering dynamic systems. In this article, motivated by HIV dynamic studies, we propose a class of mixed‐effects state‐space models based on the longitudinal feature of dynamic systems. State‐space models with mixed‐effects components are very flexible in modeling the serial correlation of within‐subject observations and between‐subject variations. The Bayesian approach and the maximum likelihood method for standard mixed‐effects models and state‐space models are modified and investigated for estimating unknown parameters in the proposed models. In the Bayesian approach, full conditional distributions are derived and the Gibbs sampler is constructed to explore the posterior distributions. For the maximum likelihood method, we develop a Monte Carlo EM algorithm with a Gibbs sampler step to approximate the conditional expectations in the E‐step. Simulation studies are conducted to compare the two proposed methods. We apply the mixed‐effects state‐space model to a data set from an AIDS clinical trial to illustrate the proposed methodologies. The proposed models and methods may also have potential applications in other biomedical system analyses such as tumor dynamics in cancer research and genetic regulatory network modeling.  相似文献   

20.
Studies of evolutionary correlations commonly use phylogenetic regression (i.e., independent contrasts and phylogenetic generalized least squares) to assess trait covariation in a phylogenetic context. However, while this approach is appropriate for evaluating trends in one or a few traits, it is incapable of assessing patterns in highly multivariate data, as the large number of variables relative to sample size prohibits parametric test statistics from being computed. This poses serious limitations for comparative biologists, who must either simplify how they quantify phenotypic traits, or alter the biological hypotheses they wish to examine. In this article, I propose a new statistical procedure for performing ANOVA and regression models in a phylogenetic context that can accommodate high‐dimensional datasets. The approach is derived from the statistical equivalency between parametric methods using covariance matrices and methods based on distance matrices. Using simulations under Brownian motion, I show that the method displays appropriate Type I error rates and statistical power, whereas standard parametric procedures have decreasing power as data dimensionality increases. As such, the new procedure provides a useful means of assessing trait covariation across a set of taxa related by a phylogeny, enabling macroevolutionary biologists to test hypotheses of adaptation, and phenotypic change in high‐dimensional datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号