首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
Summary .  Classical diagnostics for structural equation models are based on aggregate forms of the data and are ill suited for checking distributional or linearity assumptions. We extend recently developed goodness-of-fit tests for correlated data based on subject-specific residuals to structural equation models with latent variables. The proposed tests lend themselves to graphical displays and are designed to detect misspecified distributional or linearity assumptions. To complement graphical displays, test statistics are defined; the null distributions of the test statistics are approximated using computationally efficient simulation techniques. The properties of the proposed tests are examined via simulation studies. We illustrate the methods using data from a study of in utero lead exposure.  相似文献   

2.
Guan Y 《Biometrics》2008,64(3):800-806
Summary .   We propose a formal method to test stationarity for spatial point processes. The proposed test statistic is based on the integrated squared deviations of observed counts of events from their means estimated under stationarity. We show that the resulting test statistic converges in distribution to a functional of a two-dimensional Brownian motion. To conduct the test, we compare the calculated statistic with the upper tail critical values of this functional. Our method requires only a weak dependence condition on the process but does not assume any parametric model for it. As a result, it can be applied to a wide class of spatial point process models. We study the efficacy of the test through both simulations and applications to two real data examples that were previously suspected to be nonstationary based on graphical evidence. Our test formally confirmed the suspected nonstationarity for both data.  相似文献   

3.
4.
We consider the statistical modeling and analysis of replicated multi-type point process data with covariates. Such data arise when heterogeneous subjects experience repeated events or failures which may be of several distinct types. The underlying processes are modeled as nonhomogeneous mixed Poisson processes with random (subject) and fixed (covariate) effects. The method of maximum likelihood is used to obtain estimates and standard errors of the failure rate parameters and regression coefficients. Score tests and likelihood ratio statistics are used for covariate selection. A graphical test of goodness of fit of the selected model is based on generalized residuals. Measures for determining the influence of an individual observation on the estimated regression coefficients and on the score test statistic are developed. An application is described to a large ongoing randomized controlled clinical trial for the efficacy of nutritional supplements of selenium for the prevention of two types of skin cancer.  相似文献   

5.
Briggs WM  Zaretzki R 《Biometrics》2008,64(1):250-6; discussion 256-61
Summary .   We introduce the Skill Plot, a method that it is directly relevant to a decision maker who must use a diagnostic test. In contrast to ROC curves, the skill curve allows easy graphical inspection of the optimal cutoff or decision rule for a diagnostic test. The skill curve and test also determine whether diagnoses based on this cutoff improve upon a naive forecast (of always present or of always absent). The skill measure makes it easy to directly compare the predictive utility of two different classifiers in an analogy to the area under the curve statistic related to ROC analysis. Finally, this article shows that the skill-based cutoff inferred from the plot is equivalent to the cutoff indicated by optimizing the posterior odds in accordance with Bayesian decision theory. A method for constructing a confidence interval for this optimal point is presented and briefly discussed.  相似文献   

6.
Dobson A  Henderson R 《Biometrics》2003,59(4):741-751
We present a variety of informal graphical procedures for diagnostic assessment of joint models for longitudinal and dropout time data. A random effects approach for Gaussian responses and proportional hazards dropout time is assumed. We consider preliminary assessment of dropout classification categories based on residuals following a standard longitudinal data analysis with no allowance for informative dropout. Residual properties conditional upon dropout information are discussed and case influence is considered. The proposed methods do not require computationally intensive methods over and above those used to fit the proposed model. A longitudinal trial into the treatment of schizophrenia is used to illustrate the suggestions.  相似文献   

7.
Giles Hooker 《Biometrics》2009,65(3):928-936
Summary .  This article investigates the problem of model diagnostics for systems described by nonlinear ordinary differential equations (ODEs). I propose modeling lack of fit as a time-varying correction to the right-hand side of a proposed differential equation. This correction can be described as being a set of additive forcing functions, estimated from data. Representing lack of fit in this manner allows us to graphically investigate model inadequacies and to suggest model improvements. I derive lack-of-fit tests based on estimated forcing functions. Model building in partially observed systems of ODEs is particularly difficult and I consider the problem of identification of forcing functions in these systems. The methods are illustrated with examples from computational neuroscience.  相似文献   

8.
Guan Y 《Biometrics》2006,62(1):126-134
A convenient assumption while modeling a marked point process is that the observations (i.e., marks) and the locations (i.e., points) are independent. We propose new graphical and formal testing approaches to test for this assumption. The proposed graphical procedures are easy to obtain and can be used to diagnose the nature and range of dependence between marks and points. The formal testing procedures require only minimal conditions on marks and thus can be applied to a variety of settings. We illustrate these procedures through a simulation study and an application to some real data.  相似文献   

9.
Lin DY  Wei LJ  Ying Z 《Biometrics》2002,58(1):1-12
Residuals have long been used for graphical and numerical examinations of the adequacy of regression models. Conventional residual analysis based on the plots of raw residuals or their smoothed curves is highly subjective, whereas most numerical goodness-of-fit tests provide little information about the nature of model misspecification. In this paper, we develop objective and informative model-checking techniques by taking the cumulative sums of residuals over certain coordinates (e.g., covariates or fitted values) or by considering some related aggregates of residuals, such as moving sums and moving averages. For a variety of statistical models and data structures, including generalized linear models with independent or dependent observations, the distributions of these stochastic processes tinder the assumed model can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be easily generated by computer simulation. Each observed process can then be compared, both graphically and numerically, with a number of realizations from the Gaussian process. Such comparisons enable one to assess objectively whether a trend seen in a residual plot reflects model misspecification or natural variation. The proposed techniques are particularly useful in checking the functional form of a covariate and the link function. Illustrations with several medical studies are provided.  相似文献   

10.
The confirmatory analysis of pre-specified multiple hypotheses has become common in pivotal clinical trials. In the recent past multiple test procedures have been developed that reflect the relative importance of different study objectives, such as fixed sequence, fallback, and gatekeeping procedures. In addition, graphical approaches have been proposed that facilitate the visualization and communication of Bonferroni-based closed test procedures for common multiple test problems, such as comparing several treatments with a control, assessing the benefit of a new drug for more than one endpoint, combined non-inferiority and superiority testing, or testing a treatment at different dose levels in an overall and a subpopulation. In this paper, we focus on extended graphical approaches by dissociating the underlying weighting strategy from the employed test procedure. This allows one to first derive suitable weighting strategies that reflect the given study objectives and subsequently apply appropriate test procedures, such as weighted Bonferroni tests, weighted parametric tests accounting for the correlation between the test statistics, or weighted Simes tests. We illustrate the extended graphical approaches with several examples. In addition, we describe briefly the gMCP package in R, which implements some of the methods described in this paper.  相似文献   

11.
The semiparametric Cox proportional hazards model is routinely adopted to model time-to-event data. Proportionality is a strong assumption, especially when follow-up time, or study duration, is long. Zeng and Lin (J. R. Stat. Soc., Ser. B, 69:1–30, 2007) proposed a useful generalisation through a family of transformation models which allow hazard ratios to vary over time. In this paper we explore a variety of tests for the need for transformation, arguing that the Cox model is so ubiquitous that it should be considered as the default model, to be discarded only if there is good evidence against the model assumptions. Since fitting an alternative transformation model is more complicated than fitting the Cox model, especially as procedures are not yet incorporated in standard software, we focus mainly on tests which require a Cox fit only. A score test is derived, and we also consider performance of omnibus goodness-of-fit tests based on Schoenfeld residuals. These tests can be extended to compare different transformation models. In addition we explore the consequences of fitting a misspecified Cox model to data generated under a true transformation model. Data on survival of 1043 leukaemia patients are used for illustration.  相似文献   

12.
The mysterious ‘fairy circles’ are vegetation‐free discs that cover vast areas along the pro‐Namib Desert. Despite 30 yr of research their origin remains unknown. Here we adopt a novel approach that focuses on analysis of the spatial patterns of fairy circles obtained from representative 25‐ha aerial images of north‐west Namibia. We use spatial point pattern analysis to quantify different features of their spatial structures and then critically inspect existing hypotheses with respect to their ability to generate the observed circle patterns. Our working hypothesis is that fairy circles are a self‐organized vegetation pattern. Finally, we test if an existing partial‐differential‐equation model, that was designed to describe vegetation pattern formation, is able to reproduce the characteristic features of the observed fairy circle patterns. The model is based on key‐processes in arid areas such as plant competition for water and local resource‐biomass feedbacks. The fairy circles showed at all three study areas the same regular spatial distribution patterns, characterized by Voronoi cells with mostly six corners, negative correlations in their size up to a distance of 13 m, and remarkable homogeneity over large spatial scales. These results cast doubts on abiotic gas‐leakage along geological lines or social insects as causal agents of their origin. However, our mathematical model was able to generate spatial patterns that agreed quantitatively in all of these features with the observed patterns. This supports the hypothesis that fairy circles are self‐organized vegetation patterns that emerge from positive biomass‐water feedbacks involving water transport by extended root systems and soil‐water diffusion. Future research should search for mechanisms that explain how the different hypotheses can generate the patterns observed here and test the ability of self‐organization to match the birth‐ and death dynamics of fairy circles and their regional patterns in the density and size with respect to environmental gradients.  相似文献   

13.
MOTIVATION: Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. In time-course experiments in which gene expression is monitored over time, we are interested in testing gene expression profiles for different experimental groups. However, no sophisticated analytic methods have yet been proposed to handle time-course experiment data. RESULTS: We propose a statistical test procedure based on the ANOVA model to identify genes that have different gene expression profiles among experimental groups in time-course experiments. Especially, we propose a permutation test which does not require the normality assumption. For this test, we use residuals from the ANOVA model only with time-effects. Using this test, we detect genes that have different gene expression profiles among experimental groups. The proposed model is illustrated using cDNA microarrays of 3840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells.  相似文献   

14.
A new simple graphical method is described for the determination of inhibition type and kinetic parameters of an enzyme reaction without any replot. The method consists of plotting experimental data as v/(vo--v) versus the reciprocal of the inhibitor concentration at different substrate concentrations, where v and vo represent the velocity in the presence and in the absence of the inhibitor respectively with a given concentration of the substrate. Partial inhibition gives straight lines that converge on the abscissa at a point away from the origin, whereas complete inhibition gives lines that go through the origin. The inhibition constants of enzymes and the reaction rate constant of the enzyme-substrate-inhibitor complex can be calculated from the abscissa and ordinate intercepts of the plot. The relationship between the slope of the plot and the substrate concentration shows characteristic features depending on the inhibition type: for partial competitive inhibition, the straight line converging on the abscissa at--Ks, the dissociation constant of the enzyme-substrate complex; for non-competitive inhibition, a constant slope independent of the substrate concentration; for uncompetitive inhibition, a hyperbola decreasing with the increase in the substrate concentration; for mixed-type inhibition, a hyperbola increasing with the increase in the substrate concentration. The properties of the replot are useful in confirmation of the inhibition mechanism.  相似文献   

15.
Simple regression of genetic similarities between pairs of populations on their corresponding geographic distances is frequently used to detect the presence of isolation by distance (IBD). However, these pairwise values are obviously not independent and there is no parametric procedure for estimating and testing for the IBD intercepts and slopes based on standard regression theory. Nonparametric tests, such as the Mantel test, and resampling techniques, such as bootstrapping, have been exploited with limited success. Here, I describe a likelihood-based analysis to allow for simultaneously detecting patterns of correlated residuals and estimating and testing for the presence of IBD. It is shown, through the analysis of two molecular datasets in pine species, that different covariance structures of the residuals exist. More over, the likelihood ratio tests under these covariance structures are less sensitive to the presence of IBD than the Mantel test and the simple regression analysis but more sensitive than the bootstrap and jackknife samples over independent populations or population pairs. Because the likelihood analysis directly models and accounts for nonindependence of residuals, it should legitimately detect the presence of IBD, thereby allowing for accurate inferences about evolutionary and demographic processes influencing the extent and patterns of IBD.  相似文献   

16.
《Freshwater Biology》1999,41(4):747-757
1.   The prediction of macroinvertebrate community composition in flowing waters from environmental data has enabled pollution assessments that take account of natural variability. Polluted sites are identified by discrepancies between the observed fauna and the fauna expected at an unpolluted site on the same type of river.
2.   The usual method of prediction involves a sequence of (a) classification of unpolluted reference sites by cluster analysis of macroinvertebrate community data (b) multiple discriminant analysis to relate site clusters to environmental variables, and (c) use of site clusters, discriminant functions and environmental data to estimate the probability of collection of each macroinvertebrate taxon at sites that are to be assessed (test sites).
3.   This paper describes an alternative method that does not require classification and predicts abundance rather than probability of occurrence. The main steps are (a) multiple regression of biological differences between pairs of reference sites on differences in physical variables (b) use of the multiple regression relationship to predict the biological similarity of a test site to each reference site, and (c) estimation of the expected fauna at the test site as a weighted mean of the faunas at the reference sites. The predicted similarities of the test site to each reference site are used to derive the weightings.
4.   The method is illustrated using macroinvertebrate and environmental data collected in the upper Murrumbidgee River catchment as part of Australia's Monitoring River Health Initiative. In comparison with a classification-based analysis of these data, macroinvertebrate indices generated by the new method showed a greater distinction between human-disturbed and undisturbed test sites, and a similar or higher degree of correlation with physical and chemical indicators of human disturbance.  相似文献   

17.
Pan Z  Lin DY 《Biometrics》2005,61(4):1000-1009
We develop graphical and numerical methods for checking the adequacy of generalized linear mixed models (GLMMs). These methods are based on the cumulative sums of residuals over covariates or predicted values of the response variable. Under the assumed model, the asymptotic distributions of these stochastic processes can be approximated by certain zero-mean Gaussian processes, whose realizations can be generated through Monte Carlo simulation. Each observed process can then be compared, both visually and analytically, to a number of realizations simulated from the null distribution. These comparisons enable one to assess objectively whether the observed residual patterns reflect model misspecification or random variation. The proposed methods are particularly useful for checking the functional form of a covariate or the link function. Extensive simulation studies show that the proposed goodness-of-fit tests have proper sizes and are sensitive to model misspecification. Applications to two medical studies lead to improved models.  相似文献   

18.
We develop and test machine learning methods for the prediction of coarse 3D protein structures, where a protein is represented by a set of rigid rods associated with its secondary structure elements (alpha-helices and beta-strands). First, we employ cascades of recursive neural networks derived from graphical models to predict the relative placements of segments. These are represented as discretized distance and angle maps, and the discretization levels are statistically inferred from a large and curated dataset. Coarse 3D folds of proteins are then assembled starting from topological information predicted in the first stage. Reconstruction is carried out by minimizing a cost function taking the form of a purely geometrical potential. We show that the proposed architecture outperforms simpler alternatives and can accurately predict binary and multiclass coarse maps. The reconstruction procedure proves to be fast and often leads to topologically correct coarse structures that could be exploited as a starting point for various protein modeling strategies. The fully integrated rod-shaped protein builder (predictor of contact maps + reconstruction algorithm) can be accessed at http://distill.ucd.ie/.  相似文献   

19.
Zhang P  Song PX  Qu A  Greene T 《Biometrics》2008,64(1):29-38
Summary .  This article presents a new class of nonnormal linear mixed models that provide an efficient estimation of subject-specific disease progression in the analysis of longitudinal data from the Modification of Diet in Renal Disease (MDRD) trial. This new analysis addresses the previously reported finding that the distribution of the random effect characterizing disease progression is negatively skewed. We assume a log-gamma distribution for the random effects and provide the maximum likelihood inference for the proposed nonnormal linear mixed model. We derive the predictive distribution of patient-specific disease progression rates, which demonstrates rather different individual progression profiles from those obtained from the normal linear mixed model analysis. To validate the adequacy of the log-gamma assumption versus the usual normality assumption for the random effects, we propose a lack-of-fit test that clearly indicates a better fit for the log-gamma modeling in the analysis of the MDRD data. The full maximum likelihood inference is also advantageous in dealing with the missing at random (MAR) type of dropouts encountered in the MDRD data.  相似文献   

20.
Aim Local‐regional (LR) species diversity plots were conceived to assess the contribution of regional and local processes in shaping the patterns of biological diversity, but have been used also to explore the scaling of diversity in terms of its alpha, beta, and gamma components. Here we explore the idea that patterns in the geographical ranges of species over a continent can determine the shape of small region to large region (SRLR) plots, which are equivalent to LR plots when comparing the diversity of sites at two regional scales. Location To test that idea, we analysed the diversity patterns at two regional scales for the mammals of North America, defined as the mainland from Alaska and Canada to Panama. Method We developed a theoretical model relating average range size of species over a large‐scale region with its average regional point species diversity (RPD). Then, we generated a null model of expected SRLR plots based on theoretical predictions. Species diversities at two scales were modelled using linear and saturation functions for Type I and Type II SRLR relationships, respectively. We applied the models to the case of North American mammals by examining the regional diversity and the RPD for 21 large‐scale quadrats (with area equal to 160,000 km2), arranged along a latitudinal gradient. Results Our model showed that continental and large‐scale regional patterns of distribution of species can generate both types of SRLR relationship, and that these patterns can be reflected in LR plots without invoking any kind of local processes. We found that North American nonvolant mammals follow a Type I SRLR relationship, whereas bats follow a Type II pattern. This difference was linked to patterns in which species of the two mammalian groups distribute in geographical space. Conclusion Traditional LR plots and the new SRLR plots are useful tools in exploring the scaling of species diversity and in showing the relationship between distribution and diversity. Their usefulness in comparing the relative role of local and regional processes is, however, very limited.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号