首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The well known χ2 goodness of fit test for a multinomial distribution is generally biased when observations are subject to misclassification. In this paper, based on a double sampling scheme, the family of φ‐divergence test statistics is introduced for testing goodness of fit under misclassification of the data. The case of binomial data is discussed and an illustrative example is also given.  相似文献   

2.
Extensions of linear models are very commonly used in the analysis of biological data. Whereas goodness of fit measures such as the coefficient of determination (R2) or the adjusted R2 are well established for linear models, it is not obvious how such measures should be defined for generalized linear and mixed models. There are by now several proposals but no consensus has yet emerged as to the best unified approach in these settings. In particular, it is an open question how to best account for heteroscedasticity and for covariance among observations present in residual error or induced by random effects. This paper proposes a new approach that addresses this issue and is universally applicable for arbitrary variance‐covariance structures including spatial models and repeated measures. It is exemplified using three biological examples.  相似文献   

3.
Triatomines (Hemiptera: Reduviidae) are vectors of Trypanosoma cruzi Chagas, the etiological agent of Chagas's disease. They display pre‐adult development delay – that is, a development time much longer than on average – which usually has been considered as a maladaptive trait. However, this hypothesis has not been tested. We carried out an experiment under controlled laboratory conditions to (1) test whether a development delay exists in the fifth nymphal stage of Rhodnius prolixus Stål (Hemiptera: Reduviidae, Rhodniini), and (2) measure any fitness cost related to such delay by estimating the relationship between individual development time and other life‐history traits. We analyzed the development time with various continuous statistical distributions (normal, log‐normal, Weibull, gamma, Pareto, Burr, and log‐logistic). Using goodness‐of‐fit tests, the best fit was obtained with asymmetrical distributions, with the Burr distribution showing the best fit to the data. We concluded that a development delay exists in stage five of R. prolixus without fitness cost. The combination of our results and previous work suggests that such a delay could be viewed as an adaptive response to environmental stochasticity and/or density‐dependence rather than as a maladaptive trait. We propose further investigations to provide a conclusive test of adaptive delay in triatomines.  相似文献   

4.
The coefficient of determination (R2) is a common measure of goodness of fit for linear models. Various proposals have been made for extension of this measure to generalized linear and mixed models. When the model has random effects or correlated residual effects, the observed responses are correlated. This paper proposes a new coefficient of determination for this setting that accounts for any such correlation. A key advantage of the proposed method is that it only requires the fit of the model under consideration, with no need to also fit a null model. Also, the approach entails a bias correction in the estimator assessing the variance explained by fixed effects. Three examples are used to illustrate new measure. A simulation shows that the proposed estimator of the new coefficient of determination has only minimal bias.  相似文献   

5.
S-sample smooth goodness of fit tests may be constructed using components from one sample goodness of fit testing. Each sample could be assessed for consistency with a target distribution using these components, although that is not our objective here. Contrasts in the components may be used to assess consistency of the samples with each other. If all the samples are consistent, we could then conveniently perform a one-sample goodness of fit test for the target distribution. If the samples are not consistent, an LSD-type analysis can be performed on the one-sample components to identify where the differences between occur. This approach gives a detailed and informative scrutiny of the data.  相似文献   

6.
Understanding causes of nest loss is critical for the management of endangered bird populations. Available methods for estimating nest loss probabilities to competing sources do not allow for random effects and covariation among sources, and there are few data simulation methods or goodness‐of‐fit (GOF) tests for such models. We developed a Bayesian multinomial extension of the widely used logistic exposure (LE) nest survival model which can incorporate multiple random effects and fixed‐effect covariates for each nest loss category. We investigated the performance of this model and the accompanying GOF test by analysing simulated nest fate datasets with and without age‐biased discovery probability, and by comparing the estimates with those of traditional fixed‐effects estimators. We then exemplify the use of the multinomial LE model and GOF test by analysing Piping Plover Charadrius melodus nest fate data (n = 443) to explore the effects of wire cages (exclosures) constructed around nests, which are used to protect nests from predation but can lead to increased nest abandonment rates. Mean parameter estimates of the random‐effects multinomial LE model were all within 1 sd of the true values used to simulate the datasets. Age‐biased discovery probability did not result in biased parameter estimates. Traditional fixed‐effects models provided estimates with a high bias of up to 43% with a mean of 71% smaller standard deviations. The GOF test identified models that were a poor fit to the simulated data. For the Piping Plover dataset, the fixed‐effects model was less well‐supported than the random‐effects model and underestimated the risk of exclosure use by 16%. The random‐effects model estimated a range of 1–6% probability of abandonment for nests not protected by exclosures across sites and 5–41% probability of abandonment for nests with exclosures, suggesting that the magnitude of exclosure‐related abandonment is site‐specific. Our results demonstrate that unmodelled heterogeneity can result in biased estimates potentially leading to incorrect management recommendations. The Bayesian multinomial LE model offers a flexible method of incorporating random effects into an analysis of nest failure and is robust to age‐biased nest discovery probability. This model can be generalized to other staggered‐entry, time‐to‐hazard situations.  相似文献   

7.
Sexual size dimorphism (SSD) is widespread in animals, especially in lizards (Reptilia: Squamata), and is driven by fecundity selection, male–male competition, or other adaptive hypotheses. However, these selective pressures may vary through different life history periods; thus, it is essential to assess the relationship between growth and SSD. In this study, we tracked SSD dynamics between a “fading‐tail color skink” (blue tail skink whose tail is only blue during its juvenile stage: Plestiodon elegans) and a “nonfade color” tail skink (retains a blue tail throughout life: Plestiodon quadrilineatus) under a controlled experimental environment. We fitted growth curves of morphological traits (body mass, SVL, and TL) using three growth models (Logistic, Gompertz, and von Bertalanffy). We found that both skinks have male‐biased SSD as adults. Body mass has a higher goodness of fit (as represented by very high R2 values) using the von Bertalanffy model than the other two models. In contrast, SVL and TL for both skinks had higher goodness of fit when using the Gompertz model. Two lizards displayed divergent life history tactics: P. elegans grows faster, matures earlier (at 65 weeks), and presents an allometric growth rate, whereas P. quadrilineatus grows slower, matures later (at 106 weeks), and presents an isometric growth rate. Our findings imply that species‐ and sex‐specific trade‐offs in the allocation of energy to growth and reproduction may cause the growth patterns to diverge, ultimately resulting in the dissimilar patterns of SSD.  相似文献   

8.
The risk of acute aortic dissection (AAD) exhibits chronobiological variations with peak onset in the morning and in winter. However, it is not known whether the time of day or season of the year of the AAD affects clinical outcomes. We studied 1,032 patients enrolled in the International Registry of Acute Aortic Dissection from January 1997 to December 2001. For circadian and seasonal analysis, the time and date of symptom onset were available for 741 and 1,007 patients, respectively, and were grouped into four 6 h periods (morning, afternoon, evening, and night) and four seasons (winter, spring, summer, and autumn). The χ2 test for goodness of fit was used to evaluate non‐uniformity of the time of day and time of year for critical in‐hospital clinical events, including death. While highest incidence of AAD occurred in the morning and winter, clinical events (including mortality) were similar during the four different periods of the 24 h (χ2=1.9, p=0.60) and seasonal (χ2=1.2, p=0.75) periods.  相似文献   

9.
10.
The intraclass version of kappa coefficient has been commonly applied as a measure of agreement for two ratings per subject with binary outcome in reliability studies. We present an efficient statistic for testing the strength of kappa agreement using likelihood scores, and derive asymptotic power and sample size formula. Exact evaluation shows that the score test is generally conservative and more powerful than a method based on a chi‐square goodness‐of‐fit statistic (Donner and Eliasziw , 1992, Statistics in Medicine 11 , 1511–1519). In particular, when the research question is one directional, the one‐sided score test is substantially more powerful and the reduction in sample size is appreciable.  相似文献   

11.
Occupancy modeling is important for exploring species distribution patterns and for conservation monitoring. Within this framework, explicit attention is given to species detection probabilities estimated from replicate surveys to sample units. A central assumption is that replicate surveys are independent Bernoulli trials, but this assumption becomes untenable when ecologists serially deploy remote cameras and acoustic recording devices over days and weeks to survey rare and elusive animals. Proposed solutions involve modifying the detection‐level component of the model (e.g., first‐order Markov covariate). Evaluating whether a model sufficiently accounts for correlation is imperative, but clear guidance for practitioners is lacking. Currently, an omnibus goodness‐of‐fit test using a chi‐square discrepancy measure on unique detection histories is available for occupancy models (MacKenzie and Bailey, Journal of Agricultural, Biological, and Environmental Statistics, 9, 2004, 300; hereafter, MacKenzie–Bailey test). We propose a join count summary measure adapted from spatial statistics to directly assess correlation after fitting a model. We motivate our work with a dataset of multinight bat call recordings from a pilot study for the North American Bat Monitoring Program. We found in simulations that our join count test was more reliable than the MacKenzie–Bailey test for detecting inadequacy of a model that assumed independence, particularly when serial correlation was low to moderate. A model that included a Markov‐structured detection‐level covariate produced unbiased occupancy estimates except in the presence of strong serial correlation and a revisit design consisting only of temporal replicates. When applied to two common bat species, our approach illustrates that sophisticated models do not guarantee adequate fit to real data, underscoring the importance of model assessment. Our join count test provides a widely applicable goodness‐of‐fit test and specifically evaluates occupancy model lack of fit related to correlation among detections within a sample unit. Our diagnostic tool is available for practitioners that serially deploy survey equipment as a way to achieve cost savings.  相似文献   

12.
Aim Scheiner (Journal of Biogeography, 2009, 36 , 2005–2008) criticized several issues regarding the typology and analysis of species richness curves that were brought forward by Dengler (Journal of Biogeography, 2009, 36 , 728–744). In order to test these two sets of views in greater detail, we used a simulation model of ecological communities to demonstrate the effects of different sampling schemes on the shapes of species richness curves and their extrapolation capability. Methods We simulated five random communities with 100 species on a 64 × 64 grid using random fields. Then we sampled species–area relationships (SARs, contiguous plots) as well as species–sampling relationships (SSRs, non‐contiguous plots) from these communities, both for the full extent and the central quarter of the grid. Finally, we fitted different functions (power, quadratic power, logarithmic, Michaelis–Menten, Lomolino) to the obtained data and assessed their goodness‐of‐fit (Akaike weights) and their extrapolation capability (deviation of the predicted value from the true value). Results We found that power functions gave the best fit for SARs, while for SSRs saturation functions performed better. Curves constructed from data of 322 grid cells gave reasonable extrapolations for 642 grid cells for SARs, irrespective of whether samples were gathered from the full extent or the centre only. By contrast, SSRs worked well for extrapolation only in the latter case. Main conclusions SARs and SSRs have fundamentally different curve shapes. Both sampling strategies can be used for extrapolation of species richness to a target area, but only SARs allow for extrapolation to a larger area than that sampled. These results confirm a fundamental difference between SARs and area‐based SSRs and thus support their typological differentiation.  相似文献   

13.
Seasonal variation in the occurrence of cardiovascular and cerebrovascular events, including pulmonary embolism (PE), has been reported; however, recent large‐scale, population‐based studies conducted in the United States did not confirm such seasonality. The aim of this large‐scale population study was to determine whether a temporal pattern in the occurrence of PE exists. The analysis considered all consecutive cases of PE in the database of all hospital admissions of the Emilia Romagna region in Italy at the Center for Health Statistics between January 1998 and December 2005. PE cases were first grouped according to season of occurrence, and the data were analyzed by the χ2 test for goodness of fit. Then, inferential chronobiologic (cosinor and partial Fourier) analysis was applied to monthly data, and the best‐fitting curve for the annual variation was derived. The total sample consisted of 19,245 patients (8,143 male, mean age 71.6±14.1 yrs; 11,102 female, mean age 76.1±13.7 yrs). Of these, 2,484 were <65 yrs, 5,443 were between 65 and 74, and 11,318 were ≥75 yrs. There were 4,486 (23.3%) fatal‐case outcomes. PE occurred least frequently in spring (n=4,442 or 23.1%) and most frequent in winter (n=5,236 or 27.2%, goodness of fit χ2=75.75, p<0.001). Similar results were obtained for subgroups formed by gender, age, fatal/non‐fatal outcome, presence/absence of major underlying co‐morbid conditions, and specific risk factors. Inferential chronobiological analysis identified a significant annual pattern in PE, with the peak between November and December for the total sample of cases (p<0.001), males (p<0.001), females (p=0.002), fatal and non‐fatal cases (p<0.001 for both), and subgroups formed by age (<65 yrs, p=0.012; 65–74 yrs, p<0.001; ≥75 yrs, p=0.012). This pattern was independent of the presence/absence of hypertension (p=0.003 and p<0.001, respectively), pulmonary disease (p<0.001 and p<0.001, respectively), stroke (p<0.001 and p=0.004, respectively), neoplasms (p=0.005 and p=0.001, respectively), heart failure (p=0.022 and p<0.001, respectively), and deep vein thrombosis (p=0.002 and p<0.001, respectively). However, only a non‐statistically significant trend was found for subgroups formed by cases of diabetes mellitus, infections, renal failure, and trauma.  相似文献   

14.
Measures of human body mass confound 1) well‐established population differences in body form and 2) exposure to obesogenic environments, posing challenges for using body mass index (BMI) in cross‐population studies of body form, energy reserves, and obesity‐linked disease risk. We propose a method for decomposing population BMI by estimating basal BMI (bBMI) among young adults living in extremely poor, rural households where excess body mass accumulation is uncommon. We test this method with nationally representative, cross‐sectional Demographic and Health Surveys (DHS) collected from 69,916 rural women (20–24 years) in 47 low‐income countries. Predicting BMI by household wealth, we estimate country‐level bBMI as the average BMI of young women (20–24 years) living in rural households with total assets <400 USD per capita. Above 400 USD per capita, BMI increases with both wealth and age. Below this point, BMI hits a baseline floor showing little effect of either age or wealth. Between‐country variation in bBMI (range of 4.3 kg m?2) is reliable across decades and age groups (R2 = 0.83–0.88). Country‐level estimates of bBMI show no relation to diabetes prevalence or country‐level GDP (R2 < 0.05), supporting its independence from excess body mass. Residual BMI (average BMI minus bBMI) shows better fit with both country‐level GDP (R2 = 0.55 vs. 0.40) and diabetes prevalence (R2 = 0.23 vs. 0.17) than does conventional BMI. This method produces reliable estimates of bBMI across a wide range of nationally representative samples, providing a new approach to investigating population variation in body mass. Am J Phys Anthropol 153:542–550, 2014. © 2013 Wiley Periodicals, Inc.  相似文献   

15.
Imperfect detection can bias estimates of site occupancy in ecological surveys but can be corrected by estimating detection probability. Time‐to‐first‐detection (TTD) occupancy models have been proposed as a cost–effective survey method that allows detection probability to be estimated from single site visits. Nevertheless, few studies have validated the performance of occupancy‐detection models by creating a situation where occupancy is known, and model outputs can be compared with the truth. We tested the performance of TTD occupancy models in the face of detection heterogeneity using an experiment based on standard survey methods to monitor koala Phascolarctos cinereus populations in Australia. Known numbers of koala faecal pellets were placed under trees, and observers, uninformed as to which trees had pellets under them, carried out a TTD survey. We fitted five TTD occupancy models to the survey data, each making different assumptions about detectability, to evaluate how well each estimated the true occupancy status. Relative to the truth, all five models produced strongly biased estimates, overestimating detection probability and underestimating the number of occupied trees. Despite this, goodness‐of‐fit tests indicated that some models fitted the data well, with no evidence of model misfit. Hence, TTD occupancy models that appear to perform well with respect to the available data may be performing poorly. The reason for poor model performance was unaccounted for heterogeneity in detection probability, which is known to bias occupancy‐detection models. This poses a problem because unaccounted for heterogeneity could not be detected using goodness‐of‐fit tests and was only revealed because we knew the experimentally determined outcome. A challenge for occupancy‐detection models is to find ways to identify and mitigate the impacts of unobserved heterogeneity, which could unknowingly bias many models.  相似文献   

16.
Next‐generation sequencing (NGS) experiments are often performed in biomedical research nowadays, leading to methodological challenges related to the high‐dimensional and complex nature of the recorded data. In this work we review some of the issues that arise in disorder detection from NGS experiments, that is, when the focus is the detection of deletion and duplication disorders for homozygosity and heterozygosity in DNA sequencing. A statistical model to cope with guanine/cytosine bias and phasing and prephasing phenomena at base level is proposed, and a goodness‐of‐fit procedure for disorder detection is derived. The method combines the proper evaluation of local p‐values (one for each DNA base) with suitable corrections for multiple comparisons and the discrete nature of the p‐values. A global test for the detection of disorders in the whole DNA region is proposed too. The performance of the introduced procedures is investigated through simulations. A real data illustration is provided.  相似文献   

17.
Many parasitic and endophagous insect species are capable of discriminating among the quality of their hosts. However, there is no appropriate way to quantify their discrimination performance. In this study, we quantified how oviposition of the cowpea seed beetle, Callosobruchus maculatus (Fabricius) (Coleoptera: Bruchidae), was affected by the relative contributions of both egg number and host size discrimination. The effect of egg density and resource heterogeneity on these discrimination performances was also explored. Egg‐distribution predictions were made by combining time‐dependent available resource fitness (egg discrimination) and host weight factors (size discrimination). The χ2 test was then used for goodness‐of‐fit testing. The effects of both egg and size discrimination on oviposition in environments with different levels of resource heterogeneity were compared. It was found that host size, rather than the number of eggs on the host, plays a larger role in the egg‐laying decision for most individual seed beetles, especially when egg density is high. Host size discrimination behavior was reinforced when the beetles experienced increasing resource heterogeneity, but the performance might reach a plateau. This is the first quantitative evaluation of the effect of host discrimination on egg‐laying decisions of seed beetles.  相似文献   

18.
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one‐step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster‐deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts.  相似文献   

19.
We consider the general case of probability prediction models having two or more outcomes and propose an adjusted χ2 statistic which can be used to assess the goodness of fit of these models. We present a simulation study to show that our proposed statistic has an approximate χ2 distribution under the null hypothesis. Two applications are provided to illustrate the use of the new statistic. The first application examines the fit of a logistic regression model using both the proposed statistic and the popular Hosmer-Lemeshow statistic and we compare and contrast these two methods. The second application evaluates the goodness of fit of a polychotomous regression model.  相似文献   

20.
The linkage maps of male and female tiger shrimp (P. monodon) were constructed based on 256 microsatellite and 85 amplified fragment length polymorphism (AFLP) markers. Microsatellite markers obtained from clone sequences of partial genomic libraries, tandem repeat sequences from databases and previous publications and fosmid end sequences were employed. Of 670 microsatellite and 158 AFLP markers tested for polymorphism, 341 (256 microsatellite and 85 AFLP markers) were used for genotyping with three F1 mapping panels, each comprising two parents and more than 100 progeny. Chi‐square goodness‐of‐fit test (χ2) revealed that only 19 microsatellite and 28 AFLP markers showed a highly significant segregation distortion (P < 0.005). Linkage analysis with a LOD score of 4.5 revealed 43 and 46 linkage groups in male and female linkage maps respectively. The male map consisted of 176 microsatellite and 49 AFLP markers spaced every ~11.2 cM, with an observed genome length of 2033.4 cM. The female map consisted of 171 microsatellite and 36 AFLP markers spaced every ~13.8 cM, with an observed genome length of 2182 cM. Both maps shared 136 microsatellite markers, and the alignment between them indicated 38 homologous pairs of linkage groups including the linkage group representing the sex chromosome. The karyotype of P. monodon is also presented. The tentative assignment of the 44 pairs of P. monodon haploid chromosomes showed the composition of forty metacentric, one submetacentric and three acrocentric chromosomes. Our maps provided a solid foundation for gene and QTL mapping in the tiger shrimp.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号