首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Specification of an appropriate model is critical to valid statistical inference. Given the “true model” for the data is unknown, the goal of model selection is to select a plausible approximating model that balances model bias and sampling variance. Model selection based on information criteria such as AIC or its variant AICc, or criteria like CAIC, has proven useful in a variety of contexts including the analysis of open-population capture-recapture data. These criteria have not been intensively evaluated for closed-population capture-recapture models, which are integer parameter models used to estimate population size (N), and there is concern that they will not perform well. To address this concern, we evaluated AIC, AICc, and CAIC model selection for closed-population capture-recapture models by empirically assessing the quality of inference for the population size parameter N. We found that AIC-, AICc-, and CAIC-selected models had smaller relative mean squared errors than randomly selected models, but that confidence interval coverage on N was poor unless unconditional variance estimates (which incorporate model uncertainty) were used to compute confidence intervals. Overall, AIC and AICc outperformed CAIC, and are preferred to CAIC for selection among the closed-population capture-recapture models we investigated. A model averaging approach to estimation, using AIC, AICc, or CAIC to estimate weights, was also investigated and proved superior to estimation using AIC-, AICc-, or CAIC-selected models. Our results suggested that, for model averaging, AIC or AICc should be favored over CAIC for estimating weights.  相似文献   

2.
The relationship between the silent substitution rate (K s) and the GC content along the genome is a focal point of the debate about the origin of the isochore structure in vertebrates. Recent estimation of the silent substitution rate showed a positive correlation between K s and GC content, in contradiction with the predictions of both the regional mutation bias model and the selection or biased gene conversion model. The aim of this paper is to help resolve this contradiction between theoretical studies and data. We analyzed the relationship between K s and GC content under (1) uniform mutation bias, (2) a regional mutation bias, and (3) mutation bias and selection. We report that an increase in K s with GC content is expected under mutation bias because of either nonequilibrium of the isochore structure or an increasing mutation rate from AT toward GC nucleotides in GC-richer isochores. We show by simulations that CpG deamination tends to increase the mutation rate with GC content in a regional mutation bias model. We also demonstrate that the relationship between K s and GC under the selectionist or biased gene conversion model is positive under weak selection if the mutation selection equilibrium GC frequency is less than 0.5. Received: 28 March 2001 / Accepted: 16 May 2001  相似文献   

3.
Ishiguro, Sakamoto, and Kitagawa (1997, Annals of the Institute of Statistical Mathematics 49, 411-434) proposed EIC as an extension of Akaike criterion (AIC); the idea leading to EIC is to correct the bias of the log-likelihood, considered as an estimator of the Kullback-Leibler information, using bootstrap. We develop this criterion for its use in multivariate semiparametric situations, and argue that it can be used for choosing among parametric and semiparametric estimators. A simulation study based on aregression model shows that EIC is better than its competitors although likelihood cross-validation performs nearly as well except for small sample size. Its use is illustrated by estimating the mean evolution of viral RNA levels in a group of infants infected by HIV.  相似文献   

4.
Diseased animals may exhibit behavioral shifts that increase or decrease their probability of being randomly sampled. In harvest-based sampling approaches, animal movements, changes in habitat utilization, changes in breeding behaviors during harvest periods, or differential susceptibility to harvest via behaviors like hiding or decreased sensitivity to stimuli may result in a non-random sample that biases prevalence estimates. We present a method that can be used to determine whether bias exists in prevalence estimates from harvest samples. Using data from harvested mule deer (Odocoileus hemionus) sampled in northcentral Colorado (USA) during fall hunting seasons 1996-98 and Akaike's information criterion (AIC) model selection, we detected within-yr trends indicating potential bias in harvest-based prevalence estimates for chronic wasting disease (CWD). The proportion of CWD-positive deer harvested slightly increased through time within a yr. We speculate that differential susceptibility to harvest or breeding season movements may explain the positive trend in proportion of CWD-positive deer harvested during fall hunting seasons. Detection of bias may provide information about temporal patterns of a disease, suggest biological hypotheses that could further understanding of a disease, or provide wildlife managers with information about when diseased animals are more or less likely to be harvested. Although AIC model selection can be useful for detecting bias in data, it has limited utility in determining underlying causes of bias. In cases where bias is detected in data using such model selection methods, then design-based methods (i.e., experimental manipulation) may be necessary to assign causality.  相似文献   

5.
Using the ratio of nonsynonymous to synonymous nucleotide substitution rates (Ka/Ks) is a common approach for detecting positive selection. However, calculation of this ratio over a whole gene combines amino acid sites that may be under positive selection with those that are highly conserved. We introduce a new covarion‐based method to sample only the sites potentially under selective pressure. Using ancestral sequence reconstruction over a phylogenetic tree coupled with calculation of Ka/Ks ratios, positive selection is better detected by this simple covarion‐based approach than it is using a whole gene analysis or a windowing analysis. This is demonstrated on a synthetic dataset and is tested on primate leptin, which indicates a previously undetected round of positive selection in the branch leading to Gorilla gorilla.  相似文献   

6.
A spatial open-population capture-recapture model is described that extends both the non-spatial open-population model of Schwarz and Arnason and the spatially explicit closed-population model of Borchers and Efford. The superpopulation of animals available for detection at some time during a study is conceived as a two-dimensional Poisson point process. Individual probabilities of birth and death follow the conventional open-population model. Movement between sampling times may be modeled with a dispersal kernel using a recursive Markovian algorithm. Observations arise from distance-dependent sampling at an array of detectors. As in the closed-population spatial model, the observed data likelihood relies on integration over the unknown animal locations; maximization of this likelihood yields estimates of the birth, death, movement, and detection parameters. The models were fitted to data from a live-trapping study of brushtail possums (Trichosurus vulpecula) in New Zealand. Simulations confirmed that spatial modeling can greatly reduce the bias of capture-recapture survival estimates and that there is a degree of robustness to misspecification of the dispersal kernel. An R package is available that includes various extensions.  相似文献   

7.
The theory of competitive ligand–receptor binding has been used to analyze the effect of afucosylation‐based antibody heterogeneity on Fc‐FcγRIIIa ligand–receptor binding activity. In vitro activity is found to represent a linear combination of the component antibody activities, weighted by the relative concentrations of the different afucosylated antibody forms. An analysis of ELISA binding activity data has allowed for the dissection of the activity contributions of the different afucosylated antibodies, revealing that the heterogeneous afucosylated antibody exhibits greater activity, on a per mole basis, when compared to the homogeneous afucosylated antibody. The ratio of the afucosylated antibody equilibrium dissociation constants is computed to be KAF/KA ≈ 0.6–0.9, where KAF and KA denote the dissociation equilibrium constant of the heterogeneous and the homogeneous afucosylated antibodies, respectively. Our analysis also reveals that, in general, activity scales quadratically with the afucosylated glycan content of a sample. Linear activity–afucosylated glycan fraction correlations reported in the literature are shown to represent specific cases of this general scaling and result from oversimplifying the underlying antibody concentration distributions. The implications of our findings for drug development are also discussed. © 2015 American Institute of Chemical Engineers Biotechnol. Prog., 31:775–782, 2015  相似文献   

8.
Aim Spatial autocorrelation is a frequent phenomenon in ecological data and can affect estimates of model coefficients and inference from statistical models. Here, we test the performance of three different simultaneous autoregressive (SAR) model types (spatial error = SARerr, lagged = SARlag and mixed = SARmix) and common ordinary least squares (OLS) regression when accounting for spatial autocorrelation in species distribution data using four artificial data sets with known (but different) spatial autocorrelation structures. Methods We evaluate the performance of SAR models by examining spatial patterns in model residuals (with correlograms and residual maps), by comparing model parameter estimates with true values, and by assessing their type I error control with calibration curves. We calculate a total of 3240 SAR models and illustrate how the best models [in terms of minimum residual spatial autocorrelation (minRSA), maximum model fit (R2), or Akaike information criterion (AIC)] can be identified using model selection procedures. Results Our study shows that the performance of SAR models depends on model specification (i.e. model type, neighbourhood distance, coding styles of spatial weights matrices) and on the kind of spatial autocorrelation present. SAR model parameter estimates might not be more precise than those from OLS regressions in all cases. SARerr models were the most reliable SAR models and performed well in all cases (independent of the kind of spatial autocorrelation induced and whether models were selected by minRSA, R2 or AIC), whereas OLS, SARlag and SARmix models showed weak type I error control and/or unpredictable biases in parameter estimates. Main conclusions SARerr models are recommended for use when dealing with spatially autocorrelated species distribution data. SARlag and SARmix might not always give better estimates of model coefficients than OLS, and can thus generate bias. Other spatial modelling techniques should be assessed comprehensively to test their predictive performance and accuracy for biogeographical and macroecological research.  相似文献   

9.
10.
To characterize the coding-sequence divergence of closely related genomes, we compared DNA sequence divergence between sequences from a Brassica rapa ssp. pekinensis EST library isolated from flower buds and genomic sequences from Arabidopsis thaliana. The specific objectives were (i) to determine the distribution of and relationship between K a and K s, (ii) to identify genes with the lowest and highest K a:K s values, and (iii) to evaluate how codon usage has diverged between two closely related species. We found that the distribution of K a:K s was unimodal, and that substitution rates were more variable at nonsynonymous than synonymous sites, and detected no evidence that K a and K s were positively correlated. Several genes had K a:K s values equal to or near zero, as expected for genes that have evolved under strong selective constraint. In contrast, there were no genes with K a:K s >1 and thus we found no strong evidence that any of the 218 sequences we analyzed have evolved in response to positive selection. We detected a stronger codon bias but a lower frequency of GC at synonymous sites in A. thaliana than B. rapa. Moreover, there has been a shift in the profile of most commonly used synonymous codons since these two species diverged from one another. This shift in codon usage may have been caused by stronger selection acting on codon usage or by a shift in the direction of mutational bias in the B. rapa phylogenetic lineage.  相似文献   

11.
Publication bias and p-hacking are two well-known phenomena that strongly affect the scientific literature and cause severe problems in meta-analyses. Due to these phenomena, the assumptions of meta-analyses are seriously violated and the results of the studies cannot be trusted. While publication bias is very often captured well by the weighting function selection model, p-hacking is much harder to model and no definitive solution has been found yet. In this paper, we advocate the selection model approach to model publication bias and propose a mixture model for p-hacking. We derive some properties for these models, and we compare them formally and through simulations. Finally, two real data examples are used to show how the models work in practice.  相似文献   

12.
Abstract: As use of Akaike's Information Criterion (AIC) for model selection has become increasingly common, so has a mistake involving interpretation of models that are within 2 AIC units (ΔAIC ≤ 2) of the top-supported model. Such models are <2 ΔAIC units because the penalty for one additional parameter is +2 AIC units, but model deviance is not reduced by an amount sufficient to overcome the 2-unit penalty and, hence, the additional parameter provides no net reduction in AIC. Simply put, the uninformative parameter does not explain enough variation to justify its inclusion in the model and it should not be interpreted as having any ecological effect. Models with uninformative parameters are frequently presented as being competitive in the Journal of Wildlife Management, including 72% of all AIC-based papers in 2008, and authors and readers need to be more aware of this problem and take appropriate steps to eliminate misinterpretation. I reviewed 5 potential solutions to this problem: 1) report all models but ignore or dismiss those with uninformative parameters, 2) use model averaging to ameliorate the effect of uninformative parameters, 3) use 95% confidence intervals to identify uninformative parameters, 4) perform all-possible subsets regression and use weight-of-evidence approaches to discriminate useful from uninformative parameters, or 5) adopt a methodological approach that allows models containing uninformative parameters to be culled from reported model sets. The first approach is preferable for small sets of a priori models, whereas the last 2 approaches should be used for large model sets or exploratory modeling.  相似文献   

13.
Aim To demonstrate that the concept of carrying capacity for species richness (SK) is highly relevant to the conservation of biodiversity, and to estimate the spatial pattern of SK for native landbirds as a basis for conservation planning. Location North America. Methods We evaluated the leading hypotheses on biophysical factors affecting species richness for Breeding Bird Survey routes from areas with little influence of human activities. We then derived a best model based on information theory, and used this model to extrapolate SK across North America based on the biophysical predictor variables. The predictor variables included the latest and probably most accurate satellite and simulation‐model derived products. Results The best model of SK included mean annual and inter‐annual variation in gross primary productivity and potential evapotranspiration. This model explained 70% of the variation in landbird species richness. Geographically, predicted SK was lowest at higher latitudes and in the arid west, intermediate in the Rocky Mountains and highest in the eastern USA and the Great Lakes region of the USA and Canada. Main conclusions Areas that are high in SK but low in human density are high priorities for protection, and areas high in SK and high in human density are high priorities for restoration. Human density was positively related to SK, indicating that humans select environments similar to those with high bird species richness. Federal lands were disproportionately located in areas of low predicted SK.  相似文献   

14.
Male-biased sex ratios in adult odonate populations have been the subject of vigorous discussion between the students of this order of insects. The debate has centered on whether the observed male bias in many populations is real, perhaps due to unequal survival rates, or whether it is an artifact caused by differences in recapture probabilities. A mark–recapture study to assess the relative contribution of survivorship and recapture rates on male-biased sex ratio was performed in a Cuban population of the damselfly Hypolestes trinitatis. Maximum likelihood theory and Akaike information criterion were used for parameter estimation and model selection, respectively. Females in the sample were outnumbered two to one by males. Estimated recapture and survival rates were 0.188 (females) and 0.638 (males), and 0.933 (females) and 0.944 (males), respectively. Recapture rates only partially explained the bias since the population sex ratio estimated after correcting for differences in this parameter was male biased (1.5). The observed higher survival probabilities in males could have generated the male-biased population sex ratio. Therefore, we concluded that the observed male-biased population sex ratio in H. trinitatis is real.  相似文献   

15.
对模型选择中交叉验证量CV进行改进,得到新的验证模型是否合适的准则RCV,RCV包含了CV的信息,并包含了拟合程度,模型中的待估参数个数和样本容量等等,比起AIC,BIC和CV具有更好的稳定性和分辨功能.  相似文献   

16.
Wei Zhang  Simon J. Bonner 《Biometrics》2020,76(3):1028-1033
Schofield et al. (2018, Biometrics 74, 626–635) presented simple and efficient algorithms for fitting continuous-time capture-recapture models based on Poisson processes. They also demonstrated by real examples that the standard method of discretizing continuous-time capture-recapture data and then fitting traditional discrete-time models may lead to information loss in population size estimation. In this article, we aim to clarify that key to the approach of Schofield et al. (2018) is the Poisson model assumed for the number of captures of each individual throughout the study, rather than the fact of data being collected in continuous time. We further show that the method of data discretization works equally well as the method of Schofield et al. (2018), provided that a Poisson model is applied instead of the traditional Bernoulli model to the number of captures for each individual on each sampling occasion.  相似文献   

17.
The bootstrap method has become a widely used tool applied in diverse areas where results based on asymptotic theory are scarce. It can be applied, for example, for assessing the variance of a statistic, a quantile of interest or for significance testing by resampling from the null hypothesis. Recently, some approaches have been proposed in the biometrical field where hypothesis testing or model selection is performed on a bootstrap sample as if it were the original sample. P‐values computed from bootstrap samples have been used, for example, in the statistics and bioinformatics literature for ranking genes with respect to their differential expression, for estimating the variability of p‐values and for model stability investigations. Procedures which make use of bootstrapped information criteria are often applied in model stability investigations and model averaging approaches as well as when estimating the error of model selection procedures which involve tuning parameters. From the literature, however, there is evidence that p‐values and model selection criteria evaluated on bootstrap data sets do not represent what would be obtained on the original data or new data drawn from the overall population. We explain the reasons for this and, through the use of a real data set and simulations, we assess the practical impact on procedures relevant to biometrical applications in cases where it has not yet been studied. Moreover, we investigate the behavior of subsampling (i.e., drawing from a data set without replacement) as a potential alternative solution to the bootstrap for these procedures.  相似文献   

18.
It is often assumed that the von Bertalanffy growth model (VBGM) is appropriate to describe growth in length-at-age of elasmobranchs. However, a review of the literature suggests that a two-phase growth model could better describe growth in elasmobranchs. We compare the two-phase growth model (TPGM) with the VBGM for 18 data sets of elasmobranch species, by fitting the models to 36 age-length-at-age data pairs available. The Akaike Information Criteria (AIC) and the difference in AIC between both models revealed that in 23 cases the probability that the TPGM was true ≥50%. The VBGM tends to estimate larger L values than the two-phase growth model, while the k parameter tends to be underestimated. The growth rate in length-at-age appears tends to decrease near the age at first maturity in several species of elasmobranch. The importance of the TPGM lies in that it may better describe this aspect of the life history of many elasmobranchs. In this context, we conclude that the TPGM should be used along with other growth models in order to precisely estimate elasmobranch life history parameters.  相似文献   

19.
Populations of Drosophila melanogaster were maintained for 36 generations in r- and K-selected environments in order to test the life-history predictions of theories on density-dependent selection. In the r-selection environment, populations were reduced to low densities by density-independent adult mortality, whereas populations in the K-selection environment were maintained at their carrying capacity. Some of the experimental results support the predictions or r- and K-selection theory; relative to the r-selected populations, the K-selected populations evolved an increased larval-to-adult viability, larger body size, and longer development time at high larval densities. Mueller and Ayala (1981) found that K-selected populations also have a higher rate of population growth at high densities. Other predictions of the thoery are contradicted by the lack of differences between the r and K populations in adult longevity and fecundity and a slower rate of development for r-selected individuals at low densities. The differences between selected populations in larval survivorship, larval-to-adult development time, and adult body size are strongly dependent on larval density, and there is a significant interaction between populations and larval density for each trait. This manifests an inadequacy of the theory on r- and K-selection, which does not take into account such interactions between genotypes and environments. We describe mechanisms that may explain the evolution of preadult life-history traits in our experiment and discuss the need for changes in theories of density-dependent selection.  相似文献   

20.
Sample size calculations based on two‐sample comparisons of slopes in repeated measurements have been reported by many investigators. In contrast, the literature has paid relatively little attention to the design and analysis of K‐sample trials in repeated measurements studies where K is 3 or greater. Jung and Ahn (2003) derived a closed sample size formula for two‐sample comparisons of slopes by taking into account the impact of missing data. We extend their method to compare K‐sample slopes in repeated measurement studies using the generalized estimating equation (GEE) approach based on independent working correlation structure. We investigate the performance of the sample size formula since the sample size formula is based on asymptotic theory. The proposed sample size formula is illustrated using a clinical trial example. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号