首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Transformation and computer intensive methods such as the jackknife and bootstrap are applied to construct accurate confidence intervals for the ratio of specific occurrence/exposure rates, which are used to compare the mortality (or survival) experience of individuals in two study populations. Monte Carlo simulations are employed to compare the performances of the proposed confidence intervals when sample sizes are small or moderate.  相似文献   

3.
Previously, we showed that in randomised experiments, correction for measurement error in a baseline variable induces bias in the estimated treatment effect, and conversely that ignoring measurement error avoids bias. In observational studies, non-zero baseline covariate differences between treatment groups may be anticipated. Using a graphical approach, we argue intuitively that if baseline differences are large, failing to correct for measurement error leads to a biased estimate of the treatment effect. In contrast, correction eliminates bias if the true and observed baseline differences are equal. If this equality is not satisfied, the corrected estimator is also biased, but typically less so than the uncorrected estimator. Contrasting these findings, we conclude that there must be a threshold for the true baseline difference, above which correction is worthwhile. We derive expressions for the bias of the corrected and uncorrected estimators, as functions of the correlation of the baseline variable with the study outcome, its reliability, the true baseline difference, and the sample sizes. Comparison of these expressions defines a theoretical decision threshold about whether to correct for measurement error. The results show that correction is usually preferred in large studies, and also in small studies with moderate baseline differences. If the group sample sizes are very disparate, correction is less advantageous. If the equivalent balanced sample size is less than about 25 per group, one should correct for measurement error if the true baseline difference is expected to exceed 0.2-0.3 standard deviation units. These results are illustrated with data from a cohort study of atherosclerosis.  相似文献   

4.
The total deviation index of Lin and Lin et al. is an intuitive approach for the assessment of agreement between two methods of measurement. It assumes that the differences of the paired measurements are a random sample from a normal distribution and works essentially by constructing a probability content tolerance interval for this distribution. We generalize this approach to the case when differences may not have identical distributions -- a common scenario in applications. In particular, we use the regression approach to model the mean and the variance of differences as functions of observed values of the average of the paired measurements, and describe two methods based on asymptotic theory of maximum likelihood estimators for constructing a simultaneous probability content tolerance band. The first method uses bootstrap to approximate the critical point and the second method is an analytical approximation. Simulation shows that the first method works well for sample sizes as small as 30 and the second method is preferable for large sample sizes. We also extend the methodology for the case when the mean function is modeled using penalized splines via a mixed model representation. Two real data applications are presented.  相似文献   

5.
In diagnostic studies, a new diagnostic test is often compared with a standard test and both tests are applied on the same patients, called paired design. The true disease state is in general given by the so‐called gold standard (most reliable method for classification), which has to be known for all patients. The benefit of the new diagnostic test can be evaluated by sensitivity and specificity, which are in fact proportions. This means, for the comparison of two diagnostic tests, confidence intervals for the difference of the dependent estimated sensitivities and specificities are calculated. In the literature, many comparisons of different approaches can be found, but none explicitly for diagnostic studies. For this reason we compare 13 approaches for a set of scenarios that represent data of diagnostic studies (e.g., with sensitivity and specificity ?0.8). With simulation studies, we show that the nonparametric interval with normal approximation can be recommended for the difference of two dependent sensitivities or specificities without restriction, the Wald interval with the limitation of slightly anti‐conservative results for small sample sizes, and the nonparametric intervals with t‐approximation, and the Tango interval with the limitation of conservative results for high correlations.  相似文献   

6.
The present study demonstrates the possibility of estimating species numbers of animal or plant communities from samples using relative abundance distributions. We use log‐abundance–species‐rank order plots and derive two new estimators that are based on log‐series and lognormal distributions. At small to moderate sample sizes these estimators appear to be more precise than previous parametric and nonparametric estimators. We test our estimators using samples from 171 published medium‐sized to large animal and plant communities taken from the literature. By this we show that our new estimators define also limits of precision.  相似文献   

7.
The influence of methodologic aspects on cytomorphometric features was studied using preparations of hepatoma and/or mastocytoma cells. First, two preparation techniques (smear and oese) were compared. Second, four methods of selecting cells for cytomorphometric analysis (two conventional and two stratified methods) were tested for reproducibility. Third, heterogeneous cell populations were used to estimate the required sample size using the running coefficient of variation (CV), and the results were compared with expected (theoretical) values of the required sample size calculated using the standard error of the mean. The results showed significantly lower CVs for the smear preparation technique. The stratified methods appeared to be superior to the conventional methods for selecting cells for measurement. The experimentally assessed sample sizes were considerably lower than the corresponding theoretical calculations. These findings suggest that morphometric assessments in cytologic smears should utilize a stratified cell selection method. While experimentally assessed sample sizes are relatively small and therefore better routinely applicable, they may yield less reliable results in some cases. The need to test a sample for its reproducibility as well as its discriminatory power is emphasized.  相似文献   

8.
Cross-validation based point estimates of prediction accuracy are frequently reported in microarray class prediction problems. However these point estimates can be highly variable, particularly for small sample numbers, and it would be useful to provide confidence intervals of prediction accuracy. We performed an extensive study of existing confidence interval methods and compared their performance in terms of empirical coverage and width. We developed a bootstrap case cross-validation (BCCV) resampling scheme and defined several confidence interval methods using BCCV with and without bias-correction. The widely used approach of basing confidence intervals on an independent binomial assumption of the leave-one-out cross-validation errors results in serious under-coverage of the true prediction error. Two split-sample based methods previously proposed in the literature tend to give overly conservative confidence intervals. Using BCCV resampling, the percentile confidence interval method was also found to be overly conservative without bias-correction, while the bias corrected accelerated (BCa) interval method of Efron returns substantially anti-conservative confidence intervals. We propose a simple bias reduction on the BCCV percentile interval. The method provides mildly conservative inference under all circumstances studied and outperforms the other methods in microarray applications with small to moderate sample sizes.  相似文献   

9.
Zhou XH  Tu W 《Biometrics》2000,56(4):1118-1125
In this paper, we consider the problem of interval estimation for the mean of diagnostic test charges. Diagnostic test charge data may contain zero values, and the nonzero values can often be modeled by a log-normal distribution. Under such a model, we propose three different interval estimation procedures: a percentile-t bootstrap interval based on sufficient statistics and two likelihood-based confidence intervals. For theoretical properties, we show that the two likelihood-based one-sided confidence intervals are only first-order accurate and that the bootstrap-based one-sided confidence interval is second-order accurate. For two-sided confidence intervals, all three proposed methods are second-order accurate. A simulation study in finite-sample sizes suggests all three proposed intervals outperform a widely used minimum variance unbiased estimator (MVUE)-based interval except for the case of one-sided lower end-point intervals when the skewness is very small. Among the proposed one-sided intervals, the bootstrap interval has the best coverage accuracy. For the two-sided intervals, when the sample size is small, the bootstrap method still yields the best coverage accuracy unless the skewness is very small, in which case the bias-corrected ML method has the best accuracy. When the sample size is large, all three proposed intervals have similar coverage accuracy. Finally, we analyze with the proposed methods one real example assessing diagnostic test charges among older adults with depression.  相似文献   

10.
Problems of establishing equivalence or noninferiority between two medical diagnostic procedures involve comparisons of the response rates between correlated proportions. When the sample size is small, the asymptotic tests may not be reliable. This article proposes an unconditional exact test procedure to assess equivalence or noninferiority. Two statistics, a sample-based test statistic and a restricted maximum likelihood estimation (RMLE)-based test statistic, to define the rejection region of the exact test are considered. We show the p-value of the proposed unconditional exact tests can be attained at the boundary point of the null hypothesis. Assessment of equivalence is often based on a comparison of the confidence limits with the equivalence limits. We also derive the unconditional exact confidence intervals on the difference of the two proportion means for the two test statistics. A typical data set of comparing two diagnostic procedures is analyzed using the proposed unconditional exact and asymptotic methods. The p-value from the unconditional exact tests is generally larger than the p-value from the asymptotic tests. In other words, an exact confidence interval is generally wider than the confidence interval obtained from an asymptotic test.  相似文献   

11.
This study examined two problems in the measurement of chimpanzee behavior: (1) comparability among data sets varying in length of total observation time; and (2) the longest interval for scoring reliable numbers of sample points with instantaneous sampling (this required procedures for evaluating the chi-square statistics of the sampled data). During a 4.5-month field study conducted at the Mahale Mountains National Park, Tanzania, one adult male was observed as a focal animal for about 300 hr with continuous recording. His behavior was classified into five categories. Data sets varying in total time were prepared by extraction from the raw data. Comparability among the data sets was evaluated using Pearson's correlation coefficients and Kendall's coefficients of concordance calculated from two kinds of measures obtained from the raw and simulated data sets: (a) the percentages of time spent by the focal animal in each behavior category; and (b) those of the time spent by adult males in his proximity. The results revealed that observation time of 25 hr was the critical length for scoring the above measures reliably. Sample points for the focal animal's behavior categories and for adult males in his proximity were simulated with intervals of various lengths for data sets differing in total time. The longest interval was measured by comparing the simulated scores with confidence limits calculated for the number of sample points to be scored with the respective intervals. It was found that the interval for sampling should be set at 3 min or shorter, and that chi-square statistics calculated from the data sampled with such an interval should be evaluated after their modification into the values to be obtained from the data sampled with a 5-min interval. These results may not be directly applicable to studies dealing with other behavior categories, other age/sex classes of focal animals, etc. However, the above problems should be examined widely in studies attempting to measure animal behavior, and the methods employed in this study are applicable to such studies.  相似文献   

12.
J. Feifel  D. Dobler 《Biometrics》2021,77(1):175-185
Nested case‐control designs are attractive in studies with a time‐to‐event endpoint if the outcome is rare or if interest lies in evaluating expensive covariates. The appeal is that these designs restrict to small subsets of all patients at risk just prior to the observed event times. Only these small subsets need to be evaluated. Typically, the controls are selected at random and methods for time‐simultaneous inference have been proposed in the literature. However, the martingale structure behind nested case‐control designs allows for more powerful and flexible non‐standard sampling designs. We exploit that structure to find simultaneous confidence bands based on wild bootstrap resampling procedures within this general class of designs. We show in a simulation study that the intended coverage probability is obtained for confidence bands for cumulative baseline hazard functions. We apply our methods to observational data about hospital‐acquired infections.  相似文献   

13.
King M  Dobson A 《Biometrics》2000,56(4):1197-1203
The responsiveness of a measuring instrument is its ability to detect change over time. A commonly used index of responsiveness is the effect size for paired differences. This paper generalizes the effect size for paired differences to more than two repeated observations per subject. The sampling distribution of the generalized responsiveness statistic, Rt, is simulated for a range of plausible parameter values and for a range of sample sizes varying both the number of subjects (n) and the number of observations per subject (t). The coverage properties of confidence intervals constructed by four methods are compared. Confidence intervals based on jackknife estimates of the standard error and bias of Rt have good coverage properties even when n and t are small. The methods are used to determine which of two standard quality-of-life measures is more responsive to improvements in quality of life following surgery for early-stage breast cancer.  相似文献   

14.
Krishnamoorthy K  Lu Y 《Biometrics》2003,59(2):237-247
This article presents procedures for hypothesis testing and interval estimation of the common mean of several normal populations. The methods are based on the concepts of generalized p-value and generalized confidence limit. The merits of the proposed methods are evaluated numerically and compared with those of the existing methods. Numerical studies show that the new procedures are accurate and perform better than the existing methods when the sample sizes are moderate and the number of populations is four or less. If the number of populations is five or more, then the generalized variable method performs much better than the existing methods regardless of the sample sizes. The generalized variable method and other existing methods are illustrated using two examples.  相似文献   

15.
Metric data are usually assessed on a continuous scale with good precision, but sometimes agricultural researchers cannot obtain precise measurements of a variable. Values of such a variable cannot then be expressed as real numbers (e.g., 1.51 or 2.56), but often can be represented by intervals into which the values fall (e.g., from 1 to 2 or from 2 to 3). In this situation, statisticians talk about censoring and censored data, as opposed to missing data, where no information is available at all. Traditionally, in agriculture and biology, three methods have been used to analyse such data: (a) when intervals are narrow, some form of imputation (e.g., mid‐point imputation) is used to replace the interval and traditional methods for continuous data are employed (such as analyses of variance [ANOVA] and regression); (b) for time‐to‐event data, the cumulative proportions of individuals that experienced the event of interest are analysed, instead of the individual observed times‐to‐event; (c) when intervals are wide and many individuals are collected, non‐parametric methods of data analysis are favoured, where counts are considered instead of the individual observed value for each sample element. In this paper, we show that these methods may be suboptimal: The first one does not respect the process of data collection, the second leads to unreliable standard errors (SEs), while the third does not make full use of all the available information. As an alternative, methods of survival analysis for censored data can be useful, leading to reliable inferences and sound hypotheses testing. These methods are illustrated using three examples from plant and crop sciences.  相似文献   

16.
17.
1 Numerous studies have explored the role of semiochemicals in the behaviour of bark beetles (Scolytidae). 2 Multiple‐funnel traps are often used to elucidate these behavioural responses. Sufficient sample sizes are obtained by using large numbers of traps to which treatments are randomly assigned once, or by frequent collection of trap catches and subsequent re‐randomization of treatments. 3 Recently, there has been some debate about the potential for trap contamination to occur when semiochemical treatments (baits), and not trap‐treatment units (traps and baits), are re‐randomized among existing traps. Due to the volatility of many semiochemicals, small levels of contamination could potentially confound results. 4 A literature survey was conducted to determine the frequency of re‐randomizing semiochemical treatments (baits) vs. trap‐treatment units (traps and baits) in scolytid trapping bioassays. An experiment was then conducted to determine whether differences in the response of Dendroctonus brevicomis LeConte to attractant‐baited traps exist between the two methods. 5 The majority of papers examined reported use of a large number of fixed replicates (traps) rather than re‐randomization of treatments at frequent intervals. Seventy‐five percent of papers for which re‐randomization methods could be determined reported relocation of semiochemical treatments (baits) only. 6 No significant differences in trap catch were observed among multiple‐funnel traps aged with D. brevcomis baits (Phero Tech Inc., Canada) for 0, 30 and 90 days, suggesting that contamination did not influence the results. 7 It is concluded that re‐randomizing baits is a viable cost‐effective option to re‐randomizing trap and bait units.  相似文献   

18.
Paired survival times with potential censoring are often observed from two treatment groups in clinical trials and other types of clinical studies. The ratio of marginal hazard rates may be used to quantify the treatment effect in these studies. In this paper, a recently proposed nonparametric kernel method is used to estimate the marginal hazard rate, and the method of variance estimates recovery (MOVER) is used for the construction of the confidence intervals of a time‐dependent hazard ratio based on the confidence limits of a single marginal hazard rate. Two methods are proposed: one uses the delta method and another adopts the transformation method to construct confidence limits for the marginal hazard rate. Simulations are performed to evaluate the performance of the proposed methods. Real data from two clinical trials are analyzed using the proposed methods.  相似文献   

19.
Inference after two‐stage single‐arm designs with binary endpoint is challenging due to the nonunique ordering of the sampling space in multistage designs. We illustrate the problem of specifying test‐compatible confidence intervals for designs with nonconstant second‐stage sample size and present two approaches that guarantee confidence intervals consistent with the test decision. Firstly, we extend the well‐known Clopper–Pearson approach of inverting a family of two‐sided hypothesis tests from the group‐sequential case to designs with fully adaptive sample size. Test compatibility is achieved by using a sample space ordering that is derived from a test‐compatible estimator. The resulting confidence intervals tend to be conservative but assure the nominal coverage probability. In order to assess the possibility of further improving these confidence intervals, we pursue a direct optimization approach minimizing the mean width of the confidence intervals. While the latter approach produces more stable coverage probabilities, it is also slightly anti‐conservative and yields only negligible improvements in mean width. We conclude that the Clopper–Pearson‐type confidence intervals based on a test‐compatible estimator are the best choice if the nominal coverage probability is not to be undershot and compatibility of test decision and confidence interval is to be preserved.  相似文献   

20.
This paper focuses on inferences about the overall treatment effect in meta-analysis with normally distributed responses based on the concepts of generalized inference. A refined generalized pivotal quantity based on t distribution is presented and simulation study shows that it can provide confidence intervals with satisfactory coverage probabilities and perform hypothesis testing with satisfactory type-I error control at very small sample sizes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号