首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.  相似文献   

2.
Law B  Buckleton JS  Triggs CM  Weir BS 《Genetics》2003,164(1):381-387
The probability of multilocus genotype counts conditional on allelic counts and on allelic independence provides a test statistic for independence within and between loci. As the number of loci increases and each sampled genotype becomes unique, the conditional probability becomes a function of total heterozygosity. In that case, it does not address between-locus dependence directly but only indirectly through detection of the Wahlund effect. Moreover, the test will reject the hypothesis of allelic independence only for small values of heterozygosity. Low heterozygosity is expected for population subdivision but not for population admixture. The test may therefore be inappropriate for admixed populations. If individuals with parents in two different populations are always considered to belong to one of the populations, then heterozygosity is increased in that population and the exact test should not be used for sparse data sets from that population. If such a case is suspected, then alternative testing strategies are suggested.  相似文献   

3.
Bayesian estimation of the risk of a disease around a known point source of exposure is considered. The minimal requirements for data are that cases and populations at risk are known for a fixed set of concentric annuli around the point source, and each annulus has a uniquely defined distance from the source. The conventional Poisson likelihood is assumed for the counts of disease cases in each annular zone with zone‐specific relative risk and parameters and, conditional on the risks, the counts are considered to be independent. The prior for the relative risk parameters is assumed to be piecewise constant at the distance having a known number of components. This prior is the well‐known change‐point model. Monte Carlo sampling from the posterior results in zone‐specific posterior summaries, which can be applied for the calculation of a smooth curve describing the variation in disease risk as a function of the distance from the putative source. In addition, the posterior can be used in the calculation of posterior probabilities for interesting hypothesis. The suggested model is suitable for use in geographical information systems (GIS) aimed for monitoring disease risks. As an application, a case study on the incidence of lung cancer around a former asbestos mine in eastern Finland is presented. Further extensions of the model are discussed.  相似文献   

4.
Significance testing for correlated binary outcome data   总被引:1,自引:0,他引:1  
B Rosner  R C Milton 《Biometrics》1988,44(2):505-512
Multiple logistic regression is a commonly used multivariate technique for analyzing data with a binary outcome. One assumption needed for this method of analysis is the independence of outcome for all sample points in a data set. In ophthalmologic data and other types of correlated binary data, this assumption is often grossly violated and the validity of the technique becomes an issue. A technique has been developed (Rosner, 1984) that utilizes a polychotomous logistic regression model to allow one to look at multiple exposure variables in the context of a correlated binary data structure. This model is an extension of the beta-binomial model, which has been widely used to model correlated binary data when no covariates are present. In this paper, a relationship is developed between the two techniques, whereby it is shown that use of ordinary logistic regression in the presence of correlated binary data can result in true significance levels that are considerably larger than nominal levels in frequently encountered situations. This relationship is explored in detail in the case of a single dichotomous exposure variable. In this case, the appropriate test statistic can be expressed as an adjusted chi-square statistic based on the 2 X 2 contingency table relating exposure to outcome. The test statistic is easily computed as a function of the ordinary chi-square statistic and the correlation between eyes (or more generally between cluster members) for outcome and exposure, respectively. This generalizes some previous results obtained by Koval and Donner (1987, in Festschrift for V. M. Joshi, I. B. MacNeill (ed.), Vol. V, 199-224.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

5.
Both theoretical calculations and simulation studies have been used to compare and contrast the statistical power of methods for mapping quantitative trait loci (QTLs) in simple and complex pedigrees. A widely used approach in such studies is to derive or simulate the expected mean test statistic under the alternative hypothesis of a segregating QTL and to equate a larger mean test statistic with larger power. In the present study, we show that, even when the test statistic under the null hypothesis of no linkage follows a known asymptotic distribution (the standard being chi(2)), it cannot be assumed that the distribution under the alternative hypothesis is noncentral chi(2). Hence, mean test statistics cannot be used to indicate power differences, and a comparison between methods that are based on simulated average test statistics may lead to the wrong conclusion. We illustrate this important finding, through simulations and analytical derivations, for a recently proposed new regression method for the analysis of general pedigrees to map quantitative trait loci. We show that this regression method is not necessarily more powerful nor computationally more efficient than a maximum-likelihood variance-component approach. We advocate the use of empirical power to compare trait-mapping methods.  相似文献   

6.
The purpose of this work is the development of a family-based association test that allows for random genotyping errors and missing data and makes use of information on affected and unaffected pedigree members. We derive the conditional likelihood functions of the general nuclear family for the following scenarios: complete parental genotype data and no genotyping errors; only one genotyped parent and no genotyping errors; no parental genotype data and no genotyping errors; and no parental genotype data with genotyping errors. We find maximum likelihood estimates of the marker locus parameters, including the penetrances and population genotype frequencies under the null hypothesis that all penetrance values are equal and under the alternative hypothesis. We then compute the likelihood ratio test. We perform simulations to assess the adequacy of the central chi-square distribution approximation when the null hypothesis is true. We also perform simulations to compare the power of the TDT and this likelihood-based method. Finally, we apply our method to 23 SNPs genotyped in nuclear families from a recently published study of idiopathic scoliosis (IS). Our simulations suggest that this likelihood ratio test statistic follows a central chi-square distribution with 1 degree of freedom under the null hypothesis, even in the presence of missing data and genotyping errors. The power comparison shows that this likelihood ratio test is more powerful than the original TDT for the simulations considered. For the IS data, the marker rs7843033 shows the most significant evidence for our method (p = 0.0003), which is consistent with a previous report, which found rs7843033 to be the 2nd most significant TDTae p value among a set of 23 SNPs.  相似文献   

7.
Multiple endpoints are tested to assess an overall treatment effect and also to identify which endpoints or subsets of endpoints contributed to treatment differences. The conventional p‐value adjustment methods, such as single‐step, step‐up, or step‐down procedures, sequentially identify each significant individual endpoint. Closed test procedures can also detect individual endpoints that have effects via a step‐by‐step closed strategy. This paper proposes a global‐based statistic for testing an a priori number, say, r of the k endpoints, as opposed to the conventional approach of testing one (r = 1) endpoint. The proposed test statistic is an extension of the single‐step p‐value‐based statistic based on the distribution of the smallest p‐value. The test maintains strong control of the FamilyWise Error (FWE) rate under the null hypothesis of no difference in any (sub)set of r endpoints among all possible combinations of the k endpoints. After rejecting the null hypothesis, the individual endpoints in the sets that are rejected can be tested further, using a univariate test statistic in a second step, if desired. However, the second step test only weakly controls the FWE. The proposed method is illustrated by application to a psychosis data set.  相似文献   

8.
OBJECTIVE: p Values are inaccurate for model-free linkage analysis using the conditional logistic model if we assume that the LOD score is asymptotically distributed as a simple mixture of chi-square distributions. When analyzing affected relative pairs alone, permuting the allele sharing of relative pairs does not lead to a useful permutation distribution. As an alternative, we have developed regression prediction models that provide more accurate p values. METHODS: Let E(alpha) be the empirical p value, which is the proportion of statistical tests whose LOD score under the null hypothesis exceeds a threshold determined by alpha, the nominal single test significance value. We used simulated data to obtain values of E(alpha) and compared them with alpha. We also developed a regression model, based on sample size, number of covariates in the model, alpha and marker density, to derive predicted p values for both single-point and multipoint analyses. To evaluate our predictions we used another set of simulated data, comparing the Ealpha for these data with those obtained by using the prediction model, referred to as predicted p values (P(alpha)). RESULTS: Under almost all circumstances the values of P(alpha) were closer to the E(alpha) than were the values of alpha. CONCLUSION: The regression models suggested by our analysis provide more accurate alternative p values for model-free linkage analysis when using the conditional logistic model.  相似文献   

9.

Background

Faecal egg counts are a common indicator of nematode infection and since it is a heritable trait, it provides a marker for selective breeding. However, since resistance to disease changes as the adaptive immune system develops, quantifying temporal changes in heritability could help improve selective breeding programs. Faecal egg counts can be extremely skewed and difficult to handle statistically. Therefore, previous heritability analyses have log transformed faecal egg counts to estimate heritability on a latent scale. However, such transformations may not always be appropriate. In addition, analyses of faecal egg counts have typically used univariate rather than multivariate analyses such as random regression that are appropriate when traits are correlated. We present a method for estimating the heritability of untransformed faecal egg counts over the grazing season using random regression.

Results

Replicating standard univariate analyses, we showed the dependence of heritability estimates on choice of transformation. Then, using a multitrait model, we exposed temporal correlations, highlighting the need for a random regression approach. Since random regression can sometimes involve the estimation of more parameters than observations or result in computationally intractable problems, we chose to investigate reduced rank random regression. Using standard software (WOMBAT), we discuss the estimation of variance components for log transformed data using both full and reduced rank analyses. Then, we modelled the untransformed data assuming it to be negative binomially distributed and used Metropolis Hastings to fit a generalized reduced rank random regression model with an additive genetic, permanent environmental and maternal effect. These three variance components explained more than 80 % of the total phenotypic variation, whereas the variance components for the log transformed data accounted for considerably less. The heritability, on a link scale, increased from around 0.25 at the beginning of the grazing season to around 0.4 at the end.

Conclusions

Random regressions are a useful tool for quantifying sources of variation across time. Our MCMC (Markov chain Monte Carlo) algorithm provides a flexible approach to fitting random regression models to non-normal data. Here we applied the algorithm to negative binomially distributed faecal egg count data, but this method is readily applicable to other types of overdispersed data.  相似文献   

10.
BackgroundDetection and estimation of trends in cancer incidence rates are commonly achieved by fitting standardized rates to a joinpoint log-linear regression. The efficiency of this approach is inadequate when applied to a relatively low levels of incidence. We compared that approach with the Cuscore test with respect to detecting a log-linear increasing trend of chronic myelomonocytic leukemia (CMML) in datasets simulated to match a province of about 700,000 inhabitants.MethodsFor better efficiency, we replaced the standardized rate as the dependent variable with a continuous statistic that reflects the inverse of the standardized incidence ratio (SIR). Both procedures were applied to datasets simulated to match published results in the Girona Province of Spain. We also present the use of the q-interval in displaying the temporal pattern of the events. This approach is demonstrated by analyses of CMML diagnoses in Girona County (1994–2008).ResultsThe Cuscore was clearly more efficient than regression in detecting the simulated trend. The relative efficiency of the Cuscore is likely to be maintained in even higher levels of incidence. The use of graphical displays in providing clues regarding interpretation of the results is demonstrated.ConclusionsThe Cuscore test coupled with visual inspection of the temporal pattern of the events seems to be more efficient than regression analysis in detecting and interpreting data suspected to be at elevated risk. A confirmatory analysis is expected to weed out 75% of the superfluous significant results.  相似文献   

11.
Guan Y  Sherman M  Calvin JA 《Biometrics》2006,62(1):119-125
A common assumption while analyzing spatial point processes is direction invariance, i.e., isotropy. In this article, we propose a formal nonparametric approach to test for isotropy based on the asymptotic joint normality of the sample second-order intensity function. We derive an L(2) consistent subsampling estimator for the asymptotic covariance matrix of the sample second-order intensity function and use this to construct a test statistic with a chi(2) limiting distribution. We demonstrate the efficacy of the approach through simulation studies and an application to a desert plant data set, where our approach confirms suspected directional effects in the spatial distribution of the desert plant species.  相似文献   

12.
We applied a new approach based on Mantel statistics to analyze the Genetic Analysis Workshop 14 simulated data with prior knowledge of the answers. The method was developed in order to improve the power of a haplotype sharing analysis for gene mapping in complex disease. The new statistic correlates genetic similarity and phenotypic similarity across pairs of haplotypes from case-control studies. The genetic similarity is measured as the shared length between haplotype pairs around a genetic marker. The phenotypic similarity is measured as the mean corrected cross-product based on the respective phenotypes. Cases with phenotype P1 and unrelated controls were drawn from the population of Danacaa. Power to detect main effects was compared to the X2-test for association based on 3-marker haplotypes and a global permutation test for haplotype association to test for main effects. Power to detect gene x gene interaction was compared to unconditional logistic regression. The results suggest that the Mantel statistics might be more powerful than alternative tests.  相似文献   

13.
The discovery of rare genetic variants through next generation sequencing is a very challenging issue in the field of human genetics. We propose a novel region‐based statistical approach based on a Bayes Factor (BF) to assess evidence of association between a set of rare variants (RVs) located on the same genomic region and a disease outcome in the context of case‐control design. Marginal likelihoods are computed under the null and alternative hypotheses assuming a binomial distribution for the RV count in the region and a beta or mixture of Dirac and beta prior distribution for the probability of RV. We derive the theoretical null distribution of the BF under our prior setting and show that a Bayesian control of the false Discovery Rate can be obtained for genome‐wide inference. Informative priors are introduced using prior evidence of association from a Kolmogorov‐Smirnov test statistic. We use our simulation program, sim1000G, to generate RV data similar to the 1000 genomes sequencing project. Our simulation studies showed that the new BF statistic outperforms standard methods (SKAT, SKAT‐O, Burden test) in case‐control studies with moderate sample sizes and is equivalent to them under large sample size scenarios. Our real data application to a lung cancer case‐control study found enrichment for RVs in known and novel cancer genes. It also suggests that using the BF with informative prior improves the overall gene discovery compared to the BF with noninformative prior.  相似文献   

14.
Strand M 《Biometrics》2000,56(4):1222-1226
Treatment means in factorial experiments are lattice ordered when there is an increase in mean response as the level of any factor is increased while holding the other factors fixed. Such means occur naturally in many experiments. A nonparametric test for lattice-ordered means involving a Kendall-type statistic will be summarized for k-factor factorial experiments. Specifically, the form of the test statistic and variance under the null hypothesis will be presented. In addition, a normalized version of the test statistic will be discussed and applied to relevant data.  相似文献   

15.
Guan Y 《Biometrics》2008,64(3):800-806
Summary .   We propose a formal method to test stationarity for spatial point processes. The proposed test statistic is based on the integrated squared deviations of observed counts of events from their means estimated under stationarity. We show that the resulting test statistic converges in distribution to a functional of a two-dimensional Brownian motion. To conduct the test, we compare the calculated statistic with the upper tail critical values of this functional. Our method requires only a weak dependence condition on the process but does not assume any parametric model for it. As a result, it can be applied to a wide class of spatial point process models. We study the efficacy of the test through both simulations and applications to two real data examples that were previously suspected to be nonstationary based on graphical evidence. Our test formally confirmed the suspected nonstationarity for both data.  相似文献   

16.
17.
Objective: The purpose of the present study was to derive linear and non‐linear regression equations that estimate energy expenditure (EE) from triaxial accelerometer counts that can be used to quantitate activity in young children. We are unaware of any data regarding the validity of triaxial accelerometry for assessment of physical activity intensity in this age group. Research Methods and Procedures: EE for 27 girls and boys (6.0 ± 0.3 years) was assessed for nine activities (lying down, watching a video while sitting and standing, line drawing for coloring‐in, playing blocks, walking, stair climbing, ball toss, and running) using indirect calorimetry and was then estimated using a triaxial accelerometer (ActivTracer, GMS). Results: Significant correlations were observed between synthetic (synthesized tri‐axes as the vector), vertical, and horizontal accelerometer counts and EE for all activities (0.878 to 0.932 for EE). However, linear and non‐linear regression equations underestimated EE by >30% for stair climbing (up and down) and performing a ball toss. Therefore, linear and non‐linear regression equations were calculated for all activities except these two activities, and then evaluated for all activities. Linear and non‐linear regression equations using combined vertical and horizontal acceleration counts, synthetic counts, and horizontal counts demonstrated a better relationship between accelerometer counts and EE than did regression equations using vertical acceleration counts. Adjustment of the predicted value by the regression equations using the vertical/horizontal counts ratio improved the overestimation of EE for performing a ball toss. Discussion: The results suggest that triaxial accelerometry is a good tool for assessing daily EE in young children.  相似文献   

18.
For assessment of genetic association between single-nucleotide polymorphisms (SNPs) and disease status, the logistic-regression model or generalized linear model is typically employed. However, testing for deviation from Hardy-Weinberg proportion in a patient group could be another approach for genetic-association studies. The Hardy-Weinberg proportion is one of the most important principles in population genetics. Deviation from Hardy-Weinberg proportion among cases (patients) could provide additional evidence for the association between SNPs and diseases. To develop a more powerful statistical test for genetic-association studies, we combined evidence about deviation from Hardy-Weinberg proportion in case subjects and standard regression approaches that use case and control subjects. In this paper, we propose two approaches for combining such information: the mean-based tail-strength measure and the median-based tail-strength measure. These measures integrate logistic regression and Hardy-Weinberg-proportion tests for the study of the association between a binary disease outcome and an SNP on the basis of case- and control-subject data. For both mean-based and median-based tail-strength measures, we derived exact formulas to compute p values. We also developed an approach for obtaining empirical p values with the use of a resampling procedure. Results from simulation studies and real-disease studies demonstrate that the proposed approach is more powerful than the traditional logistic-regression model. The type I error probabilities of our approach were also well controlled.  相似文献   

19.
Interim analyses in clinical trials are planned for ethical as well as economic reasons. General results have been published in the literature that allow the use of standard group sequential methodology if one uses an efficient test statistic, e.g., when Wald-type statistics are used in random-effects models for ordinal longitudinal data. These models often assume that the random effects are normally distributed. However, this is not always the case. We will show that, when the random-effects distribution is misspecified in ordinal regression models, the joint distribution of the test statistics over the different interim analyses is still a multivariate normal distribution, but a sandwich-type correction to the covariance matrix is needed in order to obtain the correct covariance matrix. The independent increment structure is also investigated. A bias in estimation will occur due to the misspecification. However, we will also show that the treatment effect estimate will be unbiased under the null hypothesis, thus maintaining the type I error. Extensive simulations based on a toenail dermatophyte onychomycosis trial are used to illustrate our results.  相似文献   

20.
Kim S  Imoto S  Miyano S 《Bio Systems》2004,75(1-3):57-65
We propose a dynamic Bayesian network and nonparametric regression model for constructing a gene network from time series microarray gene expression data. The proposed method can overcome a shortcoming of the Bayesian network model in the sense of the construction of cyclic regulations. The proposed method can analyze the microarray data as a continuous data and can capture even nonlinear relations among genes. It can be expected that this model will give a deeper insight into complicated biological systems. We also derive a new criterion for evaluating an estimated network from Bayes approach. We conduct Monte Carlo experiments to examine the effectiveness of the proposed method. We also demonstrate the proposed method through the analysis of the Saccharomyces cerevisiae gene expression data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号