首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 41 毫秒
1.
S Eguchi  M Matsuura 《Biometrics》1990,46(2):415-426
A new method of testing the Hardy-Weinberg equilibrium in the human leukocyte antigen (HLA) system is proposed and applied to real data. The derivation is based on the maximum likelihood method and closely related to standard regression theory. The test statistic has a closed representation of residual sum of squares by a projection mapping of data onto the estimated regression plane. Under the Hardy-Weinberg law the noniterative estimates for the gene frequencies are suggested by the use of the projection mapping. The test statistic and gene frequency estimates are shown to be asymptotically equivalent to the maximum likelihood method and to be more efficient than the other suggested test statistic when there are more than two identified alleles.  相似文献   

2.
MOTIVATION: Recently a class of nonparametric statistical methods, including the empirical Bayes (EB) method, the significance analysis of microarray (SAM) method and the mixture model method (MMM), have been proposed to detect differential gene expression for replicated microarray experiments conducted under two conditions. All the methods depend on constructing a test statistic Z and a so-called null statistic z. The null statistic z is used to provide some reference distribution for Z such that statistical inference can be accomplished. A common way of constructing z is to apply Z to randomly permuted data. Here we point our that the distribution of z may not approximate the null distribution of Z well, leading to possibly too conservative inference. This observation may apply to other permutation-based nonparametric methods. We propose a new method of constructing a null statistic that aims to estimate the null distribution of a test statistic directly. RESULTS: Using simulated data and real data, we assess and compare the performance of the existing method and our new method when applied in EB, SAM and MMM. Some interesting findings on operating characteristics of EB, SAM and MMM are also reported. Finally, by combining the idea of SAM and MMM, we outline a simple nonparametric method based on the direct use of a test statistic and a null statistic.  相似文献   

3.
A method is proposed to conduct phylogenetic analyses of comparative or interspecific data when the true phylogeny is not known. Standard models of speciation and/or extinction or other methods are used to generate a sample from the set of all possible phylogenies for the measured species. The comparative data are then analyzed on each of the possible trees to obtain a distribution of possible evolutionary statistics for these data. The mean of this distribution is proposed as a reasonable estimate of the true evolutionary statistic of interest. Ways of obtaining confidence intervals and of developing hypothesis tests for this mean statistic are also proposed. The method can be used with any comparative method or phylogenetic analysis technique when phylogenetic relationships among species are not known or when branch lengths for a phylogeny in units of expected character change (as required by most methods) are not available. Computer programs to conduct the analyses are available on request.  相似文献   

4.
The Haseman-Elston method is widely used for the mapping of quantitative-trait loci. However, this method does not use all the information in the data, because it only considers the sib-pair trait-value difference. In addition, the Haseman-Elston method was developed for independent sib pairs; its generalization to nonindependent sib pairs is not straightforward. Here we introduce a score test statistic derived from a normal likelihood based on multiplex sibship data, conditional on identical-by-descent sharing statuses. This score test is asymptotically equivalent to the corresponding likelihood-ratio test, but it is much easier to implement. Because the proposed test uses all of the trait values, it makes more efficient use of the data than does the Haseman-Elston method. The proposed test is naturally applicable to sibships of arbitrary size. The finite-sample properties of the proposed score statistic are evaluated via simulations.  相似文献   

5.
A class of nonparametric statistical methods, including a nonparametric empirical Bayes (EB) method, the Significance Analysis of Microarrays (SAM) and the mixture model method (MMM) have been proposed to detect differential gene expression for replicated microarray experiments. They all depend on constructing a test statistic, for example, a t-statistic, and then using permutation to draw inferences. However, due to special features of microarray data, using standard permutation scores may not estimate the null distribution of the test statistic well, leading to possibly too conservative inferences. We propose a new method of constructing weighted permutation scores to overcome the problem: posterior probabilities of having no differential expression from the EB method are used as weights for genes to better estimate the null distribution of the test statistic. We also propose a weighted method to estimate the false discovery rate (FDR) using the posterior probabilities. Using simulated data and real data for time-course microarray experiments, we show the improved performance of the proposed methods when implemented in MMM, EB and SAM.  相似文献   

6.
MOTIVATION: Time-course microarray experiments are designed to study biological processes in a temporal fashion. Longitudinal gene expression data arise when biological samples taken from the same subject at different time points are used to measure the gene expression levels. It has been observed that the gene expression patterns of samples of a given tumor measured at different time points are likely to be much more similar to each other than are the expression patterns of tumor samples of the same type taken from different subjects. In statistics, this phenomenon is called the within-subject correlation of repeated measurements on the same subject, and the resulting data are called longitudinal data. It is well known in other applications that valid statistical analyses have to appropriately take account of the possible within-subject correlation in longitudinal data. RESULTS: We apply estimating equation techniques to construct a robust statistic, which is a variant of the robust Wald statistic and accounts for the potential within-subject correlation of longitudinal gene expression data, to detect genes with temporal changes in expression. We associate significance levels to the proposed statistic by either incorporating the idea of the significance analysis of microarrays method or using the mixture model method to identify significant genes. The utility of the statistic is demonstrated by applying it to an important study of osteoblast lineage-specific differentiation. Using simulated data, we also show pitfalls in drawing statistical inference when the within-subject correlation in longitudinal gene expression data is ignored.  相似文献   

7.
Summary A time‐specific log‐linear regression method on quantile residual lifetime is proposed. Under the proposed regression model, any quantile of a time‐to‐event distribution among survivors beyond a certain time point is associated with selected covariates under right censoring. Consistency and asymptotic normality of the regression estimator are established. An asymptotic test statistic is proposed to evaluate the covariate effects on the quantile residual lifetimes at a specific time point. Evaluation of the test statistic does not require estimation of the variance–covariance matrix of the regression estimators, which involves the probability density function of the survival distribution with censoring. Simulation studies are performed to assess finite sample properties of the regression parameter estimator and test statistic. The new regression method is applied to a breast cancer data set with long‐term follow‐up to estimate the patients' median residual lifetimes, adjusting for important prognostic factors.  相似文献   

8.
Parzen M  Lipsitz SR 《Biometrics》1999,55(2):580-584
In this paper, a global goodness-of-fit test statistic for a Cox regression model, which has an approximate chi-squared distribution when the model has been correctly specified, is proposed. Our goodness-of-fit statistic is global and has power to detect if interactions or higher order powers of covariates in the model are needed. The proposed statistic is similar to the Hosmer and Lemeshow (1980, Communications in Statistics A10, 1043-1069) goodness-of-fit statistic for binary data as well as Schoenfeld's (1980, Biometrika 67, 145-153) statistic for the Cox model. The methods are illustrated using data from a Mayo Clinic trial in primary billiary cirrhosis of the liver (Fleming and Harrington, 1991, Counting Processes and Survival Analysis), in which the outcome is the time until liver transplantation or death. The are 17 possible covariates. Two Cox proportional hazards models are fit to the data, and the proposed goodness-of-fit statistic is applied to the fitted models.  相似文献   

9.
Although Kolmogorov-Smirnov (KS) statistic is a widely used method, some weaknesses exist in investigating abrupt Change Point (CP) problems, e.g. it is time-consuming and invalid sometimes. To detect abrupt change from time series fast, a novel method is proposed based on Haar Wavelet (HW) and KS statistic (HWKS). First, the two Binary Search Trees (BSTs), termed TcA and TcD, are constructed by multi-level HW from a diagnosed time series; the framework of HWKS method is implemented by introducing a modified KS statistic and two search rules based on the two BSTs; and then fast CP detection is implemented by two HWKS-based algorithms. Second, the performance of HWKS is evaluated by simulated time series dataset. The simulations show that HWKS is faster, more sensitive and efficient than KS, HW, and T methods. Last, HWKS is applied to analyze the electrocardiogram (ECG) time series, the experiment results show that the proposed method can find abrupt change from ECG segment with maximal data fluctuation more quickly and efficiently, and it is very helpful to inspect and diagnose the different state of health from a patient''s ECG signal.  相似文献   

10.
It is very common in regression analysis to encounter incompletely observed covariate information. A recent approach to analyse such data is weighted estimating equations (Robins, J. M., Rotnitzky, A. and Zhao, L. P. (1994), JASA, 89, 846-866, and Zhao, L. P., Lipsitz, S. R. and Lew, D. (1996), Biometrics, 52, 1165-1182). With weighted estimating equations, the contribution to the estimating equation from a complete observation is weighted by the inverse of the probability of being observed. We propose a test statistic to assess if the weighted estimating equations produce biased estimates. Our test statistic is similar to the test statistic proposed by DuMouchel and Duncan (1983) for weighted least squares estimates for sample survey data. The method is illustrated using data from a randomized clinical trial on chemotherapy for multiple myeloma.  相似文献   

11.
Guan Y 《Biometrics》2008,64(3):800-806
Summary .   We propose a formal method to test stationarity for spatial point processes. The proposed test statistic is based on the integrated squared deviations of observed counts of events from their means estimated under stationarity. We show that the resulting test statistic converges in distribution to a functional of a two-dimensional Brownian motion. To conduct the test, we compare the calculated statistic with the upper tail critical values of this functional. Our method requires only a weak dependence condition on the process but does not assume any parametric model for it. As a result, it can be applied to a wide class of spatial point process models. We study the efficacy of the test through both simulations and applications to two real data examples that were previously suspected to be nonstationary based on graphical evidence. Our test formally confirmed the suspected nonstationarity for both data.  相似文献   

12.
Recent investigations such as a more powerful quasi-likelihoods score test (MQLS) statistic have enabled the efficient association analysis with related samples. Although those approaches are robust against the mis-specified phenotypic distribution and covariance structure, it has been shown that MQLS statistic becomes violated under the presence of the population substructure if the level of population substructure depends on the genomic location. In this report, we propose a new statistical method which combines EIGENSTRAT approach and MQLS-statistic. The proposed method was evaluated with simulation data under various scenarios and we found that proposed method performs better than the traditional methods such as transmission disequilibrium test. The proposed method was applied to genetic association analysis for body mass index with Framingham heart study, and we found that rs1121980 and rs9940128 in the linkage block in FTO gene are associated with the body mass index.  相似文献   

13.
In medical research, investigators are often interested in inferring time‐to‐event distributions under competing risks. It is well known, however, that the naive approach based on the Kaplan–Meier method to estimate the proportion of cause‐specific events overestimates the true quantity. In this paper, we show that the quantile residual life function, a natural and popular summary measure of survival data, could be also seriously affected by the competing events. An existing two‐sample test statistic for inference on median residual life is modified for competing risks data, which does not involve estimation of the improper probability density function of the subdistribution of cause‐specific events under censoring. Simulation results demonstrate that the test statistic controls the type 1 error probabilities reasonably well. The proposed method is applied to a real data example from a large‐scale phase III breast cancer study.  相似文献   

14.
An efficient recursive polynomial multiplication method is proposed for exact unconditional power calculation for unordered 2 × K contingency table with up to moderate sample size. Our method can be applied to the family of cell-additive statistics which includes the Freeman-Halton statistic, the Pearson χ2 statistic and the likelihood ratio statistic. We illustrate our proposed method by several numerical examples.  相似文献   

15.
MOTIVATION: The parametric F-test has been widely used in the analysis of factorial microarray experiments to assess treatment effects. However, the normality assumption is often untenable for microarray experiments with small replications. Therefore, permutation-based methods are called for help to assess the statistical significance. The distribution of the F-statistics across all the genes on the array can be regarded as a mixture distribution with a proportion of statistics generated from the null distribution of no differential gene expression whereas the other proportion of statistics generated from the alternative distribution of genes differentially expressed. This results in the fact that the permutation distribution of the F-statistics may not approximate well to the true null distribution of the F-statistics. Therefore, the construction of a proper null statistic to better approximate the null distribution of F-statistic is of great importance to the permutation-based multiple testing in microarray data analysis. RESULTS: In this paper, we extend the ideas of constructing null statistics based on pairwise differences to neglect the treatment effects from the two-sample comparison problem to the multifactorial balanced or unbalanced microarray experiments. A null statistic based on a subpartition method is proposed and its distribution is employed to approximate the null distribution of the F-statistic. The proposed null statistic is able to accommodate unbalance in the design and is also corrected for the undue correlation between its numerator and denominator. In the simulation studies and real biological data analysis, the number of true positives and the false discovery rate (FDR) of the proposed null statistic are compared with those of the permutated version of the F-statistic. It has been shown that our proposed method has a better control of the FDRs and a higher power than the standard permutation method to detect differentially expressed genes because of the better approximated tail probabilities.  相似文献   

16.
Nonparametric inference on median residual life function   总被引:1,自引:0,他引:1  
Summary .  A simple approach to the estimation of the median residual lifetime is proposed for a single group by inverting a function of the Kaplan–Meier estimators. A test statistic is proposed to compare two median residual lifetimes at any fixed time point. The test statistic does not involve estimation of the underlying probability density function of failure times under censoring. Extensive simulation studies are performed to validate the proposed test statistic in terms of type I error probabilities and powers at various time points. One of the oldest data sets from the National Surgical Adjuvant Breast and Bowel Project (NSABP), which has more than a quarter century of follow-up, is used to illustrate the method. The analysis results indicate that, without systematic post-operative therapy, a significant difference in median residual lifetimes between node-negative and node-positive breast cancer patients persists for about 10 years after surgery. The new estimates of the median residual lifetime could serve as a baseline for physicians to explain any incremental effects of post-operative treatments in terms of delaying breast cancer recurrence or prolonging remaining lifetimes of breast cancer patients.  相似文献   

17.
Adaptive two‐stage designs allow a data‐driven change of design characteristics during the ongoing trial. One of the available options is an adaptive choice of the test statistic for the second stage of the trial based on the results of the interim analysis. Since there is often only a vague knowledge of the distribution shape of the primary endpoint in the planning phase of a study, a change of the test statistic may then be considered if the data indicate that the assumptions underlying the initial choice of the test are not correct. Collings and Hamilton proposed a bootstrap method for the estimation of the power of the two‐sample Wilcoxon test for shift alternatives. We use this approach for the selection of the test statistic. By means of a simulation study, we show that the gain in terms of power may be considerable when the initial assumption about the underlying distribution was wrong, whereas the loss is relatively small when in the first instance the optimal test statistic was chosen. The results also hold true for comparison with a one‐stage design. Application of the method is illustrated by a clinical trial example.  相似文献   

18.
On the use of the variogram in checking for independence in spatial data   总被引:1,自引:0,他引:1  
Diblasi A  Bowman AW 《Biometrics》2001,57(1):211-218
The variogram is a standard tool in the analysis of spatial data, and its shape provides useful information on the form of spatial correlation that may be present. However, it is also useful to be able to assess the evidence for the presence of any spatial correlation. A method of doing this, based on an assessment of whether the true function underlying the variogram is constant, is proposed. Nonparametric smoothing of the squared differences of the observed variables, on a suitably transformed scale, is used to estimate variogram shape. A statistic based on a ratio of quadratic forms is proposed and the test is constructed by investigating the distributional properties of this statistic under the assumption of an independent Gaussian process. The power of the test is investigated. Reference bands are proposed as a graphical follow-up. An example is discussed.  相似文献   

19.
Pathway analysis of microarray data evaluates gene expression profiles of a priori defined biological pathways in association with a phenotype of interest. We propose a unified pathway-analysis method that can be used for diverse phenotypes including binary, multiclass, continuous, count, rate, and censored survival phenotypes. The proposed method also allows covariate adjustments and correlation in the phenotype variable that is encountered in longitudinal, cluster-sampled, and paired designs. These are accomplished by combining the regression-based test statistic for each individual gene in a pathway of interest into a pathway-level test statistic. Applications of the proposed method are illustrated with two real pathway-analysis examples: one evaluating relapse-associated gene expression involving a matched-pair binary phenotype in children with acute lymphoblastic leukemia; and the other investigating gene expression in breast cancer tissues in relation to patients' survival (a censored survival phenotype). Implementations for various phenotypes are available in R. Additionally, an Excel Add-in for a user-friendly interface is currently being developed.  相似文献   

20.
The most widely used statistical methods for finding differentially expressed genes (DEGs) are essentially univariate. In this study, we present a new T(2) statistic for analyzing microarray data. We implemented our method using a multiple forward search (MFS) algorithm that is designed for selecting a subset of feature vectors in high-dimensional microarray datasets. The proposed T2 statistic is a corollary to that originally developed for multivariate analyses and possesses two prominent statistical properties. First, our method takes into account multidimensional structure of microarray data. The utilization of the information hidden in gene interactions allows for finding genes whose differential expressions are not marginally detectable in univariate testing methods. Second, the statistic has a close relationship to discriminant analyses for classification of gene expression patterns. Our search algorithm sequentially maximizes gene expression difference/distance between two groups of genes. Including such a set of DEGs into initial feature variables may increase the power of classification rules. We validated our method by using a spike-in HGU95 dataset from Affymetrix. The utility of the new method was demonstrated by application to the analyses of gene expression patterns in human liver cancers and breast cancers. Extensive bioinformatics analyses and cross-validation of DEGs identified in the application datasets showed the significant advantages of our new algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号