Improved Estimation of the Noncentrality Parameter Distribution from a Large Number of t‐Statistics,with Applications to False Discovery Rate Estimation in Microarray Data Analysis |
| |
Authors: | Long Qu Dan Nettleton Jack C M Dekkers |
| |
Institution: | 1. Biostat Solutions, Inc., 114 South Main Street Suite 2, Mount Airy, Maryland 21771, U.S.A.;2. Department of Statistics, Iowa State University, 2115 Snedecor Hall, Ames, Iowa 50011, U.S.A.;3. Department of Animal Science, Iowa State University, 239D Kildee Hall, Ames, Iowa 50011, U.S.A. |
| |
Abstract: | Summary Given a large number of t‐statistics, we consider the problem of approximating the distribution of noncentrality parameters (NCPs) by a continuous density. This problem is closely related to the control of false discovery rates (FDR) in massive hypothesis testing applications, e.g., microarray gene expression analysis. Our methodology is similar to, but improves upon, the existing approach by Ruppert, Nettleton, and Hwang (2007, Biometrics, 63, 483–495). We provide parametric, nonparametric, and semiparametric estimators for the distribution of NCPs, as well as estimates of the FDR and local FDR. In the parametric situation, we assume that the NCPs follow a distribution that leads to an analytically available marginal distribution for the test statistics. In the nonparametric situation, we use convex combinations of basis density functions to estimate the density of the NCPs. A sequential quadratic programming procedure is developed to maximize the penalized likelihood. The smoothing parameter is selected with the approximate network information criterion. A semiparametric estimator is also developed to combine both parametric and nonparametric fits. Simulations show that, under a variety of situations, our density estimates are closer to the underlying truth and our FDR estimates are improved compared with alternative methods. Data‐based simulations and the analyses of two microarray datasets are used to evaluate the performance in realistic situations. |
| |
Keywords: | Density estimation False discovery rates Microarray Noncentrality parameter |
|
|