首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary Microarray gene expression studies over ordered categories are routinely conducted to gain insights into biological functions of genes and the underlying biological processes. Some common experiments are time‐course/dose‐response experiments where a tissue or cell line is exposed to different doses and/or durations of time to a chemical. A goal of such studies is to identify gene expression patterns/profiles over the ordered categories. This problem can be formulated as a multiple testing problem where for each gene the null hypothesis of no difference between the successive mean gene expressions is tested and further directional decisions are made if it is rejected. Much of the existing multiple testing procedures are devised for controlling the usual false discovery rate (FDR) rather than the mixed directional FDR (mdFDR), the expected proportion of Type I and directional errors among all rejections. Benjamini and Yekutieli (2005, Journal of the American Statistical Association 100, 71–93) proved that an augmentation of the usual Benjamini–Hochberg (BH) procedure can control the mdFDR while testing simple null hypotheses against two‐sided alternatives in terms of one‐dimensional parameters. In this article, we consider the problem of controlling the mdFDR involving multidimensional parameters. To deal with this problem, we develop a procedure extending that of Benjamini and Yekutieli based on the Bonferroni test for each gene. A proof is given for its mdFDR control when the underlying test statistics are independent across the genes. The results of a simulation study evaluating its performance under independence as well as under dependence of the underlying test statistics across the genes relative to other relevant procedures are reported. Finally, the proposed methodology is applied to a time‐course microarray data obtained by Lobenhofer et al. (2002, Molecular Endocrinology 16, 1215–1229). We identified several important cell‐cycle genes, such as DNA replication/repair gene MCM4 and replication factor subunit C2, which were not identified by the previous analyses of the same data by Lobenhofer et al. (2002) and Peddada et al. (2003, Bioinformatics 19, 834–841). Although some of our findings overlap with previous findings, we identify several other genes that complement the results of Lobenhofer et al. (2002) .  相似文献   

2.
Summary Given a large number of t‐statistics, we consider the problem of approximating the distribution of noncentrality parameters (NCPs) by a continuous density. This problem is closely related to the control of false discovery rates (FDR) in massive hypothesis testing applications, e.g., microarray gene expression analysis. Our methodology is similar to, but improves upon, the existing approach by Ruppert, Nettleton, and Hwang (2007, Biometrics, 63, 483–495). We provide parametric, nonparametric, and semiparametric estimators for the distribution of NCPs, as well as estimates of the FDR and local FDR. In the parametric situation, we assume that the NCPs follow a distribution that leads to an analytically available marginal distribution for the test statistics. In the nonparametric situation, we use convex combinations of basis density functions to estimate the density of the NCPs. A sequential quadratic programming procedure is developed to maximize the penalized likelihood. The smoothing parameter is selected with the approximate network information criterion. A semiparametric estimator is also developed to combine both parametric and nonparametric fits. Simulations show that, under a variety of situations, our density estimates are closer to the underlying truth and our FDR estimates are improved compared with alternative methods. Data‐based simulations and the analyses of two microarray datasets are used to evaluate the performance in realistic situations.  相似文献   

3.
We consider multiple testing with false discovery rate (FDR) control when p values have discrete and heterogeneous null distributions. We propose a new estimator of the proportion of true null hypotheses and demonstrate that it is less upwardly biased than Storey's estimator and two other estimators. The new estimator induces two adaptive procedures, that is, an adaptive Benjamini–Hochberg (BH) procedure and an adaptive Benjamini–Hochberg–Heyse (BHH) procedure. We prove that the adaptive BH (aBH) procedure is conservative nonasymptotically. Through simulation studies, we show that these procedures are usually more powerful than their nonadaptive counterparts and that the adaptive BHH procedure is usually more powerful than the aBH procedure and a procedure based on randomized p‐value. The adaptive procedures are applied to a study of HIV vaccine efficacy, where they identify more differentially polymorphic positions than the BH procedure at the same FDR level.  相似文献   

4.
Multiple testing (MT) with false discovery rate (FDR) control has been widely conducted in the “discrete paradigm” where p-values have discrete and heterogeneous null distributions. However, in this scenario existing FDR procedures often lose some power and may yield unreliable inference, and for this scenario there does not seem to be an FDR procedure that partitions hypotheses into groups, employs data-adaptive weights and is nonasymptotically conservative. We propose a weighted p-value-based FDR procedure, “weighted FDR (wFDR) procedure” for short, for MT in the discrete paradigm that efficiently adapts to both heterogeneity and discreteness of p-value distributions. We theoretically justify the nonasymptotic conservativeness of the wFDR procedure under independence, and show via simulation studies that, for MT based on p-values of binomial test or Fisher's exact test, it is more powerful than six other procedures. The wFDR procedure is applied to two examples based on discrete data, a drug safety study, and a differential methylation study, where it makes more discoveries than two existing methods.  相似文献   

5.
The paper is concerned with expected type I errors of some stepwise multiple test procedures based on independent p‐values controlling the so‐called false discovery rate (FDR). We derive an asymptotic result for the supremum of the expected type I error rate(EER) when the number of hypotheses tends to infinity. Among others, it will be shown that when the original Benjamini‐Hochberg step‐up procedure controls the FDR at level α, its EER may approach a value being slightly larger than α/4 when the number of hypotheses increases. Moreover, we derive some least favourable parameter configuration results, some bounds for the FDR and the EER as well as easily computable formulae for the familywise error rate (FWER) of two FDR‐controlling procedures. Finally, we discuss some undesirable properties of the FDR concept, especially the problem of cheating.  相似文献   

6.
In MS‐based quantitative proteomics, the FDR control (i.e. the limitation of the number of proteins that are wrongly claimed as differentially abundant between several conditions) is a major postanalysis step. It is classically achieved thanks to a specific statistical procedure that computes the adjusted p‐values of the putative differentially abundant proteins. Unfortunately, such adjustment is conservative only if the p‐values are well‐calibrated; the false discovery control being spuriously underestimated otherwise. However, well‐calibration is a property that can be violated in some practical cases. To overcome this limitation, we propose a graphical method to straightforwardly and visually assess the p‐value well‐calibration, as well as the R codes to embed it in any pipeline. All MS data have been deposited in the ProteomeXchange with identifier PXD002370 ( http://proteomecentral.proteomexchange.org/dataset/PXD002370 ).  相似文献   

7.
Many recently developed nonparametric jump tests can be viewed as multiple hypothesis testing problems. For such multiple hypothesis tests, it is well known that controlling type I error often makes a large proportion of erroneous rejections, and such situation becomes even worse when the jump occurrence is a rare event. To obtain more reliable results, we aim to control the false discovery rate (FDR), an efficient compound error measure for erroneous rejections in multiple testing problems. We perform the test via the Barndorff-Nielsen and Shephard (BNS) test statistic, and control the FDR with the Benjamini and Hochberg (BH) procedure. We provide asymptotic results for the FDR control. From simulations, we examine relevant theoretical results and demonstrate the advantages of controlling the FDR. The hybrid approach is then applied to empirical analysis on two benchmark stock indices with high frequency data.  相似文献   

8.
The use of multiple hypothesis testing procedures has been receiving a lot of attention recently by statisticians in DNA microarray analysis. The traditional FWER controlling procedures are not very useful in this situation since the experiments are exploratory by nature and researchers are more interested in controlling the rate of false positives rather than controlling the probability of making a single erroneous decision. This has led to increased use of FDR (False Discovery Rate) controlling procedures. Genovese and Wasserman proposed a single-step FDR procedure that is an asymptotic approximation to the original Benjamini and Hochberg stepwise procedure. In this paper, we modify the Genovese-Wasserman procedure to force the FDR control closer to the level alpha in the independence setting. Assuming that the data comes from a mixture of two normals, we also propose to make this procedure adaptive by first estimating the parameters using the EM algorithm and then using these estimated parameters into the above modification of the Genovese-Wasserman procedure. We compare this procedure with the original Benjamini-Hochberg and the SAM thresholding procedures. The FDR control and other properties of this adaptive procedure are verified numerically.  相似文献   

9.
Summary In a microarray experiment, one experimental design is used to obtain expression measures for all genes. One popular analysis method involves fitting the same linear mixed model for each gene, obtaining gene‐specific p‐values for tests of interest involving fixed effects, and then choosing a threshold for significance that is intended to control false discovery rate (FDR) at a desired level. When one or more random factors have zero variance components for some genes, the standard practice of fitting the same full linear mixed model for all genes can result in failure to control FDR. We propose a new method that combines results from the fit of full and selected linear mixed models to identify differentially expressed genes and provide FDR control at target levels when the true underlying random effects structure varies across genes.  相似文献   

10.
In genome-wide genetic studies with a large number of markers, balancing the type I error rate and power is a challenging issue. Recently proposed false discovery rate (FDR) approaches are promising solutions to this problem. Using the 100 simulated datasets of a genome-wide marker map spaced about 3 cM and phenotypes from the Genetic Analysis Workshop 14, we studied the type I error rate and power of Storey's FDR approach, and compared it to the traditional Bonferroni procedure. We confirmed that Storey's FDR approach had a strong control of FDR. We found that Storey's FDR approach only provided weak control of family-wise error rate (FWER). For these simulated datasets, Storey's FDR approach only had slightly higher power than the Bonferroni procedure. In conclusion, Storey's FDR approach is more powerful than the Bonferroni procedure if strong control of FDR or weak control of FWER is desired. Storey's FDR approach has little power advantage over the Bonferroni procedure if there is low linkage disequilibrium among the markers. Further evaluation of the type I error rate and power of the FDR approaches for higher linkage disequilibrium and for haplotype analyses is warranted.  相似文献   

11.
LC‐MS experiments can generate large quantities of data, for which a variety of database search engines are available to make peptide and protein identifications. Decoy databases are becoming widely used to place statistical confidence in result sets, allowing the false discovery rate (FDR) to be estimated. Different search engines produce different identification sets so employing more than one search engine could result in an increased number of peptides (and proteins) being identified, if an appropriate mechanism for combining data can be defined. We have developed a search engine independent score, based on FDR, which allows peptide identifications from different search engines to be combined, called the FDR Score. The results demonstrate that the observed FDR is significantly different when analysing the set of identifications made by all three search engines, by each pair of search engines or by a single search engine. Our algorithm assigns identifications to groups according to the set of search engines that have made the identification, and re‐assigns the score (combined FDR Score). The combined FDR Score can differentiate between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine.  相似文献   

12.
Aflatoxins are polyaromatic mycotoxins that contaminate a range of food crops as a result of fungal growth and contribute to serious health problems in the developing world because of their toxicity and mutagenicity. Although relatively resistant to biotic degradation, aflatoxins can be metabolized by certain species of Actinomycetales. However, the enzymatic basis for their breakdown has not been reported until now. We have identified nine Mycobacterium smegmatis enzymes that utilize the deazaflavin cofactor F420H2 to catalyse the reduction of the α,β‐unsaturated ester moiety of aflatoxins, activating the molecules for spontaneous hydrolysis and detoxification. These enzymes belong to two previously uncharacterized F420H2 dependent reductase (FDR‐A and ‐B) families that are distantly related to the flavin mononucleotide (FMN) dependent pyridoxamine 5′‐phosphate oxidases (PNPOxs). We have solved crystal structures of an enzyme from each FDR family and show that they, like the PNPOxs, adopt a split barrel protein fold, although the FDRs also possess an extended and highly charged F420H2 binding groove. A general role for these enzymes in xenobiotic metabolism is discussed, including the observation that the nitro‐reductase Rv3547 from Mycobacterium tuberculosis that is responsible for the activation of bicyclic nitroimidazole prodrugs belongs to the FDR‐A family.  相似文献   

13.
A widespread phenomenon in migrant birds is that they travel faster in spring than in autumn. During migration birds spend most time at stopover sites and, correspondingly, the faster spring migration is mainly explained by shorter stopovers in spring than autumn. Because a main purpose of stopovers is to replenish the fuel used in flight, a higher rate of fuel deposition (FDR) in spring is thought to explain the shorter stopovers and hence shorter total duration of migration in spring. Critical migratory processes, including the onset and extent of pre‐migratory fueling, are endogenously regulated. It is therefore not unlikely that refueling at stopover sites is, at least partly, also under endogenous control. We here tested whether there is an endogenous seasonal difference in food intake and FDR, which could contribute to shorter stopovers and hence faster migration in spring. We measured daily food intake and daily FDR in two subspecies of the northern wheatear Oenanthe oenanthe, temporarily confined at stopover under identical constant indoor conditions in spring and autumn. The two wheatear subspecies differed markedly in absolute food intake and FDR. Within subspecies, however, food intake and FDR did not differ between spring and autumn, indicating that faster spring migration in northern wheatears is not explained by an endogenously controlled seasonal difference in birds’ motivation to refuel. To further substantiate this claim, similar measurements should be taken at other locations along northern wheatears’ migration routes. Comparable experiments in other species could test the generality of our results.  相似文献   

14.
In many applications where it is necessary to test multiple hypotheses simultaneously, the data encountered are discrete. In such cases, it is important for multiplicity adjustment to take into account the discreteness of the distributions of the p‐values, to assure that the procedure is not overly conservative. In this paper, we review some known multiple testing procedures for discrete data that control the familywise error rate, the probability of making any false rejection. Taking advantage of the fact that the exact permutation or exact pairwise permutation distributions of the p‐values can often be determined when the sample size is small, we investigate procedures that incorporate the dependence structure through the exact permutation distribution and propose two new procedures that incorporate the exact pairwise permutation distributions. A step‐up procedure is also proposed that accounts for the discreteness of the data. The performance of the proposed procedures is investigated through simulation studies and two applications. The results show that by incorporating both discreteness and dependency of p‐value distributions, gains in power can be achieved.  相似文献   

15.
One of multiple testing problems in drug finding experiments is the comparison of several treatments with one control. In this paper we discuss a particular situation of such an experiment, i.e., a microarray setting, where the many-to-one comparisons need to be addressed for thousands of genes simultaneously. For a gene-specific analysis, Dunnett's single step procedure is considered within gene tests, while the FDR controlling procedures such as Significance Analysis of Microarrays (SAM) and Benjamini and Hochberg (BH) False Discovery Rate (FDR) adjustment are applied to control the error rate across genes. The method is applied to a microarray experiment with four treatment groups (three microarrays in each group) and 16,998 genes. Simulation studies are conducted to investigate the performance of the SAM method and the BH-FDR procedure with regard to controlling the FDR, and to investigate the effect of small-variance genes on the FDR in the SAM procedure.  相似文献   

16.
Benjamini Y  Heller R 《Biometrics》2008,64(4):1215-1222
SUMMARY: We consider the problem of testing for partial conjunction of hypothesis, which argues that at least u out of n tested hypotheses are false. It offers an in-between approach to the testing of the conjunction of null hypotheses against the alternative that at least one is not, and the testing of the disjunction of null hypotheses against the alternative that all hypotheses are not null. We suggest powerful test statistics for testing such a partial conjunction hypothesis that are valid under dependence between the test statistics as well as under independence. We then address the problem of testing many partial conjunction hypotheses simultaneously using the false discovery rate (FDR) approach. We prove that if the FDR controlling procedure in Benjamini and Hochberg (1995, Journal of the Royal Statistical Society, Series B 57, 289-300) is used for this purpose the FDR is controlled under various dependency structures. Moreover, we can screen at all levels simultaneously in order to display the findings on a superimposed map and still control an appropriate FDR measure. We apply the method to examples from microarray analysis and functional magnetic resonance imaging (fMRI), two application areas where the need for partial conjunction analysis has been identified.  相似文献   

17.
The Newman-Keuls (NK) procedure for testing all pairwise comparisons among a set of treatment means, introduced by Newman (1939) and in a slightly different form by Keuls (1952) was proposed as a reasonable way to alleviate the inflation of error rates when a large number of means are compared. It was proposed before the concepts of different types of multiple error rates were introduced by Tukey (1952a, b; 1953). Although it was popular in the 1950s and 1960s, once control of the familywise error rate (FWER) was accepted generally as an appropriate criterion in multiple testing, and it was realized that the NK procedure does not control the FWER at the nominal level at which it is performed, the procedure gradually fell out of favor. Recently, a more liberal criterion, control of the false discovery rate (FDR), has been proposed as more appropriate in some situations than FWER control. This paper notes that the NK procedure and a nonparametric extension controls the FWER within any set of homogeneous treatments. It proves that the extended procedure controls the FDR when there are well-separated clusters of homogeneous means and between-cluster test statistics are independent, and extensive simulation provides strong evidence that the original procedure controls the FDR under the same conditions and some dependent conditions when the clusters are not well-separated. Thus, the test has two desirable error-controlling properties, providing a compromise between FDR control with no subgroup FWER control and global FWER control. Yekutieli (2002) developed an FDR-controlling procedure for testing all pairwise differences among means, without any FWER-controlling criteria when there is more than one cluster. The empirica example in Yekutieli's paper was used to compare the Benjamini-Hochberg (1995) method with apparent FDR control in this context, Yekutieli's proposed method with proven FDR control, the Newman-Keuls method that controls FWER within equal clusters with apparent FDR control, and several methods that control FWER globally. The Newman-Keuls is shown to be intermediate in number of rejections to the FWER-controlling methods and the FDR-controlling methods in this example, although it is not always more conservative than the other FDR-controlling methods.  相似文献   

18.
Various characteristics of a long‐distance acoustic signal have been shown to vary to different degrees. It has been suggested that female preferences based on stable song parameters are stabilising or weakly directional, and preferences based on variable parameters are strongly directional. We tested this hypothesis based on a short‐distance signal (courtship song) produced by the field cricket, Gryllus bimaculatus. We studied the degree of variability of different courtship song parameters and the behavioural importance of several parameters using synthesised song models in playback experiments. We found that most of the courtship song elements of G. bimaculatus were quite variable (coefficient of variation, CV, in the range of 20–53%). The most variable parameter of the courtship song was the relative amplitude of two elements: high‐amplitude ticks and low‐amplitude pulses. Because songs containing only ticks (of rare occurrence) appeared to be more effective than songs with both ticks and pulses (of frequent occurrence), we consider female preferences to be directional. Alteration of less variable traits, such as the carrier frequency and duration of ticks (CV = 20–25%), had a different effect on female responsiveness. The synthesised songs with different carrier frequencies of ticks were as attractive to females as the positive control (courtship of muted males accompanied by playback of the recorded song). Altering the duration of ticks had a crucial effect on the female response rate, decreasing female responsiveness to the level observed in the negative control (courtship of muted males). Thus, we did not find a strong relationship between the variability of individual song parameters and their potential importance in song recognition and the evaluation of male quality. The partial inconsistency of our results with the data of other authors may be due to different patterns of past and current selection on long‐distance and short‐distance acoustic signals.  相似文献   

19.
Recently, Efron (2007) provided methods for assessing the effect of correlation on false discovery rate (FDR) in large‐scale testing problems in the context of microarray data. Although FDR procedure does not require independence of the tests, existence of correlation grossly under‐ or overestimates the number of critical genes. Here, we briefly review Efron's method and apply it to a relatively smaller spectrometry proteomics data. We show that even here the correlation can affect the FDR values and the number of proteins declared as critical.  相似文献   

20.
MicroRNAs (miRNAs) regulate gene expression with emerging data suggesting miRNAs play a role in skeletal muscle biology. We sought to examine the association of miRNAs with grip strength in a community‐based sample. Framingham Heart Study Offspring and Generation 3 participants (n = 5668 54% women, mean age 55 years, range 24, 90 years) underwent grip strength measurement and miRNA profiling using whole blood from fasting morning samples. Linear mixed‐effects regression modeling of grip strength (kg) versus continuous miRNA ‘Cq’ values and versus binary miRNA expression was performed. We conducted an integrative miRNA–mRNA coexpression analysis and examined the enrichment of biologic pathways for the top miRNAs associated with grip strength. Grip strength was lower in women than in men and declined with age with a mean 44.7 (10.0) kg in men and 26.5 (6.3) kg in women. Among 299 miRNAs interrogated for association with grip strength, 93 (31%) had FDR q value < 0.05, 54 (18%) had an FDR q value < 0.01, and 15 (5%) had FDR q value < 0.001. For almost all miRNA–grip strength associations, increasing miRNA concentration is associated with increasing grip strength. miR‐20a‐5p (FDR q 1.8 × 10?6) had the most significant association and several among the top 15 miRNAs had links to skeletal muscle including miR‐126‐3p, miR‐30a‐5p, and miR‐30d‐5p. The top associated biologic pathways included metabolism, chemokine signaling, and ubiquitin‐mediated proteolysis. Our comprehensive assessment in a community‐based sample of miRNAs in blood associated with grip strength provides a framework to further our understanding of the biology of muscle strength.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号