首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
For multiple testing based on discrete p-values, we propose a false discovery rate (FDR) procedure “BH+” with proven conservativeness. BH+ is at least as powerful as the BH (i.e., Benjamini-Hochberg) procedure when they are applied to superuniform p-values. Further, when applied to mid-p-values, BH+ can be more powerful than it is applied to conventional p-values. An easily verifiable necessary and sufficient condition for this is provided. BH+ is perhaps the first conservative FDR procedure applicable to mid-p-values and to p-values with general distributions. It is applied to multiple testing based on discrete p-values in a methylation study, an HIV study and a clinical safety study, where it makes considerably more discoveries than the BH procedure. In addition, we propose an adaptive version of the BH+ procedure, prove its conservativeness under certain conditions, and provide evidence on its excellent performance via simulation studies.  相似文献   

2.
For two independent binomial proportions Barnard (1947) has introduced a method to construct a non-asymptotic unconditional test by maximisation of the probabilities over the ‘classical’ null hypothesis H0= {(θ1, θ2) ∈ [0, 1]2: θ1 = θ2}. It is shown that this method is also useful when studying test problems for different null hypotheses such as, for example, shifted null hypotheses of the form H0 = {(θ1, θ2) ∈ [0, 1]2: θ2 ≤ θ1 ± Δ } for non-inferiority and 1-sided superiority problems (including the classical null hypothesis with a 1-sided alternative hypothesis). We will derive some results for the more general ‘shifted’ null hypotheses of the form H0 = {(θ1, θ2) ∈ [0, 1]2: θ2g1 )} where g is a non decreasing curvilinear function of θ1. Two examples for such null hypotheses in the regulatory setting are given. It is shown that the usual asymptotic approximations by the normal distribution may be quite unreliable. Non-asymptotic unconditional tests (and the corresponding p-values) may, therefore, be an alternative, particularly because the effort to compute non-asymptotic unconditional p-values for such more complex situations does not increase as compared to the classical situation. For ‘classical’ null hypotheses it is known that the number of possible p-values derived by the unconditional method is very large, albeit finite, and the same is true for the null hypotheses studied in this paper. In most of the situations investigated it becomes obvious that Barnard's CSM test (1947) when adapted to the respective null space is again a very powerful test. A theorem is provided which in addition to allowing fast algorithms to compute unconditional non-asymptotical p-values fills a methodological gap in the calculation of exact unconditional p-values as it is implemented, for example, in Stat Xact 3 for Windows (1995).  相似文献   

3.

Background  

A large number of genes usually show differential expressions in a microarray experiment with two types of tissues, and the p-values of a proper statistical test are often used to quantify the significance of these differences. The genes with small p-values are then picked as the genes responsible for the differences in the tissue RNA expressions. One key question is what should be the threshold to consider the p-values small. There is always a trade off between this threshold and the rate of false claims. Recent statistical literature shows that the false discovery rate (FDR) criterion is a powerful and reasonable criterion to pick those genes with differential expression. Moreover, the power of detection can be increased by knowing the number of non-differential expression genes. While this number is unknown in practice, there are methods to estimate it from data. The purpose of this paper is to present a new method of estimating this number and use it for the FDR procedure construction.  相似文献   

4.
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.  相似文献   

5.
Circadian preference toward eveningness has been associated with increased risk for mental health problems both in early adolescence and in adulthood. However, in late adolescence, when circadian rhythm naturally shifts to later, its significance for mental health is not clear. Accordingly, we studied how circadian rhythm estimated both by self-reported chronotype and by actigraph-defined midpoint of sleep was associated with self-reported psychiatric problems based on Youth Self Report (YSR). The study builds on a community cohort born in 1998, Helsinki, Finland. At age 17 years (mean age = 16.9, SD = 0.1 years), 183 adolescents (65.6% of the invited) participated in the study. We used the shortened version of the Horne-Östberg morningness–eveningness Questionnaire to define the chronotype, and actigraphs to define the naturally occur circadian rhythm over a 4 to 17 days’ period (mean nights N = 8.3, SD = 1.8). The Achenbach software was used to obtain T-score values for YSR psychiatric problem scales. The analyses were adjusted for important covariates including gender, socioeconomic status, body mass index, pubertal maturation, mother’s licorice consumption during pregnancy, and actigraph-defined sleep duration and quality. Eveningness was associated with higher scores in rule-breaking behavior and conduct problems (as assessed either by midpoint of sleep or by self-reported chronotype, p-values <0.05), attention deficit/hyperactivity problems (by self-reported chronotype, p-values <0.05), with affective problems (by midpoint of sleep and by self-reported chronotype, p-values <0.05) and somatic complaints (by self-reported chronotype, p-values <0.05), as compared to circadian tendency toward morningness. Our results suggest that the association between eveningness and externalizing problem behavior, present in children and younger adolescents, is also present in late adolescence when circadian rhythms shift toward evening.  相似文献   

6.

Background

Evaluating the significance for a group of genes or proteins in a pathway or biological process for a disease could help researchers understand the mechanism of the disease. For example, identifying related pathways or gene functions for chromatin states of tumor-specific T cells will help determine whether T cells could reprogram or not, and further help design the cancer treatment strategy. Some existing p-value combination methods can be used in this scenario. However, these methods suffer from different disadvantages, and thus it is still challenging to design more powerful and robust statistical method.

Results

The existing method of Group combined p-value (GCP) first partitions p-values to several groups using a set of several truncation points, but the method is often sensitive to these truncation points. Another method of adaptive rank truncated product method(ARTP) makes use of multiple truncation integers to adaptively combine the smallest p-values, but the method loses statistical power since it ignores the larger p-values. To tackle these problems, we propose a robust p-value combination method (rPCMP) by considering multiple partitions of p-values with different sets of truncation points. The proposed rPCMP statistic have a three-layer hierarchical structure. The inner-layer considers a statistic which combines p-values in a specified interval defined by two thresholds points, the intermediate-layer uses a GCP statistic which optimizes the statistic from the inner layer for a partition set of threshold points, and the outer-layer integrates the GCP statistic from multiple partitions of p-values. The empirical distribution of statistic under null distribution could be estimated by permutation procedure.

Conclusions

Our proposed rPCMP method has been shown to be more robust and have higher statistical power. Simulation study shows that our method can effectively control the type I error rates and have higher statistical power than the existing methods. We finally apply our rPCMP method to an ATAC-seq dataset for discovering the related gene functions with chromatin states in mouse tumors T cell.
  相似文献   

7.

Background  

In microarray studies researchers are often interested in the comparison of relevant quantities between two or more similar experiments, involving different treatments, tissues, or species. Typically each experiment reports measures of significance (e.g. p-values) or other measures that rank its features (e.g genes). Our objective is to find a list of features that are significant in all experiments, to be further investigated. In this paper we present an R package called sdef, that allows the user to quantify the evidence of communality between the experiments using previously proposed statistical methods based on the ranked lists of p-values. sdef implements two approaches that address this objective: the first is a permutation test of the maximal ratio of observed to expected common features under the hypothesis of independence between the experiments. The second approach, set in a Bayesian framework, is more flexible as it takes into account the uncertainty on the number of genes differentially expressed in each experiment.  相似文献   

8.
Multiple testing (MT) with false discovery rate (FDR) control has been widely conducted in the “discrete paradigm” where p-values have discrete and heterogeneous null distributions. However, in this scenario existing FDR procedures often lose some power and may yield unreliable inference, and for this scenario there does not seem to be an FDR procedure that partitions hypotheses into groups, employs data-adaptive weights and is nonasymptotically conservative. We propose a weighted p-value-based FDR procedure, “weighted FDR (wFDR) procedure” for short, for MT in the discrete paradigm that efficiently adapts to both heterogeneity and discreteness of p-value distributions. We theoretically justify the nonasymptotic conservativeness of the wFDR procedure under independence, and show via simulation studies that, for MT based on p-values of binomial test or Fisher's exact test, it is more powerful than six other procedures. The wFDR procedure is applied to two examples based on discrete data, a drug safety study, and a differential methylation study, where it makes more discoveries than two existing methods.  相似文献   

9.
False discovery rates are routinely controlled by application of the Benjamini–Hochberg step-up procedure to a set of p-values. A method is demonstrated for representing the values so obtained (the BH-FDRs) on a quantile–quantile (Q-Q) plot of the p-values transformed to the negative-logarithmic scale. Recognition of this connection between the BH-FDR and the Q-Q plot facilitates both understanding of the meaning of the BH-FDR and interpretation of the BH-FDR in a particular data set.  相似文献   

10.
Abiotic stresses such as cold, drought, heat, salinity, nutrient deficiency, and toxicity adversely affect lentil yields worldwide. Therefore, the purpose of this study was to investigate the response of two lentil cultivars (Lens culinaris Medik) (Jordan 1 and Jordan 2) to NaCl, mannitol, sorbitol, and H2O2 via the characterization of seed germination, accumulation of reactive oxygen species, and γ-aminobutyric acid (GABA) level. There was a significant increase in GABA and malondialdehyde (MDA) levels in the two lentil cultivars under all treatments. Jordan 1 showed the highest germination percentages with p-values: 0.009, 0.013, 0.026, and 0.015, while Jordan 2 seedlings showed the highest GABA levels with p-values: 0.023, 0.007, 0.023, and 0.019 and MDA accumulation with p-values: 0.009, 0.012, 0.007, and 0.009 under salt, osmotic, and oxidative stresses, respectively, compared with Jordan 1 seedlings under the same treatments. Our results indicate that GABA shunt is a key signaling and metabolic pathway that allows adaptation of lentil seedlings to salt, osmotic, and oxidative stresses. In addition, Jordan 1 cultivar showed significant tolerance to abiotic stress treatments and it is the most recommended lentil cultivar to be used in soil with high salt and osmotic contents.  相似文献   

11.
12.
Identifying multiple enzyme targets for metabolic engineering is very critical for redirecting cellular metabolism to achieve desirable phenotypes, e.g., overproduction of a target chemical. The challenge is to determine which enzymes and how much of these enzymes should be manipulated by adding, deleting, under-, and/or over-expressing associated genes. In this study, we report the development of a systematic multiple enzyme targeting method (SMET), to rationally design optimal strains for target chemical overproduction. The SMET method combines both elementary mode analysis and ensemble metabolic modeling to derive SMET metrics including l-values and c-values that can identify rate-limiting reaction steps and suggest which enzymes and how much of these enzymes to manipulate to enhance product yields, titers, and productivities. We illustrated, tested, and validated the SMET method by analyzing two networks, a simple network for concept demonstration and an Escherichia coli metabolic network for aromatic amino acid overproduction. The SMET method could systematically predict simultaneous multiple enzyme targets and their optimized expression levels, consistent with experimental data from the literature, without performing an iterative sequence of single-enzyme perturbation. The SMET method was much more efficient and effective than single-enzyme perturbation in terms of computation time and finding improved solutions.  相似文献   

13.

Background  

In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these p-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR) and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the p-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available.  相似文献   

14.
We have investigated the correlation between DNA adduct levels and aryl hydrocarbon hydroxylase (AHH) activity in peripheral lymphocyte samples obtained from 42 lung cancer patients. DNA adducts and AHH activity were determined by the 32P-postlabelling technique and the fluorometric method, respectively. The mean +/- SD of DNA adduct level was 0.88 +/- 0.37 (ranged from 0.22 to 1.90) per 108 nucleotides. The geometric means of non-induced and 3-methylcholanthrene (MC)-induced AHH activity, as well as AHH inducibility (MC-induced AHH activity/non-induced AHH activity) were 0.029, 0.228 pmol min-1 10-6 cells, and 7.776, respectively. There was no statistically significant correlation between DNA adduct levels and non-induced or MC-induced AHH activity. A tendency of positive correlation was found between DNA adduct levels and AHH inducibility for the all subjects (n = 42, r = 0.25, p = 0.11). Such a positive correlation reached statistical significance in the subjects with squamous cell carcinoma (n = 13, r = 0.70, p &lt; 0.01). In addition, similar correlation of DNA adducts with AHH inducibility was also observed in the GSTM1 present genotype (n = 17, r = 0.44, p = 0.07) and GSTP1-AA genotype (n = 29, r = 0.37, p = 0.05) individuals. These findings suggest that DNA adduct levels are mediated by CYP1A1 enzyme, and AHH inducibility may be a more relevant indicator than specific AHH activity for explaining the variation of DNA adduct levels in lymphocytes.  相似文献   

15.
The unwanted horse issue continues to be a major concern in the U.S. equine industry. Nonprofit organizations dedicated to rescuing, retraining, and rehoming unwanted horses are critical in minimizing this problem. This study utilized data collected nationwide from organizations that provide these services for thoroughbreds retired from racing to identify individual horse characteristics that influenced length of stay at the adoption facility as well as characteristics that increased the probability that an adopted horse would be returned to the facility. The results suggested that horses with fewer activity limitations were rehomed more quickly (p < .01), as were gray horses (relative to bays, p < .03) and stallions (relative to geldings, p < .04). Older horses took longer to rehome (p < .05). Interestingly, the results also suggested that gray horses were more likely to be returned to the facility postadoption (p < .02). Results from this study could benefit thoroughbreds retired from racing, nonprofit organizations, end consumers, and the thoroughbred racing industry.  相似文献   

16.
Lee YH  Nath SK 《Human genetics》2005,118(3-4):434-443
To date, several susceptibility loci for systemic lupus erythematosus (SLE) have been identified by individual genome-wide scans, but many of these loci have shown inconsistent results across studies. Additionally, many individual studies are at the lower limit of acceptable power recommended for declaring significant linkage. The genome search meta-analysis (GSMA) has been proposed as a valid and robust method for combining several genome scan results. The aim of this study is to investigate whether there is any consistent evidence of linkage across multiple studies, and to identify novel SLE susceptibility loci by using GSMA method. Twelve genome scan results generated from nine independent studies have been used for the present GSMA. All together, the data consists of 605 families with 1,355 SLE affected individuals from three self-reported ethnicities; Caucasian, African-American, and Hispanic. For each study, the genome was divided into 120 bins (30 cM) and ranked according to the maximum evidence of linkage within each bin. The ranks were summed and averaged across studies following which the significance was assessed by the permutation tests. The present study identified two genomic locations at 6p22.3–6p21.1 and 16p12.3–16q12.2 that met genome-wide significance (p<0.000417). The identified region at 6p22.3–6p21.1 contains the HLA region. The combined p-values using Fisher’s method also supported the significance in these regions. Clustering of significant adjacent bins was observed for chromosomes 6 and 16. Additionally, there are 12 other bins with two point-wise p-values (Psumrnk and Pord) <0.05, suggesting that these bin regions are highly likely to contain SLE susceptibility loci. Among them, present GSMA also identified two novel regions at 4q32.1–4q34.3 and 13q13.2–13q22.2. However, separate analysis using only Caucasian populations identified the strongest evidence for linkage at chromosome 6p21.1–6q15 (Psumrnk=0.00021). One interesting novel region suggests that 3q22.1–3q25.33 (Psumrnk=0.01376) may be an ethnicity-specific SLE linkage. In summary, the present GSMA have identified two statistically significant genomic regions that reconfirmed the SLE linkage at chromosomes 6 and 16.  相似文献   

17.
Summary Salt-free and 0.2 M NaCl oxygenated aqueous solutions of poly-L-glutamic acid were irradiated with60Co--radiation at variouspH's to examine whether or not the changes caused by the exposure to ionizing radiation depend onpH, that is, the conformations of polypeptide.TheG-values (the number of main-chain scissions per 100 eV of energy absorbed) in both salt-free and 0.2 M NaCl solutions of poly-L-glutamic acid were found to change sharply withpH. and to have a maximum value at thepH of a mid-point of helix-coil transition. The change ofG-values withpH was discussed in terms of the conformational change of poly-L-glutamic acid.  相似文献   

18.
Objective: With increasing frequency, health promotion messages advocating physical activity are claiming weight loss as a benefit. However, messages promoting physical activity as a weight loss strategy may have limited effectiveness and cross‐cultural relevance. We recently found self‐perceived overweight to be a more robust correlate of sedentary behavior than BMI in Los Angeles County adults. In this study, we examined ethnic and sex differences in overweight self‐perception and their association with sedentariness in this sample. Research Methods and Procedures: We conducted bivariate and multivariate analyses of cross‐sectional survey data from a representative sample of Los Angeles County adults. Results: Women were more likely to perceive themselves to be overweight than men overall (73.2% of overweight/non‐obese and 24.1% of average weight women vs. 44.5% of overweight/non‐obese and 5.6% of average weight men) and within each ethnic group. African‐Americans were least likely (41.3% of overweight/non‐obese African‐Americans self‐identified as overweight) and whites were most likely to consider themselves overweight (60.6% of overweight/non‐obese whites self‐identified as overweight). Overweight (vs. average weight) self‐perception was correlated with sedentariness among average weight adults (45.3% vs. 33.0%, p < 0.001), overweight adults (43.4% vs. 33.6%, p < 0.001), men (average and overweight: 38.4% vs. 27.8%, p < 0.001), overweight whites (41.9% vs. 29.7%, p = 0.0012), and African‐Americans and Latinos (41.6% vs. 33.9%, p = 0.005). Discussion: These data suggest that our society's emphasis on weight loss rather than lifestyle change may inadvertently discourage physical activity adoption/maintenance among non‐obese individuals. However, further research is needed, particularly from prospective cohort and intervention studies, to elucidate the relationship between overweight self‐perception and healthy lifestyle change.  相似文献   

19.
Survival curves of a cocktail of eight serotypes of Salmonella in ground beef and pork meat of different levels of fat (4% to 28%), at temperatures that ranged from 58°C to 65°C, were examined. Asymptotic D-values (D-values for large times) and initial D-values (D-values for small times, near zero) were estimated by identifying regions where the survival curves were linear, and performing linear regressions on data within the identified regions. The initial lag D-values increase with increasing fat levels for both beef and pork. The relationship of the asymptotic D-values with fat levels and temperature is complex, and definitive conclusions could not be made. It appears that, for ground beef, asymptotic D-values increase with increasing fat levels, but this was not the case for ground pork. The shapes of the survival curves were concave, convex, and sigmoidal, and depended upon the temperature, where for the lower temperatures studied (58°C and 60°C) the curves exhibited tailing. The Gompertz function was found to provide a good fit to the data over the range of temperatures and fat levels studied. These results, particularly for beef, suggest the importance of determining the shape of the survival curves (concave, convex or sigmoidal) when estimating times needed to obtain an adequate margin of safety for thermal processes of red meat.  相似文献   

20.
Cellulases are of economic significance, particularly in the detergent and textile industries, where they are subjected to a wide range of operating conditions affecting their stability. To increase our insight into the properties of this class of enzymes, we have carried out a study of the stability and folding behavior of the 413-residue endoglucanase I (Ce17B) from Humicola insolens. Data from chemical denaturation in guanidinium chloride agree satisfactorily with calorimetric measurements, revealing an optimum stability of ca. 20 kcal mol?1 around pH 7 and a peak half-width of 3 -4 pH units. Stability and activity show very similar pH-profiles, but this is probably fortuitous. Judging from equilibrium m-values (the dependence of the log of the equilibrium unfolding constant on the denaturant concentration), the denatured state becomes significantly more compact outside pH 6–9.

Folding and unfolding proceed very slowly with relaxation half times up to 6h. Single- and double-jump kinetic data at pH 7 suggest a folding scheme involving two intermediates with native-like secondary structure but varying degrees of tertiary structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号