首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The conventional nonparametric tests in survival analysis, such as the log‐rank test, assess the null hypothesis that the hazards are equal at all times. However, hazards are hard to interpret causally, and other null hypotheses are more relevant in many scenarios with survival outcomes. To allow for a wider range of null hypotheses, we present a generic approach to define test statistics. This approach utilizes the fact that a wide range of common parameters in survival analysis can be expressed as solutions of differential equations. Thereby, we can test hypotheses based on survival parameters that solve differential equations driven by cumulative hazards, and it is easy to implement the tests on a computer. We present simulations, suggesting that our tests perform well for several hypotheses in a range of scenarios. As an illustration, we apply our tests to evaluate the effect of adjuvant chemotherapies in patients with colon cancer, using data from a randomized controlled trial.  相似文献   

2.

Background

When conducting multiple hypothesis tests, it is important to control the number of false positives, or the False Discovery Rate (FDR). However, there is a tradeoff between controlling FDR and maximizing power. Several methods have been proposed, such as the q-value method, to estimate the proportion of true null hypothesis among the tested hypotheses, and use this estimation in the control of FDR. These methods usually depend on the assumption that the test statistics are independent (or only weakly correlated). However, many types of data, for example microarray data, often contain large scale correlation structures. Our objective was to develop methods to control the FDR while maintaining a greater level of power in highly correlated datasets by improving the estimation of the proportion of null hypotheses.

Results

We showed that when strong correlation exists among the data, which is common in microarray datasets, the estimation of the proportion of null hypotheses could be highly variable resulting in a high level of variation in the FDR. Therefore, we developed a re-sampling strategy to reduce the variation by breaking the correlations between gene expression values, then using a conservative strategy of selecting the upper quartile of the re-sampling estimations to obtain a strong control of FDR.

Conclusion

With simulation studies and perturbations on actual microarray datasets, our method, compared to competing methods such as q-value, generated slightly biased estimates on the proportion of null hypotheses but with lower mean square errors. When selecting genes with controlling the same FDR level, our methods have on average a significantly lower false discovery rate in exchange for a minor reduction in the power.  相似文献   

3.
4.

Background  

Multiple data-analytic methods have been proposed for evaluating gene-expression levels in specific biological pathways, assessing differential expression associated with a binary phenotype. Following Goeman and Bühlmann's recent review, we compared statistical performance of three methods, namely Global Test, ANCOVA Global Test, and SAM-GS, that test "self-contained null hypotheses" Via. subject sampling. The three methods were compared based on a simulation experiment and analyses of three real-world microarray datasets.  相似文献   

5.

Background

The role of migratory birds and of poultry trade in the dispersal of highly pathogenic H5N1 is still the topic of intense and controversial debate. In a recent contribution to this journal, Flint argues that the strict application of the scientific method can help to resolve this issue.

Discussion

We argue that Flint's identification of the scientific method with null hypothesis testing is misleading and counterproductive. There is far more to science than the testing of hypotheses; not only the justification, bur also the discovery of hypotheses belong to science. We also show why null hypothesis testing is weak and that Bayesian methods are a preferable approach to statistical inference. Furthermore, we criticize the analogy put forward by Flint between involuntary transport of poultry and long-distance migration.

Summary

To expect ultimate answers and unequivocal policy guidance from null hypothesis testing puts unrealistic expectations on a flawed approach to statistical inference and on science in general.  相似文献   

6.
1 Most plant‐feeding insects show some degree of specialization and use a variety of cues to locate their host. Two main mechanisms of host location, primary attraction and random landing, have been investigated for such insects. 2 Research has led to contradictory conclusions about those hypotheses, especially for wood‐feeding insects; however, recent studies suggest that both mechanisms may take place in a single taxon but at different scales. 3 We developed a field experiment to test the hypothesis that primary attraction occurs at larger scale and random landing at finer scale in wood‐feeding insects. Landing rates, measured using sticky traps, were compared first between patches and then between individual trees according to their distance to a baited central tree. 4 Polynomial functions describing landing rate to distance relationships were compared with a function produced by a null model describing what should occur under the random landing hypothesis. Scolytidae and Cerambycidae (Coleoptera) responded to volatiles at the patch scale, supporting the primary attraction hypothesis, but the landing patterns of some groups at finer scale matched closely the predictions of our null model, giving support to the random landing hypothesis. 5 Our results show that the primary attraction and random landing hypotheses are not mutually exclusive and that prelanding use of host‐produced volatile is scale‐dependant. Scale considerations should thus be included in the study of prelanding host‐selection of wood‐feeding insects.  相似文献   

7.

Background  

DNA methylation is an epigenetic phenomenon known to play an important role in the development of cancers, including colorectal cancer (CRC). Aberrant methylation of promoter regions of genes is potentially reversible, and if methylation is important for cancer survival, demethylation should do the opposite. To test this we have addressed the hypothesis that DNA methyltransferase inhibitors (DNMTi), decytabine and zebularine, potentiate inhibitory effects of classical anti-CRC cytostatics, oxaliplatin and 5-fluorouracil (5-FU), on survival of CRC cells in vitro.  相似文献   

8.
Dunson DB  Chen Z 《Biometrics》2004,60(2):352-358
In multivariate survival analysis, investigators are often interested in testing for heterogeneity among clusters, both overall and within specific classes. We represent different hypotheses about the heterogeneity structure using a sequence of gamma frailty models, ranging from a null model with no random effects to a full model having random effects for each class. Following a Bayesian approach, we define prior distributions for the frailty variances consisting of mixtures of point masses at zero and inverse-gamma densities. Since frailties with zero variance effectively drop out of the model, this prior allocates probability to each model in the sequence, including the overall null hypothesis of homogeneity. Using a counting process formulation, the conditional posterior distributions of the frailties and proportional hazards regression coefficients have simple forms. Posterior computation proceeds via a data augmentation Gibbs sampling algorithm, a single run of which can be used to obtain model-averaged estimates of the population parameters and posterior model probabilities for testing hypotheses about the heterogeneity structure. The methods are illustrated using data from a lung cancer trial.  相似文献   

9.

Objectives

Testicular cancer is the leading cancer of young adults and its incidence is increasing in almost all industrialized countries. The survival rate after testicular cancer is 95%, all stages combined, but a group of patients with poor prognosis still fails to respond to treatment. The time to diagnosis is defined as the time in months between perception of the first symptoms of testicular cancer by the patient and the diagnosis of the disease by the doctor. The objective of this study is to determine whether the time to diagnosis has a prognostic value, particularly whether it is correlated with the stage of the disease and survival.

Material and Methods

The time to diagnosis was studied in 542 patients with a diagnosis of testicular cancer between 1983 and 2002 in the Midi-Pyrenées region. Information concerning the disease and treatments contained in medical files was collected on a summary document. The time to diagnosis was correlated with prognostic parameters, including stage and survival.

Results

The mean time to diagnosis was 3.7±5.1 months and was longer for seminomas (4.9±6.1 months) than for non-seminomatous germ cell tumours (NSGCT) (2.8 ±4.0 months). The time to diagnosis was correlated with the stage of the disease and the 5-year survival on the overall population and in the NSGCT group, but not in the seminoma group.

Conclusions

Early diagnosis has a prognostic value (correlation with stage of the disease and 5-year survival rate). Testicular cancer information campaigns should therefore be envisaged.  相似文献   

10.

Background  

Time-course microarray experiments are widely used to study the temporal profiles of gene expression. Storey et al. (2005) developed a method for analyzing time-course microarray studies that can be applied to discovering genes whose expression trajectories change over time within a single biological group, or those that follow different time trajectories among multiple groups. They estimated the expression trajectories of each gene using natural cubic splines under the null (no time-course) and alternative (time-course) hypotheses, and used a goodness of fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated through a bootstrap method. Gene expression levels in microarray data are often complicatedly correlated. An accurate type I error control adjusting for multiple testing requires the joint null distribution of test statistics for a large number of genes. For this purpose, permutation methods have been widely used because of computational ease and their intuitive interpretation.  相似文献   

11.
We consider the problem treated by Simes of testing the overall null hypothesis formed by the intersection of a set of elementary null hypotheses based on ordered p‐values of the associated test statistics. The Simes test uses critical constants that do not need tabulation. Cai and Sarkar gave a method to compute generalized Simes critical constants which improve upon the power of the Simes test when more than a few hypotheses are false. The Simes constants can be viewed as the first order (requiring solution of a linear equation) and the Cai‐Sarkar constants as the second order (requiring solution of a quadratic equation) constants. We extend the method to third order (requiring solution of a cubic equation) constants, and also offer an extension to an arbitrary kth order. We show by simulation that the third order constants are more powerful than the second order constants for testing the overall null hypothesis in most cases. However, there are some drawbacks associated with these higher order constants especially for , which limits their practical usefulness.  相似文献   

12.

Background

Independence between observations is a standard prerequisite of traditional statistical tests of association. This condition is, however, violated when autocorrelation is present within the data. In the case of variables that are regularly sampled in space (i.e. lattice data or images), such as those provided by remote-sensing or geographical databases, this problem is particularly acute. Because analytic derivation of the null probability distribution of the test statistic (e.g. Pearson''s r) is not always possible when autocorrelation is present, we propose instead the use of a Monte Carlo simulation with surrogate data.

Methodology/Principal Findings

The null hypothesis that two observed mapped variables are the result of independent pattern generating processes is tested here by generating sets of random image data while preserving the autocorrelation function of the original images. Surrogates are generated by matching the dual-tree complex wavelet spectra (and hence the autocorrelation functions) of white noise images with the spectra of the original images. The generated images can then be used to build the probability distribution function of any statistic of association under the null hypothesis. We demonstrate the validity of a statistical test of association based on these surrogates with both actual and synthetic data and compare it with a corrected parametric test and three existing methods that generate surrogates (randomization, random rotations and shifts, and iterative amplitude adjusted Fourier transform). Type I error control was excellent, even with strong and long-range autocorrelation, which is not the case for alternative methods.

Conclusions/Significance

The wavelet-based surrogates are particularly appropriate in cases where autocorrelation appears at all scales or is direction-dependent (anisotropy). We explore the potential of the method for association tests involving a lattice of binary data and discuss its potential for validation of species distribution models. An implementation of the method in Java for the generation of wavelet-based surrogates is available online as supporting material.  相似文献   

13.
Using the strictly neutral model as a null hypothesis, we tested for deviations from expected levels of nucleotide polymorphism at the alcohol dehydrogenase locus (Adh-1) within and among four species of pocket gophers (Geomys bursarius major, G. knoxjonesi, G. texensis llanensis, and G. attwateri). The complete protein-encoding region was examined, and 10 unique alleles, representing both electromorphic and cryptic alleles, were used to test hypotheses (e.g., the neutral model) concerning the maintenance of genetic variation. Nineteen variable sites were identified among the 10 alleles examined, including 9 segregating sites occurring in synonymous positions and 10 that were nonsynonymous. Several statistical methods, including those that test for within-species variation as well as those that examine variation within and among species, failed to reject the null hypothesis that variation (both within and between species of Geomys) at the Adh locus is consistent with the neutral theory. However, there was significant heterogeneity in the ratio of polymorphism to divergence across the gene, with polymorphisms clustered in the first half of the coding region and fixed differences clustered in the second half of the gene. Two alternative hypotheses are discussed as possible explanations for this heterogeneity: an old balanced polymorphism in the first half of the gene or a recent selective sweep in the second half of the gene.   相似文献   

14.
Significance analysis of groups of genes in expression profiling studies   总被引:1,自引:0,他引:1  
MOTIVATION: Gene class testing (GCT) is a statistical approach to determine whether some functionally predefined classes of genes express differently under two experimental conditions. GCT computes the P-value of each gene class based on the null distribution and the gene classes are ranked for importance in accordance with their P-values. Currently, two null hypotheses have been considered: the Q1 hypothesis tests the relative strength of association with the phenotypes among the gene classes, and the Q2 hypothesis assesses the statistical significance. These two hypotheses are related but not equivalent. METHOD: We investigate three one-sided and two two-sided test statistics under Q1 and Q2. The null distributions of gene classes under Q1 are generated by permuting gene labels and the null distributions under Q2 are generated by permuting samples. RESULTS: We applied the five statistics to a diabetes dataset with 143 gene classes and to a breast cancer dataset with 508 GO (Gene Ontology) terms. In each statistic, the null distributions of the gene classes under Q1 are different from those under Q2 in both datasets, and their rankings can be different too. We clarify the one-sided and two-sided hypotheses, and discuss some issues regarding the Q1 and Q2 hypotheses for gene class ranking in the GCT. Because Q1 does not deal with correlations among genes, we prefer test based on Q2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

15.

Background  

Ranked gene lists from microarray experiments are usually analysed by assigning significance to predefined gene categories, e.g., based on functional annotations. Tools performing such analyses are often restricted to a category score based on a cutoff in the ranked list and a significance calculation based on random gene permutations as null hypothesis.  相似文献   

16.

Background  

It is widely held that in toothed whales, high frequency tonal sounds called 'whistles' evolved in association with 'sociality' because in delphinids they are used in a social context. Recently, whistles were hypothesized to be an evolutionary innovation of social dolphins (the 'dolphin hypothesis'). However, both 'whistles' and 'sociality' are broad concepts each representing a conglomerate of characters. Many non-delphinids, whether solitary or social, produce tonal sounds that share most of the acoustic characteristics of delphinid whistles. Furthermore, hypotheses of character correlation are best tested in a phylogenetic context, which has hitherto not been done. Here we summarize data from over 300 studies on cetacean tonal sounds and social structure and phylogenetically test existing hypotheses on their co-evolution.  相似文献   

17.
Random trees and random characters can be used in null models for testing phylogenetic hypothesis. We consider three interpretations of random trees: first, that trees are selected from the set of all possible trees with equal probability; second, that trees are formed by random speciation or coalescence (equivalent); and third, that trees are formed by a series of random partitions of the taxa. We consider two interpretations of random characters: first, that the number of taxa with each state is held constant, but the states are randomly reshuffled among the taxa; and second, that the probability each taxon is assigned a particular state is constant from one taxon to the next. Under null models representing various combinations of randomizations of trees and characters, exact recursion equations are given to calculate the probability distribution of the number of character state changes required by a phylogenetic tree. Possible applications of these probability distributions are discussed. They can be used, for example, to test for a panmictic population structure within a species or to test phylogenetic inertia in a character's evolution. Whether and how a null model incorporates tree randomness makes little difference to the probability distribution in many but not all circumstances. The null model's sense of character randomness appears more critical. The difficult issue of choosing a null model is discussed.  相似文献   

18.
19.
A gene tree is an evolutionary reconstruction of the genealogical history of the genetic variation found in a sample of homologous genes or DNA regions that have experienced little or no recombination. Gene trees have the potential of straddling the interface between intra- and interspecific evolution. It is precisely at this interface that the process of speciation occurs, and gene trees can therefore be used as a powerful tool to probe this interface. One application is to infer species status. The cohesion species is defined as an evolutionary lineage or set of lineages with genetic exchangeability and/or ecological interchangeability. This species concept can be phrased in terms of null hypotheses that can be tested rigorously and objectively by using gene trees. First, an overlay of geography upon the gene tree is used to test the null hypothesis that the sample is from a single evolutionary lineage. This phase of testing can indicate that the sampled organisms are indeed from a single lineage and therefore a single cohesion species. In other cases, this null hypothesis is not rejected due to a lack of power or inadequate sampling. Alternatively, this null hypothesis can be rejected because two or more lineages are in the sample. The test can identify lineages even when hybridization and lineage sorting occur. Only when this null hypothesis is rejected is there the potential for more than one cohesion species. Although all cohesion species are evolutionary lineages, not all evolutionary lineages are cohesion species. Therefore, if the first null hypothesis is rejected, a second null hypothesis is tested that all lineages are genetically exchangeable and/or ecologically interchangeable. This second test is accomplished by direct contrasts of previously identified lineages or by overlaying reproductive and/or ecological data upon the gene tree and testing for significant transitions that are concordant with the previously identified lineages. Only when this second null hypothesis is rejected is a lineage elevated to the status of cohesion species. By using gene trees in this manner, species can be identified with objective, a priori criteria with an inference procedure that automatically yields much insight into the process of speciation. When one or more of the null hypotheses cannot be rejected, this procedure also provides specific guidance for future work that will be needed to judge species status.  相似文献   

20.

Background  

Both computational and experimental approaches have been used to determine the minimal gene set required to sustain a bacterial cell. Such studies have provided clues to the minimal cellular-function set needed for life. We evaluate a minimal cellular-function set directly, instead of a geneset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号