首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the problematic relationship between publication success and statistical significance in the light of analyses in which we examine the distribution of published probability (P) values across the statistical 'significance' range, below the 5% probability threshold. P-values are often judged according to whether they lie beneath traditionally accepted thresholds (< 0.05, < 0.01, < 0.001, < 0.0001); we examine how these thresholds influence the distribution of reported absolute P-values in published scientific papers, the majority in biological sciences. We collected published P-values from three leading journals, and summarized their distribution using the frequencies falling across and within these four threshold values between 0.05 and 0. These published frequencies were then fitted to three complementary null models which allowed us to predict the expected proportions of P-values in the top and bottom half of each inter-threshold interval (i.e. those lying below, as opposed to above, each P-value threshold). Statistical comparison of these predicted proportions, against those actually observed, provides the first empirical evidence for a remarkable excess of probability values being cited on, or just below, each threshold relative to the smoothed theoretical distributions. The pattern is consistent across thresholds and journals, and for whichever theoretical approach used to generate our expected proportions. We discuss this novel finding and its implications for solving the problems of publication bias and selective reporting in evolutionary biology.  相似文献   

2.
MOTIVATION: Current methods for multiplicity adjustment do not make use of the graph structure of Gene Ontology (GO) when testing for association of expression profiles of GO terms with a response variable. RESULTS: We propose a multiple testing method, called the focus level procedure, that preserves the graph structure of Gene Ontology (GO). The procedure is constructed as a combination of a Closed Testing procedure with Holm's method. It requires a user to choose a 'focus level' in the GO graph, which reflects the level of specificity of terms in which the user is most interested. This choice also determines the level in the GO graph at which the procedure has most power. We prove that the procedure strongly controls the family-wise error rate without any additional assumptions on the joint distribution of the test statistics used. We also present an algorithm to calculate multiplicity-adjusted P-values. Because the focus level procedure preserves the structure of the GO graph, it does not generally preserve the ordering of the raw P-values in the adjusted P-values. AVAILABILITY: The focus level procedure has been implemented in the globaltest and GlobalAncova packages, both of which are available on www.bioconductor.org.  相似文献   

3.
Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity of P-values and the automated way they are, as a rule, produced by software packages. Attempts to design simple ways to convert an association P-value into the probability that a finding is spurious have been met with difficulties. The False Positive Report Probability (FPRP) method has gained increasing popularity. However, FPRP is not designed to estimate the probability for a particular finding, because it is defined for an entire region of hypothetical findings with P-values at least as small as the one observed for that finding. Here we propose a method that lets researchers extract probability that a finding is spurious directly from a P-value. Considering the counterpart of that probability, we term this method POFIG: the Probability that a Finding is Genuine. Our approach shares FPRP''s simplicity, but gives a valid probability that a finding is spurious given a P-value. In addition to straightforward interpretation, POFIG has desirable statistical properties. The POFIG average across a set of tentative associations provides an estimated proportion of false discoveries in that set. POFIGs are easily combined across studies and are immune to multiple testing and selection bias. We illustrate an application of POFIG method via analysis of GWAS associations with Crohn''s disease.  相似文献   

4.
Summary .  Pharmacovigilance systems aim at early detection of adverse effects of marketed drugs. They maintain large spontaneous reporting databases for which several automatic signaling methods have been developed. One limit of those methods is that the decision rules for the signal generation are based on arbitrary thresholds. In this article, we propose a new signal-generation procedure. The decision criterion is formulated in terms of a critical region for the P-values resulting from the reporting odds ratio method as well as from the Fisher's exact test. For the latter, we also study the use of mid-P-values. The critical region is defined by the false discovery rate, which can be estimated by adapting the P-values mixture model based procedures to one-sided tests. The methodology is mainly illustrated with the location-based estimator procedure. It is studied through a large simulation study and applied to the French pharmacovigilance database.  相似文献   

5.
Kuo CL  Zaykin DV 《Genetics》2011,189(1):329-340
In recent years, genome-wide association studies (GWAS) have uncovered a large number of susceptibility variants. Nevertheless, GWAS findings provide only tentative evidence of association, and replication studies are required to establish their validity. Due to this uncertainty, researchers often focus on top-ranking SNPs, instead of considering strict significance thresholds to guide replication efforts. The number of SNPs for replication is often determined ad hoc. We show how the rank-based approach can be used for sample size allocation in GWAS as well as for deciding on a number of SNPs for replication. The basis of this approach is the "ranking probability": chances that at least j true associations will rank among top u SNPs, when SNPs are sorted by P-value. By employing simple but accurate approximations for ranking probabilities, we accommodate linkage disequilibrium (LD) and evaluate consequences of ignoring LD. Further, we relate ranking probabilities to the proportion of false discoveries among top u SNPs. A study-specific proportion can be estimated from P-values, and its expected value can be predicted for study design applications.  相似文献   

6.
Ten genes (ANK1, bR10D1, CA3, EPOR, HMGA2, MYPN, NME1, PDGFRA, ERC1, TTN), whose candidacy for meat-quality and carcass traits arises from their differential expression in prenatal muscle development, were examined for association in 1700 performance-tested fattening pigs of commercial purebred and crossbred herds of Duroc, Pietrain, Pietrain x (Landrace x Large White), Duroc x (Landrace x Large White) as well as in an experimental F(2) population based on a reciprocal cross of Duroc and Pietrain. Comparative sequencing revealed polymorphic sites segregating across commercial breeds. Genetic mapping results corresponded to pre-existing assignments to porcine chromosomes or current human-porcine comparative maps. Nine of these genes showed association with meat-quality and carcass traits at a nominal P-value of < or = 0.05; PDGFRA revealed no association reaching the P < or = 0.05 threshold. In particular, HMGA2, CA3, EPOR, NME1 and TTN were associated with meat colour, pH and conductivity of loin 24 h postmortem; CA3 and MYPN exhibited association with ham weight and lean content (FOM) respectively at P-values of < 0.003 that correspond to false discovery rates of < 0.05. However, none of the genes showed significant associations for a particular trait across all populations. The study revealed statistical-genetic evidence for association of the functional candidate genes with traits related to meat quality and muscle deposition. The polymorphisms detected are not likely causal, but markers were identified that are in linkage disequilibrium with causal genetic variation within particular populations.  相似文献   

7.
Ranks of genuine associations in whole-genome scans   总被引:4,自引:0,他引:4       下载免费PDF全文
Zaykin DV  Zhivotovsky LA 《Genetics》2005,171(2):813-823
With the recent advances in high-throughput genotyping techniques, it is now possible to perform whole-genome association studies to fine map causal polymorphisms underlying important traits that influence susceptibility to human diseases and efficacy of drugs. Once a genome scan is completed the results can be sorted by the association statistic value. What is the probability that true positives will be encountered among the first most associated markers? When a particular polymorphism is found associated with the trait, there is a chance that it represents either a "true" or a "false" association (TA vs. FA). Setting appropriate significance thresholds has been considered to provide assurance of sufficient odds that the associations found to be significant are genuine. However, the problem with genome scans involving thousands of markers is that the statistic values of FAs can reach quite extreme magnitudes. In such situations, the distributions corresponding to TAs and the most extreme FAs become comparable and significance thresholds tend to penalize TAs and FAs in a similar fashion. When sorting between true and false associations, the "typical" place (i.e., rank) of TAs among the most significant outcomes becomes important, ordered by the association statistic value. The distribution of ranks that we study here allows calculation of several useful quantities. In particular, it gives the number of most significant markers needed for a follow-up study to guarantee that a true association is included with certain probability. This can be calculated conditionally on having applied a multiple-testing correction. Effects of multilocus (e.g., haplotype association) tests and impact of linkage disequilibrium on the distribution of ranks associated with TAs are evaluated and can be taken into account.  相似文献   

8.
Jiang Z  Akey JM  Shi J  Xiong M  Wang Y  Shen Y  Xu X  Chen H  Wu H  Xiao J  Lu D  Huang W  Jin L 《Human genetics》2001,109(1):95-98
Catalase is an important antioxidant enzyme that detoxifies H2O2 into oxygen and water and thus limits the deleterious effects of reactive oxygen species (ROS). Because chronic exposure to excess ROS may contribute to vascular damage, we investigated whether genetic variation in catalase was associated with susceptibility to essential hypertension (EHYT) in 324 individuals (at least 50 years old) who were randomly sampled from an isolated population living in Xiangchang, China. They were screened for genetic variation in the promoter of catalase by direct sequencing. In total, four single nucleotide polymorphisms (SNPs) were identified. The association between the SNPs and EHYT was investigated by a linear regression model under phenotypic selection; in our analyses, we used both SBP>150 mmHg and SBP>160 mmHg as thresholds. A SNP 844 bp upstream of the start codon (SNP-844) demonstrated strong evidence of association with EHYT (SBP>150 mmHg: F=5.09, P=0.008; SBP>160 mmHg: F=7.13, P=0.002). This is the first study to implicate genetic variation in catalase in susceptibility to EHYT and suggests that polymorphisms in promoter regions may be particularly relevant to the study of complex diseases.  相似文献   

9.

Background

Advanced intercross lines (AIL) are segregating populations created using a multi-generation breeding protocol for fine mapping complex trait loci (QTL) in mice and other organisms. Applying QTL mapping methods for intercross and backcross populations, often followed by naïve permutation of individuals and phenotypes, does not account for the effect of AIL family structure in which final generations have been expanded and leads to inappropriately low significance thresholds. The critical problem with naïve mapping approaches in AIL populations is that the individual is not an exchangeable unit.

Methodology/Principal Findings

The effect of family structure has immediate implications for the optimal AIL creation (many crosses, few animals per cross, and population expansion before the final generation) and we discuss these and the utility of AIL populations for QTL fine mapping. We also describe Genome Reshuffling for Advanced Intercross Permutation, (GRAIP) a method for analyzing AIL data that accounts for family structure. GRAIP permutes a more interchangeable unit in the final generation crosses – the parental genome – and simulating regeneration of a permuted AIL population based on exchanged parental identities. GRAIP determines appropriate genome-wide significance thresholds and locus-specific P-values for AILs and other populations with similar family structures. We contrast GRAIP with naïve permutation using a large densely genotyped mouse AIL population (1333 individuals from 32 crosses). A naïve permutation using coat color as a model phenotype demonstrates high false-positive locus identification and uncertain significance levels, which are corrected using GRAIP. GRAIP also detects an established hippocampus weight locus and a new locus, Hipp9a.

Conclusions and Significance

GRAIP determines appropriate genome-wide significance thresholds and locus-specific P-values for AILs and other populations with similar family structures. The effect of family structure has immediate implications for the optimal AIL creation and we discuss these and the utility of AIL populations.  相似文献   

10.
WHAP: haplotype-based association analysis   总被引:7,自引:0,他引:7  
We describe a software tool to perform haplotype-based association analysis, for quantitative and qualitative traits, in population and family samples, using single nucleotide polymorphism or multiallelic marker data. A range of tests is offered: omnibus and haplotype-specific tests; prospective and retrospective likelihoods; covariates and moderators; sliding window analyses; permutation P-values. We focus on the ability to flexibly impose constraints on haplotype effects, which allows for a range of conditional haplotype-based likelihood ratio tests: for example, whether an allele has an effect independent of its haplotypic background, or whether a single variant can explain the overall association at a locus. We illustrate using these tests to dissect a multi-locus association. AVAILABILITY: WHAP is a C/C++ program, freely available from the author's website: http://pngu.mgh.harvard.edu/purcell/whap/  相似文献   

11.
The cytotoxic T lymphocyte antigen4 (CTLA4) gene plays a critical role in the control of T cell activation. The gene encodes a surface molecule with inhibitory effects on activated T cells. Several studies have disclosed an association between the previously known variants of the CTLA4 gene and autoimmune disorders, but no study has as yet found any definite association between vitiligo and the CTLA4 polymorphisms. A recent study identified new candidate susceptibility polymorphisms in this region, associated with differential gene splicing and thereby the relative abundance of soluble CTLA4. To assess these new polymorphisms in patients with vitiligo, we genotyped 100 vitiligo patients and 140 healthy controls from the UK, for these novel polymorphisms. No association was found in patients with isolated vitiligo, but a significant association was seen in patients with vitiligo and other autoimmune diseases. The results indicate that the polymorphisms in the CTLA4 gene region confer susceptibility to vitiligo when occurring together with other autoimmune diseases, but not in patients with isolated vitiligo. This raises the possibility that there are two distinct forms of vitiligo where only a subgroup of patients may have a disease caused by the autoimmune destruction of melanocytes.  相似文献   

12.
Aulchenko YS  de Koning DJ  Haley C 《Genetics》2007,177(1):577-585
For pedigree-based quantitative trait loci (QTL) association analysis, a range of methods utilizing within-family variation such as transmission-disequilibrium test (TDT)-based methods have been developed. In scenarios where stratification is not a concern, methods exploiting between-family variation in addition to within-family variation, such as the measured genotype (MG) approach, have greater power. Application of MG methods can be computationally demanding (especially for large pedigrees), making genomewide scans practically infeasible. Here we suggest a novel approach for genomewide pedigree-based quantitative trait loci (QTL) association analysis: genomewide rapid association using mixed model and regression (GRAMMAR). The method first obtains residuals adjusted for family effects and subsequently analyzes the association between these residuals and genetic polymorphisms using rapid least-squares methods. At the final step, the selected polymorphisms may be followed up with the full measured genotype (MG) analysis. In a simulation study, we compared type 1 error, power, and operational characteristics of the proposed method with those of MG and TDT-based approaches. For moderately heritable (30%) traits in human pedigrees the power of the GRAMMAR and the MG approaches is similar and is much higher than that of TDT-based approaches. When using tabulated thresholds, the proposed method is less powerful than MG for very high heritabilities and pedigrees including large sibships like those observed in livestock pedigrees. However, there is little or no difference in empirical power of MG and the proposed method. In any scenario, GRAMMAR is much faster than MG and enables rapid analysis of hundreds of thousands of markers.  相似文献   

13.

Background

The genetic contribution to sporadic amyotrophic lateral sclerosis (ALS) has not been fully elucidated. There are increasing efforts to characterise the role of copy number variants (CNVs) in human diseases; two previous studies concluded that CNVs may influence risk of sporadic ALS, with multiple rare CNVs more important than common CNVs. A little-explored issue surrounding genome-wide CNV association studies is that of post-calling filtering and merging of raw CNV calls. We undertook simulations to define filter thresholds and considered optimal ways of merging overlapping CNV calls for association testing, taking into consideration possibly overlapping or nested, but distinct, CNVs and boundary estimation uncertainty.

Methodology and Principal Findings

In this study we screened Illumina 300K SNP genotyping data from 730 ALS cases and 789 controls for copy number variation. Following quality control filters using thresholds defined by simulation, a total of 11321 CNV calls were made across 575 cases and 621 controls. Using region-based and gene-based association analyses, we identified several loci showing nominally significant association. However, the choice of criteria for combining calls for association testing has an impact on the ranking of the results by their significance. Several loci which were previously reported as being associated with ALS were identified here. However, of another 15 genes previously reported as exhibiting ALS-specific copy number variation, only four exhibited copy number variation in this study. Potentially interesting novel loci, including EEF1D, a translation elongation factor involved in the delivery of aminoacyl tRNAs to the ribosome (a process which has previously been implicated in genetic studies of spinal muscular atrophy) were identified but must be treated with caution due to concerns surrounding genomic location and platform suitability.

Conclusions and Significance

Interpretation of CNV association findings must take into account the effects of filtering and combining CNV calls when based on early genome-wide genotyping platforms and modest study sizes.  相似文献   

14.
The ataxia telangiectasia mutated (ATM) gene plays a major role in repairing the double-strand breaks and maintaining the genome stability. In this case-control study, associations of seven ATM single-nucleotide polymorphisms (rs600931, rs189037, rs652311, rs624366, rs228589, rs227092 and rs227060) with risks in childhood leukemia in a Taiwanese population were investigated. Two hundred and sixty-six patients with childhood leukemia and 266 age-matched healthy controls recruited were genotyped and analyzed. The P-values of the distributions of the genotypic frequencies in the seven ATM polymorphisms were 0.8925, 0.2835, 0.5772, 0.8731, 0.3641, 0.9181 and 0.5071, respectively. The Pvalues of the distributions of the allelic frequencies in the seven ATM polymorphisms were 0.6158, 0.1179, 0.6971, 0.7944, 0.1887, 0.6605 and 0.2747, respectively. Although the results did not indicate that ATM polymorphism is directly associated with childhood leukemia, the gene-gene and gene-environment interactions of ATM with other factors is worthy of further investigation in the future.  相似文献   

15.
Global human genetics of HIV-1 infection and China   总被引:3,自引:0,他引:3  
Zhu TF  Feng TJ  Xiao X  Wang H  Zhou BP 《Cell research》2005,15(11-12):833-842
Genetic polymorphisms in human genes can influence the risk for HIV-1 infection and disease progression, although the reported effects of these alleles have been inconsistent. This review highlights the recent discoveries on global and Chinese genetic polymorphisms and their association with HIV-1 transmission and disease progression.  相似文献   

16.
INTRODUCTION HIV-1 infection results in a variety of clinical outcomes. The majority of HIV-1 infected individuals progress to AIDS within 5 to 10 years, some progress rapidly to AIDS (termed rapid progressors) while others progress to AIDS slowly (slow p…  相似文献   

17.
Genetic polymorphisms in human genes can influence the risk for HIV-1 infection and disease progression, although the reported effects of these alleles have been inconsistent. This review highlights the recent discoveries on global and Chinese genetic polymorphisms and their association with HIV-1 transmission and disease progression.  相似文献   

18.
Genetic polymorphisms in human genes can influence the risk for HIV-1 infection and disease progression, although the reported effects of these alleles have been inconsistent. This review highlights the recent discoveries on global and Chinese genetic polymorphisms and their association with HIV-1 transmission and disease progression.  相似文献   

19.
This study focused on the association of polymorphisms of the FADS2 gene with fatty acid profiles in egg yolk of eight Japanese quail lines selected for high and low omega-6:omega-3 PUFA ratio (h2 = 0.36-0.38). For the identification of polymorphisms within the FADS2 gene 1350 bp of cDNA sequence were obtained encoding 404 amino acids. Five synonymous SNPs were found by comparative sequencing of animals of the high and low lines. These SNPs were genotyped by single base extension on 160 Japanese quail. The association analysis, comprising analysis of variance and family based association test (FBAT), revealed significant effects of SNP3 and SNP4 genotypes on the egg yolk fatty acid profiles, especially the omega-6 and omega-3 PUFAs (P < 0.05). No effects of the other SNPs were found - indicating that these are not in linkage disequilibrium with the causal polymorphism. The results of this study promote FADS2 as a functional candidate gene for traits related to omega-6 and omega-3 PUFA concentration in the egg yolk.  相似文献   

20.
Yang Z  Liang Y  Qin B  Li C  Zhong R 《Cytokine》2012,57(2):282-289
The results from previous studies on association of TLR9 polymorphisms with the risk of systemic lupus erythematosus (SLE) remained contradictory. Therefore, a meta-analysis was performed to assess the association between TLR9 polymorphisms and SLE susceptibility. A literature-based search was conducted to identify all relevant studies. Pooled data were estimated by fixed- and random-effects models when appropriate. We examined seven publications, showing that there were only three polymorphisms (-1486C/T, +1174A/G and +1635C/T) existing in Asian populations. The meta-analysis indicated that none of these three polymorphisms showed any significant association with SLE risk in Asian populations. In conclusion, the present study indicates that TLR9 polymorphisms are not candidates for susceptibility to SLE, at least, in eastern Asian population. Furthermore, a large number of studies should be performed to explore the association of TLR9 polymorphisms with the risk of SLE in other populations, such as Europeans, Americans and Africans.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号