首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The inverse normal and Fisher's methods are two common approaches for combining P-values. Whitlock demonstrated that a weighted version of the inverse normal method, or 'weighted Z-test', is superior to Fisher's method for combining P-values for one-sided T-tests. The problem with Fisher's method is that it does not take advantage of weighting and loses power to the weighted Z-test when studies are differently sized. This issue was recently revisited by Chen, who observed that Lancaster's variation of Fisher's method had higher power than the weighted Z-test. Nevertheless, the weighted Z-test has comparable power to Lancaster's method when its weights are set to square roots of sample sizes. Power can be further improved when additional information is available. Although there is no single approach that is the best in every situation, the weighted Z-test enjoys certain properties that make it an appealing choice as a combination method for meta-analysis.  相似文献   

2.
The most commonly used method in evolutionary biology for combining information across multiple tests of the same null hypothesis is Fisher's combined probability test. This note shows that an alternative method called the weighted Z-test has more power and more precision than does Fisher's test. Furthermore, in contrast to some statements in the literature, the weighted Z-method is superior to the unweighted Z-transform approach. The results in this note show that, when combining P-values from multiple tests of the same hypothesis, the weighted Z-method should be preferred.  相似文献   

3.
As more investigators conduct extensive whole-genome linkage scans for complex traits, interest is growing in meta-analysis as a way of integrating the weak or conflicting evidence from multiple studies. However, there is a bias in the most commonly used meta-analysis linkage technique (i.e., Fisher's [1925] method of combining of P values) when it is applied to many nonparametric (i.e., model free) linkage results. The bias arises in those methods (e.g., variance components, affected sib pair, extremely discordant sib pairs, etc.) that truncate all "negative evidence against linkage" into the single value of LOD = 0. If incorrectly handled, this bias can artificially inflate or deflate the combined meta-analysis linkage results for any given locus. This is an especially troublesome problem in the context of a genome scan, since LOD = 0 is expected to occur over half the unlinked genome. The bias can be overcome (nearly) completely by simply interpreting LOD = 0 as a P value of 1divided by 2ln(2) is approximately equal to .72 in Fisher's formula.  相似文献   

4.
Through simulation, Whitlock showed that when all the alternatives have the same effect size, the weighted z-test is superior to both unweighted z-test and Fisher's method when combining P-values from independent studies. In this paper, we show that under the same situation, the generalized Fisher method due to Lancaster outperforms the weighted z-test.  相似文献   

5.
Evolutionarily stable strategy (ESS) models are widely viewed as predicting the strategy of an individual that when monomorphic or nearly so prevents a mutant with any other strategy from entering the population. In fact, the prediction of some of these models is ambiguous when the predicted strategy is "mixed", as in the case of a sex ratio, which may be regarded as a mixture of the subtraits "produce a daughter" and "produce a son." Some models predict only that such a mixture be manifested by the population as a whole, that is, as an "evolutionarily stable state"; consequently, strategy monomorphism or polymorphism is consistent with the prediction. The hawk-dove game and the sex-ratio game in a panmictic population are models that make such a "degenerate" prediction. We show here that the incorporation of population finiteness into degenerate models has effects for and against the evolution of a monomorphism (an ESS) that are of equal order in the population size, so that no one effect can be said to predominate. Therefore, we used Monte Carlo simulations to determine the probability that a finite population evolves to an ESS as opposed to a polymorphism. We show that the probability that an ESS will evolve is generally much less than has been reported and that this probability depends on the population size, the type of competition among individuals, and the number of and distribution of strategies in the initial population. We also demonstrate how the strength of natural selection on strategies can increase as population size decreases. This inverse dependency underscores the incorrectness of Fisher's and Wright's assumption that there is just one qualitative relationship between population size and the intensity of natural selection.  相似文献   

6.
氨基酸的分子结构与遗传密码简并及二维集合分类   总被引:13,自引:2,他引:11  
根据氨基酸遗传密码子的简并程度,可将64个遗传密码子分为高简并度类(3,4,6度简并组)和低简并度类(1,2度简并组)两大类。高简并度类有9个氨基酸,其分子量比较小,等电点的分布比较集中。低简并度类有11个氨基酸,其分子结构比较复杂,参考Taylor对氨基酸特性的分类图,本文提出以分子量(M)及等电点(P)作为氨基酸的化学特性坐标,作出其二维集合MP分类图,MP分类图可以反映出氨基酸的各种属性,如分子量的大小,简并度的高低,极性与非极性、带电荷或不带电荷,疏水性与亲水性,以及氨基酸残基的种类等。根据氨基酸的分类分析,可以认为:高简并度氨基酸多数是脂烃类和羟脂烃类的氨基酸,分子量比较小,分子结构比较简单,大部分为疏子性,主要组成跨膜结构或蛋白质的结构域,可能是出现较早的氨基酸;而低简并度的氨基酸,分子结构比较复杂,分子量比较大,多数是和蛋白质功能有密切联系的基团,可能是进化出现较晚的结构。  相似文献   

7.
Summary .  Pharmacovigilance systems aim at early detection of adverse effects of marketed drugs. They maintain large spontaneous reporting databases for which several automatic signaling methods have been developed. One limit of those methods is that the decision rules for the signal generation are based on arbitrary thresholds. In this article, we propose a new signal-generation procedure. The decision criterion is formulated in terms of a critical region for the P-values resulting from the reporting odds ratio method as well as from the Fisher's exact test. For the latter, we also study the use of mid-P-values. The critical region is defined by the false discovery rate, which can be estimated by adapting the P-values mixture model based procedures to one-sided tests. The methodology is mainly illustrated with the location-based estimator procedure. It is studied through a large simulation study and applied to the French pharmacovigilance database.  相似文献   

8.
MOTIVATION: A number of available program packages determine the significant enrichments and/or depletions of GO categories among a class of genes of interest. Whereas a correct formulation of the problem leads to a single exact null distribution, these GO tools use a large variety of statistical tests whose denominations often do not clarify the underlying P-value computations. SUMMARY: We review the different formulations of the problem and the tests they lead to: the binomial, chi2, equality of two probabilities, Fisher's exact and hypergeometric tests. We clarify the relationships existing between these tests, in particular the equivalence between the hypergeometric test and Fisher's exact test. We recall that the other tests are valid only for large samples, the test of equality of two probabilities and the chi2-test being equivalent. We discuss the appropriateness of one- and two-sided P-values, as well as some discreteness and conservatism issues. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

9.
Dalmasso C  Génin E  Trégouet DA 《Genetics》2008,180(1):697-702
In the context of genomewide association studies where hundreds of thousand of polymorphisms are tested, stringent thresholds on the raw association test P-values are generally used to limit false-positive results. Instead of using thresholds based on raw P-values as in Bonferroni and sequential Sidak (SidakSD) corrections, we propose here to use a weighted-Holm procedure with weights depending on allele frequency of the polymorphisms. This method is shown to substantially improve the power to detect associations, in particular by favoring the detection of rare variants with high genetic effects over more frequent ones with lower effects.  相似文献   

10.
R E Tarone  J J Gart 《Biometrics》1989,45(3):883-890
The goal of a cancer screening program is to reduce cancer mortality by detecting tumors at earlier stages of their development. For some types of cancer, screening tests may allow the preclinical detection of benign precursors of a tumor, and thus a screening program could result in reductions in both cancer incidence and mortality. For other types of cancer, a screening program will not reduce cancer incidence, and thus the expected outcome in a randomized cancer screening trial would be equal cancer incidence rates in control and study groups, but reduced cancer mortality in the study group. For the latter situation, we employ a variety of Poisson models for cancer incidence and mortality to derive optimal tests for equality of cancer mortality rates in a cancer screening trial, and we compare the asymptotic relative efficiencies of the test statistics under various alternatives. We demonstrate that testing equality of case mortality rates using Fisher's exact test or its Pearson chi-square approximation is nearly optimal when cancer incidence rates are equal and is fully efficient when cancer incidence rates are unequal. When valid, this comparison of case mortality rates in the study and control groups can be considerably more powerful than the standard comparison of population mortality rates. We illustrate the results using data from a clinical trial of a breast cancer screening program.  相似文献   

11.
Zaykin DV  Pudovkin A  Weir BS 《Genetics》2008,180(1):533-545
The correlation between alleles at a pair of genetic loci is a measure of linkage disequilibrium. The square of the sample correlation multiplied by sample size provides the usual test statistic for the hypothesis of no disequilibrium for loci with two alleles and this relation has proved useful for study design and marker selection. Nevertheless, this relation holds only in a diallelic case, and an extension to multiple alleles has not been made. Here we introduce a similar statistic, R(2), which leads to a correlation-based test for loci with multiple alleles: for a pair of loci with k and m alleles, and a sample of n individuals, the approximate distribution of n(k - 1)(m - 1)/(km)R(2) under independence between loci is chi((k-1)(m-1))(2). One advantage of this statistic is that it can be interpreted as the total correlation between a pair of loci. When the phase of two-locus genotypes is known, the approach is equivalent to a test for the overall correlation between rows and columns in a contingency table. In the phase-known case, R(2) is the sum of the squared sample correlations for all km 2 x 2 subtables formed by collapsing to one allele vs. the rest at each locus. We examine the approximate distribution under the null of independence for R(2) and report its close agreement with the exact distribution obtained by permutation. The test for independence using R(2) is a strong competitor to approaches such as Pearson's chi square, Fisher's exact test, and a test based on Cressie and Read's power divergence statistic. We combine this approach with our previous composite-disequilibrium measures to address the case when the genotypic phase is unknown. Calculation of the new multiallele test statistic and its P-value is very simple and utilizes the approximate distribution of R(2). We provide a computer program that evaluates approximate as well as "exact" permutational P-values.  相似文献   

12.
Measuring in a quantitative, statistical sense the degree to which structural and functional information can be "transferred" between pairs of related protein sequences at various levels of similarity is an essential prerequisite for robust genome annotation. To this end, we performed pairwise sequence, structure and function comparisons on approximately 30,000 pairs of protein domains with known structure and function. Our domain pairs, which are constructed according to the SCOP fold classification, range in similarity from just sharing a fold, to being nearly identical. Our results show that traditional scores for sequence and structure similarity have the same basic exponential relationship as observed previously, with structural divergence, measured in RMS, being exponentially related to sequence divergence, measured in percent identity. However, as the scale of our survey is much larger than any previous investigations, our results have greater statistical weight and precision. We have been able to express the relationship of sequence and structure similarity using more "modern scores," such as Smith-Waterman alignment scores and probabilistic P-values for both sequence and structure comparison. These modern scores address some of the problems with traditional scores, such as determining a conserved core and correcting for length dependency; they enable us to phrase the sequence-structure relationship in more precise and accurate terms. We found that the basic exponential sequence-structure relationship is very general: the same essential relationship is found in the different secondary-structure classes and is evident in all the scoring schemes. To relate function to sequence and structure we assigned various levels of functional similarity to the domain pairs, based on a simple functional classification scheme. This scheme was constructed by combining and augmenting annotations in the enzyme and fly functional classifications and comparing subsets of these to the Escherichia coli and yeast classifications. We found sigmoidal relationships between similarity in function and sequence, with clear thresholds for different levels of functional conservation. For pairs of domains that share the same fold, precise function appears to be conserved down to approximately 40 % sequence identity, whereas broad functional class is conserved to approximately 25 %. Interestingly, percent identity is more effective at quantifying functional conservation than the more modern scores (e.g. P-values). Results of all the pairwise comparisons and our combined functional classification scheme for protein structures can be accessed from a web database at http://bioinfo.mbb.yale.edu/alignCopyright 2000 Academic Press.  相似文献   

13.
Several extensions to implied weighting, recently implemented in TNT, allow a better treatment of data sets combining morphological and molecular data sets, as well as those comprising large numbers of missing entries (e.g. palaeontological matrices, or combined matrices with some genes sequenced for few taxa). As there have been recent suggestions that molecular matrices may be better analysed using equal weights (rather than implied weighting), a simple way to apply implied weighting to only some characters (e.g. morphology), leaving other characters with a constant weight (e.g. molecules), is proposed. The new methods also allow weighting entire partitions according to their average homoplasy, giving each of the characters in the partition the same weight (this can be used for dynamically weighting, e.g. entire genes, or first, second, and third positions collectively). Such an approach is easily implemented in schemes like successive weighting, but in the case of implied weighting poses some particular problems. The approach has the peculiar implication that the inclusion of uninformative characters influences the results (by influencing the implied weights for the partitions). Last, the concern that characters with many missing entries may receive artificially inflated weights (because they necessarily display less homoplasy) can be solved by allowing the use of different weighting functions for different characters, in such a way that the cost of additional transformations decreases more rapidly for characters with more missing entries (thus effectively assuming that the unobserved entries are likely to also display some unobserved homoplasy). The conceptual and practical aspects of all these problems, as well as details of the implementation in TNT, are discussed.  相似文献   

14.
We have investigated phosphatidylcholines with the same two saturated hydrocarbon chains of 12, 10 and 8 C-atoms. Langmuir trough data could be evaluated towards even small lipid subphase desorption when applying a novel approach that had recently been developed in our laboratory. The C12 lipid turned out to form a nearly insoluble monolayer with slight desorption only beyond 15 mN/m for an area/volume ratio around 1 cm(-1). Above 22 mN/m micellation in the subphase apparently terminates further accumulation in the interface forcing additionally added lipid to enter the bulk volume. A comparatively substantial increase of solubilization was observed for the C10 monolayer. When turning to the C8 lipid partitioning proved to take place in nearly equal parts. In that case, strong multimeric aggregation is indicated to occur in both the interfacial and the bulk volume domains. All the results are quantitatively discussed in the light of basic thermodynamic and structural considerations.  相似文献   

15.
Relationship coefficients are particularly useful to improve genetic management of endangered populations. These coefficients are traditionally based on pedigree data, but in case of incomplete or inexistent pedigrees they are replaced by coefficients calculated from molecular data. The main objective of this study was to develop a new method to estimate relationship coefficients by combining molecular with pedigree data, which is useful for specific situations where neither pedigree nor molecular data are complete. The developed method was applied to contribute to the conservation of the Skyros pony breed, which consists of less than 200 individuals, divided into 3 main herds or subpopulations. In this study, relationships between individuals were estimated using traditional estimators as well as the newly developed method. For this purpose, 99 Skyros ponies were genotyped at 16 microsatellite loci. It appeared that the limitation of the most common molecular-based estimators is the use of weights that assume relationships equal to 0. The results showed that, as a consequence of this limitation, negative relationship values can be obtained in small inbred populations, for example. By contrast, the combined estimator gave no negative values. Using principal component analysis, the combined estimator also enabled a better graphic differentiation between the 3 subpopulations defined previously. In conclusion, this new estimator can be a promising alternative to traditionally used estimators, especially in inbred populations, with both incomplete pedigree and molecular information.  相似文献   

16.
A basic element in the determination of the zygosity of a twin pair is the proportion of genotypically concordant pairs among the dizygotic pairs. Two methods to derive this proportion are in common use: the first method requires a laborious enumeration of parental genotypic mating types, and the second method relies on a set of formulas, one for each of the possible combinations of genotypes of two full sibs. In this paper the relation between both methods is uncovered. The set of formulas of the second method is reduced to a single general formula, of which the connection with the ITO method (Li and Sacks 1954) is indicated. By applying both methods in turn to an example concerning the MNS blood group system (Fisher 1951), Fisher's way of performing the calculations according to the first method is unraveled, and the preferability of the second method is made clear. Next, formulas are derived for the probability of genotypic or phenotypic concordance of dizygotic twins when direct information on the genotype or phenotype of one of the parents is available. The case of an X-linked locus is also considered. To facilitate applications, tables are given.  相似文献   

17.
In this paper we consider estimating heterogeneity variance with the DerSimonian-Laird (DSL) estimator as typically used in meta-analysis. In its general form the DSL estimator requires inverse population-averaged study-specific variances as weights, in which case the estimator is unbiased. It has become common practice, however, to use estimates of the study-specific variances instead of their population-averaged versions. This can lead to considerable bias. Simulations illustrate these findings.  相似文献   

18.
A phylogeny of the meiofaunal polychaete family Nerillidae based on morphological, molecular and combined data is presented here. The data sets comprise nearly complete sequences of 18S rDNA and 40 morphological characters of 17 taxa. Sequences were analyzed simultaneously with the morphological data by direct optimization in the program POY, with a variety of parameter sets (costs of gaps: transversions: transitions). Three outgroups were selected from the major polychaete group Aciculata and one from Scolecida. The 13 nerillid species from 11 genera were monophyletic in all analyses with very high support, and three new apomorphies for Nerillidae are identified. The topology of the ingroup varied according to the various parameter settings. Reducing the number of outgroups to one decreased the variance among the phylogenetic hypotheses. The congruence among these was tested and a parameter set, with equal weights (222) and extension gap weighted 1, yielded minimum incongruence (ILD). Several terminal clades of the combined analysis were highly supported, as well as the position of Leptonerilla prospera as sister terminal to the other nerillids. The evolution of morphological characters such as segment numbers, chaetae, appendages and ciliation are traced and discussed. A regressive pathway within Nerillidae is indicated for several characters, however, generally implying several convergent losses. Numerous genera are shown to require revision. © The Willi Hennig Society 2005.  相似文献   

19.
A non-overlapping generation model is proposed which links Wright's adaptive topography concept to the rather inexact notion of Darwinian fitness as survival and reproduction. In general, evolution is seen to weight proportionate increases in survival twice as greatly as proportionate increases in fertility. Certain special cases are also delineated in which measurements of survival and fertility receive equal weight in the fitness equations. In addition, some implications of the present study for Wright's shifting balance theory and models based on Fisher's reproductive value are discussed.  相似文献   

20.
The genetic code is degenerate, but alternative synonymous codons are generally not used with equal frequency. Since the pioneering work of Grantham's group it has been apparent that genes from one species often share similarities in codon frequency; under the "genome hypothesis" there is a species-specific pattern to codon usage. However, it has become clear that in most species there are also considerable differences among genes. Multivariate analyses have revealed that in each species so far examined there is a single major trend in codon usage among genes, usually from highly biased to more nearly even usage of synonymous codons. Thus, to represent the codon usage pattern of an organism it is not sufficient to sum over all genes as this conceals the underlying heterogeneity. Rather, it is necessary to describe the trend among genes seen in that species. We illustrate these trends for six species where codon usage has been examined in detail, by presenting the pooled codon usage for the 10% of genes at either end of the major trend. Closely-related organisms have similar patterns of codon usage, and so the six species in Table 1 are representative of wider groups. For example, with respect to codon usage, Salmonella typhimurium closely resembles E. coli, while all mammalian species so far examined (principally mouse, rat and cow) largely resemble humans.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号