首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We examined Type I error rates of Felsenstein's (1985; Am. Nat. 125:1-15) comparative method of phylogenetically independent contrasts when branch lengths are in error and the model of evolution is not Brownian motion. We used seven evolutionary models, six of which depart strongly from Brownian motion, to simulate the evolution of two continuously valued characters along two different phylogenies (15 and 49 species). First, we examined the performance of independent contrasts when branch lengths are distorted systematically, for example, by taking the square root of each branch segment. These distortions often caused inflated Type I error rates, but performance was almost always restored when branch length transformations were used. Next, we investigated effects of random errors in branch lengths. After the data were simulated, we added errors to the branch lengths and then used the altered phylogenies to estimate character correlations. Errors in the branches could be of two types: fixed, where branch lengths are either shortened or lengthened by a fixed fraction; or variable, where the error is a normal variate with mean zero and the variance is scaled to the length of the branch (so that expected error relative to branch length is constant for the whole tree). Thus, the error added is unrelated to the microevolutionary model. Without branch length checks and transformations, independent contrasts tended to yield extremely inflated and highly variable Type I error rates. Type I error rates were reduced, however, when branch lengths were checked and transformed as proposed by Garland et al. (1992; Syst. Biol. 41:18-32), and almost never exceeded twice the nominal P-value at alpha = 0.05. Our results also indicate that, if branch length transformations are applied, then the appropriate degrees of freedom for testing the significance of a correlation coefficient should, in general, be reduced to account for estimation of the best branch length transformation. These results extend those reported in Díaz-Uriarte and Garland (1996; Syst. Biol. 45:27-47), and show that, even with errors in branch lengths and evolutionary models different from Brownian motion, independent contrasts are a robust method for testing hypotheses of correlated evolution.  相似文献   

2.
Non-normality of the phenotypic distribution can affect power to detect quantitative trait loci in sib pair studies. Previously, we observed that Winsorizing the sib pair phenotypes increased the power of quantitative trait locus (QTL) detection for both Haseman-Elston (HE) least-squares tests [Hum Hered 2002;53:59-67] and maximum likelihood-based variance components (MLVC) analysis [Behav Genet (in press)]. Winsorizing the phenotypes led to a slight increase in type 1 error in H-E tests and a slight decrease in type I error for MLVC analysis. Herein, we considered transforming the sib pair phenotypes using the Box-Cox family of transformations. Data were simulated for normal and non-normal (skewed and kurtic) distributions. Phenotypic values were replaced by Box-Cox transformed values. Twenty thousand replications were performed for three H-E tests of linkage and the likelihood ratio test (LRT), the Wald test and other robust versions based on the MLVC method. We calculated the relative nominal inflation rate as the ratio of observed empirical type 1 error divided by the set alpha level (5, 1 and 0.1% alpha levels). MLVC tests applied to non-normal data had inflated type I errors (rate ratio greater than 1.0), which were controlled best by Box-Cox transformation and to a lesser degree by Winsorizing. For example, for non-transformed, skewed phenotypes (derived from a chi2 distribution with 2 degrees of freedom), the rates of empirical type 1 error with respect to set alpha level=0.01 were 0.80, 4.35 and 7.33 for the original H-E test, LRT and Wald test, respectively. For the same alpha level=0.01, these rates were 1.12, 3.095 and 4.088 after Winsorizing and 0.723, 1.195 and 1.905 after Box-Cox transformation. Winsorizing reduced inflated error rates for the leptokurtic distribution (derived from a Laplace distribution with mean 0 and variance 8). Further, power (adjusted for empirical type 1 error) at the 0.01 alpha level ranged from 4.7 to 17.3% across all tests using the non-transformed, skewed phenotypes, from 7.5 to 20.1% after Winsorizing and from 12.6 to 33.2% after Box-Cox transformation. Likewise, power (adjusted for empirical type 1 error) using leptokurtic phenotypes at the 0.01 alpha level ranged from 4.4 to 12.5% across all tests with no transformation, from 7 to 19.2% after Winsorizing and from 4.5 to 13.8% after Box-Cox transformation. Thus the Box-Cox transformation apparently provided the best type 1 error control and maximal power among the procedures we considered for analyzing a non-normal, skewed distribution (chi2) while Winzorizing worked best for the non-normal, kurtic distribution (Laplace). We repeated the same simulations using a larger sample size (200 sib pairs) and found similar results.  相似文献   

3.
A comparison was made between mathematical variations of the square root and Schoolfield models for predicting growth rate as a function of temperature. The statistical consequences of square root and natural logarithm transformations of growth rate use in several variations of the Schoolfield and square root models were examined. Growth rate variances of Yersinia enterocolitica in brain heart infusion broth increased as a function of temperature. The ability of the two data transformations to correct for the heterogeneity of variance was evaluated. A natural logarithm transformation of growth rate was more effective than a square root transformation at correcting for the heterogeneity of variance. The square root model was more accurate than the Schoolfield model when both models used natural logarithm transformation.  相似文献   

4.
We consider some multiple comparison problems in repeated measures designs for data with ties, particularly ordinal data; the methods are also applicable to continuous data, with or without ties. A unified asymptotic theory of rank tests of Brunner , Puri and Sen (1995) and Akritas and Brunner (1997) is utilized to derive large sample multiple comparison procedures (MCP's). First, we consider a single treatment and address the problem of comparing its time effects with respect to the baseline. Multiple sign tests and rank tests (and the corresponding simultaneous confidence intervals) are derived for this problem. Next, we consider two treatments and address the problem of testing for treatment × time interactions by comparing their time effects with respect to the baseline. Simulation studies are conducted to study the type I familywise error rates and powers of competing procedures under different distributional models. The data from a psychiatric study are analyzed using the above MCP's to answer the clinicians' questions.  相似文献   

5.
Exact tests are given, for the usual hypotheses on split-plot models with random blocks and fixed treatment effects, considering different numbers of blocks for each level of whole-plot treatment and assuming normally distributed observations. U- and D-optimal designs are considered with respect to the tests of main effects and interactions as well as to estimation of parameters.  相似文献   

6.
Yu Z 《Human heredity》2011,71(3):171-179
The case-parents design has been widely used to detect genetic associations as it can prevent spurious association that could occur in population-based designs. When examining the effect of an individual genetic locus on a disease, logistic regressions developed by conditioning on parental genotypes provide complete protection from spurious association caused by population stratification. However, when testing gene-gene interactions, it is unknown whether conditional logistic regressions are still robust. Here we evaluate the robustness and efficiency of several gene-gene interaction tests that are derived from conditional logistic regressions. We found that in the presence of SNP genotype correlation due to population stratification or linkage disequilibrium, tests with incorrectly specified main-genetic-effect models can lead to inflated type I error rates. We also found that a test with fully flexible main genetic effects always maintains correct test size and its robustness can be achieved with negligible sacrifice of its power. When testing gene-gene interactions is the focus, the test allowing fully flexible main effects is recommended to be used.  相似文献   

7.
Statistical analyses are an integral component of scientific research, and for decades, biologists have applied transformations to data to meet the normal error assumptions for F and t tests. Over the years, there has been a movement from data transformation toward model reformation—the use of non‐normal error structures within the framework of the generalized linear model (GLM). The principal advantage of model reformation is that parameters are estimated on the original, rather than the transformed scale. However, data transformation has been shown to give better control over type I error, for simulated data with known error structures. We conducted a literature review of statistical textbooks directed toward biologists and of journal articles published in the primary literature to determine temporal trends in both the text recommendations and the practice in the refereed literature over the past 35 years. In this review, a trend of increasing use of reformation in the primary literature was evident, moving from no use of reformation before 1996 to >50% of the articles reviewed applying GLM after 2006. However, no such trend was observed in the recommendations in statistical textbooks. We then undertook 12 analyses based on published datasets in which we compared the type I error estimates, residual plot diagnostics, and coefficients yielded by analyses using square root transformations, log transformations, and the GLM. All analyses yielded acceptable residual versus fit plots and had similar p‐values within each analysis, but as expected, the coefficient estimates differed substantially. Furthermore, no consensus could be found in the literature regarding a procedure to back‐transform the coefficient estimates obtained from linear models performed on transformed datasets. This lack of consistency among coefficient estimates constitutes a major argument for model reformation over data transformation in biology.  相似文献   

8.
Data screening is an indispensable phase in initiating the scientific discovery process. Fractional factorial designs offer quick and economical options for engineering highly-dense structured datasets. Maximum information content is harvested when a selected fractional factorial scheme is driven to saturation while data gathering is suppressed to no replication. A novel multi-factorial profiler is presented that allows screening of saturated-unreplicated designs by decomposing the examined response to its constituent contributions. Partial effects are sliced off systematically from the investigated response to form individual contrasts using simple robust measures. By isolating each time the disturbance attributed solely to a single controlling factor, the Wilcoxon-Mann-Whitney rank stochastics are employed to assign significance. We demonstrate that the proposed profiler possesses its own self-checking mechanism for detecting a potential influence due to fluctuations attributed to the remaining unexplainable error. Main benefits of the method are: 1) easy to grasp, 2) well-explained test-power properties, 3) distribution-free, 4) sparsity-free, 5) calibration-free, 6) simulation-free, 7) easy to implement, and 8) expanded usability to any type and size of multi-factorial screening designs. The method is elucidated with a benchmarked profiling effort for a water filtration process.  相似文献   

9.
Restriction‐enzyme‐based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in nonmodel organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction‐enzyme‐based methods remain largely unknown. Here, we estimated genotyping error rates in SNPs genotyped with double digest RAD sequencing from Mendelian incompatibilities in known mother–offspring dyads of Hoffman's two‐toed sloth (Choloepus hoffmanni) across a range of coverage and sequence quality criteria, for both reference‐aligned and de novo‐assembled data sets. Genotyping error rates were more sensitive to coverage than sequence quality and low coverage yielded high error rates, particularly in de novo‐assembled data sets. For example, coverage ≥5 yielded median genotyping error rates of ≥0.03 and ≥0.11 in reference‐aligned and de novo‐assembled data sets, respectively. Genotyping error rates declined to ≤0.01 in reference‐aligned data sets with a coverage ≥30, but remained ≥0.04 in the de novo‐assembled data sets. We observed approximately 10‐ and 13‐fold declines in the number of loci sampled in the reference‐aligned and de novo‐assembled data sets when coverage was increased from ≥5 to ≥30 at quality score ≥30, respectively. Finally, we assessed the effects of genotyping coverage on a common population genetic application, parentage assignments, and showed that the proportion of incorrectly assigned maternities was relatively high at low coverage. Overall, our results suggest that the trade‐off between sample size and genotyping error rates be considered prior to building sequencing libraries, reporting genotyping error rates become standard practice, and that effects of genotyping errors on inference be evaluated in restriction‐enzyme‐based SNP studies.  相似文献   

10.
We derive and compare the operating characteristics of hierarchical and square array-based testing algorithms for case identification in the presence of testing error. The operating characteristics investigated include efficiency (i.e., expected number of tests per specimen) and error rates (i.e., sensitivity, specificity, positive and negative predictive values, per-family error rate, and per-comparison error rate). The methodology is illustrated by comparing different pooling algorithms for the detection of individuals recently infected with HIV in North Carolina and Malawi.  相似文献   

11.
A comparison was made between mathematical variations of the square root and Schoolfield models for predicting growth rate as a function of temperature. The statistical consequences of square root and natural logarithm transformations of growth rate use in several variations of the Schoolfield and square root models were examined. Growth rate variances of Yersinia enterocolitica in brain heart infusion broth increased as a function of temperature. The ability of the two data transformations to correct for the heterogeneity of variance was evaluated. A natural logarithm transformation of growth rate was more effective than a square root transformation at correcting for the heterogeneity of variance. The square root model was more accurate than the Schoolfield model when both models used natural logarithm transformation.  相似文献   

12.
For investigation of main and interactive effects of six experimentally controlled environmental factors on phenol biodegradation in a shake-flask system, a largely neglected statistical procedure was applied. A major benefit resulting from the application of the orthogonal, fractional factorial design is that the number of experiments necessary to evaluate multifactor interactions is limited. In our investigation, the required number of experiments was reduced to 81 from the 324 necessary with conventional factorial designs; information was sacrificed for only 3 of 15 possible two-factor interactions. Six experimentally controlled factors were investigated at two or three treatment levels each; the six factors were (1) amount of phenol substrate, (2) amount of bacterial inoculum, (3) filtration of inoculum, (4) type of basal salts medium, (5) initial pH of basal salts medium, and (6) flask closure. Significant main effects were found for factors 1, 2, and 4; whereas significant interactive effects were found only for factor 2 with factor 3 and for factor 2 with factor 5. Our results suggest that the application of these statistical designs will greatly reduce the number of experiments necessary to evaluate multifactor effects on degradation rates during optimization of both hazard screening systems and waste treatment systems.  相似文献   

13.
The variance-components model is the method of choice for mapping quantitative trait loci in general human pedigrees. This model assumes normally distributed trait values and includes a major gene effect, random polygenic and environmental effects, and covariate effects. Violation of the normality assumption has detrimental effects on the type I error and power. One possible way of achieving normality is to transform trait values. The true transformation is unknown in practice, and different transformations may yield conflicting results. In addition, the commonly used transformations are ineffective in dealing with outlying trait values. We propose a novel extension of the variance-components model that allows the true transformation function to be completely unspecified. We present efficient likelihood-based procedures to estimate variance components and to test for genetic linkage. Simulation studies demonstrated that the new method is as powerful as the existing variance-components methods when the normality assumption holds; when the normality assumption fails, the new method still provides accurate control of type I error and is substantially more powerful than the existing methods. We performed a genomewide scan of monoamine oxidase B for the Collaborative Study on the Genetics of Alcoholism. In that study, the results that are based on the existing variance-components method changed dramatically when three outlying trait values were excluded from the analysis, whereas our method yielded essentially the same answers with or without those three outliers. The computer program that implements the new method is freely available.  相似文献   

14.
Diagnostic or screening tests are widely used in medical fields to classify patients according to their disease status. Several statistical models for meta‐analysis of diagnostic test accuracy studies have been developed to synthesize test sensitivity and specificity of a diagnostic test of interest. Because of the correlation between test sensitivity and specificity, modeling the two measures using a bivariate model is recommended. In this paper, we extend the current standard bivariate linear mixed model (LMM) by proposing two variance‐stabilizing transformations: the arcsine square root and the Freeman–Tukey double arcsine transformation. We compared the performance of the proposed methods with the standard method through simulations using several performance measures. The simulation results showed that our proposed methods performed better than the standard LMM in terms of bias, root mean square error, and coverage probability in most of the scenarios, even when data were generated assuming the standard LMM. We also illustrated the methods using two real data sets.  相似文献   

15.
In the statistical evaluation of data from a dose-response experiment, it is frequently of interest to test for dose-related trend: an increasing trend in response with increasing dose. The randomization trend test, a generalization of Fisher's exact test, has been recommended for animal tumorigenicity testing when the numbers of tumor occurrences are small. This paper examines the type I error of the randomization trend test, and the Cochran-Armitage and Mantel-Haenszel tests. Simulation results show that when the tumor incidence rates are less than 10%, the randomization test is conservative; the test becomes very conservative when the incidence rate is less than 5%. The Cochran-Armitage and Mantel-Haenszel tests are slightly anti-conservative (liberal) when the incidence rates are larger than 3%. Further, we propose a less conservatived method of calculating the p-value of the randomization trend test by excluding some permutations whose probabilities of occurrence are greater than the probability of the the observed outcome.  相似文献   

16.
Computer simulations are used to examine the significance levels and powers of several tests which have been employed to compare the means of Poisson distributions. In particular, attention is focused on the behaviour of the tests when the means are small, as is often the case in ecological studies when populations of organisms are sampled using quadrats. Two approaches to testing are considered. The first assumes a log linear model for the Poisson data and leads to tests based on the deviance. The second employs standard analysis of variance tests following data transformations, including the often used logarithmic and square root transformations. For very small means it is found that a deviance-based test has the most favourable characteristics, generally outperforming analysis of variance tests on transformed data; none of the latter appears consistently better than any other. For larger means the standard analysis of variance on untransformed data performs well.  相似文献   

17.
Different types of panelist by treatment interaction are explored to determine how they influence the outcomes of discrimination tests. The study compares the situations where panelists are considered as fixed or random effects over the range of most testing conditions for small panels (5–15 panelists) that replicate their judgements. Magnitude interaction and nonperceivers or nondiscriminators have minor effects on test outcomes. Cross-over interaction increases the chances for a type II error, especially when panelists are considered as random effects. False discrimination increases the chances for a type I error when panelists are considered as fixed effects. Applications of methods to reduce the chances for these errors in the testing for differences among treatments are discussed.  相似文献   

18.
Probit plots of sperm concentration for 1711 suspected infertile men (those with azoospermia being excluded) were compared for the untransformed and loge-, square root- and cube root-transformed values. For the distribution of sperm concentrations, which was highly skewed towards low values, the square-root transformation produced the most normal (Gaussian) distribution. Loge and cube-root transformations caused skewing towards high values. Such treatment of the data should always be considered before using parametric statistical tests to make comparisons between sperm concentrations of groups of men.  相似文献   

19.
Cultivations of Streptomyces peucetius in two types of medium were monitored on-line using a Fourier transform infrared (FTIR) spectrometer combined with an attenuated total reflection probe. The quantitative measurements of the glucose, starch and acetate concentrations were implemented using partial least squares calibration models. These were regressed on spectral and concentration information obtained by adding together single constituent spectra of the main constituents in the medium according to a full factorial design. The accuracy achieved was considered to be satisfactory, with an average root mean square error of prediction of 1.5 g/l for glucose and 0.25 g/l for acetate. The methodology used is considered to be a rapid technique for generation of calibration data, and a step towards the use of library type data for calibration purposes in quantitative FTIR spectroscopy applications in bioprocesses.  相似文献   

20.
The complexity of natural ecosystems makes it difficult to compare the relative importance of abiotic and biotic factors and to assess the effects of their interactions on ecosystem development. To improve our understanding of ecosystem complexity, we initiated an experiment designed to quantify the main effects and interactions of several factors that are thought to affect nutrient export from developing forest ecosystems. Using a replicated 2 × 2 × 4 factorial experiment, we quantified the main effects of these factors and the factor interactions on annual calcium, magnesium, and potassium export from field mesocosms over 4 years for two Vermont locations, two soils, and four different tree seedling communities. We found that the main effects explained 56%–97% of total variation in nutrient export. Abiotic factors (location and soil) accounted for a greater percentage of the total variation in nutrient export (47%–94%) than the biotic factor (plant community) (2%–15%). However, biotic control over nutrient export was significant, even when biomass was minimal. Factor interactions were often significant, but they explained less of the variation in nutrient export (1%–33%) than the main effects. Year-to-year fluctuations influenced the relative importance of the main effects in determining nutrient export and created factor interactions between most of the explanatory variables. Our study suggests that when research is focused on typically used main effects, such as location and soil, and interactions are aggregated into overall error terms, important information about the factors controlling ecosystem processes can be lost.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号