首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We address the problem of tests of homogeneity in two-way contingency tables in case-control studies when the case category is subdivided into k subcategories. In this situation, we have two cells with large frequencies and 2 X k cells with frequencies that become small as k increases. We propose two ad hoc statistics in which a statistic for the sparse cells is combined with a statistic for the cells with large frequencies. We will study these tests along with the Pearson test (using a chi-square approximation) in a Monte Carlo simulation study. Two sets of null hypothesis models and two sets of alternative hypothesis models are considered. The best test for the models considered is the usual Pearson test (using an approximate chi-square distribution) although the ad hoc models are more powerful under one alternative model considered.  相似文献   

2.
The sensitivity and specificity of markers for event times   总被引:1,自引:0,他引:1  
The statistical literature on assessing the accuracy of risk factors or disease markers as diagnostic tests deals almost exclusively with settings where the test, Y, is measured concurrently with disease status D. In practice, however, disease status may vary over time and there is often a time lag between when the marker is measured and the occurrence of disease. One example concerns the Framingham risk score (FR-score) as a marker for the future risk of cardiovascular events, events that occur after the score is ascertained. To evaluate such a marker, one needs to take the time lag into account since the predictive accuracy may be higher when the marker is measured closer to the time of disease occurrence. We therefore consider inference for sensitivity and specificity functions that are defined as functions of time. Semiparametric regression models are proposed. Data from a cohort study are used to estimate model parameters. One issue that arises in practice is that event times may be censored. In this research, we extend in several respects the work by Leisenring et al. (1997) that dealt only with parametric models for binary tests and uncensored data. We propose semiparametric models that accommodate continuous tests and censoring. Asymptotic distribution theory for parameter estimates is developed and procedures for making statistical inference are evaluated with simulation studies. We illustrate our methods with data from the Cardiovascular Health Study, relating the FR-score measured at enrollment to subsequent risk of cardiovascular events.  相似文献   

3.
In this study, we used the phenotype simulation package naturalgwas to test the performance of Zhao's Random Forest method in comparison to an uncorrected Random Forest test, latent factor mixed models (LFMM), genome-wide efficient mixed models (GEMMA), and confounder adjusted linear regression (CATE). We created 400 sets of phenotypes, corresponding to five effect sizes and two, five, 15, or 30 causal loci, simulated from two empirical data sets containing SNPs from Striped Bass representing three and 13 populations. All association methods were evaluated for their ability to detect genotype–phenotype associations based on power, false discovery rates, and number of false positives. Genomic inflation was highest for uncorrected Random Forest and LFMM tests and lowest for Gemma and Zhao's Random Forest. All association tests had similar power to detect causal loci, and Zhao's Random Forest had the lowest false discovery rate in all scenarios. To measure the performance of association tests in small data sets with few loci surrounding a causal gene we also ran analyses again after removing causal loci from each data set. All association tests were only able to find true positives, defined as loci located within 30 kbp of a causal locus, in 3%–18% of simulations. In contrast, at least one false positive was found in 17%–44% of simulations. Zhao's Random Forest again identified the fewest false positives of all association tests studied. The ability to test the power of association tests for individual empirical data sets can be an extremely useful first step when designing a GWAS study.  相似文献   

4.
Abstract Mathematical models of interacting populations have a prominent position in population and community ecology, but are often criticized for not being testable. The authors reviewed tests of a particular model, the exploitation ecosystem hypothesis as it was formulated in Oksanen et al. (1981), in order to study problems that may be encountered when testing models. A general problem is how to determine if an experimental system should be regarded as within the model's theoretical domain or not. The theoretical domain defines the type of system the model is meant to apply to. It is noted that both liberal and strict domain definitions can be problematic. Most important is that a too liberal domain definition can result in false understanding (i.e. that it is falsely concluded that the processes included in the model are controlling the study system). Other problems encountered were more system‐specific. Equilibrium predictions were tested in experiments that were too short to reach steady state and in several studies ambiguous definitions and measurements of model variables were found such as productivity, biomass and the number trophic levels. It is concluded that a major obstacle when performing tests is the conceptual and methodological problems encountered when translating model abstractions into an empirical reality.  相似文献   

5.
A virologic marker, the number of HIV RNA copies or viral load, is currently used to evaluate anti-HIV therapies in AIDS clinical trials. This marker can be used to assess the antiviral potency of therapies, but is easily affected by noncompliance, drug resistance, toxicities, and other factors during the long-term treatment evaluation process. Recently it has been suggested to use viral dynamics to assess the potency of antiviral drugs and therapies, since viral decay rates in viral dynamic models have been shown to be related to the antiviral drug potency directly, and they need a shorter evaluation time. In this paper we first review the two statistical approaches for characterizing HIV dynamics and estimating viral decay rates: the individual nonlinear least squares regression (INLS) method and the population nonlinear mixed-effect model (PMEM) approach. To compare the viral decay rates between two treatment arms, parametric and nonparametric tests, based on the estimates of viral decay rates (the derived variables) from both the INLS and PMEM methods, are proposed and studied. We show, using the concept of exchangeability, that the test based on the empirical Bayes' estimates from the PMEM is valid, powerful and robust. This proposed method is very useful in most practical cases where the INLS-based tests and the general likelihood ratio test may not apply. We validate and compare various tests for finite samples using Monte Carlo simulations. Finally, we apply the proposed tests to an AIDS clinical trial to compare the antiviral potency between a 3-drug combination regimen and a 4-drug combination regimen. The proposed tests provide some significant evidence that the 4-drug regimen is more potent than the 3-drug regimen, while the naive methods fail to give a significant result.*To whom correspondence should be addressed.  相似文献   

6.
Summary In genome‐wide association (GWA) studies, test statistics that are efficient and robust across various genetic models are preferable, particularly for studying multiple diseases in the Wellcome Trust Case–Control Consortium ( WTCCC, 2007 , Nature 447 , 661–678). A new test statistic, the minimum of the p‐values of the trend test and Pearson's test, was considered by the WTCCC. It is referred to here as MIN2. Because the minimum of two p‐values is no longer a valid p‐value itself, the WTCCC only used it to rank single nucleotide polymorphisms (SNPs) but did not report the p‐values of the associated SNPs when MIN2 was used for ranking. Given its importance in practice, we derive the asymptotic null distribution of MIN2, study some of its analytical properties related to GWA studies, and compare it with existing methods (the trend test, Pearson's test, MAX3, and the constrained likelihood ratio test [CLRT]) by simulations across a wide range of possible genetic models: the recessive (REC), additive (ADD), multiplicative (MUL), dominant (DOM), and overdominant models. The results show that MAX3 and CLRT have greater efficiency robustness than other tests when the REC, ADD/MUL, and DOM models are possible, whereas Pearson's test and MIN2 have greater efficiency robustness if the possible genetic models also include the overdominant model. We conclude that robust tests (MAX3, MIN2, CLRT, and Pearson's test) are preferable to a single trend test for initial GWA studies. The four robust tests are applied to more than 100 SNPs associated with 11 common diseases identified by the two WTCCC GWA studies.  相似文献   

7.
Albert PS 《Biometrics》2007,63(2):593-602
Estimating diagnostic accuracy without a gold standard is an important problem in medical testing. Although there is a fairly large literature on this problem for the case of repeated binary tests, there is substantially less work for the case of ordinal tests. A noted exception is the work by Zhou, Castelluccio, and Zhou (2005, Biometrics 61, 600-609), which proposed a methodology for estimating receiver operating characteristic (ROC) curves without a gold standard from multiple ordinal tests. A key assumption in their work was that the test results are independent conditional on the true test result. I propose random effects modeling approaches that incorporate dependence between the ordinal tests, and I show through asymptotic results and simulations the importance of correctly accounting for the dependence between tests. These modeling approaches, along with the importance of accounting for the dependence between tests, are illustrated by analyzing the uterine cancer pathology data analyzed by Zhou et al. (2005).  相似文献   

8.
Albert PS 《Biometrics》2000,56(2):602-608
Binary longitudinal data are often collected in clinical trials when interest is on assessing the effect of a treatment over time. Our application is a recent study of opiate addiction that examined the effect of a new treatment on repeated urine tests to assess opiate use over an extended follow-up. Drug addiction is episodic, and a new treatment may affect various features of the opiate-use process such as the proportion of positive urine tests over follow-up and the time to the first occurrence of a positive test. Complications in this trial were the large amounts of dropout and intermittent missing data and the large number of observations on each subject. We develop a transitional model for longitudinal binary data subject to nonignorable missing data and propose an EM algorithm for parameter estimation. We use the transitional model to derive summary measures of the opiate-use process that can be compared across treatment groups to assess treatment effect. Through analyses and simulations, we show the importance of properly accounting for the missing data mechanism when assessing the treatment effect in our example.  相似文献   

9.
We suggest a cure-mixture model to analyze bivariate time-to-event data, as motivated by the article of Chatterjee and Shih (2001, Biometrics 57, 779-786), but with a simpler estimation procedure and the correlated gamma-frailty model instead of the shared gamma-frailty model. This approach allows us to deal with left-truncated and right-censored lifetime data, and accounts for heterogeneity, as well as for an insusceptible (cure) fraction in the study population. We perform a simulation study to evaluate the properties of the estimates in the proposed model and apply it to breast cancer incidence data for 5857 Swedish female monozygotic and dizygotic twin pairs from the so-called old cohort of the Swedish Twin Registry. This model is used to estimate the size of the susceptible fraction and the correlation between the frailties of the twin partners. Possible extensions, advantages, and limitations of the proposed method are discussed.  相似文献   

10.
Since 1983, study of natural selection has relied heavily on multiple regression of fitness on the values for a set of traits via ordinary least squares (OLSs), as proposed by Lande and Arnold, to obtain an estimate of the quadratic relationship between fitness and the traits, the fitness surface. However, well‐known statistical problems with this approach can affect inferences about selection. One key concern is that measures of lifetime fitness do not conform to a normal or any other standard sampling distribution, as needed to justify the usual statistical tests. Another is that OLS may yield an estimate of the sign of the fitness function's curvature that is opposite to the truth. We here show that the recently developed aster modeling approach, which explicitly models the components of fitness as the basis for inferences about lifetime fitness, eliminates these problems. We illustrate selection analysis via aster using simulated datasets involving five fitness components expressed in each of four years. We demonstrate that aster analysis yields accurate estimates of the fitness function in cases in which OLS misleads, as well as accurate confidence regions for directional selection gradients. Further, to evaluate selection when many traits are under consideration, we recommend model selection by information criteria and frequentist model averaging.  相似文献   

11.
Genome-wide association studies (GWAS) are widely applied to analyze the genetic effects on phenotypes. With the availability of high-throughput technologies for metabolite measurements, GWAS successfully identified loci that affect metabolite concentrations and underlying pathways. In most GWAS, the effect of each SNP on the phenotype is assumed to be additive. Other genetic models such as recessive, dominant, or overdominant were considered only by very few studies. In contrast to this, there are theories that emphasize the relevance of nonadditive effects as a consequence of physiologic mechanisms. This might be especially important for metabolites because these intermediate phenotypes are closer to the underlying pathways than other traits or diseases. In this study we analyzed systematically nonadditive effects on a large panel of serum metabolites and all possible ratios (22,801 total) in a population-based study [Cooperative Health Research in the Region of Augsburg (KORA) F4, N = 1,785]. We applied four different 1-degree-of-freedom (1-df) tests corresponding to an additive, dominant, recessive, and overdominant trait model as well as a genotypic model with two degree-of-freedom (2-df) that allows a more general consideration of genetic effects. Twenty-three loci were found to be genome-wide significantly associated (Bonferroni corrected P ≤ 2.19 × 10−12) with at least one metabolite or ratio. For five of them, we show the evidence of nonadditive effects. We replicated 17 loci, including 3 loci with nonadditive effects, in an independent study (TwinsUK, N = 846). In conclusion, we found that most genetic effects on metabolite concentrations and ratios were indeed additive, which verifies the practice of using the additive model for analyzing SNP effects on metabolites.  相似文献   

12.
Di CZ  Liang KY 《Biometrics》2011,67(4):1249-1259
Summary We consider likelihood ratio tests (LRT) and their modifications for homogeneity in admixture models. The admixture model is a two‐component mixture model, where one component is indexed by an unknown parameter while the parameter value for the other component is known. This model is widely used in genetic linkage analysis under heterogeneity in which the kernel distribution is binomial. For such models, it is long recognized that testing for homogeneity is nonstandard, and the LRT statistic does not converge to a conventional χ2 distribution. In this article, we investigate the asymptotic behavior of the LRT for general admixture models and show that its limiting distribution is equivalent to the supremum of a squared Gaussian process. We also discuss the connection and comparison between LRT and alternative approaches such as modifications of LRT and score tests, including the modified LRT ( Fu, Chen, and Kalbfleisch, 2006 , Statistica Sinica 16 , 805–823). The LRT is an omnibus test that is powerful to detect general alternative hypotheses. In contrast, alternative approaches may be slightly more powerful to detect certain type of alternatives, but much less powerful for others. Our results are illustrated by simulation studies and an application to a genetic linkage study of schizophrenia.  相似文献   

13.
Goudet J  Perrin N  Waser P 《Molecular ecology》2002,11(6):1103-1114
Understanding why dispersal is sex-biased in many taxa is still a major concern in evolutionary ecology. Dispersal tends to be male-biased in mammals and female-biased in birds, but counter-examples exist and little is known about sex bias in other taxa. Obtaining accurate measures of dispersal in the field remains a problem. Here we describe and compare several methods for detecting sex-biased dispersal using bi-parentally inherited, codominant genetic markers. If gene flow is restricted among populations, then the genotype of an individual tells something about its origin. Provided that dispersal occurs at the juvenile stage and that sampling is carried out on adults, genotypes sampled from the dispersing sex should on average be less likely (compared to genotypes from the philopatric sex) in the population in which they were sampled. The dispersing sex should be less genetically structured and should present a larger heterozygote deficit. In this study we use computer simulations and a permutation test on four statistics to investigate the conditions under which sex-biased dispersal can be detected. Two tests emerge as fairly powerful. We present results concerning the optimal sampling strategy (varying number of samples, individuals, loci per individual and level of polymorphism) under different amounts of dispersal for each sex. These tests for biases in dispersal are also appropriate for any attribute (e.g. size, colour, status) suspected to influence the probability of dispersal. A windows program carrying out these tests can be freely downloaded from http://www.unil.ch/izea/softwares/fstat.html  相似文献   

14.
The current article explores whether the application of generalized linear models (GLM) and generalized estimating equations (GEE) can be used in place of conventional statistical analyses in the study of ordinal data that code an underlying continuous variable, like entheseal changes. The analysis of artificial data and ordinal data expressing entheseal changes in archaeological North African populations gave the following results. Parametric and nonparametric tests give convergent results particularly for P values <0.1, irrespective of whether the underlying variable is normally distributed or not under the condition that the samples involved in the tests exhibit approximately equal sizes. If this prerequisite is valid and provided that the samples are of equal variances, analysis of covariance may be adopted. GLM are not subject to constraints and give results that converge to those obtained from all nonparametric tests. Therefore, they can be used instead of traditional tests as they give the same amount of information as them, but with the advantage of allowing the study of the simultaneous impact of multiple predictors and their interactions and the modeling of the experimental data. However, GLM should be replaced by GEE for the study of bilateral asymmetry and in general when paired samples are tested, because GEE are appropriate for correlated data. Am J Phys Anthropol 153:473–483, 2014. © 2013 Wiley Periodicals, Inc.  相似文献   

15.
We provide theoretical tests of a novel experimental technique to determine mechanostability of proteins based on stretching a mechanically protected protein by single‐molecule force spectroscopy. This technique involves stretching a homogeneous or heterogeneous chain of reference proteins (single‐molecule markers) in which one of them acts as host to the guest protein under study. The guest protein is grafted into the host through genetic engineering. It is expected that unraveling of the host precedes the unraveling of the guest removing ambiguities in the reading of the force‐extension patterns of the guest protein. We study examples of such systems within a coarse‐grained structure‐based model. We consider systems with various ratios of mechanostability for the host and guest molecules and compare them to experimental results involving cohesin I as the guest molecule. For a comparison, we also study the force‐displacement patterns in proteins that are linked in a serial fashion. We find that the mechanostability of the guest is similar to that of the isolated or serially linked protein. We also demonstrate that the ideal configuration of this strategy would be one in which the host is much more mechanostable than the single‐molecule markers. We finally show that it is troublesome to use the highly stable cystine knot proteins as a host to graft a guest in stretching studies because this would involve a cleaving procedure. Proteins 2014; 82:717–726. © 2014 Wiley Periodicals, Inc.  相似文献   

16.
In light of historical and recent anthropogenic influences on Malagasy primate populations, in this study ring-tailed lemur (Lemur catta) samples from two sites in southwestern Madagascar, Beza Mahafaly Special Reserve (BMSR) and Tsimanampetsotsa National Park (TNP), were evaluated for the genetic signature of a population bottleneck. A total of 45 individuals (20 from BMSR and 25 from TNP) were genotyped at seven microsatellite loci. Three methods were used to evaluate these populations for evidence of a historical bottleneck: M-ratio, mode-shift, and heterozygosity excess tests. Three mutation models were used for heterozygosity excess tests: the stepwise mutation model (SMM), two-phase model (TPM), and infinite allele model (IAM). M-ratio estimations indicated a potential bottleneck in both populations under some conditions. Although mode-shift tests did not strongly indicate a population bottleneck in the recent historical past when samples from all individuals were included, a female-only analysis indicated a potential bottleneck in TNP. Heterozygosity excess was indicated under two of the three mutation models (IAM and TPM), with TNP showing stronger evidence of heterozygosity excess than BMSR. Taken together, these results suggest that a bottleneck may have occurred among L. catta in southwestern Madagascar in the recent past. Given knowledge of how current major stochastic climatic events and human-induced change can negatively impact extant lemur populations, it is reasonable that comparable events in the historical past could have caused a population bottleneck. This evaluation additionally functions to highlight the continuing environmental and anthropogenic challenges faced by lemurs in southwestern Madagascar.  相似文献   

17.
Rates of trait evolution are known to vary across phylogenies; however, standard evolutionary models assume a homogeneous process of trait change. These simple methods are widely applied in small‐scale phylogenetic studies, whereas models of rate heterogeneity are not, so the prevalence and patterns of potential rate variation in groups up to hundreds of species remain unclear. The extent to which trait evolution is modelled accurately on a given phylogeny is also largely unknown because studies typically lack absolute model fit tests. We investigated these issues by applying both rate‐static and variable‐rates methods on (i) body mass data for 88 avian clades of 10–318 species, and (ii) data simulated under a range of rate‐heterogeneity scenarios. Our results show that rate heterogeneity is present across small‐scaled avian clades, and consequently applying only standard single‐process models prompts inaccurate inferences about the generating evolutionary process. Specifically, these approaches underestimate rate variation, and systematically mislabel temporal trends in trait evolution. Conversely, variable‐rates approaches have superior relative fit (they are the best model) and absolute fit (they describe the data well). We show that rate changes such as single internal branch variations, rate decreases and early bursts are hard to detect, even by variable‐rates models. We also use recently developed absolute adequacy tests to highlight misleading conclusions based on relative fit alone (e.g. a consistent preference for constrained evolution when isolated terminal branch rate increases are present). This work highlights the potential for robust inferences about trait evolution when fitting flexible models in conjunction with tests for absolute model fit.  相似文献   

18.
McLain AC  Lum KJ  Sundaram R 《Biometrics》2012,68(2):648-656
Menstrual cycle patterns are often used as indicators of female fecundity and are associated with hormonally dependent diseases such as breast cancer. A question of considerable interest is in identifying menstrual cycle patterns, and their association with fecundity. A source of data for addressing this question is prospective pregnancy studies that collect detailed information on reproductive aged women. However, methodological challenges exist in ascertaining the association between these two processes as the number of longitudinally measured menstrual cycles is relatively small and informatively censored by time to pregnancy (TTP), as well as the cycle length distribution being highly skewed. We propose a joint modeling approach with a mixed effects dispersion model for the menstrual cycle lengths and a discrete survival model for TTP to address this question. This allows us to assess the effect of important characteristics of menstrual cycle that are associated with fecundity. We are also able to assess the effect of fecundity predictors such as age at menarche, age, and parity on both these processes. An advantage of the proposed approach is the prediction of the TTP, thus allowing us to study the efficacy of menstrual cycle characteristics in predicting fecundity. We analyze two prospective pregnancy studies to illustrate our proposed method by building a model based on the Oxford Conception Study, and predicting for the New York State Angler Cohort Prospective Pregnancy Study. Our analysis has relevant findings for assessing fecundity.  相似文献   

19.
Furcraea foetida (Asparagaceae) is a native plant of Central America and northern South America but there is no information about its country of origin. The species was introduced into Brazil and is now considered invasive, particularly in coastal ecosystems. To date, nothing is known about the environmental factors that constrain its distribution and there is only inconclusive information about its location of origin. We used reciprocal distribution models (RDM) to assess invasion risk of F. foetida across Brazil and to identify source regions in its native range. We also tested the niche conservatism hypothesis using Principal Components Analyses and statistical tests of niche equivalency and similarity between its native and invaded ranges. For RDM analysis, we built two models using maximum entropy, one using records in the native range to predict the invaded distribution (forward‐Ecological Niche Model or forward‐ENM) and one using records in the invaded range to predict the native distribution (reverse‐ENM). Forward‐ENM indicated invasion risk in the Cerrado region and the innermost region of the Atlantic Forest, however, failed to predict the current occurrence in southern Brazil. Reverse‐ENM supported an existing hypothesis that F. foetida originated in the Orinoco river basin, Amazon basin and Caribbean islands. Prediction errors in the RDM and multivariate analysis indicated that the species expanded its realized niche in Brazil. The niche similarity test further suggested that the niche differences are because of differences in habitat availability between the two ranges, not because of evolutionary changes. We hypothesize that physiological pre‐adaptation (especially, the crassulacean acid metabolism), human‐driven propagule pressure and high competitive ability are the main factors determining the current spatial distribution of the species in Brazil. Our study highlights the need to include F. foetida in plant invasion monitoring programs, especially in priority conservation areas where the species has still not been introduced.  相似文献   

20.
Biological data are often intrinsically hierarchical (e.g., species from different genera, plants within different mountain regions), which made mixed‐effects models a common analysis tool in ecology and evolution because they can account for the non‐independence. Many questions around their practical applications are solved but one is still debated: Should we treat a grouping variable with a low number of levels as a random or fixed effect? In such situations, the variance estimate of the random effect can be imprecise, but it is unknown if this affects statistical power and type I error rates of the fixed effects of interest. Here, we analyzed the consequences of treating a grouping variable with 2–8 levels as fixed or random effect in correctly specified and alternative models (under‐ or overparametrized models). We calculated type I error rates and statistical power for all‐model specifications and quantified the influences of study design on these quantities. We found no influence of model choice on type I error rate and power on the population‐level effect (slope) for random intercept‐only models. However, with varying intercepts and slopes in the data‐generating process, using a random slope and intercept model, and switching to a fixed‐effects model, in case of a singular fit, avoids overconfidence in the results. Additionally, the number and difference between levels strongly influences power and type I error. We conclude that inferring the correct random‐effect structure is of great importance to obtain correct type I error rates. We encourage to start with a mixed‐effects model independent of the number of levels in the grouping variable and switch to a fixed‐effects model only in case of a singular fit. With these recommendations, we allow for more informative choices about study design and data analysis and make ecological inference with mixed‐effects models more robust for small number of levels.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号