首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 476 毫秒
1.
Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness‐of‐fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer–Lemeshow () and Pigeon–Heyse (J2) statistics can be applied directly. In a simulation study, , , and J2 were used to evaluate the fit of probit, log–log, complementary log–log, and log models, all calculated with a common grouping method. The statistic consistently maintained Type I error rates, while those of and J2 were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, had more power than or J2.  相似文献   

2.
When establishing a treatment in clinical trials, it is important to evaluate both effectiveness and toxicity. In phase II clinical trials, multinomial data are collected in m‐stage designs, especially in two‐stage () design. Exact tests on two proportions, for the response rate and for the nontoxicity rate, should be employed due to limited sample sizes. However, existing tests use certain parameter configurations at the boundary of null hypothesis space to determine rejection regions without showing that the maximum Type I error rate is achieved at the boundary of null hypothesis. In this paper, we show that the power function for each test in a large family of tests is nondecreasing in both and ; identify the parameter configurations at which the maximum Type I error rate and the minimum power are achieved and derive level‐α tests; provide optimal two‐stage designs with the least expected total sample size and the optimization algorithm; and extend the results to the case of . Some R‐codes are given in the Supporting Information.  相似文献   

3.
Few articles have been written on analyzing three‐way interactions between drugs. It may seem to be quite straightforward to extend a statistical method from two‐drugs to three‐drugs. However, there may exist more complex nonlinear response surface of the interaction index () with more complex local synergy and/or local antagonism interspersed in different regions of drug combinations in a three‐drug study, compared in a two‐drug study. In addition, it is not possible to obtain a four‐dimensional (4D) response surface plot for a three‐drug study. We propose an analysis procedure to construct the dose combination regions of interest (say, the synergistic areas with ). First, use the model robust regression method (MRR), a semiparametric method, to fit the entire response surface of the , which allows to fit a complex response surface with local synergy/antagonism. Second, we run a modified genetic algorithm (MGA), a stochastic optimization method, many times with different random seeds, to allow to collect as many feasible points as possible that satisfy the estimated values of . Last, all these feasible points are used to construct the approximate dose regions of interest in a 3D. A case study with three anti‐cancer drugs in an in vitro experiment is employed to illustrate how to find the dose regions of interest.  相似文献   

4.
Biomarkers are subject to censoring whenever some measurements are not quantifiable given a laboratory detection limit. Methods for handling censoring have received less attention in genetic epidemiology, and censored data are still often replaced with a fixed value. We compared different strategies for handling a left‐censored continuous biomarker in a family‐based study, where the biomarker is tested for association with a genetic variant, , adjusting for a covariate, X. Allowing different correlations between X and , we compared simple substitution of censored observations with the detection limit followed by a linear mixed effect model (LMM), Bayesian model with noninformative priors, Tobit model with robust standard errors, the multiple imputation (MI) with and without in the imputation followed by a LMM. Our comparison was based on real and simulated data in which 20% and 40% censoring were artificially induced. The complete data were also analyzed with a LMM. In the MICROS study, the Bayesian model gave results closer to those obtained with the complete data. In the simulations, simple substitution was always the most biased method, the Tobit approach gave the least biased estimates at all censoring levels and correlation values, the Bayesian model and both MI approaches gave slightly biased estimates but smaller root mean square errors. On the basis of these results the Bayesian approach is highly recommended for candidate gene studies; however, the computationally simpler Tobit and the MI without are both good options for genome‐wide studies.  相似文献   

5.
In scientific research, many hypotheses relate to the comparison of two independent groups. Usually, it is of interest to use a design (i.e., the allocation of sample sizes m and n for fixed ) that maximizes the power of the applied statistical test. It is known that the two‐sample t‐tests for homogeneous and heterogeneous variances may lose substantial power when variances are unequal but equally large samples are used. We demonstrate that this is not the case for the nonparametric Wilcoxon–Mann–Whitney‐test, whose application in biometrical research fields is motivated by two examples from cancer research. We prove the optimality of the design in case of symmetric and identically shaped distributions using normal approximations and show that this design generally offers power only negligibly lower than the optimal design for a wide range of distributions.  相似文献   

6.
Increasing atmospheric reactive nitrogen (N) deposition due to human activities could change N cycling in terrestrial ecosystems. However, the differences between the fates of deposited and are still not fully understood. Here, we investigated the fates of deposited and , respectively, via the application of 15NH4NO3 and NH415NO3 in a temperate forest ecosystem. Results showed that at 410 days after tracer application, most was immobilized in litter layer (50 ± 2%), while a considerable amount of penetrated into 0–5 cm mineral soil (42 ± 2%), indicating that litter layer and 0–5 cm mineral soil were the major N sinks of and , respectively. Broad‐leaved trees assimilated more 15N under NH415NO3 treatment compared to under 15NH4NO3 treatment, indicating their preference for –N. At 410 days after tracer application, 16 ± 4% added 15N was found in aboveground biomass under treatment, which was twice more than that under treatment (6 ± 1%). At the same time, approximately 80% added 15N was recovered in soil and plants under both treatments, which suggested that this forest had high potential for retention of deposited N. These results provided evidence that there were great differences between the fates of deposited and , which could help us better understand the mechanisms and capability of forest ecosystems as a sink of reactive nitrogen.  相似文献   

7.
The genetic effective population size, Ne, can be estimated from the average gametic disequilibrium () between pairs of loci, but such estimates require evaluation of assumptions and currently have few methods to estimate confidence intervals. speed‐ne is a suite of matlab computer code functions to estimate from with a graphical user interface and a rich set of outputs that aid in understanding data patterns and comparing multiple estimators. speed‐ne includes functions to either generate or input simulated genotype data to facilitate comparative studies of estimators under various population genetic scenarios. speed‐ne was validated with data simulated under both time‐forward and time‐backward coalescent models of genetic drift. Three classes of estimators were compared with simulated data to examine several general questions: what are the impacts of microsatellite null alleles on , how should missing data be treated, and does disequilibrium contributed by reduced recombination among some loci in a sample impact . Estimators differed greatly in precision in the scenarios examined, and a widely employed estimator exhibited the largest variances among replicate data sets. speed‐ne implements several jackknife approaches to estimate confidence intervals, and simulated data showed that jackknifing over loci and jackknifing over individuals provided ~95% confidence interval coverage for some estimators and should be useful for empirical studies. speed‐ne provides an open‐source extensible tool for estimation of from empirical genotype data and to conduct simulations of both microsatellite and single nucleotide polymorphism (SNP) data types to develop expectations and to compare estimators.  相似文献   

8.
Interest has surged recently in removing siblings from population genetic data sets before conducting downstream analyses. However, even if the pedigree is inferred correctly, this has the potential to do more harm than good. We used computer simulations and empirical samples of coho salmon to evaluate strategies for adjusting samples to account for family structure. We compared performance in full samples and sibling‐reduced samples of estimators of allele frequency (), population differentiation () and effective population size (). Results: (i) unless simulated samples included large family groups together with a component of unrelated individuals, removing siblings generally reduced precision of and ; (ii) based on the linkage disequilibrium method was largely unbiased using full random samples but became increasingly upwardly biased under aggressive purging of siblings. Under nonrandom sampling (some families over‐represented), using full samples was downwardly biased; removing just the right ‘Goldilocks’ fraction of siblings could produce an unbiased estimate, but this sweet spot varied widely among scenarios; (iii) weighting individuals based on the inferred pedigree (to produce a best linear unbiased estimator, BLUE) maximized precision of when the inferred pedigree was correct but performed poorly when the pedigree was wrong; (iv) a variant of sibling removal that leaves intact small sibling groups appears to be more robust to errors in inferences about family structure. Our results illustrate the complex challenges posed by presence of family structure, suggest that no single optimal solution exists and argue for caution in adjusting population genetic data sets for the presence of putative siblings without fully understanding the consequences.  相似文献   

9.
10.
Comparisons of to can provide insights into the evolutionary processes that lead to differentiation, or lack thereof, among the phenotypes of different groups (e.g., populations, species), and these comparisons have been performed on a variety of taxa, including humans. Here, I show that for neutrally evolving (i.e., by genetic drift, mutation, and gene flow alone) quantitative characters, the two commonly used estimators have somewhat different interpretations in terms of coalescence times, particularly when the number of groups that have been sampled is small. A similar situation occurs for estimators. Consequently, when observations come from only a small number of groups, which is not an unusual situation, it is important to match estimators appropriately when comparing to .  相似文献   

11.
The ratio between the effective and the census population size, , is an important measure of the long‐term viability and sustainability of a population. Understanding which demographic processes that affect most will improve our understanding of how genetic drift and the probability of fixation of alleles is affected by demography. This knowledge may also be of vital importance in management of endangered populations and species. Here, we use data from 13 natural populations of house sparrow (Passer domesticus) in Norway to calculate the demographic parameters that determine . Using the global variance‐based Sobol’ method for the sensitivity analyses, we found that was most sensitive to demographic variance, especially among older individuals. Furthermore, the individual reproductive values (that determine the demographic variance) were most sensitive to variation in fecundity. Our results draw attention to the applicability of sensitivity analyses in population management and conservation. For population management aiming to reduce the loss of genetic variation, a sensitivity analysis may indicate the demographic parameters towards which resources should be focused. The result of such an analysis may depend on the life history and mating system of the population or species under consideration, because the vital rates and sex–age classes that is most sensitive to may change accordingly.  相似文献   

12.
13.
In this work we propose the use of functional data analysis (FDA) to deal with a very large dataset of atmospheric aerosol size distribution resolved in both space and time. Data come from a mobile measurement platform in the town of Perugia (Central Italy). An OPC (Optical Particle Counter) is integrated on a cabin of the Minimetrò, an urban transportation system, that moves along a monorail on a line transect of the town. The OPC takes a sample of air every six seconds and counts the number of particles of urban aerosols with a diameter between 0.28 m and 10 m and classifies such particles into 21 size bins according to their diameter. Here, we adopt a 2D functional data representation for each of the 21 spatiotemporal series. In fact, space is unidimensional since it is measured as the distance on the monorail from the base station of the Minimetrò. FDA allows for a reduction of the dimensionality of each dataset and accounts for the high space‐time resolution of the data. Functional cluster analysis is then performed to search for similarities among the 21 size channels in terms of their spatiotemporal pattern. Results provide a good classification of the 21 size bins into a relatively small number of groups (between three and four) according to the season of the year. Groups including coarser particles have more similar patterns, while those including finer particles show a more different behavior according to the period of the year. Such features are consistent with the physics of atmospheric aerosol and the highlighted patterns provide a very useful ground for prospective model‐based studies.  相似文献   

14.
Lin Wang  Lin Li  Emil Alexov 《Proteins》2015,83(12):2186-2197
We developed a Poisson‐Boltzmann based approach to calculate the values of protein ionizable residues (Glu, Asp, His, Lys and Arg), nucleotides of RNA and single stranded DNA. Two novel features were utilized: the dielectric properties of the macromolecules and water phase were modeled via the smooth Gaussian‐based dielectric function in DelPhi and the corresponding electrostatic energies were calculated without defining the molecular surface. We tested the algorithm by calculating values for more than 300 residues from 32 proteins from the PPD dataset and achieved an overall RMSD of 0.77. Particularly, the RMSD of 0.55 was achieved for surface residues, while the RMSD of 1.1 for buried residues. The approach was also found capable of capturing the large shifts of various single point mutations in staphylococcal nuclease (SNase) from ‐cooperative dataset, resulting in an overall RMSD of 1.6 for this set of pKa's. Investigations showed that predictions for most of buried mutant residues of SNase could be improved by using higher dielectric constant values. Furthermore, an option to generate different hydrogen positions also improves predictions for buried carboxyl residues. Finally, the calculations on two RNAs demonstrated the capability of this approach for other types of biomolecules. Proteins 2015; 83:2186–2197. © 2015 Wiley Periodicals, Inc.  相似文献   

15.
Reliable estimates of effective population size are of central importance in population genetics and evolutionary biology. For populations that fluctuate in size, harmonic mean population size is commonly used as a proxy for (multi‐) generational effective size. This assumes no effects of density dependence on the ratio between effective and actual population size, which limits its potential application. Here, we introduce density dependence on vital rates in a demographic model of variance effective size. We derive an expression for the ratio in a density‐regulated population in a fluctuating environment. We show by simulations that yearly genetic drift is accurately predicted by our model, and not proportional to as assumed by the harmonic mean model, where N is the total population size of mature individuals. We find a negative relationship between and N. For a given N, the ratio depends on variance in reproductive success and the degree of resource limitation acting on the population growth rate. Finally, our model indicate that environmental stochasticity may affect not only through fluctuations in N, but also for a given N at a given time. Our results show that estimates of effective population size must include effects of density dependence and environmental stochasticity.  相似文献   

16.
17.
The effect of a mutation on protein stability is traditionally measured by genetic construction, expression, purification, and physical analysis using low‐throughput methods. This process is tedious and limits the number of mutants able to be examined in a single study. In contrast, functional fitness effects can be measured in a high‐throughput manner by various deep mutational scanning tools. Using protein GB 1, we have recently demonstrated the feasibility of estimating the mutational stability effect ( G) of single‐substitution based on the functional fitness profile of all double‐substitutions. The principle is to identify genetic backgrounds that have an exhausted stability margin. The functional effect of an additional substitution on these genetic backgrounds can then be used to compute the mutational G based on the biophysical relationship between functional fitness and thermodynamic stability. However, to identify such genetic backgrounds, the approach described in our previous study required a benchmark dataset, which is a set of known mutational G. In this study, a benchmark‐independent approach is developed. The genetic backgrounds of interest are identified using k‐means clustering with the integration of structural information. We further demonstrated that a reasonable approximation of G can also be obtained without taking structural information into account. In summary, this study describes a novel method for computing G from double‐substitution functional fitness profiles alone, without relying on any known mutational G as a benchmark.  相似文献   

18.
In this paper, we introduce a new estimator of a percentile residual life function with censored data under a monotonicity constraint. Specifically, it is assumed that the percentile residual life is a decreasing function. This assumption is useful when estimating the percentile residual life of units, which degenerate with age. We establish a law of the iterated logarithm for the proposed estimator, and its ‐equivalence to the unrestricted estimator. The asymptotic normal distribution of the estimator and its strong approximation to a Gaussian process are also established. We investigate the finite sample performance of the monotone estimator in an extensive simulation study. Finally, data from a clinical trial in primary biliary cirrhosis of the liver are analyzed with the proposed methods. One of the conclusions of our work is that the restricted estimator may be much more efficient than the unrestricted one.  相似文献   

19.
Mutation may impose a substantial load on populations, which varies according to the reproductive mode of organisms. Over the past years, various authors used adaptive landscape models to predict the long‐term effect of mutation on mean fitness; however, many of these studies assumed very weak mutation rates, so that at most one mutation segregates in the population. In this article, we derive several simple approximations (confirmed by simulations) for the mutation load at high mutation rate (U), using a general model that allows us to play with the number of selected traits (n), the degree of pleiotropy of mutations, and the shape of the fitness function (which affects the average sign and magnitude of epistasis among mutations). When mutations have strong fitness effects, the equilibrium fitness of sexuals and asexuals is close to ; under weaker mutational effects, sexuals reach a different regime where is a simple function of U and of a parameter describing the shape of the fitness function. Contrarily to weak mutation results showing that is an increasing function of population size and a decreasing function of n, these parameters may have opposite effects in sexual populations at high mutation rate.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号