首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness‐of‐fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer–Lemeshow () and Pigeon–Heyse (J2) statistics can be applied directly. In a simulation study, , , and J2 were used to evaluate the fit of probit, log–log, complementary log–log, and log models, all calculated with a common grouping method. The statistic consistently maintained Type I error rates, while those of and J2 were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, had more power than or J2.  相似文献   

2.
Few articles have been written on analyzing three‐way interactions between drugs. It may seem to be quite straightforward to extend a statistical method from two‐drugs to three‐drugs. However, there may exist more complex nonlinear response surface of the interaction index () with more complex local synergy and/or local antagonism interspersed in different regions of drug combinations in a three‐drug study, compared in a two‐drug study. In addition, it is not possible to obtain a four‐dimensional (4D) response surface plot for a three‐drug study. We propose an analysis procedure to construct the dose combination regions of interest (say, the synergistic areas with ). First, use the model robust regression method (MRR), a semiparametric method, to fit the entire response surface of the , which allows to fit a complex response surface with local synergy/antagonism. Second, we run a modified genetic algorithm (MGA), a stochastic optimization method, many times with different random seeds, to allow to collect as many feasible points as possible that satisfy the estimated values of . Last, all these feasible points are used to construct the approximate dose regions of interest in a 3D. A case study with three anti‐cancer drugs in an in vitro experiment is employed to illustrate how to find the dose regions of interest.  相似文献   

3.
In scientific research, many hypotheses relate to the comparison of two independent groups. Usually, it is of interest to use a design (i.e., the allocation of sample sizes m and n for fixed ) that maximizes the power of the applied statistical test. It is known that the two‐sample t‐tests for homogeneous and heterogeneous variances may lose substantial power when variances are unequal but equally large samples are used. We demonstrate that this is not the case for the nonparametric Wilcoxon–Mann–Whitney‐test, whose application in biometrical research fields is motivated by two examples from cancer research. We prove the optimality of the design in case of symmetric and identically shaped distributions using normal approximations and show that this design generally offers power only negligibly lower than the optimal design for a wide range of distributions.  相似文献   

4.
Matched case‐control paired data are commonly used to study the association between a disease and an exposure of interest. This work provides a consistent test for this association with respect to the conditional odds ratio (), which is a measure of association that is also valid in prospective studies. We formulate the test from the maximum likelihood (ML) estimate of by using data under inverse binomial sampling, in which individuals are selected sequentially to form matched pairs until for the first time one obtains either a prefixed number of index pairs with the case unexposed but the control exposed or with the case exposed but the control unexposed. We discuss the situation of possible early stopping. We compare numerically the performance of our procedure with a competitor proposed by Lui ( 1996 ) in terms of type I error rate, power, average sample number (ASN) and the corresponding standard error. Our numerical study shows a gain in sample size without loss in power as compared to the competitor. Finally, we use the data taken from a case‐control study on the use of X‐rays and the risk of childhood acute myeloid leukemia for illustration.  相似文献   

5.
Comparisons of to can provide insights into the evolutionary processes that lead to differentiation, or lack thereof, among the phenotypes of different groups (e.g., populations, species), and these comparisons have been performed on a variety of taxa, including humans. Here, I show that for neutrally evolving (i.e., by genetic drift, mutation, and gene flow alone) quantitative characters, the two commonly used estimators have somewhat different interpretations in terms of coalescence times, particularly when the number of groups that have been sampled is small. A similar situation occurs for estimators. Consequently, when observations come from only a small number of groups, which is not an unusual situation, it is important to match estimators appropriately when comparing to .  相似文献   

6.
We develop nonparametric maximum likelihood estimation for the parameters of an irreversible Markov chain on states from the observations with interval censored times of 0 → 1, 0 → 2 and 1 → 2 transitions. The distinguishing aspect of the data is that, in addition to all transition times being interval censored, the times of two events (0 → 1 and 1 → 2 transitions) can be censored into the same interval. This development was motivated by a common data structure in oral health research, here specifically illustrated by the data from a prospective cohort study on the longevity of dental veneers. Using the self‐consistency algorithm we obtain the maximum likelihood estimators of the cumulative incidences of the times to events 1 and 2 and of the intensity of the 1 → 2 transition. This work generalizes previous results on the estimation in an “illness‐death” model from interval censored observations.  相似文献   

7.
Interest has surged recently in removing siblings from population genetic data sets before conducting downstream analyses. However, even if the pedigree is inferred correctly, this has the potential to do more harm than good. We used computer simulations and empirical samples of coho salmon to evaluate strategies for adjusting samples to account for family structure. We compared performance in full samples and sibling‐reduced samples of estimators of allele frequency (), population differentiation () and effective population size (). Results: (i) unless simulated samples included large family groups together with a component of unrelated individuals, removing siblings generally reduced precision of and ; (ii) based on the linkage disequilibrium method was largely unbiased using full random samples but became increasingly upwardly biased under aggressive purging of siblings. Under nonrandom sampling (some families over‐represented), using full samples was downwardly biased; removing just the right ‘Goldilocks’ fraction of siblings could produce an unbiased estimate, but this sweet spot varied widely among scenarios; (iii) weighting individuals based on the inferred pedigree (to produce a best linear unbiased estimator, BLUE) maximized precision of when the inferred pedigree was correct but performed poorly when the pedigree was wrong; (iv) a variant of sibling removal that leaves intact small sibling groups appears to be more robust to errors in inferences about family structure. Our results illustrate the complex challenges posed by presence of family structure, suggest that no single optimal solution exists and argue for caution in adjusting population genetic data sets for the presence of putative siblings without fully understanding the consequences.  相似文献   

8.
In this paper, we introduce a new estimator of a percentile residual life function with censored data under a monotonicity constraint. Specifically, it is assumed that the percentile residual life is a decreasing function. This assumption is useful when estimating the percentile residual life of units, which degenerate with age. We establish a law of the iterated logarithm for the proposed estimator, and its ‐equivalence to the unrestricted estimator. The asymptotic normal distribution of the estimator and its strong approximation to a Gaussian process are also established. We investigate the finite sample performance of the monotone estimator in an extensive simulation study. Finally, data from a clinical trial in primary biliary cirrhosis of the liver are analyzed with the proposed methods. One of the conclusions of our work is that the restricted estimator may be much more efficient than the unrestricted one.  相似文献   

9.
When establishing a treatment in clinical trials, it is important to evaluate both effectiveness and toxicity. In phase II clinical trials, multinomial data are collected in m‐stage designs, especially in two‐stage () design. Exact tests on two proportions, for the response rate and for the nontoxicity rate, should be employed due to limited sample sizes. However, existing tests use certain parameter configurations at the boundary of null hypothesis space to determine rejection regions without showing that the maximum Type I error rate is achieved at the boundary of null hypothesis. In this paper, we show that the power function for each test in a large family of tests is nondecreasing in both and ; identify the parameter configurations at which the maximum Type I error rate and the minimum power are achieved and derive level‐α tests; provide optimal two‐stage designs with the least expected total sample size and the optimization algorithm; and extend the results to the case of . Some R‐codes are given in the Supporting Information.  相似文献   

10.
In this work we propose the use of functional data analysis (FDA) to deal with a very large dataset of atmospheric aerosol size distribution resolved in both space and time. Data come from a mobile measurement platform in the town of Perugia (Central Italy). An OPC (Optical Particle Counter) is integrated on a cabin of the Minimetrò, an urban transportation system, that moves along a monorail on a line transect of the town. The OPC takes a sample of air every six seconds and counts the number of particles of urban aerosols with a diameter between 0.28 m and 10 m and classifies such particles into 21 size bins according to their diameter. Here, we adopt a 2D functional data representation for each of the 21 spatiotemporal series. In fact, space is unidimensional since it is measured as the distance on the monorail from the base station of the Minimetrò. FDA allows for a reduction of the dimensionality of each dataset and accounts for the high space‐time resolution of the data. Functional cluster analysis is then performed to search for similarities among the 21 size channels in terms of their spatiotemporal pattern. Results provide a good classification of the 21 size bins into a relatively small number of groups (between three and four) according to the season of the year. Groups including coarser particles have more similar patterns, while those including finer particles show a more different behavior according to the period of the year. Such features are consistent with the physics of atmospheric aerosol and the highlighted patterns provide a very useful ground for prospective model‐based studies.  相似文献   

11.
The effect of a mutation on protein stability is traditionally measured by genetic construction, expression, purification, and physical analysis using low‐throughput methods. This process is tedious and limits the number of mutants able to be examined in a single study. In contrast, functional fitness effects can be measured in a high‐throughput manner by various deep mutational scanning tools. Using protein GB 1, we have recently demonstrated the feasibility of estimating the mutational stability effect ( G) of single‐substitution based on the functional fitness profile of all double‐substitutions. The principle is to identify genetic backgrounds that have an exhausted stability margin. The functional effect of an additional substitution on these genetic backgrounds can then be used to compute the mutational G based on the biophysical relationship between functional fitness and thermodynamic stability. However, to identify such genetic backgrounds, the approach described in our previous study required a benchmark dataset, which is a set of known mutational G. In this study, a benchmark‐independent approach is developed. The genetic backgrounds of interest are identified using k‐means clustering with the integration of structural information. We further demonstrated that a reasonable approximation of G can also be obtained without taking structural information into account. In summary, this study describes a novel method for computing G from double‐substitution functional fitness profiles alone, without relying on any known mutational G as a benchmark.  相似文献   

12.
We estimated local and metapopulation effective sizes ( and meta‐) for three coexisting salmonid species (Salmo salar, Salvelinus fontinalis, Salvelinus alpinus) inhabiting a freshwater system comprising seven interconnected lakes. First, we hypothesized that might be inversely related to within‐species population divergence as reported in an earlier study (i.e., FST: S. salar> S. fontinalis> S. alpinus). Using the approximate Bayesian computation method implemented in ONeSAMP, we found significant differences in () between species, consistent with a hierarchy of adult population sizes (). Using another method based on a measure of linkage disequilibrium (LDNE: ), we found more finite values for S. salar than for the other two salmonids, in line with the results above that indicate that S. salar exhibits the lowest among the three species. Considering subpopulations as open to migration (i.e., removing putative immigrants) led to only marginal and non‐significant changes in , suggesting that migration may be at equilibrium between genetically similar sources. Second, we hypothesized that meta‐ might be significantly smaller than the sum of local s (null model) if gene flow is asymmetric, varies among subpopulations, and is driven by common landscape features such as waterfalls. One ‘bottom‐up’ or numerical approach that explicitly incorporates variable and asymmetric migration rates showed this very pattern, while a number of analytical models provided meta‐ estimates that were not significantly different from the null model or from each other. Our study of three species inhabiting a shared environment highlights the importance and utility of differentiating species‐specific and landscape effects, not only on dispersal but also in the demography of wild populations as assessed through local s and meta‐s and their relevance in ecology, evolution and conservation.  相似文献   

13.
Lin Wang  Lin Li  Emil Alexov 《Proteins》2015,83(12):2186-2197
We developed a Poisson‐Boltzmann based approach to calculate the values of protein ionizable residues (Glu, Asp, His, Lys and Arg), nucleotides of RNA and single stranded DNA. Two novel features were utilized: the dielectric properties of the macromolecules and water phase were modeled via the smooth Gaussian‐based dielectric function in DelPhi and the corresponding electrostatic energies were calculated without defining the molecular surface. We tested the algorithm by calculating values for more than 300 residues from 32 proteins from the PPD dataset and achieved an overall RMSD of 0.77. Particularly, the RMSD of 0.55 was achieved for surface residues, while the RMSD of 1.1 for buried residues. The approach was also found capable of capturing the large shifts of various single point mutations in staphylococcal nuclease (SNase) from ‐cooperative dataset, resulting in an overall RMSD of 1.6 for this set of pKa's. Investigations showed that predictions for most of buried mutant residues of SNase could be improved by using higher dielectric constant values. Furthermore, an option to generate different hydrogen positions also improves predictions for buried carboxyl residues. Finally, the calculations on two RNAs demonstrated the capability of this approach for other types of biomolecules. Proteins 2015; 83:2186–2197. © 2015 Wiley Periodicals, Inc.  相似文献   

14.
Reliable estimates of effective population size are of central importance in population genetics and evolutionary biology. For populations that fluctuate in size, harmonic mean population size is commonly used as a proxy for (multi‐) generational effective size. This assumes no effects of density dependence on the ratio between effective and actual population size, which limits its potential application. Here, we introduce density dependence on vital rates in a demographic model of variance effective size. We derive an expression for the ratio in a density‐regulated population in a fluctuating environment. We show by simulations that yearly genetic drift is accurately predicted by our model, and not proportional to as assumed by the harmonic mean model, where N is the total population size of mature individuals. We find a negative relationship between and N. For a given N, the ratio depends on variance in reproductive success and the degree of resource limitation acting on the population growth rate. Finally, our model indicate that environmental stochasticity may affect not only through fluctuations in N, but also for a given N at a given time. Our results show that estimates of effective population size must include effects of density dependence and environmental stochasticity.  相似文献   

15.
The ratio between the effective and the census population size, , is an important measure of the long‐term viability and sustainability of a population. Understanding which demographic processes that affect most will improve our understanding of how genetic drift and the probability of fixation of alleles is affected by demography. This knowledge may also be of vital importance in management of endangered populations and species. Here, we use data from 13 natural populations of house sparrow (Passer domesticus) in Norway to calculate the demographic parameters that determine . Using the global variance‐based Sobol’ method for the sensitivity analyses, we found that was most sensitive to demographic variance, especially among older individuals. Furthermore, the individual reproductive values (that determine the demographic variance) were most sensitive to variation in fecundity. Our results draw attention to the applicability of sensitivity analyses in population management and conservation. For population management aiming to reduce the loss of genetic variation, a sensitivity analysis may indicate the demographic parameters towards which resources should be focused. The result of such an analysis may depend on the life history and mating system of the population or species under consideration, because the vital rates and sex–age classes that is most sensitive to may change accordingly.  相似文献   

16.
Increasing atmospheric reactive nitrogen (N) deposition due to human activities could change N cycling in terrestrial ecosystems. However, the differences between the fates of deposited and are still not fully understood. Here, we investigated the fates of deposited and , respectively, via the application of 15NH4NO3 and NH415NO3 in a temperate forest ecosystem. Results showed that at 410 days after tracer application, most was immobilized in litter layer (50 ± 2%), while a considerable amount of penetrated into 0–5 cm mineral soil (42 ± 2%), indicating that litter layer and 0–5 cm mineral soil were the major N sinks of and , respectively. Broad‐leaved trees assimilated more 15N under NH415NO3 treatment compared to under 15NH4NO3 treatment, indicating their preference for –N. At 410 days after tracer application, 16 ± 4% added 15N was found in aboveground biomass under treatment, which was twice more than that under treatment (6 ± 1%). At the same time, approximately 80% added 15N was recovered in soil and plants under both treatments, which suggested that this forest had high potential for retention of deposited N. These results provided evidence that there were great differences between the fates of deposited and , which could help us better understand the mechanisms and capability of forest ecosystems as a sink of reactive nitrogen.  相似文献   

17.
18.
The genetic effective population size, Ne, can be estimated from the average gametic disequilibrium () between pairs of loci, but such estimates require evaluation of assumptions and currently have few methods to estimate confidence intervals. speed‐ne is a suite of matlab computer code functions to estimate from with a graphical user interface and a rich set of outputs that aid in understanding data patterns and comparing multiple estimators. speed‐ne includes functions to either generate or input simulated genotype data to facilitate comparative studies of estimators under various population genetic scenarios. speed‐ne was validated with data simulated under both time‐forward and time‐backward coalescent models of genetic drift. Three classes of estimators were compared with simulated data to examine several general questions: what are the impacts of microsatellite null alleles on , how should missing data be treated, and does disequilibrium contributed by reduced recombination among some loci in a sample impact . Estimators differed greatly in precision in the scenarios examined, and a widely employed estimator exhibited the largest variances among replicate data sets. speed‐ne implements several jackknife approaches to estimate confidence intervals, and simulated data showed that jackknifing over loci and jackknifing over individuals provided ~95% confidence interval coverage for some estimators and should be useful for empirical studies. speed‐ne provides an open‐source extensible tool for estimation of from empirical genotype data and to conduct simulations of both microsatellite and single nucleotide polymorphism (SNP) data types to develop expectations and to compare estimators.  相似文献   

19.
The response of soil carbon dynamics to climate and land‐use change will affect both the future climate and the quality of ecosystems. Deep soil carbon (>20 cm) is the primary component of the soil carbon pool, but the dynamics of deep soil carbon remain poorly understood. Therefore, radiocarbon activity (C), which is a function of the age of carbon, may help to understand the rates of soil carbon biodegradation and stabilization. We analyzed the published C contents in 122 profiles of mineral soil that were well distributed in most of the large world biomes, except for the boreal zone. With a multivariate extension of a linear mixed‐effects model whose inference was based on the parallel combination of two algorithms, the expectation–maximization (EM) and the Metropolis–Hasting algorithms, we expressed soil C profiles as a four‐parameter function of depth. The four‐parameter model produced insightful predictions of soil C as dependent on depth, soil type, climate, vegetation, land‐use and date of sampling (). Further analysis with the model showed that the age of topsoil carbon was primarily affected by climate and cultivation. By contrast, the age of deep soil carbon was affected more by soil taxa than by climate and thus illustrated the strong dependence of soil carbon dynamics on other pedologic traits such as clay content and mineralogy.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号