首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nucleotide substitution in both coding and noncoding regions is context-dependent, in the sense that substitution rates depend on the identity of neighboring bases. Context-dependent substitution has been modeled in the case of two sequences and an unrooted phylogenetic tree, but it has only been accommodated in limited ways with more general phylogenies. In this article, extensions are presented to standard phylogenetic models that allow for better handling of context-dependent substitution, yet still permit exact inference at reasonable computational cost. The new models improve goodness of fit substantially for both coding and noncoding data. Considering context dependence leads to much larger improvements than does using a richer substitution model or allowing for rate variation across sites, under the assumption of site independence. The observed improvements appear to derive from three separate properties of the models: their explicit characterization of context-dependent substitution within N-tuples of adjacent sites, their ability to accommodate overlapping N-tuples, and their rich parameterization of the substitution process. Parameter estimation is accomplished using an expectation maximization algorithm, with a quasi-Newton algorithm for the maximization step; this approach is shown to be preferable to ordinary Newton methods for parameter-rich models. Overlapping tuples are efficiently handled by assuming Markov dependence of the observed bases at each site on those at the N - 1 preceding sites, and the required conditional probabilities are computed with an extension of Felsenstein's algorithm. Estimated substitution rates based on a data set of about 160,000 noncoding sites in mammalian genomes indicate a pronounced CpG effect, but they also suggest a complex overall pattern of context-dependent substitution, comprising a variety of subtle effects. Estimates based on about 3 million sites in coding regions demonstrate that amino acid substitution rates can be learned at the nucleotide level, and suggest that context effects across codon boundaries are significant.  相似文献   

2.
Evaluating the goodness of fit of logistic regression models is crucial to ensure the accuracy of the estimated probabilities. Unfortunately, such evaluation is problematic in large samples. Because the power of traditional goodness of fit tests increases with the sample size, practically irrelevant discrepancies between estimated and true probabilities are increasingly likely to cause the rejection of the hypothesis of perfect fit in larger and larger samples. This phenomenon has been widely documented for popular goodness of fit tests, such as the Hosmer-Lemeshow test. To address this limitation, we propose a modification of the Hosmer-Lemeshow approach. By standardizing the noncentrality parameter that characterizes the alternative distribution of the Hosmer-Lemeshow statistic, we introduce a parameter that measures the goodness of fit of a model but does not depend on the sample size. We provide the methodology to estimate this parameter and construct confidence intervals for it. Finally, we propose a formal statistical test to rigorously assess whether the fit of a model, albeit not perfect, is acceptable for practical purposes. The proposed method is compared in a simulation study with a competing modification of the Hosmer-Lemeshow test, based on repeated subsampling. We provide a step-by-step illustration of our method using a model for postneonatal mortality developed in a large cohort of more than 300 000 observations.  相似文献   

3.
The Japanese atomic bomb survivors and three other cohorts of children exposed to radiation are analyzed, and evidence is found for a reduction in the radiation-induced relative risk of cancers other than leukemia with time following exposure. Multiplicative adjustments to the excess risk either of the form exp[-delta.(time since exposure)] or of the form [time since exposure] gamma give equivalent goodness of fit. Using the former type of adjustment an annual overall reduction of 6.9-8.6% in excess relative risk is indicated (depending on the year after which this reduction might take effect). Using the second type of multiplier an adjustment to the excess relative risk varying between [time after exposure]-2.0 and [time after exposure]-3.2 fits best overall. All these reductions are statistically significant at the 5% level. There is no significant variation by cohort, by sex, by cancer type, or by age at exposure group in the degree of annual reduction in excess relative risk. Although time-adjusted relative and absolute risk models give equivalently good fits within each cohort, there is significant variation between cohorts in the degree of increase of risk with time in the absolute risk formulation, in contrast to the lack of such heterogeneity for the relative risk formulation. It is shown that if the range of observed reductions in relative risk is assumed to operate 40 or more years after exposure in the youngest age groups, the calculated UK population risks would be reduced by 30-45% compared to those based on a constant relative risk model.  相似文献   

4.
We have compiled an extensive database of archaeological evidence for rice across Asia, including 400 sites from mainland East Asia, Southeast Asia and South Asia. This dataset is used to compare several models for the geographical origins of rice cultivation and infer the most likely region(s) for its origins and subsequent outward diffusion. The approach is based on regression modelling wherein goodness of fit is obtained from power law quantile regressions of the archaeologically inferred age versus a least-cost distance from the putative origin(s). The Fast Marching method is used to estimate the least-cost distances based on simple geographical features. The origin region that best fits the archaeobotanical data is also compared to other hypothetical geographical origins derived from the literature, including from genetics, archaeology and historical linguistics. The model that best fits all available archaeological evidence is a dual origin model with two centres for the cultivation and dispersal of rice focused on the Middle Yangtze and the Lower Yangtze valleys.  相似文献   

5.
A general linear model is described here for cultural and biological inheritance of lipids and lipoproteins. This model involves 10 parameters to be estimated from a total of 17 correlations, leaving ample degrees of freedom to test the goodness of fit. The model fits very well to each of the five lipid and lipoprotein variables analyzed here from a Lipid Research Clinic family data set. Both genetic and cultural inheritance are significant for each trait with the single exception that triglyceride levels fail to support genetic inheritance. Under the most parsimonious hypothesis, the genetic heritability (h2) ranges from .194 +/- .092 for triglyceride to .624 +/- .093 for low-density lipoprotein-cholesterol. Cultural heritability ranges from .070 +/- .030 for total cholesterol to .149 +/- .034 for triglyceride.  相似文献   

6.
A generalization of the two-mutation stochastic carcinogenesis model of Moolgavkar, Venzon and Knudson and certain models constructed by Little [Little, M.P. (1995). Are two mutations sufficient to cause cancer? Some generalizations of the two-mutation model of carcinogenesis of Moolgavkar, Venzon, and Knudson, and of the multistage model of Armitage and Doll. Biometrics 51, 1278-1291] and Little and Wright [Little, M.P., Wright, E.G. (2003). A stochastic carcinogenesis model incorporating genomic instability fitted to colon cancer data. Math. Biosci. 183, 111-134] is developed; the model incorporates multiple types of progressive genomic instability and an arbitrary number of mutational stages. The model is fitted to US Caucasian colon cancer incidence data. On the basis of the comparison of fits to the population-based data, there is little evidence to support the hypothesis that the model with more than one type of genomic instability fits better than models with a single type of genomic instability. Given the good fit of the model to this large dataset, it is unlikely that further information on presence of genomic instability or of types of genomic instability can be extracted from age-incidence data by extensions of this model.  相似文献   

7.
Estimating the probability that a species is extinct and the timing of extinctions is useful in biological fields ranging from paleoecology to conservation biology. Various statistical methods have been introduced to infer the time of extinction and extinction probability from a series of individual sightings. There is little evidence, however, as to which of these models provide adequate fit to actual sighting records. We use L-moment diagrams and probability plot correlation coefficient (PPCC) hypothesis tests to evaluate the goodness of fit of various probabilistic models to sighting data collected for a set of North American and Hawaiian bird populations that have either gone extinct, or are suspected of having gone extinct, during the past 150 years. For our data, the uniform, truncated exponential, and generalized Pareto models performed moderately well, but the Weibull model performed poorly. Of the acceptable models, the uniform distribution performed best based on PPCC goodness of fit comparisons and sequential Bonferroni-type tests. Further analyses using field significance tests suggest that although the uniform distribution is the best of those considered, additional work remains to evaluate the truncated exponential model more fully. The methods we present here provide a framework for evaluating subsequent models.  相似文献   

8.
In certain toxicological experiments with laboratory animals, the outcome of interest is the occurrence of dead or malformed fetuses in a litter. Previous investigations have shown that the simple one-parameter binomial and Poisson models generally provide poor fits to this type of binary data. In this paper, a type of correlated binomial model is proposed for use in this situation. First, the model is described in detail and is compared to a beta-binomial model proposed by Williams (1975). These two-parameter models are then contrasted for goodness of fit to some real-life data. Finally, numerical examples are given in which likelihood ratio tests based on these models are employed to assess the significance of treatment-control differences.  相似文献   

9.
Perceptual multistability, alternative perceptions of an unchanging stimulus, gives important clues to neural dynamics. The present study examined 56 perceptual dominance time series for a Necker cube stimulus, for ambiguous motion, and for binocular rivalry. We made histograms of the perceptual dominance times, based on from 307 to 2478 responses per time series (median=612), and compared these histograms to gamma, lognormal and Weibull fitted distributions using the Kolmogorov–Smirnov goodness-of-fit test. In 40 of the 56 tested cases a lognormal distribution provided an acceptable fit to the histogram (in 24 cases it was the only fit). In 16 cases a gamma distribution, and in 11 cases a Weibull distribution, were acceptable but never as the only fit in either case. Any of the three distributions were acceptable in three cases and none provided acceptable fits in 12 cases. Considering only the 16 cases in which a lognormal distribution was rejected (p<0.05) revealed that minor adjustments to the fourth-moment term of the lognormal characteristic function restored good fits. These findings suggest that random fractal theory might provide insight into the underlying mechanisms of multistable perceptions.  相似文献   

10.
11.
Pathogen infection is typically costly to hosts, resulting in reduced fitness. However, pathogen exposure may also come at a cost even if the host does not become infected. These fitness reductions, referred to as “resistance costs”, are inducible physiological costs expressed as a result of a trade‐off between resistance to a pathogen and aspects of host fitness (e.g., reproduction). Here, we examine resistance and infection costs of a generalist fungal pathogen (Metschnikowia bicuspidata) capable of infecting a number of host species. Costs were quantified as reductions in host lifespan, total reproduction, and mean clutch size as a function of pathogen exposure (resistance cost) or infection (infection cost). We provide empirical support for infection costs and modest support for resistance costs for five Daphnia host species. Specifically, only one host species examined incurred a significant cost of resistance. This species was the least susceptible to infection, suggesting the possibility that host susceptibility to infection is associated with the detectability and size of resistance cost. Host age at the time of pathogen exposure did not influence the magnitude of resistance or infection cost. Lastly, resistant hosts had fitness values intermediate between unexposed control hosts and infected hosts. Although not statistically significant, this could suggest that pathogen exposure does come at some marginal cost. Taken together, our findings suggest that infection is costly, resistance costs may simply be difficult to detect, and the magnitude of resistance cost may vary among host species as a result of host life history or susceptibility.  相似文献   

12.
Well‐defined productivity–precipitation relationships of ecosystems are needed as benchmarks for the validation of land models used for future projections. The productivity–precipitation relationship may be studied in two ways: the spatial approach relates differences in productivity to those in precipitation among sites along a precipitation gradient (the spatial fit, with a steeper slope); the temporal approach relates interannual productivity changes to variation in precipitation within sites (the temporal fits, with flatter slopes). Precipitation–reduction experiments in natural ecosystems represent a complement to the fits, because they can reduce precipitation below the natural range and are thus well suited to study potential effects of climate drying. Here, we analyse the effects of dry treatments in eleven multiyear precipitation–manipulation experiments, focusing on changes in the temporal fit. We expected that structural changes in the dry treatments would occur in some experiments, thereby reducing the intercept of the temporal fit and displacing the productivity–precipitation relationship downward the spatial fit. The majority of experiments (72%) showed that dry treatments did not alter the temporal fit. This implies that current temporal fits are to be preferred over the spatial fit to benchmark land‐model projections of productivity under future climate within the precipitation ranges covered by the experiments. Moreover, in two experiments, the intercept of the temporal fit unexpectedly increased due to mechanisms that reduced either water loss or nutrient loss. The expected decrease of the intercept was observed in only one experiment, and only when distinguishing between the late and the early phases of the experiment. This implies that we currently do not know at which precipitation–reduction level or at which experimental duration structural changes will start to alter ecosystem productivity. Our study highlights the need for experiments with multiple, including more extreme, dry treatments, to identify the precipitation boundaries within which the current temporal fits remain valid.  相似文献   

13.
Isothermal titration calorimetry (ITC) is commonly used to determine the thermodynamic parameters associated with the binding of a ligand to a host macromolecule. ITC has some advantages over common spectroscopic approaches for studying host/ligand interactions. For example, the heat released or absorbed when the two components interact is directly measured and does not require any exogenous reporters. Thus the binding enthalpy and the association constant (Ka) are directly obtained from ITC data, and can be used to compute the entropic contribution. Moreover, the shape of the isotherm is dependent on the c-value and the mechanistic model involved. The c-value is defined as c = n[P]tKa, where [P]t is the protein concentration, and n is the number of ligand binding sites within the host. In many cases, multiple binding sites for a given ligand are non-equivalent and ITC allows the characterization of the thermodynamic binding parameters for each individual binding site. This however requires that the correct binding model be used. This choice can be problematic if different models can fit the same experimental data. We have previously shown that this problem can be circumvented by performing experiments at several c-values. The multiple isotherms obtained at different c-values are fit simultaneously to separate models. The correct model is next identified based on the goodness of fit across the entire variable-c dataset. This process is applied here to the aminoglycoside resistance-causing enzyme aminoglycoside N-6''-acetyltransferase-Ii (AAC(6'')-Ii). Although our methodology is applicable to any system, the necessity of this strategy is better demonstrated with a macromolecule-ligand system showing allostery or cooperativity, and when different binding models provide essentially identical fits to the same data. To our knowledge, there are no such systems commercially available. AAC(6'')-Ii, is a homo-dimer containing two active sites, showing cooperativity between the two subunits. However ITC data obtained at a single c-value can be fit equally well to at least two different models a two-sets-of-sites independent model and a two-site sequential (cooperative) model. Through varying the c-value as explained above, it was established that the correct binding model for AAC(6'')-Ii is a two-site sequential binding model. Herein, we describe the steps that must be taken when performing ITC experiments in order to obtain datasets suitable for variable-c analyses.Download video file.(61M, mov)  相似文献   

14.
Uncovering pathways of tumor progression is an important topic in cancer research that has led to numerous studies, including several pathway models proposed and investigated by Sontag and Axelrod [Progression of heterogeneous breast tumors. J. Theoret. Biol. 210, 107-119, 2005]. In their comparative studies, the authors focused on relative goodness of fits of the various pathways, but a simple test revealed that even the "best" model did not provide an adequate explanation for the observed breast tumor data. The heterogeneous nature of breast tumors leads to the question of whether more than one (i.e., a combination of) pathway models are needed in order to explain the observed data. In the current paper, we address this question based on the finite mixture modeling framework and utilizing the four pathways proposed in Sontag and Axelrod as our individual pathway models. The expectation-maximization algorithm was used to derive estimates for the mixing proportions of the mixture models. Indeed, a two-pathway mixture provides a dramatic improvement over any of the single pathway models for explaining the data derived under either the Van Nuys or the Holland system. In particular, for data graded under the Van Nuys system, the mixture model was shown to be consistent with the observed data at the 1% significant level.  相似文献   

15.
Analgorithm has been developed for placing three-dimensional atomic structures into appropriately scaled cryoelectron microscopy maps. The first stage in this process is to conduct a three-dimensional angular search in which the center of gravity of an X-ray crystallographically determined structure is placed on a selected position in the cryoelectron microscopy map. The quality of the fit is measured by the sum of the density at each atomic position. The second stage is to refine the three angles and three translational parameters for the best (usually 25 to 100) fits. Useful criteria for this refinement include the sum of densities at atomic sites, the lack of atoms in negative or low density, the absence of atomic clashes between symmetry-related positions of the atomic structure, and the distances between identifiable features in the map and their positions on the fitted atomic structure. These refinements generally lead to a convergence of the originally chosen, top scoring fits to just a few (about 3 to 8) acceptable possibilities. Usually, the best remaining fit is clearly superior to any of the others.  相似文献   

16.
Although fatigue is a common and distressing symptom in cancer survivors, the mechanism of fatigue is not fully understood. Therefore, this study aims to investigate the relation between the fatigue and mindfulness of breast cancer survivors using anxiety, depression, pain, loneliness, and sleep disturbance as mediators. Path analysis was performed to examine direct and indirect associations between mindfulness and fatigue. Participants were breast cancer survivors who visited a breast surgery department at a university hospital in Japan for hormonal therapy or regular check-ups after treatment. The questionnaire measured cancer-related-fatigue, mindfulness, anxiety, depression, pain, loneliness, and sleep disturbance. Demographic and clinical characteristics were collected from medical records. Two-hundred and seventy-nine breast cancer survivors were registered, of which 259 answered the questionnaire. Ten respondents with incomplete questionnaire data were excluded, resulting in 249 participants for the analyses. Our final model fit the data well (goodness of fit index = .993; adjusted goodness of fit index = .966; comparative fit index = .999; root mean square error of approximation = .016). Mindfulness, anxiety, depression, pain, loneliness, and sleep disturbance were related to fatigue, and mindfulness had the most influence on fatigue (β = − .52). Mindfulness affected fatigue not only directly but also indirectly through anxiety, depression, pain, loneliness, and sleep disturbance. The study model helps to explain the process by which mindfulness affects fatigue. Our results suggest that mindfulness has both direct and indirect effects on the fatigue of breast cancer survivors and that mindfulness can be used to more effectively reduce their fatigue. It also suggests that health care professionals should be aware of factors such as anxiety, depression, pain, loneliness, and sleep disturbance in their care for fatigue of breast cancer survivors. This study was registered in the University Hospital Medical Information Network Clinical Trials Registry (UMIN number. 000027720) on June 12, 2017.  相似文献   

17.
Plants of duckweed (Lemna minor) were grown under constant illumination and with a controlled supply of ammonium-N so as to maintain a constant low concentration. In two kinetic experiments (differing in illumination and N level) with 15N-ammonia, plants were periodically harvested and their free amino acids analysed for 15N abundance. Attempts were then made to fit the data by computer simulation models. Only models which had at least two or more intracellular compartments gave adequate fits. Two two-compartment models were tested fully. Both had in compartment 1 the glutamine synthetase-glutamate synthase cycle and in compartment 2 a second site of glutamine synthesis. In one model the glutamate for compartment 2 was derived by transport from compartment 1; in the second model it was synthesized from ammonia by glutamate dehydrogenase at a rate equivalent to 10% of the total N uptake. This second model was rejected after it was found that plants previously treated with methionine sulphoximine and aza-serine (inhibitors of the glutamate synthase cycle) were unable to incorporate 15N. In spite of wide differences in labelling pattern between the two experiments the first model gave acceptable fits to both when different pool sizes were allowed for. Operation of the glutamate synthase cycle was confirmed by the correspondence between model and data for labelling of glutamine amide, glutamine amino and glutamic acid. Consideration of enzyme distributions suggested that compartment 1 (the glutamate synthase system) is the chloroplasts and compartment 2 the cytosol. Analysis of asparagine and neutral amino acids made it possible to construct balance sheets for N uptake in the two experiments. They suggest that all glutamine synthesized in the chloroplast is used for glutamate and asparagine synthesis and that the cytosol enzyme meets the need of the cell for glutamine per se. The high turnover rates for asparagine indicate that this compound is an important intermediate even under steady state conditions, and carries between 20 and 50% of the products of N assimilation.  相似文献   

18.
A periodic regression model, named the Baseline Cosinus Function (BCF), was designed to fit biological rhythms that show temporal deviations (peaks) above or below an otherwise relatively stable baseline. The BCF model has four parameters only, namely, baseline, peak-height, acrophase, and peak-width. BCF-regressions to daily rhythms in urinary 6-sulphatoxymelatonin (aMT6s), hypothalamic glutamate concentration, and body temperature of hamsters are compared to fits of single (SCF) and complex cosine functions (CCF; using the fundamental and the first harmonic). Goodness of fit statistics show that BCF-regressions to aMT6s-profiles of 36 hamsters resulted in lower residual errors than both SCF and CCF regressions, in particular when rhythms were determined under long photoperiod (n = 18) with relatively short nocturnal peaks (? 2 = 316.6, 142.7 and 74.5 for SCF, CCF and BCF, respectively). For aMT6s rhythms obtained from hamsters in short photoperiod (n = 18) with prolonged nocturnal peaks, goodness of fit was equivalent in CCF and BCF regressions (? 2 = 326.3, 107.0 and 101.4, for SCF, CCF, BCF, respectively), while BCF requires one parameter less than CCF. BCF-fits to daily patterns of hypothalamic glutamate and body temperature demonstrate that this model may be applied to various data types and has particular advantages when rhythms are sharply peaked, and when an independent estimate of peak-width, i.e., the total duration of a rise above the baseline, is desired.  相似文献   

19.
The species–area relationship (SAR) constitutes one of the most general ecological patterns globally. A number of different SAR models have been proposed. Recent work has shown that no single model universally provides the best fit to empirical SAR datasets: multiple models may be of practical and theoretical interest. However, there are no software packages available that a) allow users to fit the full range of published SAR models, or b) provide functions to undertake a range of additional SAR‐related analyses. To address these needs, we have developed the R package ‘sars’ that provides a wide variety of SAR‐related functionality. The package provides functions to: a) fit 20 SAR models using non‐linear and linear regression, b) calculate multi‐model averaged curves using various information criteria, and c) generate confidence intervals using bootstrapping. Plotting functions allow users to depict and scrutinize the fits of individual models and multi‐model averaged curves. The package also provides additional SAR functionality, including functions to fit, plot and evaluate the random placement model using a species–sites abundance matrix, and to fit the general dynamic model of oceanic island biogeography. The ‘sars’ R package will aid future SAR research by providing a comprehensive set of simple to use tools that enable in‐depth exploration of SARs and SAR‐related patterns. The package has been designed to allow other researchers to add new functions and models in the future and thus the package represents a resource for future SAR work that can be built on and expanded by workers in the field.  相似文献   

20.
Frisch JE 《International journal for parasitology》1999,29(1):57-71; discussion 73-5
Acaricides are essential in the short-term but do not offer a permanent solution to tick control. This situation will not change without a change of approach. A vaccine against Boophilus microplus confers partial long-term control but has little immediate effect on tick burdens. The effectiveness of acaricides and vaccination is greatest for breeds of high tick resistance. High host resistance is the key to effective long-term tick control with total resistance the ultimate aim. While improvements to acaricides and vaccines are continuously pursued, improvements to the most important single factor controlling ticks, host resistance, have been neglected. Resistance is as heritable as milk yield or growth and in tropical breeds can be increased to very high levels by selection. Despite this there are no current examples of sustained selection for tick resistance. Temperate breeds have low resistance but because of high production potentials are favoured for crossbreeding with tropical breeds. This perpetuates the need for reliance on acaricides. Selection to increase polygenic resistance of temperate breeds is impractical. However, a quantum increase can be achieved by introgressing major resistance genes. Such a gene occurs in the Belmont Adaptaur and in suitable genetic backgrounds confers 100% resistance. Total resistance is achievable and provides a permanent solution to ticks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号