首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Estimates of quantitative trait loci (QTL) effects derived from complete genome scans are biased, if no assumptions are made about the distribution of QTL effects. Bias should be reduced if estimates are derived by maximum likelihood, with the QTL effects sampled from a known distribution. The parameters of the distributions of QTL effects for nine economic traits in dairy cattle were estimated from a daughter design analysis of the Israeli Holstein population including 490 marker-by-sire contrasts. A separate gamma distribution was derived for each trait. Estimates for both the α and β parameters and their SE decreased as a function of heritability. The maximum likelihood estimates derived for the individual QTL effects using the gamma distributions for each trait were regressed relative to the least squares estimates, but the regression factor decreased as a function of the least squares estimate. On simulated data, the mean of least squares estimates for effects with nominal 1% significance was more than twice the simulated values, while the mean of the maximum likelihood estimates was slightly lower than the mean of the simulated values. The coefficient of determination for the maximum likelihood estimates was five-fold the corresponding value for the least squares estimates.  相似文献   

2.
MOTIVATION: Maximum likelihood-based methods to estimate site by site substitution rate variability in aligned homologous protein sequences rely on the formulation of a phylogenetic tree and generally assume that the patterns of relative variability follow a pre-determined distribution. We present a phylogenetic tree-independent method to estimate the relative variability of individual sites within large datasets of homologous protein sequences. It is based upon two simple assumptions. Firstly that substitutions observed between two closely related sequences are likely, in general, to occur at the most variable sites. Secondly that non-conservative amino acid substitutions tend to occur at more variable sites. Our methodology makes no assumptions regarding the underlying pattern of relative variability between sites. RESULTS: We have compared, using data simulated under a non-gamma distributed model, the performance of this approach to that of a maximum likelihood method that assumes gamma distributed rates. At low mean rates of evolution our method inferred site by site relative substitution rates more accurately than the maximum likelihood approach in the absence of prior assumptions about the relationships between sequences. Our method does not directly account for the effects of mutational saturation, However, we have incorporated an 'ad-hoc' modification that allows the accurate estimation of relative site variability in fast evolving and saturated datasets.  相似文献   

3.
S M Kokoska 《Biometrics》1987,43(3):525-534
This paper is concerned with the analysis of certain cancer chemoprevention experiments that involve Type I censoring. In experiments of this nature, two common response variables are the number of induced cancers and the rate at which they develop. In this study we assume that the number of induced tumors and their times to detection are described by the Poisson and gamma distributions, respectively. Using the method of maximum likelihood, we discuss a procedure for estimating the parameters characterizing these two distributions. We apply standard techniques in order to construct a confidence region and conduct a hypothesis test concerning the parameters of interest. We discuss a method for comparing the effects of two different treatments using the likelihood ratio principle. A technique for isolating group differences in terms of the mean number of promoted tumors and the mean time to detection is described. Using the techniques developed in this paper, we reanalyze an existing data set in the cancer chemoprevention literature and obtain contrasting results.  相似文献   

4.
Bivariate time series of counts with excess zeros relative to the Poisson process are common in many bioscience applications. Failure to account for the extra zeros in the analysis may result in biased parameter estimates and misleading inferences. A class of bivariate zero-inflated Poisson autoregression models is presented to accommodate the zero-inflation and the inherent serial dependency between successive observations. An autoregressive correlation structure is assumed in the random component of the compound regression model. Parameter estimation is achieved via an EM algorithm, by maximizing an appropriate log-likelihood function to obtain residual maximum likelihood estimates. The proposed method is applied to analyze a bivariate series from an occupational health study, in which the zero-inflated injury count events are classified as either musculoskeletal or non-musculoskeletal in nature. The approach enables the evaluation of the effectiveness of a participatory ergonomics intervention at the population level, in terms of reducing the overall incidence of lost-time injury and a simultaneous decline in the two mean injury rates.  相似文献   

5.
We have investigated the effects of different among-site rate variation models on the estimation of substitution model parameters, branch lengths, topology, and bootstrap proportions under minimum evolution (ME) and maximum likelihood (ML). Specifically, we examined equal rates, invariable sites, gamma-distributed rates, and site-specific rates (SSR) models, using mitochondrial DNA sequence data from three protein-coding genes and one tRNA gene from species of the New Zealand cicada genus Maoricicada. Estimates of topology were relatively insensitive to the substitution model used; however, estimates of bootstrap support, branch lengths, and R-matrices (underlying relative substitution rate matrix) were strongly influenced by the assumptions of the substitution model. We identified one situation where ME and ML tree building became inaccurate when implemented with an inappropriate among-site rate variation model. Despite the fact the SSR models often have a better fit to the data than do invariable sites and gamma rates models, SSR models have some serious weaknesses. First, SSR rate parameters are not comparable across data sets, unlike the proportion of invariable sites or the alpha shape parameter of the gamma distribution. Second, the extreme among-site rate variation within codon positions is problematic for SSR models, which explicitly assume rate homogeneity within each rate class. Third, the SSR models appear to give severe underestimates of R-matrices and branch lengths relative to invariable sites and gamma rates models in this example. We recommend performing phylogenetic analyses under a range of substitution models to test the effects of model assumptions not only on estimates of topology but also on estimates of branch length and nodal support.  相似文献   

6.
We consider the statistical modeling and analysis of replicated multi-type point process data with covariates. Such data arise when heterogeneous subjects experience repeated events or failures which may be of several distinct types. The underlying processes are modeled as nonhomogeneous mixed Poisson processes with random (subject) and fixed (covariate) effects. The method of maximum likelihood is used to obtain estimates and standard errors of the failure rate parameters and regression coefficients. Score tests and likelihood ratio statistics are used for covariate selection. A graphical test of goodness of fit of the selected model is based on generalized residuals. Measures for determining the influence of an individual observation on the estimated regression coefficients and on the score test statistic are developed. An application is described to a large ongoing randomized controlled clinical trial for the efficacy of nutritional supplements of selenium for the prevention of two types of skin cancer.  相似文献   

7.
For a prospective randomized clinical trial with two groups, the relative risk can be used as a measure of treatment effect and is directly interpretable as the ratio of success probabilities in the new treatment group versus the placebo group. For a prospective study with many covariates and a binary outcome (success or failure), relative risk regression may be of interest. If we model the log of the success probability as a linear function of covariates, the regression coefficients are log-relative risks. However, using such a log-linear model with a Bernoulli likelihood can lead to convergence problems in the Newton-Raphson algorithm. This is likely to occur when the success probabilities are close to one. A constrained likelihood method proposed by Wacholder (1986, American Journal of Epidemiology 123, 174-184), also has convergence problems. We propose a quasi-likelihood method of moments technique in which we naively assume the Bernoulli outcome is Poisson, with the mean (success probability) following a log-linear model. We use the Poisson maximum likelihood equations to estimate the regression coefficients without constraints. Using method of moment ideas, one can show that the estimates using the Poisson likelihood will be consistent and asymptotically normal. We apply these methods to a double-blinded randomized trial in primary biliary cirrhosis of the liver (Markus et al., 1989, New England Journal of Medicine 320, 1709-1713).  相似文献   

8.
Bayesian hierarchical models usually model the risk surface on the same arbitrary geographical units for all data sources. Poisson/gamma random field models overcome this restriction as the underlying risk surface can be specified independently to the resolution of the data. Moreover, covariates may be considered as either excess or relative risk factors. We compare the performance of the Poisson/gamma random field model to the Markov random field (MRF)‐based ecologic regression model and the Bayesian Detection of Clusters and Discontinuities (BDCD) model, in both a simulation study and a real data example. We find the BDCD model to have advantages in situations dominated by abruptly changing risk while the Poisson/gamma random field model convinces by its flexibility in the estimation of random field structures and by its flexibility incorporating covariates. The MRF‐based ecologic regression model is inferior. WinBUGS code for Poisson/gamma random field models is provided.  相似文献   

9.
The risk difference is an intelligible measure for comparing disease incidence in two exposure or treatment groups. Despite its convenience in interpretation, it is less prevalent in epidemiological and clinical areas where regression models are required in order to adjust for confounding. One major barrier to its popularity is that standard linear binomial or Poisson regression models can provide estimated probabilities out of the range of (0,1), resulting in possible convergence issues. For estimating adjusted risk differences, we propose a general framework covering various constraint approaches based on binomial and Poisson regression models. The proposed methods span the areas of ordinary least squares, maximum likelihood estimation, and Bayesian inference. Compared to existing approaches, our methods prevent estimates and confidence intervals of predicted probabilities from falling out of the valid range. Through extensive simulation studies, we demonstrate that the proposed methods solve the issue of having estimates or confidence limits of predicted probabilities out of (0,1), while offering performance comparable to its alternative in terms of the bias, variability, and coverage rates in point and interval estimation of the risk difference. An application study is performed using data from the Prospective Registry Evaluating Myocardial Infarction: Event and Recovery (PREMIER) study.  相似文献   

10.
Maps depicting cancer incidence rates have become useful tools in public health research, giving valuable information about the spatial variation in rates of disease. Typically, these maps are generated using count data aggregated over areas such as counties or census blocks. However, with the proliferation of geographic information systems and related databases, it is becoming easier to obtain exact spatial locations for the cancer cases and suitable control subjects. The use of such point data allows us to adjust for individual-level covariates, such as age and smoking status, when estimating the spatial variation in disease risk. Unfortunately, such covariate information is often subject to missingness. We propose a method for mapping cancer risk when covariates are not completely observed. We model these data using a logistic generalized additive model. Estimates of the linear and non-linear effects are obtained using a mixed effects model representation. We develop an EM algorithm to account for missing data and the random effects. Since the expectation step involves an intractable integral, we estimate the E-step with a Laplace approximation. This framework provides a general method for handling missing covariate values when fitting generalized additive models. We illustrate our method through an analysis of cancer incidence data from Cape Cod, Massachusetts. These analyses demonstrate that standard complete-case methods can yield biased estimates of the spatial variation of cancer risk.  相似文献   

11.
Tissue heterogeneity, radioactive decay and measurement noise are the main error sources in compartmental modeling used to estimate the physiologic rate constants of various radiopharmaceuticals from a dynamic PET study. We introduce a new approach to this problem by modeling the tissue heterogeneity with random rate constants in compartment models. In addition, the Poisson nature of the radioactive decay is included as a Poisson random variable in the measurement equations. The estimation problem will be carried out using the maximum likelihood estimation. With this approach, we do not only get accurate mean estimates for the rate constants, but also estimates for tissue heterogeneity within the region of interest and other possibly unknown model parameters, e.g. instrument noise variance, as well. We also avoid the problem of the optimal weighting of the data related to the conventionally used weighted least-squares method. The new approach was tested with simulated time–activity curves from the conventional three compartment – three rate constants model with normally distributed rate constants and with a noise mixture of Poisson and normally distributed random variables. Our simulation results showed that this new model gave accurate estimates for the mean of the rate constants, the measurement noise parameter and also for the tissue heterogeneity, i.e. for the variance of the rate constants within the region of interest.  相似文献   

12.
Hsieh F  Tseng YK  Wang JL 《Biometrics》2006,62(4):1037-1043
The maximum likelihood approach to jointly model the survival time and its longitudinal covariates has been successful to model both processes in longitudinal studies. Random effects in the longitudinal process are often used to model the survival times through a proportional hazards model, and this invokes an EM algorithm to search for the maximum likelihood estimates (MLEs). Several intriguing issues are examined here, including the robustness of the MLEs against departure from the normal random effects assumption, and difficulties with the profile likelihood approach to provide reliable estimates for the standard error of the MLEs. We provide insights into the robustness property and suggest to overcome the difficulty of reliable estimates for the standard errors by using bootstrap procedures. Numerical studies and data analysis illustrate our points.  相似文献   

13.
Xiang L  Yau KK  Van Hui Y  Lee AH 《Biometrics》2008,64(2):508-518
Summary .   The k-component Poisson regression mixture with random effects is an effective model in describing the heterogeneity for clustered count data arising from several latent subpopulations. However, the residual maximum likelihood estimation (REML) of regression coefficients and variance component parameters tend to be unstable and may result in misleading inferences in the presence of outliers or extreme contamination. In the literature, the minimum Hellinger distance (MHD) estimation has been investigated to obtain robust estimation for finite Poisson mixtures. This article aims to develop a robust MHD estimation approach for k-component Poisson mixtures with normally distributed random effects. By applying the Gaussian quadrature technique to approximate the integrals involved in the marginal distribution, the marginal probability function of the k-component Poisson mixture with random effects can be approximated by the summation of a set of finite Poisson mixtures. Simulation study shows that the MHD estimates perform satisfactorily for data without outlying observation(s), and outperform the REML estimates when data are contaminated. Application to a data set of recurrent urinary tract infections (UTI) with random institution effects demonstrates the practical use of the robust MHD estimation method.  相似文献   

14.
Little is known about long-term cancer risks following in utero radiation exposure. We evaluated the association between in utero radiation exposure and risk of solid cancer and leukemia mortality among 8,000 offspring, born from 1948-1988, of female workers at the Mayak Nuclear Facility in Ozyorsk, Russia. Mother's cumulative gamma radiation uterine dose during pregnancy served as a surrogate for fetal dose. We used Poisson regression methods to estimate relative risks (RRs) and 95% confidence intervals (CIs) of solid cancer and leukemia mortality associated with in utero radiation exposure and to quantify excess relative risks (ERRs) as a function of dose. Using currently available dosimetry information, 3,226 (40%) offspring were exposed in utero (mean dose = 54.5 mGy). Based on 75 deaths from solid cancers (28 exposed) and 12 (6 exposed) deaths from leukemia, in utero exposure status was not significantly associated with solid cancer: RR = 0.94, 95% CI 0.58 to 1.49; ERR/Gy = -0.1 (95% CI < -0.1 to 4.1), or leukemia mortality; RR = 1.65, 95% CI 0.52 to 5.27; ERR/Gy = -0.8 (95% CI < -0.8 to 46.9). These initial results provide no evidence that low-dose gamma in utero radiation exposure increases solid cancer or leukemia mortality risk, but the data are not inconsistent with such an increase. As the offspring cohort is relatively young, subsequent analyses based on larger case numbers are expected to provide more precise estimates of adult cancer mortality risk following in utero exposure to ionizing radiation.  相似文献   

15.
Several analysis of the geographic variation of mortality rates in space have been proposed in the literature. Poisson models allowing the incorporation of random effects to model extra‐variability are widely used. The typical modelling approach uses normal random effects to accommodate local spatial autocorrelation. When spatial autocorrelation is absent but overdispersion persists, a discrete mixture model is an alternative approach. However, a technique for identifying regions which have significant high or low risk in any given area has not been developed yet when using the discrete mixture model. Taking into account the importance that this information provides to the epidemiologists to formulate hypothesis related to the potential risk factors affecting the population, different procedures for obtaining confidence intervals for relative risks are derived in this paper. These methods are the standard information‐based method and other four, all based on bootstrap techniques, namely the asymptotic‐bootstrap, the percentile‐bootstrap, the BC‐bootstrap and the modified information‐based method. All of them are compared empirically by their application to mortality data due to cardiovascular diseases in women from Navarra, Spain, during the period 1988–1994. In the small area example considered here, we find that the information‐based method is sensible at estimating standard errors of the component means in the discrete mixture model but it is not appropriate for providing standard errors of the estimated relative risks and hence, for constructing confidence intervals for the relative risk associated to each region. Therefore, the bootstrap‐based methods are recommended for this matter. More specifically, the BC method seems to provide better coverage probabilities in the case studied, according to a small scale simulation study that has been carried out using a scenario as encountered in the analysis of the real data.  相似文献   

16.
Friedl H  Kauermann G 《Biometrics》2000,56(3):761-767
A procedure is derived for computing standard errors of EM estimates in generalized linear models with random effects. Quadrature formulas are used to approximate the integrals in the EM algorithm, where two different approaches are pursued, i.e., Gauss-Hermite quadrature in the case of Gaussian random effects and nonparametric maximum likelihood estimation for an unspecified random effect distribution. An approximation of the expected Fisher information matrix is derived from an expansion of the EM estimating equations. This allows for inferential arguments based on EM estimates, as demonstrated by an example and simulations.  相似文献   

17.
Species range shifts associated with environmental change or biological invasions are increasingly important study areas. However, quantifying range expansion rates may be heavily influenced by methodology and/or sampling bias. We compared expansion rate estimates of Roesel''s bush-cricket (Metrioptera roeselii, Hagenbach 1822), a nonnative species currently expanding its range in south-central Sweden, from range statistic models based on distance measures (mean, median, 95th gamma quantile, marginal mean, maximum, and conditional maximum) and an area-based method (grid occupancy). We used sampling simulations to determine the sensitivity of the different methods to incomplete sampling across the species'' range. For periods when we had comprehensive survey data, range expansion estimates clustered into two groups: (1) those calculated from range margin statistics (gamma, marginal mean, maximum, and conditional maximum: ˜3 km/year), and (2) those calculated from the central tendency (mean and median) and the area-based method of grid occupancy (˜1.5 km/year). Range statistic measures differed greatly in their sensitivity to sampling effort; the proportion of sampling required to achieve an estimate within 10% of the true value ranged from 0.17 to 0.9. Grid occupancy and median were most sensitive to sampling effort, and the maximum and gamma quantile the least. If periods with incomplete sampling were included in the range expansion calculations, this generally lowered the estimates (range 16–72%), with exception of the gamma quantile that was slightly higher (6%). Care should be taken when interpreting rate expansion estimates from data sampled from only a fraction of the full distribution. Methods based on the central tendency will give rates approximately half that of methods based on the range margin. The gamma quantile method appears to be the most robust to incomplete sampling bias and should be considered as the method of choice when sampling the entire distribution is not possible.  相似文献   

18.
The Information Capacity of Nerve Cells Using a Frequency Code   总被引:4,自引:0,他引:4  
Approximate equations are derived for the amount of information a nerve cell or group of nerve cells can transmit about a stimulus of a given duration using a frequency code (i.e., assuming the mean frequency of nerve impulses measures the intensity of a maintained stimulus). The equations take into account the variability of successive interspike intervals, and any serial correlations between successive intervals, but do not require detailed assumptions about the mechanism of impulse initiation. The errors involved in using these approximations are evaluated for neurons which discharge either completely regularly, completely at random (Poisson process) or show a particular type of intermediate variability (gamma distribution model). The errors become negligibly small as the stimulus duration or the number of functionally similar nerve cells increases. The conditions for applying these equations to experimental data are discussed. The application of these equations should help considerably in eliminating the enormous discrepancies between some earlier estimates for the information processing capabilities of single nerve cells and systems of nerve cells.  相似文献   

19.
Significantly elevated lung cancer deaths and statistically significantly positive linear trends between leukemia mortality and radiation exposure were reported in a previous analysis of Portsmouth Naval Shipyard workers. The purpose of this study was to conduct a modeling-based analysis that incorporates previously unanalyzed confounders in exploring the exposure-response relationship between cumulative external ionizing radiation exposure and mortality from these cancers among radiation-monitored workers in this cohort. The main analyses were carried out with Poisson regression fitted with maximum likelihood in linear excess relative risk models. Sensitivity analyses varying model components and using other regression models were conducted. The positive association between lung cancer risk and ionizing radiation observed previously was no longer present after adjusting for socioeconomic status (smoking surrogate) and welding fume and asbestos exposures. Excesses of leukemia were found to be positively, though not significantly, associated with external ionizing radiation, with or without including potential confounders. The estimated excess relative risk was 10.88% (95% CI -0.90%, 38.77%) per 10 mSv of radiation exposure, which was within the ranges of risk estimates in previous epidemiological studies (-4.1 to 19.0%). These results are limited by many factors and are subject to uncertainties of the exposure and confounder estimates.  相似文献   

20.
P. D. Keightley 《Genetics》1994,138(4):1315-1322
Parameters of continuous distributions of effects and rates of spontaneous mutation for relative viability in Drosophila are estimated by maximum likelihood from data of two published experiments on accumulation of mutations on protected second chromosomes. A model of equal mutant effects gives a poor fit to the data of the two experiments; higher likelihoods are obtained with leptokurtic distributions or for models in which there is more than one class of mutation effect. Minimum estimates of mutation rates (events per generation) at polygenes affecting viability on chromosome 2 are 0.14 and 0.068, but estimates are strongly confounded with other parameters in the model. Separate information on rates of molecular divergence between Drosophila species and from rates of movement of transposable elements is used to infer the overall genomic mutation rate in Drosophila, and the viability data are analyzed with mutation rate as a known parameter. If, for example, a mutation rate for chromosome 2 of 0.4 is assumed, maximum likelihood estimates of mean mutant effect on relative viability are 0.4% and 1%, but the majority of mutations have very much smaller effects than these values as distributions are highly leptokurtic. The methodology is applied to estimate viability effects of single P element insertional mutations. The mean effect per insertion is found to be higher, and their distribution is found to be less leptokurtic than for spontaneous mutations. The equilibrium genetic variance of viability predicted by a mutation-selection balance model with parameters estimated from the mutation accumulation experiments is similar to laboratory estimates of genetic variance of viability from natural populations of Drosophila.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号