首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Regressions of biological variables across species are rarely perfect. Usually, there are residual deviations from the estimated model relationship, and such deviations commonly show a pattern of phylogenetic correlations indicating that they have biological causes. We discuss the origins and effects of phylogenetically correlated biological variation in regression studies. In particular, we discuss the interplay of biological deviations with deviations due to observational or measurement errors, which are also important in comparative studies based on estimated species means. We show how bias in estimated evolutionary regressions can arise from several sources, including phylogenetic inertia and either observational or biological error in the predictor variables. We show how all these biases can be estimated and corrected for in the presence of phylogenetic correlations. We present general formulas for incorporating measurement error in linear models with correlated data. We also show how alternative regression models, such as major axis and reduced major axis regression, which are often recommended when there is error in predictor variables, are strongly biased when there is biological variation in any part of the model. We argue that such methods should never be used to estimate evolutionary or allometric regression slopes.  相似文献   

2.
A method for fitting regression models to data that exhibit spatial correlation and heteroskedasticity is proposed. It is well known that ignoring a nonconstant variance does not bias least-squares estimates of regression parameters; thus, data analysts are easily lead to the false belief that moderate heteroskedasticity can generally be ignored. Unfortunately, ignoring nonconstant variance when fitting variograms can seriously bias estimated correlation functions. By modeling heteroskedasticity and standardizing by estimated standard deviations, our approach eliminates this bias in the correlations. A combination of parametric and nonparametric regression techniques is used to iteratively estimate the various components of the model. The approach is demonstrated on a large data set of predicted nitrogen runoff from agricultural lands in the Midwest and Northern Plains regions of the U.S.A. For this data set, the model comprises three main components: (1) the mean function, which includes farming practice variables, local soil and climate characteristics, and the nitrogen application treatment, is assumed to be linear in the parameters and is fitted by generalized least squares; (2) the variance function, which contains a local and a spatial component whose shapes are left unspecified, is estimated by local linear regression; and (3) the spatial correlation function is estimated by fitting a parametric variogram model to the standardized residuals, with the standardization adjusting the variogram for the presence of heteroskedasticity. The fitting of these three components is iterated until convergence. The model provides an improved fit to the data compared with a previous model that ignored the heteroskedasticity and the spatial correlation.  相似文献   

3.
In order to decide which is the best growth model for the tambaqui Colossoma macropomum Cuvier, 1818, we utilized 249 and 256 length-at-age ring readings in otholiths and scales respectively, for the same sample of individuals. The Schnute model was utilized and it is concluded that the Von Bertalanffy model is the most adequate for these data, because it proved highly stable for the data set, and only slightly sensitive to the initial values of the estimated parameters. The phi values estimated from five different data sources presented a CV = 4.78%. The numerical discrepancies between these values are of not much concern due to the high negative correlation between k and Linfinity viz, so that when one of them increases, the other decreases and the final result in phi remains nearly unchanged.  相似文献   

4.
The effect of lignocellulosic hydrolysate of crushed corn mobs on the growth and lactic acid formation in a continuous culture ofLactobacillus casei andL. lactis at dilution rate 0.08–0.3/h was studied. A simple physiological model of the process was derived from computer-aided analysis of the data which relates bacterial growth, lactic acid formation to the utilization of two different sources of nutrients. The parameters of the model were estimated by nonlinear regression and used for process simulation and estimation of maximum productivity of lactic acid.  相似文献   

5.
In population‐based case‐control studies, it is of great public‐health importance to estimate the disease incidence rates associated with different levels of risk factors. This estimation is complicated by the fact that in such studies the selection probabilities for the cases and controls are unequal. A further complication arises when the subjects who are selected into the study do not participate (i.e. become nonrespondents) and nonrespondents differ systematically from respondents. In this paper, we show how to account for unequal selection probabilities as well as differential nonresponses in the incidence estimation. We use two logistic models, one relating the disease incidence rate to the risk factors, and one modelling the predictors that affect the nonresponse probability. After estimating the regression parameters in the nonresponse model, we estimate the regression parameters in the disease incidence model by a weighted estimating function that weights a respondent's contribution to the likelihood score function by the inverse of the product of his/her selection probability and his/her model‐predicted response probability. The resulting estimators of the regression parameters and the corresponding estimators of the incidence rates are shown to be consistent and asymptotically normal with easily estimated variances. Simulation results demonstrate that the asymptotic approximations are adequate for practical use and that failure to adjust for nonresponses could result in severe biases. An illustration with data from a cardiovascular study that motivated this work is presented.  相似文献   

6.
Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.  相似文献   

7.
Atrial fibrillation (AF) is an abnormal heart rhythm characterized by rapid and irregular heartbeat, with or without perceivable symptoms. In clinical practice, the electrocardiogram (ECG) is often used for diagnosis of AF. Since the AF often arrives as recurrent episodes of varying frequency and duration and only the episodes that occur at the time of ECG can be detected, the AF is often underdiagnosed when a limited number of repeated ECGs are used. In studies evaluating the efficacy of AF ablation surgery, each patient undergoes multiple ECGs and the AF status at the time of ECG is recorded. The objective of this paper is to estimate the marginal proportions of patients with or without AF in a population, which are important measures of the efficacy of the treatment. The underdiagnosis problem is addressed by a three‐class mixture regression model in which a patient's probability of having no AF, paroxysmal AF, and permanent AF is modeled by auxiliary baseline covariates in a nested logistic regression. A binomial regression model is specified conditional on a subject being in the paroxysmal AF group. The model parameters are estimated by the Expectation‐Maximization (EM) algorithm. These parameters are themselves nuisance parameters for the purpose of this research, but the estimators of the marginal proportions of interest can be expressed as functions of the data and these nuisance parameters and their variances can be estimated by the sandwich method. We examine the performance of the proposed methodology in simulations and two real data applications.  相似文献   

8.
The biokinetic parameters for autotrophic systems are difficult to obtain and are often mistakenly determined because the size of the autotrophic population in mixed (i.e., heterotrophic and autotrophic) cultures cannot be accurately estimated. This article presents a systematic approach, combining bioenergetic calculations and experimental data, to obtain values of the biokinetic parameters pertinent to the aerobic, autotrophic biodegradation of thiocyanate. Nonlinear regression techniques were employed using both initial thiocyanate utilization rate data and single thiocyanate depletion curves. Both types of data were necessary to overcome the problems arising from the linear nature of the substrate depletion curves and the high correlation of the biokinetic model parameters inherent in nonlinear regression analysis. The aerobic biodegradation of thiocyanate followed a substrate inhibition pattern that was successfully described by the Haldane-Andrews model. Although regression analysis did not yield unique biokinetic parameter estimates, the following parameter value ranges were obtained: maximum specific substrate utilization rate (k), 0.26 to 0.44 mg SCN-/mg biomass h; half-saturation coefficient (Ks), 2.3 to 7.1 mg SCN-/L; and inhibition coefficient (Ki), 28 to 109 mg SCN-/L. Based on the estimated biokinetic parameter values, a design and operation diagram was constructed that depicts the steady-state thiocyanate concentration as a function of solids retention time for a completely mixed, continuous-flow reactor.  相似文献   

9.
In this paper a generalization of the Poisson regression model indexed by a shape parameter is proposed for the analysis of life table and follow-up data with concomitant variables. The model is suitable for analysis of extra-Poisson variation data. The model is used to fit the survival data given in Holford (1980). The model parameters, the hazard and survival functions are estimated by the method of maximum likelihood. The results obtained from this study seem to be comparable to those obtained by Chen (1988). Approximate tests of the dispersion and goodness-of-fit of the data to the model are also discussed.  相似文献   

10.
A nonlinear regression technique for estimating the Monod parameters describing biodegradation kinetics is presented and analyzed. Two model data sets were taken from a study of aerobic biodegradation of the polycyclic aromatic hydrocarbons (PAHs), naphthalene and 2-methylnaphthalene, as the growth-limiting substrates, where substrate and biomass concentrations were measured with time. For each PAH, the parameters estimated were: q(max), the maximum substrate utilization rate per unit biomass; K(S), the half-saturation coefficient; and Y, the stoichiometric yield coefficient. Estimating parameters when measurements have been made for two variables with different error structures requires a technique more rigorous than least squares regression. An optimization function is derived from the maximumlikelihood equation assuming an unknown, nondiagonal covariance matrix for the measured variables. Because the derivation is based on an assumption of normally distributed errors in the observations, the error structures of the regression variables were examined. Through residual analysis, the errors in the substrate concentration data were found to be distributed log-normally, demonstrating a need for log transformation of this variable. The covariance between ln C and X was found to be small but significantly nonzero at the 67% confidence level for NPH and at the 94% confidence level for 2MN. The nonlinear parameter estimation yielded unique values for q(max), K(S), and Y for naphthalene. Thus, despite the low concentrations of this sparingly soluble compound, the data contained sufficient information for parameter estimation. For 2-methylnaphthalene, the values of q(max) and K(S) could not be estimated uniquely; however, q(max)/K(S) was estimated. To assess the value of including the relatively imprecise biomass concentration data, the results from the bivariate method were compared with a univariate method using only the substrate concentration data. The results demonstrated that the bivariate data yielded a better confidence in the estimates and provided additional information about the model fit and model adequacy. The combination of the value of the bivariate data set and their nonzero covariance justifies the need for maximum likelihood estimation over the simpler nonlinear least squares regression.  相似文献   

11.
The Poisson regression model for the analysis of life table and follow-up data with covariates is presented. An example is presented to show how this technique can be used to construct a parsimonious model which describes a set of survival data. All parameters in the model, the hazard and survival functions are estimated by maximum likelihood.  相似文献   

12.
A mixture Markov regression model is proposed to analyze heterogeneous time series data. Mixture quasi‐likelihood is formulated to model time series with mixture components and exogenous variables. The parameters are estimated by quasi‐likelihood estimating equations. A modified EM algorithm is developed for the mixture time series model. The model and proposed algorithm are tested on simulated data and applied to mosquito surveillance data in Peel Region, Canada.  相似文献   

13.
Testimation is considered in the problem of estimation of regression parameters. The first stage sample is used to test a (null) hypothesis that specifies initial (preassumed) values for some of the regression parameters. Linear combination of the preassumed values and the ordinary least square (OLS) estimates is considered as the estimate if the data agree with the hypothesis. Otherwise, a second sample is taken and parameters are estimated only by using OLS, based on the combined sample. The procedure protects against type II error and against taking larger samples when inference can be made from a smaller sample.  相似文献   

14.
Five parameters of one of the most common neuronal models, the diffusion leaky integrate-and-fire model, also known as the Ornstein-Uhlenbeck neuronal model, were estimated on the basis of intracellular recording. These parameters can be classified into two categories. Three of them (the membrane time constant, the resting potential and the firing threshold) characterize the neuron itself. The remaining two characterize the neuronal input. The intracellular data were collected during spontaneous firing, which in this case is characterized by a Poisson process of interspike intervals. Two methods for the estimation were applied, the regression method and the maximum-likelihood method. Both methods permit to estimate the input parameters and the membrane time constant in a short time window (a single interspike interval). We found that, at least in our example, the regression method gave more consistent results than the maximum-likelihood method. The estimates of the input parameters show the asymptotical normality, which can be further used for statistical testing, under the condition that the data are collected in different experimental situations. The model neuron, as deduced from the determined parameters, works in a subthreshold regimen. This result was confirmed by both applied methods. The subthreshold regimen for this model is characterized by the Poissonian firing. This is in a complete agreement with the observed interspike interval data. Action Editor: Nicolas Brunel  相似文献   

15.
16.
Development and application of photogrammetric mass-estimation techniques in marine mammal studies is becoming increasingly common. When a photogrammetrically estimated mass is used as a covariate in regression modeling, the error associated with estimating mass induces bias in regression statistics and decreases model explanatory power. Thus, it is important to understand and account for prediction variance when addressing ecological questions that require use of estimated mass values. In a simulation study based on data collected from Weddell seals, we developed regression models of pup weaning mass as a function of maternal postparturition mass where maternal mass was directly measured and second where maternal mass was photogrammetrically estimated. We demonstrate that when estimated mass was used, the regression coefficient was biased toward zero and the coefficient of determination was 30% less than the value obtained when using maternal postparturition mass obtained from direct measurement. After applying bias correction procedures, however, the regression coefficient and coefficient of determination were within 2% of their true values. To effectively use photogrammetrically estimated masses, prediction variance should be understood and accounted for in all analyses. The methods presented in this paper are effective and simple techniques to explore and account for prediction variance.  相似文献   

17.
Wunder MB  Kester CL  Knopf FL  Rye RO 《Oecologia》2005,144(4):607-617
We used feathers of known origin collected from across the breeding range of a migratory shorebird to test the use of isotope tracers for assigning breeding origins. We analyzed δD, δ13C, and δ15N in feathers from 75 mountain plover (Charadrius montanus) chicks sampled in 2001 and from 119 chicks sampled in 2002. We estimated parameters for continuous-response inverse regression models and for discrete-response Bayesian probability models from data for each year independently. We evaluated model predictions with both the training data and by using the alternate year as an independent test dataset. Our results provide weak support for modeling latitude and isotope values as monotonic functions of one another, especially when data are pooled over known sources of variation such as sample year or location. We were unable to make even qualitative statements, such as north versus south, about the likely origin of birds using both δD and δ13C in inverse regression models; results were no better than random assignment. Probability models provided better results and a more natural framework for the problem. Correct assignment rates were highest when considering all three isotopes in the probability framework, but the use of even a single isotope was better than random assignment. The method appears relatively robust to temporal effects and is most sensitive to the isotope discrimination gradients over which samples are taken. We offer that the problem of using isotope tracers to infer geographic origin is best framed as one of assignment, rather than prediction.  相似文献   

18.
The traditional method for determining compartmental analysis parameters relies on a visual selection of data points to be used for regression of data from each cellular compartment. This method is appropriate when the compartments are kinetically discrete and are easily discernible. However, where treatment effects on compartment parameters are being evaluated, a more objective method for determining initial parameters is desirable.

Three methods were examined for determining initial isotopic contents and half-times of 86Rb elution from cellular compartments using theoretical data with known parameters. Experimental data from roots of Douglas fir (Pseudotsuga menziesii [Mirb.] Franco) and barley (Hordeum vulgare L.) intact seedlings were also used. The three methods were a visually assisted, linear regression on data of semilog plot of isotope elution versus time, a microcomputer-assisted, linear regression on semilog plot where maximization of the square of the correlation coefficient (r2) was the criterion to determine data points needed for each regression and a mainframe computer-assisted, direct nonlinear regression on elution data using a model of the sum of three exponential decay functions. The visual method resulted in the least accurate estimates of compartmental analysis parameters. The microcomputer-assisted and nonlinear regression methods calculated the parameters equally well.

  相似文献   

19.
Random regression models are widely used in the field of animal breeding for the genetic evaluation of daily milk yields from different test days. These models are capable of handling different environmental effects on the respective test day, and they describe the characteristics of the course of the lactation period by using suitable covariates with fixed and random regression coefficients. As the numerically expensive estimation of parameters is already part of advanced computer software, modifications of random regression models will considerably grow in importance for statistical evaluations of nutrition and behaviour experiments with animals. Random regression models belong to the large class of linear mixed models. Thus, when choosing a model, or more precisely, when selecting a suitable covariance structure of the random effects, the information criteria of Akaike and Schwarz can be used. In this study, the fitting of random regression models for a statistical analysis of a feeding experiment with dairy cows is illustrated under application of the program package SAS. For each of the feeding groups, lactation curves modelled by covariates with fixed regression coefficients are estimated simultaneously. With the help of the fixed regression coefficients, differences between the groups are estimated and then tested for significance. The covariance structure of the random and subject-specific effects and the serial correlation matrix are selected by using information criteria and by estimating correlations between repeated measurements. For the verification of the selected model and the alternative models, mean values and standard deviations estimated with ordinary least square residuals are used.  相似文献   

20.
Gray RJ 《Biometrics》2000,56(2):571-576
An estimator of the regression parameters in a semiparametric transformed linear survival model is examined. This estimator consists of a single Newton-like update of the solution to a rank-based estimating equation from an initial consistent estimator. An automated penalized likelihood algorithm is proposed for estimating the optimal weight function for the estimating equations and the error hazard function that is needed in the variance estimator. In simulations, the estimated optimal weights are found to give reasonably efficient estimators of the regression parameters, and the variance estimators are found to perform well. The methodology is applied to an analysis of prognostic factors in non-Hodgkin's lymphoma.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号