首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
2.
Summary Various studies have estimated covariance components as half the difference between the variance component of the sum of the variable values, for each observation, and the sum of the corresponding variable variance components. Although the variance components for the separate variables can be computed using all available data, the variance components of the sum can be computed only from those observations with records for both variables. Previous studies have suggested eliminating observations with missing data, because of possible selection bias. The effect of missing data on estimates of covariance components and genetic correlations was tested on sample beef cattle data and simulated data by randomly deleting differing proportions of records of one variable for each pair of variables analyzed. Estimates of genetic correlations computed with observations with missing data eliminated, were more accurate than estimates computed using all available data. Furthermore, when observations with missing data were included, estimates of genetic correlation far outside the parameter space were common. Therefore, this method should be used only if observations with missing data have been eliminated.  相似文献   

3.
Efficiency of regression estimates for clustered data   总被引:1,自引:0,他引:1  
Mancl LA  Leroux BG 《Biometrics》1996,52(2):500-511
Statistical methods for clustered data, such as generalized estimating equations (GEE) and generalized least squares (GLS), require selecting a correlation or convariance structure to specify the dependence between observations within a cluster. Valid regression estimates can be obtained that do not depend on correct specification of the true correlation, but inappropriate specifications can result in a loss of efficiency. We derive general expressions for the asymptotic relative efficiency of GEE and GLS estimators under nested correlation structures. Efficiency is shown to depend on the covariate distribution, the cluster sizes, the response variable correlation, and the regression parameters. The results demonstrate that efficiency is quite sensitive to the between- and within-cluster variation of the covariates, and provide useful characterizations of models for which upper and lower efficiency bounds are attained. Efficiency losses for simple working correlation matrices, such as independence, can be large even for small to moderate correlations and cluster sizes.  相似文献   

4.
5.
The availability of brewery pale malt as a substrate for gibberellinbioassay was investigated. GA3 at the concentration of 0.001to 1 µg/ml caused an increase in a-amylase activity inpale malt under aerobic incubation, while no increase was observedunder anaerobic conditions. Pale malt heated at 130°C for2 hr showed no increase in a-amylase activity in the presenceof GA3. Although the mechanism for the enhancement of a-amylaseactivity in pale malt by GA3 is not clear, it is evident thatthis phenomena can be used in bioassay of gibberellins. Experimentalconditions for the bioassay using pale malt are described. Withthis method, the enhancement of a-amylase activity by differentgibberellins was: GA3>GA4>GA20 (inactive). (Received October 16, 1975; )  相似文献   

6.

Sex‐related Cognitive Differences: By Julia A. Sherman. Charles C Thomas, Springfield, Illinois, 1978. 262 pp. Cloth, $16.25; paper, $12.25.

Kinometrics: Determinants of So‐cioeconomic Success Within and Between Families. Edited by Paul Taubman. North‐Holland Publishing Company, Amsterdam, 1977. $28.25

Nutrition and Human Reproduction: Edited by W. Henry Mosley. Plenum Publishing Corporation, New York, 1978. 526 pp. $42.50.

Manual of the International Statistical Classification of Diseases, Injuries, and Causes of Death, Volume 1: World Health Organization, 1977. 777 pp. $12.00 (hardbound). (Also to be published in French, Spanish, and Russian.)

Women in Jamaica: Patterns of Reproduction and Family: By George W. Roberts and Sonja A. Sinclair. KTO Press, New York, 1978. $16.00.  相似文献   

7.
A note on lifetime regression models   总被引:3,自引:0,他引:3  
LAWLESS  J. F. 《Biometrika》1986,73(2):509-512
  相似文献   

8.
The concentrations of soluble biologically available nitrogen and phosphorus were determined using Selenastrum capricornutum bioassays and compared with analytically measured soluble nitrate (NO3-N) and soluble reactive phosphorus concentrations during enrichment studies in a South African impoundment. The NO3-N analyses consistently underestimated the soluble biologically available nitrogen and the extent of the discrepancy decreased with increasing NO3-N concentration. Biological availability of soluble organic nitrogen during the bioassays is suggested as a reason for the discrepancies. At low soluble reactive phosphorus concentrations the analytical measurements underestimated the soluble biologically available phosphorus while at high soluble reactive phosphorus concentrations the analytical measurements were considerable overestimates of soluble biologically available phosphorus. Possible reasons for the observed trend are discussed.  相似文献   

9.

Background

Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods.

Methods

In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values.

Results

When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction.

Conclusions

Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0075-3) contains supplementary material, which is available to authorized users.  相似文献   

10.
Recently, regression analysis of the cumulative incidence function has gained interest in competing risks data analysis, through the model proposed by Fine and Gray (JASA 1999; 94: 496-509). In this note, we point out that inclusion of time-dependent covariates in this model can lead to serious bias. We illustrate the problems arising in such a context, using bone marrow transplant data as a working example and numerical simulations. Practical advices are given, preventing the misuse of this model.  相似文献   

11.
12.
Tree-based models are a popular tool for predicting a response given a set of explanatory variables when the regression function is characterized by a certain degree of complexity. Sometimes, they are also used to identify important variables and for variable selection. We show that if the generating model contains chains of direct and indirect effects, then the typical variable importance measures suggest selecting as important mainly the background variables, which have a strong indirect effect, disregarding the variables that directly influence the response. This is attributable mainly to the variable choice in the first steps of the algorithm selecting the splitting variable and to the greedy nature of such search. This pitfall could be relevant when using tree-based algorithms for understanding the underlying generating process, for population segmentation and for causal inference.  相似文献   

13.
A note on shrinkage sliced inverse regression   总被引:3,自引:0,他引:3  
  相似文献   

14.
15.
16.
Simulated experimental data were generated from error-free data following the equation y = A ? Be?k1 where A, B, and k are constants and were analyzed by iterative nonlinear regression using one of two basic published computer programs. The effect of the simulated experimental error in y on the precision of the computed constants A, B, and k was evaluated. The errors were either independent of y (simple errors) or proportional to y (relative errors) and outliers were sometimes introduced. Other factors investigated were the number of data points per regression, the range of values of y, and the effect of weighting the data. The results show that the errors in the computed constants, and particularly the rate constant k, may be considerably magnified with respect to the errors in the experimental data. The quantitative relationships that are presented are useful aids in the design of biochemical experiments in which the above equation is applicable.  相似文献   

17.
18.
19.
Isofemale lines are commonly used inDrosophila and other genera for the purpose of assaying genetic variation. Isofemale lines can be kept in the laboratory for many generations before genetic work is carried out, and permit the confirmation of newly discovered alleles. A problem not realized by many workers is that the commonly used estimate of allele frequency from these lines is biased. This estimation bias occurs at all times after the first laboratory generation, regardless of whether single individuals or pooled samples are used in each well of an electrophoretic gel. This bias can potentially affect the estimation of population genetic parameters, and in the case of rare allele analysis it can cause gross overestimates of gene flow. This paper provides a correction for allele frequency estimates derived from isofemale lines for any time after the lines are established in the laboratory. When pooled samples are used, this estimator performs better than the standard estimator at all times after the first generation. The estimator is also insensitive to multiple inseminations. After the lines have drifted oneN e generations, multiple inseminations actually make the new estimator perform better than it does in singly inseminated females. Simulations show that estimates made using either estimator after the lines have drifted to fixation have a much greater error associated with their use than do those estimates made earlier in time using the correction. In general it is better to use corrected estimates of gene frequency soon after lines are established than to use uncorrected estimates made after the first laboratory generation. This work was supported by an NSERC fellowship to A.D.L.  相似文献   

20.
Summary .   In Li and Yin (2008, Biometrics 64, 124–131), a ridge SIR estimator is introduced as the solution of a minimization problem and computed thanks to an alternating least-squares algorithm. This methodology reveals good performance in practice. In this note, we focus on the theoretical properties of the estimator. It is shown that the minimization problem is degenerated in the sense that only two situations can occur: Either the ridge SIR estimator does not exist or it is zero.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号