Imputation of missing ages in pedigree data |
| |
Authors: | Balise Raymond R Chen Yu Dite Gillian Felberg Anna Sun Limei Ziogas Argyrios Whittemore Alice S |
| |
Affiliation: | Department of Health Research and Policy, Stanford University, Stanford, CA, USA. balise@stanford.edu |
| |
Abstract: | BACKGROUND: In human pedigree data age at disease occurrence frequently is missing and is imputed using various methods. However, little is known about the performance of these methods when applied to families. In particular, there is little information about the level of agreement between imputed and actual values of temporal data and their effects on inferences. METHODS: We performed two evaluations of five imputation methods used to generate complete data for repositories to be shared by many investigators. Two of the methods are mean substitution methods, two are regression methods and one is a multiple imputation method based on one of the regression methods. To evaluate the methods, we randomly deleted the years of disease diagnosis of some men in a sample of pedigrees ascertained as part of a prostate cancer study. In the first evaluation, we used the five methods to impute the missing diagnosis years and evaluated agreement between imputed and actual values. In the second evaluation, we compared agreement between regression coefficients estimated using imputed diagnosis years with those estimated using the actual years. RESULTS/CONCLUSIONS: For both evaluations, we found optimal or near-optimal performance from a regression method that imputes a man's diagnosis year based on the year of birth and year of last observation of all affected men with complete data. The multiple imputation analogue of this method also performed well. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|