首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Development of predictive models to identify advanced-stage cancer patients in a US healthcare claims database
Institution:1. HealthCore, Inc., Wilmington, DE, United States;2. Boston University, Boston, MA, United States;3. Pfizer, Inc., Collegeville, PA, United States;4. Merck KGaA, Darmstadt, Germany;1. Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, United States;2. Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences Durham, NC, United States;3. Department of Nutrition, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, United States;4. Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, United States;5. Biospecimen Processing Center, University of North Carolina, Chapel Hill, NC, United States;6. Division of Oncology and Center for Childhood Cancer Research, Children’s Hospital of Philadelphia, Philadelphia, PA, United States;7. Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA, United States;8. Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States;9. Showers Center for Childhood Cancer and Blood Disorder, Akron Children’s Hospital, Akron, OH, United States;10. Department of Biostatistics, Colleges of Medicine and Public Health & Health Professions, University of Florida, Children’s Oncology Group Statistics & Data Center, Gainesville, FL, United States;11. Department of Otolaryngology, Washington University School of Medicine, St Louis, MO, 63110;1. Centre for Big Data Research in Health, University of New South Wales Sydney, NSW, Australia;2. Centre for Primary Health Care and Equity, University of New South Wales Sydney, NSW, Australia;3. School of Medicine, University of Wollongong, NSW, Australia;4. National Drug and Alcohol Research Centre, University of New South Wales Sydney, NSW, Australia;5. Faculty of Medicine and Health, University of Sydney, NSW, Australia;6. Cancer Voices NSW, NSW, Australia;1. Moffitt Cancer Center, Department of Health Outcomes and Behavior, 4115 E. Fowler Ave., Tampa, FL 33617, United States;2. Moffitt Cancer Center, Center for Immunization and Infection Research in Cancer, 12902 USF Magnolia Drive, Tampa, FL 33612, United States;3. Moffitt Cancer Center, Department of Cancer Epidemiology, 12902 USF Magnolia Drive, Tampa, FL 33612, United States;4. Moffitt Cancer Center, Department of Biostatistics and Bioinformatics, 12902 USF Magnolia Drive, Tampa, FL 33612, United States;5. University of South Florida, Department of Family Medicine, 13330 USF Laurel Drive, Tampa, FL 33612, United States;6. University of South Florida, Department of Epidemiology & Biostatistics, 13201 Bruce B Downs Blvd, Tampa, FL 33612, United States;8. University of Florida, Department of Medicine, 1600 SW Archer Rd., Gainesville, FL 32608, United States;9. University of Florida Health, Department of Health Outcomes and Biomedical Informatics, 2004 Mowry Road, Ste 2245, Gainesville, FL 32610, United States;10. University of Florida Health, Cancer Population Sciences, 2004 Mowry Road, Ste 2245, Gainesville, FL 32610, United States;1. Department of Haematology, Oslo University Hospital, Oslo, Norway;2. Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway;3. Department of Pathology, Akershus University Hospital, Lørenskog, Norway;4. The Cancer Registry of Norway, Oslo, Norway;5. K.G. Jebsen Centre for B-Cell malignancies, University of Oslo, Oslo, Norway;1. Faculty of Medicine and Health Sciences, Department of Social Epidemiology and Health Policy, University of Antwerp, Belgium;2. Department of Oncology, Antwerp University Hospital, Antwerp, Belgium;3. Centre for Cancer Detection, Bruges, Antwerp, Belgium;4. Molecular Imaging, Pathology, Radiotherapy & Oncology (MIPRO), University of Antwerp, Belgium;1. Department of Medicine, University of Kentucky, Lexington, KY, United States;2. Department of Neurology, University of Kentucky, Lexington, KY, United States;3. Departments of Neurosurgery, University of Kentucky, Lexington, KY, United States;4. Departments of Pathology, Division of Neuropathology, University of Kentucky, Lexington, KY, United States;5. Markey Cancer Center, University of Kentucky, Lexington, KY, United States
Abstract:BackgroundAlthough healthcare databases are a valuable source for real-world oncology data, cancer stage is often lacking. We developed predictive models using claims data to identify metastatic/advanced-stage patients with ovarian cancer, urothelial carcinoma, gastric adenocarcinoma, Merkel cell carcinoma (MCC), and non-small cell lung cancer (NSCLC).MethodsPatients with ≥1 diagnosis of a cancer of interest were identified in the HealthCore Integrated Research Database (HIRD), a United States (US) healthcare database (2010–2016). Data were linked to three US state cancer registries and the HealthCore Integrated Research Environment Oncology database to identify cancer stage. Predictive models were constructed to estimate the probability of metastatic/advanced stage. Predictors available in the HIRD were identified and coefficients estimated by Least Absolute Shrinkage and Selection Operator (LASSO) regression with cross-validation to control overfitting. Classification error rates and receiver operating characteristic curves were used to select probability thresholds for classifying patients as cases of metastatic/advanced cancer.ResultsWe used 2723 ovarian cancer, 6522 urothelial carcinoma, 1441 gastric adenocarcinoma, 109 MCC, and 12,373 NSCLC cases of early and metastatic/advanced cancer to develop predictive models. All models had high discrimination (C > 0.85). At thresholds selected for each model, PPVs were all >0.75: ovarian cancer = 0.95 (95% confidence interval 95% CI]: 0.94–0.96), urothelial carcinoma = 0.78 (95% CI: 0.70–0.86), gastric adenocarcinoma = 0.86 (95% CI: 0.83–0.88), MCC = 0.77 (95% CI 0.68–0.89), and NSCLC = 0.91 (95% CI 0.90 – 0.92).ConclusionPredictive modeling was used to identify five types of metastatic/advanced cancer in a healthcare claims database with greater accuracy than previous methods.
Keywords:Ovarian cancer  Urothelial carcinoma  Gastric adenocarcinoma  Merkel cell carcinoma  Non-small cell lung cancer  Predictive modeling  Machine learning  LASSO regression
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号