Advanced Search
Article Contents
Article Contents
Early Access

Early Access articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Early Access publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Early Access articles via the “Early Access” tab for the selected journal.

Prediction models for burden of caregivers applying data mining techniques

  • * Corresponding author: Sunmoo Yoon, RN, PhD, Associate Research Scientist, Columbia University, sy2102@columbia.edu

    * Corresponding author: Sunmoo Yoon, RN, PhD, Associate Research Scientist, Columbia University, sy2102@columbia.edu 
Abstract Full Text(HTML) Figure(2) / Table(2) Related Papers Cited by
  • Introduction

    Caregiver stress negatively influences both patients and caregivers. Predictors of caregiver difficulty may provide crucial insights for providers to prioritize those with the highest risk of stress. The purpose of this study was to develop a prediction model of caregiver difficulty by applying data mining techniques to a national behavioral risk factor data set.


    Behavioral data including 397 variables on 2,264 informal caregivers, who provided any care to a friend or family member during the past month, were extracted from a publicly available national dataset in the U.S (N = 451,075) and analyzed. We applied several classification algorithms (J48, RandomForest, MultilayerPerceptron, AdaboostM1), to iteratively generate prediction models for caregiving difficulty with 10-fold cross validation.


    44.7% of informal caregivers answered that they faced the greatest difficulties while they took care of patients. Among those who faced the greatest difficulties, the reasons were creating emotional burden (45%). Patient cognitive alteration (e.g. cognitive changes in thinking or remembering during the past year), care hours, and relationship with a caregiver appeared as the main predictors of caregiver stress (classified correctly 63%, difficulty AUC = 65%, no difficulty AUC = 65%).


    Data mining methods were useful to discover new behavioral risk knowledge and to visualize predictors of caregiver stress from a multidimensional behavioral dataset.This study suggests that health professionals target dementia family caregivers who are anticipated to experience patients' neuro-cognitive changes, and inform the caregivers about importance of limiting care hours, burn out and delegation of caregiving tasks.

    Mathematics Subject Classification: Primary:97R50;Secondary:97R71.


    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Iterativesteps of the data mining process to build a prediction model from a large dataset

    Figure 2.  Burden of caregivers

    Table 1.  Characteristics of Caregivers (n=2,264)

    Patient age (mean, SD) 69.87 20.53
    Caregiver age (mean, SD) 56.14 15.46
      White 2,049 90.50%
      Black 61 2.69%
      Hispanic 69 3.05%
      Others 56 2.47%
    Patient Gender
      Male 795 35.11%
      Female 1,455 64.27%
      Employed for wages 1,035 45.72%
      Self-employed 220 9.72%
      Unemployed 423 18.69%
      Retired 582 25.71%
      < $35,000 577 25.49%
      < $50,000 299 13.21%
      < $75,000 344 15.19%
      ≥$75,000 734 32.42%
      (Grand) Parents 915 40.41%
      Spouse 371 16.39%
      Child, sibling, relatives 504 22.26%
      Friends 451 19.92%
    Patient status
      Cognitive changes 1,156 51.06%
      No cognitive changes 1,038 45.85%
      Not sure 29 1.28%
     | Show Table
    DownLoad: CSV

    Table 2.  Characteristics of Caregivers -Cont'd (n=2,264)

    Caregiving duration
      ≤ 1 year 769 33.97%
      ≤ 5 years 907 40.06%
      > 5 years 497 21.95%
    Caregiving frequency
      ≤ 10 hours/week 1,344 59.36%
      ≤ 30 hours/week 380 16.78%
      ≤ 100 hours/week 201 8.88%
      > 100 hours/week 92 4.06%
    Most needs
      Cleaning, managing $, prepare meals 614 27.12%
      Transportation outside of the home 503 22.22%
      Something else 317 14.00%
      Self care -eating, dressing, bathing 302 13.34%
      Relieving anxiety or depression 184 8.13%
    Caregiving difficulties
      No difficulty 1,013 54.0%
      Difficulty 1,178 44.7%
      Not sure/ Don't know 28 1.25%
      Refused 24 1.07%
    Greatest difficulties having difficulties
      Creates emotional burden 528 44.82%
      Not enough time for yourself 165 14.01%
      Other difficulty 113 9.59%
      Creates financial burden 95 8.06%
      Affects family relationships 85 7.22%
      No enough time for your family 84 7.13%
      Interferes with your work 71 6.03%
      Aggravates health problems 37 3.14%
     | Show Table
    DownLoad: CSV
  • [1] R. D. AdelmanL. L. TmanovaD. DelgadoS. Dion and M. S. Lachs, Caregiver burden: A clinical review, Jama, 311 (2014), 1052-1060.  doi: 10.1001/jama.2014.304.
    [2] A. Barfar and B. Padmanabhan, Predicting presidential election outcomes from what people watch, Big Data, 5 (2017), 32-41.  doi: 10.1089/big.2017.0013.
    [3] C. M. Bishop, Neural Networks for Pattern Recognition, Oxford university press, 1995.
    [4] L. Breiman, Random forests, Machine Learning, 45 (2001), 5-32. 
    [5] C.-Y. ChiaoH.-S. Wu and C.-Y. Hsiao, Caregiver burden for informal caregivers of patients with dementia: A systematic review, International Nursing Review, 62 (2015), 340-350.  doi: 10.1111/inr.12194.
    [6] G. DePalmaH. XuK. E. CovinskyB. A. CraigE. StallardJ. Thomas Ⅲ and L. P. Sands, Hospital readmission among older adults who return home with unmet need for ADL disability, The Gerontologist, 53 (2013), 454-461.  doi: 10.1093/geront/gns103.
    [7] C. for Disease Control and Prevention, Behavioral risk factor surveillance system survey data, atlanta, georgia. u. s.
    [8] J. E. GauglerD. L. RothW. E. Haley and M. S. Mittelman, Can counseling and support reduce burden and depressive symptoms in caregivers of people with Alzheimer's disease during the transition to institutionalization? results from the new york university caregiver intervention study, Journal of the American Geriatrics Society, 56 (2008), 421-428.  doi: 10.1111/j.1532-5415.2007.01593.x.
    [9] P. E. Greenberg and H. G. Birnbaum, The economic burden of depression in the us: Societal and patient perspectives, Expert Opinion on Pharmacotherapy, 6 (2005), 369-376.  doi: 10.1517/14656566.6.3.369.
    [10] S. GuptaG. HawkerA. LaporteR. Croxford and P. Coyte, The economic burden of disabling hip and knee osteoarthritis (oa) from the perspective of individuals living with this condition, Rheumatology, 44 (2005), 1531-1537.  doi: 10.1093/rheumatology/kei049.
    [11] M. HallE. FrankG. HolmesB. PfahringerP. Reutemann and I. H. Witten, The weka data mining software: An update, ACM SIGKDD Explorations Newsletter, 11 (2009), 10-18.  doi: 10.1145/1656274.1656278.
    [12] Y. LeCunY. Bengio and G. Hinton, Deep learning, Nature, 521 (2015), 436-444.  doi: 10.1038/nature14539.
    [13] S. J. LupienB. S. McEwenM. R. Gunnar and C. Heim, Effects of stress throughout the lifespan on the brain, behaviour and cognition, Nature Reviews Neuroscience, 10 (2009), 434-445.  doi: 10.1038/nrn2639.
    [14] P. C. J. Navas, Y. C. G. Parra and J. I. R. Molano, Big data tools: Haddop, mongodb and weka, in International Conference on Data Mining and Big Data, Springer, 2016,449-456.
    [15] U. D. of Health Huma Service., 2011 poverty guideline, Federal Register, 76 (2010), 3637-3638. 
    [16] B. D. Ripley, Pattern Recognition and Neural Networks, Cambridge university press, 2007.
    [17] J. W. RoweT. Fulmer and L. Fried, Preparing for better health and health care for an aging population, Jama, 316 (2016), 1643-1644.  doi: 10.1001/jama.2016.12335.
    [18] J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks, 61 (2015), 85-117.  doi: 10.1016/j.neunet.2014.09.003.
    [19] B. C. Spillman and S. K. Long, Does high caregiver stress predict nursing home entry?, INQUIRY: The Journal of Health Care Organization, Provision, and Financing, 46 (2009), 140-161.  doi: 10.5034/inquiryjrnl_46.02.140.
    [20] C. H. Van HoutvenS. D. RamseyM. C. HornbrookA. A. Atienza and M. van Ryn, Economic burden for informal caregivers of lung and colorectal cancer patients, The Oncologist, 15 (2010), 883-893.  doi: 10.1634/theoncologist.2010-0005.
    [21] I. H. Witten, E. Frank, M. A. Hall and C. J. Pal, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2016.
    [22] X. WuV. KumarJ. Ross QuinlanJ. GhoshQ. YangH. MotodaG. J. McLachlanA. NgB. Liu and P. S. Yu, et al., Top 10 algorithms in data mining, Knowledge and Information Systems, 14 (2008), 1-37.  doi: 10.1007/s10115-007-0114-2.
    [23] E. Yan and T. Kwok, Abuse of older Chinese with dementia by family caregivers: An inquiry into the role of caregiver burden, International Journal of Geriatric Psychiatry, 26 (2011), 527-535.  doi: 10.1002/gps.2561.
    [24] Q. Yang and X. Wu, 10 challenging problems in data mining research, International Journal of Information Technology & Decision Making, 5 (2006), 597-604.  doi: 10.1142/S0219622006002258.
  • 加载中




Article Metrics

HTML views(2723) PDF downloads(137) Cited by(0)

Access History



    DownLoad:  Full-Size Img  PowerPoint