An excessive amount of total hospitalization is caused by delays due to patients waiting to be placed in a rehabilitation facility or skilled nursing facility (RF/SNF). An accurate preoperative prediction of who would need a RF/SNF place after surgery could reduce costs and allow more efficient organizational planning. We aimed to develop a machine learning algorithm that predicts non-home discharge after elective surgery for lumbar spinal stenosis.
Methods
We used the American College of Surgeons National Surgical Quality Improvement Program to select patient that underwent elective surgery for lumbar spinal stenosis between 2009 and 2016. The primary outcome measure for the algorithm was non-home discharge. Four machine learning algorithms were developed to predict non-home discharge. Performance of the algorithms was measured with discrimination, calibration, and an overall performance score.
Results
We included 28,600 patients with a median age of 67 (interquartile range 58–74). The non-home discharge rate was 18.2%. Our final model consisted of the following variables: age, sex, body mass index, diabetes, functional status, ASA class, level, fusion, preoperative hematocrit, and preoperative serum creatinine. The neural network was the best model based on discrimination (c-statistic = 0.751), calibration (slope = 0.933; intercept = 0.037), and overall performance (Brier score = 0.131).
Conclusions
A machine learning algorithm is able to predict discharge placement after surgery for lumbar spinal stenosis with both good discrimination and calibration. Implementing this type of algorithm in clinical practice could avert risks associated with delayed discharge and lower costs.
Graphical abstract
These slides can be retrieved under Electronic Supplementary Material.
In recent years, there has been a trend toward quicker discharges after orthopedic surgery, which does not seem to affect patients’ outcomes inordinately [1, 2]. However, an excessive amount of total hospitalization is caused by delays due to patients waiting to be placed in a rehabilitation facility or skilled nursing facility (RF/SNF) [3, 4, 5, 6, 7, 8]. Not only does this incur unnecessary costs and hamper efficient delivery of care, but more importantly delayed discharges are detrimental to the patient [9]. Increased length of stay has been associated with hospital acquired infections and adverse drug events [7, 9, 10, 11]. Although increasing the number of facilities seems the obvious solution, a study by Gaughan et al. [12] found that this would only have a small effect on delayed discharges and would actually cost more.
Previous studies have determined risk factors for non-home discharge placement. Some have developed scoring systems based on these risk factors aiming to predict who will likely not be discharged home after spine surgery [13, 14, 15, 16]. However, no studies have looked at employing machine learning (ML) algorithms. The increased amounts of available data combined with more computational hardware are currently causing a rapid expansion of ML in medicine. ML is a form of artificial intelligence which allows algorithms to learn and self-improve from experience without explicit programming by a data scientist. The capacity of these algorithms to handle large datasets and incorporate nonlinear interactions allows for more accurate and personalized prediction than regular statistical methods.
An accurate personal preoperative prediction of who would need a RF/SNF place would allow reservation of a place in advance and earlier insurance precertification. This could reduce costs and avoid the risks of (unnecessary) prolonged hospitalization.
Lumbar spinal stenosis is a relatively common degenerative spine condition for which the SPORT trial has indicated surgical treatment to be superior to non-surgical treatment. Currently, it is one of the most common indications for spine surgery [17, 18].
In this study, we aim to develop a prediction tool using ML algorithms to predict discharge to a RF/SNF after elective surgery for lumbar spinal stenosis for patients living at home preoperatively. Second, we aim to select the best performing algorithm and develop an application to enable healthcare providers to arrange a place in a RF/SNF well in advance.
Methods
Data source
We used the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) as our main data source. The ACS-NSQIP is a large clinical database with data of more than 680 US hospitals combined and has often been used in the spine literature.
We included patients based on the following criteria: (1) International Classification of Disease—Ninth Revision (ICD-9) code 724.02 or 724.03 for lumbar spinal stenosis; (2) year of surgery between 2009 and 2016; (3) Current Procedural Terminology (CPT) codes for decompression, fusion, or fixation. We included 28,600 patients in our dataset to train and test the algorithms.
Data analysis
Our primary outcome measure was non-home discharge defined as all discharges not to home. This variable was created by grouping together discharges to rehabilitation facilities, skilled nursing facilities, and unskilled nursing facilities. Variable selection for our algorithm was performed by entering all available variables in a random forest regression, which then ranks variables according to their predictive power for the outcome variable [19].
We performed a stratified 80:20 split of the dataset into a training set and a testing set. We used the training set for algorithm training and assessment of performance by tenfold cross-validation. Cross-validation means dividing the data into a selected number of groups, named folds. Each fold is withheld once and treated as the test set, while the other folds together are treated as the training set. Results are subsequently averaged across all repetitions of this sequence [20].
Four different algorithms (neural network, support vector machine, Bayes point machine, boosted decision tree) were trained using these variables to predict non-home discharge. We choose these four because they each have different merits for prediction (Appendix 1). Senders et al. [20] provide an accessible overview of the most commonly used algorithms. The model with the best performance was subsequently used in the testing set to predict discharge placement. These predictions were then compared with the actual outcomes of the testing set to assess the performance of the algorithm outside of the training set.
Model assessment
Performance of the algorithms was measured with discrimination, calibration, and an overall performance score [21, 22].
Discrimination is the algorithm’s ability to distinguish patients who were discharged to home from patients who were not discharged to home. We assessed discrimination with receiver operating curves (ROC) and c-statistics. A c-statistic of 1.0 indicates perfect discrimination, while a c-statistic of 0.5 indicates discrimination similar to chance [23].
Calibration determines whether the predicted probabilities of the algorithm are similar to the actual observed events. The calibration intercept determines whether the algorithm is over- or underestimating the probabilities; the calibration slope determines whether the predictor effects are similar in the training and the testing set. A perfect model has an intercept score of 0.0 and a slope score of 1.0.
Overall model performance was assessed with the Brier score, calculated by obtaining the mean squared error between the observed events in the testing set and the predictions given by the algorithm. A perfect algorithm would have a score of 0. The Brier score combines discrimination and calibration characteristics, but must always be interpreted in the context of the prevalence of the predicted outcome—in our study non-home discharge [22]. If the prevalence of the outcome variable is lower, the maximum score of a poor algorithm is lower as well. Therefore, the Brier score must be compared with the null Brier score, which is calculated by assigning each patient a probability equivalent to the prevalence of non-home discharge. Steyerberg et al. [22] offers a detailed framework of all performance metrics.
Web-based application
The algorithm with the best performance based on discrimination, calibration, and overall performance was subsequently incorporated in a Web-based application. This application is built to input the variable values collected by a healthcare provider into the algorithm, calculate the probability, and output the result to the healthcare provider in real-time.
Microsoft Azure, STATA 13 (StataCorp LP, College Station, TX, USA), RStudio version 1.0.153, and Python version 3.6 (Python Software Foundation) (Anaconda distribution) were used for data analysis, model creation, and application development.
Results
Of the 28,600 patients, 18.2% were not discharged to home. Baseline characteristics are given in Table 1. The following variables were included after variable selection: age (years), sex (male/female), body mass index (BMI), American Society of Anesthesiologists (ASA) class (I/II/III/IV), functional status (independent/dependent), number of levels included in surgery (1 or 2 levels/3 or more levels), fusion (yes/no), diabetes (no/oral medication/insulin dependent), preoperative hematocrit (vol.%), and preoperative serum creatinine (mg/dL).Table 1
Baseline characteristics of patients, n = 28,600
Variable
Definition
Age
Median (IQR)
67 (58–74)
Missing, n (%)
113 (0.37)
Sex
Female
13,518 (47.3)
Male
15,082 (52.7)
BMI
Median (IQR)
30.09 (26.58–34.43)
Missing, n (%)
125 (0.34)
Functional status
Independent
27,917 (97.6)
Dependent
508 (1.8)
Missing, n (%)
175 (0.6)
Fusion
No
13,053 (45.6)
Yes
15,547 (54.4)
Approach
Posterior
26,633 (93.1)
Anterior
682 (2.4)
Combined
1285 (4.5)
Level
One or two levels
14,618 (41.5)
Three or more levels
20,638 (58.5)
Instrumentation
No
15,973 (55.8)
Yes
12,627 (44.2)
ASA class
I
475 (1.7)
II
12,281 (42.9)
III
15,079 (52.7)
IV
765 (2.7)
Hematocrit
Median (IQR)
41.1 (38.4–43.8)
Missing, n (%)
1926 (6.7)
White blood cell
Median (IQR)
7.0 (5.8–8.4)
Missing, n (%)
2260 (7.9)
Platelet
Median (IQR)
232 (194–276)
Missing, n (%)
2285 (7.9)
Sodium
Median (IQR)
140 (138–141)
Missing, n (%)
3401 (11.9)
Creatinine
Median (IQR)
0.90 (0.77–1.09)
Missing, n (%)
3270 (11.4)
Diabetes
No
22,488 (78.6)
Oral medication
4111 (14.4)
Insulin dependent
2001 (7.0)
Hypertension
18,742 (65.5)
Current smoker
4676 (16.3)
Chronic obstructive pulmonary disease
1493 (5.2)
Chronic steroid use
1189 (4.2)
Bleeding disorder
571 (2.0)
BMI body mass index, ASA American Society of Anesthesiologists Classification; IQRinterquartile rangeTable 2 shows the list of the AUC, calibration slope and intercept, and Brier score for the four algorithms. The null Brier score was 0.150. Based on numerical and graphical assessment of these metrics, the neural network algorithm was chosen as the final model with a c-statistic of 0.751, a calibration slope of 0.933, a calibration intercept of 0.037, and a Brier score of 0.130 (Fig. 1).Table 2
Model performance for discharge disposition prediction on training set
Performance metric
Neural network
Support vector machine
Bayes point machine
Boosted decision tree
C-statistic
0.751
0.743
0.752
0.747
Calibration slope
0.933
0.996
1.038
0.694
Calibration intercept
0.037
5.2 × 10−4
− 3.57 × 10−4
4.58 × 10−3
Brier score
0.130
0.131
0.131
0.133
Null model Brier score
0.150
When evaluating the neural network algorithm on the testing set a c-statistic of 0.744, a calibration slope of 0.915, a calibration intercept of − 0.131, and a Brier score of 0.131 were achieved (Figs. 2, 3).
The Web application based on the neural network can be accessed at https://sorg-apps.shinyapps.io/stenosisdisposition/. As an example, a 75-year-old male is scheduled for two-level surgery with fusion. He has a BMI of 34 and is classified as ASA II; he lives independently at home and does not have diabetes. His preoperative creatinine level is 2.9 mg/dL and preoperative hematocrit level is 34%. After filling out these values in the algorithm, this patient has a 24.4% chance of non-home discharge.
Discussion
We aimed to develop an ML algorithm that can predict discharge to a RF/SNF after elective surgery for lumbar stenosis. Our algorithm included age, sex, BMI, functional status, ASA class, level, fusion, diabetes, preoperative hematocrit, and preoperative serum creatinine. The neural network was picked as the best algorithm based on discrimination (c-statistic = 0.752), calibration (intercept = − 1.27 × 10−5; slope = 0.996), and overall performance (Brier score = 0.1257) in the training set and subsequent performance on internal validation.
Our study has several limitations. First, studies using a large clinical database are always affected by miscoding and other inaccuracies. Although widely used, few studies have assessed the actual accuracy of the NSQIP database. Rolston et al. [24] found many internal inconsistencies between procedure CPT codes and postoperative ICD-9 codes in neurosurgery. However, the codes for lumbar stenosis and lumbar surgery are more straightforward so we estimate that potential miscoding will not severely affect our algorithm. Second, certain variables of interest are not always available in the ACS-NSQIP. Considering preoperative patient-reported outcomes are known to be predictors of discharge placement after spine surgery, we consider this a major limitation of our work [25]. While the current AUC of 0.751 is fair, the algorithm could potentially be improved by adding these and other relevant variables. Third, although the ACS-NSQIP database consists of data of 680 US hospitals, these results may not be applicable to all the patients it is intended for due to differences in demographic or clinical characteristics. Fourth, the differences between the algorithms are small, which makes the choice for a neural network somewhat arbitrary. However, settling on an algorithm based on numerical and graphical assessment is the most reproducible method. Finally, it must be emphasized that this study focuses on accurate prediction of a, rather simple, prespecified outcome (here ‘non-home discharge’) in contrast to the explanation of this outcome, which is the focus of the vast majority of medical research. The variables in our model cannot simply be interpreted as independent explanatory variables.
Age, sex, diabetes, functional status, fusion, and preoperative hematocrit have previously been identified in other (explanatory) studies on discharge placement after spine surgery [26, 27, 28]. The inclusion of most variables in our model can likely be attributed to being independent risk factors for major complications after surgery for lumbar stenosis. Age, diabetes, BMI, functional status, ASA class, preoperative hematocrit, and preoperative creatinine have all been shown to be associated with major complications [29, 30, 31, 32]. Number of levels and fusion are likely surrogates for longer procedural time which is also implicated in postoperative morbidity [30, 31].
The importance of eliminating delayed discharges for patients lies in averting the risks associated with longer hospitalization and the advantages of starting rehabilitation earlier. Umarji et al. [11] found that 58% of patients with a hip fracture acquire nosocomial infections when discharge was delayed beyond 8 days. Hauck et al. [33] found that each additional night in hospital increases the risk by 0.5% for adverse drug events and 1.6% for infections. With regard to rehabilitation, other studies have found worse post-rehabilitation scores for patients with delays in discharge [34, 35]. While those studies did not necessarily focus on elective spine patients, other spine centers have acknowledged the problem and aimed to construct risk scores for predicting discharge placement. McGirt et al. [15] created the Carolina-Semmes grading score for all degenerative lumbar spine surgery based on logistic regression. They included the variables age, ASA class, fusion, Oswestry disability index score, ambulation, and non-private insurance and achieved an area under the curve (AUC) of 0.731. Kanaan et al. [14] used age, prior level of function, and gait distance to create a model for discharge placement after lumbar laminectomy and achieved an AUC of 0.80. Slover et al. [13] stratified spine patients in low, medium, and high risk based on points for age, sex, walking distance, gait aid, community support, and availability of caregiver at home. They did not report an AUC. None of the above-mentioned studies assessed calibration.
Although often overlooked, assessment of calibration is an essential feature of studies creating prediction models. In our study, the neural network and the Bayes point machine had highly similar performance metrics. However, on graphical assessment the calibration of the Bayes point machine was slightly inaccurate between the predicted probabilities of 0.15 and 0.50, which represent a significant part of the study population (Fig. 3). This deviation means the algorithm slightly underestimates the chance of discharge to a RF/SNF, which for some patients would mean no placement has been arranged before surgery—the situation as it is right now. Assessing calibration over the full range of predictions is crucial in ensuring the model is useful [23]. Future studies aiming to create models should always feature a numerical and graphical assessment of calibration. As depicted in the calibration subplot in Fig. 3, the vast majority of patients have a 10–40% chance of discharge to an RF/SNF, as can be expected for an elective spine procedure. The algorithm is meant to trace and designate higher-risk patients so their potential discharge delay might be avoided.
Where hospitals set their threshold to arrange an RF/SNF placement in advance would differ per health system. There are major differences in the availability of RF/SNF beds, insurance regulations, and discharge practices between countries [36, 37, 38]. Length of stay for deforming dorsopathies ranges from 4.6 to 27 within Europe. American patients are three times more likely to be discharged to RF/SNF than Canadian patients with a hip fracture [39]. While these complex differences do exist, delayed discharges are a problem for patients and hospitals around the world [9]. Mirroring the differences between countries, a wide variety of policies have been implemented internationally to try to lower amount and duration of these delayed discharges [40, 41]. In Great Britain, imposing fines has reduced the number of delayed discharges, but simultaneously rising readmission rates brought up questions about the quality of discharges [42]. Sweden tried making local municipalities financially responsible for the care of elderly [43]. Others focused on developing allocation decision tools or the effect of increasing nursing home supply [12, 44].
At the very core of all these suggested policies, regardless of health system, is the inability to make an accurate assessment of who will need a RF/SNF placement with enough time to set things in motion. An ML algorithm can give an individualized prediction. Thorough external validation needs to be performed along with an assessment of where to place the threshold before these algorithms can be implemented, especially if the algorithm were to be used outside the USA.
Nevertheless, considering the risks for patients and the unnecessary costs involved with longer hospitalization due to delayed discharges, the use of predictive algorithms could be worth the initial effort.
Conclusion
A prediction tool based on an ML algorithm is able to predict discharge placement after surgery for lumbar spinal stenosis with both good discrimination and calibration. This methodology can be implemented for a variety of other diseases and elective treatments, which could avoid risks associated with delayed discharge and lower costs.
Notes
Acknowledgements
The American College of Surgeons National Surgical Quality Improvement Program and the hospitals participating in the ACS-NSQIP are the source of the data used herein; they have not verified and are not responsible for the statistical validity of the data analysis or the conclusions derived by the authors.
Compliance with ethical standards
Conflict of interest
The authors have nothing to disclose.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
2.Basques BA, Tetreault MW, Della Valle CJ (2017) Same-day discharge compared with inpatient hospitalization following hip and knee arthroplasty. J Bone Joint Surg Am 99:1969–1977. https://doi.org/10.2106/JBJS.16.00739CrossRefGoogle Scholar
3.Hwabejire JO, Kaafarani HMA, Imam AM et al (2013) Excessively long hospital stays after trauma are not related to the severity of illness: let’s aim to the right target! JAMA Surg 148:956–961. https://doi.org/10.1001/jamasurg.2013.2148CrossRefGoogle Scholar
7.Rosman M, Rachminov O, Segal O, Segal G (2015) Prolonged patients’ In-Hospital Waiting Period after discharge eligibility is associated with increased risk of infection, morbidity and mortality: a retrospective cohort analysis. BMC Health Serv Res 15:1–5. https://doi.org/10.1186/s12913-015-0929-6CrossRefGoogle Scholar
8.New PW, Andrianopoulos N, Cameron PA et al (2013) Reducing the length of stay for acute hospital patients needing admission into inpatient rehabilitation: a multicentre study of process barriers. Intern Med J 43:1005–1011. https://doi.org/10.1111/imj.12227CrossRefGoogle Scholar
10.Härkänen M, Kervinen M, Ahonen J et al (2015) Patient-specific risk factors of adverse drug events in adult inpatients—evidence detected using the Global Trigger Tool method. J Clin Nurs 24:582–591. https://doi.org/10.1111/jocn.12714CrossRefGoogle Scholar
12.Gaughan J, Gravelle H, Siciliani L (2015) Testing the bed-blocking hypothesis: does nursing and care home supply reduce delayed hospital discharges? Health Econ 24:32–44. https://doi.org/10.1002/hecCrossRefGoogle Scholar
14.Kanaan SF, Yeh H-W, Waitman RL et al (2014) Predicting discharge placement and health care needs after lumbar spine laminectomy. J Allied Health 43:88–97Google Scholar
15.McGirt MJ, Parker SL, Chotai S et al (2017) Predictors of extended length of stay, discharge to inpatient rehab, and hospital readmission following elective lumbar spine surgery: introduction of the Carolina-Semmes Grading Scale. J Neurosurg Spine 27:382–390. https://doi.org/10.3171/2016.12.SPINE16928CrossRefGoogle Scholar
16.Niedermeier S, Przybylowicz R, Virk SS et al (2017) Predictors of discharge to an inpatient rehabilitation facility after a single-level posterior spinal fusion procedure. Eur Spine J 26:771–776. https://doi.org/10.1007/s00586-016-4605-2CrossRefGoogle Scholar
18.Weinstein JN, Tosteson TD, Lurie JD et al (2010) Surgical versus nonoperative treatment for lumbar spinal stenosis four-year results of the Spine Patient Outcomes Research Trial. Spine 35:1329–1338. https://doi.org/10.1097/BRS.0b013e3181e0f04dCrossRefGoogle Scholar
19.Degenhardt F, Seifert S, Szymczak S (2017) Evaluation of variable selection methods for random forests and omics data sets. Brief Bioinform. https://doi.org/10.1093/bib/bbx124Google Scholar
24.Rolston JD, Han SJ, Chang EF (2017) Systemic inaccuracies in the National Surgical Quality Improvement Program database: implications for accuracy and validity for neurosurgery outcomes research. J Clin Neurosci 37:44–47. https://doi.org/10.1016/j.jocn.2016.10.045CrossRefGoogle Scholar
25.Mancuso CA, Duculan R, Craig CM, Girardi FP (2018) Psychosocial variables contribute to length of stay and discharge destination after lumbar surgery independent of demographic and clinical variables. Spine 43:281–286. https://doi.org/10.1097/BRS.0000000000002312Google Scholar
26.Best MJ, Buller LT, Falakassa J, Vecchione D (2015) Risk factors for nonroutine discharge in patients undergoing spinal fusion for intervertebral disc disorders. Iowa Orthop J 35:147–155Google Scholar
27.Abt NB, McCutcheon BA, Kerezoudis P et al (2017) Discharge to a rehabilitation facility is associated with decreased 30-day readmission in elective spinal surgery. J Clin Neurosci 36:37–42. https://doi.org/10.1016/j.jocn.2016.10.029CrossRefGoogle Scholar
30.Schoenfeld AJ, Carey PA, Cleveland AW et al (2013) Patient factors, comorbidities, and surgical characteristics that increase mortality and complication risk after spinal arthrodesis: a prognostic study based on 5,887 patients. Spine J 13:1171–1179. https://doi.org/10.1016/j.spinee.2013.02.071CrossRefGoogle Scholar
34.Young J, Green J (2010) Effects of delays in transfer on independence outcomes for older people requiring postacute care in community hospitals in England. J Clin Gerontol Geriatr 1:48–52. https://doi.org/10.1016/j.jcgg.2010.10.009CrossRefGoogle Scholar
36.Kondo A, Zierler BK, Isokawa Y et al (2010) Comparison of lengths of hospital stay after surgery and mortality in elderly hip fracture patients between Japan and the United States—the relationship between the lengths of hospital stay after surgery and mortality. Disabil Rehabil 32:826–835. https://doi.org/10.3109/09638280903314051CrossRefGoogle Scholar
37.Nikkel LE, Kates SL, Schreck M et al (2015) Length of hospital stay after hip fracture and risk of early mortality after discharge in New York state: retrospective cohort study. BMJ 351:1–10. https://doi.org/10.1136/bmj.h6246Google Scholar
39.Beaupre LA, Wai EK, Hoover DR et al (2018) A comparison of outcomes between Canada and the United States in patients recovering from hip fracture repair: secondary analysis of the FOCUS trial. Int J Qual Health Care 30:97–103. https://doi.org/10.1093/intqhc/mzx199CrossRefGoogle Scholar
42.McCoy D, Godden S, Pollock AM, Bianchessi C (2007) Carrot and sticks? The Community Care Act (2003) and the effect of financial incentives on delays in discharge from hospitals in England. J Public Health 29:281–287. https://doi.org/10.1093/pubmed/fdm026CrossRefGoogle Scholar
44.Zychlinski N (2017) Time-varying fluid networks with blocking: models supporting patient flow analysis in hospitals. Doctoral dissertation, Israel Institute of TechnologyGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
CrossMarkCite this article as:Ogink, P.T., Karhade, A.V., Thio, Q.C.B.S. et al. Eur Spine J (2019) 28: 1433. https://doi.org/10.1007/s00586-019-05928-z
Esta web utiliza cookies para que podamos ofrecerte la mejor experiencia de usuario posible. La información de las cookies se almacena en tu navegador y realiza funciones tales como reconocerte cuando vuelves a nuestra web o ayudar a nuestro equipo a comprender qué secciones de la web encuentras más interesantes y útiles.
Cookies estrictamente necesarias
Las cookies estrictamente necesarias tiene que activarse siempre para que podamos guardar tus preferencias de ajustes de cookies.
Si desactivas esta cookie no podremos guardar tus preferencias. Esto significa que cada vez que visites esta web tendrás que activar o desactivar las cookies de nuevo.
Cookies de terceros
Esta web utiliza Google Analytics para recopilar información anónima tal como el número de visitantes del sitio, o las páginas más populares.
Dejar esta cookie activa nos permite mejorar nuestra web.
¡Por favor, activa primero las cookies estrictamente necesarias para que podamos guardar tus preferencias!