Evidence-Based Recommendations for Spine Surgery
Abstract and Introduction
While the efficacy of surgical treatment has been well established, the comparative effectiveness of adding instrumented lumbar spinal fusion to decompressive laminectomy in lumbar spondylolisthesis and spinal stenosis is controversial. Nonrandomized, prospective comparative studies have found evidence of laminectomy with fusion being superior but there is no Class I evidence. In partial contrast, other prospective studies with at least 5 years of postsurgery follow-up demonstrate excellent outcomes with only lumbar decompression. The lack of high quality evidence hinders development of clinical practice guidelines. The SLIP trial (spinal laminectomy vs. instrumented pedicle screw) reported by Ghogawala et alreports the results at 2 years of surgical patients randomized to either decompression or decompression and fusion for lumbar degenerative spondylolisthesis.
The trial began with 130 patients screened of which 106 enrolled and 66 were randomized (35 decompression alone, 31 decompression and fusion). At the 2-year follow-up the group sizes were 29 and 28. At baseline, the fusion group had an short form-36 physical component summary (SF-36 PCS) score that was 3.2 points lower than in the decompression-alone group. However, there were no statistically significant differences, suggesting a successful randomization process.
At the primary endpoint of 2 years, the mean change in SF-36 PCS for the decompression-alone group was 9.5 and for the fusion group was 15.2; the difference of 5.7 was statistically significant (P = 0.046) and larger than the minimal clinically significant difference. While the difference was not statistically significant at 1 year, it was significant at 3 months, 6 months, and then at 2 years through to the end of the 4-year follow-up period. Differences in Oswestry Disability Index (ODI) did not reach statistical significance until 4 years. The fusion group had significantly higher surgical complications (blood loss, length of stay, length of procedure) but had a significantly lower rate of reoperation (14% vs. 34%).
The authors conclude that laminectomy plus fusion was associated with statistically greater and clinically meaningful improvement in physical health and function compared with laminectomy alone at 2-, 3-, and 4-year follow-up. The authors’ conclusions are strongly supported by the well-designed study and analysis. Differences in low back pain disability, as measured by the ODI, were not found until 4 years after surgery. Despite its widespread use across many spinal conditions and being the primary outcome measure in many studies, the ODI is a historic measure of low back pain related disability and may not be the best measure for patients with neurogenic claudication. Surgery for lumbar spinal stenosis with spondylolisthesis is indicated to improve symptoms of neurogenic claudication, with axial lower back pain being a secondary objective if present. Therefore, the lack of difference in ODI scores noted in this study does not infer a clinically meaningful implication for patients with lumbar spinal stenosis and spondylolisthesis. On the contrary, the improvements in SF-36 PCS scores probably more accurately reflect functional improvements associated with surgery for this condition.
This is a randomized, controlled trial with patients from five centers, although most (51 of 66) patients were from one site. The study plan was to enroll 100 patients, and randomly assign 64 of them. A parallel registry of the other approximately 40 was planned as an observation cohort for patients who declined randomization. Details of the final protocol are available at NEJM.org.
A detailed list of inclusion and exclusion criteria is provided. The patient flowchart is comprehensive and well-presented. A panel of 10 expert spine surgeons assessed suitability of each patient for randomization. Independent interpretation of radiologic imaging was also performed to confirm the diagnosis of degenerative lumbar canal stenosis with spondylolisthesis without disk herniation. The authors note that “This novel approach appeared to increase patient consent to undergo randomization.” Patients underwent one of two surgical procedures: laminectomy or laminectomy and instrumented fusion. The details of the interventions are clearly explained and the authors insured that all surgeons had experience with at least 100 laminectomy and 100 laminectomy and instrumented fusion procedures.
The a priori primary outcome measure is the change in SF-36 physical-component (PC) at 2 years with the ODI defined as a secondary outcome measure. The choice of a generic (SF-36) rather than a disease-specific (Oswestry) outcome instrument as the primary outcome would be unconventional. Minimal clinically important differences were pre-set at five points for the SF-36 and 10 points for the ODI. Sample size calculations are based on a between-group difference of 7.5 (standard deviation 10) for the SF-36 PC and yielded a needed sample size of 64 patients. We replicated their calculations confirming appropriate power analysis was performed.
Statistical analysis includes standard comparisons of two groups, using independent-sample t tests for continuous variables and chi-square tests or Fisher’s exact tests for categorical variables. To compare groups with respect to the changes from baseline in SF-36 and ODI scores, mixed-effects models for repeated measures were used. Fixed effects were treatment type (the key variable under study), site, time since randomization, and time-by-treatment interaction. Finally, changes in SF-36 PC were dichotomized as over or under five points (the minimal clinically important difference) at 2 years, and a logistic regression carried out using the same fixed effects as in the mixed-effects models.
Analysis was on an intent-to-treat basis (i.e., according to a patient’s original randomized treatment assignment) for all patients with follow-up assessments. Details of the statistical analysis plan are also available in the study protocol (NEJM.org); the study design, presentation, and statistical analysis are strong and appropriate for the design and nature of the investigational question.
Recommendation Regarding Impact on Clinical Practice
This paper demonstrates, with high quality and a high level of evidence, that lumbar laminectomy with fusion leads to a greater (statistically significant and clinically meaningful) improvement in physical health-related quality of life than laminectomy alone, over a follow-up period of 4 years. While the short-term complications were lower in the laminectomy alone group, reoperation rates were significantly higher. Therefore, we offer a strong recommendation that laminectomy plus fusion is the better treatment option for patients with lumbar spinal stenosis and spondylolisthesis. While there is a role for laminectomy alone, the subset patient population in whom this procedure will prove durable remains to be identified.
Surgery to treat lumbar spinal stenosis has become more prevalent over recent decades such that it is the most common indication for spine surgery at present. Surgical options may include decompression or decompression with fusion. When stenosis is present with a spondylolisthesis many have advocated fusion, in addition to decompression, in an effort to prevent worsening of the spondylolisthesis and recurrent or persistent symptoms. This current study investigates whether or not the addition of fusion to surgical decompression, in those with spinal stenosis, results in better clinical outcomes at 2 years. Neurogenic claudication caused by degenerative lumbar spinal stenosis is common. High-quality evidence to guide treatment decisions is lacking; specifically, there is controversy as to the benefit of adding fusion to surgical decompression. In an attempt to address these ongoing issues, the authors performed the present study which was intended to assess the utility of performing an adjunctive fusion for patients with symptomatic spinal stenosis.
Forsth et al have conducted a prospective randomized trial investigating the effectiveness of fusion as an adjunct to decompression in treatment of lumbar spinal stenosis. Clinical inclusion criteria included patients between the ages of 50 and 80 with more than 6 months of neurogenic claudication in one or both legs and back pain (visual analogue scale [VAS] >30) while radiographic criteria limited the population to patients with 1 to 2 level disease with thecal cross-sectional area of ≤75 mm2. Patients were additionally stratified according to presence or absence of degenerative spondylolisthesis defined as ≥3 mm of anterolisthesis on plain lateral X-ray (not flexion-extension dynamic views). A total of 247 patients were enrolled between 2006 and 2012 (123 assigned to decompression and fusion and 124 to decompression alone). Of these, 14 patients did not receive the prescribed intervention and 5 were lost to follow-up, yielding 228 patients who were included in the per-protocol analysis. Baseline characteristics and preoperative variables are reported and were similar across all groups.
Specific surgical procedures performed were left to the discretion of the operating surgeons. As might be expected, there was a statistically significant difference between the treatment groups for intraoperative variables with longer operating times and greater estimated blood loss in the fusion groups. There was no significant difference in either intraoperative or postoperative complications. While there was no difference in reoperation rates or postoperative resource utilization between the two groups, patients in the fusion group stayed in the hospital longer (7.4 days for fusion vs. 4.1 days for decompression, P < 0.001). The mean operative costs were higher in the fusion group ($12,000 USD vs. $5,400 USD).
At 2 years, there was no significant difference in the primary outcome (i.e., ODI) between the two groups. A subgroup analysis showed no significant difference in outcome between patients with spondylolisthesis undergoing decompression alone and those undergoing decompression with fusion. Secondary measures included EuroQol-Five Dimensions, VAS-back, VAS-leg, Zurich Claudication Questionnaire score, 6-minute walk test, and patient-reported satisfaction. There were no differences in any of these secondary outcomes measures between patients undergoing decompression and those undergoing decompression and fusion. Moreover, this same finding applied for patients with spondylolisthesis. Of the 228 included patients, 144 were available for the 5-year follow-up and 138 provided outcome information. Once again, there were no differences in any of the primary or secondary outcomes measures between the fusion group and the decompression group at 5 years.
On the basis of the above data, the authors conclude that the addition of fusion as an adjunct to decompression for patients with lumbar spinal stenosis does not confer any significant clinical benefit but does increase hospital length of stay, blood loss, and direct costs. Most interestingly, the authors conclude that the presence of spondylolisthesis does not change this finding: decompression and fusion for spondylolisthesis with stenosis is no better than decompression alone for patients presenting with symptoms of neurogenic claudication and back pain. Moreover, in their population, fusion did not offer protection from the risk of reoperation even in the setting of spondylolisthesis.
The authors clearly state that the aim of this study was to investigate whether the addition of fusion to decompression surgery resulted in better clinical outcomes at 2 years. There was no clear statement of study hypothesis and the authors did not commit to a stated hypothesis a priori. By examining the inclusion/exclusion criteria as well as the heterogeneity of surgical procedures permissible this present study does not seek to define the efficacy of fusion as an adjunct but rather the effectiveness of fusion as an adjunct to decompression.
The authors have used a prospective randomized trial design—patients with lumbar spinal stenosis were randomly assigned to undergo decompression plus fusion or decompression surgery alone. Patients were stratified according to the presence or absence of spondylolisthesis. Using this prognostic stratification ensured an equal distribution of spondylolisthesis patients in both treatment groups. Patients were not blinded to treatment received.
A clear description of the population from which the study sample was drawn was lacking, however the authors did mention that the study sample was derived from seven Swedish hospitals. A detailed list of appropriate inclusion and exclusion criteria were provided; stenosis was defined as a cross-sectional area of the dural sac of ≤75 mm2. A detailed patient flow diagram was provided and in so doing the authors accounted for all eligible patients. Baseline comparability of the treatment and control groups were documented in terms of age, sex, smoking status, ASA score, various quality-of-life measures, and a functional test.
It was clearly stated that the surgical interventions were at the discretion of the treating surgeon. The type of decompression or fusion was not stipulated as part of the trial design. Postoperative cointervention with other treatments was not controlled for, again in keeping theme with this trials effectiveness design. All patients who entered the study were accounted for. Patients who did not receive the intervention to which they were allocated were excluded from the analysis. This is of concern in that the excluded patients may have been systematically different from those who were included; however, the number excluded was small and would not likely have changed the results. All the outcome measures utilized were appropriate and relevant to the questions asked.
The statistical tests were simple and applied appropriately. Sample size was considered prior to the study and the study sample was large enough to detect important differences should they have existed.
This study demonstrated no significant difference between groups in the mean ODI score at 2 years. This was the primary outcome. The results were the same in those patients with or without spondylolisthesis. Patients who underwent fusion were hospitalized longer and with greater associated operative costs. The recorded adverse events and reoperation rate were not substantially different between groups.
It is likely that the patients entered into this study are sufficiently representative that the results can be generalized to other settings. Similarly, the heterogeneity of surgical intervention, in terms of the type of decompression and fusion, also allows for generalizability. Generic, disease specific and functional outcome measures all suggested that the addition of fusion did not result in a superior result.
Recommendation Regarding Impact on Clinical Practice
Forsth et al have conducted a very well-designed and executed clinical trial addressing an exceedingly common pathology encountered by spine surgeons. The results of this study should cause the reader to seriously question the addition of fusion to surgical decompression in the symptomatic patient with central/lateral recess lumbar stenosis secondary to degenerative changes, particularly without clear evidence of segmental instability. Nevertheless, given the methodological limitations discussed previously, we believe that these findings support only a weak recommendation to incorporate these findings into clinical practice and no changes are warranted for patients with degenerative spondylolisthesis.
Cervical spondylotic myelopathy (CSM) is a degenerative spinal condition and the most common cause of spinal cord dysfunction worldwide. Although surgery may be beneficial, it is still a challenge to accurately predict improvement after surgical treatment. A clinical prediction rule would: (1) help appropriately manage patient expectations, and, therefore, satisfaction; (2) give decision-making support to surgeons; (3) provide a quantitative tool to discuss prognostic information during the consent conversation; and (4) align surgeons’ perceptions of outcomes across hospitals, regions, and countries.
This study is the third of a three-part study, and reports on a new analysis of previously collected data, combining data from a CSM-North America study of 278 patients, and a CSM-International study of 479 patients (from 16 global sites). The first study (North America) led to a clinical prediction model to predict postoperative functional outcomes with prospective data. The model distinguished between patients with an mJOA score of 16+ versus <16 1-year postoperatively and reported predictive factors of age, duration of symptoms, severity, smoking, depression, bipolar disorder, and impaired gait. The model from the second study led to similar predictors as in the first model, but with some differences, most importantly, psychiatric disorders (relevant in North America but not Internationally). Other limitations noted included observing that the model did better in predicting outcome in moderate (mJOA = 12–14) to mild (mJOA = 15–18) myelopathy. The objective of the current (third) part was to address these limitations and refine the original model to increase global validity by using the combined data set.
Although the title mentions 757 subjects (which is the sum of 278 [N.A. study] and 479 [Int’l study]), 14 subjects with pre-op mJOA of 18 (the maximum) were excluded so analysis actually proceeded on 743 patients. Of these, 614 returned for the 1-year follow-up and hence were evaluated for improvements in functional status. The sample of 743 was comprised of 193 mild, 296 moderate, and 254 severe CSM subjects.
The authors demonstrated improvements with the surgical treatment of CSM in a large number of patients. The authors utilized a mJOA score of 16 as the threshold for minimal clinical symptoms which is a well-established value. Through a large set of patients, with prospectively collected data, the authors identify a number of clinical factors associated with a greater relative risk of poor mJOA recovery. These factors included: baseline mJOA score, impaired gait, age, co-morbidity score, smoker, duration of symptoms.
While the information presented are helpful, the paper does not deliver a clinical prediction model. The reader learns which are the significant predictors in each of the models, and the RR and 95% confidence interval (CI), but as mentioned above, the coefficients of the predictors are not provided and, therefore, cannot be used for prediction. The correlation with poorer postsurgical outcomes can be utilized to inform patients and establish postsurgical expectations. These negative correlations do not, as noted by the authors, support nonsurgical treatment even in candidates with a higher risk of poorer results.
The two studies were both prospective multicenter cohort studies done to compare preoperative and postoperative neurological status and quality of life in CSM patients. The secondary objective was to develop a clinical prediction rule with the most significant predictors of surgical outcome. Inclusion criteria were age 18+, symptomatic CSM, evidence of spinal cord compression on imaging, and no previous spine surgery. Exclusion criteria were asymptomatic, active infection, neoplastic disease, rheumatoid arthritis, ankylosing spondylitis, or concomitant lumbar stenosis.
All patients underwent surgical decompression of the spinal cord and details of the surgical intervention are provided. Data were collected at baseline and 12 months postoperatively on demographics, symptomatology, imaging and clinical assessment, medical history, previous conservative treatment, functional status and health related quality of life (mJOA, neck disability index [NDI], SF-36, 30-meter walking test).
The primary outcome measure was the 18-point mJOA. The authors note that the reliability of the mJOA has not yet been established, but the original JOA has high inter- and intra-observer reliability. To develop prediction models, a cut-off of 16 was used for one model and 12 for a second model. Statistical methods used were univariate log-binomial regression (to model the relationship between clinical factors and primary outcome measure, and for relative risk estimation). Multi-collinearity was assessed to guard against over-specification of the model. Modified Poisson regression with robust error variance was used to create the final multivariate model and calculate relative risk for each predictor. Logistic regression was carried out on the final model to obtain an receiver operating characteristic (ROC) curve and to compute the area under the curve (AUC) as a one-number summary of the predictive performance of the model.
Odds ratios and relative risks are similar for rare events, but when events are not rare, odds ratios are not good estimates of relative risk. Hence, log-binomial regression, which estimates relative risk directly, was used here rather than the more familiar logistic regression. Both methods are appropriate for binary (dichotomous) outcomes. Poisson regression is an appropriate approach for analyzing rare events when subjects are followed for a variable length of time, but when used with binary data, the error for the estimated relative risk will be overestimated. Modified Poisson regression with a robust error variance procedure (known as sandwich estimation) fixes this problem. Logistic regression is the appropriate method to get the ROC and AUC. Hence, all the modeling techniques appear to be appropriately used.
The primary weakness of the study is the lack of a final, unifying predictive model. The authors call the paper a clinical prediction rule, but the findings are presented as the reduction in relative risk based on which category of a predictor variable a patient is in. There is, however, no overall model presented that can be used to compute a probability of a positive outcome. A logistic regression model apparently was developed to get the ROC and AUC, but the coefficients are not presented, only the individual relative risk (RRs). Unfortunately, the authors do not discuss how a clinician may combine the various RRs for the predictors in the final model.
Recommendation Regarding Impact on Clinical Practice
The study does an admirable job of performing a new analysis of existing data, with the goal being to address limitations and discrepancies in earlier study investigations. It discusses a list of significant predictors, along with individual relative risks, for various groups of patients. However, it does not provide a clinical prediction rule. Furthermore, the study does not support nonsurgical treatment for patients even if negative correlative factors are present. As such, we agree that there is sufficient evidence to justify a strong recommendation for advocating operative treatment for individuals with CSM. However, we only can provide a weak recommendation for the use of the identified risk factors to predict poorer outcomes after surgery.
Lumbar discectomy has been shown to be a useful treatment for those patients with radiculopathy that has been refractory to conservative measures. The reported incidence of reoperation after lumbar microdiscectomy is variable. Moreover, the specific indication for reoperation—recurrent herniation at the index level, complication, new condition—has not been extensively examined. This current study seeks to add clarity to both the rate of reoperation and risk factors which might be predictive of the need for further surgical intervention.
The authors present combined data from the randomized and observational arms of the Spine Outcomes Research Trial (SPORT) study. Of the 810 patients for whom 8-year data were available, 119 patients (15%) underwent reoperation of any sort within 8 years. They examine both the specific indications for reoperation and the time points at which reoperation was performed. The reason for reoperation was recurrent disk herniation in 62% (86% at the index level, 11% different level, 3% unspecified), complication in 25%, and a new condition in 11%. The incremental annual risk of surgery for recurrent disk herniation declined with time—4% in the first year and a fairly steady 0.5% to 1% per year thereafter. The incremental risk of reoperation for any reason also declined with time with annual risk of 6% in the first year, 2% in the second year, and approximately 1% per year thereafter.
Of all of the index microdiskectomy procedures, 57% were performed at L5-S1, 39% at L4–5, 3% at L3–4, and 2% at L2–3. Neither the index herniation level nor the type of herniation (sequestered, extruded, protruding, or posterolateral) affected the risk of reoperation. The only factors associated with increased likelihood of reoperation were age (patients undergoing reoperation were modestly younger) and presence of asymmetric motor weakness (less likely to undergo reoperation). There is no explanation offered for this unusual finding that asymmetric motor weakness was less common in the reoperation group. Patients in the reoperation group had slightly, but statistically significant, longer hospital length of stay and were more likely to have postoperative infection and total operative complication. Other patient variables were not found to statistically influence risk of reoperation—these included smoking, diabetes, obesity, depression, worker’s compensation status, and work lift demand.
As might be expected, patient satisfaction and other patient reported outcomes measures in the reoperation group were inferior to those reported in the group not undergoing reoperation. Coupled with the fact that patients who perceived that their symptoms were worsening at enrollment has a higher likelihood of undergoing reoperation underscores that patient expectations and perceptions may have been heterogeneous in this population. This is an important consideration in the decision for reoperation and introduces a significant source of variability. Additionally, there is the potential that the randomized arm and observational arm of the SPORT study do not represent populations with uniform expectations. Although 245 (49%) of the 501 patients enrolled in the randomized cohort were assigned to surgery, only 148 (60%) underwent surgery; conversely, of the 256 patient assigned to non-surgical treatment, 122 (48%) underwent surgery. The details of this considerable cross-over are not examined and incorporated into the analysis but data for surgical patients are aggregated. In the observational arm, 521 (70%) of 743 patients chose surgery and 494 (94%) went on to have surgery; of the 222 (30%) who chose non-surgical treatment, 56 (25%) went on to have surgery. Factors influencing the initial willingness to be randomized and the motivators for cross-over may have implications on subsequent satisfaction with treatment and willingness/interest to pursue reoperation.
Reoperation after lumbar discectomy is relatively common so having further understanding regarding possible risk factors which might contribute to the rate of reoperation is desirable. It follows then that there is sufficient need to justify the present study. These authors clearly stated their purpose in that they sought to determine which patient baseline characteristics were risk factors for reoperation in patients treated surgically for intervertebral disc herniation and further to compare outcomes among patients who underwent reoperation with those who did not. Leven et al hypothesized that patient and clinical characteristics are risk factors for reoperation and that patient outcomes are more favorable in those who did not undergo reoperation.
The SPORT study subjects form the inception cohort for this prognostic evaluation. This study made use of the 8-year follow-up data from the SPORT study; it was a retrospective review of prospectively collected data. Patients from both the randomized cohort and observational cohort who underwent surgery were combined and formed the basis of this analysis.
The authors described the patient population in considerable detail. A total of 13 institutions from 11 states in the USA contributed patients who were treated with surgery for lumbar disc herniation. Inclusion and exclusion criteria were specified and replicable. Enrollment spanned 4 years and concluded in November 2004. The study interventions, both surgical and nonsurgical care, were clearly stated. Commonly accepted patient demographic variables, operative data, and patient self-reported outcomes were measured. The authors presented a patient flow diagram from the 8-year follow-up SPORT study. Of the 1244 patients, 810 (65%) had some data available for 8-year follow-up. The authors report far too many P values without adjustment for multiple comparisons and they used Cox regression to identify variables predictive of reoperation.
The rate of reoperation in the available cohort at 8 years was 15%. The majority (55%) of patients undergoing reoperation did so within the first 2 years after the index surgery. Of those who both underwent reoperation and had complete surgical data (104/119) the surgery was performed at the same level as the index surgery in 71%. With respect to the primary aim of this study, the predictors of reoperation were age and motor weakness: older patients and the presence of motor weakness were associated with a decreased risk of reoperation. The 8-year patient-reported outcomes data (SF-36, ODI, and sciatica bothersomeness) were all better in the group that did not undergo reoperation.
The authors conclude that a younger age is predictive of reoperation. The mean age difference in this study was 2.7 years and although this difference is statistically significant, it may not be clinically relevant. The authors speculate that motor weakness, which can often be painless (i.e., a painless foot drop) and does not respond to the index surgery is less likely to be a trigger for reoperation as compared with persistant/recurrent radicular pain; it would follow then that those with persisting motor weakness alone may have a lower reoperation rate.
Recommendation Regarding Impact on Clinical Practice
Leven et al have used 8-year SPORT data with a 35% loss to follow-up to identify risk factors for reoperation in patients treated surgically for lumbar disc herniation. The substantial loss to follow up calls into question the results of this analysis. Younger patients may be at increased risk of reoperation although the observed mean age difference was small. Those patients with asymmetrical motor weakness were noted to have a lower reoperation rate. The 15% reoperation rate, the majority occurring in the first 2 years is important information; however, based on this study no changes to clinical practice are recommended.
Arthrodesis remains one of the most common treatments for a wide range of spinal disorders including fractures because it may confer immediate stability, allow for correction of deformities, and minimize the potential for further neurologic injury. However, there are ongoing concerns about the deleterious biomechanical effects of spinal fusion because of the observation that spinal segments contiguous to the construct frequently exhibit progressive spondylosis. The exact etiology of “adjacent level degeneration” remains a matter of some controversy; although it has been proposed that this process may occur because of the greater forces that these segments are be subjected to as a consequence of an arthrodesis, it is also possible that these changes may simply represent the natural history of spondylosis.[7–12] Wood et al recently published the results of a prospective randomized trial assessing the long-term outcomes of patients with stable thoracolumbar burst fractures and normal neurologic dysfunction which suggested that nonsurgical care may give rise to less pain and better function than fusion over time. Nevertheless, the incidence of adjacent segment degeneration and symptomatic disease necessitating operative intervention associated with thoracolumbar fractures has yet to be definitively established. In their study, D’Oro et al attempted to quantify the risks of developing disc degeneration as well as the rates of subsequent fusion by evaluating a large series of thoracic and lumbar fractures which were initially managed either conservatively or with surgery.
Using a large clinical database comprised primarily of medical records collected from Medicare and United Healthcare, the authors utilized International Classification of Diseases, Ninth Edition (ICD-9) codes to identify all patients who were noted to have fractures involving either the thoracic or lumbar spines in 2007. These subjects were further subdivided according to whether they were managed nonoperatively or with a fusion within 90 days of presentation. For all of these groups, ICD-9 codes were also employed to determine how many of these individuals were found to have developed disc degeneration at 1, 2, and 3 years after their initial treatment; an analysis of current procedural terminology (CPT) and ICD-9 procedural codes was performed to ascertain how many of these patients elected to proceed with an operation related to disc degeneration within 1 year after this diagnosis had been established.
Of the 3699 patients diagnosed with thoracic fractures, 3215 (86.9%) were managed nonoperatively, 117 (3.2%) underwent fusion, and the remaining subjects were treated with a surgical procedure other than an arthrodesis. In the nonoperative subgroup, 147 individuals (4.6%) were assigned the CPT code for thoracic disc degeneration within 3 years and 11 required surgery for symptomatic disease. Because of issues related to privacy, the authors were only able to determine that between 1 and 11 individuals in the fusion cohort (0.9%–8.5%) had been diagnosed with thoracic disc degeneration but there were no surgical procedures performed for this indication; there were no statistically significant differences noted between the values recorded for these two groups (P > 0.05).
A similar analysis of this database yielded 5016 cases of lumbar fractures of which 4371 (87.1%) were treated conservatively and 150 (3.0%) were immediately stabilized with an arthrodesis, respectively. Within 3 years, the percentage of patients in the fusion cohort who had been diagnosed with lumbar disc degeneration was significantly higher than that observed for those whose fractures were managed without surgery (23.3% vs. 11.5%, respectively; P < 0.05). However, only 42 individuals (1.0%) in the nonoperative subgroup and no more than 11 (0.7%–6.7%) of the subjects who had initially been fused underwent subsequent surgical intervention for disc degeneration, a difference which was not found to be statistically significant (P > 0.05).
Based upon these findings, the authors concluded that fusions for thoracic fractures do not appear to increase the incidence of disc degeneration or the likelihood of further surgery for symptomatic disease. In contrast, patients with lumbar fractures definitively managed with an arthrodesis may exhibit a greater predisposition for the development of adjacent level degeneration which could potentially necessitate additional operative procedures for this condition.
In this investigation, the ICD-9 codes for thoracic or lumbar fractures were utilized to identify the study cohorts although individuals with a preexisting diagnosis of degenerative disc disease present at the time of the initial traumatic event were excluded from this analysis; this strategy was also employed to determine the incidences of adjacent segment degeneration within 3 years. Likewise, the treatment of these injuries (i.e., nonoperative vs.arthrodesis) within the first 90 days as well as the percentages of subjects who underwent additional surgical intervention for disc degeneration during the follow-up period were assessed by searching for various CPT codes specific for fusion. Given the dichotomous nature of these variables, the chi-square test was used appropriately to evaluate for any statistically significant differences between the two groups.
As is typical of these types of administrative databases, many details of each particular diagnosis (e.g., level of injury, fracture morphology) and operative procedure (e.g., number of segments fused, bone graft material) were not available and, therefore, not considered in this analysis. Given this relative lack of resolution on an individual basis as well an inability to control for how individual coding is performed, it is difficult to reliably corroborate the homogeneity of these cohorts. For example, there may be a strong impetus to continue to report a “trauma” code after surgery which is an error that is essentially impossible to take into account. Thus, while the methodology of this study is generally sound, there are still multiple sources of bias which could not fully be controlled for by the authors.
Recommendation Regarding Impact on Clinical Practice
The results of this retrospective review of clinical data compiled in a large database suggest that although fusion surgery for thoracic fractures does not appear to hasten the development of degenerative disc disease, patients undergoing these procedures as a treatment for fractures of the lumbar spine are more likely to be diagnosed with adjacent segment degeneration which when symptomatic could necessitate operative intervention. Nevertheless, the applicability of these findings is clearly diminished by the inherent limitations of this investigation so we do not recommend any changes in clinical practice.
With recent advances in operative techniques, the surgical treatment of adult spinal deformities (ASD) has become more feasible and these procedures are being performed with increasing frequency. Given the relative complexity of the pathology and the medical comorbidities characteristically exhibited by this patient population, the correction and stabilization of ASD is known to give rise to a wide range of complications including neurologic deterioration. Nevertheless, there continues to be a paucity of high-quality clinical studies elucidating the safety and efficacy of operative intervention for ASD using validated outcome measures. To this end, Lenke et al recently published the results of a prospective, multicenter investigation assessing the motor function of a large series of individuals undergoing surgery for complex ASD.
The authors present findings derived from a subset of data prospectively collected as part of the Scoli-RISK-1 Trial which is an ongoing international, multicenter, observational study. This analysis included a cohort of 272 adult subjects who were enrolled by 43 surgeons from 15 centers across the world over a period of approximately 1 year. All of these patients underwent surgical correction for a diagnosis of “complex ASD” (apex between C7 and L2) which involved a major Cobb angle ≥80° in the coronal or sagittal planes, corrective osteotomies, or reconstructions requiring concomitant decompression for myelopathy or ossified ligamentous structures (i.e., ligamentum flavum, posterior longitudinal ligament). The primary outcome instrument utilized for this study was the American Spinal Injury Association Lower Extremity Motor Score (ASIA LEMS) which assesses the strength of five muscle groups (hip flexors, quadriceps, anterior tibialis, extensor halluces longus, and gastrocnemius/soleus) on a scale of 0 (no function) to 5 (normal) with a maximum value of 50 points (25 for each leg). Each patient was evaluated by an ASIA-certified examiner within 6 weeks before surgery, prior to hospital discharge, and then at 6 weeks and 6 months postoperatively in order to identify any changes in motor function over time which were classified as “maintenance,” “improvement,” or “decline.”
For the purpose of this analysis, subjects were stratified according to their preoperative neurologic status—individuals with normal function (Preop NML, N = 204) and those with preexisting motor deficits (Preop ABNL, N = 68). At the time of hospital discharge, the LEMS were lower for 22.18% of these patients and higher for 12.78%; however, these values were significantly different at 6 weeks compared with hospital discharge (17.9% and 16.42%, respectively; P = 0.0042). By 6 months, 10.82% of subjects had declined, 20.52% were improved, and 68.66% exhibited maintenance which reflected a significant change relative to the LEMS recorded at 6 weeks (P = 0.0011) and hospital discharge (P = 0.0001). Overall, the LEMS of the Preop NML group were significantly reduced at all three follow-up time points whereas the motor function of the Postop ABNML cohort was found to be significantly better at 6 months. Based upon these results, the authors concluded that neurologic deterioration after operative correction of complex ASD may be more prevalent than previously reported with over 20% demonstrating weakness in the perioperative period in this series. Nevertheless, in many cases this motor dysfunction appears to be transient and a large percentage of these patients may be expected to regain their strength within 6 months after surgery.
There are a number of methodologic limitations to this analysis, most of which are acknowledged by the authors. While this series was comprised of a large number of patients initially identified by 43 investigators at 15 international ASD centers, it should be noted that over two-thirds of them were recruited by the eight highest-enrolling surgeons and 18 of the practitioners only contributed a single subject. In addition, the diagnostic and treatment criteria utilized for this study were somewhat heterogeneous which allowed for the inclusion of individuals undergoing more complex ASD operations which are inherently associated with a greater risk of neurologic injury. Thus, the applicability of these results to all patients with ASD remains uncertain. There is also potential for a subjective bias because of inconsistencies in the manner in which the LEMS assessments were performed (e.g., different ASIA-certified examiner at the various time points, time of day, administration of pain medications, etc.)
The statistical methodology employed by the investigators is generally sound in that the categorical data derived from ASIA scoring were analyzed with the Fisher exact test whereas paired t test and analysis of variance (ANOVA) for continuous variables were used to compare the LEMS scores. One issue is that the LEMS is an additive value (i.e., five maximum point per myotome, 25 maximum points per side, 50 point max total) so it is not really a continuous variable with a normal distribution which is a presumption of the ANOVA tests. Nevertheless, the primary results were reported by “categorizing” the groups into normal and abnormal motor scores groups so that the Fisher exact test would be an appropriate statistical tool for this purpose.
Recommendation Regarding Impact on Clinical Practice
This study reports the findings of a large prospective, multicenter, international, observational study which elucidate the changes in motor function that may occur after surgical correction of ASD. Although this prognostic information would certainly be of interest to surgeons as they counsel their patients who are undergoing these types of procedures, it is unclear whether these results would actually significantly alter the operative decision-making process such as indications, technique, etc. Consequently, we feel as if this investigation does not warrant any changes in clinical practice.