We found excellent clinical results for both treatment modalities at 3 months, which were sustained at 2 years. There was no significant difference between arthroplasty and fusion at any of the follow-up times. However, statistical analyses using linear mixed models that adjust for baseline values, dropout and missing data showed a difference in self-rated neck disability and the numeric rating score for arm pain in favor of fusion after 2 years.
This is not consistent with most randomized controlled trials [6, 7, 8, 9, 10, 11, 12], the recent study on available registry data by Staub and colleagues [29], and three recent meta-analyses [15, 16, 17] reporting clinical outcome in favor of arthroplasty.
The between-group difference in NDI score of 5.9%, shown in the present study is small and the statistical significance is weak, and the results must, therefore, be interpreted with caution. One might argue that the difference should not be considered clinically important, but there is no clear consensus-based agreement on how large the between-group difference should be [30, 31]. There were 78.3% in the fusion group and 70.0% in the arthroplasty group reporting an NDI change of 10 or more from baseline to 2-year follow-up. Even though the difference was not statistically significant, the direction did not favor arthroplasty. There may be several reasons for the discrepancy compared with previous studies, such as different implant design, different study methods, different fusion technique, different lengths of follow-up, and the impact of funding by arthroplasty manufacturers.
Different arthroplasty designs have revealed different biomechanical performances for the treatment of single-level cervical disc disease [32]. Arthroplasty devices are considered constrained in certain planes if they restrict motion to less than that seen physiologically. The usual designs are, however, “semiconstrained”, which allows for physiological movement, or “nonconstrained”, where there is no mechanical stop and extremes of motion are prevented by the perispinal soft tissue and inherent compression across the disc space [33]. The nonconstrained device used in the present trial is comparable in this respect with the Bryan device (Medtronic Spine and Biologics) [7, 8, 10] and the Porous Coated Motion (PCM) device (NuVasiveInc. San Diego, CA, USA) [11]. The Prestige ST (Medtronic Sofamore Danek) [6, 9] differs from the present study implant by its semiconstrained design, and by the implantation technique, where the device is fixed with screws to the vertebrae cranial and caudal to the disc space. In addition to different degree of constraint, implants may also differ in design of their articulating surfaces. The ball and socket design of the device used in the present trial has a different impact on range of motion (ROM) compared with the Bryan and PCM devices, and the adjacent level intradiscal pressure has been shown to differ according to implant design [32].
The study methods of the present trial also differ from the previously mentioned studies where only two describe blinding of the participating patients [7, 11]. However, Heller and colleagues [7] could not continue blinding of patients after completion of the surgical procedure due to treatment with non-steroid anti-inflammatory medication (NSAID) in the arthroplasty group for two weeks after surgery. Phillips and colleagues [11] blinded patients only until after the surgical procedure was completed. Blinding of the surgical team until after decompression of the compressed nerve root has rarely been included in previous study designs, but was conducted in the study by Skeppholm and colleagues [14], consistent with the previous study methods. Strict study methods are probably important to avoid expectation bias in both patients and surgeons, and may have been a contributing factor to the discrepancy with previous trials.
Another aspect, which may influence the outcome, is the applied fusion technique. Stand-alone polyetheretherketone (PEEK) cage implant as used in the NORCAT differs from most other comparable trials, where allograft and anterior plating are most commonly used [6, 7, 8, 9, 10, 11]. The reported fusion rates between the two techniques after 2 years are, however, similar at 97.5% [6], 94.3% [7], and 92.1% [11] for allograft with plating and 92% [34] for stand-alone PEEK cage. Nemoto and colleagues [34] recently assessed clinical outcome and complications regarding postoperative dysphagia between stand-alone cage implant versus cage and anterior plating in single-level cervical disc disease, and found no difference between the two surgical methods.
The length of follow-up may also have an impact on the clinical outcome, and longer observational period after surgery is often requested. Time is naturally highly relevant in relation to the impact of adjacent level disease [35]. However, the present study results demonstrate that there is little change in clinical outcome from 3 months up to 2 years after surgery. A longer follow-up has probably little effect on clinical outcome related to the completed surgery, as recently demonstrated by Gornet et al. [36].
Arthroplasty manufacturers are often represented as sponsors of large randomized, controlled trials, as was the case in the present study. Their role in relation to outcome is probably important to include in the overall discussion regarding outcome discrepancy between authors, and was recently discussed by Alvin and colleagues [37]. They assessed whether trials funded by arthroplasty manufacturers had a greater likelihood of reporting results in favor of arthroplasty, and found lower complication rates when a conflict of interest was reported, but no impact on health-related quality of life outcomes.
Critical issues which may explain the discrepancy in clinical outcome between the present study and most previous comparable trials are difficult to point out. The truth, however, may be a combination of physiological and actual differences between the implants, as well as different study designs as discussed above.
The expected clinical outcome is important in the surgical decision-making for individual patients. In addition, differences between surgical techniques are also key factors to consider. In the present trial, patients operated with arthroplasty had significantly longer duration of surgery, which corresponds to the results from a newly published meta-analysis [15]. Even though experienced spinal surgeons operated the patients, all surgeons were more familiar with the fusion procedure as it was the standard treatment in the departments involved. Thus, level of experience is one possible explanation for the difference in surgery duration. Other possible explanations are that implantation of the specific arthroplasty device is technically more demanding and time consuming. There were no severe complications in the present study, but the reoperation rate differed from previous trials reporting more secondary surgeries with fusion [6, 8, 9]. The difference in index level reoperations could be explained by suboptimal implantation technique or incorrect size of the arthroplasty device. However, all patients who were reoperated had their primary surgery at a time-point when all surgeons had good experience with the particular arthroplasty device. In a recent study using the same implant [38], instability and accompanying neck pain after arthroplasty were found in 8% of patients, all of whom underwent revision surgery.
Corresponding with previous reports [6, 7], patients in the arthroplasty group returned to work two weeks earlier than patients in the fusion group, but there was no difference in employment status at 2-year follow-up. A previous study concluded that the duration of preoperative sick leave influenced return to work postoperatively [39]. In the present trial, preoperative sick leave was 3 weeks shorter in the arthroplasty group, but the difference was not significant.
Ament and colleagues recently assessed the cost-effectiveness of 2-level arthroplasty or fusion at 2- and 5-years follow-up. Arthroplasty was more expensive than fusion, but came out with higher total quality adjusted life years, suggesting it to be a highly cost-effective treatment option [40, 41]. Consistent with these results, Zou and colleagues recently presented a meta-analysis on clinical outcome after two-contiguous level cervical disc surgery and concluded that arthroplasty was equivalent, and in some aspects significantly superior to fusion regarding clinical outcome [42]. Considering the results of the present trial, the growing interest among physicians for arthroplasty as an alternative to fusion, and the high number of surgical procedures performed each year [43], future studies should focus on both clinical outcome as well as cost-effectiveness analyses.
The role of adjacent level disease was not addressed in the present study since clinical outcome was the only focus of this report. The impact of adjacent level disease will be presented in a forthcoming paper including the NORCAT 5-year follow-up data. Regarding maintenance of mobility, which is the main goal of choosing arthroplasty over fusion, the authors of the present study have recently shown that high-grade heterotopic ossification around the Discover arthroplasty device was found in 62% after 2 years [44].
Limitations
Our study may be criticized for a too short follow-up period. However, the present study shows that there is little change in clinical outcome from 3 months up to 2 years. Similar results at even longer follow-up was recently presented by Staub and colleagues who reported quite stable postoperative course of patient-reported outcomes between 2 and 5 years both after arthroplasty and fusion based on registry data [29]. Their results also strengthen the external validity of randomized controlled trials comparing cervical arthroplasty and fusion, where a large number of patients often do not meet the inclusion criteria, as was the case in the present trial.
Even though no patients with severe spondylosis should have been included in the NORCAT, the degree of spondylosis using radiographic parameters for evaluation could have been emphasized specifically in the inclusion/exclusion criteria. Therefore, one cannot exclude the possibility that some patients not meeting the criteria for arthroplasty may have been included, which again could have biased the study in favor of the fusion group.