Several randomised clinical trials have clearly demonstrated that the immunomodulating drugs, human recombinant interferon beta (IFN-β) and glatiramer acetate (GA) are more effective than placebo in the treatment of relapsing-remitting multiple sclerosis (RRMS).1 These drugs are now the approved treatment for RRMS and clinical practice treatment guidelines have been issued. After many years of effective and safe treatment of RRMS with immunomodulating drugs, several trials have compared the efficacy of the various drugs. However, the results from these trials have been conflicting.
Prospective Randomised Trials
The Independent Comparison of Interferon (INCOMIN) trial is the first prospective randomised directly comparative trial of two IFNs, 250μg IFN- β-1b once every other day (qod) and 30μg IFN-β- 1a once weekly (qw).2 It involved 15 MS centres and 188 patients with a two-year prospective follow-up and showed that IFN-β-1b qod has a greater clinical and magnetic resonance imaging (MRI) efficacy than IFN-β-1a qw. IFN-β-1b particularly reduced the risk of disease progression to less than half compared with that of patients treated with IFN-β-1a qw.
The Evidence for Interferon Dose Effect: European- North American Comparative Efficacy (EVIDENCE) trial is a prospective randomised trial comparing two different protocols of administering 44μg IFN-β-1a three times weekly (tiw) compared with 30μg qw.3 It involved over 677 patients who were followed up for one year. Again, both clinical and MRI effects favoured the multiple weekly high-dose administration protocol.
In conclusion, both the INCOMIN and EVIDENCE trials confirmed the American Academy of Neurology (AAN) clinical practice treatment guidelines for MS, stating that there is a dose-response curve associated with the effect of IFN-β in the treatment of MS.1
IFN treatment requires multiple weekly parenteral administrations for an as yet undetermined time period. Some patients may find it hard to cope with such a treatment regimen in the long term and might ask to reduce the dose or the frequency of administrations. The Dose Reduction study was aimed at trying to identify the minimum effective dose and frequency of administration of IFN-β.4 Patients on chronic 250μg IFN-β-1b qod who were doing very well (i.e. no relapses or disease progression for at least three years, and no signs of disease activity in two consecutive MRI scans) were randomised to either continue on 250μg IFN-β-1b qod or be gradually switched to intramuscular (IM) 30μg IFN-β-1a qw. Patients were followed up for one year and a resumption of the clinical and MRI signs of disease activity was observed in the group of patients who reduced the dose of IFN to 30μg IFN- β-1a qw.
The latter prospective randomised trial showed that IFN-β-1b treatment is a chronic treatment to be continued at high dose and with frequent weekly administration. A reduction in the administered weekly dose of IFN-β-1b is not only not advisable but can also be dangerous – even in patients with prolonged absence of clinical and MRI signs of disease activity.
Observational Non-randomised Trials
The Detroit study, a prospective observational study on 122 patients, confirmed a greater efficacy of IFN- β-1b qod and GA, which reduced relapse rate from an untreated control group more than IFN-β-1a qw (see Figure 1).5
The University of Bari study, a big retrospective observational comparative study, performed in 15 MS centres and involving over 1,000 patients, did not show any significant difference in the reduction in relapse rate from baseline between three different IFN-βs (250μg IFN-β-1b qod; 22μg IFN-β-1a tiw; 30μg IFN-β-1a qw) (see Figure 2).6
The Berlin study, a single-centre retrospective observational study comparing IFN-β-1b qod, IFN-β-1a qw, IFN-β-1a tiw and GA, showed a greater efficacy of GA than IFN-βs, and no significant differences between the various IFN-βs (see Figure 3).7
The Quality Assessment in MS Therapy (QUASIMS) study collected the largest cohort of treated MS patients to compare all available IFN- βs.8 It involved 4,754 patients from 510 centres from three countries (Germany, Austria and Switzerland) who were followed up for two years. It is, again, a retrospective observational non-randomised study. It did not show any significant difference in the reduction in relapse rate, Expanded Disability Status Scale (EDSS) change, percentages of relapse-free or progression-free patients between the three IFN-βs (IFN-β-1b qod; IFN-β-1a qw and IFN-β-1a tiw). The only difference was a greater number of relapse-free patients at two years in the IFN-β-1a qw group compared with the 44μg IFN-β-1a tiw group. The difference was highly significant (p<0.008) and exactly the contrary of the results of the EVIDENCE trial.
The Reason Behind Different Results
The INCOMIN, Dose Reduction and EVIDENCE trials were prospective randomised trials; the other ones were all observational, i.e. non-randomised, trials. The International Conference on Harmonisation (ICH) working group has drawn up a set of guidelines that cover all aspects of the conduct of clinical trials.9 A group of scientists, statisticians and editors of various fields of medicine developed the Consolidated Standards of Reporting Trials (CONSORT) statement to improve the quality of reporting clinical trials.10 Below, the authors describe the most important CONSORT guidelines.
These guidelines point to randomisation as a crucial and unavoidable step. Many confounding factors might affect trial results; the only way to ensure that confounding factors are evenly distributed among treatment arms is randomisation. Observational studies have systematic errors due to the fact that they cannot control for the confounding factors; they can, however, be partially controlled for by introducing stratifications, logistic regression analysis, and so on. Often, however, the confounding factors cannot be identified. In a disease such as MS, where there are so many prognostic factors (clinical, demographic and biological), most of them unknown, randomisation is the only means of controlling for everything, particularly the unknown factors. The results of observational trials must therefore be evaluated very cautiously and, besides those shown above about MS, there are many other examples in the medical literature where treatments or procedures validated by observational studies have been completely contradicted by randomised studies. For example, observational studies showed that blood vitamin C level correlated with reduction in incidence of cardiovascular events,11 while a subsequent randomised trial demonstrated that supplements of vitamin C fail to change the incidence of cardiovascular events.12 Observational studies showed that the use of hormone replacement therapy (HRT) was associated with a reduction in cardiovascular disease (CVD) incidence,13 while a subsequent randomised trial showed that supplementation with HRT was associated with an increase in CVD incidence.14
Allocation concealment is also extremely important: it requires a central randomisation performed by a co-ordinating centre of personnel unaware of patients’ demographic and clinical characteristics. In observational studies without allocation concealment, the investigating physician is more likely to attribute more severe patients to the treatment he/she believes more effective. The baseline co-variates that differentiate the two treatment groups can be carried forwards and minimise or reverse treatment differences.
A meta-analysis of several systematic reviews of hundreds of trials in different fields of medicine showed that trials with inadequate allocation concealment yielded estimates of treatment results exaggerated on average by 30% compared with trials using adequate methods to conceal treatment allocation.15
Blinding is obviously an important step, although the weight of a careful allocation concealment and randomisation and that of a double-blind design on the final estimate of treatment results is not alike.15 Inadequate allocation concealment yields estimates of treatment results exaggerated by 30%, while trials without double-blinding yield estimates of treatment effects exaggerated by 14% on average compared with double-blinded trials.
In addition, blinding is usually disappointing in IFN trials due to the well-known IFN side effects. It is tentatively overcome by using two different physicians in each centre, one treating physician who is informed about the treatment and one evaluating physician who is totally blinded. Nevertheless, when the reliability of blinding is assessed by means of a questionnaire completed by patients and physicians at the end of the trial, most patients and many evaluating physicians guess correctly about the nature of treatment.16
Intention to treat (ITT) means that all enrolled patients must be evaluated in the final analysis. The drop-outs – i.e. patients who are completely lost to follow-up or lacking information of a part of the follow-up – are assumed to be bad outcomes; the withdrawals – i.e. patients who discontinued treatment but remained in follow-up – must be evaluated with the final result they had, in spite of the fact that they stopped treatment.
In observational studies, patients who are lost to follow-up are excluded and the analyses are performed on the so-called completers – i.e. the patients who remained on a certain treatment for a certain period of time. Thus, patients dissatisfied with their therapy for any reason (including lack of efficacy) are excluded.
ITT assumes that drop-outs and withdrawals occurred by chance and are evenly distributed between treatment arms. If they did not occur by chance (for example, a patient’s perception of lack of efficacy of a poorly effective drug is likely to increase the drop-out number in the arm of that drug), they can obviously affect the final result.
Outcomes must be clearly specified before a trial is started, and not all outcomes are alike. Exacerbation rate is the most usual clinical outcome measure in RRMS clinical trials. Relapse rate is, however, sensitive to the relapse count of patients at the extremes of the Gaussian curve. Many relapses occurring in a few patients may substantially change the overall population relapse rate. If the proportion of patients free from relapses are used as outcome, as in the more recent MS trials, each patient with many relapses is counted as one clinically active patient similar to those with one or a few relapses. If the proportion of relapse-free patients is used as outcome measure, all active patients have the same proportional influence on the final count of patients with or without relapses.
Randomised trials are the gold standard for evaluating treatment efficacy. The only randomised comparative trials in RRMS are INCOMIN, EVIDENCE and the Dose Reduction trials.2–4
Non-randomised large-scale observational studies that analyse large patient samples yield results with a very high level of statistical significance. The QUASIMS study,8 which showed IFN-β-1a qw to be more effective (for a certain end-point) than IFN-β-1a tiw, with a very high statistical significance (p<0.008) due to the huge sample size. Studies such as this give the false impression of a high-quality study. However, lacking randomisation, the large observational study carries over and amplifies all bias and confounding factors that the study design cannot control for.
There is international consensus that the results of observational studies cannot change those of randomised studies. Without taking into consideration the main sources of bias, observational studies may simply be producing tight confidence intervals around spurious results. ■