Evaluation of new treatments in clinical trials of multiple sclerosis (MS) requires valid and reliable measures of disability and disease progression. It is also important to monitor clinical outcomes in individual patients to optimize care. There are different kinds of outcome measure, including physician-oriented measures, such as those based on the neurologic examination and quantitative tests of neurologic function, as well as patient-oriented self-report measures.1 Physician-oriented outcomes tend to be more objective compared with patient self-report measures, whereas quantitative tests of neurologic function are more standardized and reliable than measures based on the neurologic examination. However, physicians tend to be more familiar with the latter measures, whereas the clinical relevance of changes in objective tests of neurologic function are unclear.1
The expanded disability status scale (EDSS)2 is the most universally used measure in assessing disability and progression in MS.1 The EDSS is based on the neurologic examination and measures impairment in eight functional systems, with EDSS scores in steps of 0.5, ranging from zero (normal neurologic examination) to 10 (death).2 The EDSS has been used as a primary or secondary efficacy end-point in clinical trials of disease-modifying therapies (DMTs) in MS.3–12
The main strengths of the EDSS are its familiarity and widespread use, which enable comparisons between different trials. In addition, data have been collected on its reliability and validity.1,13 However, over the past few years, there has been discussion regarding the limitations of the EDSS.1,13 First, the scale is ordinal rather than equal interval, requiring non-parametric analyses. The mean staying time is different at each level of the scale, with longer mean staying times at the upper and lower ends of the scale than at scores of three, four, or five.14 Second, there is subjectivity in determining scores of ambulation, and bowel and bladder dysfunction.1 There is also interexaminer variability in rating functional system scores as mild, moderate, or severe, resulting in lower reliability at low EDSS scores.13 Finally, higher EDSS scores are fully dependent upon ambulatory disability, so new changes in functional system scores do not affect upper-range EDSS scores. Additionally, the EDSS is relatively insensitive to arm function, cognitive function, and fatigue, which are important dimensions of MS.1
Development of the Multiple Sclerosis Functional Composite
Owing to known limitations of the EDSS and the increasing number of clinical trials in MS, the National Multiple Sclerosis Society (NMSS) sponsored a workshop in 1994 to evaluate currently available outcome tools. At this workshop, participants agreed that there was no optimal assessment measure available and recommended the development of a multidimensional assessment tool incorporating multiple clinically independent dimensions of MS, including cognitive function.15 A task force was then appointed to recommend improved clinical outcome measures. This task force published important criteria for MS clinical trial outcome measures16 and conducted a meta-analysis of quantitative measures of arm, leg, cognitive, and visual function from historic data collected in natural-history and clinical studies of MS.17 This led to the development of the multiple sclerosis functional composite (MSFC).18
Components of the Multiple Sclerosis Functional Composite
The MSFC is a composite of three objective quantitative tests of neurologic function that were identified as being important in MS, including ambulatory function, arm function and cognitive function (see Table 1).17 The timed 25-foot walk (25FTW)17 is a test of ambulatory function, requiring the patient to walk 25 feet quickly and safely in his or her usual manner. The nine-hole peg test (9HPT)17,19 measures arm and hand function—the patient moves nine pegs from a box into nine holes on a peg board, then back into the open box twice with each hand. The time is averaged for each hand. The three-second paced auditory serial-addition task (PASAT3)17,20 measures cognitive function. Patients listen to a series of 61 spoken numbers with three seconds between each, and must add each number to the previous number. The score is the number of correct additions out of 60.
The MSFC score is reported as a standardized z-score, because the three components are in different units of measurement (seconds and number correct) and direction of change (improvement is indicated by higher PASAT scores but lower 25FTW and 9HPT scores). A z-score is created for each component by standardizing to a reference population and the z-scores for the 25FTW and 9HPT are transformed such that a decrease represents worsening. Finally, the-z scores from the three tests are averaged to create the final MSFC score.21 The reference population might be the baseline study population or a standard external reference population, such as that of the task force pooled data set.17 Lower MSFC scores compared with baseline suggest neurologic deterioration.21
The MSFC has excellent reliability, but practice effects have been demonstrated. A pilot study of 10 patients with secondary progressive MS assessed the reliability of the MSFC through administration of six sessions of the MSFC over a two-week period.22 The first five sessions were conducted by the same technician, whereas another technician administered the sixth session. The intraclass correlation coefficient between session four and five was 0.97, demonstrating excellent intra-rater reliability. The intraclass correlation coefficient between session five and six was 0.95, again demonstrating excellent inter-rater reliability, which was maintained six months later.22
There were similar findings of excellent reliability in a larger phase III trial. The MSFC was used as the primary efficacy end-point in the phase III International Multiple Sclerosis Secondary Progressive Avonex Controlled Trial (IMPACT).23 Before randomization, the 436 patients underwent three pre-baseline MSFC testing sessions. The MSFC had excellent intra-rater reliability, with an intraclass correlation coefficient of 0.90 for session three (final pre-baseline session) and session four (baseline session).23
Both studies demonstrated practice effects with the MSFC. Although these effects were evident initially, the MSFC scores stabilized by the fourth administration.22,23 Practice effects were most apparent with the 9HPT, followed by the PASAT, whereas there were no practice effects for the 25FTW after the first administration.24 Thus, it has been suggested that there should be one pre-baseline administration of the 25FTW, three pre-baseline administrations of the PASAT, and four pre-baseline administrations of the 9HPT to maximize efficiency.24
Several studies have assessed various components of the validity of the MSFC. Face validity (i.e. the extent to which the tool measures what it is supposed to measure) and content validity (i.e. the extent to which the tool measures dimensions from the range of disease) were established through a group process. The NMSS task force established important MS clinical dimensions, and these were reviewed by the NMSS Advisory Committee on Clinical Trials and by the NMSS Medical Advisory Board,16,17 establishing face validity. Content validity was determined through incorporating tests of various clinical dimensions of MS, including cognitive function, ambulatory function, and arm function. Addition of further clinical dimensions, such as fatigue, visual function and sensory function, could also improve the content validity of the MSFC, as previously suggested.21 Construct validity is the ability of a tool to measure the disease dimensions that it was designed to measure. The EDSS score (>3.0) and the 25FTW component of the MSFC are tests of ambulatory function, whereas the 9HPT and the PASAT measure non-ambulatory functions that are not well measured by the EDSS. The EDSS correlates more strongly with the 25FTW than with the 9HPT or the PASAT, which supports the construct validity of the MSFC.17,23
Concurrent criterion validity involves the degree to which the tool correlates with other accepted instruments. Concurrent criterion validity of the MSFC was established through comparisons with EDSS, magnetic resonance imaging (MRI), and measures of self-reported quality of life. Several studies have demonstrated a moderately strong correlation between the MSFC and EDSS.17,23,25 MRI measures, such as T1 and T2 lesion load, also correlate significantly with MSFC scores.26 Additionally, one study found a moderately strong correlation between the MSFC and measures of brain atrophy in patients with relapsing–remitting MS (RRMS) studied over eight years.27 Furthermore, Miller and colleagues28 found significant correlations between the MSFC and measures of quality of life, including the Medical Outcomes Study Short Form 36 (SF-36) and the Sickness Impact Profile (SIP) in 300 patients with MS. The physical components of both the SF-36 and SIP were more strongly correlated with the MSFC than the mental and psychosocial components of these instruments.28
Predictive criterion validity is the ability of an instrument to predict future disease status. Support for this form of validity was provided by a follow-up study of the phase III study of intramuscular interferonβ-1a (IFNβ-1a; Avonex®).5 In 160 patients it was found that MSFC scores from this clinical trial strongly predicted MSFC and MRI status at eight-year follow-up.29 MSFC baseline scores were strongly correlated with MSFC scores at two and eight years, whereas there was a more moderate correlation between baseline EDSS scores and EDSS scores at two and eight years. Additionally, baseline MSFC scores and change in the MSFC over two years were correlated with both EDSS scores and brain atrophy (measured by brain parenchymal fraction with MRI) at the eight-year follow-up.29 Despite this demonstrated high predictive validity in RRMS, a study of 161 patients with primary progressive MS found that short-term worsening in both the MSFC and EDSS had poor predictive validity of future disability.30
Clinical Trials Using the Multiple Sclerosis Functional Composite
The MSFC has been used in several clinical trials, primarily as a supplement to the EDSS rather than a replacement measure of disability. The MSFC was not available in early phase III trials of first-line DMTs, but has been used in further clinical trials of IFNβ-1a, IFNβ-1b and glatiramer acetate. Newer phase III trials have incorporated the MSFC as a secondary outcome measure, including trials of natalizumab, fingolimod and teriflunomide. The MSFC was not reported in the phase III study of cladribine.12
The MSFC was first used as a primary outcome measure in a phase III placebo-controlled study of IFNβ-1b in secondary progressive MS (IMPACT).31 The median MSFC z-score change was reduced by 40.4% in IFNβ-1b patients compared with placebo, whereas there was no benefit demonstrated by the EDSS.31 These findings suggest that the MSFC is more sensitive to change in disability than is the EDSS.
The MSFC was used as a secondary outcome measure in the phase III trial of Betaseron in newly emerging MS for initial treatment (BENEFIT).32 Patients with a clinically isolated syndrome, including a first neurologic event and two or more clinically silent MRI lesions, were given either subcutaneous IFNβ-1b (Betaseron®) or placebo every other day for two years or until they developed MS. They were then eligible to enter a follow-up study that involved continuing IFNβ-1b or switching from placebo to IFNβ-1b for three additional years to assess whether early treatment had an effect on disability progression. Early treatment had a beneficial effect on six-month-confirmed EDSS disability progression three years after the initial neurologic event, suggesting that a treatment delay early in the course of disease affects later disability accumulation. However, the MSFC did not detect any relevant deterioration in either group and there was no difference between groups in their overall scores. The investigators were surprised by this finding because the MSFC was designed to improve sensitivity to change compared with the EDSS. However, the authors concluded that the MSFC might not be suitable in measuring disability early during the course of disease because domains not included in the MSFC (i.e. visual and sensory function) are often more affected in early MS than are those domains measured by the MSFC (i.e. arm dexterity, ambulation, and cognition).33
The MSFC was also used as a secondary outcome measure in a randomized placebo-controlled pilot trial of IFNβ-1b in 73 patients with primary progressive or transitional MS. There was no difference between groups in disability progression as measured by the EDSS; however, there was a significant difference in MSFC scores favoring IFNβ-1b,34 suggesting better sensitivity of the MSFC in this study.
The MSFC was used as a secondary disability end-point in a large placebo-controlled trial of glatiramer acetate (Copaxone®) in primary progressive MS (PROMiSE).35 Changes in both the MSFC and EDSS score were not significantly different in the placebo or treatment groups, and the study was terminated early.35
The MSFC was a secondary efficacy end-point in the placebo-controlled phase III trial of natalizumab (Tysabri®) monotherapy in relapsing MS (The efficacy of natalizumab on clinical and radiological measures in the Phase III Natalizumab Safety and Efficacy in Relapsing–Remitting MS [AFFIRM]).7 Natalizumab reduced disability progression compared with the placebo as measured by both the EDSS and MSFC. There was a significant difference in MSFC z-score change from baseline apparent after 12 weeks of treatment, which was maintained over two years.36 The MSFC was also used as a secondary efficacy end-point in the trial of natalizumab plus IFNβ-1b versus IFNβ-1b alone [The Safety and efficacy of antegren in combination with IFNβ-1a in subjects with relapsing-remitting MS (SENTINEL)].8 Similarly, the natalizumab-treated group had a reduced risk of disability progression as measured by the EDSS and MSFC compared with IFNβ-1b alone. There was a significant difference between groups in the MSFC z-score change from baseline that was apparent 48 weeks after beginning natalizumab and sustained over two years.36
The MSFC was also a secondary efficacy end-point in the two recently published Phase III trials of fingolimod in RRMS. In the Phase III placebo-controlled trial of oral Fingolimod for relapsing MS (FREEDOMS study),10 both EDSS scores and MSFC z-scores remained stable or improved slightly in the fingolimod groups, but worsened in the placebo group. Similarly, in the Phase III study of oral fingolimod versus IFNβ-1b (Trial assessing injectable interferon versus FTY720 oral in relapsing–remitting multiple sclerosis [TRANSFORMS]),11 EDSS and MSFC z-score changes were similar and both measures were generally better in fingolimod-treated groups than in the IFNβ-1b group. Figure 1 displays a comparison between the 24-month z-score change in EDSS and in MSFC from baseline in the fingolimod and placebo groups in the FREEDOMS study.10 Figure 2 displays a similar comparison of 12-month z-score change in EDSS and in MSFC in the fingolimod and IFNβ-1b groups in the TRANSFORMS study.11 Changes in EDSS and MSFC z-scores in a given treatment group were similar, which suggests that the MSFC did not provide a much more sensitive measure of disability in these trials.
Clinical Relevance of the Multiple Sclerosis Functional Composite
To interpret MSFC scores in both clinical trials and individual patients, it is important to understand meaningful changes in the MSFC. It has been suggested that a 20% change in the 25FTW and 9HPT represents a reliably true change in function, whereas lower levels of change might represent clinically insignificant day-to-day fluctuations.37 In addition, it has been suggested that an increase of more than 20% in the 25FTW or 9HPT also indicates a clinically significant impact on disability, as perceived by patients with MS.38–40 However, the clinically relevant change in the overall MSFC score has not yet been determined.41 This limits the usefulness of the MSFC as an outcome measure. Additionally, the MSFC z-score value is not clinically useful and it is neither practical or beneficial to incorporate the MSFC routinely into clinical practice.
Multiple Sclerosis Functional Composite versus Expanded Disability Status Scale
The MSFC was originally developed to improve or supplement the EDSS as a measure of disability, given flaws identified in the EDSS. There are several technical issues that favor one scale over the other, which are discussed below. The single biggest limitation of the MSFC is the fact that a given score tells a clinician nothing about how a patient with MS appears from a neurologic perspective, which the EDSS does do. As such, the MSFC is less informative for clinicians and, therefore, is used far less than the EDSS, which is a widely used disability end-point in clinical trials of MS. Although there are advantages of the MSFC, there are also several additional limitations to its use, as discussed below.
The main advantage of the MSFC is that it is a quantitative linear continuous measure with high reliability and validity. By contrast, the EDSS is an ordinal scale and deterioration is non-linear with a ceiling effect.2 It has been suggested that, given its continuous nature, the MSFC is more sensitive to change in disability than is the EDSS.17 This is supported by a study that showed that the MSFC had better precision than did the EDSS in detecting differences in MS severity based on MRI findings, however overall both the EDSS and MSFC correlated weakly with MRI pathology.42 Additionally, the MSFC measures a broader range of MS dimensions than does the EDSS, with inclusion of measurements of cognitive and arm function, rather than the sole reliance on ambulation at high EDSS scores.17 However, despite including these dimensions, the MSFC lacks measures of visual function, sensory function, and fatigue, which are also important dimensions of MS.17 It has been suggested that contrast letter acuity would be a useful addition to the MSFC as a measure of visual function.43 Another quoted advantage of the MSFC is that it can be administered by a trained staff member rather than a neurologist, which has been suggested to be cost effective and more practical than the neurologist-administered EDSS.44 However, clinical trials generally still include the EDSS, with the MSFC as an additional measure rather than a replacement. Thus, the argument of lowering costs by implementing the MSFC is problematic.
There are several additional limitations to the MSFC. Unlike the EDSS, there are practice effects with the 9HPT and PASAT components of the MSFC, making interpretation of improvement difficult and requiring at least three pre-baseline sessions.24 Additionally, the use of various reference populations affects MSFC scores, limiting the comparability of scores across different studies.17 Finally, the clinical interpretation of changes in MSFC z-scores is unclear. Although clinically meaningful scores have been recommended for the components of the MSFC,37–40 clinically meaningful scores for overall MSFC scores have not been established.41 Thus, the MSFC has not been used as a primary outcome measure in clinical trials and is not useful in clinical practice. This is in contrast to the EDSS, which can be scored in clinical practice across patient visits to track changes in the neurologic examination, thus aiding in treatment decision making.2
New Approach—Multiple Sclerosis Functional Composite Progression
To address limitations and improve the clinical interpretation of the MSFC as an outcome measure, an MSFC Working Group was recently formed to develop new approaches to using MSFC data.45 Rather than using MSFC z-score change as an outcome, this group created a definition for MSFC progression, which involved worsening from baseline score on at least one MSFC component by 20% (MSFC Progression-20) or 15% (MSFC Progression-15), sustained for at least three months. The group used AFFIRM7 and SENTINEL8 data to study MSFC progression rates using this definition. They found that the MSFC Progression-20 and MSFC Progression-15 were sensitive measures of disability and correlated with EDSS, relapse rates, and SF-36 Physical Component Summary score change. The MSFC Progression-20 and MSFC Progression-15 at one year were predictive of EDSS progression at two years, and both MSFC progression end-points demonstrated treatment effects in AFFIRM and SENTINEL.45 The MSFC progression is more useful and clinically meaningful than is the MSFC z-score change, and is more similar to the way that EDSS data are currently used in clinical trials.
The MSFC is a multidimensional objective measure of neurologic function that was developed to be a more sensitive measure of disability than the EDSS for use as a clinical trial disability end-point. The MSFC has excellent intra- and inter-rater reliability.22,23 Validity of the MSFC has also been demonstrated; the MSFC correlates well with EDSS, MRI measures of disease, and quality of life measures.17,23,25–28 Since its development, the MSFC z-score change has been used as a secondary disability end-point in clinical trials.32–35,7,8,10,11 The MSFC is a linear, quantitative continuous measure that may be more sensitive to detect changes in disability than the ordinal EDSS scale.2,17,42 Additionally, it measures a broader range of disability, including cognitive and arm function in addition to ambulation. However, it does not include a measure of visual function.17 Other limitations include significant practice effects with the 5HPT and PASAT3 components24 and the use of varying reference populations affects MSFC scores and limits comparability between studies.17 Although a 20% change in components of the MSFC has been suggested to be clinically meaningful,37 clinical interpretation of MSFC z-score change remains unclear,41 which limits the use of the MSFC as a primary outcome measure in clinical trials. An alternative approach to analyzing MSFC data has recently been suggested to improve the clinical interpretation of this scale. This involves defining MSFC progression based on a three-month period of sustained worsening by 15 or 20% in at least one MSFC component, rather than using MSFC z-score change.45 Currently, the most widely accepted end-points in MS clinical trials are relapse rate and disability progression measured using the EDSS. With further study, the newly defined MSFC progression could be used as a primary disability outcome measure in future clinical trials. ■