Report of a Satellite Symposium Held at the Joint Congress of European Neurology, Istanbul, Turkey, 1 June 2014
Relevance of Systematic Real-world Data Collection for Physicians and Patient
Physicians routinely collect data from multiple sclerosis (MS) patients as part of clinical practice. Much of these data are now in electronic formats. Minimum datasets for monitoring clinical outcomes are similar in many centres, and consensus is generally easily reached. When aggregated, these data potentially produce a powerful database. When paper records were ubiquitous, clinical research was largely separate from clinical practice with clinical research organisations and pharma companies populating databases via questionnaires. The development of electronic health records is a major improvement, but an agreed minimum dataset and data-sharing arrangements are also required to monitor a long-term disease such as MS. The information in electronic medical records is ‘big data’ and typically describe overall global trends, but are not necessarily suitable to analyse individual patient outcomes. In common with a clinical trial, a disease registry (or long-term database) uses an agreed minimum dataset. Examples in MS include the European Database for Multiple Sclerosis (EDMUS) project,1 the Swedish National Registry,2 the Danish National Registry3 and the global MS Registry (MSBase).4 Once the language (i.e. the minimum dataset) has been agreed, typically including demographics, disease classification, relapse dates, Expanded Disability Status Scale (EDSS) scores, disease-modifying drug (DMD) start and stop dates, codified magnetic resonance imaging (MRI) and other diagnostic testing, it can be implemented as a physician-initiated registry. Ideally, the registry data collection effort is highly integrated with clinical practice, with little extra time taken. A common misconception among physicians is lack of time, but the large existing registries with over 150,000 MS records in aggregate illustrate that it is possible with proper motivation and, ideally, some dedicated resources. For example, the Danish registry was started in 1948 and now includes over 25,000 records.3 It demonstrates the motivation of the Danish neurologists with minimal additional resources to produce this valuable dataset.
Most of the successful approaches are hybrids – comprising both a registry function and an e-health record function. For example, 80 % of Swedish MS patients are monitored online using such a system that provides data to the clinician on treatment, relapses and EDSS scores, together with comparative information on the severity of disease compared with other patients (see Figure 1).2 Registries provide long-term information displayed graphically making it easy to relate the patients’ current situation to the past, thus helping their management.
The Serono Symposia International Foundation originated the MSBase registry in 2000 to collect codified data with ethical approval and patient consent. The project was relatively unsuccessful due to perceived direct pharmaceutical company influence, and thus became independent in 2004 registered as a not-for-profit organisation in Australia. It has now registered over 31,500 patients, with recruitment significantly accelerating in the 2013–2014 period. In total 28 countries contribute to the registry, involving more than 150 participating neurologists, and the patient datasets have generated over 165,000 patient–years of followup. Median patient follow-up is 5 years and EDSS scores are submitted regularly (median 4–5 months), a data density equivalent to some clinical trials. One advantage of the Swedish Registry and MSBase is their built-in capacity to benchmark patient outcomes, and in the case of MSBase the ‘MS severity calculator’ is open to non-members of the system online. These MSBase severity calculators summarise EDSS score rank at various disease durations by displaying 25th, 50th and 75th percentiles using the entire MSBase dataset as a ‘live’ reference population. EDSS scores of an individual patient can be entered and the severity calculator provides percentile of the EDSS for any given annualised disease duration (see Figure 2). In the future, more sophisticated benchmarking efforts could be utilised to analyse potential clinical therapeutic decision outcomes, especially if MRI metrics can be incorporated. Randomised clinical trials in MS provide information on the comparative efficacy (relapses, EDSS, MRI lesions, brain volume loss) of a diseasemodifying treatment (DMT) in a controlled setting usually lasting up to 2 years.5 By contrast, real-world data assess the effectiveness of a DMT in a real-world setting over a longer period. There are many kinds of real-world data including MS registries giving comparative effectiveness (relapses, EDSS, treatment discontinuation) of a DMT in clinical practice, claims databases illustrating the impact of a DMT on resource utilisation and medical costs, population-based registries giving the impact of a DMT on clinical, safety, resource use and socioeconomic outcomes and observational studies that usually monitor a single DMT over a long period reporting its impact on multiple patient and clinical-reported outcomes.1–3 In MS a number of important questions can potentially be addressed with real-world data:
- How does DMT compare head-to-head if no randomised clinical trial has been reported? • Do DMTs prevent long-term disability accumulation?
- What is the impact of relapses on long-term disability progression?
- What is the long-term disability impact of pregnancy? Do relapses in pregnancy influence long-term disability?
- Do DMTs work in progressive forms of MS?
- Can treatment be individualised to maximise benefit:risk ratios for people with MS?
In terms of treatment comparisons, a potential problem in the real world is the lack of randomisation so that baseline variables between different treatment groups may not be well matched. This problem can be addressed by various techniques, and one popular method is propensity score matching. Individual patients with different treatment decisions (for instance, switch to fingolimod or natalizumab after relapse on injectable DMD compared with switch between injectable DMD classes) can be selected as pairs on the basis of closely ‘matched’ baseline characteristics, using both patients demographic and prior disease activity/severity.6 The statistical method merges all significant predictors of differential treatment assignation into a single score, known as a propensity score, representing the conditional probability that a particular patient receives one of the two treatment options. For example, a patient in group A having a conditional propensity score of 0.41 of receiving treatment A (based on his or her baseline characteristics) will be matched with a patient in group B who also has a propensity score of 0.41 of receiving treatment A (see Figure 3). On the basis of this probability, individuals with the same propensity score can then be matched (as perfect pairs) in the two groups to be compared, and subsequent outcome analysis (e.g. relapse rate, disability progression, DMD discontinuation) can be performed using powerful paired statistics.
By way of further explanation, in a randomised control trial, the probability of being treated with DMT A or B is fixed, often at 50:50, and thus the propensity score (probability of receiving treatment A) is 0.5, whereas in the real-world data the probability of treatment A or B are different, so the population that can be studied in propensity-matched analyses is the overlap that occurs between the treatment assignment probabilities (see Figure 4). One effect is that patients with unmatched scores, e.g. 0.01 or 0.99, whose treatment assignment is almost always either A or B, are removed from the subsequent analysis. In practice, this could often include patients with either very mild or very severe disease initially. Questions on the validity of such analysis are raised, and MSBase proposes that one important means of validation is the capacity to assess treatment comparison for which the outcome is known from a randomised clinical trial, and several of these projects are underway, delivering results largely concordant with clinical trials.7,9
An MS registry programme in clinical practice will improve the management of this chronic disease, especially where a large number of patients are seen. By participating in a registry, benchmarking and decision support can be enhanced, adding value to a clinical practice. Furthermore, such registries and databases are the only feasible way to generate long-term safety and efficacy data that are clinically meaningful to patients and may facilitate personalised medicine especially if combined with biomarker and MRI data. Importantly, real-world data can be benchmarked against randomised clinical trial results to demonstrate validity and improve the confidence of results generated from registries.
Collection of Fingolimod Real-World Evidence and Use in Clinical Practice and Research
A number of tools developed over the last few years can be used to investigate real-world data. It is increasingly important to collect pharmacoeconomic data, which can be facilitated using claims databases, observational studies and population-based registries. Claims databases represent an easy way to obtain data as the DMTs and services are paid for, and the data collected can be interrogated to establish the effect of DMTs on resource utilisation in the clinical setting and medical costs. PharMetrics PlusTM is one of the most important US claims databases and processes the billable interactions between patients and healthcare providers.10,11 This database has information on over 200,000 patients with MS claims diagnosed between 2006 and 2013.11
Considerable data have been collected on fingolimod, the first oral MS therapy. The other oral therapies were not available at the time these data were collected. PharMetrics Plus has data on a matched cohort of patients who switched to fingolimod or glatiramer acetate (GA) from interferons (IFNs). Those switched to fingolimod had a reduced claims-based relapse rate compared with GA over 12 months (62 %; p=0.0013).11 A number of other parameters can be investigated, such as resource utilisation, allowing the quality of life of patients for the money invested to be determined. In this patient group, inpatient visits were reduced by 71 % (p<0.01) in patients switched to fingolimod compared with before switch. A similar reduction (67 %; p<0.001) was observed for the number of days corticosteroids were supplied.12 These data provide invaluable insights into the effect of a DMT.
Another approach involves population-based registries, which have been developed especially in Scandinavia and can reveal the long-term impact of a DMT on clinical, safety, resource-use and socioeconomic outcomes. In the Swedish MS registry set up in 2002 (>60 healthcare units), information on approximately 12,000 patients includes treatment with DMTs, demographics, EDSS, relapses and MRI.13 It can also be linked to other population-based registries. Immunomodulation and Multiple Sclerosis Epidemiology Study of Fingolimod (IMSE) II is one of the five post-marketing Swedish surveillance studies monitoring DMTs, investigating the long-term effectiveness and safety of fingolimod in 806 patients from 44 centres.14 The majority were relapsing–remitting MS (RRMS) patients most frequently treated with natalizumab or IFNs (mean duration 11.5 months). Patients switched to fingolimod had a reduced annualised relapse rate (0.15 with fingolimod versus 0.39) and a more stable EDSS score over 12 months compared with before switch. In addition, more innovative parameters were investigated such as disease severity (Multiple Sclerosis Severity Score [MSSS]), cognition (Symbol Digit Modalities Test [SDMT]) and quality of life (European Quality of Life 5 dimensions questionnaire [EQ-50]) and over 12 months all of these clinical measures improved with fingolimod (MSSS, –17.1 %; SDMT, +6.8 %; EQ-5D score +8.3 %) (see Figure 5). Eventually, long-term data will be available.
Finally, observational studies can be conducted to investigate the impact of one DMT on multiple patient- and clinical-reported outcomes. An example from Germany is Post-Authorization Non-interventional German sAfety study of GilEnyA in RRMS patients (PANGAEA), a prospective, observational, 5-year registry study of patients treated with fingolimod.15
Over a number of years, software tools have been developed to provide and collect data. For example, the Multiple Sclerosis Documentation System (MSDS3D) – designed as a database application for the documentation of MS patients – which is used for PANGAEA.16 It can interact with various patient-monitoring systems, such as clinical documentation, and specific treatment management for the more recently licensed DMTs, such as alemtuzumab, natalizumab, fingolimod, dimethyl fumarate or teriflunimide, and can be linked to different registries and for research projects. Its architecture is innovative, with the data stored in the MSDS3D cloud. The MS nurse can add most patient information via an iPad with only specific clinical aspects, such as adverse events and patient-related outcomes, reported by the physician. There is also the facility to present case reports for discussion.
The real-world evidence programme has demonstrated the benefits of fingolimod in regular clinical practice. Considerable claims outcome data has been collected for a large population of patients but to date imaging measures of brain tissue damage are missing and such data would undoubtedly strengthen real-world evidence.
Evolving the Evidence Base – Importance of Collecting Imaging Measures
Although imaging data is routinely collected to assist in diagnosing MS and following disease progression, to date such data have not been systemically included in real-world databases. Conventional MRI sequences include, among others, T2-fast attenuated inversion recovery (FLAIR), proton density and T2-weighted imaging (T2/PDweighted imaging). Over the last 10 years, improvement in scanner technology has made these techniques more sophisticated for performing quantitative analysis.17 FLAIR and T2/PD-weighted imaging are important sequences in the clinical routine for recognition of hyperintense lesions (T2 volume and number). Also use of pre- and post-contrast T1 spin echo scans help identify hypointense T1 and gadolinium (Gd)-enhancing lesions, respectively. Gd-enhancing lesions represent acute inflammatory activity indicative of the breakdown of the blood–brain barrier.18 About a third of these lesions evolve over time into chronic hypointense T1 lesions or black holes that are related to neurodegeneration in MS. Three-dimensional (3D) T1- weighted images are specific for volumetric assessments and better morphology tissue detection and with acceleration acquisition techniques, high-quality images can be obtained in only a few minutes. Such images allow tissue segmentation of grey matter, white matter and cerebrospinal fluid compartments. Historically these techniques were performed in academic centres but are now increasingly available in community practice. However, it is not necessary to acquire these sophisticated sequences to perform quantitative analysis in MS patients.
A number of recommendations for brain MRI protocols are available for adults with MS, for example, the Consortium of Multiple Sclerosis Centers (CMSC) guidelines (see Table 1).19 New improved scanners produce 3D FLAIR sequences that can give different orientation of the slices including sagittal, coronal and axial, without acquisition of each plane separately. It is important to note that these images can be obtained on 1.5 or 3 Tesla scanners and the recommended slice thickness is 3 mm without gap. In the past this was problematic due to the time involved, but in the last few years it has become possible to obtain this recommended protocol in less than 30 minutes, especially using acceleration acquisition techniques, such as parallel imaging.
An important question to address is whether collection of MRI data is feasible in clinical routine. There are a number of general requirements for image processing that need to be considered:
- provide the technological resources to perform accurate, reproducible analyses of medical imaging data from which sound routine and scientific conclusions can be drawn.
- Maintain data integrity and confidentiality and ensure its future availability as techniques are constantly evolving.
- Make the workflows involved as efficient as possible – many techniques are becoming fully automated, being suitable for routine clinical use and make virtually no errors:
- receive scans;
- store scans;
- analyse scans;
- ensure scan analysis quality;
- store analyses for future use;
- maintain data integrity; and
- report data within a few hours to the physician.
In many centres, picture archiving and communication systems (PACS) store the MRI scans, which is important for allowing future data analysis. Extremely powerful systems, able to perform analyses on a large amount of data, are now relatively inexpensive. However, as well as with the computational storage system, it is crucial that as much of the analysis as possible is automated ensuring consistent workflow. In the future the role of the operator in performing directly analyses will substantially change. The human interaction will become only necessary for processes requiring quality control checks to determine whether the analysis is acceptable to be used on an individual patient basis.20 Quality checks are undertaken at various stages during the process to reduce or remove scanner error, diminish variability, prevent unanalysable scans and increase the power to detect effects earlier or in smaller groups.
The impact of pre-processing techniques is playing a major role in avoiding scanner or motion artefact errors. One of the most important tools in imaging pre-processing is non-uniformity correction, which removes low spatial frequency background perturbations via iterative, non-parametric bias field estimation and performs adjustment of the scan without changing its quality. Another major component is co-registration, which realigns images in 3D in order to correct for positioning errors and/or patient movements. New lesions are sometimes reported in MS patients, while some of these represent positioning errors. Both these techniques occur automatically in a few seconds with virtually zero errors. Multidimensional combination of quantitative analysis is performed and although there are many different types, they can generally be divided into focal (Gd-enhancing lesions, T2 lesions, T1 black holes), macroscopic or visible global (atrophy, tissue-specific atrophy, regional atrophy) and microscopic or invisible global (diffusion-tensor, magnetisation transfer ratio, perfusion, spectroscopy, phase [iron]). Quantitative activity analysis of lesions can help to reduce the time of examining scans. If a new lesion appears on FLAIRs on longitudinal images, a new post-processing technique, the subtraction, can give a clear indication whether an apparently new lesion is real – such verification is invaluable as part of the clinical routine.
Many measures of brain atrophy are currently available and this is an extremely important outcome both for clinical trials and in predicting disability. However, the situation is complicated because during the disease course, the importance of some of these measures is changing. For example, in very early disease (clinically isolated syndrome [CIS] or early MS) thalamus and other grey matter structures evolve more rapidly compared with normal controls.21 However, later in the disease process, cortical atrophy is more involved. By contrast, lateral ventricular volume enlargement is independent of disease course and unfolds linearly across time. It is very predictive of clinically definite MS, disability development in CIS patients or secondary progressive MS (SPMS).22 However, to obtain a more comprehensive picture, other volumetric components, such as corpus callosum atrophy of the white matter, have to be also considered.
To address whether these measures can be used on an individual basis, 26 age- and sex-matched healthy controls were compared with 98 RRMS patients on 3D-T1-WI (see Figure 6). Although the images look relatively the same, the thalamus volume is 13.5 % lower in the RRMS group, illustrating that it is almost impossible to visually define the changes, necessitating the move to quantitative tools.23
The Buffalo Neuroimaging Analysis Center at the University at Buffalo has extensive experience collaborating with long-term MRI databases. A 10-year collaboration with Charles University, Prague on the collection of data from the Avonex-Steroid-Azathioprine clinical trial.24–30 yielded important insights into understanding the natural history of brain atrophy development under DMTs. This is one of few studies that used the same scanner without hardware or software changes to acquire serial yearly scans over 10 consecutive years. The 5-year data showed a good correlation between disability and whole brain volume loss, cortical atrophy and lateral ventricle volume enlargement but no correlation with T2 lesion volume was found.29
Furthermore, this centre has also collected MRI data from more than 2,500 MS patients and 400 healthy controls on the same 3 Tesla and 1.5 Tesla scanners over 8 years using standardised MRI clinical routine protocols.31–33 MRI scans are stored daily in a centralised database system and fully automated computational systems run the analyses, which are overwritten and checked by the operators. All these data are available to physicians. There are multiple examples that can illustrate dynamic changes of lesion volume accumulation and brain atrophy progression, as outcomes on an individual patient basis from that database. Figure 7A shows the patient that had approximately 30 ml of T2 lesion volume after 20 years of disease duration, which is representative of lesion accumulation over this time period in a typical MS patient. Based on a group of more than 1,000 matched MS patients and over 400 matched healthy controls, this patient fell 25 percentiles below the standard line of the MS patient and control agerelated brain volumes, indicating more brain atrophy. This is expected since MS patients with more lesions tend to have increased brain atrophy. However, Figure 7B shows another example of an individual patient with lower T2- and T1-lesion volumes than expected for the disease duration, but with more advanced brain atrophy in terms of the matched MS and control groups. Such patients are more difficult to treat because it is believed that their disability progression is not so much dependent on lesions, but is due to rapid brain atrophy development. These real-world cases indicate that brain atrophy should be part of the clinical routine evaluation.
It is feasible to use the quantitative tools described in a clinical routine since the quality of MRI sequences is improving dramatically in last couple of years. The collection, storage and computation of the images are now more straightforward and the use of huge database systems together with automated analysis pipelines revolutionises the analysis. Furthermore, use of pre-processing techniques is making a dramatic difference in improving the quality of scans and facilitating reliable analysis.
Improvements in standard MRI protocols allow for quantitative data assessment in academic and community centres worldwide. In addition, PACS and other storage systems are available in most centres and allow easy MRI transfer and collection. Software improvements in MRI analysis are closing the gap towards application in clinical routine. It is clear that systematic MRI data collection on an individual patient basis is possible, but ideally should be further standardised. In the future, the focus should be on simplifying the measures described to enhance their role in the follow-up of MS patients since improving follow-up will make a significant difference to patient management.
Real-world Evidence – Evolving Outcome Measures
Real-world evidence provides information about the effectiveness and long-term safety of a treatment in everyday clinical practice, but currently the systematic collection of MRI data is missing. To strengthen real-world evidence, the collection of clinical outcomes needs to evolve to include four core measures – relapses, EDSS, new lesion development and brain volume loss. Another outcome measure of great potential is tracking cognitive change over time, but the challenge of its implementation is even greater than that of routinely quantified MRI, due to its resource intensity. Large democratic data collection platforms, such as MSBase, make it easy for practising neurologists to form collaborations for collective outcome analysis and knowledge exchange.
Data from randomised, controlled trials can be expanded and enhanced with real-world data. For example, the systematic collection of real-world evidence has confirmed the effectiveness and safety of fingolimod in the real-world setting. First-hand real-world evidence is generated daily by clinical practices and a valuable contribution would be made to real-world MS research if data were routinely submitted to a local or international registry.
The authors are therefore confident that increased precision in monitoring will occur in the near future. Over the next few years, it is hoped that it will be possible to access other measures such as cognition and quantitative MRI, which together with clinical outcome measures, will facilitate improvements in treatment and
treatment-failure definitions.