Longitudinal data comprising repeated measurements of the same individuals on a number of occasions arise frequently in a wide range of fields: medicine, public health, psychology, biology and more. The main objectives of a longitudinal study are to characterise changes in the response of interest over time and to examine the selected co-variates that contribute to those changes. Traditional methods used to analyse longitudinal data are varied. Examples of these methods include autoregressive models, repeated measures multivariate analysis of variance, mixed-effects models, multiple regressions and so on.
Recently, there has been growing interest in models that have the ability to incorporate information concerning not only the group or population, but also changes in the individual. Latent (growth) curve modelling allows for the testing of complex models regarding developmental trends at both inter- and intra-individual levels. It has received increasing attention in medical research recently and has been well recognised as a useful longitudinal technique in the analysis of patterns of change.1–6
Latent Curve Model
The latent curve model (LCM) is a method to model individual change, assess the effects of co-variates and assess the relationship among multiple outcomes, and take the measurement error into account. It provides a means of modelling developmental processes from both the inter- and intra-individual perspective. Generally speaking, LCM consists of two stages in modelling the patterns of change. In the first stage, the repeated measures of each individual across time are fitted through a regression-type curve, which is either linear or non-linear. In the second stage, the focus of the analysis is on the latent growth factors, which are used to identify the individual’s growth curve. The interest is no longer specifically on the original repeated measures observed, but on the unobserved latent growth factors that lead to the repeated measures. The LCM not only takes into account the mean of latent growth factors, which represent the group-level change, but also considers the variances that measure the degree of individual differences. This combination of group- and individual-level analyses is synthesised in the LCM procedure.
The interpretation of the model parameters in the LCM is illustrated with a simple two-factor linear model. The first-stage regression type equation, also called trajectory equation, is:
where yit is the measurement for individual i at time t, and αi and βi are, respectively, the random intercept and slope for individual i, ti represents the sequential value of each time record and εit is the measurement error for individual i at time t.
In the second stage, the random intercept and slope are of interest, leading to to the following two equations, called the intercept equation and slope equation, respectively:
where μα and μβ are the mean intercept and mean slope averaged across all individuals over time, δαi is a residual reflecting the difference between the mean intercept μα and the individual intercept and δβi is a residual reflecting the difference between the mean slope αi and the individual slope βi. The variances of δαi and δβi measure the degree of diversity in individual specific intercepts and slopes from the mean intercept and slope.
A lot of extensions of the above basic LCM have been developed to meet real-world needs. For example, substantial co-variates can be included to account for their effects on latent growth factors.7 Non-linear LCMs can be used to capture the higher-order polynomial trajectories.2,7,8 For hierarchically longitudinal data, a multilevel LCM can characterise dynamic change of nested data,9 and for long-term heterogeneous data a mixture of LCMs can explore and identify different dynamic change patterns in different subject clusters.4,10
Recent Applications of Latent Curve Models in Medical Research
Linear Latent Curve Models
The linear LCM defined in equations 1–3 has been widely applied in medical research. The first illustrative example is the application of LCM in polydrug use. Brecht et al.11 conducted an analysis to examine 10-year patterns of heroin, cocaine, methamphetamine, marijuana and alcohol use for primary users of heroin, cocaine and methamphetamine. Their main purpose was to investigate whether the trajectories of primary substance use over time are related to trajectories of other substance use. Non-overlapping samples from five studies that collected longitudinal information using the Natural History Instrument (NHI) were pooled in this analysis. Only those subjects who reported a primary drug problem of heroin, cocaine or methamphetamine were selected from each study.
Figure 1 depicts the trajectories of average number of days per month with use of each of the five substances for each of the primary drug sub-samples: heroin, cocaine and methamphetamine. It showed that the usage of non-primary heroin, cocaine or methamphetamine was very low during periods concurrent with use of these primary drugs. For users of all three primary drugs, however, there was consistently moderate usage of marijuana and alcohol. For each of the three primary drug sub-samples, a linear LCM was used to explore the patterns and relationships across drugs over time. The main finding was that the primary drug levels declined for heroin and methamphetamine users and showed relatively consistency over 10 years for cocaine users, while levels of non-primary drugs remained at consistently low levels or declined in tandem with the primary drug. This finding is consistent with the previous observed patterns in Figure 1.
Recently, Pan et al.3 applied the above LCM to a study of quality of life (QoL). It was a prospective cohort study of first disabling stroke patients conducted at the Prince of Wales Hospital (PWH) in Hong Kong. The study aimed to assess longitudinal behaviours of health-related QoL (HRQoL) in stroke survivors in relation to the changes in activities of daily living (ADL), handicap and post-stroke depression over time in the sub-acute phase of stroke recovery. Subjects were interviewed at three, six and 12 months after stroke to measure their levels of ADL, handicap, depression and HRQoL using the modified Barthel Index (MBI),12 the London Handicap Scale (LHS),13 the Geriatric Depression Scale (GDS)14 and the World Health Organization (WHO) QoL questionnaire (abbreviated Hong Kong version).15 A linear LCM was proposed to examine how the dynamic changes in ADL, handicap and depression influenced dynamic changes in the four domains of HRQoL: physical health, psychological health, social interaction and environment. The LCM in analysing the physical health domain of HRQoL is depicted in Figure 2. The main finding of this analysis was that changes in mood in the sub-acute phase of the stroke recovery had the most significant effect on the HRQoL of the stroke survivors, while changes in basic functional status and handicap had less significant effects. In this longitudinal study, LCMs played an important role in delineating the relative contribution of basic functional status, handicap and depression in the HRQoL of stroke patients in a prospective manner.
Since the current care services in the sub-acute phase of stroke recovery do not specifically address the problem of post-stroke depression, this study suggests that, apart from drug treatment, there is a need to devise alternative strategies to manage post-stroke depression for better QoL for stroke survivors.
Non-linear Latent Curve Models
Basic LCMs can be extended to incorporate non-linear trajectories and effects of co-variates. With this extension, a higher-order polynomial is used to describe a non-linear pattern of dynamic change in individual characteristics. A longitudinal study about depression in multiple sclerosis (MS)7 is used for illustration.
The study goals focused on three research issues: to reveal the patterns of change in depressive symptoms over time; to identify substantial effects of co-variates such as age, type of MS (TMS), years since diagnosis of MS (YMS) and functional limitation (FL) on the trajectory of depression over time; and to examine the correlations between characteristics of change in FL and depressive symptoms over the seven-year time period. The data were collected from 607 MS patients over a seven-year period, with initial recruitment in 1999 as part of an ongoing longitudinal study of QoL16 in chronic illness.
A non-linear trajectory in depression for the sample was suggested by an examination of randomly selected empirical growth plots. A three-factor LCM was used to model the trajectories of depression for the MS sample:
where αi, βi and γi represent the random intercept, slope and quadratic slope for individual i, respectively. As depicted in Figure 3, the change pattern and the correlations between the characteristics of change in depressive symptoms were examined. Furthermore, predictors of change of depression were examined by regressing the intercept, slope and quadratic slope on the co-variates of interest, which leads to the following equations:
βi = μβ +bβ1 (Age) + bβ2 (TMS) + bβ3 (YMS) + bβ4 (FL) +δβi (6)
γi = μγ +bγ1 (Age) + bγ2 (TMS) + bγ3 (YMS) + bγ4 (FL) +δγi (7)
Findings associated with the aforementioned three specific issues were obtained.
First, there was no significant increasing or decreasing trend in depressive symptoms, although it fluctuated over time for individuals. Second, younger age, longer time since diagnosis of MS, progressive forms of MS and greater extent of FL would result in greater depressive symptoms at time one. Third, FL showed an association with depression at all time periods, but other co-variates did not. In addition, gender did not predict the changes in depressive symptoms. These results indicated that screening for depression in all patients with MS was necessary and important.
Multilevel Latent Curve Models
In some circumstances, clinical trials involve a multilevel design, leading to hierarchically longitudinal data. For example, to evaluate the effects of a neighbourhood walking programme on QoL among older adults, Fisher and Li9 used a multilevel sampling scheme to collect a sample of neighbourhoods from a large metropolitan city, from which older adult residents were randomly recruited. This two-level design resulted in a nested data structure in which participants were clustered within neighbourhoods.
The substantive interest of this study focused on whether a six-month neighbourhood walking programme would improve neighbourhoodlevel QoL for senior residents. A two-level LCM of QoL, with individual-and neighbourhood-level data structures, is shown in Figure 4. All of the involved measures were assessed at baseline and at three and six months of the study period.
Compared with the control neighbourhoods, results from the two-level LCM indicated that physical, mental and satisfaction with life aspects of QoL were significantly improved over the course of the six-month intervention. The study concluded that it was feasible and beneficial to implement a neighbourhood-based walking programme of low to moderate intensity in order to promote QoL among senior residents at a community level.
Mixture Latent Curve Models
Heterogeneity is commonly encountered in longitudinal analysis of medical research. For heterogeneous data there exist some latent classes under which the interested characteristics might present completely different change patterns.
A mixture of LCMs can be used to characterise the heterogeneity and to reveal specific change patterns for each distinctive latent class. Compared with the basic LCM, the additional tasks in applying mixture LCMs are to identify the number of latent classes, detect the membership of each individual observation and predict the probability of each individual falling in a specific class. To formulate the probability of an individual belonging to the latent class, the following multinomial logistic regression model was introduced. For k=1,2 … K,
exp(a0k + a1kxi1 +a2kxi2 + … + apkxip)
k∑j=1exp(a0j + a1jxi1 +a2jxi2 + … + apjxip)
where K is the number of latent classes, Ci is the class membership for individual i, xi1, xi2, …, xip, are co-variates that may potentially influence the chance of individual i belonging to latent class k and a0k, a1k, … , apk are corresponding regression co-efficients that reflect the importance of potential co-variates.
Hser et al.10 applied this model to examine long-term trajectories of drug use for primary heroin, cocaine and methamphetamine users. The data included 629 primary heroin users, 694 cocaine users and 474 methamphetamine users. The main outcome measure was the number of days using the primary drug per month.
As shown in Figure 5, the analysis of mixture LCMs revealed five distinct groups with different drug use trajectories over a 10-year follow-up: consistently high use, increasing use, decreasing use, moderate use and low use. In addition, primary drug type was significantly associated with different trajectory patterns.
Heroin users were most likely to be in the consistently high use group and cocaine and methamphetamine users were most likely to be in the moderate use group. The study also revealed that users in the high use group had earlier onset of drug use and crime, longer incarceration durations and fewer employed periods than those in other groups. Compared with other existing studies of drug addiction, the use of mixture LCMs in this analysis emphasised the heterogeneity of drug use patterns and the importance of understanding and addressing the full spectrum of drug use patterns over time.
Another application of mixture LCMs is the analysis of depression in persons with myocardial infarction (MI). Elliott et al.4 analysed affect and event data from subjects post-MI in order to understand how mood and reactivity to negative events over time relate to diagnostic-level depression. In this study, 35 patients who had experienced an MI within the past year and were in treatment were investigated. The affect scores and event indicators (indicating presence of positive, negative and neutral events) of the patients were collected for up to 35 consecutive days.
The analysis of mixture LCMs suggested a two-class model for the MI patients: an ‘optimist’ class, with stable positive affect and declining perceived negative events, and a ‘pessimist’ class, with declining positive affect and continuing perceived negative events. Depressed subjects had a 92% chance of belonging to the ‘pessimist’ class compared with 62% among non-depressed subjects. This finding uncovered some hitherto unobserved structure in the positive affect and negative event data in this sample.
The key advantage of using mixture LCMs in this study was that persons who are most at risk of developing major or minor depression could be identified, which will assist in developing more specific interventions or treatments.
The LCM is an integrated and flexible approach that can be used to identify an appropriate growth curve for describing the developmental trend accurately and parsimoniously and to study the individual differences in changes. It has become a common and useful tool for the analysis of longitudinal data in behavioural, educational, social and psychological sciences. However, in medical applications LCMs have not been as popular as in other fields. There has been a lag between the recent development of statistical techniques and their widespread application in medical disciplines.
One reason why LCMs have been slow to move into the realm of medical sciences is that applied researchers are not familiar with this useful method and the associated user-friendly computer packages such as SPSS, SAS, Amos,19 EQS,20 LISREL,21 Mplus,21 WinBUGS,23 etc. It is believed that the medical field would benefit from an increasing awareness and use of these powerful and flexible methods. ■