Skip to main content

1. The Core Framework: Nonlinear Dynamic Latent Class SEM (NDLC-SEM)

The forecasting of critical states in the SAM study required innovative techniques to capture the complex, multi-level nature of student dropout.
The methodological approach relies on the Nonlinear Dynamic Latent Class Structural Equation Model (NDLC-SEM) — a flexible Bayesian framework integrated with a modified Forward Filtering Backward Sampling (FFBS) algorithm for real-time prediction.

1.1 Key Capabilities

The NDLC-SEM framework combines the capabilities of several dynamic models to simultaneously address four essential data structures for modeling the longitudinal dropout process using Intensive Longitudinal Data (ILD). It models:
  1. Inter-Individual Differences (Traits): Stable characteristics (e.g., cognitive abilities).
  2. Intra-Individual Changes (States): Changeable psychological states (e.g., affective and motivational states).
  3. Unobserved Heterogeneity of Trajectories: Captured via time-varying latent classes following a hidden Markov process.
    • In this study, these classes represent the discrete latent variable SitS_{it}:
      • s=1s=1: Intention to stay
      • s=2s=2: Intention to quit
  4. Time-Dependent Nonlinearities: Latent within-person variables predict transition probabilities with nonlinear effects.
The model was implemented in JAGS 4.2 and executed via the R2jags package, using a Gibbs sampler for Bayesian estimation.

2. Model Specification

The model specification outlines how observed variables relate to latent constructs (measurement models) and how these latent constructs evolve and interact across levels (structural models).

2.1 Measurement Models

Within-Level (States – η1it\mathbf{\eta}_{1it}): Seventeen observed variables (Y1it\mathbf{Y}_{1it}) operationalize seven continuous latent within-factors (η1its\mathbf{\eta}_{1its}), representing affective/cognitive states such as stress, fear of failure, and affect balance. All observed variables were centered and oriented so that higher values indicate stronger intention to quit. The within-level measurement model is consistent across latent states: Equation (1):(Y1itSit=s)=Λ10η1its+ε1it\textbf{Equation (1):} \quad (\mathbf{Y}_{1it} \mid S_{it} = s) = \mathbf{\Lambda}_{10}\mathbf{\eta}_{1its} + \mathbf{\varepsilon}_{1it} Where:
  • Y1it\mathbf{Y}_{1it} = (17 × 1) vector of observed variables
  • η1its\mathbf{\eta}_{1its} = (7 × 1) vector of latent state factors
  • ε1it\mathbf{\varepsilon}_{1it} = vector of residuals

Between-Level (Traits – η2i\mathbf{\eta}_{2i}): A single latent construct — cognitive ability (IQ) — was modeled using three CFT-3 test items measured at baseline. Equation (2):Y2i=Λ2η2i+ε2i\textbf{Equation (2):} \quad \mathbf{Y}_{2i} = \mathbf{\Lambda}_{2}\mathbf{\eta}_{2i} + \mathbf{\varepsilon}_{2i} Where:
  • Y2i\mathbf{Y}_{2i} = (3 × 1) vector of observed indicators
  • η2i\mathbf{\eta}_{2i} = latent IQ factor
  • ε2i\mathbf{\varepsilon}_{2i} = uncorrelated residuals

2.2 Structural Dynamics (Within-Level)

The within-level dynamics were modeled using a first-order autoregressive process (AR(1)), specific to the discrete latent class Sit=sS_{it}=s: Equation (3):(η1itSit=s)=α1is+B1isη1i,t1+ζ1it\textbf{Equation (3):} \quad (\mathbf{\eta}_{1it} \mid S_{it}=s) = \mathbf{\alpha}_{1is} + \mathbf{B}_{1is}\mathbf{\eta}_{1i,t-1} + \mathbf{\zeta}_{1it} Where:
  • η1i,t1\mathbf{\eta}_{1i,t-1} = latent states at previous time t1t-1
  • α1is\mathbf{\alpha}_{1is} = class-specific intercept vector
  • B1is\mathbf{B}_{1is} = diagonal matrix of AR(1) effects
  • ζ1it\mathbf{\zeta}_{1it} = innovation term
Note: Initial tests showed near-zero cross-lagged effects, justifying a simplified AR(1) structure.

2.3 Structural Models (Between- and Cross-Level Interactions)

The stable latent trait (IQ, η2i\mathbf{\eta}_{2i}) influences both the intercepts and autoregressive dynamics of the latent state processes. Intercept Function: Equation (4):α1is=α21s+β2sη2i+ζ2i\textbf{Equation (4):} \quad \mathbf{\alpha}_{1is} = \mathbf{\alpha}_{21s} + \mathbf{\beta}_{2s}\eta_{2i} + \mathbf{\zeta}_{2i} AR Coefficients with Cross-Level Moderation: Equation (5):B1is=B1s+Ω2sη2i\textbf{Equation (5):} \quad \mathbf{B}_{1is} = \mathbf{B}_{1s} + \mathbf{\Omega}_{2s}\eta_{2i} Here, Ω2s\mathbf{\Omega}_{2s} allows cognitive ability (IQ) to moderate motivational and self-regulatory state dynamics over time.

2.4 Markov Switching Model (Transition Probabilities)

The discrete latent state SitS_{it} evolves according to a hidden Markov process, governed by transition probabilities derived via a logit link function. Probability of Staying in s=1s=1 (No Intention to Quit): Equation (6):P(Sit=1Si,t1=1)=exp(νit11)exp(νit11)+1\textbf{Equation (6):} \quad \text{P}(S_{it}=1 \mid S_{i,t-1}=1) = \frac{\exp(\nu_{it}^{11})}{\exp(\nu_{it}^{11}) + 1} Logit Function Definition: Equation (10):νit11=γ1+γ2η2i+γ3η1i,t1+γ4η1i,t1η2i\textbf{Equation (10):} \quad \nu_{it}^{11} = \gamma_{1} + \gamma_{2}\eta_{2i} + \mathbf{\gamma}_{3}\mathbf{\eta}_{1i,t-1} + \mathbf{\gamma}_{4}\mathbf{\eta}_{1i,t-1}\eta_{2i} Return Probability (P₁₂):
The transition from intention to quit (s=2s=2) to intention to stay (s=1s=1) is assumed rare and slow:
P12unif(0.0,0.1)P_{12} \sim \text{unif}(0.0, 0.1) This aligns with the Rubicon model, positing that individuals rarely revert once a quitting intention is formed.

3. Identification of Latent Discrete States

Identifying the latent states (SitS_{it}) as “intention to drop out” involved a confirmatory modeling strategy:
  1. Imposed Constraints:
    Persons in state s=2s=2 constrained to show higher scores on all seven negative affect scales.
  2. Predictive Link:
    Transition probabilities were regressed on affective scales and their interactions with IQ.
  3. Temporal Restrictions:
    Transitions back to s=1s=1 restricted to low probability.
  4. Partial Observation:
    Dropouts observed during the semester were coded directly as Sit=2S_{it}=2 (manifest dropout).

4. Forecasting Implementation: Forward Filtering Backward Sampling (FFBS)

Forecasting dynamic latent states is achieved through the Forward Filtering Backward Sampling (FFBS) algorithm — a Bayesian sequential estimation method adapted for hidden Markov models (see West & Harrison, 1997).

4.1 Key Features of the FFBS Adaptation

  • Integrates seamlessly with the Gibbs sampler (forecasting within estimation loop).
  • Handles latent time-dependent predictors (η1it\mathbf{\eta}_{1it}) driving state transitions.
  • Produces real-time posterior forecasts for the latent dropout intention states.

4.2 Core Algorithmic Steps

Step 1 — Reformulation (Aspect i.)

Reformulate NDLC-SEM into a Dynamic Linear Model (DLM) framework:
  • Observation Equation: η1jts=Fjtθjts+vjts\mathbf{\eta}_{1jts} = \mathbf{F}_{jt}\mathbf{\theta}_{jts} + \mathbf{v}_{jts}
  • System Equation: θjts=Gjtsθj,t1,s+wjts\mathbf{\theta}_{jts} = \mathbf{G}_{jts}\mathbf{\theta}_{j,t-1,s} + \mathbf{w}_{jts}

Step 2 — Define Strata (Aspect ii.)

Define four strata (s,s)(s, s') for every consecutive time pair (t,t1)(t, t-1), covering all possible transitions between “stay” and “quit” states.

Step 3 — Continuous State Prediction (Aspect iii.)

For each stratum, sample latent factor scores, producing four forecast draws: (η1jit(s,s),Dt1)(\eta_{1jit} \mid (s,s'), D_{t-1})

Step 4 — Marginal Predictive Density (Aspect iv.)

Compute the mixture of predictive densities across the four strata: P(η1jitDt1)=s=12s=12[πi(s,s)pi,t1(s)P(η1jit(s,s),Dt1)]\text{P}(\mathbf{\eta}_{1jit}\mid D_{t-1}) = \sum_{s=1}^{2}\sum_{s'=1}^{2} \left[ \pi_{i}(s,s')p_{i,t-1}(s') \text{P}(\mathbf{\eta}_{1jit}\mid(s,s'),D_{t-1}) \right] This produces the overall forecast distribution of the continuous latent variable.

4.3 Posterior Updating and Smoothing

After observing DtD_t, priors are updated and the joint posterior over model combinations is computed: pit(s,s)πi(s,s)pi,t1(s)P(η1jitMit(s),Mi,t1(s),Dt)p_{it}(s,s') \propto \pi_{i}(s,s')p_{i,t-1}(s')\text{P}(\mathbf{\eta}_{1jit}\mid M_{it}(s), M_{i,t-1}(s'),D_t) The smoothed posterior for each latent state is then: pit(s)=s=12pit(s,s)p_{it}(s) = \sum_{s'=1}^{2} p_{it}(s,s')

4.4 H-Steps-Ahead Forecasting

For future predictions (t>Ntt > N_t):
  • Use last observed posteriors (t=Ntt = N_t).
  • Recursively define expected values ajt\mathbf{a}_{jt} and covariance matrices Rjt\mathbf{R}_{jt}.
  • Generate the HH-step-ahead forecast distribution of latent factors η1jt\mathbf{\eta}_{1jt}.

Summary:
The integrated NDLC-SEM + FFBS framework enables dynamic, real-time prediction of latent dropout intentions, bridging stable cognitive traits and fluctuating emotional-motivational states.
It provides a powerful approach to modeling nonlinear psychological dynamics in longitudinal educational data.