White PaperClinical Screening24-Month Progression RiskXGBoost

Stage 1 — Clinical Progression Model

A sensitivity-optimized XGBoost classifier for estimating 24-month MCI-to-Alzheimer's progression risk using routinely available clinical and neuropsychological variables, trained on the ADNI cohort.

0.916

AUC

89.9%

Sensitivity

80.1%

Specificity

82.3%

Accuracy

1. Clinical Rationale

Mild Cognitive Impairment (MCI) is a heterogeneous syndrome sitting between normal aging and dementia. Approximately 10–15% of MCI patients progress to Alzheimer's disease per year, but individual progression trajectories vary substantially. Identifying high-risk patients at the point of clinical screening — before costly biomarker testing — has significant implications for care pathway efficiency and early intervention.

Stage 1 serves as the entry gate to the pipeline. It uses only variables available in a standard outpatient neurology or primary care encounter: demographic information, MMSE, a validated caregiver-rated functional scale (ECog), and two components of the Rey Auditory Verbal Learning Test (RAVLT). No blood draws, imaging, or genetic testing are required at this stage.

The model is calibrated for high sensitivity (89.9%) rather than balanced accuracy, reflecting the asymmetric cost of false negatives in a screening context: missing a high-risk patient delays intervention, while false positives are caught and corrected by the downstream plasma and MRI stages.

2. Dataset and Cohort

The model was trained on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. The analytic sample comprised 2,430 participants after preprocessing, with a 24-month progression label derived from longitudinal follow-up visits.

Parameter	Value
Dataset	ADNI (adnimerge.csv)
Total samples	2,430
Progressors (N)	547 (22.5%)
Non-progressors (N)	1,883 (77.5%)
Train / Test	80% / 20% stratified
Class imbalance	Handled via scale_pos_weight
Input dimensionality	9 clinical features + missingness indicators

3. Target Definition

The binary target variable Target_24m was defined as cognitive progression within 24 months of the baseline visit, operationalized as a transition from MCI to dementia or Alzheimer's disease diagnosis as recorded in the ADNI longitudinal assessment data.

Class	Label	Count	Proportion
0	Low Risk Monitor	1,883	77.5%
1	High Risk Progressor	547	22.5%

Design note: The 24-month window was chosen to align with standard ADNI follow-up intervals and to capture clinically actionable near-term progression risk. Longer horizons (36–48 months) would increase progressor prevalence but reduce the actionability of the prediction for treatment initiation decisions.

4. Feature Set

All nine input features are available from a standard outpatient clinical encounter. Missing values were median-imputed with binary missingness indicators retained as auxiliary features. No imaging or laboratory values are required.

Feature	Group	Description
MMSE	Cognitive	Mini-Mental State Examination — global cognitive screening tool (0–30). Lower scores indicate greater impairment.
RAVLT_immediate	Memory	Rey Auditory Verbal Learning Test — sum of trials 1–5. Measures verbal learning capacity; strongly sensitive to hippocampal dysfunction.
EcogSPTotal	Functional	Everyday Cognition scale rated by a study partner across 6 cognitive domains (1 = normal, 4 = impaired). Provides informant-based functional assessment.
APOE4	Genetic	Number of APOE ε4 alleles (0, 1, or 2). The strongest genetic risk factor for late-onset Alzheimer's disease.
EcogMem_discrepancy	Functional	Difference between patient self-rating and study partner rating on memory subscale (Pt − SP). Positive values indicate anosognosia — lack of insight into cognitive decline.
AGE	Demographic	Age in years. Risk of progression increases non-linearly with age across the MCI population.
RAVLT_forgetting	Memory	Trial 5 minus delayed recall — measures retention loss over a delay interval. Elevated scores indicate accelerated forgetting.
PTEDUCAT	Demographic	Total years of formal education. Higher education is associated with greater cognitive reserve, potentially delaying symptom expression.
PTGENDER	Demographic	Biological sex (0 = female, 1 = male). Contributed zero SHAP importance in this model — sex did not independently predict 24-month progression in this cohort.

5. Model Architecture

An XGBoost gradient-boosted tree classifier was trained with sensitivity optimization. The classification threshold was set not at the default 0.5 but at the value that achieves approximately 90% sensitivity on the validation set, reflecting the screening-first clinical philosophy of Stage 1.

Hyperparameter	Value
Algorithm	XGBoost (gradient boosted trees)
Classification threshold	0.1082 (sensitivity-optimized)
Sensitivity target	≥ 0.90
Class imbalance	scale_pos_weight = N_negative / N_positive
Validation strategy	80/20 stratified train/test split

Threshold design: A threshold of 0.1082 means the model flags a patient as high-risk whenever its estimated progression probability exceeds 10.8%. This aggressive threshold accepts a higher false positive rate (19.9%) in exchange for catching 90% of true progressors. False positives proceed to Stage 2a and 2b where they are corrected by biomarker evidence.

6. Feature Importance (SHAP)

SHAP (SHapley Additive exPlanations) values quantify each feature's average contribution to the model output across the test set. Values reflect mean absolute SHAP — larger values indicate greater influence on the progression probability estimate, regardless of direction.

MMSE ScoreCognitive

0.1045

Mini-Mental State Examination — global cognitive screening tool (0–30). Lower scores indicate greater impairment.

RAVLT Immediate RecallMemory

0.0956

Rey Auditory Verbal Learning Test — sum of trials 1–5. Measures verbal learning capacity; strongly sensitive to hippocampal dysfunction.

ECog Study Partner TotalFunctional

0.0842

Everyday Cognition scale rated by a study partner across 6 cognitive domains (1 = normal, 4 = impaired). Provides informant-based functional assessment.

APOE ε4 Allele CountGenetic

0.0376

Number of APOE ε4 alleles (0, 1, or 2). The strongest genetic risk factor for late-onset Alzheimer's disease.

ECog Memory DiscrepancyFunctional

0.0263

Difference between patient self-rating and study partner rating on memory subscale (Pt − SP). Positive values indicate anosognosia — lack of insight into cognitive decline.

AgeDemographic

0.0180

Age in years. Risk of progression increases non-linearly with age across the MCI population.

RAVLT Forgetting ScoreMemory

0.0170

Trial 5 minus delayed recall — measures retention loss over a delay interval. Elevated scores indicate accelerated forgetting.

Years of EducationDemographic

0.0162

Total years of formal education. Higher education is associated with greater cognitive reserve, potentially delaying symptom expression.

SexDemographicno contribution

0.0000

Biological sex (0 = female, 1 = male). Contributed zero SHAP importance in this model — sex did not independently predict 24-month progression in this cohort.

Key findings

•MMSE and RAVLT: The top three features — MMSE, RAVLT Immediate Recall, and ECog Study Partner Total — together account for approximately 75% of total SHAP importance. All three measure cognitive function from different angles: global screening, verbal memory capacity, and informant-reported functional decline.
•ECog Memory Discrepancy: The discrepancy between self-rated and informant-rated memory captures anosognosia — a clinically significant marker of disease progression that is often missed in self-report instruments alone. Its presence in the top five features validates its inclusion.
•APOE4: APOE4 contributes meaningfully (SHAP = 0.038) despite being a static genetic variable. This reflects its role as a strong prior for amyloid accumulation and progression risk, particularly in the MCI population.
•Sex (PTGENDER): Sex contributed zero SHAP importance in this model. This does not imply sex is clinically irrelevant — it may be captured indirectly through correlated features such as education or ECog scores — but it did not independently drive predictions in this cohort.

7. Model Performance

Performance was evaluated on a held-out 20% stratified test set. The sensitivity-optimized threshold (0.1082) achieves the target recall of 89.9% on progressors while maintaining 80.1% specificity — meaning 1 in 5 non-progressors is flagged for downstream biomarker workup, which is acceptable in a gated pipeline where false positives incur the cost of a blood test rather than immediate treatment.

0.916

AUC

Discrimination

89.9%

Sensitivity

True positive rate

80.1%

Specificity

True negative rate

82.3%

Accuracy

Overall

AUC interpretation: An AUC of 0.916 indicates that the model correctly ranks a randomly selected progressor above a randomly selected non-progressor 91.6% of the time — substantially better than chance and competitive with published clinical prediction models for MCI progression using similar feature sets.

8. Pipeline Integration

Stage 1 output determines the downstream routing decision for each patient:

HIGH_RISK_PROGRESSOR

Patient proceeds to Stage 2a (plasma biomarker triage) and Stage 2b (MRI neurodegeneration gate) in parallel. Both stages run simultaneously to minimize time-to-diagnosis.

LOW_RISK_MONITOR

Patient is placed on a 12-month clinical monitoring schedule. No immediate biomarker testing is triggered. The Stage 1 result is logged for longitudinal tracking.

Stage 1 output also feeds directly into the decision support layer (Tools 1–6), contributing to risk stratification tier assignment, NHI diagnosis code suggestions, and the uncertainty flag engine. The progression_probability field is used as an amyloid probability proxy when Stage 2a plasma data is unavailable.

9. Limitations

•The model was trained exclusively on ADNI data. ADNI participants are predominantly highly educated, English-speaking, and North American. Performance in Korean community clinic populations — where educational norms, neuropsychological test administration, and MCI referral patterns differ — requires prospective validation.
•ECog requires a reliable study partner (family member or caregiver). In patients without an available informant, the ECog subscores cannot be obtained, reducing model completeness. Missingness indicators partially mitigate this but do not fully compensate.
•The RAVLT is a verbal memory test and may underperform in patients with primary language other than English, or in those with hearing impairment or low premorbid literacy.
•The 24-month outcome window creates a survivor bias: patients who died, withdrew, or were lost to follow-up before month 24 are excluded, which may overestimate model performance in real-world populations with higher attrition.
•The sensitivity-optimized threshold (0.1082) produces a 19.9% false positive rate. In populations with lower MCI-to-AD base rates than ADNI, the positive predictive value will be lower, and more patients will be unnecessarily routed to biomarker testing.
•This tool is intended for research-use clinical decision support only. It is not cleared as a medical device and must not be used as a standalone diagnostic instrument.

10. References

[1]Petersen RC, et al. Alzheimer's Disease Neuroimaging Initiative (ADNI): Clinical characterization. Neurology. 2010;74(3):201–209.
[2]Folstein MF, Folstein SE, McHugh PR. Mini-Mental State: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12(3):189–198.
[3]Rey A. L'examen clinique en psychologie. Paris: Presses Universitaires de France; 1958.
[4]Amariglio RE, et al. Everyday Cognition (ECog): Scale development and psychometric properties. Neuropsychology. 2011;25(4):531–544.
[5]Corder EH, et al. Gene dose of apolipoprotein E type 4 allele and the risk of Alzheimer's disease in late onset families. Science. 1993;261(5123):921–923.
[6]Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems. 2017;30.
[7]Chen T, Guestrin C. XGBoost: A scalable tree boosting system. KDD. 2016:785–794.
[8]Mitchell AJ, Shiri-Feshki M. Rate of progression of mild cognitive impairment to dementia — meta-analysis of 41 robust inception cohort studies. Acta Psychiatr Scand. 2009;119(4):252–265.