Real performance numbers

Every screening algorithm is validated on independent datasets across multiple sources, devices, and patient populations. All metrics are reported with 95% confidence intervals.

Screening performance metrics by condition
Condition	Modality	AUC-ROC	Sensitivity	Specificity	95% CI (AUC)
Diabetic Retinopathy	Fundus	0.949	90.1%	82.8%	0.938–0.960
Glaucoma	Fundus	0.979	93.0%	94.0%	0.972–0.985
Hypertensive Retinopathy	Fundus	0.968	93.7%	89.1%	0.956–0.978
AMD & DME	OCT	0.999	99.2%	99.4%	0.999–1.000

All metrics validated on held-out external datasets not used during model training.

Validation Methodology

How we ensure our screening models perform reliably across real-world clinical environments.

External datasets

External Dataset Validation

All models are evaluated on independent, held-out datasets that were never seen during training. Across all conditions, RetGuard has been validated on 13 external datasets sourced from institutions across multiple countries, device manufacturers, and clinical settings.

Multi

Device manufacturers

Cross-Device Generalization

Screening performance is tested across images captured by different fundus cameras and OCT devices to ensure the models generalize beyond a single hardware platform. This is critical for deployment across diverse clinical environments.

Global

Patient populations

Population Diversity

Validation datasets include patients across a range of ages, ethnicities, disease severities, and comorbidity profiles. This ensures the models perform equitably and do not degrade for underrepresented patient groups.

Clinical Robustness

Designed for real-world screening.

Clinical robustness features — model design and interpretability
Model Design	Interpretability
Sensitivity-First Thresholds Operating points are tuned to maximize sensitivity — ensuring at-risk patients are flagged for referral, even at the cost of slightly higher false positive rates. Calibrated Confidence Scores Every prediction includes a calibrated probability, not just a binary pass/fail. Clinicians see exactly how confident the model is for each condition.	Grad-CAM Evidence Maps Each result includes visual heatmaps highlighting image regions clinicians can verify. Multi-Condition Correlation All five conditions are analyzed in a single pass, allowing the system to flag co-occurring conditions like DR alongside DME that are clinically expected together.

Model Design

Sensitivity-First Thresholds
Operating points are tuned to maximize sensitivity — ensuring at-risk patients are flagged for referral, even at the cost of slightly higher false positive rates.
Calibrated Confidence Scores
Every prediction includes a calibrated probability, not just a binary pass/fail. Clinicians see exactly how confident the model is for each condition.

Interpretability

Grad-CAM Evidence Maps
Each result includes visual heatmaps highlighting image regions clinicians can verify.
Multi-Condition Correlation
All five conditions are analyzed in a single pass, allowing the system to flag co-occurring conditions like DR alongside DME that are clinically expected together.

Safety & Reliability

Built-in safeguards that automatically flag suboptimal images and unusual inputs before they reach the screening algorithms.

✓

Image Quality Assurance

Every image is automatically assessed for clinical gradability before analysis. Images that do not meet quality standards are flagged with clear guidance, prompting the operator to recapture and ensuring only reliable inputs proceed.

Input Validation

The system automatically identifies and flags inputs that fall outside its validated range. This prevents unreliable predictions and ensures clinicians are alerted whenever an image requires manual review.

Learn how RetGuard works