AI Red-Team Realism Auditor¶
The AI Realism Auditor is a post-run verification workflow for IINTS-AF physiological simulations. It is designed for one clear question:
Did the simulator produce a biologically plausible extreme, or did the math/data pipeline produce an impossible state?
It is not a clinical decision tool. It is an engineering and education tool for finding simulator bugs, broken assumptions, and unrealistic edge cases before a model is shown as evidence.
What It Does¶
The workflow has two layers:
- Deterministic filter: scans every row in a CSV for hard violations such as negative glucose, impossible rate of change, explosive ketones, or negative pump delivery.
- Optional local AI verdict: sends only the small window around a flagged event to local Ollama/Ministral so the explanation stays local and context-bounded.
This avoids sending a full 14-day run to an LLM. The LLM explains flagged evidence; it does not decide insulin and it does not replace the deterministic filter.
Install Requirements¶
For the full Jetson/research workflow, install the SDK with report dependencies:
python3 -m pip install -e ".[full]"
For local AI explanations, Ollama must be running locally:
ollama serve
ollama pull ministral-3:8b
On small edge devices, use the deterministic mode first:
python3 -m iints.tools.ai_realism_auditor results/red_team/endurance_data.csv --no-ai
Path A: Official Jetson Endurance Output¶
Run the official endurance workflow:
iints jetson endurance start \
--algo algorithms/example_algorithm.py \
--duration 14d \
--output-dir results/jetson_14day \
--profile mixed_adversarial \
--seed 42
Audit the raw step CSV:
python3 -m iints.tools.ai_realism_auditor \
results/jetson_14day/raw/steps.csv \
--report results/jetson_14day/final/AI_REALISM_AUDIT.md \
--no-ai
With local Ollama enabled:
python3 -m iints.tools.ai_realism_auditor \
results/jetson_14day/raw/steps.csv \
--report results/jetson_14day/final/AI_REALISM_AUDIT.md \
--model ministral-3:8b
Path B: Advanced Metabolic Model Stress CSV¶
For an educational DKA/lipotoxicity stress demo, generate a CSV directly from the AdvancedMetabolicModel:
PYTHONPATH=src python3 examples/jetson_endurance_test.py \
--days 14 \
--output results/red_team/endurance_data.csv \
--inject-demo-glitch
Then audit it:
PYTHONPATH=src python3 -m iints.tools.ai_realism_auditor \
results/red_team/endurance_data.csv \
--report results/red_team/AI_REALISM_AUDIT.md \
--no-ai
The --inject-demo-glitch flag writes one deliberately corrupted row. Use it only for a self-test or booth explanation. It proves the auditor catches impossible states; it is not evidence that the physiological model naturally produced that bug.
Red-Team Thresholds¶
The deterministic filter flags:
| Check | Threshold |
|---|---|
| Negative glucose | < 0 mg/dL |
| Extreme hypoglycemia | < 20 mg/dL |
| Extreme hyperglycemia | > 800 mg/dL |
| Impossible glucose velocity | abs(dG/dt) > 15 mg/dL/min |
| Explosive ketones | > 15 mmol/L |
| Negative insulin delivery | < 0 U |
Mathematical Model Summary¶
The current advanced model extends the Bergman-style state vector to 18 states:
where:
- \(G\): plasma glucose in mg/dL
- \(X\): remote insulin action
- \(I\): plasma insulin
- \(F\): free fatty acids (FFA)
- \(K\): ketones
- \(\beta\): residual beta-cell mass fraction
- \(Q_{fat}\): slow fat stomach pool
- \(Q_{prot}\): slow protein pool
Free Fatty Acids¶
Insulin suppresses lipolysis. The implemented FFA dynamics are:
with the current parameters:
Ketones¶
Ketone production is driven by high FFA and low insulin:
with:
Lipotoxic Insulin Resistance¶
High FFA reduces insulin sensitivity through:
and the effective insulin-action gain becomes:
Beta-Cell Decay¶
Residual beta-cell mass decays exponentially:
where \(a\) is autoimmune_aggressiveness.
Endogenous Glucose Production¶
The T1D instability upgrade removes the old automatic pull back to basal glucose and models glucose as:
where:
and starvation/hepatic resistance scales the effective basal term:
Safety Guard¶
The implementation still applies simulator guards after ODE solving:
This means a negative glucose value should not appear in normal model output. If it appears in a CSV, the auditor should treat it as data corruption or a mathematical bug.
What To Say At A Booth¶
Use this phrasing:
We run the digital patient for many simulated days. A fast red-team filter scans every row for impossible biology. If something suspicious appears, a local AI explains the small evidence window in plain language. The AI explains the bug; it does not control the pump.
That distinction is important and credible.