Skip to content

Local AI Research

Use this page when you want to train local models from IINTS-AF data instead of only running hand-written algorithms.

The SDK now separates three AI roles:

Role Best model type Trained on Purpose
explanation assistant local LLM such as Ministral via Ollama reports and certified payloads explain, summarize, review
glucose predictor time-series model real multimodal T1D datasets forecast future glucose
controller policy compact numeric policy model supervised safe-action labels from simulated runs propose insulin actions in research simulations

Do not confuse them: - Ministral is useful for language and review. - A controller needs physiological inputs and numeric validation. - The deterministic safety supervisor still wraps experimental policy outputs.

For the dedicated glucose-model workflow, start with IINTS Glucose Forecast Model. That page covers the new iints research glucose-model ... commands, long-training setup, and Hugging Face-safe export.

Start by generating the dataset acquisition plan:

iints data research-plan --output-dir data_packs/research_dataset_plan

The full source map lives in Diabetes Research Datasets.

Dataset family Best current role Why
AZT1D primary multimodal predictor training real AID-system data with detailed bolus variables
HUPA-UCM multimodal predictor training and subgroup analysis CGM, insulin, carbs, steps, calories, heart rate, sleep
OhioT1DM external held-out benchmark widely used T1D forecasting benchmark with CGM, insulin, meals, exercise, and life events
DCLP3/iDCL and Loop closed-loop / AID external validation useful for benchmark-style comparisons after schema conversion
T1DEXI / T1DexiP exercise-aware stress testing captures exercise context that ordinary meal-only scenarios miss
MetaboNet / Glucose-ML dataset-selection references useful for cross-dataset benchmark design and source discovery
Jetson / simulator teacher runs controller-policy imitation data gives exact safe-action labels under known scenarios

The scientific split is intentional: - real datasets teach glucose dynamics - teacher-labeled simulator runs teach experimental action policies

That is stronger than pretending one small dataset can safely solve both problems.

1. Build A Better Real-Data Predictor Blend

Prepare the source datasets first:

iints research prepare-azt1d
iints research prepare-hupa
iints research prepare-ohio

If you have the full request-gated OhioT1DM XML release locally, prepare it as train/test/all files and keep the raw data outside git:

export OHIO_T1DM_ROOT="/path/to/OhioT1DM-volledig"

PYTHONPATH=src python3 research/prepare_ohio_t1dm.py \
  --input "$OHIO_T1DM_ROOT" \
  --splits train \
  --output data_packs/public/ohio_t1dm_full/processed/ohio_train.csv \
  --report data_packs/public/ohio_t1dm_full/processed/ohio_train_quality_report.json

PYTHONPATH=src python3 research/prepare_ohio_t1dm.py \
  --input "$OHIO_T1DM_ROOT" \
  --splits test \
  --output data_packs/public/ohio_t1dm_full/processed/ohio_test.csv \
  --report data_packs/public/ohio_t1dm_full/processed/ohio_test_quality_report.json

Recommended leakage-safe Ohio workflow:

PYTHONPATH=src python3 research/train_predictor.py \
  --data data_packs/public/ohio_t1dm_full/processed/ohio_train.csv \
  --config research/configs/predictor_ohio_dual_guard_v2.yaml \
  --out models/ohio_t1dm_full

PYTHONPATH=src python3 research/evaluate_predictor.py \
  --data data_packs/public/ohio_t1dm_full/processed/ohio_test.csv \
  --model models/ohio_t1dm_full/predictor.pt \
  --reference-data data_packs/public/ohio_t1dm_full/processed/ohio_train.csv \
  --subgroup-column subject_id \
  --subgroup-column dataset_year \
  --out results/ohio_t1dm_full_eval.json

Use ohio_train.csv for fitting, ohio_test.csv for held-out evaluation, and ohio_all.csv only for descriptive cohort summaries or final re-training after the benchmark protocol is frozen.

Then blend the real training sources while preserving source-aware subject IDs:

iints research blend-datasets \
  --source azt1d=data_packs/public/azt1d/processed/azt1d_merged.csv \
  --source hupa=data_packs/public/hupa_ucm/processed/hupa_ucm_merged.csv \
  --output data_packs/processed/predictor_blend.csv \
  --manifest data_packs/processed/predictor_blend_manifest.json

Use OhioT1DM separately as an external benchmark rather than silently mixing every dataset together:

PYTHONPATH=src python3 research/train_predictor.py \
  --data data_packs/processed/predictor_blend.csv \
  --config research/configs/predictor_multimodal_dual_guard.yaml \
  --out models/predictor_blend

PYTHONPATH=src python3 research/evaluate_predictor.py \
  --data data_packs/public/ohio_t1dm/processed/ohio_t1dm_merged.csv \
  --model models/predictor_blend/predictor.pt \
  --external-data ohio=data_packs/public/ohio_t1dm/processed/ohio_t1dm_merged.csv \
  --reference-data data_packs/processed/predictor_blend.csv \
  --out results/predictor_blend_external_eval.json

2. Forecast One Run Before Training A Controller

Glucose prediction now has a dedicated evidence command. Use it on a run folder or CSV before you start making controller claims:

iints research forecast-run \
  --input results/jetson_research_day \
  --output-dir results/jetson_research_day_forecast

With a trained predictor checkpoint:

iints research forecast-run \
  --input results/jetson_research_day \
  --predictor models/predictor_blend/predictor.pt \
  --output-dir results/jetson_research_day_forecast_ai

For hidden-biology stress testing, you can explicitly simulate an insulin-antibody-like delay in the forecast evidence:

iints research forecast-run \
  --input results/jetson_research_day \
  --hidden-biology insulin-antibody \
  --output-dir results/jetson_research_day_forecast_antibody

This does not diagnose antibody problems and it is not a normal assumption for most people with type 1 diabetes. It is a research-only stress test for the idea that some relevant biological variables are hidden from the algorithm.

This writes:

results/jetson_research_day_forecast/
  forecast_predictions.csv
  forecast_report.json
  forecast_report.md
  forecast_manifest.json

The forecast bundle compares:

  • a last-value baseline
  • a transparent physiology-aware baseline using glucose trend, IOB, COB, ISF, ICR, activity and stress features
  • the neural predictor when a checkpoint is provided
  • optional hidden-biology feature overrides such as insulin-antibody-like binding/release

The key rule is simple: the AI predictor must beat transparent baselines and must be checked for missed hypoglycemia, uncertainty, calibration and risk-level behavior before it is used in a research controller experiment.

3. Build Controller Training Data

The Jetson research runner now writes:

research/
  predictor_training.csv
  controller_teacher_dataset.csv
  training_manifest.json

For a true 24-hour research acquisition:

iints jetson endurance start \
  --algo algorithms/example_algorithm.py \
  --duration 1d \
  --profile normal \
  --wall-clock \
  --output-dir results/jetson_research_day

You can combine several safe supervised runs into one controller dataset:

iints research build-control-dataset \
  --run day1=results/jetson_research_day \
  --run stress=results/jetson_stress_day \
  --output data_packs/processed/controller_teacher_dataset.csv \
  --manifest data_packs/processed/controller_teacher_manifest.json

4. Train A Local Controller

The first controller learner is intentionally simple and auditable:

iints research train-controller \
  --data data_packs/processed/controller_teacher_dataset.csv \
  --output models/controller_imitation.json \
  --metrics-output models/controller_imitation_metrics.json

Use it in Python research code with:

from iints.core.algorithms.imitation_controller import ExperimentalImitationController

algorithm = ExperimentalImitationController(
    settings={"model_path": "models/controller_imitation.json"}
)

This controller is not presented as clinically validated. It is a baseline that proves the full local-AI research loop:

  1. collect or simulate data
  2. build a supervised dataset
  3. train a local policy
  4. run it behind the supervisor
  5. compare it against rule-based baselines

When you need a stronger local model after that auditable baseline, train the PyTorch policy:

iints research train-neural-controller \
  --data data_packs/processed/controller_teacher_dataset.csv \
  --output models/controller_neural.pt \
  --metrics-output models/controller_neural_metrics.json

Then require held-out closed-loop evidence before treating it as a serious research candidate:

iints research evaluate-controller \
  --model models/controller_neural.pt \
  --model-kind neural \
  --output-dir results/controller_neural_eval

That report compares the learned controller with ClinicalBaselineAlgorithm on unseen presets such as hypo_prone_night, hyper_challenge, pizza_paradox, and midnight_crash.

5. One-Step Jetson Research Finalization

After a completed endurance run you can close the whole post-run loop in one command:

iints jetson endurance finalize-research \
  --output-dir results/jetson_research_day

It will:

  1. train the auditable linear imitation baseline
  2. train the stronger PyTorch controller
  3. train the glucose predictor when the exported dataset is large enough
  4. run held-out closed-loop evaluation against the clinical baseline
  5. write research/RESEARCH_PIPELINE_REPORT.md

If you want the same work to happen automatically at the end of the endurance command, add:

--finalize-research

to iints jetson endurance start.

6. Multi-Run Local AI Lab

For actual AI research, one run should not be the whole story. The SDK now has a higher-level lab command that combines multiple completed Jetson or simulator bundles into one training workspace:

iints research train-local-ai \
  --run day1=results/jetson_research_day \
  --run day2=results/jetson_research_day_2 \
  --output-dir results/local_ai_lab

It writes:

results/local_ai_lab/
  datasets/
    predictor_training.csv
    predictor_dataset_manifest.json
    controller_teacher_dataset.csv
    controller_dataset_manifest.json
    LOCAL_AI_DATASET_CARD.json
  models/
    linear_controller.json
    neural_controller.pt
    predictor/
  evaluation/
    CONTROL_EVALUATION_REPORT.md
  LOCAL_AI_RESEARCH_SUMMARY.json
  LOCAL_AI_RESEARCH_REPORT.md

This is the clearest workflow when you want to use long Jetson runs as training data:

  1. collect one or more 24h/7d research bundles
  2. merge predictor rows and controller-teacher rows with source labels
  3. train an auditable linear local controller
  4. optionally train the PyTorch neural controller
  5. optionally train the glucose predictor
  6. evaluate learned controllers on held-out scenarios

For a first smoke test without heavy PyTorch/predictor work:

iints research train-local-ai \
  --run day1=results/jetson_research_day \
  --output-dir results/local_ai_lab_smoke \
  --skip-predictor \
  --skip-neural \
  --skip-evaluation

The important separation is:

  • predictor_training.csv trains a glucose forecasting model
  • controller_teacher_dataset.csv trains a research controller policy
  • LOCAL_AI_DATASET_CARD.json records lineage, row counts, sources, and research-only limits

The older command name iints research local-ai-lab remains available, but train-local-ai is the clearest command for new users.

7. Publish The Research Evidence

After training, attach the run folders and local AI lab to one evidence bundle:

iints evidence build \
  --run day1=results/jetson_research_day \
  --run day2=results/jetson_research_day_2 \
  --local-ai-dir results/local_ai_lab \
  --output-dir results/local_ai_evidence

This writes:

  • README.md with run-level glucose metrics
  • MODEL_CARD.md with research-only use, non-use, and safety gate status
  • evidence_summary.json for machine-readable review
  • run_index.csv for quick comparison tables

8. What Good Research Looks Like

Before making any strong claim, require all of the following:

  • subject/source-aware splits
  • external-data evaluation for predictors
  • calibration and hypo-detection analysis
  • held-out scenario evaluation for controllers
  • comparison against deterministic baselines
  • safety-supervisor intervention counts
  • exact run manifests and dataset manifests

The SDK now gives you both the model layer and the first evidence layer:

  • auditable linear imitation baseline
  • stronger PyTorch neural controller
  • held-out closed-loop controller evaluation
  • automatic post-run Jetson research finalization

The next scientific step after that is not "make the model bigger"; it is stricter promotion gates, more held-out patients, and external real-data validation for every predictor claim.