Skip to content

IINTS Glucose Forecast Model

The IINTS glucose model workflow is for building a dedicated research-only model that reads glucose time-series context and predicts future glucose trends.

It is designed for:

  • CGM trend forecasting
  • 30/60/120 minute glucose prediction experiments
  • hypoglycemia and hyperglycemia risk research
  • uncertainty and calibration studies
  • local AI experiments using OhioT1DM, AZT1D, HUPA-UCM, simulator exports, and Jetson endurance runs
  • Hugging Face model packaging without publishing private/raw dataset rows

It is not designed for:

  • real-world treatment decisions
  • insulin or glucagon dosing authority
  • replacing a deterministic safety supervisor
  • uploading gated/private patient data to public repositories

Mental Model

The workflow has four stages:

prepared glucose datasets
        ↓
iints research glucose-model build-dataset
        ↓
normalized training pack + manifest + config
        ↓
iints research glucose-model train
        ↓
predictor.pt + training_report.json
        ↓
iints research glucose-model export-hf
        ↓
Hugging Face-ready model folder

The dedicated model line is called:

iints-glucose-forecast-v0

This is intentionally a numeric time-series model, not a language model. LLMs can explain runs, summarize results, or manage research artifacts, but the glucose forecaster itself should be trained and evaluated as a physiological time-series predictor.

The default training profile is PINN-first: the generated config uses loss: pinn, which combines forecast error with penalties for impossible glucose bounds, unrealistic rate-of-change, and suspicious IOB/COB logic.

Build A Training Pack

Use one or more prepared datasets. Keep full/private OhioT1DM data outside git.

export OHIO_T1DM_ROOT="/path/to/OhioT1DM-volledig"

PYTHONPATH=src python3 research/prepare_ohio_t1dm.py \
  --input "$OHIO_T1DM_ROOT" \
  --splits train \
  --output data_packs/public/ohio_t1dm_full/processed/ohio_train.csv \
  --report data_packs/public/ohio_t1dm_full/processed/ohio_train_quality_report.json

Then normalize the training sources into the glucose-model contract:

iints research glucose-model build-dataset \
  --input data_packs/public/ohio_t1dm_full/processed/ohio_train.csv \
  --input results/realism_learning_10k/research/predictor_training.csv \
  --labels ohio_full,sim_10k \
  --profile long \
  --history-minutes 360 \
  --horizon-minutes 120 \
  --output-dir models/iints-glucose-forecast-v0/dataset

This writes:

models/iints-glucose-forecast-v0/dataset/
├── glucose_training_dataset.csv
├── glucose_dataset_manifest.json
├── glucose_model_config.yaml
└── MODEL_INTENT.md

OhioT1DM On Jetson Without Pushing Raw Data

Do not commit the full raw OhioT1DM folder to GitHub. Keep it on a local SSD, Jetson disk, or another access-controlled storage location. The repository contains the preparation code and training commands, not the gated/raw dataset.

If the raw folder is available on the Jetson, prepare it locally:

iints research prepare-ohio \
  --input-dir /path/to/OhioT1DM-volledig \
  --splits train \
  --output data_packs/public/ohio_t1dm_full/processed/ohio_train.csv \
  --report data_packs/public/ohio_t1dm_full/processed/ohio_train_quality_report.json

Then build the normalized glucose-model dataset:

iints research glucose-model build-dataset \
  --input data_packs/public/ohio_t1dm_full/processed/ohio_train.csv \
  --labels ohio_full \
  --profile long \
  --history-minutes 360 \
  --horizon-minutes 120 \
  --output-dir models/iints-glucose-forecast-v0/dataset

The generated processed files live under gitignored folders (data_packs/ and models/). They are available to the Jetson for training, but they are not accidentally published to GitHub.

Train The Model

For a quick smoke test:

iints research glucose-model train \
  --data models/iints-glucose-forecast-v0/dataset/glucose_training_dataset.csv \
  --config models/iints-glucose-forecast-v0/dataset/glucose_model_config.yaml \
  --output-dir models/iints-glucose-forecast-v0 \
  --epochs 2

For a serious long local run:

iints research glucose-model train \
  --data models/iints-glucose-forecast-v0/dataset/glucose_training_dataset.csv \
  --config models/iints-glucose-forecast-v0/dataset/glucose_model_config.yaml \
  --output-dir models/iints-glucose-forecast-v0 \
  --epochs 220 \
  --batch-size 256 \
  --export-hf

The training output includes:

models/iints-glucose-forecast-v0/
├── predictor.pt
├── training_report.json
├── glucose_model_config.resolved.yaml
└── huggingface/

Evaluate Against Held-Out Data

Use external data as a separate benchmark whenever possible:

PYTHONPATH=src python3 research/evaluate_predictor.py \
  --data data_packs/public/ohio_t1dm_full/processed/ohio_test.csv \
  --model models/iints-glucose-forecast-v0/predictor.pt \
  --config models/iints-glucose-forecast-v0/glucose_model_config.resolved.yaml \
  --reference-data models/iints-glucose-forecast-v0/dataset/glucose_training_dataset.csv \
  --out results/iints_glucose_forecast_v0_eval.json \
  --mc-samples 30

Minimum metrics to inspect:

  • MAE and RMSE by forecast horizon
  • band-wise error for hypo, target, and hyper ranges
  • missed hypoglycemia rate
  • false hypoglycemia alarm rate
  • uncertainty calibration if MC dropout is used
  • subject-level split and leakage audit
  • external dataset performance, not just internal validation

Compare MSE, Band-Weighted, And PINN Models

After training multiple candidates, compare them with one command:

iints research glucose-model compare \
  --data data_packs/public/ohio_t1dm_full/processed/ohio_test.csv \
  --config models/iints-glucose-forecast-v0/dataset/glucose_model_config.yaml \
  --model mse=models/glucose_mse/predictor.pt \
  --model band=models/glucose_band/predictor.pt \
  --model pinn=models/iints-glucose-forecast-v0/predictor.pt \
  --mc-samples 30 \
  --output-dir results/glucose_model_comparison

This writes:

results/glucose_model_comparison/
├── comparison_report.json
├── comparison_report.md
├── horizon_metrics.csv
├── physiological_violation_metrics.csv
├── hypo_detection_metrics.csv
├── model_card_metrics.json
└── figures/

The key idea is simple: do not promote a model just because MAE improved. A useful diabetes model must also reduce missed hypoglycemia, avoid overconfident uncertainty, and reduce physiologically impossible predictions.

The comparison gate checks:

  • MAE, RMSE, bias, and within-range error
  • per-horizon metrics across the forecast window
  • missed hypoglycemia rate and false hypo alarms
  • impossible glucose predictions below 20 mg/dL or above 600 mg/dL
  • unrealistic predicted rate-of-change
  • suspicious rise with high IOB and no COB
  • suspicious drop with COB and no IOB

For the scientific reasoning behind these outputs, see Interpreting Glucose Forecast Results. That page explains why the lowest MSE is not automatically the best research model, why PINN can be preferable, and why long-horizon forecasts are harder.

Export For Hugging Face

After training:

iints research glucose-model export-hf \
  --model-dir models/iints-glucose-forecast-v0 \
  --dataset-manifest models/iints-glucose-forecast-v0/dataset/glucose_dataset_manifest.json \
  --comparison-dir results/glucose_model_comparison \
  --repo-id IINTS/iints-glucose-forecast-v0 \
  --output-dir models/iints-glucose-forecast-v0/huggingface

The export folder contains:

huggingface/
├── README.md
├── PUBLISHING.md
├── privacy.md
├── limitations.md
├── config.json
├── glucose_model_config.yaml
├── predictor.pt
├── training_report.json
├── dataset_manifest.public.json
├── comparison_report.md
├── comparison_interpretation.md
├── comparison_report.json
├── horizon_metrics.csv
├── physiological_violation_metrics.csv
├── hypo_detection_metrics.csv
├── model_card_metrics.json
└── examples/
    ├── inference_example.py
    └── sample_glucose_trace.csv

The public manifest redacts local source paths and raw file hashes. This is important for gated datasets such as OhioT1DM. The comparison files are optional, but strongly recommended before publishing because they show why the model is judged by physiology-aware safety gates, not only by MAE.

Continue Training On Jetson From Hugging Face

If your model already exists on Hugging Face, use the Jetson as a conservative fine-tuning worker. The SDK downloads the current model, trains candidates with warm-start, compares the candidate against the current local champion, and only promotes the candidate when a physiology-aware composite score improves.

Login once on the Jetson:

HF_HOME="$PWD/.cache/huggingface" hf auth login

Run one safe smoke trial:

iints research glucose-model jetson-train-hf \
  --repo-id IINTS/iints-glucose-forecast-v0 \
  --dataset models/iints-glucose-forecast-v0/dataset/glucose_training_dataset.csv \
  --dataset-manifest models/iints-glucose-forecast-v0/dataset/glucose_dataset_manifest.json \
  --work-dir models/jetson_hf_training \
  --max-trials 1 \
  --epochs 2 \
  --batch-size 64 \
  --upload-mode none

If that succeeds, start a longer run:

nohup iints research glucose-model jetson-train-hf \
  --repo-id IINTS/iints-glucose-forecast-v0 \
  --dataset models/iints-glucose-forecast-v0/dataset/glucose_training_dataset.csv \
  --dataset-manifest models/iints-glucose-forecast-v0/dataset/glucose_dataset_manifest.json \
  --work-dir models/jetson_hf_training \
  --max-trials 0 \
  --epochs 8 \
  --batch-size 64 \
  --timeout-minutes 45 \
  --cooldown-seconds 20 \
  --upload-mode none \
  > jetson_hf_training.log 2>&1 &

Monitor progress:

tail -f jetson_hf_training.log
cat models/jetson_hf_training/jetson_hf_leaderboard.csv
ls models/jetson_hf_training/champion

When you are ready to send a candidate to Hugging Face, prefer a pull request:

iints research glucose-model jetson-train-hf \
  --repo-id IINTS/iints-glucose-forecast-v0 \
  --dataset models/iints-glucose-forecast-v0/dataset/glucose_training_dataset.csv \
  --work-dir models/jetson_hf_training \
  --max-trials 1 \
  --epochs 8 \
  --batch-size 64 \
  --upload-mode pr

This command does not upload raw OhioT1DM rows. It uploads the champion model bundle only when a candidate is promoted, with the research-only model card, privacy notes, limitations, comparison artifacts, and example inference script.

Private-First Upload

Upload privately first:

cd models/iints-glucose-forecast-v0/huggingface
hf upload IINTS/iints-glucose-forecast-v0 . . --type model --private

Before making it public, verify:

  • no raw OhioT1DM rows are included
  • no private local file paths are visible
  • the model card clearly says research-only and not for treatment
  • evaluation includes held-out subjects
  • limitations and uncertainty are documented

Feature Contract

The v0 feature contract includes:

glucose_actual_mgdl
glucose_trend_mgdl_min
patient_iob_units
patient_cob_grams
delivered_insulin_units
carb_intake_grams
effective_isf
effective_icr
effective_basal_rate_u_per_hr
exercise_intensity
stress_intensity
steps
heart_rate
time_of_day_sin
time_of_day_cos
glucagon_mg
haaf_memory

Missing optional features are filled with conservative defaults during dataset preparation. Glucose is required.

Research Boundary

This model should be treated as a forecast signal only. In IINTS controller experiments, the deterministic supervisor remains the final authority. The model can inform analysis and simulation, but it must never bypass safety constraints.