Skip to content

AI Assistant Guide

This guide explains how the local open-weight Ministral 3 AI layer works inside IINTS-AF.

Scope

  • Research use only.
  • Not a medical device.
  • No clinical dosing advice.
  • AI output is blocked unless MDMP verification succeeds first.

What The AI Layer Does

The local AI assistant is designed for four narrow tasks:

  • explain: explain a single simulation step in plain language
  • trends: summarize glucose-oriented trends from a payload
  • anomalies: call out unusual or safety-relevant patterns
  • report: generate a short markdown run summary

The assistant is intentionally conservative. It explains simulation behavior or imported glucose data patterns; it does not produce treatment advice.

Architecture

The flow is:

  1. iints ai ... loads a JSON payload from disk.
  2. MDMPGuard verifies the signed MDMP artifact and enforces the minimum grade.
  3. IINTSAssistant selects the backend.
  4. OllamaBackend checks that Ollama is reachable and that a local open Ministral 3 tag is installed.
  5. The prompt is built from a fixed system instruction plus a serialized payload.
  6. The response is wrapped with a hard-coded research-only disclaimer before output is shown or saved.

Local Backend Behavior

The SDK defaults to local inference through Ollama.

Default model:

ministral-3:8b

Supported convenience aliases include:

  • ministral
  • ministral-3
  • ministral-3:8b
  • ministral-8b
  • ministral-8b-instruct

If the alias is used, IINTS resolves it to the installed local Ollama tag before generation.

Always work from an active virtual environment:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[full,mdmp]"

Clean Ollama Setup (Small Version)

This is the shortest reliable setup if you want the local AI layer working end to end.

1. Install the SDK

Released SDK:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U "iints-sdk-python35[full,mdmp]"

Source checkout:

cd /path/to/IINTS-SDK
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U -e ".[full,mdmp]"

2. Install Ollama

On macOS/Linux, the quick official path is:

curl -fsSL https://ollama.com/install.sh | sh
ollama -v

If you are on Windows, install Ollama first via the official installer and then reopen your terminal.

3. Start Ollama

ollama serve

If Ollama is already running as a service, you do not need to start it manually again.

4. Pull a local model

Balanced default:

ollama pull ministral-3:8b

Smaller fallback:

ollama pull ministral-3:3b

By default, the SDK looks for Ollama here:

http://127.0.0.1:11434

So on a normal single-machine setup, there is nothing extra to configure.

If you want to set it explicitly:

export OLLAMA_HOST=http://127.0.0.1:11434

If you want to override it for one command only:

iints ai local-check \
  --model ministral-3:8b \
  --ollama-host http://127.0.0.1:11434

Important notes:

  • OLLAMA_HOST is the normal way to point the SDK at a non-default local Ollama endpoint.
  • Remote Ollama endpoints are blocked by default. Only enable them deliberately.
  • If you truly want to connect to a non-loopback Ollama host, you must also set IINTS_ALLOW_REMOTE_OLLAMA=1.

6. Verify the full chain

Install and check the local model:

ollama pull ministral-3:8b
iints ai local-check --model ministral-3:8b

If the model is missing, the command fails with the exact ollama pull ... command to run next. If your Ollama runtime is too old for the open Ministral 3 line, local-check now tells you that as well. local-check also runs a tiny generation smoke-test by default, so it catches the common case where /api/tags works but the model crashes during real inference. The current Ollama listing for ministral-3 expects Ollama 0.13.1 or newer.

Hardware Recommendations

Use this as a practical starting point:

Model Good Fit Recommended System RAM Recommended GPU VRAM Approx Download
ministral-3:3b smaller laptops, CPU-first setups, entry-level edge boxes 16 GB 6 GB ~3 GB
ministral-3:8b balanced desktop or strong laptop 24 GB 10 GB ~6 GB
ministral-3:14b high-end workstation 32 GB 16 GB ~10 GB

General advice:

  • Start with ministral-3:8b unless you have a specific reason to go smaller or larger.
  • Choose ministral-3:3b if latency and memory matter more than answer quality.
  • Choose ministral-3:14b only if your machine can comfortably absorb the extra RAM and latency.
  • Run iints ai models in the CLI to see the same recommendations in a terminal-friendly table.
  • If ministral-3:8b closes the connection during generation, try ministral-3:3b first before assuming something is wrong with the SDK.

After a run completes, prepare the run directory once:

iints ai prepare results/<run_id>

That command creates:

  • ai/report_payload.json
  • ai/anomalies_payload.json
  • ai/trends_payload.json
  • ai/step_riskiest.json
  • ai/step_latest.json
  • ai/report.signed.mdmp plus ai/keys/ when the MDMP extra is installed

After that, you can point the AI commands directly at the run directory:

iints ai explain results/<run_id>
iints ai trends results/<run_id>
iints ai anomalies results/<run_id>
iints ai report results/<run_id> --output results/<run_id>/ai/ai_report.md
iints ai review results/<run_id>

For imported CareLink data, generate a personal workspace first:

iints carelink-workbench \
  --input-csv "/path/to/CareLink export.csv" \
  --output-dir results/personal_carelink

That creates:

  • carelink_dashboard.png
  • carelink_poster.png
  • carelink_dashboard.html
  • carelink_timeline.csv
  • carelink_metrics.json
  • ai/report_payload.json
  • ai/review_payload.json
  • ai/trends_payload.json
  • ai/anomalies_payload.json
  • ai/step_riskiest.json
  • ai/report.signed.mdmp when the MDMP extra is installed

After that, the same AI commands work directly on the CareLink workspace directory:

iints ai report results/personal_carelink --model ministral-3:3b
iints ai review results/personal_carelink --model ministral-3:3b
iints ai trends results/personal_carelink --model ministral-3:3b
iints ai explain results/personal_carelink --model ministral-3:3b

Digital Patient Review

The Raspberry Pi live runtime also plugs into the same AI layer.

After a live patient session has written its bundle under patient_runtime/live_bundle/, run:

iints patient review \
  --workspace patient_runtime \
  --model ministral-3:3b

That automatically prepares the runtime bundle, checks the MDMP gate, and writes:

  • patient_runtime/live_bundle/ai/realism_review.md

This is useful for expo demos where you want the Pi to both simulate and critique the realism of the current run.

Generation Commands

Prepared run directory mode:

iints ai explain results/<run_id>
iints ai trends results/<run_id>
iints ai anomalies results/<run_id>
iints ai report results/<run_id> --output results/<run_id>/ai/ai_report.md
iints ai review results/<run_id>

Prepared CareLink workspace mode:

iints carelink-workbench \
  --input-csv "/path/to/CareLink export.csv" \
  --output-dir results/personal_carelink

iints ai report results/personal_carelink --model ministral-3:3b
iints ai review results/personal_carelink --model ministral-3:3b
iints ai trends results/personal_carelink --model ministral-3:3b
iints ai explain results/personal_carelink --model ministral-3:3b

Direct JSON mode:

iints ai explain results/step.json \
  --mdmp-cert results/report.signed.mdmp

iints ai trends results/glucose_payload.json \
  --mdmp-cert results/report.signed.mdmp

iints ai anomalies results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp

iints ai report results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp \
  --output results/ai_report.md

iints ai review results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp \
  --output results/realism_review.md

iints ai review writes a realism-focused critique. When you point it at a prepared run directory and do not pass --output, it automatically saves to results/<run_id>/ai/realism_review.md.

The review now asks the local model to always structure feedback as:

  • overall realism verdict
  • what looks realistic
  • what looks suspicious
  • priority fixes
  • what to improve next
  • follow-up validation checks

Useful options:

  • --mode local to require Ollama explicitly
  • --model ministral-3:8b or --model ministral
  • --model ministral-3:3b for lighter machines
  • --model ministral-3:14b for stronger workstations
  • --ollama-host http://127.0.0.1:11434 to override the endpoint
  • --timeout-seconds 120 for slower local hardware
  • --minimum-grade research_grade to raise or lower the MDMP floor

How Reliability Is Enforced

For local robustness, the SDK now does four checks before a real generation call:

  • verifies that the Ollama HTTP endpoint is reachable
  • verifies that a compatible local open Ministral 3 model is installed
  • normalizes common local model aliases to the installed tag
  • truncates oversized JSON payloads before prompt construction so large run artifacts do not overwhelm local inference
  • flags too-old Ollama runtimes when they do not meet the minimum version expected for the open Ministral 3 line

If a generation succeeds, the response records the actual resolved model name used by the local backend.

MDMP Guard Behavior

The AI assistant does not run on unsigned or insufficiently graded artifacts.

The guard enforces:

  • signed MDMP verification
  • minimum grade threshold
  • hard-coded disclaimer injection on every response

That means the research-only warning is not dependent on the prompt and cannot be removed by changing prompt text alone.

Troubleshooting

Ollama Not Reachable

Run:

iints ai local-check --model ministral-3:8b

This now checks both basic reachability and a tiny real generation.

If the endpoint is wrong, retry with:

iints ai local-check --model ministral-3:8b --ollama-host http://127.0.0.1:11434

Model Missing

Pull the model shown in the error output:

ollama pull ministral-3:8b

Local Inference Is Slow

Increase the timeout:

iints ai report results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp \
  --timeout-seconds 180

If the server disconnects instead of timing out, the model may be too heavy for the machine at that moment. In that case try:

ollama pull ministral-3:3b
iints ai local-check --model ministral-3:3b
iints ai report results/<run_id> --model ministral-3:3b

Large Run Payloads

The assistant now clips oversized payloads automatically before sending them to the model. If you want tighter control, pass a smaller JSON summary rather than a full raw run dump.

No report.signed.mdmp In My Run Folder

That is now expected for a fresh raw run. The easiest fix is:

iints ai prepare results/<run_id>

This creates a local development certificate for AI use when the MDMP extra is available, so you do not have to hand-build step.json and report.signed.mdmp yourself.

No such command 'ai'

If the CLI says No such command 'ai', the most common cause is a legacy iints package still being installed beside iints-sdk-python35. That older package can shadow the newer SDK command tree.

Run the install doctor:

iints-sdk-doctor

If it reports a package ownership conflict, repair the environment:

python -m pip uninstall -y iints iints-sdk-python35
python -m pip install -U "iints-sdk-python35[full,mdmp]"
hash -r

Then retry:

iints ai models
iints ai local-check --model ministral-3:8b