AI Assistant Guide¶

This guide explains how the local open-weight Ministral 3 AI layer works inside IINTS-AF.

Scope¶

Research use only.
Not a medical device.
No clinical dosing advice.
AI output is blocked unless MDMP verification succeeds first.

What The AI Layer Does¶

The local AI assistant is designed for four narrow tasks:

explain: explain a single simulation step in plain language
trends: summarize glucose-oriented trends from a payload
anomalies: call out unusual or safety-relevant patterns
report: generate a short markdown run summary

The assistant is intentionally conservative. It explains simulation behavior or imported glucose data patterns; it does not produce treatment advice.

Architecture¶

The flow is:

iints ai ... loads a JSON payload from disk.
MDMPGuard verifies the signed MDMP artifact and enforces the minimum grade.
IINTSAssistant selects the backend.
OllamaBackend checks that Ollama is reachable and that a local open Ministral 3 tag is installed.
The prompt is built from a fixed system instruction plus a serialized payload.
The response is wrapped with a hard-coded research-only disclaimer before output is shown or saved.

Local Backend Behavior¶

The SDK defaults to local inference through Ollama.

Default model:

ministral-3:8b

Supported convenience aliases include:

ministral
ministral-3
ministral-3:8b
ministral-8b
ministral-8b-instruct

If the alias is used, IINTS resolves it to the installed local Ollama tag before generation.

Recommended Setup¶

Always work from an active virtual environment:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[full,mdmp]"

Clean Ollama Setup (Small Version)¶

This is the shortest reliable setup if you want the local AI layer working end to end.

1. Install the SDK¶

Released SDK:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U "iints-sdk-python35[full,mdmp]"

Source checkout:

cd /path/to/IINTS-SDK
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U -e ".[full,mdmp]"

2. Install Ollama¶

On macOS/Linux, the quick official path is:

curl -fsSL https://ollama.com/install.sh | sh
ollama -v

If you are on Windows, install Ollama first via the official installer and then reopen your terminal.

3. Start Ollama¶

ollama serve

If Ollama is already running as a service, you do not need to start it manually again.

4. Pull a local model¶

Balanced default:

ollama pull ministral-3:8b

Smaller fallback:

ollama pull ministral-3:3b

5. Link Ollama to the SDK¶

By default, the SDK looks for Ollama here:

http://127.0.0.1:11434

So on a normal single-machine setup, there is nothing extra to configure.

If you want to set it explicitly:

export OLLAMA_HOST=http://127.0.0.1:11434

If you want to override it for one command only:

iints ai local-check \
  --model ministral-3:8b \
  --ollama-host http://127.0.0.1:11434

Important notes:

OLLAMA_HOST is the normal way to point the SDK at a non-default local Ollama endpoint.
Remote Ollama endpoints are blocked by default. Only enable them deliberately.
If you truly want to connect to a non-loopback Ollama host, you must also set IINTS_ALLOW_REMOTE_OLLAMA=1.

6. Verify the full chain¶

Install and check the local model:

ollama pull ministral-3:8b
iints ai local-check --model ministral-3:8b

If the model is missing, the command fails with the exact ollama pull ... command to run next. If your Ollama runtime is too old for the open Ministral 3 line, local-check now tells you that as well. local-check also runs a tiny generation smoke-test by default, so it catches the common case where /api/tags works but the model crashes during real inference. The current Ollama listing for ministral-3 expects Ollama 0.13.1 or newer.

Hardware Recommendations¶

Use this as a practical starting point:

Model	Good Fit	Recommended System RAM	Recommended GPU VRAM	Approx Download
`ministral-3:3b`	smaller laptops, CPU-first setups, entry-level edge boxes	16 GB	6 GB	~3 GB
`ministral-3:8b`	balanced desktop or strong laptop	24 GB	10 GB	~6 GB
`ministral-3:14b`	high-end workstation	32 GB	16 GB	~10 GB

General advice:

Start with ministral-3:8b unless you have a specific reason to go smaller or larger.
Choose ministral-3:3b if latency and memory matter more than answer quality.
Choose ministral-3:14b only if your machine can comfortably absorb the extra RAM and latency.
Run iints ai models in the CLI to see the same recommendations in a terminal-friendly table.
If ministral-3:8b closes the connection during generation, try ministral-3:3b first before assuming something is wrong with the SDK.

Recommended Workflow¶

After a run completes, prepare the run directory once:

iints ai prepare results/<run_id>

That command creates:

ai/report_payload.json
ai/anomalies_payload.json
ai/trends_payload.json
ai/step_riskiest.json
ai/step_latest.json
ai/report.signed.mdmp plus ai/keys/ when the MDMP extra is installed

After that, you can point the AI commands directly at the run directory:

iints ai explain results/<run_id>
iints ai trends results/<run_id>
iints ai anomalies results/<run_id>
iints ai report results/<run_id> --output results/<run_id>/ai/ai_report.md
iints ai review results/<run_id>

For imported CareLink data, generate a personal workspace first:

iints carelink-workbench \
  --input-csv "/path/to/CareLink export.csv" \
  --output-dir results/personal_carelink

That creates:

carelink_dashboard.png
carelink_poster.png
carelink_dashboard.html
carelink_timeline.csv
carelink_metrics.json
ai/report_payload.json
ai/review_payload.json
ai/trends_payload.json
ai/anomalies_payload.json
ai/step_riskiest.json
ai/report.signed.mdmp when the MDMP extra is installed

After that, the same AI commands work directly on the CareLink workspace directory:

iints ai report results/personal_carelink --model ministral-3:3b
iints ai review results/personal_carelink --model ministral-3:3b
iints ai trends results/personal_carelink --model ministral-3:3b
iints ai explain results/personal_carelink --model ministral-3:3b

Digital Patient Review¶

The Raspberry Pi live runtime also plugs into the same AI layer.

After a live patient session has written its bundle under patient_runtime/live_bundle/, run:

iints patient review \
  --workspace patient_runtime \
  --model ministral-3:3b

That automatically prepares the runtime bundle, checks the MDMP gate, and writes:

patient_runtime/live_bundle/ai/realism_review.md

This is useful for expo demos where you want the Pi to both simulate and critique the realism of the current run.

Generation Commands¶

Prepared run directory mode:

iints ai explain results/<run_id>
iints ai trends results/<run_id>
iints ai anomalies results/<run_id>
iints ai report results/<run_id> --output results/<run_id>/ai/ai_report.md
iints ai review results/<run_id>

Prepared CareLink workspace mode:

iints carelink-workbench \
  --input-csv "/path/to/CareLink export.csv" \
  --output-dir results/personal_carelink

iints ai report results/personal_carelink --model ministral-3:3b
iints ai review results/personal_carelink --model ministral-3:3b
iints ai trends results/personal_carelink --model ministral-3:3b
iints ai explain results/personal_carelink --model ministral-3:3b

Direct JSON mode:

iints ai explain results/step.json \
  --mdmp-cert results/report.signed.mdmp

iints ai trends results/glucose_payload.json \
  --mdmp-cert results/report.signed.mdmp

iints ai anomalies results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp

iints ai report results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp \
  --output results/ai_report.md

iints ai review results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp \
  --output results/realism_review.md

iints ai review writes a realism-focused critique. When you point it at a prepared run directory and do not pass --output, it automatically saves to results/<run_id>/ai/realism_review.md.

The review now asks the local model to always structure feedback as:

overall realism verdict
what looks realistic
what looks suspicious
priority fixes
what to improve next
follow-up validation checks

Useful options:

--mode local to require Ollama explicitly
--model ministral-3:8b or --model ministral
--model ministral-3:3b for lighter machines
--model ministral-3:14b for stronger workstations
--ollama-host http://127.0.0.1:11434 to override the endpoint
--timeout-seconds 120 for slower local hardware
--minimum-grade research_grade to raise or lower the MDMP floor

How Reliability Is Enforced¶

For local robustness, the SDK now does four checks before a real generation call:

verifies that the Ollama HTTP endpoint is reachable
verifies that a compatible local open Ministral 3 model is installed
normalizes common local model aliases to the installed tag
truncates oversized JSON payloads before prompt construction so large run artifacts do not overwhelm local inference
flags too-old Ollama runtimes when they do not meet the minimum version expected for the open Ministral 3 line

If a generation succeeds, the response records the actual resolved model name used by the local backend.

MDMP Guard Behavior¶

The AI assistant does not run on unsigned or insufficiently graded artifacts.

The guard enforces:

signed MDMP verification
minimum grade threshold
hard-coded disclaimer injection on every response

That means the research-only warning is not dependent on the prompt and cannot be removed by changing prompt text alone.

Troubleshooting¶

Ollama Not Reachable¶

Run:

iints ai local-check --model ministral-3:8b

This now checks both basic reachability and a tiny real generation.

If the endpoint is wrong, retry with:

iints ai local-check --model ministral-3:8b --ollama-host http://127.0.0.1:11434

Model Missing¶

Pull the model shown in the error output:

ollama pull ministral-3:8b

Local Inference Is Slow¶

Increase the timeout:

iints ai report results/simulation_run.json \
  --mdmp-cert results/report.signed.mdmp \
  --timeout-seconds 180

If the server disconnects instead of timing out, the model may be too heavy for the machine at that moment. In that case try:

ollama pull ministral-3:3b
iints ai local-check --model ministral-3:3b
iints ai report results/<run_id> --model ministral-3:3b

Large Run Payloads¶

The assistant now clips oversized payloads automatically before sending them to the model. If you want tighter control, pass a smaller JSON summary rather than a full raw run dump.

No `report.signed.mdmp` In My Run Folder¶

That is now expected for a fresh raw run. The easiest fix is:

iints ai prepare results/<run_id>

This creates a local development certificate for AI use when the MDMP extra is available, so you do not have to hand-build step.json and report.signed.mdmp yourself.

`No such command 'ai'`¶

If the CLI says No such command 'ai', the most common cause is a legacy iints package still being installed beside iints-sdk-python35. That older package can shadow the newer SDK command tree.

Run the install doctor:

iints-sdk-doctor

If it reports a package ownership conflict, repair the environment:

python -m pip uninstall -y iints iints-sdk-python35
python -m pip install -U "iints-sdk-python35[full,mdmp]"
hash -r

Then retry:

iints ai models
iints ai local-check --model ministral-3:8b