AI Assistant Guide¶
This guide explains how the local open-weight Ministral 3 AI layer works inside IINTS-AF.
Scope¶
- Research use only.
- Not a medical device.
- No clinical dosing advice.
- AI output is blocked unless MDMP verification succeeds first.
What The AI Layer Does¶
The local AI assistant is designed for four narrow tasks:
explain: explain a single simulation step in plain languagetrends: summarize glucose-oriented trends from a payloadanomalies: call out unusual or safety-relevant patternsreport: generate a short markdown run summary
The assistant is intentionally conservative. It explains simulation behavior or imported glucose data patterns; it does not produce treatment advice.
Architecture¶
The flow is:
iints ai ...loads a JSON payload from disk.MDMPGuardverifies the signed MDMP artifact and enforces the minimum grade.IINTSAssistantselects the backend.OllamaBackendchecks that Ollama is reachable and that a local open Ministral 3 tag is installed.- The prompt is built from a fixed system instruction plus a serialized payload.
- The response is wrapped with a hard-coded research-only disclaimer before output is shown or saved.
Local Backend Behavior¶
The SDK defaults to local inference through Ollama.
Default model:
ministral-3:8b
Supported convenience aliases include:
ministralministral-3ministral-3:8bministral-8bministral-8b-instruct
If the alias is used, IINTS resolves it to the installed local Ollama tag before generation.
Recommended Setup¶
Always work from an active virtual environment:
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[full,mdmp]"
Clean Ollama Setup (Small Version)¶
This is the shortest reliable setup if you want the local AI layer working end to end.
1. Install the SDK¶
Released SDK:
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U "iints-sdk-python35[full,mdmp]"
Source checkout:
cd /path/to/IINTS-SDK
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -U -e ".[full,mdmp]"
2. Install Ollama¶
On macOS/Linux, the quick official path is:
curl -fsSL https://ollama.com/install.sh | sh
ollama -v
If you are on Windows, install Ollama first via the official installer and then reopen your terminal.
3. Start Ollama¶
ollama serve
If Ollama is already running as a service, you do not need to start it manually again.
4. Pull a local model¶
Balanced default:
ollama pull ministral-3:8b
Smaller fallback:
ollama pull ministral-3:3b
5. Link Ollama to the SDK¶
By default, the SDK looks for Ollama here:
http://127.0.0.1:11434
So on a normal single-machine setup, there is nothing extra to configure.
If you want to set it explicitly:
export OLLAMA_HOST=http://127.0.0.1:11434
If you want to override it for one command only:
iints ai local-check \
--model ministral-3:8b \
--ollama-host http://127.0.0.1:11434
Important notes:
OLLAMA_HOSTis the normal way to point the SDK at a non-default local Ollama endpoint.- Remote Ollama endpoints are blocked by default. Only enable them deliberately.
- If you truly want to connect to a non-loopback Ollama host, you must also set
IINTS_ALLOW_REMOTE_OLLAMA=1.
6. Verify the full chain¶
Install and check the local model:
ollama pull ministral-3:8b
iints ai local-check --model ministral-3:8b
If the model is missing, the command fails with the exact ollama pull ... command to run next.
If your Ollama runtime is too old for the open Ministral 3 line, local-check now tells you that as well.
local-check also runs a tiny generation smoke-test by default, so it catches the common case where /api/tags works but the model crashes during real inference.
The current Ollama listing for ministral-3 expects Ollama 0.13.1 or newer.
Hardware Recommendations¶
Use this as a practical starting point:
| Model | Good Fit | Recommended System RAM | Recommended GPU VRAM | Approx Download |
|---|---|---|---|---|
ministral-3:3b |
smaller laptops, CPU-first setups, entry-level edge boxes | 16 GB | 6 GB | ~3 GB |
ministral-3:8b |
balanced desktop or strong laptop | 24 GB | 10 GB | ~6 GB |
ministral-3:14b |
high-end workstation | 32 GB | 16 GB | ~10 GB |
General advice:
- Start with
ministral-3:8bunless you have a specific reason to go smaller or larger. - Choose
ministral-3:3bif latency and memory matter more than answer quality. - Choose
ministral-3:14bonly if your machine can comfortably absorb the extra RAM and latency. - Run
iints ai modelsin the CLI to see the same recommendations in a terminal-friendly table. - If
ministral-3:8bcloses the connection during generation, tryministral-3:3bfirst before assuming something is wrong with the SDK.
Recommended Workflow¶
After a run completes, prepare the run directory once:
iints ai prepare results/<run_id>
That command creates:
ai/report_payload.jsonai/anomalies_payload.jsonai/trends_payload.jsonai/step_riskiest.jsonai/step_latest.jsonai/report.signed.mdmpplusai/keys/when the MDMP extra is installed
After that, you can point the AI commands directly at the run directory:
iints ai explain results/<run_id>
iints ai trends results/<run_id>
iints ai anomalies results/<run_id>
iints ai report results/<run_id> --output results/<run_id>/ai/ai_report.md
iints ai review results/<run_id>
For imported CareLink data, generate a personal workspace first:
iints carelink-workbench \
--input-csv "/path/to/CareLink export.csv" \
--output-dir results/personal_carelink
That creates:
carelink_dashboard.pngcarelink_poster.pngcarelink_dashboard.htmlcarelink_timeline.csvcarelink_metrics.jsonai/report_payload.jsonai/review_payload.jsonai/trends_payload.jsonai/anomalies_payload.jsonai/step_riskiest.jsonai/report.signed.mdmpwhen the MDMP extra is installed
After that, the same AI commands work directly on the CareLink workspace directory:
iints ai report results/personal_carelink --model ministral-3:3b
iints ai review results/personal_carelink --model ministral-3:3b
iints ai trends results/personal_carelink --model ministral-3:3b
iints ai explain results/personal_carelink --model ministral-3:3b
Digital Patient Review¶
The Raspberry Pi live runtime also plugs into the same AI layer.
After a live patient session has written its bundle under patient_runtime/live_bundle/, run:
iints patient review \
--workspace patient_runtime \
--model ministral-3:3b
That automatically prepares the runtime bundle, checks the MDMP gate, and writes:
patient_runtime/live_bundle/ai/realism_review.md
This is useful for expo demos where you want the Pi to both simulate and critique the realism of the current run.
Generation Commands¶
Prepared run directory mode:
iints ai explain results/<run_id>
iints ai trends results/<run_id>
iints ai anomalies results/<run_id>
iints ai report results/<run_id> --output results/<run_id>/ai/ai_report.md
iints ai review results/<run_id>
Prepared CareLink workspace mode:
iints carelink-workbench \
--input-csv "/path/to/CareLink export.csv" \
--output-dir results/personal_carelink
iints ai report results/personal_carelink --model ministral-3:3b
iints ai review results/personal_carelink --model ministral-3:3b
iints ai trends results/personal_carelink --model ministral-3:3b
iints ai explain results/personal_carelink --model ministral-3:3b
Direct JSON mode:
iints ai explain results/step.json \
--mdmp-cert results/report.signed.mdmp
iints ai trends results/glucose_payload.json \
--mdmp-cert results/report.signed.mdmp
iints ai anomalies results/simulation_run.json \
--mdmp-cert results/report.signed.mdmp
iints ai report results/simulation_run.json \
--mdmp-cert results/report.signed.mdmp \
--output results/ai_report.md
iints ai review results/simulation_run.json \
--mdmp-cert results/report.signed.mdmp \
--output results/realism_review.md
iints ai review writes a realism-focused critique. When you point it at a prepared run directory and do not pass --output, it automatically saves to results/<run_id>/ai/realism_review.md.
The review now asks the local model to always structure feedback as:
- overall realism verdict
- what looks realistic
- what looks suspicious
- priority fixes
- what to improve next
- follow-up validation checks
Useful options:
--mode localto require Ollama explicitly--model ministral-3:8bor--model ministral--model ministral-3:3bfor lighter machines--model ministral-3:14bfor stronger workstations--ollama-host http://127.0.0.1:11434to override the endpoint--timeout-seconds 120for slower local hardware--minimum-grade research_gradeto raise or lower the MDMP floor
How Reliability Is Enforced¶
For local robustness, the SDK now does four checks before a real generation call:
- verifies that the Ollama HTTP endpoint is reachable
- verifies that a compatible local open Ministral 3 model is installed
- normalizes common local model aliases to the installed tag
- truncates oversized JSON payloads before prompt construction so large run artifacts do not overwhelm local inference
- flags too-old Ollama runtimes when they do not meet the minimum version expected for the open Ministral 3 line
If a generation succeeds, the response records the actual resolved model name used by the local backend.
MDMP Guard Behavior¶
The AI assistant does not run on unsigned or insufficiently graded artifacts.
The guard enforces:
- signed MDMP verification
- minimum grade threshold
- hard-coded disclaimer injection on every response
That means the research-only warning is not dependent on the prompt and cannot be removed by changing prompt text alone.
Troubleshooting¶
Ollama Not Reachable¶
Run:
iints ai local-check --model ministral-3:8b
This now checks both basic reachability and a tiny real generation.
If the endpoint is wrong, retry with:
iints ai local-check --model ministral-3:8b --ollama-host http://127.0.0.1:11434
Model Missing¶
Pull the model shown in the error output:
ollama pull ministral-3:8b
Local Inference Is Slow¶
Increase the timeout:
iints ai report results/simulation_run.json \
--mdmp-cert results/report.signed.mdmp \
--timeout-seconds 180
If the server disconnects instead of timing out, the model may be too heavy for the machine at that moment. In that case try:
ollama pull ministral-3:3b
iints ai local-check --model ministral-3:3b
iints ai report results/<run_id> --model ministral-3:3b
Large Run Payloads¶
The assistant now clips oversized payloads automatically before sending them to the model. If you want tighter control, pass a smaller JSON summary rather than a full raw run dump.
No report.signed.mdmp In My Run Folder¶
That is now expected for a fresh raw run. The easiest fix is:
iints ai prepare results/<run_id>
This creates a local development certificate for AI use when the MDMP extra is available, so you do not have to hand-build step.json and report.signed.mdmp yourself.
No such command 'ai'¶
If the CLI says No such command 'ai', the most common cause is a legacy iints package still being installed beside iints-sdk-python35. That older package can shadow the newer SDK command tree.
Run the install doctor:
iints-sdk-doctor
If it reports a package ownership conflict, repair the environment:
python -m pip uninstall -y iints iints-sdk-python35
python -m pip install -U "iints-sdk-python35[full,mdmp]"
hash -r
Then retry:
iints ai models
iints ai local-check --model ministral-3:8b