Data Certification Quickstart¶
Use this page when you need a quick, defensible answer to one question: “is this dataset good enough to use?”
By the end, you should have a contract, a certification report, and a shareable audit dashboard.
Read before: Getting Started if you have not produced a run or dataset yet.
Read next: Data Certification Full Guide when you need the full protocol details.
Use a virtual environment
Run all commands from an active .venv.
What Certification Produces¶
- contract validation results
- compliance score
- deterministic dataset and contract fingerprints
- trust grade:
draft,research_grade, orclinical_grade
Fastest Working Path¶
1. Create a contract¶
iints data certify-template --output-path data_contract.yaml
Edit the schema, units, and value ranges so they match your dataset.
2. Validate the dataset¶
iints data certify data_contract.yaml data/my_cgm.csv --output-json results/certification.json
For pipelines, use a strict gate:
iints data certify data_contract.yaml data/my_cgm.csv \
--min-mdmp-grade research_grade \
--fail-on-noncompliant \
--output-json results/certification.json
3. Generate a dashboard¶
iints data certify-visualizer results/certification.json --output-html results/mdmp_dashboard.html
The dashboard is a single-file HTML artifact you can share offline.
Optional: Create Synthetic Mirror Data¶
iints data synthetic-mirror data/my_cgm.csv data_contract.yaml \
--output-csv data/synthetic_mirror.csv \
--output-json results/synthetic_mirror_report.json
Use this when you need schema-realistic development data without distributing raw sensitive rows.
Grade Meaning¶
| Grade | Interpretation |
|---|---|
draft |
useful for iteration, not ready for rigorous research claims |
research_grade |
acceptable for research workflows |
clinical_grade |
strongest validation level currently exposed by the SDK |
Optional Python Gate¶
import pandas as pd
from iints import mdmp_gate
@mdmp_gate("contracts/clinical_mdmp_contract.yaml", min_grade="research_grade")
def process(df: pd.DataFrame) -> int:
return len(df)
This blocks or warns when incoming data does not meet the required quality level.
Where To Go Next¶
| If you want to... | Continue with |
|---|---|
| understand every rule and fingerprint | Data Certification Full Guide |
| use certified data in a benchmark | Scientific Workflow |
| inspect the science behind realism claims | Evidence Base |