Iris — Licensing & Commercialization Audit¶
Status: Iris is a private, soon-to-be-commercial product. This document inventories every third-party component, flags what blocks commercial use, and lays out the options (replace / license / self-train) with cost and effort.
Bottom line: Iris's own code is MIT (yours). Only three dependencies block commercial sale today — the object detector (YOLOv8, AGPL-3.0), the face models (InsightFace
buffalo_l, non-commercial), and the re-ID weights (OSNet trained on MSMT17, research-only data). Everything else — STT, translation, both VAD engines, the web stack, FAISS, ONNX Runtime — is already commercial-clean. All three blockers have a free Apache/MIT replacement path; none require paying anyone if you swap models.
1. Deployment model decides which licenses bite¶
This matters before any line-item:
| Model | What it means | License impact |
|---|---|---|
| SaaS / hosted | Customers use Iris over the network; you never ship them the software | GPL (ffmpeg) does not trigger (no distribution). AGPL still bites (its §13 network clause covers remote interaction). Non-commercial model/data licenses still bite (you're using them commercially regardless). |
| On-prem / shipped (Docker image to the customer) | Customer runs the container | GPL and AGPL bite, plus you redistribute ffmpeg/CUDA/NVENC → their redistribution terms apply. |
Either way the three blockers below must be resolved. On-prem additionally needs an LGPL ffmpeg build (§5).
2. Full component inventory¶
✅ = commercial-OK · ⛔ = blocks commercial use · ⚠️ = conditional
| Component | Role in Iris | License | Commercial? |
|---|---|---|---|
| Iris application code | everything we wrote | MIT (ours) | ✅ relicense freely |
| Ultralytics YOLOv8n | person/object detector | AGPL-3.0 | ⛔ blocker |
| InsightFace — code | SCRFD + ArcFace runtime | MIT | ✅ |
InsightFace — buffalo_l weights |
face detect + 512-d embedding | non-commercial research | ⛔ blocker |
OSNet osnet_ain_x1_0_msmt17 |
appearance re-ID embedding | code MIT, weights trained on MSMT17 (research-only) | ⛔ blocker (weights) |
| faster-whisper + CTranslate2 | speech-to-text | MIT | ✅ |
| OpenAI Whisper model | STT weights | MIT | ✅ |
| EuroLLM-9B | translation LLM | Apache-2.0 | ✅ |
| ollama | LLM server | MIT | ✅ |
| TEN VAD | live voice gate | Apache-2.0 | ✅ |
| Silero VAD | fallback VAD | MIT | ✅ |
| yt-dlp | web-link source resolver | Unlicense (public domain) | ✅ |
| mediamtx | RTSP fan-out sidecar | MIT | ✅ |
| ONNX Runtime (GPU) | inference | MIT | ✅ |
| FAISS | vector index | MIT | ✅ |
| OpenCV (headless) | frame I/O | Apache-2.0 | ✅ |
| NumPy / Pillow / SQLAlchemy / FastAPI / uvicorn / pydantic / Starlette | web + utils | BSD / MIT | ✅ |
| hls.js | browser HLS player | Apache-2.0 | ✅ |
| ffmpeg | decode/encode/segment | LGPL-2.1+ core; GPL if built --enable-gpl |
⚠️ see §5 |
| CUDA / cuDNN / NVENC / NVDEC | GPU runtime | NVIDIA proprietary EULA (redistributable for deployment) | ⚠️ keep the EULA notice |
3. The three blockers — options, cost, effort¶
3.1 Object detector — YOLOv8 (AGPL-3.0)¶
YOLOv8 is the easiest to fix: the detection task is commoditised and there are several Apache-2.0 detectors with COCO-pretrained weights that ship person/vehicle classes out of the box.
| Option | License cost | Eng. effort | Notes |
|---|---|---|---|
| A. Buy Ultralytics Enterprise license | Contact sales — custom; one publicly-reported quote ≈ $5k/yr (older, unverified), expect low-to-mid 5 figures/yr, subscription or one-time | ~0 (keep current model) | Keeps YOLOv8 weights/accuracy; recurring cost; vendor lock-in |
| B. Swap to an Apache-2.0 detector ⭐ | $0 | ~days | Re-export to ONNX, adapt the pre/post-process in app/detect/. Candidates: RTMDet (OpenMMLab), RT-DETR (Baidu original), RF-DETR (Roboflow), YOLOX, D-FINE — all Apache-2.0, COCO-pretrained, real-time. Avoid YOLO-NAS — its weights carry a restrictive custom license. |
| C. Train our own | $0 (COCO annotations CC-BY 4.0) | ~1–2 wks | Only worth it if we need custom classes; otherwise (B)'s pretrained weights are enough |
Recommendation: B — swap to RTMDet or RT-DETR (Apache-2.0). Free, removes the AGPL obligation entirely, comparable or better accuracy than YOLOv8n.
Status (1.55.0): resolved in code. The Apache-2.0 RT-DETR backend (
app/detect/rtdetr_onnx.py, NMS-free) ships and is selectable per-camera or globally. Themodel_license_cleanedition switch forces it in place of YOLOv8n,scripts/download_models.py --license-cleanprovisions only the Apache detector, and the settings API acceptsdetector_backend="rtdetr". The AGPL blocker is lifted for any build that enables the clean edition and drops a*rtdetr*.onnxin place ofyolov8n.onnx.
3.2 Face recognition — InsightFace buffalo_l (non-commercial)¶
The InsightFace code is MIT (fine to keep); the pretrained packs are the problem — trained on datasets (MS1M / Glint360K) released for non-commercial research. This is the hardest blocker because commercially-licensed face data is scarce.
| Option | License cost | Eng. effort | Notes |
|---|---|---|---|
| A. Buy InsightFace commercial model license ⭐ (fastest) | Contact insightface.ai — not public; expect 4–5 figures per deployment/model | ~0 | Keeps current SCRFD+ArcFace accuracy; cleanest legal path that preserves the feature |
| B. Defer face-ID to an "enterprise add-on" | $0 | low | Ship v1 commercially with detection + appearance re-ID only; face recognition stays an opt-in module the customer enables under their own license. De-risks launch |
| C. Replace with a permissively-licensed face stack | $0 | high | Detector → YuNet (OpenCV Zoo, permissive) is easy; the recognition embedding is the hard part — few ArcFace weights exist on commercially-usable data |
| D. Self-train ArcFace on a commercial dataset | dataset licensing cost | very high (wks–mos + data) | Realistic commercial face datasets must be licensed or collected with consent; rarely worth it vs (A) |
Recommendation: A or B. Buy the InsightFace commercial license if face-ID is a headline feature; otherwise ship without it first (B) and add it under license later. Self-training (D) is the worst ROI.
3.3 Appearance re-ID — OSNet on MSMT17 (research data)¶
OSNet code (torchreid) is MIT, but our weights are trained on MSMT17, an academic ReID dataset. Cross-camera re-ID is core to Iris, so this needs a clean embedding.
| Option | License cost | Eng. effort | Notes |
|---|---|---|---|
| A. Generic appearance embedding ⭐ | $0 | ~moderate | Use a self-supervised backbone with a permissive license — DINOv2 (Apache-2.0) or an OpenCLIP image encoder — as the clothing/appearance feature instead of MSMT17-trained OSNet. No ReID-dataset provenance |
| B. Retrain OSNet on commercial data | dataset cost | high | Same data-licensing problem as faces |
| C. Enterprise add-on | $0 | low | Like 3.2-B: ship without cross-camera appearance re-ID at first |
Recommendation: A — DINOv2/OpenCLIP appearance features (Apache-2.0). Removes
the dataset-provenance risk; needs re-tuning of the match thresholds in
app/reid/.
Status (1.56.0): wired, tuning-gated.
reid_appearance_backend="dinov2"(forced bymodel_license_clean) loads an Apache-2.0 DINOv2 backbone in place of OSNet. The appearance embedding dimension is decoupled from the 512-d face gallery (reid_app_embedding_dim, default 384 for DINOv2-small) across serialization, the appearance store, and the maintenance centroid; stale-dim exemplars from a backend switch are quarantined and decay out.download_models.py --license-cleanprovisions the DINOv2 ONNX. Remaining before production trust: thereid_app_*cosine thresholds are OSNet-tuned — re-tune them on real footage (DINOv2 similarities distribute differently) before enabling a DINOv2 build on live cameras.
4. Can we just train our own models? (summary)¶
- Detector: don't need to — Apache-2.0 COCO-pretrained weights already exist (RTMDet/RT-DETR). Train only for custom classes.
- Face recognition: training is the expensive path — the blocker is data licensing, not compute. Buying the InsightFace commercial license is almost always cheaper than sourcing a clean face dataset.
- Re-ID: swapping to a self-supervised Apache backbone (DINOv2) avoids training and the dataset problem.
So: swap models (free) for the detector and re-ID; license-or-defer for face recognition. No mandatory recurring fees if face-ID is deferred.
5. ffmpeg & GPU notes (on-prem only)¶
- ffmpeg: the Ubuntu
aptbuild is GPL (--enable-gpl, bundles x264/x265). Iris encodes via NVENC (h264_nvenc), not x264, so for SaaS there is no distribution and no obligation. For on-prem shipping, rebuild ffmpeg LGPL (drop--enable-gpl, no x264/x265) or document GPL compliance. - CUDA / cuDNN / NVENC: redistributable under the NVIDIA EULA for deployment; keep the NVIDIA notice in the image and product docs.
6. Recommended commercialization path¶
- Detector → RTMDet/RT-DETR (Apache-2.0). Free, removes AGPL. (do first)
— ✅ done (1.55.0): RT-DETR backend +
model_license_cleanedition + provisioning. - Re-ID → DINOv2/OpenCLIP appearance features (Apache-2.0). Free, removes the MSMT17 data risk. — 🟡 wired (1.56.0): DINOv2 backend + dim-decoupling
- provisioning; thresholds still need real-footage tuning before production.
- Face recognition → decide: buy the InsightFace commercial license (keep accuracy) or ship v1 without face-ID (B) and add it later.
- On-prem only: LGPL ffmpeg build; carry the NVIDIA EULA notice.
- Keep STT/translate/VAD as-is — all already commercial-clean.
After steps 1–2 (free, ~1–2 weeks of engineering) Iris is sellable as a SaaS with no per-year model fees, with face recognition as the only license-or-defer decision left.
Pricing figures are indicative and were not publicly confirmed for 2026 — treat Ultralytics and InsightFace numbers as "contact sales", and re-verify before budgeting.