Iris — Licensing & Commercialization Audit¶

Status: Iris is a private, soon-to-be-commercial product. This document inventories every third-party component, flags what blocks commercial use, and lays out the options (replace / license / self-train) with cost and effort.

Bottom line: Iris's own code is MIT (yours). Only three dependencies block commercial sale today — the object detector (YOLOv8, AGPL-3.0), the face models (InsightFace buffalo_l, non-commercial), and the re-ID weights (OSNet trained on MSMT17, research-only data). Everything else — STT, translation, both VAD engines, the web stack, FAISS, ONNX Runtime — is already commercial-clean. All three blockers have a free Apache/MIT replacement path; none require paying anyone if you swap models.

1. Deployment model decides which licenses bite¶

This matters before any line-item:

Model	What it means	License impact
SaaS / hosted	Customers use Iris over the network; you never ship them the software	GPL (ffmpeg) does not trigger (no distribution). AGPL still bites (its §13 network clause covers remote interaction). Non-commercial model/data licenses still bite (you're using them commercially regardless).
On-prem / shipped (Docker image to the customer)	Customer runs the container	GPL and AGPL bite, plus you redistribute ffmpeg/CUDA/NVENC → their redistribution terms apply.

Either way the three blockers below must be resolved. On-prem additionally needs an LGPL ffmpeg build (§5).

2. Full component inventory¶

✅ = commercial-OK · ⛔ = blocks commercial use · ⚠️ = conditional

Component	Role in Iris	License	Commercial?
Iris application code	everything we wrote	MIT (ours)	✅ relicense freely
Ultralytics YOLOv8n	person/object detector	AGPL-3.0	⛔ blocker
InsightFace — code	SCRFD + ArcFace runtime	MIT	✅
InsightFace — `buffalo_l` weights	face detect + 512-d embedding	non-commercial research	⛔ blocker
OSNet `osnet_ain_x1_0_msmt17`	appearance re-ID embedding	code MIT, weights trained on MSMT17 (research-only)	⛔ blocker (weights)
faster-whisper + CTranslate2	speech-to-text	MIT	✅
OpenAI Whisper model	STT weights	MIT	✅
EuroLLM-9B	translation LLM	Apache-2.0	✅
ollama	LLM server	MIT	✅
TEN VAD	live voice gate	Apache-2.0	✅
Silero VAD	fallback VAD	MIT	✅
yt-dlp	web-link source resolver	Unlicense (public domain)	✅
mediamtx	RTSP fan-out sidecar	MIT	✅
ONNX Runtime (GPU)	inference	MIT	✅
FAISS	vector index	MIT	✅
OpenCV (headless)	frame I/O	Apache-2.0	✅
NumPy / Pillow / SQLAlchemy / FastAPI / uvicorn / pydantic / Starlette	web + utils	BSD / MIT	✅
hls.js	browser HLS player	Apache-2.0	✅
ffmpeg	decode/encode/segment	LGPL-2.1+ core; GPL if built `--enable-gpl`	⚠️ see §5
CUDA / cuDNN / NVENC / NVDEC	GPU runtime	NVIDIA proprietary EULA (redistributable for deployment)	⚠️ keep the EULA notice

3. The three blockers — options, cost, effort¶

3.1 Object detector — YOLOv8 (AGPL-3.0)¶

YOLOv8 is the easiest to fix: the detection task is commoditised and there are several Apache-2.0 detectors with COCO-pretrained weights that ship person/vehicle classes out of the box.

Option	License cost	Eng. effort	Notes
A. Buy Ultralytics Enterprise license	Contact sales — custom; one publicly-reported quote ≈ $5k/yr (older, unverified), expect low-to-mid 5 figures/yr, subscription or one-time	~0 (keep current model)	Keeps YOLOv8 weights/accuracy; recurring cost; vendor lock-in
B. Swap to an Apache-2.0 detector ⭐	$0	~days	Re-export to ONNX, adapt the pre/post-process in `app/detect/`. Candidates: RTMDet (OpenMMLab), RT-DETR (Baidu original), RF-DETR (Roboflow), YOLOX, D-FINE — all Apache-2.0, COCO-pretrained, real-time. Avoid YOLO-NAS — its weights carry a restrictive custom license.
C. Train our own	$0 (COCO annotations CC-BY 4.0)	~1–2 wks	Only worth it if we need custom classes; otherwise (B)'s pretrained weights are enough

Recommendation: B — swap to RTMDet or RT-DETR (Apache-2.0). Free, removes the AGPL obligation entirely, comparable or better accuracy than YOLOv8n.

Status (1.55.0): resolved in code. The Apache-2.0 RT-DETR backend (app/detect/rtdetr_onnx.py, NMS-free) ships and is selectable per-camera or globally. The model_license_clean edition switch forces it in place of YOLOv8n, scripts/download_models.py --license-clean provisions only the Apache detector, and the settings API accepts detector_backend="rtdetr". The AGPL blocker is lifted for any build that enables the clean edition and drops a *rtdetr*.onnx in place of yolov8n.onnx.

3.2 Face recognition — InsightFace `buffalo_l` (non-commercial)¶

The InsightFace code is MIT (fine to keep); the pretrained packs are the problem — trained on datasets (MS1M / Glint360K) released for non-commercial research. This is the hardest blocker because commercially-licensed face data is scarce.

Option	License cost	Eng. effort	Notes
A. Buy InsightFace commercial model license ⭐ (fastest)	Contact insightface.ai — not public; expect 4–5 figures per deployment/model	~0	Keeps current SCRFD+ArcFace accuracy; cleanest legal path that preserves the feature
B. Defer face-ID to an "enterprise add-on"	$0	low	Ship v1 commercially with detection + appearance re-ID only; face recognition stays an opt-in module the customer enables under their own license. De-risks launch
C. Replace with a permissively-licensed face stack	$0	high	Detector → YuNet (OpenCV Zoo, permissive) is easy; the recognition embedding is the hard part — few ArcFace weights exist on commercially-usable data
D. Self-train ArcFace on a commercial dataset	dataset licensing cost	very high (wks–mos + data)	Realistic commercial face datasets must be licensed or collected with consent; rarely worth it vs (A)

Recommendation: A or B. Buy the InsightFace commercial license if face-ID is a headline feature; otherwise ship without it first (B) and add it under license later. Self-training (D) is the worst ROI.

3.3 Appearance re-ID — OSNet on MSMT17 (research data)¶

OSNet code (torchreid) is MIT, but our weights are trained on MSMT17, an academic ReID dataset. Cross-camera re-ID is core to Iris, so this needs a clean embedding.

Option	License cost	Eng. effort	Notes
A. Generic appearance embedding ⭐	$0	~moderate	Use a self-supervised backbone with a permissive license — DINOv2 (Apache-2.0) or an OpenCLIP image encoder — as the clothing/appearance feature instead of MSMT17-trained OSNet. No ReID-dataset provenance
B. Retrain OSNet on commercial data	dataset cost	high	Same data-licensing problem as faces
C. Enterprise add-on	$0	low	Like 3.2-B: ship without cross-camera appearance re-ID at first

Recommendation: A — DINOv2/OpenCLIP appearance features (Apache-2.0). Removes the dataset-provenance risk; needs re-tuning of the match thresholds in app/reid/.

Status (1.56.0): wired, tuning-gated. reid_appearance_backend="dinov2" (forced by model_license_clean) loads an Apache-2.0 DINOv2 backbone in place of OSNet. The appearance embedding dimension is decoupled from the 512-d face gallery (reid_app_embedding_dim, default 384 for DINOv2-small) across serialization, the appearance store, and the maintenance centroid; stale-dim exemplars from a backend switch are quarantined and decay out. download_models.py --license-clean provisions the DINOv2 ONNX. Remaining before production trust: the reid_app_* cosine thresholds are OSNet-tuned — re-tune them on real footage (DINOv2 similarities distribute differently) before enabling a DINOv2 build on live cameras.

4. Can we just train our own models? (summary)¶

Detector: don't need to — Apache-2.0 COCO-pretrained weights already exist (RTMDet/RT-DETR). Train only for custom classes.
Face recognition: training is the expensive path — the blocker is data licensing, not compute. Buying the InsightFace commercial license is almost always cheaper than sourcing a clean face dataset.
Re-ID: swapping to a self-supervised Apache backbone (DINOv2) avoids training and the dataset problem.

So: swap models (free) for the detector and re-ID; license-or-defer for face recognition. No mandatory recurring fees if face-ID is deferred.

5. ffmpeg & GPU notes (on-prem only)¶

ffmpeg: the Ubuntu apt build is GPL (--enable-gpl, bundles x264/x265). Iris encodes via NVENC (h264_nvenc), not x264, so for SaaS there is no distribution and no obligation. For on-prem shipping, rebuild ffmpeg LGPL (drop --enable-gpl, no x264/x265) or document GPL compliance.
CUDA / cuDNN / NVENC: redistributable under the NVIDIA EULA for deployment; keep the NVIDIA notice in the image and product docs.

6. Recommended commercialization path¶

Detector → RTMDet/RT-DETR (Apache-2.0). Free, removes AGPL. (do first) — ✅ done (1.55.0): RT-DETR backend + model_license_clean edition + provisioning.
Re-ID → DINOv2/OpenCLIP appearance features (Apache-2.0). Free, removes the MSMT17 data risk. — 🟡 wired (1.56.0): DINOv2 backend + dim-decoupling
provisioning; thresholds still need real-footage tuning before production.
Face recognition → decide: buy the InsightFace commercial license (keep accuracy) or ship v1 without face-ID (B) and add it later.
On-prem only: LGPL ffmpeg build; carry the NVIDIA EULA notice.
Keep STT/translate/VAD as-is — all already commercial-clean.

After steps 1–2 (free, ~1–2 weeks of engineering) Iris is sellable as a SaaS with no per-year model fees, with face recognition as the only license-or-defer decision left.

Pricing figures are indicative and were not publicly confirmed for 2026 — treat Ultralytics and InsightFace numbers as "contact sales", and re-verify before budgeting.