Mechanics vs discipline
Your feature flag platform can safely ship variant B to 10% of players and roll it back if metrics drop. What it can't do is tell you which variant to build in the first place, or what the result actually means for your roadmap.
Experimentation and feature flag platforms do their job well: targeting segments, gating exposure, rolling back safely. Iridae doesn't replace any of that.
The hard part is everything around the rollout: what to test, which tradeoffs you accept, what stop rules you enforce, who owns the call, and what changes after you see the result.
Why results stall
- Too many candidate tests, no shared prioritization.
- Guardrails and stop rules stay vague; debates repeat under pressure.
- Readouts exist, but ownership and decision rights are unclear.
- Local wins ship but never become reusable playbooks.
What Iridae adds
- Rank hypotheses by player impact, business leverage, and confidence.
- Define success criteria, guardrails, stop rules, and a decision owner before the test runs.
- Coordinate rollout tasks across product, live ops, engineering, and community.
- Interpret outcomes into ship, iterate, or rollback -- rationale captured.
- Route follow-through into the roadmap so the next cycle compounds.
Your rollout tools run the mechanics. Iridae runs the decision loop around them.
Comparison at a glance
| Dimension | Experimentation + feature flags | Iridae |
|---|---|---|
| Core question | "Did this change work?" (measurement) | "What should we test, and what changes because of it?" (decisions) |
| Typical outputs | Experiments, flags, rollouts, readouts | Prioritized hypotheses, stop rules, approved actions, traced outcomes |
| Best for | Safe rollouts and proving impact | Compounding learning across teams and cycles |
| Together | Test stack runs the mechanics | Iridae runs prioritization, decisions, and post-test follow-through |