As teams increasingly rely on dynamic configurations for AI systems — from model hyperparameters to prompt templates — controlling and validating these changes before rollout becomes critical.
At Statsig, we’ve heard from customers who wanted to automate this process:
“We’d like to run our custom benchmark tests automatically whenever a config changes, and block rollout if it fails.”
This is a perfect example of why we built Release Pipelines and recently expanded our Webhook + Console API capabilities. Together, they allow teams to integrate Statsig directly into their CI/CD and validation workflows — ensuring only safe, validated configurations reach production.
Here’s how you can automate benchmark validation with Statsig’s new webhook event:
“Release Pipeline Waiting for Review.”

Start by creating a Release Pipeline in Statsig for your AI Config (for example, your prompt or model configuration).
Define your rollout phases — such as:
Phase 1:
Dev (10%)
Phase 2:
Staging (50%)
Phase 3:
Production (100%)
You can require manual approval before advancing between phases — which we’ll automate next.
📘 Learn more about Release Pipelines → Release Pipeline Overview
Go to
Project Settings → Integrations → Webhook → Event Filtering
Under Configuration Changes → Action Types, enable:
“Release Pipeline Waiting For Review”
This webhook will fire whenever a rollout phase is awaiting approval.
The payload includes metadata like:
{
"event": "ReleasePipelineWaitingForReview",
"releasePipelineMetadata": {
"releasePipelineID": "rp_123",
"phaseID": "phase_2",
"gateID": "g_456",
"triggerID": "t_789"
}
}
You can use this metadata to trigger your CI/CD workflow — for example, running a custom benchmark test suite on your updated model config.
When your webhook fires, your CI/CD system (e.g., GitHub Actions, Jenkins, or an internal testing service) can automatically:
Pull the latest config from Statsig.
Run your internal benchmarks — such as prompt quality evaluation, latency checks, or regression testing.
Evaluate results against your acceptance criteria.
If benchmarks pass, proceed to the next phase.
If benchmarks fail, block rollout or trigger a rollback.
Statsig’s Console API (CAPI) lets you programmatically approve or halt rollout phases based on your benchmark results.
✅ To advance rollout:
Call the Approve Phase API endpoint using the releasePipelineMetadata payload.
❌ To block rollout or rollback:
Use the Kill Switch API to stop rollout for a specific region (e.g. SouthAmerica) or environment (e.g. Staging).
This closes the loop — enabling an end-to-end automated validation process governed by your own benchmark logic, but powered by Statsig’s feature and config rollout engine.
You push a new prompt config to Statsig.
Statsig triggers a webhook: “Release Pipeline Waiting for Review.”
Your CI workflow starts automatically:
Runs benchmark tests.
Sends results back to Statsig via the Console API.
If results pass → rollout advances to next phase.
If results fail → rollout halts or rolls back.
The result: fully automated, test-gated configuration rollouts.

This integration allows you to:
Automate quality gates
for AI and ML configurations.
Enforce CI validation
before any rollout proceeds.
Protect production environments
from bad configs or regressions.
Accelerate deployment velocity
while maintaining control and trust.
By combining Statsig’s Release Pipelines, Webhooks, and Console API, you can treat configuration changes like code — continuously tested, validated, and safely deployed.
We’re continuing to expand automation and CI/CD integrations for config and feature rollouts.
Future enhancements include:
Direct integrations with popular CI tools (e.g., GitHub Actions, CircleCI)
More granular approval APIs (targeting specific phases)
Enhanced observability for automated approvals and test results
If you’re exploring automated gating for your AI Configs or CI/CD workflows, we’d love to hear from you — reach out in Statsig Community.
Try it yourself:
Enable Release Pipeline Waiting for Review webhooks in your Statsig project today, and see how easily you can add automated benchmark validation to your rollouts.