When a metric moves or an alert fires, teams need to understand what changed, why it mattered, and what to do next. As products get more complex, that learning step takes longer—even though shipping doesn’t.
Statsig already connects features, experiments, and metrics. But those objects don’t show how the product actually produces those outcomes. That understanding lives in code and in the mental models engineers build over time.
The Knowledge Graph makes that understanding explicit, so teams—and the AI systems they use—can learn what matters and act faster.
The gap shows up immediately when context is missing.
Imagine joining a team on call and getting paged with an alert that reads “Hemingway Kirby Tail P99 Latency.” If you’re new, you might reasonably wonder why your company is tracking the literary output of Ernest Hemingway or when Kirby acquired a tail. A senior engineer, meanwhile, already knows those names refer to parts of a streaming pipeline that absolutely must not melt.
Veteran intuition fills in the blanks. New teammates—and automated agents—don’t get that luxury. Without context, the alert is technically accurate but practically useless, and every minute spent deciphering it is a minute not spent fixing what users are experiencing.
The same breakdown happens in product work. A PM asks, “Activation dropped 6% after last week’s release. What changed?” The metric definition lives in a dashboard, but the reality lives in code. Which onboarding steps feed the activation definition? Which endpoints do they hit? Which feature gates changed eligibility? Which experiment variant altered the flow? Which downstream services added latency or errors that caused drop-off?
Without a concrete map from outcomes back to the system, teams default to grepping, guessing, and debating narratives. Learning slows because it’s unclear where to look or which hypothesis is even plausible. AI systems face the same limitation: they can surface correlations, but they don’t know which parts of the system actually matter.
Every experienced engineer relies on a mental model of their product. They know which services own onboarding, which flags gate risky paths, which metrics reflect real user value, and which components quietly hold everything together. That model is what makes debugging faster and experimentation interpretable.
The problem is that this model rarely exists in a shared, durable form. New hires don’t have it. Tools don’t have it. And as teams increasingly use AI systems to investigate issues, propose changes, or modify production code, those systems don’t have it either.
The Statsig Knowledge Graph is an attempt to make that model explicit and machine-readable.
It represents a product as a connected system. On one side are Statsig primitives—events, metrics, feature gates, and experiments. On the other are the components that implement them in the codebase—services, endpoints, handlers, workflows, domain concepts, and the logic that influences behavior. The graph records how these elements relate, where they live in code, and what role they play in producing outcomes.
Instead of treating metrics, experiments, and code as separate concerns, the graph ties them together into a structure that reflects how the product actually works.

When a customer connects their GitHub repository, Statsig runs an offline, multi-step analysis workflow designed to stay pragmatic and grounded in real usage.
The workflow starts by anchoring on Statsig entities—events, metrics, feature gates, and experiments. These represent the behaviors teams already care about and sharply constrain the search space. Rather than attempting full static analysis of a large repository, Statsig works outward from these anchors into the code paths that actually influence them.
From there, the system traverses the surrounding code to understand how each entity behaves in practice. It examines where a gate is evaluated and which branches it controls, where events are emitted and with what properties, where metric inputs are produced, and where experiment variants influence logic or UI. The goal isn’t to catalog references, but to extract evidence that connects decisions in code to observable behavior.
This is what turns an alert like “Hemingway Kirby Tail” or a metric like “Activation” from a label into a traceable set of handlers, services, and dependencies.
As traversal continues, Statsig identifies the components engineers actually reason about: services with clear responsibilities, request handlers, API endpoints, workflow entry points, recurring interaction patterns, and domain objects that thread through the system. These become nodes in the graph. The intent isn’t architectural purity, but alignment with how engineers debug incidents, reason about changes, and assign ownership.
Finally, the workflow captures how these components relate. A service emits an event. A handler influences a metric. A workflow is gated behind a feature flag. An experiment variant changes behavior. An upstream dependency introduces latency or errors that affect conversion. These relationships encode the “why” and “how” links that normally live only in tribal knowledge.
The result is a unified product graph grounded in real code locations and structured so both humans and AI systems can move from outcomes to causes without starting from scratch.
The Knowledge Graph isn’t just documentation. It’s infrastructure for learning faster.

When a metric moves, teams can immediately see which flows, gates, and services contribute to it. When an alert fires, the relevant parts of the system are already connected. When an experiment runs, it’s clear which code paths and user experiences were actually touched.
That clarity changes how teams respond. Instead of broad rollbacks or speculative fixes, they can narrow in on the smallest place to learn. If a conversion drop runs through a specific onboarding handler behind a feature gate, the right response might be tightening rollout for a risky segment or A/B testing the old and new handler paths—along with a clear expectation of which metrics should move if the hypothesis is correct.
This is also where AI systems become genuinely useful. With the Knowledge Graph as context, coding agents can operate on the same levers experienced engineers rely on. They can navigate the codebase with an understanding of which components are relevant, suggest targeted experiments rather than generic fixes, and recommend monitoring based on what a change actually touches.
Over time, this compounds. New flows come with suggested monitoring. New alerts come with meaningful runbooks. Launches come with checklists derived from real dependencies. Teams (and their AI systems) can spend less time figuring out what matters and more time improving it.
Statsig’s core loop hasn’t changed: ship, measure, learn, and iterate. What has changed is the complexity of the systems involved and the role AI plays in working with them. We recently wrote about several AI-powered features that are improving workflows and delivering real value for customers.
The Knowledge Graph is infrastructure for that next phase. By connecting metrics, experiments, and feature flags to the code paths that produce them, it reduces the gap between observing a change and learning what to do next.
That makes it easier to debug, easier to design targeted experiments, and easier for AI systems to operate with the same context experienced engineers rely on today.
It’s a foundation we’ll keep building on as we expand more agentic workflows—grounded in real systems, real code, and real product decisions.