Statsig customer story - How Character.ai scales AI entertainment with AI experimentation

Products

Solutions

Resources

Docs Pricing

Products

Solutions

Resources

Products

Solutions

Resources

Pricing

Statsig’s unified experimentation and analytics platform helps Character.ai scale safe, engaging AI entertainment at speed.

How Character.ai scales AI entertainment with AI experimentation

3.5x

revenue YTD attributed to monetization experiments

500+

experiments YTD

3

experimentation and analytics platforms combined into one

Character.ai is a groundbreaking conversational AI platform that was launched prior to ChatGPT. The platform enables tens of millions of users worldwide to interact with AI-powered virtual characters, and has evolved from a purely text-based chat experience into a full-fledged multimodal AI entertainment platform. From educational dialogues to playful chats with fictional personas, Character.ai provides uniquely immersive experiences driven by advanced large language models (LLMs).

For Character.ai, data-driven development has been foundational from the start. Given the nature of their initial product, virtual AI characters responding dynamically in conversations, there often isn’t a clearly correct or incorrect response. Instead, success is determined by the entertainment and safety of the interactions. This makes a sophisticated, rigorous experimentation process validated in real-user feedback crucial to their development process.

An AI development philosophy rooted in experimentation

With Character.ai's goal to deliver entertaining yet safe conversational experiences, offline evaluations and standard benchmarks aren’t sufficient. Unlike AI tools designed for tasks with clear-cut answers, such as code completion, the success of Character.ai's models depends on subtle, subjective interactions.

This means they need a way to validate that these interactions are consistently moving in the right direction—toward being both entertaining and safe. From the beginning, after validation through offline evaluations and redteaming, they’ve relied on real user feedback to guide development, with experimentation serving as the backbone of that process. Character.ai’s founding team included engineers and data scientists with deep experience in experimentation, and they recognized early on the value of building a product grounded in user data. This is especially critical when there’s no objectively correct answer, only whether the experience resonates with users.

For Character.ai, model hallucinations that generate dynamic and creative responses can actually enhance the user experience, as long as they align with the goals of being entertaining and safe. Validating these nuanced behaviors through offline evaluation and then through experimentation is essential. In fact, experimentation often serves as the final tie-breaker in deciding whether a new model is ready for broad release.

To support this, Character.ai needs to run many experiments in parallel, which requires sophisticated experimentation capabilities such as holdouts, layers, rules-based targeting and precise traffic allocation.

Fragmented tools, friction-filled processes

Before adopting Statsig, Character.ai relied on separate tools for backend model experimentation and product UI testing. While each tool served its individual purpose, managing experimentation across both led to inefficiencies and a disjointed workflow.

Challenges included:

Implementation and ongoing maintenance of multiple tools was cumbersome and error-prone
Experimentation in the UI testing tool led to frequent false launches that required restarting
Measuring results in the UI was inefficient, prompting the team to log data in one tool but analyze it in another
Because their setup wasn’t warehouse-native, the team had to engineer workarounds to export data into their warehouse

Ultimately, the fragmented setup slowed teams down and made it harder to scale experimentation with confidence.

A unified platform for confident AI product development

Character.ai uses Statsig as a centralized platform for experimentation, feature management, and product analytics. The tool has become deeply embedded across engineering, research, and operations, enabling the team to move quickly while validating most releases with real user feedback.

Building an experimentation program for scale

Experimentation is at the core of Character.ai’s development philosophy and Statsig is the system powering it across every team. From product engineering and safety to research and post-training, Statsig enables self-serve development such that a majority of experiments are engineer-led. Nearly every user-facing change is tested to preserve the user experience unless there’s a blocking reason not to.

While the team moves quickly, every experiment is grounded in a rigorous, user-first approach that prioritizes safety and the overall experience. Experiment velocity is important, but never at the expense of thoughtful design, evaluation, and user trust.

Online experimentation is especially critical for the text modeling team, where model success is measured through subjective signals like engagement and safety, not predefined answers.

“We use online experimentation powered by Statsig to figure out whether a model resonated with our users at scale. It’s the tiebreaker for determining which models would best serve our users' needs.”

Sri Lacoul

Staff Data Scientist, Character.ai

Feature Flags for confident rollouts

While most launches go through an experiment, the team uses feature gates for slow, controlled rollouts when they’re confident in the change. For example, they used a feature gate for rolling out their annual subscriptions product. If the goal is to learn—or if there are multiple variants to compare—they default to experimentation. The Statsig platform’s flexibility supports both rollout styles seamlessly.

Operational and analytics use cases

Beyond experiments and rollouts, Character.ai uses Statsig for key operational workflows. The operations team manages elements like the homepage “Featured” carousel using dynamic configs, enabling fast changes without requiring code deployments. This allows non-engineering teams to control and update parts of the user experience in real time.

On the analytics side, Statsig’s dashboards are increasingly used by the text and multimodal research teams to explore experiment outcomes and uncover behavioral trends. Because metrics are already defined and tightly integrated with experimentation, teams can quickly generate insights without duplicating setup or logic.

Weekly metrics reviews, held alongside their experiment reviews, often feature Statsig charts and screenshots to communicate top-line impact across executive and cross-functional stakeholders.

Achieving responsible AI experimentation at scale

Statsig’s flexibility and statistical depth have played a big role in enabling this experimentation culture. Statsig gives Character.ai much greater confidence in their experimentation processes. The team highlighted the flexibility of Statsig’s stats engine and the ability to customize methodologies depending on the experiment type.

“We’ve enjoyed using a lot of the advanced features available to us in Statsig including covariates, qualified exposures, and filtering assignments.”

Sri Lacoul

Staff Data Scientist, Character.ai

This breadth of functionality, combined with speed and ease of use, has made Statsig essential to Character.ai’s ability to run high-velocity, high-confidence experimentation at scale.

Character.ai has experienced measurable gains from using Statsig:

Increased experimentation velocity and reduced time-to-insights that empowered running 500+ experiments YTD with a lean team, and without compromising precision and rigor
3.5x revenue YTD attributed to successful monetization experiments
Improved operational cost and effort by consolidating 3 different experimentation platforms into Statsig
Reliable detection and mitigation of regressions and safety risks through utilizing Statsig’s self-serve diagnostic tools

“Statsig has elevated our ability to rapidly and confidently deploy AI-powered experiences. It’s become central to the development of our conversational AI models.”

Andy Nahman

Head of Data Science, Character.ai

What’s next?

As Character.ai looks ahead, they’re focused on evolving their data-driven development process to meet the growing complexity of building safe, engaging, and high-performing AI systems. This evolution reflects their strategic shift from a text-based AI chat platform to a multimodal AI entertainment platform, expanding the possibilities for creative and interactive user experiences. While real-time experimentation has become their default approach for evaluating models, they see an opportunity to introduce more structure particularly around sequencing offline and online evaluations.

Looking ahead, they’re excited about Statsig’s continued investment in AI-specific tooling, especially in areas like prompt testing and enhanced offline model evaluations, which they believe will be critical to building the next generation of conversational AI.

“I see Statsig very clearly focusing their investments around unlocking this value for AI companies and we are excited to explore the new features as they roll out.”

Andy Nahman

Head of Data Science, Character.ai

About Character.ai

Character.ai is a leading generative AI platform allowing users to create and interact with virtual characters powered by advanced language models. With millions of active users, Character.ai is redefining human-AI interactions, delivering engaging, personalized, and safe conversational experiences across diverse use cases.

We use cookies to ensure you get the best experience on our website.