Cloud-hosted vs warehouse native experimentation platforms: How to decide

Thu Nov 09 2023

Sid Kumar

Product Marketing, Statsig

At Statsig, we collaborate with a diverse range of companies scaling their experimentation culture, spanning industry leaders, fast-growing startups, and cutting-edge AI enterprises. Earlier this year, we introduced our Warehouse Native (WHN) offering, granting new customers the flexibility to select their deployment model according to their unique needs and use cases.

Companies exhibit significant variations in their existing experimentation approaches, toolsets, resource allocations, and policies. While some customers enter the picture with a distinct architectural inclination, others are in the midst of evaluating their alternatives. In light of this, we wanted to provide some guidance based on our customer onboarding experiences.

Below is a summary of key criteria to consider when making your decision between the two modes of deployment:

Criteria Cloud-hosted Warehouse native (WHN)
Data Source Primary source of metrics come from Statsig SDKs or CDPs like Segment. Some metrics can still come from a warehouse. Warehouse is the primary source of metrics, making WHN ideal when wanting to reuse existing data pipelines and computation.
Analysis needs Automated experimentation for every experiment and product launch, especially with metrics derived from event logging. Flexible analysis on top of your existing source of truth metric data.
Data team involvement Involvement is optional but recommended for experiment design and readouts. Necessary for setting up the warehouse connection and configuring core metrics, but not mandatory for every experiment.
Costs TCO is slightly lower. No warehouse costs involved. TCO includes Statsig license + costs incurred for computation and storage in your warehouse.
Modularity An integrated end-to-end platform that spans SDKs for feature rollout, experiment execution, analysis, and experiment readouts. Modular: You can opt for the integrated end-to-end platform or choose to use only a subset of capabilities, such as assignment or experiment analysis.

This framework is designed to provide directional guidance, intended to ease organizations in their decision-making process. Specific requirements can vary, so it's essential to carefully evaluate your situation. That said, the following factors that build on the table above merit consideration:

Primary source of metrics

This is often a crucial factor for companies that have a data warehouse as their primary source of metrics and want to perform experiments using their existing, reliable datasets. In this case, the warehouse-native approach allows for experiment analysis directly within the data warehouse, preserving privacy and data ownership and avoiding data egress.

Existing metrics

Building on the preceding point, enterprises with pre-established, trusted metrics in their data warehouse usually seek an experimentation platform that directly interfaces with these metrics.

On the other hand, Statsig's cloud-hosted solution is perfect for organizations seeking a reliable method for creating metrics. It's more engineering-friendly and doesn't necessitate additional data work. You can simply use the Statsig SDK to log events and send them to Statsig, where they can be seamlessly integrated with any precomputed metrics from a data warehouse, if needed. While metrics can be easily imported in the cloud-hosted version, using WHN may be a better choice if all your metrics are derived from the warehouse.

Tool stack with cloud vendors

Organizations already leveraging other SaaS tools for customer data and metrics find synergy in adopting a cloud-hosted experimentation platform. For instance, companies already utilizing tools like Segment discover a straightforward path for direct ingestion of event data into the cloud-hosted Statsig version.

Whether an organization already has a feature flagging tool can influence their decision. For instance, customers considering migration from an existing vendor like LaunchDarkly might be more accustomed to cloud workflows. However, as many organizations are transitioning towards maintaining a single source of truth for their metrics and making experiment data available in a warehouse for rapid analysis, they might prefer a warehouse-native tool that operates on top of these metrics.

Total cost of ownership (TCO)

In general, cloud-hosted solutions typically offer a slightly lower total cost of ownership when factoring in warehouse management expenses related to compute and storage, as well as any initial setup costs.

Security/privacy policies

Organizations enforcing stringent data egress policies pertaining to user-level data from their warehouse may gravitate toward a warehouse-native deployment. However, it's crucial to note that private information such as names and email addresses isn't requisite for experiments, as they operate based on user ID. We have some of our largest enterprise customers trusting to run their workloads on our cloud-hosted offering, leveraging hashing or private attributes to avoid logging IDs or sensitive data.

Data maturity

Considering the diverse stages of cultivating a data-driven culture of experimentation, organizations starting their journey often find the cloud to be a more accessible starting point, as it is so easy for developers to adopt. In contrast, a warehouse-native deployment requires a trusted and reusable set of metrics/events within a warehouse.

Introducing Statsig Warehouse Native

Interested in deploying Statsig directly within your own data warehouse?
warehouse-native-cta-graphic

Data team support

The cloud-hosted option places lower demands on data team involvement in experimentation, as all teams, including Engineering and Product, gain access to the console for metrics and analysis. Statsig Cloud-Hosted caters to users who favor automation, minimizing the need for manual data work to compute experiment results. It fosters simplicity and accessibility for the entire organization, encompassing engineering, product, and data teams. Warehouse-native entails some data team support, though not for every experiment.

As a result, Statsig Cloud often presents a seamless initiation point for organizations new to experimentation. It accommodates companies at varying levels of data maturity, offering a "just works" approach. This proves invaluable for organizations lacking data engineers or extensive data management resources, facilitating swift experimentation without the complexity of managing a data warehouse or intricate data processes.

Flexibility and customizations

Organizations that extensively customize their metrics may discover that defining these metrics within their data warehouse and orchestrating precise experiments using a warehouse-native platform is more convenient. This is because complex data transformations and the integration of new data sources are more straightforward with a data warehouse.

Need for built-in product analytics

Organizations seeking in-depth metric analysis and insights through built-in analytics (Metrics Explorer) are best suited to Statsig Cloud-hosted. Most other features align between our cloud-hosted and warehouse-native offerings, and we are working diligently to bring the two offerings to parity.

Related: See the full comparison table between Statsig Warehouse Native versus cloud-hosted.

TLDR: The choice between cloud-hosted and warehouse-native isn't about one being superior to the other but rather about selecting based on your current tooling, processes, and objectives. Don't hesitate to reach out if you would like to discuss further what aligns with your goals.

Request a demo

Statsig's experts are on standby to answer any questions about experimentation at your organization.
request a demo cta image

Build fast?

Subscribe to Scaling Down: Our newsletter on building at startup-speed.

Try Statsig Today

Get started for free. Add your whole team!
We use cookies to ensure you get the best experience on our website.
Privacy Policy