Platform

Resources

Docs Blog Pricing

Platform

Resources

Platform

Resources

Synthetic control methods: Complex test groups

Mon Jun 23 2025

Ever tried to measure the impact of a big product change when you can't run a proper A/B test? You know the feeling - maybe you're rolling out to an entire country, or the change affects your whole user base. Traditional experiments just don't work here.

That's where synthetic control methods come in. Think of it as building a "what if" scenario from your existing data - a clever way to create a comparison group when you can't actually have one.

Introduction to synthetic control methods

Let's start with the basics. Synthetic control methods (SCM) construct a comparison group by combining data from multiple untreated units. Instead of finding a single perfect control group (which rarely exists), you're essentially mixing and matching to create one that closely mirrors your treated unit before the intervention happens.

Here's the magic: Microsoft's data science team showed how this weighted combination approach can estimate causal effects even when randomized trials aren't possible. The synthetic control becomes your benchmark - what would have happened if you hadn't made that change.

This approach really shines in specific situations:

When you're dealing with a single treated unit (like one specific market or region)
When interventions affect entire populations at once
When running an RCT would be unethical or just plain impossible

The flexibility is what makes SCM so useful across different fields. Public health researchers use it to evaluate population-wide interventions. Marketing teams apply it to measure campaign effectiveness. Product teams at companies like Statsig leverage it when traditional A/B tests aren't feasible.

Challenges with complex test groups

Now, here's where things get tricky. When your test groups are complex - think multiple interacting features or spillover effects between users - even synthetic controls can struggle.

The biggest challenge? Selecting the right control units. Measured's research on geo-testing found that random market selection can completely throw off your results. You need control units that actually behave like your treated unit would have - easier said than done.

Then there's the pre-intervention fit problem. Your synthetic control needs to track closely with your treated unit before the intervention. If it doesn't, you're basically comparing apples to oranges. Medium's data science community emphasizes running robustness checks to validate your results:

Placebo tests (applying your method to units that weren't actually treated)
Permutation tests
Sensitivity analysis

Despite these challenges, SCM remains incredibly powerful for complex environments. The key is understanding its limitations and building in safeguards. Some teams combine synthetic controls with quasi-experimental designs to strengthen their findings.

Applying synthetic control methods to complex test groups

So how do you actually implement this with complex test groups? Start by identifying units that share key characteristics with your treated unit during the pre-intervention period.

The implementation process typically looks like this:

Gather your potential control units
Define your pre-intervention period
Fit the model using tools like R's Synth library or Python's CausalImpact
Validate the pre-intervention fit
Estimate the treatment effect

Microsoft's approach emphasizes that the weighted combination should closely resemble your treated unit before the intervention. This isn't just about matching averages - you want similar trends and patterns too.

When dealing with intricate group dynamics, pay extra attention to interaction effects. Your synthetic control needs to capture not just the main behaviors but also how different segments interact with each other.

Robustness checks aren't optional - they're essential. Research teams recommend:

Leave-one-out tests (removing each control unit and re-running the analysis)
Time placebo tests (pretending the intervention happened earlier)
Multiple specifications to ensure your results aren't sensitive to small changes

Practical applications and considerations

Let's talk real-world applications. Epidemiologists have used SCM to evaluate smoking bans and vaccination programs - cases where you can't exactly randomize who gets the intervention. Advertising teams apply it for geo-testing campaigns when holdout regions aren't feasible.

But here's the catch: SCM is incredibly sensitive to your choice of control units. Pick the wrong ones, and your entire analysis falls apart. Researchers at various institutions have found that poor control selection is the number one reason synthetic control analyses fail.

Best practices from teams who've done this successfully:

Clean your data obsessively: Missing data and outliers can wreck your synthetic control
Check pre-intervention fit religiously: If it's not good, don't proceed
Run multiple sensitivity analyses: Change your control pool, adjust your weights, see if results hold
Document everything: Future you (and your team) will thank you

One approach gaining traction is combining SCM with other methods. Data scientists at Microsoft often pair it with difference-in-differences analysis. This triangulation approach - getting the same answer from multiple methods - builds confidence in your results.

For teams using experimentation platforms like Statsig, synthetic controls can complement your existing toolkit. When you can't run that perfect A/B test, SCM gives you another path to understanding causal impact.

Closing thoughts

Synthetic control methods aren't a silver bullet, but they're an incredibly valuable tool when traditional experiments fall short. The key is understanding when to use them and how to validate your results.

If you're dealing with single-unit interventions, population-wide changes, or complex test groups where randomization isn't possible, SCM deserves a spot in your toolkit. Just remember: the method is only as good as your control selection and validation process.

Want to dive deeper? Check out Matteocourthoud's technical walkthrough for implementation details, or explore how platforms like Statsig are incorporating these methods into modern experimentation workflows.

Hope you find this useful!

Permalink: https://www.statsig.com/perspectives/synthetic-control-methods-test-groups

Platform

Resources

Platform

Resources

Docs

Blog

Pricing

Back to Perspectives home

The Statsig Team

Synthetic control methods: Complex test groups

Introduction to synthetic control methods

Challenges with complex test groups

Applying synthetic control methods to complex test groups

Practical applications and considerations

Closing thoughts

Recent Posts

Speeding up A/B tests with discipline

Yuzheng Sun, PhD

You can have it all: Parallel testing with A/B tests

Allon Korem, Oryah Lancry-Dayan

Move forward: The A/B testing mindset guide

Israel Ben Baruch

Experimentation and AI: 4 trends we’re seeing

Skye Scofield, Sid Kumar

From SEVs to self-serve: How we GitOps’d our infra with Pulumi & Argo CD

Tyrone Wong, Karan Luthra

Calculate exact relative metric deltas with Fieller intervals

Liz Obermaier