Tavour wanted to ship a “no-brainer” feature - address autocomplete. What they saw surprised them! |
---|
Download this in PDF format: Tavour Case Study
One point of friction they noticed was requiring users to input an address for delivery — so they decided to add auto complete to ease the process. They expected this to be a no-brainer feature to ship — it should increase speed and accuracy in the user sign-up flow — resulting in better conversion.
Autocomplete makes typing an address fast!
They put this behind a Statsig feature flag and rolled it out to a small % of users to make sure nothing was broken. What they saw next surprised them!
Statsig’s “Pulse” view shows impact on metrics
Tavour was expecting “Address Auto-Complete” to increase the number of successfully activated users. Instead — they found users exposed to this feature churned out at a higher rate than users in the control group (who didn’t have auto-complete).
Initially, they hypothesized there was a data quality issue. Could an issue related to logging or data pipelines be causing this? After verifying that this was not the case, the team looked at other metrics impacted by the rollout. A system event — “Application Backgrounded” had also shot up. A new feature causing users to abandon the app suggested something weird could be going on.
The Tavour team started investigating usability for the new feature. Looking at other apps, they noticed that those apps displayed more address results than the Tavour app did, without having to scroll. They formed a hypothesis that the partial list of autocomplete suggestions displayed in the Tavour app did not convey to users that they were additional suggestions behind the hidden scroll.
When they sliced data by phone size, they saw a marked difference between small and large phones. With a small phone, fewer address suggestions were visible without scrolling. With a large phone, more address suggestions were visible. This finding provided evidence for the hypothesis that showing fewer addresses confused users and prompted them to abandon registration.
Tavour decided to tweak the feature to let users see more auto complete suggestions without having to scroll.
“New user activation rate increased by double digit percent points”
— Bella Muno, Product Manager at Tavour
The revised feature increased new user activation rate, giving them the confidence to finish rolling out this feature! It’s easy to assume that “no-brainer features” don’t need to have impact measured. It’s obvious that the user experience will improve. Tavour’s example is a reminder that if you don’t measure this value, you may miss something.
Every feature is well intentioned… that’s why we build them. However, our experience is less than a third of new features create positive impact. Another third require iteration before they land the desired impact — users might not discover it or are confused by it. The final third of the features are bad for users — they have a negative effect on product metrics, and the best bet is to abandon them.
This split varies with both product maturity (it’s harder to find wins on a well optimized product) — and the product team’s insights and creativity. Yet, these three buckets almost always exist. Failing to critically analyze impact prevents knowing which bucket a feature is in.
Statsig turns every feature rollout into an A/B test with no additional work. In a partial rollout, people who are not yet getting the new feature are the Control group of the A/B test. People who are getting the new feature are the Test group. By comparing metrics you’re already logging across these groups, Statsig can tell how much any given feature is impacting your KPIs. Statistical tests identify differences between the groups that are unlikely due randomness and noise.
This type of testing enables product teams to understand the impact of a feature and determine which of the three buckets above the feature is likely to be in!
Simple feature flagging systems let you turn features on and off and control gradual rollout, but don’t offer automatic A/B tests and analysis. Simple experimentation systems let you do something similar — but introduce too much overhead for use on every feature rollout. Statsig makes it easy to feature flag and measure changes.