When Amazon launched Home Services, the team was convinced that most people want to schedule home installations in the mornings, evenings, or weekends. This naturally constrained the number of available time slots, and stretched delivery dates months into the future. Customers werenât pleased and happiness was far from guaranteed.
Frustrated and out of ideas, the team decided to experiment by offering the next available time slot by default (customers could still choose another slot if they wanted to). It turned out that most customers wanted the installation as soon as possible and took the first available slot. As orders clocked up, technicians completed more installations and delivered higher quality service based on customer satisfaction. Happiness began to trend upwards.
Experiments enable us to form a deeper understanding of a problem. While the desire to understand is universal, using experimentation to understand a problem is not a natural human instinct.
In intentionally seeking data that disproves oneâs deepest convictions, experimentation defines the culture of the teamwhere it takes hold. From the scientific revolution in Europe to the rise of Netflix, Amazon, and Facebook, experimentation has enabled unparalleled growth. Large companies such as Amazon and Facebook havenât become experimental after theyâve grown big. Theyâve grown big because they tested their convictions every step of the way.
This mindset of testing oneâs convictions, even when deeply rooted in instinct, is valuable at every stage of a company regardless of product-market fit. Itâs essential before product-market fit, when weâre still learning, assimilating, and applying our understanding to push deeper into a problem. We might hit upon a great idea by sheer luck, but usually itâs because weâve spent countless iterations trying different things.
Tristan, CEO and founder of dbt, rejects experimentation maximalism as a product creator. His experience with experimentation is 100% on the money. A/B testing sucks today. The tooling is painful and error-prone, driving teams to spend countless hours and sweat on experimentation without revealing any earth-shattering insight. A successful A/B test might reveal 1-bit of information if youâre lucky, taking you from âbeing 50/50 on which variant to being sure of oneâ.
What big tech companies have solved is running experiments at industrial scale. They have reduced the marginal economic and human cost of running an experiment drops to nearly zero. Amazon is huge today, but this is Jeff Bezos speaking in 2005, a week after they launched Prime, on how to maximize experimentation by reducing the cost of experimentation.
Ubiquitous, low cost experimentation enables every engineer to ask, âWhy shouldnât I A/B test this? Maybe Iâll learn something new.âš. Learning how users respond (or how complex systems respond) to successive software updates compounds the engineerâs ability to detect signals and fix issues as early as possible. Without this feedback loop, mistakes and poor assumptions morph beyond recognition over time and become progressively harder to root cause.
Good experimentation infrastructure compounds learning and discovery throughout the company, arming everyone with data to make smarter decisions. Every engineer in the team aligns and owns the vision instead of blindly taking instructions via the roadmap.
While I violently agree with Tristan that A/B testing sucks today, I disagree that it is overused. If anything, the lack of good tools limits use of experimentation in most companies beyond big tech.
People often trip on the idea that big companies can enjoy the benefits of experimentation because they can simply optimize what theyâve built. I have nothing against optimization, but if people thought this way in Amazon, theyâd never have created any new products or businesses.
Amazon launched Amazon Auctions in 1999, zShops later the same year, and Amazon Marketplace in late 2000, all iterations to build a marketplace. In helping Target and Marks & Spencer build on top of Amazonâs e-commerce engine, Amazon recognized the need to untangle the mess that was the e-commerce platform into well-documented APIs. âSo very quietly around 2000, we became a services company with really no fanfare,â Andy Jassy described in an interview. At a summer retreat in 2003, the team progressively began to recognize what theyâd become good at: running reliable, scalable, cost-effective data centers. âIn retrospect it seems fairly obvious, but at the time I donât think we had ever really internalized that,â Jassy explained the first steps that Amazon took towards web services.
No one sits in a corner to come up with breakthrough product insights. Similarly, no one relies on experimentation alone to come up with fundamentally new and great products. However, the path to invention that combines experimentation, learning, and reflection is often lost in the prevailing folklore.
Henry Ford is famously quoted as saying, âIf I had asked people what they wanted, they would have said faster horses.â Yet, Ford didnât invent cars to replace horses. He experimented relentlessly to make cars more cost-effective.
Cars had been around for decades before the Model T. In fact, Fordâs 1908 Model T was the 20th iteration over a five year period that began with the production of Model A in 1903. Fordâs vision was mass production. Introducing consistently interchangeable auto-parts was the first big step. The moving assembly line was a further âoptimizationâ to reduce the time workers spent walking around the shop floor to procure and fit components into a car. âProgress through cautious, well-founded experiments,â was Fordâs motto.
Thereâs no rule that live experiments are the only way to create great products. On a lucky day, our vision and planned roadmap are outcomes of prior exploration and learning. Focus is essential to solve most problems, and roadmaps help us focus further exploration in a given space. While a series of experiments may be structured to focus on a specific problem, occasionally these experiments break out a new dimension of the problem and take us back to the drawing board.
Itâs great to have a vision, but itâs essential to keep your eyes open. Experimentation is a critical tool to remain open minded even as the world shifts around you.
Thereâs also no rule that experiments must move an existing KPI. As our understanding of the problem space evolves, we must rewrite the metrics to best reflect our current understanding of the problem.
When I was in EC2, we tracked normalized instance hours as the primary usage metric, but in EC2 Spot we tracked utilization of unused instances as the primary metric². Compared to an auction based pricing model that was designed to maximize revenue, we flipped the problem over to reduce waste. We now ran experiments to solve the problem as we defined it and described our vision in terms of metrics that we created. Today, all three cloud providers have âSpotâ instances.
Youâre a believer in the culture of experimentation and you have appropriate tooling. What does experimentation look like in practice?
Start by measuring: Turn every new feature into an A/B test - with feature gates this should be a no brainer for any engineering team.
Build shared context: Train your team to recognize unexpected patterns and go deeper into these âissuesâ jointly as a team.
Take calculated risks: Breakdown large roadmap projects into small, incremental experiments. Roll back effortlessly when things donât pan out.
Invest in timely and accurate data: No engineer should need to be told that her work isnât showing results. Pave the way for everyone in the team to arrive at the same conclusions about the product.
Iterate fast: Try out more ideas to improve the rate at which you generate better ideas.
A previous post goes into more detail on these. As a product creator, I also exchange notes with other creators on whatâs working for themÂł. Tim, our data science lead, writes extensively about not needing large sample sets for A/B tests. My colleague Vineeth writes often about democratizing experimentation with best practices.
At Statsig, weâre doing our small bit to make experimentation simple and rewarding for every team, big or small. We focus on the infrastructure and tooling so you can focus on learning and iterating. Building a culture of experimentation is in your hands and weâre behind you every step of the way.
[1] This forward reasoning is the basis of the paper Why ask Why? by Andrew Gelman Guido Imbens: âWe do not try to answer âWhyâ questions; rather, âWhyâ questions motivate âWhat ifâ questions that can be studied using standard statistical tools such as experiments, observational studies, and structural equation models.â
[2] If somehow EC2 got better at ordering exactly what was needed, weâd be out of business.
[3] One way to find practitioners from teams of all sizes is in the growing Statsig Experimentation Community.
The Statsig <> Azure AI Integration is a powerful solution for configuring, measuring, and optimizing AI applications. Read More ⇾
Take an inside look at how we built Statsig, and why we handle assignment the way we do. Read More ⇾
Learn the takeaways from Ron Kohavi's presentation at Significance Summit wherein he discussed the challenges of experimentation and how to overcome them. Read More ⇾
Learn how the iconic t-test adapts to real-world A/B testing challenges and discover when alternatives might deliver better results for your experiments. Read More ⇾
See how weâre making support faster, smarter, and more personal for every user by automating what we can, and leveraging real, human help from our engineers. Read More ⇾
Marketing platforms offer basic A/B testing, but their analysis tools fall short. Here's how Statsig helps you bridge the gap and unlock deeper insights. Read More ⇾