Last month, I hosted Dylan Lewis, Experimentation Leader at Atlassian, for a virtual fireside chat on building the culture of experimentation. Dylan brings over two decades' worth of experience in the domain and had a lot of great anecdotes to share.
Back in 2005, when Dylan was working at Intuit-TurboTax as the first Data Analyst on their web team, they had a learning window during tax season, from January through April.
This period essentially provided them one quarter to try out ideas, learn as much as they could, and help customers.
Leadership proposed ideas each Monday morning. The team would then build and launch those experiments by Friday and review the early results the following Monday.
The outcome at the end of the tax season was revealing:
Out of the 40 experiments they ran, 38 didn’t win. Side note: The two winning experiments came from marketers. ;)
The Highest Paid Opinion (HIPO) was not always correct.
The customers—the ones actually using the product and experiencing the treatment variants—helped them understand what would ultimately succeed.
Dylan shared, “The term HIPO was modified to 'HIPPO'. Avinash Kaushik presented it at an Emetrics conference, and Ronny Kohavi published this.” It has since become commonplace in the vocabulary of experimentation. Dylan noted that these symbols added a lot of fun and excitement.
“We loved it, and as teams began experimenting, we sent a (stuffed) hippo to the team with a winning experiment for that week. It moved from one place to another depending on which team was winning, and they got to decorate it. By the end of tax season, the hippo would be covered in souvenirs from the teams.”
It didn't stop with the hippos; they also introduced skunks, awarded for experiments that didn't win. Engineers would write the experiment ID on the skunks, giving them to people whose experiments didn't achieve 100% success. By the end of the tax season, engineers would have collected plenty of skunks—proudly displayed on their tables in intricate dioramas!
Now at Atlassian, Dylan is working to scale a mature experimentation program. Modern-day experimentation platforms have become more robust in terms of metric trustworthiness and statistical capabilities, enabling greater experimentation velocity.
Yet Dylan noted that culture remains the biggest challenge for most organizations. A good example Dylan shared, highlighting how culture can make a difference, concerned one of the key metrics on his dashboard: the percentage of failed/restarted experiments—a figure that should be low ideally.
One of their experimentation teams was experiencing a 40% restart rate. To address this, they organized a launch party, during which the experiment was made available to those in the room. This process allowed them to verify if the experience worked as expected.
One of the critical factors for success here was including someone who wasn’t part of the experiment to ensure an unbiased perspective.
The results were impactful, reducing the percentage of restarts to 5%.
Our conversation was filled with valuable takeaways for operationalizing the culture of experimentation, focusing on themes around identifying roadblocks, conducting reviews, prioritization, and ensuring trustworthiness.
This fireside chat is one you won’t want to miss! Watch below. 👇
Layers in Statsig allow variables (parameters) to be shared by many experiments. This means that once a Layer is integrated into your app's code, you can easily modify it.
Learn how the experts do it! Experimentation at B2B companies is a great way to score serious wins—if you do it right. Discover real experiments conducted by B2B pros.
Switching from LaunchDarkly to Statsig can help improve workflows, streamline feature management, and give your team's experimentation culture a fresh start.
What if you could rewind the exact moment a user didn't convert through a funnel and watch how it unfolded?
Marketplace experiments can have a ripple effect, impacting buyers, sellers, and the health of your platform. Here's how to experiment effectively.
Watch Akin Olugbade's recorded session wherein he discusses how to cultivate a data-driven culture using analytics, and why we decided to create this product.