The culture in News Feed wasn’t just to use experiments to verify that our intuitions about what changes may lead to improvements were true. We also used experiments much like any good scientist would to understand more about their field—we started with a question, formed a hypothesis, and then went about constructing an experiment that would allow us to test our hypothesis and hopefully get an answer to our question.
The questions were usually centered around understanding more about the people who used our products, such as “is watching a video or liking a video a stronger signal of someone’s interest in the video?”, or “does receiving more likes on content you create make you more interested in creating even more content?”
Asking these more open-ended questions, and then answering them by running experiments, allowed us to come up with the products we should build in the first place. Instead of relying on our intuitions about what we should build or which features have the greatest potential, we could then prioritize building the features that we believed would result in the biggest impact, run experiments once those features were implemented, and finally make decisions on what features to ship. Experimentation guided the full process - from brainstorming, to prioritization, to launching.
Here’s an example of this method in action.
The situation: we want to boost content creation on the platform, because we know having more content is good for the ecosystem. We have engineers ready to build new tools and entry points for creation, but we don’t know what kind of content creation is best for the ecosystem—is it photo content, video content, posts that link to external blogs, or is it all the same.
We don’t want to spend time building things for all of these different types; we’d rather identify the most impactful areas to build and focus on those.
The solution: the first step is to figure out which of these types of creation is most beneficial to the ecosystem. In order to do this, we can construct a set of experiments that each increase the prevalence of a different type of content in peoples’ News Feeds by the same amount through minor adjustments to the ranking algorithm, to simulate what it would be like if creation increased for that type of content (one condition would increase photo content by 5%, another condition would increase video content by 5%, and so on).
We then choose a metric that to us defines ecosystem value; in this case it could be time spent on the platform. Next we let the experiment run, and observe which condition increases time spent the most. If the video content boost won, then we conclude it’s most efficient for us to dedicate our time to developing video creation tools, whereas if the photo content boost won, we would dedicate our time to developing tools for photo creation.
The end result is that we were able to use experimentation to quickly identify where to focus our efforts based on real data, instead of relying on our intuition.
We also applied experimentation in other novel ways — from automating the process of running experiments, reading results, and iterating on configurations, to training person-level models on the results of experiments in order to personalize configurations to each individual.
Experimentation can bring a lot of value to any organization working on improving their products. Whether you are working on a very early stage product and want to verify that the changes you are making are truly good for metrics before launching them fully, or you are a mature product that is trying to streamline the process of tuning the many parameters that drive your product behavior, Statsig can help you understand and grow your product.
AI technology has been here for years, but the new wave of AI products and features is game-changing. We covered this, and other topics, at the Seattle AI Meetup.
Building a culture of experimentation benefits greatly from things like reviewing experiments regularly and discussing the results.
Thanks to our support team, our customers can feel like Statsig is a part of their org and not just a software vendor. We want our customers to know that we're here for them.
Migrating experimentation platforms is a chance to cleanse tech debt, streamline workflows, define ownership, promote democratization of testing, educate teams, and more.
Calculating the right sample size means balancing the level of precision desired, the anticipated effect size, the statistical power of the experiment, and more.
The term 'recency bias' has been all over the statistics and data analysis world, stealthily skewing our interpretation of patterns and trends.