New podcast with Susan Athey, The Economics of Technology Professor at Stanford's Graduate Business School. Our conversation is available everywhere (Apple Podcasts, Spotify, YouTube, etc.). Please subscribe, follow, and review.
Susan Athey is one of the top 10 smartest people I have ever met. And I’ve met a lot of people. I’d put her in the same league as Tyler Cowen and Peter Thiel. She’s a genius.
Susan is the Economics of Technology Professor at Stanford's Graduate Business School. She won the John Bates Clark Medal -- an award given to the American economist under the age of 40 who made the greatest contribution to economic thought and knowledge. She basically invented the role of Tech Economist when she served as Chief Economist for Microsoft. She sits on several boards. And so much more.
Susan and I dive into what tech companies are doing wrong and how we can use machine learning to make better decisions. Here are some highlights from my conversation with Susan Athey.
Most algorithms on the web are really good at optimizing for clicks because there is a very tight feedback loop -- and feedback is really important to ML systems. So the tighter the feedback loop, the faster the algorithm will improve. The problem is that the things with the tighter feedback loops are not always good proxies for what you want to measure. Measuring clicks has a lot of downsides. Irrelevant clicks will never lead to a purchase. We really want to measure long-term value to the customer -- but that could take a year and that would be too long of a feedback loop. So we need to be creative in what we measure and identify the trade-offs between time and long-term value.
Almost all companies have this idea that in order to have data-driven innovation, you want to do a lot of A/B tests. To run a lot of tests, they need to be short-term. If they're short-term, then you’re going to overlook long term feedback which is super important for understanding if your algorithm is doing what you originally intended. Most people understand this tradeoff. But A/B tests let you ship algorithms often, and engineers are typically rewarded when their algorithm is shipped.
You have to recognize that there’s a tradeoff. You want to optimize for the long run objective, but there’s a lot of noise, innovation can slow down, and you may not get a signal. If you only optimize for a short-term objective, then you're probably going in the wrong direction. But figuring out the right KPIs to optimize for is one of the most powerful things you can do.
When you put in a reinforcement learning algorithm, you have to commit to metrics. Self learning works well when your metrics capture everything you care about. The system will optimize without you stopping it. But if you have bad metrics, you’ll optimize for the wrong thing.
At a smaller company, you can ensure that engineers pay more attention to it. Ship a new algorithm, have a holdout set, and evaluate a few months later. This works well when you have a team of 20 people where everyone understands all of the algorithms and decisions being made.
At a larger organization, it’s harder. You don’t want to stop decentralized innovation. You want to make hundreds of product decisions in a week. So you can have peer reviews, where you have to explain the effects of your work and the metrics that capture this information. Most people know what’s qualitatively going wrong. So you can also identify when A/B tests are the right fit. And sometimes you just have to change the metrics that you’re monitoring.
Setting up a two year holdout group can help you figure out the right longterm metrics to optimize for. But this will require passing on some revenue. It might be expensive for a small firm. It’s easiest to set up before you have revenue. It's hard to claw back and give up revenue once you have investors to report to.
Machine learning allows us to more quickly see interesting things in the world because we can maybe run a million experiments, whereas before we could just run one. But we also risk learning things from experiments that aren't true. High quality data from natural experiments is key to avoiding this. Of course, this is music to our ears at SafeGraph as our core focus is high-quality data for data scientists and machine learning engineers.
We stopped teaching a new generation of people working on machine learning how to think about drawing inferences from observational data. And we stopped teaching them that there's situations where you just can't. Regardless of the data or the AI, there are some theorems that say you literally cannot answer that question.