James Le

View Original

Datacast Episode 87: Product Experimentation, ML Platforms, and Metrics Store with Nick Handel

The 87th episode of Datacast is my conversation with Nick Handel, the CEO and Co-Founder of Transform.

Our wide-ranging conversation touches on his education at UCLA graduating at the age of 19; his time as a quant analyst at BlackRock; his 4-year stint at Airbnb scaling their data science team and managing their ML Infrastructure platform; his time as the Head of Data at Branch; his current journey with Transform making data accessible by way of a metrics store; lessons learned from building different metrics capabilities, pushing interoperability within the Modern Data Stack, finding design partners and driving Go-To-Market, identifying cultural values, choosing the right investors; and much more.

Please enjoy my conversation with Nick!

See this content in the original post

Listen to the show on (1) Spotify, (2) Apple Podcasts, (3) Google Podcasts, (4) TuneIn, and (5) iHeart Radio

Key Takeaways

Here are the highlights from my conversation with Nick:

On Studying Math at UCLA

Source: https://ww3.math.ucla.edu/majors-minors-specializations/

When starting at UCLA at age 16, I decided that math and economics were probably two of the most useful and broadly applicable fields that I could study. So I chose them as my majors. I ended up graduating quickly from college after that because when I interned at BlackRock in 2011, I was somewhat concerned about a double-dip recession. I didn’t want to graduate and not have a job. So I decided to rush my last year and graduated at 19.

I took linear algebra with a great professor. He was writing a textbook and asked me to do an honors research seminar with him. I basically read a chapter of his book and highlighted things I didn’t understand. We would get together once a week, go through the chapter, and talk about what made sense and what didn’t. That was probably the most fun because I learned more about algebra than in my other classes. That one-on-one time gave me a small-school feel since UCLA is a big school. Ultimately, it made me love linear algebra as a topic.

On Being a Quant Analyst at BlackRock

Source: https://fortune.com/company/blackrock/fortune500/

I interviewed at BlackRock after my sophomore year and got an internship. I wanted to apply my math skills, but I had heard a lot about finance and was skeptical of getting into some parts of it. What I had found at BlackRock was a very high-empathy environment and the kind of culture that I wanted to be a part of. I think I felt like it was unique relative to the rest of finance. Beyond that, it’s a large firm, and there’re many intelligent people there, so I felt like I could navigate to the right place for me if the team that I landed on didn’t end up being the right place. I found a team that rarely hired undergraduates, and they couldn’t hire anyone because of the hiring freeze in 2011, so I was their only option. I got to join a team of mostly Ph.Ds and impressive quant researchers, which felt like an opportunity that was too hard to pass up.

We did some interesting math, but not the kind of what I would classify as ML problems people are pursuing in companies like Airbnb. Most of what we did was feature creation. We found a large amount of macroeconomic data, such as measures of political risk in different countries and reports coming out of these countries (for every country and every asset class). We basically built features out of those and tried to use macroeconomic foundations to create those signals. One of the outcomes of that for me was a slow feedback loop in my first job. Macroeconomy takes years for things to play out, so I would have no idea if I did a good job or not. It was just the encouragement of my coworkers that kept me going, which ultimately led me to look around and think: “How could I do something that would give me more signals of whether I’m doing a good job and allow me to learn faster?”

One of the more exciting things I worked on was reinsurance markets, which is the epitome of finance and risk re-distribution. I still reflect on that a lot because it’s one of the ways that finance can actually add a lot of value to society.

On Doing Data Science Growth for Airbnb

Source: https://blog.transform.co/data-talks/my-experience-with-airbnbs-early-metrics-store/

The majority of what I did in the early days was product experimentation. I joined a few weeks before Airbnb released this tool called Experiment Reporting framework. It’s actually the origin of what we are working on now with Transform. There was a sub-part of that tool called metrics repo. It allowed me to say: “Hey, these are the metrics I want to track for my experiments.” Then it went building the Hive and Airflow pipelines to capture those metrics at the granularity of the subject of the experiment and over the specific window of time that each of those subjects was exposed to the experiment. It would return to me all of these metrics for each treatment group. It allowed me to go from doing a bunch of manual analysis (maybe calculating 5 to 10 metrics on any given experiments) to walking in every morning and looking at the 10-to-20 experiments I was running live at that moment, with up-to-date data as of the night before when all of the pipelines ran. This was an amazing time where we went from running maybe one experiment every week or two to running 10, 20 experiments at any given time.

I was working on emails and anything related to notifications (user login, sign-up flow, or anything related to onboarding). I was working in the engagement and activation teams — doing a little product iteration and making some discovery every once in a while. My favorite experiment was: We found out that there was a big drop off in one of our email experiments after 48 hours if we sent the email more than 48 hours after they were on the state. Basically, we got far fewer bookings than if we sent it within 48 hours. We realized that we were logging our users out after 48hours, so when they were coming back logged out, they had this extra friction to go and book accommodations (versus if they came back and were still logged in from their previous session). It’s literally one character change (from 48 hours to a week), which had a massive impact. It made me realize how powerful the right insight can be. Ultimately, this was the age of the growth hacker. The ability to change just a few characters in your codebase and get millions of dollars back to the business is the ultimate growth hacker thing to do.

On Experimentation For Airbnb Trips

The idea of adding experiences serving this product existed in many different iterations. In 2016, there was a shift to take it much more seriously. There was a team of probably about eight folks who had been working on it for more than a year. They laid an amazing foundation of user research and an understanding of what that product could actually look like. The decision was to basically launch this thing at the end of 2016.

At the beginning of 2016, they started ramping up this team to build an entirely new dimension of Airbnb’s product. I was lucky to be called in as somebody who could potentially help on the data side. I was about two years into my time there and had a bunch of experience with product experimentation and the whole data stack at Airbnb. The charter for me was to join this team (about 12 people at the time) and help them figure out how they’re going to do product experimentation on the platform that we already had ahead of their public launch at the end of the year. I started to do some user testing that included project experimentation and lay some foundation for all the data ingestion that would come off of that. That included all the event logging and data engineering work to turn production databases into nice tables for consumption from the rest of the company. It was this fun time when we had the charter to use Airbnb’s infrastructure or build our own infrastructure if it made sense. We generally used Airbnb infrastructure, but it ended up being a great place to test the event logging framework that Airbnb now uses across its products. I ended up partnering with an engineer on the infrastructure team to build that out.

The core goal was to (1) launch this product and (2) fit into the fold of Airbnb’s infrastructure in a way that could benefit from all the great data tools that had been built. One of the sub-goals was to enable ML at launch. We wanted to make things more personalized. The hypothesis is that booking an experience requires a lot more personalization than booking a home because people have strong preferences for what they do with their time (versus the kinds of places they stay at). There are also probably more variants in these experiences, which got me to do more ML. The goal was to get the datasets in a place where we could build more sophisticated ML models at launch. After seven months, the team grew from 12 when I joined to 150 at launch. It was a great product launch and an exciting time.

On Scaling Airbnb’s Data Science Team

Source: https://www.linkedin.com/pulse/one-data-science-job-doesnt-fit-all-elena-grewal/

I was lucky to have the benefits of the fact that Airbnb was a fantastic place to work. It was very attractive to candidates, and because of that, we got exceptional applicants. We also went out to recruit exceptional people. On the other side of that, we had a really high hiring bar, not just technically but also culture fit. Airbnb is an extremely cultural-focused and cognizant place, so we care a lot about hiring empathetic culture and protecting our culture. The biggest challenge around hiring was to figure out how to balance and assess those two things together. The hiring process was built around figuring out whether this person is a good technical fit and trying to put them in scenarios where they would have to collaborate with people on the team (to get a sense of how well they can collaborate).

I didn’t interview a ton at BlackRock, so this was new for me. I was interviewing probably 2 to 3 people a day in the peak times and scaling this team from 20 to 85+ when I shifted to product.

  1. I came in not knowing Python well, which was pretty essential to the job when I joined. So I think the biggest thing for me was interviewing for the analytical mindset that people have or the technical chops of understanding how to do that, even if their language preferences weren’t the same as the ones we were using.

  2. The other thing I looked for is the desire to contribute outside of their day-to-day work. We had a team of people who worked on interesting side projects and contributed a lot of value to the data science work. We had internal libraries like Rbnb and AirPy that allowed us to do interesting data analytics internally.

On Leveling Up as a Product Manager

The best way to transition to product is to do some product work and have great product mentors. For me, that started with the product lead on growth, Gustav Alstromer, who handed me small projects to work on. I could own and run with it with a little bit of oversight. At one point, I was a data scientist but was also doing some product work. I collaborated with one engineer and a half-of-one designer to build a nimble team that does little projects. I helped launch Airbnb Trips and then worked on various ML projects. The product lead for infrastructure, James Mayfield, came over and said: “Hey, you’ve done some product work and worked on this big launch. You’ve been sending emails to the whole company and sharing the metrics. I think you have the attributes to come over and be a product manager. I’m forming this new team around ML infrastructure and would like to have you.” That’s the official transition.

Product is an interesting role because you need some designer and data analyst skills while being technical enough to understand all of the engineering work. But you’re also expected to have this entire another set of skills around communication, building clarity, the ability to build relationships and influence around the organization. That makes it a complicated role to jump right into, but if you have the right mentors and 20% of your time to do it, it makes it easy to transition into it over time.

On Designing Airbnb’s End-To-End ML Platform Bighead

Source: https://ieeexplore.ieee.org/document/8964147/authors#authors

Bighead is Airbnb’s ML platform. We started working on it in 2017 when I transitioned to this product team. Originally, four people were pulled from different parts of the data infrastructure team who wanted to work on ML infrastructure and myself, who was going through a 3-month trial as a product manager. We outlined a plan to build these five core components and designed the interfaces between them.

  1. It started with a data management tool called Zipline, now commonly referred to as a feature store. There was no reference to feature stores on the Internet when we started building this, so it was one of the first takes on what a feature store could be.

  2. The second piece provided access to large compute environments, a tool called Redspot. It’s comprised of hosted Juyter notebooks on large machines, allowing you to manage the environment and get access to that from your laptop.

  3. The third piece was model lifecycle management, later renamed Bighead service. Initially, it was called Model Repo and allowed you to train new models, save them, version control them, and run them as needed.

  4. The fourth piece of the workflow consists of two systems: one was called ML Automator, which allowed you to put in a chunk of code from a notebook and have that be a production model. The other was a serving environment called DeepThought, which allowed you to run that code on a machine in a low-latency environment and productionize/serve models.

  5. Holding all of that together was a common library called the Bighead library.

You can tell by the name that we didn’t take ourselves too seriously. We built great infrastructure and had a lot of fun doing it. The team grew from 4 to 13 in about a year. That was fun to scale my product skillsets at the same time as scaling this team and building this tool. This was the 0-to-1 moment for this tool.

We served a number of different teams. The really hard part was trying to serve the three main ML applications at Airbnb: pricing, search, and advertising. Those teams had a ton of nuanced requirements around what their systems did, making it really hard for our system to support them in the early days. Instead, we started with the longer tail of ML applications: the data scientists on business travel who wanted to predict if some upcoming reservations would be business travel or personal travel, for instance. They tried to get some incremental product experience that’s slightly better, but they weren’t trying to serve some massive core system like Airbnb pricing. The challenges were building out various features that we needed to move up that stack and get teams to use more and more of what we were building.

It reminds me a lot of what I’m doing today: trying to figure out who are the people I can support the soonest and how do I support the most different use cases. But ultimately, the goal is to support everyone and get everyone onto one system that has all the kinds of features that they need and hopefully, more than just the features they need but the ones that empower them to do better, more exciting things with ML applications (in this case) and analytics applications (in the case of Transform today).

On Building ML Infrastructure at Branch

I wanted to bring some of the ideas learned at Airbnb to Branch. It was an experiment, but I believed that a lot of the large infrastructure at Airbnb could be built in a very simplified way at Branch, given that none of the pieces of infrastructure that we built at Airbnb were things that we could buy. The market wasn’t quite there. At a foundational level, I wanted to build the more impactful pieces of ML and analytical infrastructure — metrics repository, experimentation tool, and feature repository.

It took us maybe two engineers for about a month or two to build. We scaled everything down and had total control over all of the systems so that we could build them in a very specific way. The core of it was that we had problems with our ML models because they would change or break, or the systems were fragile, or we were training on different features than we were running inference on our models. It was leading to the bad performance of our ML applications. We built a basic Python library that we could run for both our batch of model training and model inference. It didn’t have to be that performance since people could wait 30 seconds for their loan application to be approved or not. Some of our fraud applications had to be a little bit faster, but we got by making those models smaller.

Before this talk, not many people were talking about feature stores. Airbnb had just released their paper on Bighead/Zipline. Uber had written their post about Michelangelo. Google had written a few articles around feature serving. This was my take to say: “Hey, this is not just for Uber, Airbnb, or Google. Everyone can benefit from this.” It ended up being really impactful at Branch. Our ML models improved dramatically. Ultimately, enhancing our ML models allowed us to approve more of the right loans, which let us expand access to financial services. It’s fun to see that very direct impact on the customers of our tool.

On Pursuing Startup Ideas in Analytics

I’m a bit jaded on ML after all of that work. The hardest thing about the ML space is that ML features can be different across the company, and everyone is happy. As long as you’ve chosen the best ML feature for the model you’re running, and somebody else has chosen a slight variation of that same feature, it doesn’t matter to you as long as you both have the best model. That’s not the case for metrics. People want consistency, and if you have a slight variation of a metric all over the company, it leads to chaos. So people have this natural force to push consistency of metrics that they didn’t have for features.

That was one of the core hypotheses that we were testing at Airbnb, and ultimately, it was something that we could enforce at Branch, but we couldn’t enforce at Airbnb. There ended up being very little reuse of features across disciplines. I think it’s an incredibly important piece of tooling more for a technical perspective, as it ensures the right features are going into models in training and production. However, it had a much less profound impact from an organization’s perspective: Is this going to make data scientists and ML engineers more productive and allow them to build more ML applications faster? Maybe to some extent, but less so than I originally believed in 2017.

That being said, we’ll continue to see more valuable applications of ML. I just think that the overall risk of ML applications is much greater than the risk-reward profile of analytics applications. I have gone back and forth on both of these. I worked primarily on analytics for a number of years, then on ML for a number of years, and now I’m back to analytics.

On Founding Transform

I mentioned joining Airbnb just a few weeks before this metrics repo thing came into existence at Airbnb. It was a profound change in my workflow and a foundational thing that got me excited. I don’t think I had strong opinions about it at the time. I had a bunch of different feature requests and was working closely with the infrastructure team. My other two co-founders are both on that team, so that was how I started working with them.

It was an incredible tool, and I didn’t fully grasp how far it could go. That has been gradually built over the years. First, it supported experimentation applications. Later on, in its life, I started launching fake experiments because I wanted it to create metrics for me. That was the first time when I began to see how this metrics tooling can be impactful for other applications. A few other data scientists took this further by building a library called Metrics Query Language, which is the name of our library at Transform now. That didn’t get any traction at Airbnb, but it allows you to pull metrics into Jupyter Notebooks and now is a pretty central part of our product.

A bit of history: At Airbnb, I switched over to work primarily on ML applications. A few others took that metrics store idea much further with Minerva. Ultimately, they built tools to build Airflow pipelines, create Hive jobs, serve metrics to the data warehouse, export metrics to Druid, etc. — a gradual iteration from the original metrics repo to Minerva. Because of that, I saw a bunch of different reasons why that wouldn’t generalize. When I was at Branch, I was trying to build something similar, but I realized how much work it took to build something like that. I was spoiled at Airbnb with all the great tooling and great people I was working with. It’s much harder when you’re on a team of 5 folks trying to build out the entire data stack to solve our problems, and there weren’t any tools that we could buy. So I had this realization that this tool needs to exist, and nobody is currently building it in a generalized public way. I was also a little bit burned out.

I decided to leave Branch to go and think about this idea. I had a few other ideas at the time — doing things in feature repository, ML applications, and analytics tooling. Then I realized I’ve basically been spending the last seven years thinking about different variants of building derived data and serving it to different applications. The one missing and broadly needed is this metrics tooling that we had. I talked to my old manager and mentor at Airbnb, James Mayfield, who was the director of product for infrastructure. Originally, I went to him to get advice, and then we started talking about it. James and I also worked closely with our third co-founder Paul Yang (who worked with James back in Facebook in 2008). The 3 of us outlined the product and business ideas at the end of 2019 / early 2020. We formally started the company in January 2020, which was a wild time. By March, we were a fully remote company trying to recruit our first employee. Well, nobody wanted to switch jobs, so that was a tough time to start. But the fact that we got through that period was a good sign.

On Metrics Store

See this content in the original post

A metrics store has four key components: semantics, performance, governance, and interfaces.

Starting with semantics: These are basically the abstractions that you choose to capture what is a metric. We see a metric as an abstract concept because it can be aggregated to different granularity ease. You can do different things to it. To put that in perspective with an example at Airbnb, we tracked bookings, and that’s a metric. But there are a bunch of different ways to look at bookings. Maybe you want year-over-year bookings, maybe you want bookings filtered by country, maybe you want year-to-date bookings, etc. One way to do this is to write SQL to define all those variants, but there are most likely hundreds of metrics at small companies and thousands of metrics at large companies. There are different levels of importance: from all those metrics that everyone consumes to the one metric that someone made one time for one analysis. How you capture those abstractions are really important because it allows you to reuse logic across different variants of the same metrics. Honestly, the primary thing that we took away from Airbnb’s metrics store is the challenges with semantics — it’s hard to change them once you’ve built something. So we took the initial design of Transform’s semantics very seriously.

The second piece is around performance: Companies with large datasets just want metrics quickly. You can’t just have some SQL query that goes to a normalized source table and builds out some metric, aggregating hundreds of billions of rows. On the other side of this, for a startup with smaller datasets, you still want it to be fast because you want to answer some questions quickly and move on. The challenges of performance across a wide variety of different companies are fairly different. But ultimately, the goal is to serve metrics quickly. The question then becomes: “Do I calculate this in advance, or do I calculate this on the fly?” If you know what you want, you can calculate it in advance and serve that. That’s frequently how companies serve metrics to different applications. If you don’t know what you want, you have to do that on the fly. The challenge then is: “How do I reuse those datasets that I’m building on the fly or statically? Can I potentially use those statically-built datasets and merge them with dynamically-built datasets to get the results I’m looking for quickly?”

The third piece is governance: there are a lot of interesting things around lifecycle management, tearing of metrics, ownership, workflows, etc.

The last one is interfaces: For a metrics store to truly be a single source of truth for all of your metric definitions, you need to be able to pull those metrics into every single place that you would want to consume them. Otherwise, you have to go around this tool. And it’s not a single source of truth if you have to go around it because you’re expressing logic in some other place.

On Transform’s Capabilities

Our Metrics Framework, Metrics Catalog, and Metrics API are geared towards solving the four pillars mentioned above.

Source: https://blog.transform.co/product-focus/metrics-framework/

The purpose of the Metrics Framework is to lay out the definitions of metrics and their semantics and then prepare to serve those metrics efficiently through various interfaces. There’s some amount of governance supplied because these metrics in our framework are defined in YAML, and those are contributed to Git. So there’s some kind of standard code review process that companies follow, which applies some sort of technical governance to the definitions of the metrics.

Source: https://blog.transform.co/product-focus/metrics-catalog/

Metrics Catalog is geared towards governance and interfaces. I think there’s a unique attribute of metrics that is under-valued and not widely discussed: a metric is fairly stable in the company’s history. Metrics don’t move around. Their definitions might change, but they’ll probably iterate slowly and eventually get fairly locked down (especially the company’s most important metrics). That makes them an interesting vehicle for capturing institutional knowledge over time — the ability to say this thing happened at this moment in time. This can probably save data organizations an unbelievable amount of time if done well.

  • At a basic level, the catalog shows the definition of the metric, the owner of the metric, some governance, etc. It also gives the ability to ask basic questions of the data, slice metrics by different dimensions, or apply year-to-year/year-to-date/cumulative sum, etc. That has the potential to dramatically reduce the repetitive tasks of having to ask the question: “What happened in the past?”

  • On the other side of this, it stretches over into interfaces. We can ask basic questions of the metrics in the UI: what is bookings-by-country? what is bookings-by-etc.? Non-technical consumers can ask those. Because we have the semantics of how these metrics are defined, we can go and construct SQL statements, then serve those datasets to consumers who maybe have a limited understanding of SQL or have no knowledge of where the data comes from. It’s this nice safe experience with guardrails to ask these questions, all built on top of the Metrics API.

Source: https://blog.transform.co/product-focus/metrics-api/

For our metrics store to be a true single source of truth, it has to connect to every possible tool. So we built a Metrics API to connect our foundational pieces to many different tools. Sitting at the core is a GraphQL API, and we’ve built various clients around that. We have some open-source React components that allow companies to build the front-end on top of our Metrics API. So they can build metrics in our data catalog, their own internal tools, and connections with other tools. The GraphQL interface is low-level — if you’re pulling directly from it, you can do just about anything from a software engineering perspective. However, the end consumer of this is primarily data analysts. For data analysts, we have three foundational interfaces:

  1. A command-line tool to develop metrics.

  2. A Python interface to pull metrics into their Jupyter Notebooks or whatever Python environments they consume from.

  3. A JDBC interface to connect directly to BI tools so that they can write API requests within SQL. Semi-technical people can safely query metrics and pull them into their existing workflows.

On The Modern Data Stack

Source: https://blog.transform.co/data-talks/history-of-the-metrics-store/

Because of those components I just named, there’re obviously all of the different data warehouses that we need to be able to connect to and read from. We don’t have any of our own storage or compute. We are writing queries against the company’s underlying data warehouse and serving those to their downstream tools. Downstream, we connect to just about everything that companies want to use: BI tools, product analytics, experimentation, operational analytics, etc. The metrics store plays an interesting role in the modern data stack (MDS) because so much of the MDS is about inter-operability. There is good interoperability across the tools that are being built in the MDS, but there’s little consistency of the logic that’s being applied. The purpose of the metrics store is to apply logic consistently to allow companies to inter-operate between all of these tools but still have the correct metrics definitions and data pipelines to perform de-normalization and serve those downstream.

The biggest part of our future is continuing to amplify the MDS by focusing on interoperability. That means continuing to work on generic interfaces to consume from our tool. We expose 100% of what’s in our tool. All of our metadata can be read through the two GraphQL APIs that we have. On the other end of this is how flexible our framework can be to build all of the different metrics that companies want. These metrics are hard to define. For instance, say a conversion metric, if you ask 5 data analysts to define a conversion metric, you will end up with at best 3 or 4 definitions of that conversion metric. How do you turn that into an abstraction that allows people to say: “Hey, this is a conversion metric that happens between these two events within seven days?” The core of what our tool does is to push better interoperability, better interfaces, and better flexibility of our framework to allow it to connect to other tools.

On Hiring

Hiring is by far the most important thing for a founder. From the early days, James, Paul, and I thought a lot about the culture and the type of environment we wanted to work in. We had a lot of both positive and negative experiences between us, and we wanted to take everything we’ve learned about the environments we’ve been in and make Transform a really great place to work. That, in my mind, is the foundation of building a great culture and a great team. Basically, we laid out our values as three founders even before talking in-depth about the product or the company. We spent a lot of time talking about our values and ways of working we wanted to instill in the company. It’s probably one of the most impactful things we did because it allowed us to recruit some of our first few employees — which ultimately allowed us to continually reshape those values in those ways of working that let us grow. People really want to understand what they’re getting into, so setting your values and ways of working early on intentionally allows you to recruit those people and build an environment where those people will be successful and want to protect the environment. This is incredibly important and under-rated in most startups, where people just assume if you get great people together, they will make a great culture. That might be true, but I don’t necessarily want to take that risk. I’d rather be intentional about it.

Talent leads to more talent. The best talent wants to work with the best talents. You have to find really great people from the beginning. The most underrated skill for a founder is definitely whether they can recruit amazing people. You absolutely have to decide on what the hiring bar looks like. The decision to lower that hiring bar will impact your future recruiting, to the extent that making to decision to hire somebody in the short term might actually have longer-term consequences that will prevent you from hiring the right people in the future. It’s tempting when the right person is in front of you to just decide that you’re going to hire them because you have such a need for this role that you choose to lower your technical bar or culture bar. Hiring great people and keeping that bar high allows you to continue hiring great people. I’ve been in environments before where I’ve seen this not happening. It’s something that we’re working really hard to maintain an exceptional level of talent at Transform.

Hiring is a strategy. You have to come up with a plan for who and when you want to hire. Then you need to execute it. If you’re not executing, then you will most likely find yourself in that situation where you just need to hire someone. This includes defining roles before needing to hire them, getting feedback on those roles, collecting leads, collecting people’s feedback on those roles, speaking informally with folks in different spaces to learn before you’re even going out to hire that person, etc. Once you get to the right place, narrow your focus and hire the right person. If you’ve done the work upfront, you should have a pool of candidates and be able to assess that.

On Finding Early Traction

Nobody knew what a metrics store was or what a metrics layer was in 2019. When we first started talking to companies, before we raised any money or even James and Paul left Airbnb, I spoke to about 50+ companies. There was a wide range of feedback: “I don’t understand what this is. How it’s different from my existing tools” to “I saw the tool Airbnb had, and it’s amazing. I desperately want this tool.” A personal network is the greatest way to bootstrap early design partnerships.

The other part is understanding what those relationships are. You shouldn’t be trying to optimize for making money or selling your product vision to them. So much of that relationship has to be getting their feedback and listening. One of the hardest things to do as a founder is to basically merge two types of conversation:

  1. You want to sell your vision and ask them to buy/invest in your product.

  2. You ask questions to get their opinions and thoughts.

Ultimately, you will find people who believe in what you’re doing and identify those who have product-vision fit — when you build something which fits a vision that you believe in and they believe in. Then they will choose to invest in it. Acknowledging that is incredibly important.

On Fundraising

As much as possible, especially in the early days, you should ideally not have a time-bound on when you go out and fundraise and think about taking money. So many people talk to investors and make the decision of whether they’re going to work with this person for the next 10+ years in weeks. That, to me, is absolutely crazy. The way I thought about our fundraising is: I need to pitch my company and share ideas, but I need to find somebody who will believe in the values, the mission, and the culture of Transform (as much as we care about it). The longer I have to spend time with them to talk about ideas, the more I will see whether they are truly committed to this — whether they are a good culture fit with the company (just like an employee or a co-founder would be).

We’ve found that in Index and Redpoint. We talked to many different great firms, but the relationships we have with Bryan Offutt and Shardul Shah at Index and Patrick Chase and Tomasz Tunguz at Redpoint are special. We couldn’t imagine having a better relationship with investors. Ultimately, the founder is going to do the majority of the work to make the company successful. But there are moments when you really need your investors to give you some advice, or you just need them to be there to help with some difficult situation. Taking the time to find the right person who you want to be the first person to call is the most important thing.

Show Notes

  • (01:51) Nick shared his formative experiences of her childhood — moving between different schools, becoming interested in Math, and graduating from UCLA at the age of 19.

  • (05:45) Nick recalled working as a quant analyst focused on emerging market debt at BlackRock.

  • (09:57) Nick went over his decision to join Airbnb as a data scientist on their growth team in 2014.

  • (12:17) Nick discussed how data science could be used to drive community growth on the Airbnb platform.

  • (16:35) Nick led the data architecture design and experimentation platform for Airbnb Trips, one of Airbnb’s biggest product launches in 2016.

  • (20:40) Nick provided insights on attributes of exceptional data science talent, given his time interviewing hundreds of candidates to build a data science team from 20 to 85+.

  • (23:50) Nick went over his process of leveling up his product management skillset — leading Airbnb’s Machine Learning teams and growing the data organization significantly.

  • (26:56) Nick emphasized the importance of flexibility in his work routine.

  • (29:27) Nick unpacked the technical and organizational challenges of designing and fostering the adoption of Bighead, Airbnb’s internal framework-agnostic, end-to-end platform for machine learning.

  • (34:54) Nick recalled his decision to leave Airbnb and become the Head of Data at Branch, which delivers world-class financial services to the mobile generation.

  • (37:24) Nick unpacked key takeaways from his Bay Area AI meetup in 2019 called “ML Infrastructure at an Early Stage Startup” related to his work at Branch.

  • (40:55) Nick discussed his decision to pursue a startup idea in the analytics space rather than the ML space.

  • (43:36) Nick shared the founding story of Transform, whose mission is to make data accessible by way of a metrics store.

  • (49:54) Nick walked through the four key capabilities of a metrics store: semantics, performance, governance, and interfaces + introduced Metrics Framework (Transform’s capability to create company-wide alignment around key metrics that scale with an organization through a unified framework).

  • (55:58) Nick unpacked Metrics Catalog — Transform’s capability to eliminate repetitive tasks by giving everyone a single place to collaborate, annotate data charts, and view personalized data feeds.

  • (59:57) Nick dissected Metrics API — Transform’s capability to generate a set of APIs to integrate metrics into any other enterprise tools for enriched data, dimensional modeling, and increased flexibility.

  • (01:02:41) Nick explained how metrics store fit into a modern data analytics stack

  • (01:05:57) Nick shared valuable hiring lessons finding talents who fit with Transform’s cultural values.

  • (01:12:27) Nick shared the hurdles his team has to go through while finding early design partners for Transform.

  • (01:15:38) Nick shared upcoming go-to-market initiatives that he’s most excited about for Transform.

  • (01:17:46) Nick shared fundraising advice for founders currently seeking the right investors for their startups.

  • (01:20:45) Closing segment.

Nick’s Contact Info

Transform’s Resources

Mentioned Content

Articles + Talks

People

Book

Notes

My conversation with Nick was recorded back in July 2021. Since then, many things have happened at Transform. I’d recommend:

See this content in the original post

About the show

Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.

Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing khanhle.1013@gmail.com.

Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:

If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.

See this gallery in the original post