The 97th episode of Datacast is my conversation with Cody Coleman — the Founder and CEO of Coactive AI.

Our wide-ranging conversation touches on his remarkable childhood growing up in poverty and finding a few people who have made big differences in his story; his academic experience at MIT studying EE & CS; his industry experience interning at Google and working at JUMP Trading; his Ph.D. work on data selection for deep learning at Stanford, his current journey with Coactive AI; key developments for the Data-Centric AI community; similarities between being a researcher and a founder; and much more.

Please enjoy my conversation with Cody!

Listen to the show on (1) Spotify, (2) Google Podcasts, (3) Apple Podcasts, (4) Stitcher, (5) RadioPublic, and (6) TuneIn

Key Takeaways

Here are the highlights from my conversation with Cody:

On His Remarkable Upbringing

My father left before I was born. I do not even know his name. My mom was actually in prison when I was born. As soon as she gave birth to me, I had to be put into foster care. Then I was adopted by my maternal grandparents. They had big hearts for trying to adopt me and my siblings, but they really did not have much left to give us. Basically, we grew up in poverty my entire life with welfare and social security.

About a year after my mom was in prison, her charges got dropped. My grandparents took her in, but that became a constant source of drama. While in prison, my mom was declared insane by the FBI. Over 24 years in the house, we racked up 65 police reports for all sorts of crazy stuff my mom did that you cannot even imagine. It is hard actually to convey how chaotic and ridiculous life was. I want to share some stories about my mom to give a sense of her character.

Once she went to prison and my grandparents adopted us, she did not care about me or my siblings. There was never even a notion of caring for my personal health. She loved animals and dreamed of opening her own pet store. We had a ridiculous number of pets, like 11 cats and 14 dogs living in our tiny house. I was sleeping in the living room at the time, and there was not even room for me as a bed. On top of that, I am allergic to cats and dogs. I was constantly sick every single time while I was at home.
Another story that exemplifies this is that: with the constant drama she had with my grandparents, she would get fed up and leave. I remember the first time my mom left. As I woke up one morning, she came to me and said: “Cody, I am leaving for New Mexico. You are smart, and you will figure things out on your own.” She went down the stairs, left the house, and was gone for months at the time.

My grandparents had become jaded based on everything that was happening with my mother. They also had a completely different worldview since they grew up in the Great Depression. My grandfather had a second-grade education, while my grandmother had a fourth-grade education. Going to college or doing well in school meant nothing to them. I would come home and tell them I did well in school, and no one really cared. With the chaos that was going on and everyone trying to survive, I was left on my own to figure everything out.

Source: https://characterlab.org/tips-of-the-week/resilience/

Things changed in the latter half of middle school. I had an older brother, Sean, who is 18 years older and left our family when I was born. He later came back into my life to provide support. He helped me build my first computer. I still remember he was the first person who believed in me and was my first role model in a sense. I think that is part of why I do computer science now.

In high school, I had an amazing trigonometry teacher, Chantel Smith, who became my unofficial adopted mom. She saw that my braces were falling apart and took me to her kids’ orthodontist. She paid for me to go to driving school so I could get out of the house away from the cats and dogs that were making me sick. She even bought food for me. Then other amazing high school teachers banded together and helped me out as well.

On Going to MIT

I did not know anything about applying to colleges. I remember going through the Common Application, looking at the dropdown for different things to major in, and seeing one called Computer Science. As mentioned above, I love computers because of my brother Sean. I was a weird kid growing up and wanted to be a scientist. So I decided to opt-in for computer science as my perspective major.

After that, looking up on the college board to find the best schools for computer science, I figured there was probably no chance I would get in, maybe a small percentage, so why not just go for it? That is when I first found out about MIT. No one where I grew up knew about MIT or any of these places. Literally, my grandmother did not realize that MIT was a thing until a year after I started my freshman year.

I was the first student from my high school to ever go to MIT. I did not realize how big a deal it was. I was just naively optimistic and got myself in over my head. I could not even get to MIT. I was working at the county library at the time. My boss’s parents were driving up to Boston, so I hitched a ride with them to actually get to MIT and start my degree there. I will never forget lying in an extended twin-sized bed. I am 6'4'’, so my feet were hanging out at the end of the bed. I was staring up at the ceiling and realizing the gravity of the situation that I had gotten myself into. I am this kid from a poor public high school, which was ranked 300 out of 322 schools, in the middle of nowhere town in South Jersey. Now, I was at MIT, which was apparently the best school for engineering in the world.

The initial thing hit me, and I was like: “Oh dear God, what have I gotten myself into? What am I possibly doing here? How am I going to figure it out?” When I left high school, my friends were not as supportive. One of my best friends at the time told me that the only reason I got into MIT was that I am black. That explanation caused me a lot of fear, so I had an intense imposter syndrome going into MIT.

However, two things helped define my time at MIT to get over that immense fear:

No one in Boston or at MIT knew anything about me growing up. It is one of the perks of being the first one from your high school to ever go to MIT. I realized I had a blank slate to define who I was and what I was capable of. I would take advantage of that and think about putting my best foot forward and being the person I ultimately want to be. Effectively, that helped me mentally let go of my past and keep the past from clouding my vision as I move forward.
MIT had all kinds of resources: office hours, recitations, tutorials, etc. If there is anyone who needs these programs, it is definitely me. So I was not afraid to take advantage of them. I also did not over-subscribe myself. I focused on getting the basic classes out of the way, doing the average, and not trying to do too much. Many people fell into this trap at MIT, where they ended up burning themselves out. I stayed a steady course, and that ultimately ended up working out. Not only did I graduate from MIT, but I did pretty well. I had a 4.9 out of 5.0 GA for my Bachelor of Science and a 5.0 GPA for my Masters of Engineering. I was also part of the Engineering Honor Society TPP and the president of the Electrical Engineering & Computer Science Honor Society Beta Kappa Nu for two years during my senior year and my Master’s.

That whole process was just a bit of head down — being humble about things and not afraid to ask questions. There were also many opportunities like internships and traveling abroad. I studied abroad for an entire year at Cambridge University, which opened up my world in terms of thinking about my career and life. I used to think my dream job was being a game designer and working for Epic Games. Then I realized that there is so much you can do with computer science, such as web development, data science, machine learning, etc. It felt like jumping off a cliff into an amazing experience that showed me the world.

On His Favorite Classes at MIT

Source: http://web.mit.edu/founders/www/2.a16.html

There were a bunch of different classes I enjoyed for different reasons:

I will never forget 6002 at MIT, which is a circuit class. I did electrical engineering and computer science, so I was split between the two. I ended up enjoying the circuits class a lot because it had a whole notion of engineering intuition. Operating in that mode worked well for thinking about how these things pierce together and things are not perfect. I found that to be quite interesting to go from imperfect pieces to a workable outcome. I remember the final exam was a 3-hour exam, and I finished it in an hour and a half. I got a ridiculously high grade in that class, so that was really enjoyable as far as I think of my sense of accomplishment.
I also loved Founder’s Journey. This is a class at MIT for engineers to learn about entrepreneurship. Ever since I was a little kid, I actually dreamed of starting my own business one day. I remember talking with my grandmother at one point, and I will never forget what she said: “They will never let a black man start his own company.” For some reason, that stuck with me. This class demystified the process of starting one’s own company in a sense, where you got to hear other entrepreneurs talking about their stories, all the unexpected things they went through to create their companies, and that there is no magic formula to it. I realized that there is nothing inherently special about the process of entrepreneurship, which was super eye-opening for me.
Patrick Winston had a class called The Human Intelligence Enterprise, which I think back a lot. It is basically stories with Patrick about research and life in general. It is similar to Founder’s Journey, where I got to hear how someone thinks about the world.

On Presiding Eta Kappa Nu at MIT

Source: https://www.facebook.com/mithkn/

At the beginning of MIT, despite all the imposter syndrome, I always had this notion of having big goals. I had the secret goal of getting a 5.0 GPA at MIT. I worked really hard, so getting into Eta Kappa Nu and receiving the invitation letter was an awesome achievement. There was an election immediately after the induction ceremony. Since no one was stepping up to be president, I decided to run for it. Surprisingly, everyone voted for me. I think I also ran unopposed since no one else wanted the job.

At the time, there was also Tau Beta Pi, which was the other engineering honor society. They had a ridiculous amount of resources, such as a spring career fair at MIT with participation from big companies. Despite those resources, I felt like Tau Beta Pi had too much. On the other hand, Eta Kappa Nu did a ton of stuff for the department, even though we did not have nearly the same amount of funding. We ran the tutoring program for all the advanced CS courses, a resume book for folks to get their resumes out there, and the Underground Guide — a peer review system for courses (students would review their courses after they had completed them; then someone would aggregate these reviews and create the students’ view of the courses).

As Eta Kappa Nu was dwindling, I came in to help figure things out. We actually grew the membership of the group overall. We continued all the programs mentioned above. We even started a women’s outreach initiative led by Chelsea Finn (now a faculty member at Stanford). Coming into MIT not knowing anything about Computer Science, I thought it was a huge problem not knowing which path to pursue after getting a CS degree — working in big tech, working in startups, doing research, or being a consultant. I organized a panel with alumni and people connected to MIT to have these career paths represented and help students like me who did not know how to navigate this process.

It was a very tough organization to run since we were very scrappy. We did not have the same amount of funding as other honor societies. We also required people to do a lot of work, such as a minimum of 14 hours of service. But we continued to grow larger. We even helped design the EE & CS student lounge at MIT, where we used to hold our meetings. It was a super rewarding experience that had a lasting change at MIT.

On Advancing Online Education at the MIT Office of Digital Learning

MIT is big into undergraduate research. There was this whole movement around online courses. I remember they had appointed someone to be the Director of the Office of Digital Learning, Sanjay Sharma. I emailed him to chat and share my ideas on how digital learning could help people like me at MIT and abroad. He responded quickly and offered me a research assistantship. I accepted immediately since education was so powerful in my life and transformed my trajectory. Making that more accessible to people really resonated with me.

The Office of Digital Learning was very much similar to a startup at the time. There was me, a PostDoc, a Ph.D. student, and my faculty advisor Isaac Chuang. The four of us tried to figure out how to conduct research in this new space of digital learning. We wrote the first paper on Massive Open Online Courses. We worked with edX and had to deal with undocumented code. We collaborated with Harvard’s School of Education to publish papers identifying struggling students and figuring out how to best intervene. That was the first time when data was made available about education. Before that, learning only happened in person, so it was hard to get fine-grained details. But now, thanks to the digital interface, you can track everything and collect fine-grained data, which led to various cool data science and ML problems.

On Interning at Google

After my freshman year, I applied to Google’s BOLD immersion program for all the non-technical internships at Google. It was kind of odd since I am a CS major at MIT. But I applied for it because Google’s Sales office in Boston held an event I attended. The BOLD program was quite cool. It was a super diverse group of people. I was in New York City when I got selected for that program.

For my first Google internship in Mountain View, I was a weird intern because I had done the BOLD program before. They had me as part of both the BOLD and Software Engineering internship programs. I was in a team that interfaces these two things — the intersection of engineering and business. For local search quality, the business goal was to drive traffic to businesses with listings on Google. But at the same time, some people would falsely claim a listing to take over businesses.

That was my first introduction to machine learning since they tried to detect spam with a classifier (whether something is a fraudulent listing or not) — doing everything from feature engineering to model retraining.
That was also my first introduction to the programming language R for statistics because my manager exclusively used R.
We were using a data system at Google called Dremel, which later became BigQuery. There were no resources on the team about it, so I ended up becoming a Dremel expert who created training sessions to teach people how to use Dremel.

Source: https://research.google.com/pubs/pub36632.html

For my second Google internship in Zurich, I was in their APM (Associate Product Management) program focusing on YouTube analytics. They were in the process of switching from the number of views to time watch as a metric for their recommendation algorithm. I was responsible for thinking about how to showcase that metric to both users on the front page and creators who create content. I needed to think about how to do this on a global scale, especially in terms of translation into different languages. I also worked with different engineers, designers, QA folks, etc. I learned how much work that went into creating a product and how to think through every detail of that process.

On Working As A Data Scientist at Jump Trading

My Master’s of Engineering Advisor advised me: “You only have so many data points in your life to figure out what you ultimately want to do. To figure out those experiences, you should spread those data points as far as possible to explore a lot of the space.” Being a data scientist and working at Jump Trading in Chicago was super new, so I decided to go there. I also wanted to stay close to home as my grandmother was sick at the time, but it turned out to be a great experience in many ways.

Source: https://www.linkedin.com/company/jump-trading

I observed that data and ML in production in finance were far ahead of the tech industry. It made sense because when you think about tying data and algorithms to business outcomes, you can literally see whether the algorithm is making you money or losing you money. I was part of the Vendor Data Group at Jump. We were building a system that brought data from different sources under one roof. This whole process was fascinating because every data source is different. A lot of data scientists struggle with this issue all the time. We also had to think about data quality.

The first project I worked on was an ETL pipeline, where I built out the entire history of stock exchanges from multiple different vendors and merged this together. I effectively built a distributed compute thing from scratch with custom caching and dynamic work allocation. For the second project, I created a decorator-based thing that would run sanity checks for the ETL pipeline to ensure that the data was not messed up. When I came out to the Bay Area and saw projects like Great Expectations for data quality monitoring, I realized all these tech companies have the same problem I once faced at Jump Trading.

On Doing a Ph.D. at Stanford University

Doing my Master’s of Engineering at MIT with my advisor Isaac Chuang made me realize my curiosity as a kid of just exploring ideas. I actually never thought of doing a Ph.D. because it seemed like people who get Ph.D. degrees are geniuses. Isaac convinced me to apply to 4 schools (Stanford, Berkeley, MIT, and UT Austin). To my surprise, I got into all of them. Isaac told me that the hard thing about a Ph.D. program is getting in. Just because I get in does not mean I have to go, and just because I go does not mean I have to finish.

Source: https://stvp.stanford.edu/?attachment_id=28217

I deferred the Ph.D. to work at Jump for a year because of my grandmother. Ultimately, this led to an interesting decision process. While at Jump, I did really well, and they wanted to move me to the most profitable team at the firm. My original goal was to be able to have a comfortable job where I could afford things that I needed and do the things that I wanted. Effectively when I was working at Jump, I had that life. After a lot of self-reflection, I realized the three things I wanted from life:

To be challenged and constantly grow.
To be in a supportive environment that helps me grow and accelerates my growth.
To have a broader positive impact on the world.

From working in finance, I got the challenge piece and the support piece, but the broader positive impact thing was a bit more questionable. I saw that doing the Ph.D. at Stanford was an opportunity to have all three pieces.

On DAWNBench

Source: https://dawn.cs.stanford.edu/benchmark/about.html

The whole goal of the DAWN project at Stanford was to democratize AI. We had been at this point where there was a lot of success with large deep learning models. But it was just super expensive. The people who were publishing in this space only worked at big tech companies like Google and Facebook. So the DAWN group composes of faculty members from a systems background (Matei Zaharia, Peter Bailis, Chris Re, Kunle Olukotun) who came together to figure out how to create systems, tools, and infrastructure in places like Google and Facebook and bring them down to a level that anyone can use. That resonated with me because, ultimately, the thing I love about Computer Science is that: with an Internet connection and a laptop, you can create something that impacts the lives of thousands or millions of people around the world. Whereas other forms of engineering require hundreds of thousands of dollars with expensive equipment. With deep learning, it seemed like computer science was going in that direction where you need to have a $100,000 machine to be able to do anything. So the DAWN project tried to bring that down.

To this end, DAWNBench focused on how to be more efficient with training and inference using deep learning models from a performance standpoint. We did not just focus on algorithmic changes but also brought in software and hardware changes to help people navigate this complex space of optimizations via different tools. Ultimately, it was super successful in getting people to focus on improving efficiency in terms of time and cost — where we saw training time on ImageNet dropped by over 400x and costs dropped by 20x.

On MLPerf

Source: https://ieeexplore.ieee.org/abstract/document/9001257

MLPerf is the spiritual successor to DAWNBench in a sense. When we started, there were not many benchmarks for system performance of deep learning. MLPerf started by getting together the leading benchmarks at that point. Besides DAWNBench, Baidu had DeepBench (kernel-level benchmarking), Harvard had Fathom, and Google had TensorFlow benchmarks. We basically came together to take the best pieces of those benchmarks, scale them up to more tasks, and get broader industry support so that MLPerf could become the industry standard.

It has been amazing to see how much impact DAWNBench has had and MLPerf has grown! I am proud to pass the ML systems benchmarking torch to MLPerf: https://t.co/vMpu6Y1TkN
— Cody Coleman (@codyaustun) January 13, 2020

We started with a small set of 4 organizations that meet in a conference room once a week. We came up with the initial straw person for those benchmarks and presented that to a larger audience. Going from those 4 groups, we branched out to representatives from different companies to create something that would be representative of different perspectives in the field. Since the initial meetings, it has received much support from all the hardware manufacturers like NVIDIA, Intel, Cerebras Systems and truly achieved the vision of becoming the industry standard. Companies submitted results to the benchmark, and now MLPerf has gone to a nonprofit called ML Commons. We started with benchmarking system performance but now are thinking about data as well as ML in production.

On His Research

Data selection methods like active learning and core-set selection techniques don't have to be computationally expensive! In recent work for #ICLR2020, we show how to speed up data selection by up to 41.9x: https://t.co/kAQFl0ZIbG
— Cody Coleman (@codyaustun) April 23, 2020

“Selection via Proxy: Efficient Data Selection for Deep Learning” ties back to the mission of the DAWN project, which was to democratize ML. On one side, it is very expensive at the time to train deep learning models and make inferences with them. How can we be more efficient with computational resources? But at the same time, the other big thing that was a blocker was the amount of data we needed. ImageNet was the gold standard, and academics used datasets that were created for them. But in practice, you do not have that and must create your own dataset. I realized that we were stuck with this mentality of purely thinking about the data quantity as the general wisdom at the time. That was when I became interested in whether we can be smarter about the data we pick and label to make it more accessible to smaller companies.

Source: https://openreview.net/forum?id=HJg2b0VYDr

There is excellent work in the active learning and core-set selection literature on selecting the most valuable data points. But when you think about the scale of modern datasets and models, rather than a few thousands of examples, we are talking about millions or billions of unlabeled examples and models that take days to train. Can we scale down these models to get a useful signal to select the most valuable data points without running over every single data point at every iteration? It turned out that it worked. You can use smaller models that give you enough information on data points that are highly correlated with large models.

Can active learning scale to millions (potentially billions) of examples? Yes! We propose Similarity search for Efficient Active Learning and Search (SEALS) to restrict the candidates considered in each round and vastly reduce the computational complexity: https://t.co/wnCHMzXege
— Cody Coleman (@codyaustun) July 2, 2020

“Similarity Search for Efficient Active Learning and Search of Rare Concepts” (SEALS) built on the same goal that I was talking about with Selection via Proxy — rather than throwing a ton of data into a model, can we be more intelligent about it? While Selection via Proxy improves the amount of computation that we do per example and reduces the amount of computation in-between selection rounds, it still requires you to scan over all of the data. When I was working on SEALS, I was at Facebook — where the scale of data was billions of examples. The process of scanning over a billion examples between each round of selecting data points became computationally infeasible. Even with a small model, it became too hard for Selection via Proxy to work at that scale.

Existing active learning approaches such as core-set selection scan at least linearly with the size of unlabeled data, potentially quadratically. Ultimately, we need these methods to scale sub-linearly. That was the original inspiration for SEALS. I realized that we have gotten pretty good with pre-trained models and embedding data — where we can learn really good representations from unstructured data. These representations take data points with similar features and cluster them together. If we are looking for some new concept, even if we have not seen it being trained before, those data points are clustered together in a small portion of the overall space. Especially in web-scale data, rather than trying to search globally over all the data to find a specific concept that we care about, we should instead start with positive examples and gradually grow our search space from there.

Source: https://arxiv.org/pdf/2007.00077.pdf

The surprising thing is that because these concepts are well clustered by these representations, we only need to look at a very small fraction of the overall dataset to find relevant examples and train an accurate classifier. We can grow the search space by doing a similarity search — which is really fast with modern caching or clustering-based methods. Rather than scanning quadratically or linearly, we scale sub-linearly with the size of the unlabeled data. It is now as easy to work with a dataset with billions of examples as to work with a dataset with a million or a few thousand unlabeled examples. You can accomplish this from your laptop in a fast and responsive experience, which is perfect for human-in-the-loop setups.

On Being Inspired By His Advisors

Congrats to my student @codyaustun (with @pbailis) on defending his PhD today! Cody did amazing work improving the resource and data efficiency of deep learning, including widely used benchmarks (DAWNBench/MLPerf), perf analysis, and new 10-1000x faster algorithms (SVP & SEALS). pic.twitter.com/KWH3qWI5YH
— Matei Zaharia (@matei_zaharia) June 3, 2021

I was very fortunate to have the opportunity to do my Ph.D. at Stanford and be part of the DAWN project because it is such an entrepreneurial environment. It almost felt like a part of the tenure process at Stanford is to start a company. All the faculty members ended up starting a company in addition to doing research. The nice thing about the DAWN project and working on more systems-related research is that you really focus on problems in the industry, but you effectively have more freedom to think about them on a broader scale — going from symptoms you hear at different companies in the industry to proposing a central solution that solves all these symptoms and gets to the core problem.

Matei and Peter were super great. I felt like getting an introductory course into entrepreneurship and the process of starting a company. There are just a ton of operational things you have to do. In particular, I saw Peter getting increasingly busy since he started his company while being a Ph.D. student. I observed how he navigated that process of being an early-stage founder via osmosis.

On Data-Centric AI

Our workshop on Data Centric AI was accepted at NeurIPS 2021! Stay tuned for details #NeurIPS2021 https://t.co/SFpwoScflW
— Cody Coleman (@codyaustun) August 2, 2021

My research on Selection via Proxy and SEALs was very much in the space of Data-Centric AI. However, there was not really a community for that. I always felt isolated as a systems person thinking about data. The amazing thing about the Data-Centric AI workshop at NeurIPS 2021 and the DataPerf benchmark suite is that I (and many others) no longer feel alone since these data problems come up everywhere. We saw people from a systems background, a traditional ML background, and even an HCI background getting involved. A quote I remember super vividly from D Scully is: “Given the notion of HCI (Human-Computer Interaction), it almost felt like we need a new field around data called Human-Data Interaction.” I think that is an interesting idea since we will see more efforts to enable people to interact with data and facilitate the creation of applications powered by AI that are using data under the hood.

The DataPerf benchmark suite is just an awesome initiative. We have academics from Stanford, Harvard, ETH Zurich, and industry folks from Meta, Google, Landing AI, Coactive AI coming together to define the industry standard that aligns people around different components of the data lifecycle we are creating for intelligent applications.

On Founding Coactive AI

As I talked to many data scientists and analysts working with image and video data, really all forms of unstructured data, one particularly hard thing is metadata. We all have some rough information about an image, such as file size, file name, etc., but the semantically meaningful information is captured in the pixels of that image. There is a whole gap in solving metadata problems and fixing issues in the data pipeline for those problems, especially for a large amount of data.

Coming out of a Ph.d. and doing ML systems from the ground up, I was almost too much in the Ph.D. zone. Terminologies like active learning or self-supervised learning are meaningless in a sense to industry practitioners. How can I bridge that gap between the technology I have spent my time working on during my Ph.D. and their day-to-day work? That was a critical lesson for me — realizing that the world I had been in was a very specific and niche research world, and I need empathy for data practitioners out there who have not spent 5 years of their life reading NeurIPS papers.

On Defining Early-Stage Culture

During the process of starting Coactive AI, I talked with a lot of founders about their biggest challenges. A surprising thing that blew me away is that: especially as technical founders, many people undervalued HR problems. And when I talked with successful founders, they realized that these HR problems were their most painful mistakes. I remember talking to an awesome CEO of a multi-billion dollar company, in which he said that 90% of his job is HR-related things: Should they hire a chief revenue officer? What is their compensation philosophy? What do they think about the organization of the company? How do they deal with cultural issues? I was surprised to hear that his toughest problems are HR, while the technical engineering problems (my bread-and-butter for the past 5 years) were not that difficult. As a founder, I must resist the urge to dive into technical issues and think holistically about people and culture.

Source: https://github.com/CoactiveAI/culture

As a result, we created our culture first and open-sourced it on GitHub, which shows our guiding principles as a company. We put that into every piece of our process, from interviewing to onboarding, which becomes a huge differentiator for us and ultimately gives me peace of mind. My worst fear is not that the company will not be successful. My worst fear is waking up in the morning and dreading going to work because of a toxic culture. It has been great thinking about creating an environment where I am excited to work with everyone on the team.

On Being A Founder versus Being A Researcher

There are many similarities between being a researcher and being a founder. Fundamentally, you are both starting from an idea to a solution. In the case of research, you start with a hypothesis and systematically test that hypothesis with a scientific method. In the case of entrepreneurship, you have the lean startup methodology to iterate on your problem statement. That mechanical process of taking an idea from zero to one is similar in both research and startup. You also have to think about how to present your work to the broader world. Additionally, every Ph.D. student goes through a process of figuring out a research direction, similar to how a founder goes through the process of figuring out product-market fit.

Look what came in the mail today. Hopefully, I'll stop having nightmares about paper deadlines soon! pic.twitter.com/LWL1uXv3RH
— Cody Coleman (@codyaustun) October 30, 2021

The one big difference is that: The Ph.D. feels binary and scary in a sense. You either get the Ph.D. at the end, or you do not. All of your success depends on your advisor. With the startup, you can talk to different folks and have more flexibility. Every brick that you place down adds value to the company.

~ New Podcast ~🎙️#Datacast E97 features @codyaustun. We discussed:
- Escaping Poverty
- Digital Learning
- ML Systems Benchmark
- Data-Centric AI

Cody has the most remarkable story out of all the guests I've had on the show. Enjoy!🧵https://t.co/Tf8p6H3rXt
— James (@le_james94) August 2, 2022

Show Notes

(01:49) Cody shared his upbringing in New Jersey, his childhood interest in science and technology, and the few people who have made big differences in his story.
(09:35) Cody went over his academic experience studying Electrical Engineering and Computer Science at MIT.
(17:51) Cody recalled his favorite classes taken at MIT.
(22:43) Cody talked about his engagement in serving as the president of MIT’s chapter of Eta Kappa Nu Honor Society and advancing online education at the MIT Office of Digital Learning.
(31:25) Cody is bullish on the future of digital learning.
(35:43) Cody expanded on his internships with Google throughout his time at MIT — doing local search quality and YouTube analytics.
(42:31) Cody described the challenges of dealing with high-frequency trading data from his one year working as a junior data scientist at the Vendor Data Group of Jump Trading in Chicago.
(46:50) Cody reflected on his decision to embark on a Ph.D. journey in Computer Science at Stanford University.
(51:54) Cody mentioned his participation in the DAWN project, specifically DAWNBench, an end-to-end deep learning benchmark and competition.
(54:21) Cody unpacked the evolution of MLPerf, an industry-standard benchmark for the training and inference performance of ML models.
(56:52) Cody walked through the motivation and empirical work in his paper “Selection via Proxy: Efficient Data Selection for Deep Learning.”
(59:34) Cody discussed his paper “Similarity Search for Efficient Active Learning and Search of Rare Concepts.”
(01:06:32) Cody shared his learnings about bringing ML from research to industry from his advisors, Matei Zaharia and Peter Bailis — who were both academics and startup founders simultaneously.
(01:09:19) Cody went over key trends in the emerging Data-Centric AI community — given his involvement with the Data-Centric AI workshop at NeurIPS 2021 and the DataPerf benchmark suite.
(01:12:19) Cody shared lessons learned about finding product-market fit as the founder of Coactive AI — which brings unstructured data into the world of SQL and the big data tools that teams already love.
(01:15:34) Cody emphasized the importance of focusing on the HR function and defining cultural guiding principles for any early-stage startup founder.
(01:21:05) Cody provided his perspective on the differences and similarities between being a researcher and a founder.
(01:23:47) Closing segment.

Cody’s Contact Info

Coactive AI’s Resources

Mentioned Content

Talk

“Digging Deeper: How a Few Extra Moments Can Change Lives” (TEDxStanford 2017)
“Data Selection for Data-Centric AI: Data Quality over Quantity” (MLOps Meetup 2021)
“Data Selection for Data-Centric AI” (Stanford MLSys 2022)

Research

People

Matei Zaharia (Cody’s Ph.D. Advisor, Co-Creator of Apache Spark, Co-Founder of Databricks)
Fei-Fei Li (Professor of Computer Science at Stanford, Creator of ImageNet Dataset)
Michael Bernstein (Professor of Computer Science at Stanford with a focus on Human-Computer Interaction)

Books

“No Rule Rules: Netflix and the Culture of Reinvention” (by Reed Hastings)
“What You Do Is Who You Are: How to Create Your Work Business Culture” (by Ben Horowitz)
“The Inner Game of Tennis: The Classical Guide to Peak Performance” (by Timothy Gallwey)

Notes

My conversation with Cody was recorded back in January 2022. Since then, many things have happened at Coactive AI. I’d recommend:

Attending Cody’s upcoming talk at Snorkel’s The Future of Data-Centric AI.
Reviewing the DataPerf workshop at ICML 2022.
Reading the CoactiveAI blog post on bringing UI props to MLOps.
Watching Cody’s CBS News interview back in February 2022.

About the show

Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.

Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing khanhle.1013@gmail.com.

Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:

If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.

Related Episodes