James Le

February 13, 2024

Datacast

Datacast Episode 132: Big Data Engineering, Data Culture from First Principles, and Reimagined Metadata with Suresh Srinivas

James Le

February 13, 2024

Datacast

Suresh Srinivas was the Chief Architect of Uber’s data platform, responsible for all data initiatives at the company, including the Databook, Data Quality, and Data Lineage initiatives. Suresh was part of the original team that built Hadoop at Yahoo! and co-founded Hortonworks, which developed and supported open-source software to manage big data and associated processing.

He is leading the OpenMetadata Project to build Metadata APIs & specifications and a single place to discover, collaborate, and get your data right.

James Le

January 22, 2024

Datacast

Datacast Episode 131: Data Infrastructure for Consumer Platforms, Algorithmic Governance, and Responsible AI with Krishna Gade

James Le

January 22, 2024

Datacast

Krishna Gade is the founder and CEO of Fiddler AI, an AI Observability startup that helps AI-forward organizations build trusted AI solutions and connect model outcomes to business KPIs. Fiddler addresses problems in model monitoring, explainability, analytics, and fairness.

An entrepreneur and engineering leader with strong technical experience in creating scalable platforms and delightful products, Krishna previously held senior engineering leadership roles at Facebook, Pinterest, Twitter, and Microsoft. At Facebook, Krishna led the News Feed Ranking Platform that created the infrastructure for ranking content and powered use-cases like Facebook Stories and recommendations like People You May Know, Groups You Should Join, etc. Krishna’s team built Facebook’s explainability features like ‘Why am I seeing this?’ which helped bring much-needed algorithmic transparency and, thereby, accountability to the News Feed for both internal and external users.

James Le

September 4, 2023

Datacast

Datacast Episode 125: The Next Wave of Developer Platforms, Data Products, and Software Infrastructure with Sakib Dadi

James Le

September 4, 2023

Datacast

Sakib Dadi is a vice president at Bessemer Venture Partners in their San Francisco office, where he primarily focuses on early-stage investments in developer platforms, data products, and software infrastructure. He has been involved with Bessemer’s investments, including Prefect, Coiled, Arcion, Periscope Data, Okera, Guild Education, and Sila Nanotechnologies. Before joining Bessemer, Sakib worked in product at Viagogo, an international marketplace for buying and selling tickets for live events.

James Le

March 15, 2023

Datacast

Datacast Episode 110: Wisdom in Building Data Infrastructure, Lessons from Open-Source Development, The Missing README, and The Future of Data Engineering with Chris Riccomini

James Le

March 15, 2023

Datacast

Chris Riccomini is an engineer, author, investor, and advisor. He has worked on infrastructure as an engineer and manager for about 15 years at PayPal, LinkedIn, and WePay. He was involved in open source as the original author of Apache Samza and an early contributor to Apache Airflow. He has also written a book with Dmitriy Ryaboy called The Missing README, a guide for software engineers. Lately, he has been investing in startups in the data space.

James Le

November 10, 2022

Datacast

Datacast Episode 101: Scaling Data Engineering, Building Data Teams, and Managed Data Stack with Tarush Aggarwal

James Le

November 10, 2022

Datacast

Tarush Aggarwal is one of the leading experts in leveraging data for exponential growth, with over ten years of experience in the field.

After graduating with a degree in Computer Engineering from Carnegie Mellon in 2011, he became the first data engineer on the analytics team at Salesforce.com. Data was in its infancy, and the log metric framework he built was critical in allowing Salesforce to analyze data across customers and provide benchmarks across different industries and verticals.

Most recently, Tarush led Data for WeWork. WeWork leveraged data to grow 10x in 3 years, supporting a footprint of 800+ offices in 120+ cities in 23+ countries with over 12,000 employees, making WeWork one of the fastest-growing companies in the world. He scaled the data org from 2 to 100+, and their unique approach allowed them to stay lean while supporting every functional area of the business. In 2019, he moved to China to help establish WeWork’s Asia operations and focus on the hyper-growing Chinese market.

James Le

July 4, 2022

Conference

What I Learned From Tecton's apply() 2022 Conference

James Le

July 4, 2022

Conference

Back in May, I attended apply(), Tecton’s second annual virtual event for data and ML teams to discuss the practical data engineering challenges faced when building ML for the real world. There were talks on best practice development patterns, tools of choice, and emerging architectures to successfully build and manage production ML applications.

This long-form article dissects content from 14 sessions and lightning talks that I found most useful from attending apply(). These talks cover 3 major areas: industry trends, production use cases, and open-source libraries. Let’s dive in!

James Le

March 17, 2022

Datacast

Datacast Episode 86: Risk Management, Open-Source Governance, and Negative Engineering with Jeremiah Lowin

James Le

March 17, 2022

Datacast

Jeremiah Lowin is the Founder & CEO of Prefect, a dataflow automation company. Before starting Prefect, Jeremiah gained extensive experience in all aspects of the modern data stack as a director of risk management, machine learning researcher, and data scientist at a number of institutional investment firms. Today, he lives with his wife and two sons in Washington, DC.

James Le

March 4, 2022

Conference

What I Learned From Attending Tecton apply(meetup) 2022

James Le

March 4, 2022

Conference

What I Learned From Attending Tecton apply(meetup) 2022

Last month, I attended another apply(meetup), Tecton’s follow-up virtual event of their ML data engineering conference series. For context, I have written recaps for both of their 2021 events, including the inaugural conference and the follow-up meetup. The content below covers my learnings, ranging from model calibration and ranking systems to real-time analytics and online feature stores.

James Le

March 2, 2022

Datacast

Datacast Episode 85: Ad Exchange, Stream Processing, and Data Discovery with Shinji Kim

James Le

March 2, 2022

Datacast

Shinji Kim is the Founder & CEO of Select Star, an intelligent data discovery platform that helps you understand your data. Previously, she was the CEO of Concord Systems, an NYC-based data infrastructure startup acquired by Akamai Technologies in 2016. She led the development of Akamai’s Internet-of-Things data platform for real-time messaging, log processing, and edge computing.

Prior to Concord, Shinji was the first Product Manager hired at Yieldmo, where she led the Ad Format Lab, A/B testing, and yield optimization. Before Yieldmo, she analyzed data and built enterprise applications at Deloitte Consulting, Facebook, Sun Microsystems, and Barclays Capital.

Shinji studied Software Engineering at the University of Waterloo and General Management at Stanford GSB. She also advises early-stage startups on product strategy, customer development, and company building.

James Le

December 27, 2021

Datacast

Datacast Episode 77: Delivering Modern Data Engineering with Einat Orr

James Le

December 27, 2021

Datacast

Einat Orr is the CEO and Co-founder of Treeverse, the company behind lakeFS, an open-source platform that delivers resilience and manageability to object-storage-based data lakes. She received her PhD. in Mathematics from Tel Aviv University in optimization in graph theory. Einat previously led several engineering organizations, most recently as the CTO at SimilarWeb.