Data Engineering

Datacast Episode 131: Data Infrastructure for Consumer Platforms, Algorithmic Governance, and Responsible AI with Krishna Gade

Datacast Episode 131: Data Infrastructure for Consumer Platforms, Algorithmic Governance, and Responsible AI with Krishna Gade

Krishna Gade is the founder and CEO of Fiddler AI, an AI Observability startup that helps AI-forward organizations build trusted AI solutions and connect model outcomes to business KPIs. Fiddler addresses problems in model monitoring, explainability, analytics, and fairness.

An entrepreneur and engineering leader with strong technical experience in creating scalable platforms and delightful products, Krishna previously held senior engineering leadership roles at Facebook, Pinterest, Twitter, and Microsoft. At Facebook, Krishna led the News Feed Ranking Platform that created the infrastructure for ranking content and powered use-cases like Facebook Stories and recommendations like People You May Know, Groups You Should Join, etc. Krishna’s team built Facebook’s explainability features like ‘Why am I seeing this?’ which helped bring much-needed algorithmic transparency and, thereby, accountability to the News Feed for both internal and external users.

Datacast Episode 110: Wisdom in Building Data Infrastructure, Lessons from Open-Source Development, The Missing README, and The Future of Data Engineering with Chris Riccomini

Datacast Episode 110: Wisdom in Building Data Infrastructure, Lessons from Open-Source Development, The Missing README, and The Future of Data Engineering with Chris Riccomini

Chris Riccomini is an engineer, author, investor, and advisor. He has worked on infrastructure as an engineer and manager for about 15 years at PayPal, LinkedIn, and WePay. He was involved in open source as the original author of Apache Samza and an early contributor to Apache Airflow. He has also written a book with Dmitriy Ryaboy called The Missing README, a guide for software engineers. Lately, he has been investing in startups in the data space.

Datacast Episode 101: Scaling Data Engineering, Building Data Teams, and Managed Data Stack with Tarush Aggarwal

Datacast Episode 101: Scaling Data Engineering, Building Data Teams, and Managed Data Stack with Tarush Aggarwal

Tarush Aggarwal is one of the leading experts in leveraging data for exponential growth, with over ten years of experience in the field.

After graduating with a degree in Computer Engineering from Carnegie Mellon in 2011, he became the first data engineer on the analytics team at Salesforce.com. Data was in its infancy, and the log metric framework he built was critical in allowing Salesforce to analyze data across customers and provide benchmarks across different industries and verticals.

Most recently, Tarush led Data for WeWork. WeWork leveraged data to grow 10x in 3 years, supporting a footprint of 800+ offices in 120+ cities in 23+ countries with over 12,000 employees, making WeWork one of the fastest-growing companies in the world. He scaled the data org from 2 to 100+, and their unique approach allowed them to stay lean while supporting every functional area of the business. In 2019, he moved to China to help establish WeWork’s Asia operations and focus on the hyper-growing Chinese market.

What I Learned From Tecton's apply() 2022 Conference

What I Learned From Tecton's apply() 2022 Conference

Back in May, I attended apply(), Tecton’s second annual virtual event for data and ML teams to discuss the practical data engineering challenges faced when building ML for the real world. There were talks on best practice development patterns, tools of choice, and emerging architectures to successfully build and manage production ML applications.

This long-form article dissects content from 14 sessions and lightning talks that I found most useful from attending apply(). These talks cover 3 major areas: industry trends, production use cases, and open-source libraries. Let’s dive in!

Datacast Episode 86: Risk Management, Open-Source Governance, and Negative Engineering with Jeremiah Lowin

Datacast Episode 86: Risk Management, Open-Source Governance, and Negative Engineering with Jeremiah Lowin

Jeremiah Lowin is the Founder & CEO of Prefect, a dataflow automation company. Before starting Prefect, Jeremiah gained extensive experience in all aspects of the modern data stack as a director of risk management, machine learning researcher, and data scientist at a number of institutional investment firms. Today, he lives with his wife and two sons in Washington, DC.

What I Learned From Attending Tecton apply(meetup) 2022

What I Learned From Attending Tecton apply(meetup) 2022

Last month, I attended another apply(meetup), Tecton’s follow-up virtual event of their ML data engineering conference series. For context, I have written recaps for both of their 2021 events, including the inaugural conference and the follow-up meetup. The content below covers my learnings, ranging from model calibration and ranking systems to real-time analytics and online feature stores.

Datacast Episode 85: Ad Exchange, Stream Processing, and Data Discovery with Shinji Kim

Datacast Episode 85: Ad Exchange, Stream Processing, and Data Discovery with Shinji Kim

Shinji Kim is the Founder & CEO of Select Star, an intelligent data discovery platform that helps you understand your data. Previously, she was the CEO of Concord Systems, an NYC-based data infrastructure startup acquired by Akamai Technologies in 2016. She led the development of Akamai’s Internet-of-Things data platform for real-time messaging, log processing, and edge computing.

Prior to Concord, Shinji was the first Product Manager hired at Yieldmo, where she led the Ad Format Lab, A/B testing, and yield optimization. Before Yieldmo, she analyzed data and built enterprise applications at Deloitte Consulting, Facebook, Sun Microsystems, and Barclays Capital.

Shinji studied Software Engineering at the University of Waterloo and General Management at Stanford GSB. She also advises early-stage startups on product strategy, customer development, and company building.