Data Quality

Datacast Episode 133: Full Data Stack Observability with Salma Bakouk

Datacast Episode 133: Full Data Stack Observability with Salma Bakouk

Salma Bakouk is the CEO and co-founder of Sifflet, a Full Data Stack Observability platform. Before Sifflet, Salma was an Executive Director at Goldman Sachs in Sales & Trading in Asia, leading key Data & Analytics initiatives. Salma holds an Engineering Degree from École Centrale Paris in Applied Mathematics and a Master's in Statistics and Data Science.

Datacast Episode 127: Data Intelligence for Insurance Transformation with Heather Wentworth

Datacast Episode 127: Data Intelligence for Insurance Transformation with Heather Wentworth

Heather Wentworth is the Chief Data Officer of Accelerant, an MGA-enabling insurtech focused on improving how risk is exchanged across the insurance ecosystem. She is passionate about delivering innovative solutions and business models that improve business outcomes and drive revenue growth. She was nominated as one of the top 100 Insurance Innovators of 2022 by Lightico.

Datacast Episode 119: Experimentation Culture, Immutable Data Warehouse, The Data Collaboration Problem, and The Rise of Data Contracts with Chad Sanderson

Datacast Episode 119: Experimentation Culture, Immutable Data Warehouse, The Data Collaboration Problem, and The Rise of Data Contracts with Chad Sanderson

Chad Sanderson was the Product Lead for Convoy's Data Platform team, which includes the data warehouse, streaming, BI & visualization, experimentation, machine learning, and data discovery.

Previously he worked on Microsoft's AI Platform team and led Data initiatives at SEPHORA and Subway. He has built everything from feature stores, experimentation platforms, metrics layers, streaming platforms, analytics tools, data discovery systems, and workflow development platforms.

His love of the data space has also allowed him to implement open-source and SaaS products (early and late-stage) and build cutting-edge technology from the ground up.

Datacast Episode 118: Overcoming Hardships, Confident Learning, Dataset Improvement, and The Ph.D. Rapper with Curtis Northcutt

Datacast Episode 118: Overcoming Hardships, Confident Learning, Dataset Improvement, and The Ph.D. Rapper with Curtis Northcutt

Curtis Northcutt is an American computer scientist and entrepreneur focusing on AI to empower people. He is the CEO and Co-Founder of Cleanlab, building next-generation data-centric AI and open-source technologies that enable AI to work with real-world, messy data.

He completed his Ph.D. at MIT, where he invented confident learning to automatically find label issues in any dataset. Curtis received the MIT thesis award, NSF Fellowship, and Goldwater Scholarship for his work. Before Cleanlab, he worked in AI research teams at Google, Oculus, Amazon, Facebook, Microsoft, and NASA.

Datacast Episode 92: Analytics Engineering, Locally Optimistic, and Marketing-Mix Modeling with Michael Kaminsky

Datacast Episode 92: Analytics Engineering, Locally Optimistic, and Marketing-Mix Modeling with Michael Kaminsky

Michael Kaminsky is the co-founder of Recast, a marketing optimization platform, and the co-founder of Analytics Engineers Club, a training course for data analysts looking to improve their engineering skills. He is passionate about helping organizations “make better decisions faster.” He has experience applying econometric research methods to environmental economics, child welfare policy, and medical treatment efficacy. He studies Spanish, reads, and pets dogs around Mexico City in his spare time.

What I Learned From Convergence 2022

What I Learned From Convergence 2022

Last week, I attended Comet ML’s Convergence virtual event. The event features presentations from data science and machine learning experts, who shared their best practices and insights on developing and implementing enterprise ML strategies. There were talks discussing emerging tools, approaches, and workflows that can help you effectively manage an ML project from start to finish.

In this blog recap, I will dissect content from the event’s technical talks, covering a wide range of topics from testing models in production and data quality assessment to operational ML and minimum viable model.

Datacast Episode 76: Modern Data Collaboration and Social Entrepreneurship with Prukalpa Sankar

Datacast Episode 76: Modern Data Collaboration and Social Entrepreneurship with Prukalpa Sankar

Prukalpa Sankar is the Co-founder of Atlan. Atlan is a modern data collaboration workspace (like Github for engineering or Figma for design). By acting as a virtual hub for data assets ranging from tables and dashboards to models & code, Atlan enables teams to create a single source of truth for all their data assets and collaborate across the modern data stack through deep integrations with tools like Slack, BI tools, data science tools and more. As a pioneer in the space, Atlan was recognized by Gartner as a Cool Vendor in DataOps, as one of the top 3 companies globally.

Prukalpa previously co-founded SocialCops, a world-leading data-for-good company recognized as a New York Times Global Visionary and a World Economic Forum Tech Pioneer. SocialCops is behind landmark data projects, including India’s National Data Platform and SDGs global monitoring in collaboration with the United Nations.

Prukalpa was awarded the Economic Times Emerging Entrepreneur for the Year and recognized in multiple lists such as Forbes 30u30, Fortune 40u40, and Top 10 CNBC Young Business Women 2016.

What I Learned From Attending Tecton apply(meetup) 2021

What I Learned From Attending Tecton apply(meetup) 2021

Last month, I attended apply(), Tecton’s follow-up virtual event of their ML data engineering conference series. I’ve previously written a recap of their inaugural event, a whirlwind tour of wide-ranging topics such as feature stores, ML platforms, and research on data engineering. In this shorter post, I would like to share content from the main talks and lightning talks presented at the community meetup. Topics include ML systems research, ML observability, streaming architecture, and more.

Datacast Episode 72: Folding Data with Gleb Mezhanskiy

Datacast Episode 72: Folding Data with Gleb Mezhanskiy

Gleb Mezhanskiy is the CEO & Co-founder of Datafold -  a data observability platform that helps companies unlock growth through more effective and reliable use of their analytical data. As a founding member of Data teams at Autodesk and Lyft and the Head of Product at Phantom Auto, Gleb has built some of the world's largest and most sophisticated data platforms and has developed tooling to improve productivity and data quality in organizations with hundreds of data users.