Datacast Episode 81: Research, Engineering, and Product in Machine Learning with Aarti Bagul

Datacast Episode 81: Research, Engineering, and Product in Machine Learning with Aarti Bagul

Aarti Bagul is a machine learning engineer at Snorkel AI. Before Snorkel, she worked closely with Andrew Ng in various capacities: (1) at AI Fund helping build ML companies from scratch internally and investing in ML companies, (2) as an ML engineer at his startup Landing AI, (3) as head TA for his deep learning class CS230, and (4) as an assistant in his research lab at Stanford.

Aarti graduated with a master’s in Computer Science from Stanford, where she participated in the Threshold Venture and Greylock X fellowships. Before Stanford, she got her bachelor’s in Computer Science and Computer Engineering from NYU with the highest honors. During her time at NYU, she worked in David Sontag’s lab on machine learning applications to clinical medicine and at Microsoft Research as a research intern for John Langford (where she contributed to Vowpal Wabbit, an open-source project).

Datacast Episode 80: Creating The Sense of Sight with Alberto Rizzoli

Datacast Episode 80: Creating The Sense of Sight with Alberto Rizzoli

Alberto Rizzoli is co-Founder of V7, a platform for deep learning teams to manage training data workflows and create image recognition AI. V7 is used by AI-first companies and enterprises, including Honeywell, Merck, General Electrics, and MIT.

Alberto founded his first startup at age 19 and made the MakerFaire’s 20 under 20 list. In 2015, he began working on AI with Simon Edwardson while studying under Ray Kurzweil, leading to the creation of the first engine capable of running large deep neural networks on smartphone CPUs. Later, this project became Aipoly, a startup that helped the blind identify over 3 billion objects to date using their phones.

Alberto's work on AI granted him an award and personal audience by Italian President Sergio Mattarella and Italy’s Premio Gentile for Science and Innovation. V7's underlying technology won the CES Best of Innovation in 2017 and 2018.

What I Learned From The Modern Data Stack Conference 2021

What I Learned From The Modern Data Stack Conference 2021

Back in September 2021, I attended the second annual Modern Data Stack Conference, Fivetran’s community-focused event that brings together hundreds of data analysts, data engineers, and data leaders to share the impact and experiences of next-generation analytics. The presenters shared the transformations they experienced with their analytics teams, the new insights and tooling they enabled, and the best practices they employ to drive insights across their organizations.

In this long-form blog recap, I will dissect content from 14 sessions that I found most useful from the conference. These talks are broken down into 4 categories tailored to 4 personas: data engineers, data analysts, product managers, and data team leads. Let’s dive in!

Datacast Episode 79: Analytics Culture, Digital Contracting, and Data Angels with Jessica Cherny

Datacast Episode 79: Analytics Culture, Digital Contracting, and Data Angels with Jessica Cherny

Jessica Cherny currently runs the data analytics team at Ironclad, a series D digital contracting startup. As part of her job, Jessica builds out the data analytics function and provides analytics and data insights to inform business decisions for the product, customer success, engineering, sales, marketing, and operations organizations. Prior to joining Ironclad, she graduated from UC Berkeley as part of the university’s first cohort of Data Science majors.

Outside of work, Jessica is passionate about data mentorship and founded Data Angels -- a Slack community of women in data that provides resources, support, job opportunities, education, and community to its members.

Datacast Episode 78: Open-Source Investing and Data Product Management with Julia Schottenstein

Datacast Episode 78: Open-Source Investing and Data Product Management with Julia Schottenstein

Julia Schottenstein is a Product Manager at dbt Labs, the maker of the popular open-source project dbt. Before joining dbt Labs, Julia was an investor at New Enterprise Associates, where she invested in infrastructure, developer tools, open-source, and data startups. She currently sits on the boards of Sentry and Metabase while being an active angel investor. Julia graduated from Stanford University with degrees in Computer Science and Management Science & Engineering.

2021 Annual Review: The Year of Presence

2021 Annual Review: The Year of Presence

The end of the year is usually a time for reflection, gratitude, and planning. With our inboxes quiet and our schedules free of meetings, we can afford to take a moment and think about what we achieved and what’s next. This is my 8th straight year writing an annual review, and I want to switch up the format this time by looking at critical areas such as work, productivity, health, relationships, and learning. Because it’s important to both acknowledge what went well and what could have gone better, I’ll make sure to highlight areas where I want to improve.

What I Learned From The Future of Data-Centric AI 2021

What I Learned From The Future of Data-Centric AI 2021

Last September, I attended Snorkel AI’s The Future of Data-Centric AI. This summit connects experts on data-centric AI from academia, research, and industry to explore the shift from a model-centric practice to a data-centric approach to building AI. There were talks discussing the challenges, solutions, and ideas to make AI practical, both now and in the future.

In this blog recap, I will dissect content from the conference’s session talks, covering a wide range of topics from weak supervision and fine-grained error analysis to MLOps design principles and data-centric AI case studies.

What I Learned From The Open Source Data Stack Conference 2021

What I Learned From The Open Source Data Stack Conference 2021

I attended the Open-Source Data Stack Conference in late September 2021, the first-ever conference dedicated to building a modern data stack using open-source data solutions. The emergence of the modern data stack has seen a rapid spike in the number of data tools an organization can use to drive better decision-making. Open-source software helps you control the end-to-end flow of customer data throughout your organization to guarantee data auditability, allow data governance, support consumer data privacy, and enable productive engineer workflows.

In this blog recap, I will dissect content from the conference’s session talks, each being a building block of the open-source data stack to demonstrate how teams can build a data stack that reflects their needs.

Datacast Episode 76: Modern Data Collaboration and Social Entrepreneurship with Prukalpa Sankar

Datacast Episode 76: Modern Data Collaboration and Social Entrepreneurship with Prukalpa Sankar

Prukalpa Sankar is the Co-founder of Atlan. Atlan is a modern data collaboration workspace (like Github for engineering or Figma for design). By acting as a virtual hub for data assets ranging from tables and dashboards to models & code, Atlan enables teams to create a single source of truth for all their data assets and collaborate across the modern data stack through deep integrations with tools like Slack, BI tools, data science tools and more. As a pioneer in the space, Atlan was recognized by Gartner as a Cool Vendor in DataOps, as one of the top 3 companies globally.

Prukalpa previously co-founded SocialCops, a world-leading data-for-good company recognized as a New York Times Global Visionary and a World Economic Forum Tech Pioneer. SocialCops is behind landmark data projects, including India’s National Data Platform and SDGs global monitoring in collaboration with the United Nations.

Prukalpa was awarded the Economic Times Emerging Entrepreneur for the Year and recognized in multiple lists such as Forbes 30u30, Fortune 40u40, and Top 10 CNBC Young Business Women 2016.