This is a guest article written by Han Wang and Kevin Kho, in collaboration with James Le. Han is a Staff Machine Learning Engineer at Lyft, where he serves as a Tech Lead of the ML Platform. He is also the founder of the Fugue Project. Kevin is an Open Source Engineer at Prefect, a workflow orchestration framework, and a contributor to Fugue. Opinions presented are their own and not the views of their employers.
Datacast Episode 68: Threat Intelligence, Venture Stamina, and Data Investing with Sarah Catanzaro
Sarah Catanzaro is a Partner at Amplify Partners, where she focuses on investing in and advising high potential startups in machine intelligence, data management, and distributed systems. Her investments at Amplify include startups like RunwayML, Maze Design, OctoML, and Metaphor Data, among others. Sarah also has several years of experience defining data strategy and leading data science teams at startups and in the defense/intelligence sector, including roles at Mattermark, Palantir, Cyveillance, and the Center for Advanced Defense Studies.
Datacast Episode 67: Model Observability, AI Ethics, and ML Infrastructure Ecosystem with Aparna Dhinakaran
Aparna Dhinakaran is the Chief Product Officer at Arize AI, a startup focused on ML Observability. She was previously an ML engineer at Uber, Apple, and TubeMogul (acquired by Adobe). During her time at Uber, she built several core ML Infrastructure platforms, including Michelangelo. She has a bachelor’s from Berkeley's Electrical Engineering and Computer Science program, where she published research with Berkeley's AI Research group. She is on a leave of absence from the Computer Vision Ph.D. program at Cornell University.
What I Learned From Attending REWORK MLOps and ML Fairness Summits
Last month, I attended two great summits organized by REWORK: The MLOps summit that discovers how to optimize the ML lifecycle & streamline ML pipeline for better production and the ML Fairness summit that discovers strategies to ensure ML models are accountable & fair to build secure & responsible AI. As a previous attendee of REWORK’s in-person summit, I have always enjoyed the unique mix of academia and industry, enabling attendees to meet with AI pioneers at the forefront of research and explore real-world case studies to discover the business value of AI.
In this long-form blog recap, I will dissect content from the talks that I found most useful from attending the summit. The post consists of 10 talks that range from automated data labeling and pipeline optimization, to model fairness and responsible AI at scale.
Datacast Episode 66: Monitoring Models in Production with Emeli Dral
Emeli Dral is a Co-founder and CTO at Evidently AI, a startup developing tools to analyze and monitor the performance of machine learning models. Earlier, she co-founded an industrial AI startup and served as the Chief Data Scientist at Yandex Data Factory. She led over 50 applied ML projects for various industries - from banking to manufacturing. Emeli is also a data science lecturer at St. Petersburg State Management School and Harbour.Space University. She is a co-author of the Machine Learning and Data Analysis curriculum at Coursera with over 100,000 students. She also co-founded Data Mining in Action, the largest open data science course in Russia.
Datacast Episode 65: Chaos Theory, High-Frequency Trading, and Experimentations at Scale with David Sweet
David Sweet was a quantitative trader at GETCO, where he used experimental methods to tune trading strategies, and a machine learning engineer at Instagram, where he experimented on a large-scale recommender system. He is currently writing a book called "Tuning Up," an extension of lectures given at NYU Stern on tuning high-frequency trading systems. Before working in the industry, he received a Ph.D. in Physics and published research in Physical Review Letters and Nature. The latter publication – an experiment demonstrating chaos in geometrical optics -- has become a source of inspiration for computer graphics artists, undergraduate Physics instructors, and an exhibit called TetraSphere at the Museum of Mathematics in New York City.
Datacast Episode 64: Improving Access to High-Quality Data with Fabiana Clemente
Fabiana Clemente is a Data Scientist with a background that ranges from Business Intelligence to Big Data Development and IoT architecture. Throughout her professional career, she has been leading state-of-the-art projects in global companies and startups. She has an academic background in Applied Maths, and MSc in Data Management combined with nano degrees in Deep Learning and Secure and Private AI.
As YData’s Co-Founder, she combines Data Privacy with Deep Learning as her main field of work and research, with the mission to unlock data with privacy by design. She also aims to inspire more women to follow her steps and join the tech community.
What I Learned From Attending REWORK AI Applications Summit 2021
Last month, I attended REWORK’s AI Applications Virtual Summit, which discovers machine learning tools and techniques to improve the financial, retail, and insurance experience. As a previous attendee of REWORK’s in-person summit, I have always enjoyed the unique mix of academia and industry, enabling attendees to meet with AI pioneers at the forefront of research and explore real-world case studies to discover the business value of AI.
In this long-form blog recap, I will dissect content from the talks that I found most useful from attending the summit. The post consists of 13 talks that are divided into 3 sections: (1) AI in Finance and RegTech, (2) AI in Retail and Marketing, and (3) AI in Insurance.
Datacast Episode 63: Real-World Transfer Learning with Azin Asgarian
Azin Asgarian is currently an applied research scientist on Georgian’s R&D team, where she works with companies to help adopt applied research techniques to overcome business challenges. Azin holds a Master of Science in Computer Science from the University of Toronto and a Bachelor of Computer Science from the University of Tehran. Before joining Georgian, Azin was a research assistant at the University of Toronto and part of the Computer Vision Group, where she worked on the intersection of Machine Learning, Transfer Learning, and Computer Vision. In addition, due to her interest in HealthCare, she has worked on various healthcare projects as a research assistant at University Health Network.
What I Learned From Attending Tecton's apply() Conference
Last week, I attended apply(), Tecton’s first-ever conference that brought together industry thought leaders and practitioners from over 30 organizations to share and discuss ML data engineering’s current and future state. The complexity of ML data engineering is the most significant barrier between most data teams and transforming their applications and user experiences with operational ML.
In this long-form blog recap, I will dissect content from 23 sessions and lightning talks that I found most useful from attending apply(). These talks cover everything from the rise of feature stores and the evolution of MLOps, to novel techniques and scalable platform design. Let’s dive in!