Datacast’s 30th Episode is my conversation with Parul Pandey, a data science evangelist at H2O.ai. Listen to learn about her background in Electrical Engineering, her journey into technical writing, our discussion on various topics ranging from ML interpretability to data visualization, her current work at H2O.AI, her involvement with the data community in Hyderabad and the global women community in Machine Learning and Data Science, and much more.
Parul Pandey is a Data Science Evangelist at H2O.ai. She combines Data Science, evangelism and community in her work. Her emphasis is to break down the data science jargon for the people. Prior to H2O.ai, she worked with Tata Power India, applying Machine Learning and Analytics to solve the pressing problem of load sheddings in India. She is also an active writer and speaker and has contributed to various national and international publications including Towards Data Science, Analytics Vidhya, and KDNuggets and Datacamp.
Show Notes
(2:12) Parul talked about her educational background, studying Electrical Engineering at the National Institute of Technology, Hamirpur.
(3:18) Parul worked as a Business Analyst at Tata Power India for 7 years.
(4:29) Parul talked about her initial interests in writing about data science and machine learning on Medium.
(6:30) Parul discussed her first blog series “A Guide to Machine Learning in R for Beginners” — which covers the Fundamentals of ML, Intro to R, Distributions and EDA in R, Linear Regression, Logistic Regression, and Decision Trees.
(8:02) In reference to her articles on data visualization, Parul talked about matplotlib, seaborn, and plotly as the main visualization libraries she practices, in addition to Tableau for building dashboards.
(10:11) Parul shared her thoughts on the state of Machine Learning interpretability, in reference to her articles on this topic.
(13:54) Parul discussed the advantages of using Jupyter Lab over Jupyter Notebook.
(17:30) Parul discussed the common challenges of bringing recommendation systems from prototype into production (Read her two articles about recommendation systems: (1) an overview of different approaches and (2) an overview of the process of designing and building a recommendation system pipeline)
(21:00) Parul went in-depth into her NLP project called “Building a Simple Chatbot from Scratch in Python (using NLTK).”
(23:26) Parul continued this chatbot project with a 2-part series on building a conversational chatbot with Rasa stack and Python and deploying it on Slack.
(28:15) Parul went over her Satellite Imagery Analysis with Python piece, which examines the vegetation cover of a region with the help of satellite data.
(32:22) Parul talked about the process of Recreating Gapminder in Tableau: A Humble Tribute to Hans Rosling.
(35:17) Parul discussed her project Music Genre Classification, which shows how to analyze an audio/music signal in Python.
(39:20) Parul went over her tutorials on Computer Vision: (1) Face Detection with Python using OpenCV and (2) Image Segmentation with Python’s scikit-image module.
(42:01) Parul unpacked her tutorial “Predicting the Future with Facebook’s Prophet” — a forecasting model to predict the number of views for her Medium articles.
(44:58) Parul has been working as a Data Science Evangelist at H2O.AI since July 2019.
(47:04) Parul described Flow — H2O’s web-based interface (Read her tutorial here).
(49:23) Parul described Driverless AI — H2O’s product that automates the challenging and repetitive tasks in applied data science (Read her tutorial here).
(52:39) Parul described AutoML — H2O’s automation of the end-to-end process of applying ML to real-world problems (Read her tutorial here).
(57:07) Parul shared her secret sauce for effective data visualization and storytelling, as illustrated in her analysis of the 2019 Kaggle Survey to figure out women’s representation in machine learning and data science.
(01:02:02) Parul described the data science community in Hyderabad, from her lens as an organizer for the Hyderabad Chapter of the Women in Machine Learning and Data Science.
(01:05:45) Parul was recognized as a LinkedIn’s Top Voices 2019 in the Software Development category.
(01:10:30) Closing segment.
Her Contact Info
Her Recommended Resources
“Interpretable Machine Learning: A Guide for Making Black Box Models Explainable” by Chris Molnar
“Towards A Rigorous Science of Interpretable Machine Learning” by Finale Doshi-Velez and Been Kim
Parul’s Compilation of Data Visualization articles
Parul’s Programming with Python articles
“Factfulness” by Hans Rosling