Datacast’s 13th episode is my conversation with Martina Pugliese, data science lead at Mallzee — an Edinburgh-based startup known as “Tinder for Fashion.” Give it a listen to learn about her academic background in Physics, her opinions on the relationship between the technical side and the scientific side of data science, her thoughts on the lacking of women in scientific fields, and many more.
Martina is a physicist and works as Data Science Lead at Mallzee, based in Edinburgh (Scotland). She loves looking at data regardless of the topic and area and believes the most enjoyable thing in Data Science is analyzing your data, finding the one you need for your question, and producing facts out of it. Throughout her education and job experience, she worked with data in epidemic dynamics, linguistics, and fashion. She also loves producing hand-crafted data visualizations and keeps studying and improving, whether it’s about Machine Learning or leadership topics.
Show Notes
(2:20) Martina recalled her experience getting Bachelor, Master’s, and Ph.D. degrees in Physics from Sapienza Università’ di Roma.
(3:35) Martina discussed her Ph.D. thesis, in which she looked at the study of Complex Systems related to Linguistics and studied how natural language evolves in time.
(6:04) Martina talked about her experience doing the S2DS bootcamp in London after finishing her Ph.D.
(7:10) Martina gave her reasons to move to the Greater UK while looking for a tech job.
(8:05) Martina talked about the importance of software engineering from her time working for the education company Twig World.
(10:07) Martina discussed her current job as a Data Scientist at Mallzee, also known as “Tinder for Fashion.”
(11:46) Martina briefly went over her work in recommendation systems, data analytics, and statistical modeling in the first two years at Mallzee.
(13:50) Martina explained the unique features of fashion that make it a fertile field to do data science work.
(17:30) Martina emphasized the importance of communication for a data scientist.
(22:20) Martina talked about her transition to the Data Science Lead role at Mallzee since 2017.
(24:26) Martina gave her opinion about the relationship between the technical side and the scientific side of data science. (“Data Science Down the Line”)
(26:54) Martina talked about the importance of learning statistics for people coming from an engineering background who want to get into data science.
(30:30) Martina explained the analogy in her post “Don’t make recipes out of them” where she compared doing data science to cooking.
(35:00) Martina discussed the fundamental problem in academia which makes it losing appeal to young and talented individuals. (“Scientific publishing”)
(38:48) Martina talked about her experience organizing the PyData Edinburgh Meetup.
(42:04) Martina advocated for contributing to conversations to raise awareness about women in the scientific field. (“Women amaze”)
(46:45) Martina discussed her project using data from the corpora present in the NLTK library and analyzed the growth of types with respect to text size. (“The growth of vocabulary in different languages”)
(50:52) Martina mentioned her attempt to learn D3 to visualize data. (“Rallying into D3”)
(53:20) Martina moved on to her project analyzing tags on Stackoverflow. (“Stackoverflow Tags”)
(59:30) Martina talked about using TensorFlow for her deep learning project (“TensorFlow: Create the training set for the object detection”)
(01:03:23) Martina gave her prediction on how Data Science and Machine Learning will evolve in the next couple of years.
(01:07:08) Closing segments.
Her Contact Info
Her Recommended Resources
Scott Murray’s “Interactive Data Visualization For The Web”
Elijah Meeks’s “D3.js In Action”
Malcolm Maclean’s “D3 Tips and Tricks”
Trevor Hastie, Robert Tibshirani, Jerome Friedman’s “The Elements of Statistical Learning”
Francis Chollet’s “Deep Learning with Python”