Datacast Episode 66: Monitoring Models in Production with Emeli Dral
The 66th episode of Datacast is my conversation with Emeli Dral— the co-founder and CTO at Evidently AI, a startup developing tools to analyze and monitor the performance of machine learning models.
Our wide-ranging conversation touches on her educational background in Applied Mathematics and Computer Science; her work on recommendation systems and applied ML at Yandex; her popular teaching materials online and in-person; her startup on Industrial AI; her current journey with Evidently to tackle the model monitoring space, and much more.
Please enjoy my conversation with Emeli!
Listen to the show on (1) Spotify, (2) Apple Podcasts, (3) Google Podcasts, (4) TuneIn, (5) RadioPublic, (6) Stitcher, and (7) iHeart Radio.
Key Takeaways
Here are highlights from my conversation with Emeli:
On Studying Applied Math in College
When choosing universities, there were pretty much two options for me. The first option was to go to a top technical university in Russia, such as St. Petersburg State or Moscow State University. The second option was to enroll in a university with a great international connection and many students from different countries. I am from a multi-cultural family (my father is African, and my mother is Russian). I was very interested in meeting people from different cultures, so I went for the second option. This is my best decision in life.
The Peoples’ Friendship University of Russia is not a technical university, but it has many nice mathematical courses. As a result, I got a solid mathematics background but lacked modern computer science courses. I compensated for that by enrolling at Yandex School of Data Analysis.
My favorite Math classes are all connected to statistics because I love the idea of making decisions based on data. Notable courses included probability theory, introductory statistics, and financial mathematics.
On Getting Into Computer Science
During the third year of my bachelor’s degree, I started to think about my future jobs. I searched for different options from different companies. Finally, I decided that I need to go to Yandex because it’s one of the best IT companies in Russia. I figured that I lacked experience in computer science and programming because I didn’t participate in any programming competitions or work on any programming projects. Therefore, I started to think about getting more experience and becoming a better engineer to at least have a chance within Yandex.
I figured out that they have some Master’s programs, so I prepared really, really hard for it. I spent three months studying whatever I needed to attend school. This is a competitive program to get into, and I thought I had a tiny chance because I was not from the best technical university. But I decided to try anyway. It worked out for me. I was super happy. This is my second best decision in life.
In 2010, Yandex had only two programs: one on data science/analysis and one on computer science. I thought that I needed to focus on the engineering courses, so I went for the computer science program.
On Working as a Software Developer
This first job working at Rambler was very stressful for me. I wasn’t sure that I was good enough to start my career as a software engineer. My previous internship has been in data analysis, and I was worried about my engineering skills. My first big task was to work with distributed systems. I had to learn how to write MapReduce jobs, write Hive queries, use distributed filesystems, and other similar things. I was so scared that I would break something.
Eventually, I figured out that it’s okay to learn on the fly while working. If I want to stay in this profession for a long time, I need to be comfortable with this learning process.
I still remember everything that I have learned about distributed computation. In the beginning, I was really scared to run distributed MapReduce jobs. If something went wrong, the whole cluster would break, and everybody would know that’s my fault. But later on, that process became more experimental, and I enjoyed it.
On Building E-Commerce Recommendation Systems at Yandex
After finishing Yandex School of Data Analysis, I knew that Yandex would be my home company. I love the Yandex culture; even while working at Rambler, I still communicated with people from Yandex. So when they went through challenges and problem statements in recommender systems, I was so happy to join.
There were many expectations for Yandex’s system: the response time should be short, and the solution should be stable enough to be passed by the engineering team. This was when I learned how to write production-grade code and good tests, create a stable system with many fallbacks, and design a nice database schema.
I also learned about the straightforward connection between the quality of ML models and business KPIs. It’s vital to know how the solution impacts real users and aligns with the right metrics.
On Applied Machine Learning at Yandex Data Factory
I was super proud that I got invited to join Yandex Data Factory. It was a new department focused on applied ML for different businesses. They invited many experienced data scientists from inside Yandex. I was not sure that I had been good enough to join the team, so I wanted so much to prove that I was capable.
In my first project, we worked with one of the biggest communications companies in Russia.
They were trying to solve churn prediction/prevention problems. They developed an ML model to predict which customers would leave the service to retain them. They wanted to compare the quality of our model with their internal ones.
At the time, I felt that I needed to be in this competition to represent Yandex. So I spent way too much time on this project. We ended up winning the competition.
The second project was more interesting from a business point of view.
I worked with an engineering company that was trying to solve a human resources problem. They had a lot of engineering interns who would leave in under a year. But, of course, the company lost quite a lot of money in training these interns, so they definitely would like to retain them as long as possible.
They gave us their internal data and asked us to build ML models to predict resignations. In the beginning, many thought it’d be impossible to predict who would leave the company because there could be many different motivations. So I invested a lot of time learning the domain area, iterating through various dataset versions, examining individual features, etc.
At the very end, after 5–6 iterations, our final model successfully predicted 26 people out of 50 cases on an unseen dataset. This was a huge success, and our client was surprised that ML actually works.
On Challenges in Industrial AI
Industrial AI differs from online services significantly. While working for any online company, you have access to user-generated data. Events are associated with users and can be aggregated by user IDs, making it easy to create features, build training datasets, and proceed with the ML lifecycle.
When it comes to manufacturing, you don’t have the data associated directly with the objects.
Let’s say you are trying to predict defective steel on a surface area. Pretty much the data that you have come from some sensors elsewhere. There’s no direct connection between the sensor data and the objects. You need to do many aggregations (in most cases, by time) to make these associations. If you fail at this step, it’d be impossible to continue the project.
Thus, I believe for industrial AI; it’s super important to learn the data by heart — understanding how the data was generated, how the attributes are related to each other, are there any time gaps, etc.
Furthermore, it’s even more important to dig deep into the domain area. Without understanding how steel production works (for example), it’s impossible to come up with problem statements and start the project.
On Making Coursera Courses
This happened when Coursera went after the Russian market. I was invited to work on the “Machine Learning and Data Analysis” Coursera specialization by Konstantin Vorontsov, an ML instructor from Yandex School of Data Analysis and a faculty at Moscow Institute of Physics and Technology. I prepared seminars with hands-on Python programming and applied projects.
In the beginning, nobody actually knew how much time we would end up investing in this project. It took us over a year to create this specialization. It became a second job for us. We had to really plan how to structure the materials and synchronize content across modules. I learned how to work in teams very efficiently.
It’s also not something that can be 100% perfected. There would always be something that could be done better. So it’s important to figure out when it’s “good enough” and launch the product then.
It’s important to have thick skin. When you publish something on the Internet, you should be ready to get feedback.
On Founding Evidently AI
At Mechanica AI, our production ML system needs to work correctly. If something fails in production, money would be lost. Therefore, we set up the monitoring schema from scratch for every project because there are various parts of the solution that we need to pay attention to.
In some cases, we needed to monitor our data sources, as they were unstable due to broken sensors.
In other cases, we needed to monitor our pipeline, as there were different feature engineering steps inside a data pipeline.
I figured out that there was no general solution for the ML monitoring system. So, I thought, why not try to build a product that can look after ML models in production. That’s how Evidently AI was born.
On Model Monitoring
Most companies start monitoring their ML models only after their first huge break. I think that’s a mistake because they should prepare for such a scenario and monitor models from the beginning. It’s also important to understand that ML-based service is different from other services. Thus, it’s crucial to monitor service health, response time, memory/GPU usage, etc.
We have a data layer that is part of the solution, so I would even say that it’s most likely due to the input data when something is wrong with your models. Therefore, it’s even more crucial to monitor the input data. It’s important to analyze your specific case to determine where your models can break and use the appropriate monitoring strategy.
When it comes to data quality and data integrity, many things can happen to your models.
For instance, you can lose access to data (your data sources might be broken), and your models might lose very important signals.
Or for example, your ML models might use data from a CRM system. If a CRM user updates the data, your models will break because they use the new data schema as the input data. Such things frequently happen, especially with 3rd-party sources, so you need data integrity monitoring for them.
Issues like data drift and concept drift occur when the exact features and targets change. If your models aren’t prepared for that, they will degrade or break.
On Open-Source Roadmap
My co-founder, Elena, and I have discussed open-source a lot. It’s hard to be a perfectionist because we are building in public. If something doesn’t work, then everybody is going to see it. So it was a hard decision for us.
But when you build something in public, you have feedback very early and can test hypotheses fast.
When you build a monitoring system to evaluate other systems, it’s better to understand how it actually works, what problems it accounts for, and what the weaknesses are. Real-world ML systems impact people’s lives directly (healthcare, finance, social services). No one actually has the whole picture of all possible problems that can happen. When we build a monitoring tool in public, we can aggregate the experiences from many different engineers and business specialists. For us, that was the biggest reason to build Evidently AI in public.
I have faced monitoring ML models in production at Mechanica and Yandex Data Factory, so we prioritize those issues and base our roadmap on top of that priority. I also spent a lot of time talking with potential users from various companies. As a result, we figured out that data drift and concept drift are more top-of-mind issues for them. Therefore, we prioritize our roadmap even more on top of user feedback.
On The Data Community in Moscow
The data community in Russia is large and young. We have a lot of young data scientists who were just finishing school. They are very active and enthusiastic.
Another fun fact about Russian data scientists: We are really good at implementing things, even if we reimplement bicycles. We like to reimplement things ourselves, and sometimes that’s not very efficient.
Show Notes
(02:07) Emeli shared her educational background getting degrees in Applied Mathematics and Informatics from the Peoples’ Friendship University of Russia in the early 2010s.
(04:24) Emeli went over her experience getting a Master’s Degree at Yandex School of Data Analysis.
(07:06) Emeli reflected on lessons learned from her first job out of university working as a Software Developer at Rambler, one of the biggest Russian web portals.
(09:33) Emeli walked over her first year as a Data Scientist developing e-commerce recommendation systems at Yandex.
(13:38) Emeli discussed core projects accomplished as the Chief Data Scientist at Yandex Data Factory, Yandex’s end-to-end data platform.
(17:52) Emeli shared her learnings transitioning from an IC to a manager role.
(19:21) Emeli mentioned key components of success for industrial AI, given her time as the co-founder and Chief Data Scientist at Mechanica AI.
(22:40) Emeli dissected the makings of her Coursera specializations — “Machine Learning and Data Analysis” and “Big Data Essentials.”
(26:14) Emeli discussed her teaching activities at Moscow Institute of Physics and Technology, Yandex School of Data Analysis, Harbour.Space, and Graduate School of Management — St. Petersburg State University.
(30:12) Emeli shared the story behind the founding of Evidently AI, which is building a human interface to machine learning, so that companies can trust, monitor, and improve the performance of their AI solutions.
(32:32) Emeli explained the concept of model monitoring and exposed the monitoring gap in the enterprise (read Part 1 and Part 2 of the Monitoring series).
(34:13) Emeli looked at possible data quality and integrity issues while proposing how to track them (read Part 3, Part 4, and Part 5 of the Monitoring series).
(36:47) Emeli revealed the pros and cons of building an open-source product.
(39:13) Emeli talked about prioritizing product roadmap for Evidently AI.
(41:24) Emeli described the data community in Moscow.
(42:03) Closing segment.
Emeli’s Contact Info
Evidently AI’s Resources
Mentioned Content
Blog Posts
ML Monitoring, Part 1: What Is It and How It Differs? (Aug 2020)
ML Monitoring, Part 2: Who Should Care and What We Are Missing? (Aug 2020)
ML Monitoring, Part 3: What Can Go Wrong With Your Data? (Sep 2020)
ML Monitoring, Part 4: How To Track Data Quality and Data Integrity? (Oct 2020)
ML Monitoring, Part 5: Why Should You Care About Data And Concept Drift? (Nov 2020)
ML Monitoring, Part 6: Can You Build a Machine Learning Model to Monitor Another Model? (April 2021)
Courses
People
Yann LeCun (Professor at NYU, Chief AI Scientist at Facebook)
Tomas Mikolov (the creator of Word2Vec, ex-scientist at Google and Facebook)
Andrew Ng (Professor at Stanford, Co-Founder of Google Brain, Coursera, and Landing AI, Ex-Chief Scientist at Baidu)
Book
“The Elements of Statistical Learning” (by Trevor Hastie, Robert Tibshirani, and Jerome Friedman)
New Updates
Since the podcast was recorded, a lot has happened at Evidently! You can use this open-source tool (https://github.com/evidentlyai/evidently) to generate a variety of interactive reports on the ML model performance and integrate it into your pipelines using JSON profiles.
This monitoring tutorial is a great showcase of what can go wrong with your models in production and how to keep an eye on them: https://evidentlyai.com/blog/tutorial-1-model-analytics-in-production.
About the show
Datacast features long-form conversations with practitioners and researchers in the data community to walk through their professional journey and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:
If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.