How to detect and mitigate machine learning model drift
Some words, in certain contexts, have inherently pleasant connotations. Take the term drift, for example. You might think of drifting off to sleep on a relaxing Sunday afternoon or a colorful fall leaf drifting down a river.
When it comes to machine learning (ML) models, however, there are no positive associations with the word drift. In fact, quite the opposite. Model drift can significantly degrade the performance of an ML system over time — not to mention the anxiety it can cause for data scientists.
Understanding what model drift is, how to detect it (easy to say, tricky to do) and how to mitigate it when it occurs are essential to maintaining your model's integrity and reliability.
What is model drift?
When the data an ML model was trained on becomes outdated or no longer represents current real-world conditions, model drift can occur. The result is that the model's performance declines as these data changes degrade its prediction capabilities.
Consider an ML model created to detect spam emails. The tactics spammers use change over time, so a model trained using spam email examples from several years ago wouldn't necessarily be able to detect a recent spam email. The model's prediction power has degraded due to model drift.
There are a few types of drift that can happen — here are two of the most common to be mindful of when it comes to your ML models:
- Concept drift: This type of drift occurs when the properties of the variables change. In the spam email detection model above, for example, the format of spam emails changed over time.
- Data drift: This occurs when the data itself changes over time due to seasonal shifts or changes in consumer preferences. Consider an ML model that was trained to predict the likelihood of a customer buying a product based on their age and income. This model could become less accurate when the distribution of the customers' ages and incomes changes significantly over time.
Measurements to detect model drift
Comparing the values you predict your ML model will produce to the actual values it produces is a commonly used method for detecting model drift. Using this tactic, you'll find that the model's accuracy decreases as the predicted values deviate farther and farther from the actual values produced.
There are a number of metrics that can be used to measure model accuracy, with certain ones being more relevant than others, depending on the situation.
F1 score
One of the most commonly used metrics to measure model accuracy, F1 score considers both precision and recall. With this metric, 1 indicates perfect precision and recall, and 0 indicates a lack of precision and recall. So, if your model is scoring closer to 0, you know your model is drifting.
Kolmogorov-Smirnov (K-S) test
This metric, commonly used to determine whether a data sample comes from a specific population or whether two data samples come from the same population, can also be used to detect ML model drift.
By comparing the cumulative distributions of the two datasets, you can determine whether they come from the same distribution. Essentially, if the distributions differ, your model is drifting.
Population stability index (PSI)
This statistical measure is used to determine how much the distribution of a variable has changed either over time or between two samples. Applied to an ML model, it measures how different (if at all) the current data is compared to the model's training data.
A high PSI value indicates that there is a significant difference in the two datasets, which suggests that model drift has occurred.
How to address model drift
When it comes to mitigating model drift, there's no one-size-fits-all method. There are, however, a number of tactics you can employ to help ensure your model remains accurate over time.
Use high-quality training data
In some cases, performance challenges aren't caused by drift, but rather by the quality (or lack thereof) of the data used to train the model. For example, using biased training data can lead to inaccurate model output. That's why using clean source data that's as free of bias as possible is crucial for ensuring your model's accuracy and reliability over time.
The essential guide to AI training data
Discover best practices for the sourcing, labeling and analyzing of training data from TELUS International, a leading provider of AI data solutions.
Retrain your model
Let's say you've trained your model using the best quality data possible. Unfortunately, due to situations beyond your control, that data can still become outdated or no longer represent real-world conditions. In such cases, retaining your model will be necessary.
There are a number of retraining approaches you can use, depending on the nature of the drift. If the data is outdated, you should use only new data to retrain your model. If, however, the old data is still valid and wouldn't necessarily be causing the drift, you can use both the old data and new data to retrain your model.
If your model is prone to continual drift, you can use an online learning API, in which an ML model learns in real-time via streaming data. This method ensures the model remains up to date since it's constantly learning from these new datasets.
Tune your model
If you retrain your model but it's still drifting, consider tuning it. For example, if your model allows for weighting of the data, you can tune it using all available data with higher weights assigned to recent data to ensure the model puts more emphasis on it.
Other ways to tune your model include experimenting with different features, optimizing the hyperparameters or using different model architectures to help update your model and keep it inline with new data.
Maintaining model accuracy with humans-in-the-loop
Model drift can have a significant impact on the performance of your ML system, causing it to become less accurate and less effective over time.
When it comes to preventing, detecting and mitigating drift, keeping humans-in-the-loop is essential. From the start of an ML project, human intelligence is required to identify errors or bias in the training data and to correct it. When it comes to the performance of your model in the real-world, keeping humans-in-the-loop is essential for regularly monitoring and evaluating performance to ensure the model remains true to its purpose. Should drift occur, human intervention is necessary to determine where it's coming from and the steps needed to correct it.
Working with an experienced partner can help. TELUS International harnesses the intelligence, skills and cultural knowledge of our global community of contributors to insert human judgment into the AI process. Reach out to our team of experts to learn more about our AI Data Solutions.