Model Drift in Machine Learning

Nwaamaka Iduwe
Oct 14, 2022
3 min read

Also known as model decay, model drift is a problem in machine learning where the accuracy of a machine learning model declines due to factors affecting the variables, the data used, or the model itself. This in turn causes the model’s predictions to become less accurate.

TYPES OF MODEL DRIFT To build models, you have to first decide on what you are trying to solve and how you want to solve it. This means that you will have variables assigned to your model in the form of dependent variables and independent variables. Usually, the dependent variable will be the problem you aim to solve whereas the independent variables will be the data you use to solve this problem. Perfectly said by Heraclitus in 500 B.C., the only thing constant in life is change and this holds true even for Artificial Intelligence. As the world evolves, the definition of different variables is also likely to change. New data emerges and external factors change the relationship between variables. A change in the dependent/target variables leads to a type of model drift called concept drift. In contrast, a change in the independent variables leads to the other type called data drift.

Concept Drift. As mentioned, this is a type of model drift that occurs when there is a change in the properties of the dependent variable thereby causing a change in the correlation between dependent and independent variables so that the model loses its predictive power.

There are three types of concept drift namely: gradual, sudden, or recurring.

Gradual Concept Drift: As the name implies, this type of drift happens over time. A great example is Inflation. The price (the independent variable) of goods rises over time thereby causing inflation. Now, further definition of this inflation can also change over time thereby changing the type of inflation i.e., hyperinflation, stagflation, etc.

Sudden Concept Drift: As the name implies, this happens abruptly. A good example is the COVID-19 pandemic which caused people around the world to suddenly change their behavior towards things like transportation and recreational activities as they became more adverse and as the government put lockdowns in place. On the other hand, they welcomed more home products such as food, toiletries, and medicines.

Image Source: Harshil Patel via Censius

Recurring Concept Drift: As the name implies, this happens repeatedly. A good example of this would be the consumer shopping frequency and habits that happen every last quarter of the year, as the world celebrates Halloween, Thanksgiving, Black Friday, and Christmas.

Image Source: Clean.io

2. Data Drift. Briefly explained, this happens when the data used in the model experience a change in properties. This data can also be referred to as the independent variable as the target variable is dependent on it. A good example is a change in consumer preference. Post-covid, we have seen a shift in consumer preference to accommodate and actually prefer healthier options such as vegan food, less processed commodities, and fitness equipment e.g., the trending work treadmill desk.

Image Source: Hassen et al, 2021 via Science Direct

Apart from a change in the relationship of the variables used within a model, Harshil Patel of Censius tells us that some other ways model drift could come about are: · Lack of Data Credibility, · Inaccuracy of Data Source, · Poor Data Engineering.

MINIMISING MODEL DRIFT 1. Observe Your Model. The simplest and quickest way to minimize model drift is to keep checking for it. It is important to regularly check the properties of the data to ensure that they still correlate as they should and that no changes to the definition of the target problem have occurred. In addition to this, it is important as it is helpful to set performance metrics for your model so that you can regularly compare the model performance against these performance metrics (Dilmegani, 2022).

There are also some platforms to monitor your model such as Adaptive Windowing (ADWIN), Drift Detection Method(DDM) and the Early Drift Detection Method (EDDM), Google Cloud AI Platform, ZenML, etc.

2. Train and Tune Your Model. If by chance you detect model drift, you can remedy the issue by simply retraining your model to ensure the drift is accounted for. Be sure to use the most recent data for this so that your model is as up-to-date as possible. Also, ensure that your data and data sources are credible.

You can also choose to train your model or design your model to be able to learn in real time with data feed online (Dilmegani, 2022).

CONCLUSION In this article, we discussed Model Drift by looking at its definition, types, cause, and how we can minimize it. Model Drift is a common problem in machine learning, and it can easily be remedied.

This write up was originally published by me on medium and you can find it here:

https://medium.com/@nwaamaka_iduwe/model-drift-in-machine-learning-47575f48bfcc

Model Drift in Machine Learning

Recent Posts

Comments