Data & analytics models have an expiry date – how can we update them?


By continually assuring the currency, accuracy, and relevancy of business-critical data and analytics models, organisations can better anticipate and solve whatever market and environmental challenges they may encounter. However, perishable data and analytics models make it much harder to predict and respond to unexpected events, for example sudden shifts in demand for products and services, the price or availability of raw materials or consumer sentiment.

Therefore, businesses that keep their data and analytics models fresher, can substantially increase not only their chances of survival, but of capturing a larger share of revenue and profits in the process. When a business can generate deeper and more accurate insights, it can therefore provide more value internally and externally.

For example, a convenience store chain that identifies which products are selling most quickly at their stores during the pandemic will be able to make sure that they have enough of those goods in stock to meet demand. They can also make decisions around the placement of those products, for example putting them near the checkouts so customers will spend as little time in the store as possible. This could also increase customer purchases of those products. But understanding data and model perishability goes far beyond traditional measures such as age or recency. So, what data classes as current, accurate and relevant?


Current enough data

Current data reflects the most recent changes that could have a significant impact on the business. These might include the loosening or tightening of COVID-19 lockdowns, or a call on social media for a protest near the business. Advanced artificial intelligence (AI) techniques such as machine learning can help find such data by, for example, identifying which data sources were used to generate the models.

These models are continually refined and use advanced AI to compare against “virtual twins” of the real world, so they are continuously learning rather than only being trained once. This avoids the risk of models becoming redundant, which is what has happened to many predictive models that were based on pre-COVID consumer or employee behaviour and have not been able to adapt to the new circumstances.


Accurate enough data

This is data that has been cleansed and validated to ensure it comes from an accurate source, has not been compromised and is in a usable format. This is especially important for data in non-traditional forms, such as unstructured data, or from newer sources such as social media or the Internet of Things (IoT). Such data can often be the source of important insights, such as when mobile phone location tracking data is used in addition to COVID-19 testing data, to better track the spread of the disease and new infections.

Accurate models have not only been tested for accuracy under current conditions, but they can also use advanced AI to provide more accurate predictions. If a prediction is low-confidence but could have massive consequences, such as an outflow of millennials from urban areas due to COVID-19, a business could get a low-cost jump on competitors by being the first to plan for such a trend. For example, an analytics model that “discovers” weekly or seasonal patterns in financial trading is of little use if competitors have already found that pattern and adjusted their own trades to account for it. Therefore, models that are accurate have also been trained to disregard patterns in the data when they cease to be relevant.


Relevant enough data

This refers to data that is significant enough to have a meaningful impact on predictions of future conditions or the steps it recommends to respond to them. As an example, before the “Me Too” movement, an insensitive tweet by a CEO might not have been considered a relevant data point to track. Today, the boycotts against such a tweet could drive material changes in revenue, market share, and brand value. Relevant data is drawn from anywhere, within or outside of the organisation and it can help the business be the first to sense and respond to change.

By developing and refining relevant models, organisations can determine, with the help of machine learning, which data is most insightful and disregard less useful data. Even more importantly, it helps contribute towards costs and time saving, particularly when it comes to eradicating the effort spent on training models on irrelevant data.