Machine Learning : The Trade-Off Between Prediction and Interpretation

Vachan Anand
4 min readMar 8, 2021
Photo by Richard Lee on Unsplash

Ever since machine learning or artificial intelligence took a leap in popularity, more often than not, we think of M.L. as a piece of software/code that has the capabilities of predicting the future given an input data. For instance, forecasting the price of bitcoin or recommending a user with products he/she is most willing to buy. As fascinating as it may sounds to have such capabilities one of the down side of it is that practitioners new to M.L. use it like a black box and hence do not make use of its full potential.

In this blog, I will try to point of few mistakes that I made in the early years of using machine learning and why I believe avoiding them could speed up the learning curve for training a good M.L. model. At the time of writing, although having a few years of experience in the field I still consider myself an amateur as learning “Machine Learning” is an iterative process just like the learning of a model by the machine (Witty Huh? haha). So here we go!

When we start to model a machine learning problem, we generally focus on approximating a function that is as close as possible to a true underlying function. We do so by reducing the error term, generally Mean Squared Error (M.S.E.) for a regression problems. Now there are multiple ways to reduce this error term and one of the most common way is to increase the complexity of the model. We can do so in many ways but the most common ways are as follows :

  • Introducing interaction between the features
  • Using more sophisticated models like neural network

Even though this seems like a good solution at first and in some cases it may be a step towards your solution to the problem, I personally believe a lot of time in real world, it is not what we should blindly take a step towards.

You may ask why?

I'm glad you asked. Well, there are several reason to it and some of the most important ones are as follows:

1. The curse of constraints

One of the major issue while working on a real world problem is how much resources does the business have to complete a given task. These resources are generally not very flexible and hence creating models to solve a business problem has to abide by these constraints. Some of these constraints are

  • Processing Power to train a model
  • Storage Capacity to store the data and model
  • Time taken to make a prediction
  • Budget of a project to provision resources on cloud
  • and many more…

Now with a complex model for instance a deep neural network, we would require huge amount of data to prevent overfitting and to process huge volume of data we might need better processing power and hence might need to provision more resources. Subsequently, more resources for the model would imply less resources for other aspects of the project and hence it is a decision that is not as easy as it sounded earlier.

2. The compromise between Predictability and Interpretability

In an ideal situation let us assume that the business has infinite resources. So now we can train and use any model as we desire. Even with no constrain on resources, we should still be hesitant on using a complex model for one simple reason i.e a complex model is difficult for us to understand. This is an issue because while using a machine learning model a business not only requires the model to make a prediction as accurately as possible, but also it requires the analyst to understand what is the major contributing factors that drive the prediction. This is important because once a business understand the driving forces, then the business can focus on those factors to improve the key performance indicators(KPIs).

In my opinion question of predictability vs interpretability is one of the major task a data scientist should address. To clarify, in no means I imply that interpretability is more important than predictability of the model. However I intend to shed light on the forgotten half of machine learning models i.e interpretability. Nevertheless, the question of interpretability vs predictability would also depend on the problem we are trying to solve. For instance, in an autonomous car, we might focus more on predictability as the prime concern for such use case is to have a safe ride for the passenger while on the other hand to predict the price of a land, interpretability is equally important because as a business we also want to understand what features of a property might be driving the prices in a real estate market.

Furthermore, focusing on interpretability not only helps us answer business question, but also as a data scientist, it help us in multiple aspects such as :

  • Narrowing down the focus of analysis in a particular area. This further would not only help us improve predictability of the model but also reduce the resources used by the model such as processing power, storage, etc.
  • Improve data pipelines. As we would have a better understanding of data that maps well to the prediction problem, we can improve data pipelines and reduce the load on storage if necessary.
  • Lastly, it would also help us as data scientists to model interaction between multiple features as we would have a better understanding of their effect on prediction.

Finally, now that we understand why machine learning is not only about prediction, in the future blogs we shall look into how to interpret the model to improve our understanding of the business problem. If you liked this blog and would like to have a read of my other blogs, click here.

See you next time. Until then, Keep Rocking!

--

--

Vachan Anand

A consultant with an interest in Data Science, Data Engineering and Cloud Technology.