My approach to solving (almost) any machine learning problem

In this article, I’ll detail the technique I use to solve almost any AI / machine learning project. I can already hear you screaming behind your screen « there is no magic approach to ML » and you’d be right!

I would say that this technique applied to 90% of my projects. Without further ado, here’s the approach:

  1. Find a machine learning competition with a problem close to the one you want to solve
  2. Find the winning team’s solution
  3. Adapt this solution to your problem

Before detailing each step, I’d like to point out that the winning team solution is not used as is. Mostly because your problem will not be perfectly identical to the competition’s problem.

1 - Finding a matching machine learning competition

To do so, I recommend using Google or Kaggle Past Solutions. (disclaimer: I created Kaggle Past Solutions, precisely for this reason. It’s open-source and free)

How to find a matching competition ? Let’s use an example. I was recently working on a project to predict the inventory in a warehouse. Since we want to predict future orders, the keywords here are « time series » and « forecast ». Looking up those keywords in Kaggle Past Solutions yields, among other results, the Corporación Favorita Grocery Sales Forecasting competition (“Can you accurately predict sales for a large grocery chain? »).

This competition seems similar enough. Indeed, an order from a warehouse must be correlated to the sales, so with the same input we might be able to get good results. Now let’s move on to choosing a solution.

2 - Finding the winning team’s solution

This step should be rephrased into “finding a good winning solution”. It mostly depends on:

In our example, the first place solution is of very good quality. It quotes sources, contains a good explanation and the linked code is concise and of good quality.

3 - Adapt this solution to your problem

The solution will be used as a baseline. You will know what technology to use (convolutions, gradient boosting, LSTMs) and what part of the data to exploit. While you will not get a working solution right away, you’ll at least know that you’re using the right tools and the right data.

Moreover, you can draw inspiration and learn from:

In our example, the solution is using an ensemble of LSTMs and gradient boosting. Reading the code, I also learned about pandas.read_csv’s converters argument.

What’s next

At this point you should have a very good baseline, built upon the shoulders of kaggle masters. From this point, it’s all about improving your model. And if you get good results, don’t forget to thank the authors of the solution.