How Recommendation Engine Works

The processes behind the power of data driven recommendation engines.
Salik Khan

You make trips to a local supermarket to buy grocery. After some time, the store clerk gets to know you, what you tend to buy and what you like. One day, to be more helpful and make an extra buck, the clerk suggests a few more items according to your purchasing history, what sections of mart you’ve been searching, what you already have in the cart, and what other people like you have bought.

The clerk successfully recommended products that you might need but wouldn’t have attended otherwise.

A recommendation engine works on the same principle as the clerk did. Gather data, analyze it, filter it and make product suggestions you might be interested in.

Built with highly effective machine learning algorithms, combined with loyalty programmes, dynamic pricing, email marketing, and much more, product recommendation engines constitute a major part of an eCommerce strategy known as personalization.

I’ve summarised complete process of a recommendation engine in 3 different phases.


1. Data Collection

Data collected is either implicit or explicit. Inputted data such as customer’s personal information, product descriptions, ratings, and comments fall under the category of explicit data. While user activity and interactions on the store are categorized as implicit data.

Explicit information about your visitors like gender, age, demographics, etc. is used to maintain user profile. Also, information about products description, tags, categories, reviews, and comments are collected for product profiling.

Every visitor performs some actions and interacts with your store by viewing and clicking on products, making searches, adding products to cart, and completing purchases. This implicit data is collected to track user activity.

All this information is collected, transformed and stored in databases.

Users have different preferences, buying patterns, likes & dislikes, resulting in unique datasets. Over time, as you feed more data, the engine gets smarter and produce recommendations that are more likely to engage customers and make a sale.


Amazon tracks each and every activity of its users and feeds the data to its recommendation engine. As a result, 35% of the total revenue generated by Amazon comes from product recommendations, according to McKinsey.

2. Data Analysis

After data collection, storage, and transformation it is analyzed for what kind of system could be implemented on the type of data obtained.

The analysis is carried out to make a decision for developing and implementing machine learning algorithms that perform well according to the data.

  • Batch Analysis
  • Recommendation engines leveraging batch analytical techniques takes historical data as input and learn the model. These systems perform best with huge datasets. This technique might underperform if the quantity of data is insufficient.

    Batch processing systems process data periodically say every day or whatever period suits, decided on the basis of testing.

    You should have a good volume of visitors on your store in order to get the most out of this technique. According to my research and experience, if you have around 10k visitors per day, then batch analysis will perform best for you.

    Such recommendation engines are comparatively cost efficient as they run on predefined resources like storage, memory etc.

  • Real-time Analysis
  • Recommendation engines providing real-time analysis involve continual data processing. These systems keep track of user activity and update the machine learning model every second.

    With every single interaction with your store, the system gets updated and provide more learned product recommendations.

    To utilize such systems, the number of visitors per minute on your website should not be very large, if it is then this system would require high-performance resources to keep the system running.

There is no right or wrong approach, it all depends on your requirements, resources, and constraints. Both approaches will successfully accomplish the task with efficiency provided the right conditions.

3. Data Filtering

Filtering is the part where the real magic happens. Recommendation engine takes data as input and produces personalized product recommendations.

Following filtering strategies are usually deployed in recommendation engines,

  • Collaborative Filtering
  • In collaborative filtering, recommendations are produced on the basis of the user’s history and/or item properties. Two techniques, combined or individually, are used in CF,

    1. User-Based
    2. Imagine you want to recommend movies to a user ‘Bob’. An already existing customer ‘Jhon’ belongs to a similar age group, gender, and locality as Bob. Now the movies rated good, or bought by Jhon are more likely to be preferred by Bob.

      A list of movies will be obtained by matching Bob’s and Jhon’s personalities.

      Next filter out movies already seen by Bob from the list and recommend remaining movies. These recommended movies are most likely to make a sale compared to other movies.

    3. Item-Based
    4. Imagine your customer ‘Emma’ placed a skateboard in her cart. Now you know other customers who bought a skateboard also bought a helmet, gloves, and shoes. Hence, you got a list of items frequently bought along with skateboards.

      Filter the products Emma already bought, and you got recommended item list she is more prone to buy along with the skateboard, enabling you to increase her basket value.

  • Content-based Filtering
  • In content-based filtering, you will focus on what products are similar to the products your customer ‘Abraham’ like. For instance, you know Abraham already bought movies like Inception, The Departed, The prestige, Deadpool, The Hangover, Avengers, Spiderman, and Venom.

    This indicates he likes thriller, comedy, action, and Marvel comics, and is a fan of Leonardo Dicaprio. Now you can recommend him movies having a similar genre or the stars he is a fan of.

    Filter out all the movies he already bought, and recommend remaining movies like Thor, Shutter Island, Ironman, 22 Jumpstreet, etc.


    According to McKinsey, 75% of what users watch on Netflix comes from product recommendations

  • Hybrid Model
  • As the name suggests a hybrid model leverages the pros of both the approaches and deliver more relevant and personalized recommendations. You can combine collaborative and content-based filtering to design a custom system that can achieve more efficiency than the individual models can.

    Mostly, hybrid models tend to omit the disadvantages of either filtering strategies and can produce better ROI on your recommendation engine.

According to Business Insider, around 49% of consumers bought products they weren’t planning to buy after receiving a personalized recommendation.


In a research cited by dynamic yield, 92% of marketers and executives agreed on the value of personalization.

If you need a recommendation system, you typically have two options either build your own or buy an existing service/product.

Developing your own is an expensive option but can save you from relying heavily on any other company for running a core part of your business. Netflix spent years building and improving its recommendation engine costing them a fortune.

If you have the resources, you should build your own, but if not, there are a number of reliable, efficient, and cost-effective solutions in the market that you can opt for, like Personide.

Personide offers a perfect mix of all the techniques combined and customized, producing product recommendations that can increase your store’s revenue exponentially.

I Hope now you have a clear idea of how a recommendation engine works and know what a salesperson selling you his recommendation engine is talking about.