In Segmentify Blog, we used to post on the value of using recommender systems for ecommerce, list best practices and share our findings with 20.000+ customers. In the other hand, we realised that our customers are also interested in the technical aspects of recommender systems and recommendation engines so we kindly asked academicians to start a series of posts on technical aspects. This is the first post from Assoc. Prof. Mehmet Güray Güler and Mehmet Cabir Akkoyunlu from Yıldız Technical University as an introduction. More to come for science freak marketers and ecommerce enthusiasts.
An Introduction to Recommender Systems
In general, the aim of the Recommender Systems or Recommendation Engines is to infer meaningful recommendations to attract the users’ attention from the vast amount of data generated by the users. The book recommendations of Amazon and the movie recommendations of Netflix are two well-known examples. In Netflix, for example, a huge amount of data is generated from the users rating through the scale of 1 to 5. Later, by compiling the relations between the users and the products, the system finds meaningful intersections and recommends movies to the users. The recommendation systems can be grouped into three main categories: collaborative filtering, which depends on the historical interactions, content-based filtering, which depends on the profile properties, and hybrid techniques which use both of them.
The Types of Recommender Systems
The collaborative filtering (CF) technique uses a matrix which depends on the previous interactions of the users. For the Netflix example, one side of the matrix, say rows, is composed of the user and the other side, say columns, shows the movies. The corresponding entries in the matrix are simply the rating that the users give for each movie. The collaborative filtering can be divided into three according to its solution technique. In neighbourhood based CF, the active users are grouped according to their similarities and new recommendations are proposed according to the contents having the highest probability. In item-based CF, the matching is done through the similarity of the items. This technique allows fast results, especially in online systems. In model-based CF, the parameters of the underlying statistical model of the user ratings are calculated and then used for recommendation.
The content-based filtering proposes recommendations according to the contents of the items. These can be the comments made for the movies, the user feedbacks and the written contents of the items. In general, the similarities that are established with the statistical techniques are used to make the recommendations. Bayesian classification, k-means algorithm, decision trees and neural networks are some of the techniques that are used in content-based filtering.
The hybrid methods, as the name implies, are the methods which use both. For example, there are several cases where using content based grouping and then applying collaborative filtering gives better results than using simple CF techniques.
And Some Tricky Points
One of the main issues in recommender systems is the sparsity of the matrix. For the Netflix example, it is impossible for every user to leave a recommendation for each movie. Another issue is recommendations for the new items. However, there are several simulation-based recommendations systems based on contents. The final issue is fraud, i.e., some users can increase the ratings of their own items or reduce the items of the competitor.
The recommender systems are now at the very center of many ecommerce websites. While good recommender systems can increase the revenue generated by the system, bad ones can reduce it significantly. Hence, the owners of the websites should carefully design/employ their recommendation engines.