Creating Item Recommendations by Finding Users with Similar Taste

Let's say you have bought some books on programming (nerd), read them, and then rated them. Now, let's say a few other people have done the same thing. As an online bookstore owner it would be in my best interest to find the users who have reviewed books similar to you (i.e. they have the same taste as you) and recommend items that you haven't read yet.

A really simple way to find how similar your reviews are to another person is to essentially graph your reviews against Reviewer X's reviews and calculate the Euclidean Distance between your numerical review scores and the other reviewer's. With Euclidean Distance, the closer the distance between the two points is, the more similar your taste is to the other user.

The key here is to take a list of reviews and calculate a numerical score, which can be called a Similarity Score that you can use to measure how the reviews of other users stacks up to your reviews.

This code will calculate the Similarity Score between two users.

public class EuclideanDistance : SimilarityScore
{
  private readonly Reviewer CompareTo;
  private readonly Reviewer CompareWith;

  public EuclideanDistance(Reviewer compareTo, Reviewer compareWith)
  {
    CompareTo = compareTo;
    CompareWith = compareWith;
  }

  public double Score()
  {
    var similarTitles = FindSharedItems();
    if (similarTitles.Count == 0)
    {
      return 0.0;
    }

    double sumOfSquares = similarTitles.Sum(title => 
      Math.Pow(CompareTo.Reviews[title] - CompareWith.Reviews[title], 2));

    return Math.Round(1 / (1 + sumOfSquares), 3);
  }

  public List<string> FindSharedItems()
  {
    return (from r in CompareTo.Reviews where 
      CompareWith.Reviews.ContainsKey(r.Key) select r.Key).ToList();
  }
}

Once you have calculated the Similarity Score of the rest of the users in the system, you can use those scores to weight and average the reviews of items that you haven't read in order to attempt to make the best possible recommendation. This will work with any type of item or review as long as two reviewers have something in common, a score will be calculated.

This is a really simple example of machine learning. Using a simple algorithm (or more complex one if you actually have a site that makes money) we can intelligently guess at what our customers are looking for based on data from other customers. Building intelligence into applications can significantly improve the user's experience by turning advertisement or recommendation space into something that can benefit the user. Which, will in turn improve your bank account.

This is just the tip of the iceberg, you can find the rest of the project on Github.

Creative Commons License

What do you think?