Friday, May 27, 2011

Evaluating Mahout based Recommender Implementations

           Mahout provides you an option to evaluate your generated recommendations against the actual preference values.
        In mahout recommender evaluators, a part of the real preference data set is kept as test data. These test preferences won’t be there in the training data set (actual data set – test data set) which is fed to the recommender under evaluation (ie all data other than the test data is fed into the recommender as input). The recommender internal generates preferences for the test data and these calculated values are compared to actual values in the data set.
For this mahout uses two types of evaluations

1.       Average Absolute Difference Evaluator
                The average difference between the actual and estimates preference is calculated. Lower the value better the recommendations. Lower values means the estimated preference differed from the actual preferences only in a smaller extent. If this value is 0 it indicates that both the estimated and actual preferences are the same means perfect recommendations.

2.       Root Mean Square Evaluator
                Here we calculate the value of difference as the square root of the average of the squares of the differences between actual and estimated recommendations. In this evaluation also lower the score value better the recommendations. Also 0 refers to perfect recommendations.
The below code snippet shows an implementation of sample recommender evaluator.

Recommender Evaluator


import org.apache.mahout.common.RandomUtils;

public class RecommenderEvaluvator {

      private static int neighbourhoodSize=7;
      public static void main(String args[])
            String recsFile="D://inputData.txt";
            /*creating a RecommenderBuilder Objects with overriding the buildRecommender method
            this builder object is used as one of the parameters for RecommenderEvaluator - evaluate method*/
            //for Recommendation evaluations
            RecommenderBuilder userSimRecBuilder = new RecommenderBuilder() {
                  public Recommender buildRecommender(DataModel model)throws TasteException
                        //The Similarity algorithms used in your recommender
                        UserSimilarity userSimilarity = new TanimotoCoefficientSimilarity(model);
                        /*The Neighborhood algorithms used in your recommender
                         not required if you are evaluating your item based recommendations*/
                        UserNeighborhood neighborhood =new NearestNUserNeighborhood(neighbourhoodSize, userSimilarity, model);
                        //Recommender used in your real time implementation
                        Recommender recommender =new GenericBooleanPrefUserBasedRecommender(model, neighborhood, userSimilarity);
                        return recommender;
            try {
                  //Use this only if the code is for unit tests and other examples to guarantee repeated results
                  //Creating a data model to be passed on to RecommenderEvaluator - evaluate method
                  FileDataModel dataModel = new FileDataModel(new File(recsFile));
                  /*Creating an RecommenderEvaluator to get the evaluation done
                  you can use AverageAbsoluteDifferenceRecommenderEvaluator() as well*/
                  RecommenderEvaluator evaluator = new RMSRecommenderEvaluator();
                  //for obtaining User Similarity Evaluation Score
                  double userSimEvaluationScore = evaluator.evaluate(userSimRecBuilder,null,dataModel, 0.7, 1.0);
                  System.out.println("User Similarity Evaluation score : "+userSimEvaluationScore);
            } catch (IOException e) {
                  // TODO Auto-generated catch block
            } catch (TasteException e) {
                  // TODO Auto-generated catch block

                Lets us look into a bit detail on some lines of code in the above example which is very important

                A lot of randomness is used inside the evaluator to choose the test data. With the usage of RandomUtils.useTestSeed() we can ensure that the evaluator chooses the same random data every time. Use this line in your evaluations if and only if you are going for unit test or examples that should guarantee same evaluation results every time. Never use it in your real code.

                The core evaluation operation happens in this method. Let us look into each of the four parameters used in here.
The first parameter null is a place holder for the DataModelBuilder. Null would indicate the default value and it would be fine as long as you are not using any specialized implementation of DataModel in your recommender implementation.
The second parameter is the RecommenderBuilder you have created young the buildRecommender() method in your evaluator
The third and fourth parameter indicates the volume of input data to be considered for evaluation. 1.0 as the last parameter indicates that 100% of the input data is used for evaluation purposes. The 3rd parameter indicates the volume of data to be used to train the algorithm. 0.7 means 70% of the input data allocated would be used to train the algorithm and 30% would be used to perform the test. On a real time data set when the data volume is huge we'd normally take in only a small percent of actual input data to evaluate our recommenders over minor modifications on code each time, in such a case just choose a small portion of total data set say 10%, ie give the last parameter as 0.1 . There would be slight loss in accuracy but on test driven development it is a good approach.


  1. How do you apply this concept to evaluate the accuracy of distributed recommendations?

  2. Hi Jonathan

    Based on my experience, this is just a metric. The real accuracy of recommendations could be determined by a data and business analysts. :) All our projects were designed based on the analysis of our BAs on various similarity algorithms. We chose the best one base on their feed back.

  3. Hey Bejoy, thanks for this.

    Could u tell what is the meaning of neighborhood size and its significance?

    Javadocs says, it is capped at the number of users in the data model. Also, it cannot be less than 1 else, it throws Taste exception.

    Does it mean that, user should except a minimum of one recommendation?

  4. I appreciate you sharing this article. Really thank you! Much obliged.
    This is one awesome blog article. Much thanks again.

    sap online training
    software online training
    sap sd online training
    hadoop online training

  5. I really enjoy the blog.Much thanks again. Really Great.
    Very informative article post. Really looking forward to read more. Will read on…

    oracle online training
    sap fico online training
    dotnet online training

  6. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.

    Spark Training in Chennai