Gab41

Gab41 is Lab41's blog exploring data science, machine learning, and artificial intelligence. Geek out with us!

Follow publication

TPS Report for Recommender Systems? Yeah, That Would be Great.

Tiffany Jaya
Gab41
Published in
9 min readMar 13, 2016

--

Predictive Performance Metrics

Root Mean Squared Error and Mean Absolute Error

Implementation of RMSE and MAE in Spark

y_predicted_reformat = y_predicted.map(
lambda (user_id, movie_id, predicted_rating):
((user_id, movie_id), predicted_rating)
)
y_actual_reformat = y_actual.map(
lambda (user_id, movie_id, actual_rating):
((user_id, movie_id), actual_rating)
)
ratings_diff_sq = (y_predicted_reformat).join(y_actual_reformat) \
.map(lambda (_, (predicted_rating, actual_rating)): \
(predicted_rating — actual_rating) ** 2 )
sum_ratings_diff_sq = ratings_diff_sq.reduce(add)num = ratings_diff_sq.count()average_prediction_error = sum_ratings_diff_sq / float(num)
rmse = sqrt(average_prediction_error)
predicted_and_actual_ratings =   
y_predicted_reformat.join(y_actual_reformat) \
.reduceByKey(lambda \
predicted_rating, actual_rating: \
predicted_rating + actual_rating) \
.map(lambda \
((user_id, movie_id), \
(predicted_rating, actual_rating)): \
(predicted_rating, actual_rating)
)
from pyspark.mllib.evaluation import RegressionMetricsmetrics = RegressionMetrics(predicted_and_actual_ratings)
rmse = metrics.rootMeanSquaredError
mae = metrics.meanAbsoluteError

Classification Performance Metrics

Accuracy, Prediction, Recall and F-score

Implementation of Precision, Recall, and F-score in Spark

from pyspark.mllib.evaluation import RegressionMetricsmetrics = RegressionMetrics(predicted_and_actual_classifications)confusion_matrix = metrics.confusionMatrix().toArray()precision = metrics.precision()recall = metrics.recall()f1 = metrics.fMeasure()precision_for_bad_movies = metrics.precision(0.0)precision_for_good_movies = metrics.precision(1.0)recall_for_bad_movies = metrics.recall(0.0)recall_for_good_movies = metrics.recall(1.0)f1_for_bad_movies = metrics.fMeasure(0.0, 1.0)f1_for_good_movies = metrics.fMeasure(1.0, 1.0)

Conclusion

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Published in Gab41

Gab41 is Lab41's blog exploring data science, machine learning, and artificial intelligence. Geek out with us!

Written by Tiffany Jaya

Passionate in communicating big data insights effectively to non-tech users. Data Vis Aficionado. UX Design Practitioner.

Responses (1)

Write a response