Creating Personalized Experiences: A Deep Dive into TensorFlow Recommenders

Tensorflow Recommenders

In the realm of recommendation systems, the process of developing comprehensive and effective systems can be quite challenging for both newcomers and experienced professionals. It involves multiple intricate steps, such as acquiring a dataset, embedding vectors, and employing advanced coding techniques. However, to simplify this complex procedure, TensorFlow has introduced an open-source package called TensorFlow Recommenders. In this article, we will delve into the concept behind TensorFlow Recommenders and explore its implementation, highlighting the ease and efficiency with which we can set up a recommendation system. Let’s begin by discussing the key points we’ll cover:

Points to be Discussed

  1. What are Tensorflow Recommenders?
  2. Retrieval System
  3. Implementing Tensorflow Recommenders

What are Tensorflow Recommenders?

TensorFlow Recommenders (TFRS) is a remarkable open-source TensorFlow package designed to streamline the creation, evaluation, and deployment of advanced recommender models. Based on TensorFlow 2.x, TFRS empowers us to build and assess flexible candidate nomination models. It allows seamless integration of item, user, and context information into recommendation models, enabling us to train multi-task models that optimize various recommendation goals simultaneously. By leveraging TensorFlow serving, we can efficiently serve the models we obtain.

Retrieval System

In many recommender systems, the objective is to extract a few top recommendations from a vast pool of potential candidates. The retrieval stage of a recommender system addresses the challenge of identifying a shortlist of promising candidates from a large pool of candidates. Thankfully, TensorFlow Recommenders simplifies this process through the construction of two-tower retrieval models. These models operate in two steps:

  1. Converting user input into an embedding
  2. Identifying the best options in the embedding space

TFRS utilizes TensorFlow 2.x and Keras to build these models, making them both familiar and user-friendly. While the framework is designed to be modular, allowing for easy customization of specific layers and metrics, it also functions seamlessly as a whole, ensuring optimal compatibility among individual components.

Implementing Tensorflow Recommenders

To provide a practical demonstration of how to leverage TensorFlow Recommenders, let’s explore a basic use case based on TensorFlow’s official implementation. In this scenario, we will train a model for movie recommendations using the MovieLens dataset. The dataset contains information about the movies users have watched and the ratings they have assigned to those movies.

Our objective is to build a model that can predict which movies a user will watch and which they won’t. We will employ a widely used and effective design known as the two-tower model, which consists of a neural network with two sub-models. These sub-models train representations for questions and candidates separately. The score for a specific query-candidate pair is determined by the dot product of the outputs of these two towers, as depicted in the animation below.

Two-Tower Analogy

On the query side, the inputs can vary, ranging from user ids and search queries to timestamps. On the candidate side, we can consider movie titles, descriptions, synopses, and lists of starring actors. For the purpose of this example, we will keep things simple and use user ids for the query tower and movie titles for the candidate tower.

Let’s proceed by setting up our environment, installing the necessary dependencies, and importing the required libraries:

!pip install -q tensorflow-recommenders
!pip install -q --upgrade tensorflow-datasets
 
import tensorflow_datasets as tfds
import tensorflow_recommenders as tfrs
 
import numpy as np
import tensorflow as tf
 
from typing import Dict, Text
import pprint

Next, we will prepare the dataset using the MovieLens dataset from TensorFlow datasets. For simplicity, we will only utilize the user_id and movie_titles information:

# Ratings data
rating = tfds.load('movielens/100k-ratings', split='train')
# Features of all the movies
movies = tfds.load('movielens/100k-movies', split='train')
 
# Limiting the features
rating = rating.map(lambda x:{'movie_title':x['movie_title'],'user_id':x['user_id']})
movies = movies.map(lambda x: x['movie_title'])

To implement the two-tower analogy, we need to create a user tower that maps user_ids to high-dimensional vector space, as well as a movie_title tower. These embeddings will later be used in the Keras embedding layer:

user_id_vocabulary = tf.keras.layers.experimental.preprocessing.StringLookup(mask_token=None)
user_id_vocabulary.adapt(rating.map(lambda x: x['user_id']))
 
movies_title_vocabulary = tf.keras.layers.experimental.preprocessing.StringLookup(mask_token=None)
movies_title_vocabulary.adapt(movies)

Now, let’s define the class that represents our recommendation model. This class contains two methods: __init__() and compute_loss(). The __init__() method sets up the primary components of our model, such as user_ids, movie_titles representation, and the retrieval task. The compute_loss() method defines how the loss is computed during model training:

class MovieLensModel(tfrs.Model):
 
  def __init__(
      self,
      user_model: tf.keras.Model,
      movie_model: tf.keras.Model,
      task: tfrs.tasks.Retrieval):
    super().__init__()
 
    # Set up user and movie representations.
    self.user_model = user_model
    self.movie_model = movie_model
 
    # Set up a retrieval task.
    self.task = task
 
  def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
    # Define how the loss is computed.
 
    user_embeddings = self.user_model(features["user_id"])
    movie_embeddings = self.movie_model(features["movie_title"])
 
    return self.task(user_embeddings, movie_embeddings)

Furthermore, we define the user model and movie model using Keras Sequential layers, and the retrieval task using TFRS:

lessCopy codeusers_model = tf.keras.Sequential([user_id_vocabulary,
                                   tf.keras.layers.Embedding(user_id_vocabulary.vocab_size(),64)])
movie_model = tf.keras.Sequential([movies_title_vocabulary, tf.keras.layers.Embedding(movies_title_vocabulary.vocab_size(),64)])
 
task = tfrs.tasks.Retrieval(metrics=tfrs.metrics.FactorizedTopK(
    movies.batch(128).map(movie_model)))

With the model, user model, movie model, and retrieval task in place, we can create, compile, and train the retrieval model:

scssCopy codemodel = MovieLensModel(users_model,movie_model,task)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.5))
model.fit(rating.batch(4096), epochs=3

Validation and Recommendations

To validate the recommendations made by the model, we can utilize the TFRS BruteForce layer. This layer is indexed with candidate representations that have already been computed, enabling us to find the top movies in response to a query by calculating the query-candidate score for all available candidates. Here’s an example of how it can be done:

recommends = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
recommends.index_from_dataset(movies.batch(100).map(lambda title: (title, model.movie_model(title))))
 
id_ = input('Enter the user_id: ')
_, titles = recommends(np.array([str(id_)]))
print('Top recommendations for user', id_, 'are:', titles[0, :3])

Conclusion

In this article, we have covered the four essential steps involved in building a movie recommendation system by analyzing user ratings. We imported the necessary data and simplified the features. We then created embedding representations using Keras preprocessing layers. Next, we defined a class for the TFRS model, as well as the strategies for the models and retrieval tasks. Finally, we combined all the components under the class MovieLensModel, trained the model, and made inferences. This post provided a comprehensive overview of how to build a recommendation system using the state-of-the-art TensorFlow Recommenders.