Overview

  • Recommender systems are powerful tools in modern applications, helping to personalize content for users. However, they are susceptible to various biases that can distort their effectiveness and fairness. This article delves into five key biases in recommender systems and explores technical solutions to mitigate them.

Biases in Recommender Systems

Clickbait Bias

Problem: Recommender systems that rely on clicks as positive signals tend to favor sensational content (clickbait) which attracts clicks but offers low value.

Solution: Weighted Logistic Regression

  • Standard Logistic Regression: Uses clicks as positives and no-clicks as negatives.
  • Weighted Logistic Regression: Weights positive examples (clicks) by watch time to prioritize content with longer watch times.
  • Implementation: \(\text{logit}(u, v) = \frac{\text{watch_time}(u, v)}{\text{click}(u, v) + 1}\) where \(\text{watch_time}\) is used as the weight for clicks, emphasizing content that retains viewers longer.

Duration Bias

Problem: Models favor longer videos simply because they have longer watch times, not necessarily because they are more relevant.

Solution: Quantile-based Watch-Time Prediction

  • Duration Quantiles: Bucket videos by their lengths.
  • Watch Time Quantiles: Bucket watch times within each duration bucket.
  • Implementation: \(\text{video_quantile} = \text{Quantile}(\text{duration}, \text{num_quantiles})\) \(\text{watch_quantile} = \text{Quantile}(\text{watch_time}, \text{num_quantiles})\)
    • Train the model to predict (\text{watch_quantile}) based on (\text{video_quantile}), ensuring fair comparison across different video lengths.

Position Bias

Problem: Users are more likely to click items at the top of a list due to their position, not necessarily their relevance.

Solutions:

  • Inverse Position Weighting: \(w_i = \frac{1}{\text{position_bias}(i)}\) Weight training samples inversely to their position bias to reduce the influence of top positions.

  • Result Randomization: Randomly re-rank top items for a subset of users to estimate position bias through engagement changes, though this can negatively impact user experience.

  • Intervention Harvesting: Use historical engagement data from different model versions to estimate position bias without additional interventions.

  • Google’s Rule 36: Include the rank as a feature during training: \(\text{features} = [\text{other_features}, \text{rank}]\) Set the rank feature to a default value (e.g., -1) during inference to isolate its effect.

Popularity Bias

Problem: The model favors popular items due to their higher interaction rates, ignoring potentially more relevant but less popular items.

Solution: Logit Adjustment for Popularity

  • Adjustment Formula: \(\text{logit}(u, v) = \text{logit}(u, v) - \log(P(v))\) where \(\log(P(v))\) normalizes predicted odds by the item’s popularity, balancing popular and niche items.

Single-Interest Bias

Problem: Models overemphasize the most frequent user interest, neglecting the user’s diverse preferences.

Solution: Platt Scaling

  • Calibration Technique: Platt scaling maps predicted probabilities to actual probabilities using a scaled sigmoid function.
  • Formula: \(P(y = 1 | x) = \frac{1}{1 + \exp(A \cdot f(x) + B)}\) where \(A\) and \(B\) are learned parameters on a hold-out set.

  • Outcome: Improved recommendation diversity and alignment with actual user preferences, quantified using metrics like KL divergence.

Position Bias: An In-Depth Look

Overview

Position bias occurs when items displayed at the top of a list (e.g., search results, recommendation list) are more likely to be clicked on or selected than items displayed lower down, regardless of relevance. This bias results from several factors:

  • Attention Bias: Users focus more on prominently positioned items.
  • Trust Bias: Users assume top items are more trustworthy or higher quality.
  • Default Bias: Users choose top items because they are easier to access.
  • Feedback Bias: Higher-ranked items get more feedback, reinforcing their position.
  • Information Overload: Users rely on heuristics like position or popularity to make quick decisions.

Measuring Position Bias

Examine User Engagement by Position

  • Analyze click-through rate (CTR) or conversion rate across different positions.
  • Plot CTR or conversion rate for each position to identify trends.

Calculate Position-Based Metrics

  • Use metrics like discounted cumulative gain (DCG) and expected reciprocal rank (ERR) to evaluate position effects.
  • DCG Calculation: \(\text{DCG} = \sum_{i=1}^{N} \frac{relevance_i}{\log_2(i+1)}\)
  • ERR Calculation: \(\text{ERR} = \sum_{i=1}^{N} \frac{1}{i} \left(\prod_{j=1}^{i-1} (1-R_j)\right) R_i\) where \(R_i\) is the probability of relevance at position \(i\).

Randomize Item Positions

  • Randomly shuffle items for a subset of users to measure engagement changes, though this can negatively impact user experience.
  • This is not a recommended approach as it results in terrible user experience and can be quite costly. We can see how it would look with the illustration below (source).

Control for Item Popularity

  • Group items by popularity or quality and compare engagement metrics within each group.
  • Use propensity score matching to create balanced user groups and compare metrics.

Use Counterfactual Evaluation

  • Simulate outcomes of different strategies and compare to baseline.
  • Use A/B testing or multi-armed bandit algorithms to test strategies in a live environment.

Infer Position Bias via Expectation Maximization (EM)

  • Use EM to infer click probability based on item position.
  • Model CTR as a function of position and use EM to estimate position bias.

FairPairs and RandPair

  • FairPairs: Swap items at positions \(k\) and \(k+1\) to introduce randomness and reduce bias.
  • RandPair: Swap the first item with a randomly selected item, introducing more aggressive randomness.
  • The image below (source) represents the ranking of items after both FairPairs and RandPair.

Mitigation Strategies for Position Bias

  • Add Randomness: Randomize item positions to reduce bias.
  • De-bias Logged Data: Use inverse propensity scores to weight training data based on position bias.
  • Include Positional Features: Add positional features to the model to account for position impact.

References

Conclusion

Addressing biases in recommender systems is crucial for their fairness and effectiveness. Techniques such as weighted logistic regression, quantile-based watch-time prediction, and Platt scaling can mitigate various biases. Understanding and mitigating position bias through methods like FairPairs, RandPair, and expectation maximization further enhances recommender system performance. Continuous research and innovation are essential to keep these systems unbiased and user-centric.