Causal Effect of Query Complexity on Product Relevance

Casual research problem studying title-query overlap and relevance (Amazon KDD Cup 2022)

Summary

We study the causal effect of title-query token overlap on exact-match relevance across locales (US, ES, JP). Using causal adjustment (IPW, AIPW) and modern meta-learners (DRLearner, CausalForestDML) augmented with SBERT embeddings, adjusted ATEs converge around 0.16–0.17, indicating a robust positive effect of overlap on relevance.

My role & tools

Role: analysis, modeling, and evaluation
Tools: PyTorch, scikit-learn, Econ/causal libraries (IPW, AIPW), DRLearner, CausalForestDML
Data: Amazon KDD Cup 2022 (query-product pairs, locales)

Outcome

Adjusted estimates show a consistent 15–17 percentage point increase in exact-match probability when titles overlap queries; embeddings improve precision modestly. Next steps: held-out validation, sensitivity analyses, and production A/B testing.