SlideShare a Scribd company logo
1 of 19
GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE
Deep Learning @ Groupon
Applications within Relevance and Ranking
AI Summit San Francisco, 19-20th September 2018
The Team
Bojan Babic is a Senior Engineer at Groupon working on core
search and relevance in both personalized deals search and
deal recommendations
@bojanbabic
Joaquin A. Delgado, PhD. is currently serving as Director of
Machine Learning at Groupon, working on search and
recommender systems for local e-commerce. Previously, he
was Director at Verizon and CTO of Lending Club and
AdBrite. He also worked at Yahoo! and Oracle
@joaquind
Who are we?
Groupon - Dynamic Marketplace
Building Daily Habit
Four Distinct Customer Journeys
Search Browse Home Feed
Similar Deals to Consider
Examples of use of DL @ Groupon
GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE
Search
Using Query Understanding to impact the recall of the long tail queries
Query Similarity
● TF-IDF - bag of words approach
○ Sparse representation
○ Never consider queries unless they share the same terms
○ Queries that share terms but do not have same meaning are candidates
■ “nail clippers” vs “la clippers”
● Random Walk in bipartite graph of queries and categories
○ no guarantee that similar queries have same search results
● Doc2Vec - get k-closest queries in the embeddings space (PV-DM)
○ Improved recall of the tail queries
○ Better overall precision
○ Examples:
■ “sony playstation” -> “playstation 4”, “ps4”, “psp”
Query_1
Query_2
Cat_1
Cat_m-1
Query_n-1
Query_n
Cat_m
...
Model Validation
GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE
Browse
Leveraging deal classification to guide the customer through Groupon’s vast catalog
Complex deal taxonomy
Learned Taxonomy
Hyperparameters
● batch size: 64
● epochs: 30
● sequence length: 200 words
● dropout: 0.2
What we tried:
● K-Means on the vector dense representation of the deal description
What worked:
● CNN
● LSTM
GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE
Home Feed
Understanding how images influence purchases in recommended feed
Image Propensity to Purchase
● Question being Asked:
○ Are certain deal images more attractive to
customers than others?
○ Do images influence purchases?
● We use a Convolutional Neural Network (CNN) to train a
model to predict an Image Propensity to Purchase (IPP)
● The target class is a binary purchase/no-purchase label
● We later use the precomputed IPP as a feature in our
proprietary learning-to-rank algorithm in the Home Feed
and other places
It is well known that a picture is worth a thousand words and, at Groupon, images play a fundamental role in the marketing of deals.
GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE
Similar Deals to Consider
Recommending Similar deals by Leveraging User Session Information
Recommending Similar Deals
● Content-based similarity
○ Never consider deals unless they share the same terms
○ Deals that share terms but do not have same meaning are
candidates
● Collaborative Filtering
○ Sparse representation, requires Matrix Factorization
○ Only captures similarity based on what other users
purchased, without considering context.
● Deal2Vec - get k-nearest neighbors deals in the embeddings space
○ Build deal embeddings using customer sessions that resulted
in purchases
○ This considers context: customers’ journeys
○ Beyond co-purchases, deal2vec has proven to be an
important source of candidates for deal similarity
○ Can be used in several customer touch-points
■ When browsing: Similar deals to consider
■ Post-purchase: Customers who bought X have also
bought Y
Conclusions
● All-in in replacing traditional feature engineering with respective embeddings representation
● Expanding Deep Learning reach within the Groupon to other areas (ie mobile - credit card detection)
● More work in automating feature discovery and model parameter tuning
References
1. Comparative Study of CNN and RNN for Natural Language Processing, Wenpeng Yin, Katharina
Kann, Mo Yu and Hinrich Schutze, IBM 2017
2. Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising, Yahoo! 2016
3. The Evolution of a Real-World Recommender System, Pinterest 2016
4. Deep Neural Networks for YouTube Recommendations, ACM 2016
5. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for
Recommender Systems. Computer 42, 8 (August 2009), 30-37.
6. Yann LeCun, Patrick Haffner, Léon Bottou, and Yoshua Bengio. 1999. Object Recognition with
Gradient-Based Learning. In Shape, Contour and Grouping in Computer Vision, David A.
Forsyth, Joseph L. Mundy, Vito Di Gesù, and Roberto Cipolla (Eds.). Springer-Verlag, London, UK,
UK, 319-.
We need your help!
relevance-jobs@groupon.com
Q & A

More Related Content

Similar to Deep Learning Application within search and ranking at Groupon

Webinar: Tools for Modern Marketing
Webinar: Tools for Modern MarketingWebinar: Tools for Modern Marketing
Webinar: Tools for Modern MarketingAndy Shatananda
 
Marketplace in motion - AdKDD keynote - 2020
Marketplace in motion - AdKDD keynote - 2020 Marketplace in motion - AdKDD keynote - 2020
Marketplace in motion - AdKDD keynote - 2020 Roelof van Zwol
 
9 Digital Marketing Trends To Know In 2019 - PodCamp Toronto
9 Digital Marketing Trends To Know In 2019 - PodCamp Toronto9 Digital Marketing Trends To Know In 2019 - PodCamp Toronto
9 Digital Marketing Trends To Know In 2019 - PodCamp TorontoAngela LaGamba
 
More Customers, Lower CPL: Stay Competitive with AI in Google Ads
More Customers, Lower CPL: Stay Competitive with AI in Google AdsMore Customers, Lower CPL: Stay Competitive with AI in Google Ads
More Customers, Lower CPL: Stay Competitive with AI in Google AdsHanapin Marketing
 
Creating Customized Buyer Journeys with AI and Data
Creating Customized Buyer Journeys with AI and DataCreating Customized Buyer Journeys with AI and Data
Creating Customized Buyer Journeys with AI and DataMarketo
 
Search Introduction - Updated
Search Introduction - UpdatedSearch Introduction - Updated
Search Introduction - UpdatedDominique Hind
 
Search engine strategies
Search engine strategiesSearch engine strategies
Search engine strategieslaytonhind
 
How to leverage market insights for winning Digital Strategies
How to leverage market insights for winning Digital StrategiesHow to leverage market insights for winning Digital Strategies
How to leverage market insights for winning Digital StrategiesMel Tomas
 
Inbound Marketing for Binary Options Brokers
Inbound Marketing for Binary Options BrokersInbound Marketing for Binary Options Brokers
Inbound Marketing for Binary Options BrokersHop Online Ltd.
 
Qredo gravytrain digital pitch '15
Qredo gravytrain digital pitch '15Qredo gravytrain digital pitch '15
Qredo gravytrain digital pitch '15Dan Whitehouse
 
Leveraging Micro-Moments: Using PPC When It Matters Most
Leveraging Micro-Moments: Using PPC When It Matters MostLeveraging Micro-Moments: Using PPC When It Matters Most
Leveraging Micro-Moments: Using PPC When It Matters MostHanapin Marketing
 
The Relationship Between SEO & Content
The Relationship Between SEO & ContentThe Relationship Between SEO & Content
The Relationship Between SEO & ContentJennifer Lind
 
How To Do Technical Keyword Research For A New Website
How To Do Technical Keyword Research For A New WebsiteHow To Do Technical Keyword Research For A New Website
How To Do Technical Keyword Research For A New WebsiteFrom The Future
 
Can You Afford to Grow in 2017?
Can You Afford to Grow in 2017?Can You Afford to Grow in 2017?
Can You Afford to Grow in 2017?Hanapin Marketing
 
"SEO Success: Aligning Content with User Intent"
"SEO Success: Aligning Content with User Intent""SEO Success: Aligning Content with User Intent"
"SEO Success: Aligning Content with User Intent"himanithakur84
 
Search engine strategy introduction
Search engine strategy introductionSearch engine strategy introduction
Search engine strategy introductionlaytonhind
 
Search engine strategies - introduction
Search engine strategies - introductionSearch engine strategies - introduction
Search engine strategies - introductionlaytonhind
 
Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...
Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...
Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...Lucidworks
 
Morphing Banner Advertisement
Morphing Banner Advertisement Morphing Banner Advertisement
Morphing Banner Advertisement Malte Greiner
 

Similar to Deep Learning Application within search and ranking at Groupon (20)

Webinar: Tools for Modern Marketing
Webinar: Tools for Modern MarketingWebinar: Tools for Modern Marketing
Webinar: Tools for Modern Marketing
 
Marketplace in motion - AdKDD keynote - 2020
Marketplace in motion - AdKDD keynote - 2020 Marketplace in motion - AdKDD keynote - 2020
Marketplace in motion - AdKDD keynote - 2020
 
9 Digital Marketing Trends To Know In 2019 - PodCamp Toronto
9 Digital Marketing Trends To Know In 2019 - PodCamp Toronto9 Digital Marketing Trends To Know In 2019 - PodCamp Toronto
9 Digital Marketing Trends To Know In 2019 - PodCamp Toronto
 
Future Of Advocacy
Future Of AdvocacyFuture Of Advocacy
Future Of Advocacy
 
More Customers, Lower CPL: Stay Competitive with AI in Google Ads
More Customers, Lower CPL: Stay Competitive with AI in Google AdsMore Customers, Lower CPL: Stay Competitive with AI in Google Ads
More Customers, Lower CPL: Stay Competitive with AI in Google Ads
 
Creating Customized Buyer Journeys with AI and Data
Creating Customized Buyer Journeys with AI and DataCreating Customized Buyer Journeys with AI and Data
Creating Customized Buyer Journeys with AI and Data
 
Search Introduction - Updated
Search Introduction - UpdatedSearch Introduction - Updated
Search Introduction - Updated
 
Search engine strategies
Search engine strategiesSearch engine strategies
Search engine strategies
 
How to leverage market insights for winning Digital Strategies
How to leverage market insights for winning Digital StrategiesHow to leverage market insights for winning Digital Strategies
How to leverage market insights for winning Digital Strategies
 
Inbound Marketing for Binary Options Brokers
Inbound Marketing for Binary Options BrokersInbound Marketing for Binary Options Brokers
Inbound Marketing for Binary Options Brokers
 
Qredo gravytrain digital pitch '15
Qredo gravytrain digital pitch '15Qredo gravytrain digital pitch '15
Qredo gravytrain digital pitch '15
 
Leveraging Micro-Moments: Using PPC When It Matters Most
Leveraging Micro-Moments: Using PPC When It Matters MostLeveraging Micro-Moments: Using PPC When It Matters Most
Leveraging Micro-Moments: Using PPC When It Matters Most
 
The Relationship Between SEO & Content
The Relationship Between SEO & ContentThe Relationship Between SEO & Content
The Relationship Between SEO & Content
 
How To Do Technical Keyword Research For A New Website
How To Do Technical Keyword Research For A New WebsiteHow To Do Technical Keyword Research For A New Website
How To Do Technical Keyword Research For A New Website
 
Can You Afford to Grow in 2017?
Can You Afford to Grow in 2017?Can You Afford to Grow in 2017?
Can You Afford to Grow in 2017?
 
"SEO Success: Aligning Content with User Intent"
"SEO Success: Aligning Content with User Intent""SEO Success: Aligning Content with User Intent"
"SEO Success: Aligning Content with User Intent"
 
Search engine strategy introduction
Search engine strategy introductionSearch engine strategy introduction
Search engine strategy introduction
 
Search engine strategies - introduction
Search engine strategies - introductionSearch engine strategies - introduction
Search engine strategies - introduction
 
Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...
Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...
Webinar: Guest Forrester Analyst Reveals Why Cognitive Search Matters for Eco...
 
Morphing Banner Advertisement
Morphing Banner Advertisement Morphing Banner Advertisement
Morphing Banner Advertisement
 

Recently uploaded

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Recently uploaded (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Deep Learning Application within search and ranking at Groupon

  • 1. GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE Deep Learning @ Groupon Applications within Relevance and Ranking AI Summit San Francisco, 19-20th September 2018
  • 2. The Team Bojan Babic is a Senior Engineer at Groupon working on core search and relevance in both personalized deals search and deal recommendations @bojanbabic Joaquin A. Delgado, PhD. is currently serving as Director of Machine Learning at Groupon, working on search and recommender systems for local e-commerce. Previously, he was Director at Verizon and CTO of Lending Club and AdBrite. He also worked at Yahoo! and Oracle @joaquind
  • 3. Who are we? Groupon - Dynamic Marketplace
  • 5. Four Distinct Customer Journeys Search Browse Home Feed Similar Deals to Consider Examples of use of DL @ Groupon
  • 6. GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE Search Using Query Understanding to impact the recall of the long tail queries
  • 7. Query Similarity ● TF-IDF - bag of words approach ○ Sparse representation ○ Never consider queries unless they share the same terms ○ Queries that share terms but do not have same meaning are candidates ■ “nail clippers” vs “la clippers” ● Random Walk in bipartite graph of queries and categories ○ no guarantee that similar queries have same search results ● Doc2Vec - get k-closest queries in the embeddings space (PV-DM) ○ Improved recall of the tail queries ○ Better overall precision ○ Examples: ■ “sony playstation” -> “playstation 4”, “ps4”, “psp” Query_1 Query_2 Cat_1 Cat_m-1 Query_n-1 Query_n Cat_m ...
  • 9. GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE Browse Leveraging deal classification to guide the customer through Groupon’s vast catalog
  • 11. Learned Taxonomy Hyperparameters ● batch size: 64 ● epochs: 30 ● sequence length: 200 words ● dropout: 0.2 What we tried: ● K-Means on the vector dense representation of the deal description What worked: ● CNN ● LSTM
  • 12. GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE Home Feed Understanding how images influence purchases in recommended feed
  • 13. Image Propensity to Purchase ● Question being Asked: ○ Are certain deal images more attractive to customers than others? ○ Do images influence purchases? ● We use a Convolutional Neural Network (CNN) to train a model to predict an Image Propensity to Purchase (IPP) ● The target class is a binary purchase/no-purchase label ● We later use the precomputed IPP as a feature in our proprietary learning-to-rank algorithm in the Home Feed and other places It is well known that a picture is worth a thousand words and, at Groupon, images play a fundamental role in the marketing of deals.
  • 14. GROUPON, INC. PRIVILEGED AND CONFIDENTIAL – DO NOT DISTRIBUTE Similar Deals to Consider Recommending Similar deals by Leveraging User Session Information
  • 15. Recommending Similar Deals ● Content-based similarity ○ Never consider deals unless they share the same terms ○ Deals that share terms but do not have same meaning are candidates ● Collaborative Filtering ○ Sparse representation, requires Matrix Factorization ○ Only captures similarity based on what other users purchased, without considering context. ● Deal2Vec - get k-nearest neighbors deals in the embeddings space ○ Build deal embeddings using customer sessions that resulted in purchases ○ This considers context: customers’ journeys ○ Beyond co-purchases, deal2vec has proven to be an important source of candidates for deal similarity ○ Can be used in several customer touch-points ■ When browsing: Similar deals to consider ■ Post-purchase: Customers who bought X have also bought Y
  • 16. Conclusions ● All-in in replacing traditional feature engineering with respective embeddings representation ● Expanding Deep Learning reach within the Groupon to other areas (ie mobile - credit card detection) ● More work in automating feature discovery and model parameter tuning
  • 17. References 1. Comparative Study of CNN and RNN for Natural Language Processing, Wenpeng Yin, Katharina Kann, Mo Yu and Hinrich Schutze, IBM 2017 2. Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising, Yahoo! 2016 3. The Evolution of a Real-World Recommender System, Pinterest 2016 4. Deep Neural Networks for YouTube Recommendations, ACM 2016 5. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (August 2009), 30-37. 6. Yann LeCun, Patrick Haffner, Léon Bottou, and Yoshua Bengio. 1999. Object Recognition with Gradient-Based Learning. In Shape, Contour and Grouping in Computer Vision, David A. Forsyth, Joseph L. Mundy, Vito Di Gesù, and Roberto Cipolla (Eds.). Springer-Verlag, London, UK, UK, 319-.
  • 18. We need your help! relevance-jobs@groupon.com
  • 19. Q & A

Editor's Notes

  1. Collection of techniques that we have applied in large scale and sophisticated recommendation system Showcase what we have tried, lessons learned and evolution of the thinking
  2. We serving inventory of over 1m merchants to 50 mil customers. Business organized in verticals: local, goods, tickets, getaways just to name few Pumped $20b into local business
  3. The group we work is working on building daily habit.
  4. Search uses Query Understanding Browse uses Deal Classification Home Feed uses Image Image Processing Post-Purchase uses Deal Similarity
  5. Vanila examples do not work in production Steep learning curve from basic examples to real production example in ranking Steep learning curve from vanila examples to real production applications convolutions work well on 1D vectors (text) as well as on 2D vector representation (images). Instead of edges, curves and diagonals, 1D convolutions detect n-grams. Infact Conv nets can be used on any data with spacial patterns feature engineering can get you only far to certain point. as much we covered first level features and used age, gender, location or calculates features like propensity to category or to travel. there are latent features that we can’t simply do not have access to. there is when embeddings kick in. still getting from word2vec to sequence model application in recommendation systems is not trivial. this talk has task of bridging that gap and showcasing set of applications we used that cover 4 main areas that recommender system at groupon care about: search, browse, homefeed and post purchase.
  6. Make sure that performance of the model does not deteriorate with subsequence model release. Need to have robust validation. Intrinsic validation requires curated list of analogies
  7. 1-D Convolutions on text vs 2-D Convolutions on the images
  8. Taxonomies are super important for recall. Taxonomies have 1500 nodes. Complexity of having multiple taxonomies New partnerships we have require incorporating new orthogonal taxonomies (ie food delivery restaurants, place has wifi) Problem of bad taxonomy mapping can have big impact on the business. We do need safety measures in order to suppress human errors. Human errors: missed opportunity and/or misclassified deal.
  9. Corpus considered title, short description, highlight and fineprint Why not CNNs? Accuracy of the models vary on the length of the input sequence - CNNs focus on small regions in order to extract important classification features, while LSTMs consider the whole input sequence Embedding is calculated by taking the output of the LSTM cell at last time step and multiplying it with another weight matrix, normalizing it and put through softmax classifier that had 1500 classes. Trying whole range of the hyperparameters, but we settled with batch size 64 and sequence length 200 and dropout 0.2 Scoring time use target (category) embedding in order to find cosine similarity with vector representation of the input
  10. We use a Convolutional Neural Network (CNN) to train a model using deal views <image, target> to predict an image’s propensity to purchase (IPP) The target class is a binary purchase/no-purchase label based on the customer’s decision made after viewing the deal We use the precomputed IPP as a feature in our proprietary learning-to-rank algorithm used for ranking deals in the Home Feed