尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Developing Recommendation
System to provide a
Personalized
Learning experience at Chegg
Sanghamitra Deb
Staff Data Scientist
Chegg Inc
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Outline • Recommendations at Chegg.
• Organizing Content – Knowledge Graph
• Deep Dive : Content Classifications
• Cross Product Recommendations.
• Takeaways.
2
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Recommendations at Chegg
3
Goal of Recommendations at Chegg is providing the best possible
learning experience to Students. This is fueled by high quality
content.
Recommender Systems provide a backbone to surface the most
relevant content to a student. Organizing content into a knowledge
graph and detecting patterns in student behavior helps us
personalize student experience.
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Recommendations at Chegg
Chegg Study Home Page
4
Multiple services: text book
rentals, question answering,
online tutoring, flashcards,
writing, math solver, etc.
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Knowledge Graph
Subject
Course
Course
Course
Concept
Concept
Concept
Sub-
concepts
Physics
Electricity
and
Magnetism
Mechanics
Quantum
Physics
Velocity
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Connecting conetnt to the Knowledge Graph
Subject
Course
Concept
Sub-
concepts
A rightward-moving bicycle increases its speed from 2.0 m/s to 12.0 m/s. Is the
bicycle accelerating?
Writing tools
Machine
Learning
Classifiers
Mitosis
a type of cell division that results in two daughter
cells each having the same number and kind of
chromosomes as the parent nucleus, typical of
ordinary tissue growth.
Get your physics paper checked by an expert
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Connecting users to the Knowledge Graph
Subject
Course
Concept
Sub-
concepts
A rightward-moving bicycle
increases its speed from 2.0 m/s
to 12.0 m/s. Is the bicycle
accelerating?
Writing
tools
Machine
Learning
Classifiers
Physics
101
Acceleration
Do you need help
writing a physics
paper?
Edges are created
between users and
Biology
Mitosis
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Content Classification Pipeline
Text Pre-processing Collecting Training Data Model Building
Offline
SME
• Reduces noise
• Ensures quality
• Improves overall
performance
• Training Data Collection
/ Examples of classes that we are
trying to model
• Model performance is directly
correlated with quality of training
data
Model Evaluation
• Model selection
• Architecture
• Parameter
Tuning
Student
Online
8
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Classification Problem
Assigning decks to Courses
• Decks are list of cards grouped together by students
for studying.
• There are several thousand courses, typically it is
more granular than subjects but less granular than
concepts.
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
• TFIDF features with an SVM classifier
Pros –
• Gives decent performance on small training data.
• Straightforward training pipeline.
Cons –
• Does not do well for subjects dominated by symbols,
• Including word & character based features makes the token space & model extremely large.
• Character Based CNN.
• Has the ability to deal with out of vocabulary words. This makes it particularly suitable for user
generated raw text.
• Works for multiple languages.
• Model size is small since the tokens are limited to the number of characters ~ 70. This makes real life
deployments easier and faster.
• Networks with convolutional and pooling layers are useful for classification tasks in which we expect
to find strong local clues regarding class membership.
Modeling Approaches
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
CNN Model Architecture
GlobalMaxPool1D
Convolutions
Feature
Length
DenseLayer
Dropout
Prelu
Norm
….
Convolution &
pool layer
….
2 layers of convolution & pooling
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Multi-task Modeling
CNN
Model
CNN
Model
Cross Entropy Loss
Output
Card Front
Front Back
Card Back
Similarity Function
Card
CNN
Model
Softmax -- # of courses
Cross Entropy Loss
Output
Two tasks
• Similarity between card
front and back.
• Classification of courses
Adding another task
improves the accuracy
by a few percent.
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Model Performance
Top-3 -- 73% accuracy on offline-
test data.
Challenges
• Imbalanced Training
Data
• Some classes have
too few training
examples
Solutions
• Collect More training
data.
• Use rule based
techniques to augment
training data
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
Cross Product Recommendations!
Cold Start Problem: Users often use one product such as Chegg Study and may
just browse other products that provide Chegg Practice or Flash Cards.
Solutions:
Personalized
• Content Filtering --- Use KG to determine courses, concepts and sub-concepts
that users are currently studying and recommend trending content in that
category.
• Text Similarity --- Based on their content engagement. Use in house language
models optimized for Chegg content.
Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved
• Content drives recommendations
• High quality
• Relevant
• Organizing the content into a Knowledge Graph (KG) facilitates content
based recommendations.
• Accuracy of classifiers is important --- models are constantly iterated even
for few percent gain.
• KG helps connect students and courses/concepts which helps with
personalized recommendation
• Cross product recommendations are possible through KG.
• Cold Start problems are made easier.
Takeaways
Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved
Questions

More Related Content

What's hot

The Most Pressing Amazon Operations Challenges — and How to Address Them
The Most Pressing Amazon Operations Challenges — and How to Address ThemThe Most Pressing Amazon Operations Challenges — and How to Address Them
The Most Pressing Amazon Operations Challenges — and How to Address Them
Tinuiti
 
Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...
Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...
Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...
Simplilearn
 
Netflix
Netflix Netflix
Google shopping campaigns presentation
Google shopping campaigns presentationGoogle shopping campaigns presentation
Google shopping campaigns presentation
Bogdan Ch
 
Best Buy – Showrooming
Best Buy – ShowroomingBest Buy – Showrooming
Best Buy – Showrooming
George Giannoulis
 
Personalization at Netflix - Making Stories Travel
Personalization at Netflix -  Making Stories Travel Personalization at Netflix -  Making Stories Travel
Personalization at Netflix - Making Stories Travel
Sudeep Das, Ph.D.
 
Netflix - Strategy management
Netflix - Strategy managementNetflix - Strategy management
Netflix - Strategy management
Mario Clement
 
Netflix Business Model - Nine Elements
Netflix Business Model - Nine ElementsNetflix Business Model - Nine Elements
Netflix Business Model - Nine Elements
Giovanna Correa
 
How to Engage New-to-Brand Amazon Customers in 2023
How to Engage New-to-Brand Amazon Customers in 2023How to Engage New-to-Brand Amazon Customers in 2023
How to Engage New-to-Brand Amazon Customers in 2023
Tinuiti
 
Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...
Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...
Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...
Pallabh Bhura
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
Benjamin Le
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging Applications
Kyuhwan Jung
 
Procter & Gamble Supply Chain Finance Case
Procter & Gamble Supply Chain Finance CaseProcter & Gamble Supply Chain Finance Case
Procter & Gamble Supply Chain Finance Case
YASSER ELSEDAWY
 
SEO exam - 50 Questions with answers
SEO exam - 50 Questions with answersSEO exam - 50 Questions with answers
SEO exam - 50 Questions with answers
Silvia Alongi
 
DIEVO Google SA360 Admixer
DIEVO Google SA360 AdmixerDIEVO Google SA360 Admixer
DIEVO Google SA360 Admixer
DIEVO
 
Netflix – A Game Changer in Internet streaming media
Netflix – A Game Changer in Internet streaming mediaNetflix – A Game Changer in Internet streaming media
Netflix – A Game Changer in Internet streaming media
Ashish Arora
 
Optimizely Product Vision: The Future of Experimentation
Optimizely Product Vision: The Future of ExperimentationOptimizely Product Vision: The Future of Experimentation
Optimizely Product Vision: The Future of Experimentation
Optimizely
 
Global supply chain case study team8_submit v2
Global supply chain case study team8_submit v2Global supply chain case study team8_submit v2
Global supply chain case study team8_submit v2
Meghan Histand
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
Grace T. Huang
 
Netflix
NetflixNetflix
Netflix
Ritika Beria
 

What's hot (20)

The Most Pressing Amazon Operations Challenges — and How to Address Them
The Most Pressing Amazon Operations Challenges — and How to Address ThemThe Most Pressing Amazon Operations Challenges — and How to Address Them
The Most Pressing Amazon Operations Challenges — and How to Address Them
 
Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...
Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...
Google Display Network Tutorial | Google Display Ads | Google Ads | Digital M...
 
Netflix
Netflix Netflix
Netflix
 
Google shopping campaigns presentation
Google shopping campaigns presentationGoogle shopping campaigns presentation
Google shopping campaigns presentation
 
Best Buy – Showrooming
Best Buy – ShowroomingBest Buy – Showrooming
Best Buy – Showrooming
 
Personalization at Netflix - Making Stories Travel
Personalization at Netflix -  Making Stories Travel Personalization at Netflix -  Making Stories Travel
Personalization at Netflix - Making Stories Travel
 
Netflix - Strategy management
Netflix - Strategy managementNetflix - Strategy management
Netflix - Strategy management
 
Netflix Business Model - Nine Elements
Netflix Business Model - Nine ElementsNetflix Business Model - Nine Elements
Netflix Business Model - Nine Elements
 
How to Engage New-to-Brand Amazon Customers in 2023
How to Engage New-to-Brand Amazon Customers in 2023How to Engage New-to-Brand Amazon Customers in 2023
How to Engage New-to-Brand Amazon Customers in 2023
 
Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...
Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...
Rosewood Hotels and Resorts: Branding to increase Customer Profitability and ...
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging Applications
 
Procter & Gamble Supply Chain Finance Case
Procter & Gamble Supply Chain Finance CaseProcter & Gamble Supply Chain Finance Case
Procter & Gamble Supply Chain Finance Case
 
SEO exam - 50 Questions with answers
SEO exam - 50 Questions with answersSEO exam - 50 Questions with answers
SEO exam - 50 Questions with answers
 
DIEVO Google SA360 Admixer
DIEVO Google SA360 AdmixerDIEVO Google SA360 Admixer
DIEVO Google SA360 Admixer
 
Netflix – A Game Changer in Internet streaming media
Netflix – A Game Changer in Internet streaming mediaNetflix – A Game Changer in Internet streaming media
Netflix – A Game Changer in Internet streaming media
 
Optimizely Product Vision: The Future of Experimentation
Optimizely Product Vision: The Future of ExperimentationOptimizely Product Vision: The Future of Experimentation
Optimizely Product Vision: The Future of Experimentation
 
Global supply chain case study team8_submit v2
Global supply chain case study team8_submit v2Global supply chain case study team8_submit v2
Global supply chain case study team8_submit v2
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
 
Netflix
NetflixNetflix
Netflix
 

Similar to Developing Recommendation System to provide a Personalized Learning experience at Chegg

Using weak supervision and transfer learning techniques to build knowledge gr...
Using weak supervision and transfer learning techniques to build knowledge gr...Using weak supervision and transfer learning techniques to build knowledge gr...
Using weak supervision and transfer learning techniques to build knowledge gr...
Paris Women in Machine Learning and Data Science
 
cache teaching analogy dataa naylatics Download PDF(Updated Curriculum in Bo...
cache teaching  analogy dataa naylatics Download PDF(Updated Curriculum in Bo...cache teaching  analogy dataa naylatics Download PDF(Updated Curriculum in Bo...
cache teaching analogy dataa naylatics Download PDF(Updated Curriculum in Bo...
Mayurkumarpatil1
 
3Edge Corporate Presentation
3Edge Corporate Presentation3Edge Corporate Presentation
3Edge Corporate Presentation
3Edge
 
Discovering the New SuccessFactors LMS Admin Features
Discovering the New SuccessFactors LMS Admin FeaturesDiscovering the New SuccessFactors LMS Admin Features
Discovering the New SuccessFactors LMS Admin Features
Ashton Plusquellec
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
ankit_ppt
 
Data Science and Practical Application Course
Data Science and Practical Application CourseData Science and Practical Application Course
Data Science and Practical Application Course
Object Automation
 
OpenEd 2013: Designing Open Badges and an Open Course to Enhance and Extend...
OpenEd  2013: Designing Open Badges and an Open Course  to Enhance and Extend...OpenEd  2013: Designing Open Badges and an Open Course  to Enhance and Extend...
OpenEd 2013: Designing Open Badges and an Open Course to Enhance and Extend...
Dan Randall
 
Adhyyan presentation.pptx
Adhyyan presentation.pptxAdhyyan presentation.pptx
Adhyyan presentation.pptx
RashmiM58
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
Edge AI and Vision Alliance
 
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
WeCloudData
 
ML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptx
AltafSMT
 
Introduction to EMA highlights
Introduction to EMA highlightsIntroduction to EMA highlights
Introduction to EMA highlights
Nick Bunyan
 
Nagacv
NagacvNagacv
Monika_Bansal_resume
Monika_Bansal_resumeMonika_Bansal_resume
Monika_Bansal_resume
Monika Bansal
 
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
James BonTempo
 
NLP and Machine Learning for non-experts
NLP and Machine Learning for non-expertsNLP and Machine Learning for non-experts
NLP and Machine Learning for non-experts
Sanghamitra Deb
 
Bringing Blackboard to Bath
Bringing Blackboard to BathBringing Blackboard to Bath
Bringing Blackboard to Bath
kateboardman
 
altafppt.pptx
altafppt.pptxaltafppt.pptx
altafppt.pptx
AltafSMT
 
Improving the student experience using digital insights
Improving the student experience using digital insightsImproving the student experience using digital insights
Improving the student experience using digital insights
Jisc
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Lucidworks
 

Similar to Developing Recommendation System to provide a Personalized Learning experience at Chegg (20)

Using weak supervision and transfer learning techniques to build knowledge gr...
Using weak supervision and transfer learning techniques to build knowledge gr...Using weak supervision and transfer learning techniques to build knowledge gr...
Using weak supervision and transfer learning techniques to build knowledge gr...
 
cache teaching analogy dataa naylatics Download PDF(Updated Curriculum in Bo...
cache teaching  analogy dataa naylatics Download PDF(Updated Curriculum in Bo...cache teaching  analogy dataa naylatics Download PDF(Updated Curriculum in Bo...
cache teaching analogy dataa naylatics Download PDF(Updated Curriculum in Bo...
 
3Edge Corporate Presentation
3Edge Corporate Presentation3Edge Corporate Presentation
3Edge Corporate Presentation
 
Discovering the New SuccessFactors LMS Admin Features
Discovering the New SuccessFactors LMS Admin FeaturesDiscovering the New SuccessFactors LMS Admin Features
Discovering the New SuccessFactors LMS Admin Features
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Data Science and Practical Application Course
Data Science and Practical Application CourseData Science and Practical Application Course
Data Science and Practical Application Course
 
OpenEd 2013: Designing Open Badges and an Open Course to Enhance and Extend...
OpenEd  2013: Designing Open Badges and an Open Course  to Enhance and Extend...OpenEd  2013: Designing Open Badges and an Open Course  to Enhance and Extend...
OpenEd 2013: Designing Open Badges and an Open Course to Enhance and Extend...
 
Adhyyan presentation.pptx
Adhyyan presentation.pptxAdhyyan presentation.pptx
Adhyyan presentation.pptx
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
 
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
 
ML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptx
 
Introduction to EMA highlights
Introduction to EMA highlightsIntroduction to EMA highlights
Introduction to EMA highlights
 
Nagacv
NagacvNagacv
Nagacv
 
Monika_Bansal_resume
Monika_Bansal_resumeMonika_Bansal_resume
Monika_Bansal_resume
 
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
 
NLP and Machine Learning for non-experts
NLP and Machine Learning for non-expertsNLP and Machine Learning for non-experts
NLP and Machine Learning for non-experts
 
Bringing Blackboard to Bath
Bringing Blackboard to BathBringing Blackboard to Bath
Bringing Blackboard to Bath
 
altafppt.pptx
altafppt.pptxaltafppt.pptx
altafppt.pptx
 
Improving the student experience using digital insights
Improving the student experience using digital insightsImproving the student experience using digital insights
Improving the student experience using digital insights
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
 

More from Sanghamitra Deb

odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
Sanghamitra Deb
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
Sanghamitra Deb
 
Intro to ml_2021
Intro to ml_2021Intro to ml_2021
Intro to ml_2021
Sanghamitra Deb
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
Sanghamitra Deb
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
Sanghamitra Deb
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Sanghamitra Deb
 
Democratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUsDemocratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUs
Sanghamitra Deb
 
Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.
Sanghamitra Deb
 
Data day2017
Data day2017Data day2017
Data day2017
Sanghamitra Deb
 
Extracting knowledgebase from text
Extracting knowledgebase from textExtracting knowledgebase from text
Extracting knowledgebase from text
Sanghamitra Deb
 
Extracting medical attributes and finding relations
Extracting medical attributes and finding relationsExtracting medical attributes and finding relations
Extracting medical attributes and finding relations
Sanghamitra Deb
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
Sanghamitra Deb
 
Understanding Product Attributes from Reviews
Understanding Product Attributes from ReviewsUnderstanding Product Attributes from Reviews
Understanding Product Attributes from Reviews
Sanghamitra Deb
 

More from Sanghamitra Deb (14)

odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
Intro to ml_2021
Intro to ml_2021Intro to ml_2021
Intro to ml_2021
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Democratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUsDemocratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUs
 
Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.
 
Data day2017
Data day2017Data day2017
Data day2017
 
Extracting knowledgebase from text
Extracting knowledgebase from textExtracting knowledgebase from text
Extracting knowledgebase from text
 
Extracting medical attributes and finding relations
Extracting medical attributes and finding relationsExtracting medical attributes and finding relations
Extracting medical attributes and finding relations
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Understanding Product Attributes from Reviews
Understanding Product Attributes from ReviewsUnderstanding Product Attributes from Reviews
Understanding Product Attributes from Reviews
 

Recently uploaded

Post init hook in the odoo 17 ERP Module
Post init hook in the  odoo 17 ERP ModulePost init hook in the  odoo 17 ERP Module
Post init hook in the odoo 17 ERP Module
Celine George
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
MJDuyan
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Catherine Dela Cruz
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
biruktesfaye27
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
PriyaKumari928991
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
Friends of African Village Libraries
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
Kalna College
 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
TechSoup
 
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT KanpurDiversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Quiz Club IIT Kanpur
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
Kalna College
 
How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...
Infosec
 
Creating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptxCreating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptx
Forum of Blended Learning
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
chaudharyreet2244
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
Frederic Fovet
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
EducationNC
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
khabri85
 
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
Kalna College
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
whatchangedhowreflec
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
MattVassar1
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
roshanranjit222
 

Recently uploaded (20)

Post init hook in the odoo 17 ERP Module
Post init hook in the  odoo 17 ERP ModulePost init hook in the  odoo 17 ERP Module
Post init hook in the odoo 17 ERP Module
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
 
Diversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT KanpurDiversity Quiz Finals by Quiz Club, IIT Kanpur
Diversity Quiz Finals by Quiz Club, IIT Kanpur
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
 
How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...
 
Creating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptxCreating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptx
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
 
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx78 Microsoft-Publisher - Sirin Sultana Bora.pptx
78 Microsoft-Publisher - Sirin Sultana Bora.pptx
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
 

Developing Recommendation System to provide a Personalized Learning experience at Chegg

  • 1. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Developing Recommendation System to provide a Personalized Learning experience at Chegg Sanghamitra Deb Staff Data Scientist Chegg Inc
  • 2. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Outline • Recommendations at Chegg. • Organizing Content – Knowledge Graph • Deep Dive : Content Classifications • Cross Product Recommendations. • Takeaways. 2
  • 3. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Recommendations at Chegg 3 Goal of Recommendations at Chegg is providing the best possible learning experience to Students. This is fueled by high quality content. Recommender Systems provide a backbone to surface the most relevant content to a student. Organizing content into a knowledge graph and detecting patterns in student behavior helps us personalize student experience.
  • 4. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Recommendations at Chegg Chegg Study Home Page 4 Multiple services: text book rentals, question answering, online tutoring, flashcards, writing, math solver, etc.
  • 5. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Knowledge Graph Subject Course Course Course Concept Concept Concept Sub- concepts Physics Electricity and Magnetism Mechanics Quantum Physics Velocity
  • 6. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Connecting conetnt to the Knowledge Graph Subject Course Concept Sub- concepts A rightward-moving bicycle increases its speed from 2.0 m/s to 12.0 m/s. Is the bicycle accelerating? Writing tools Machine Learning Classifiers Mitosis a type of cell division that results in two daughter cells each having the same number and kind of chromosomes as the parent nucleus, typical of ordinary tissue growth. Get your physics paper checked by an expert
  • 7. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Connecting users to the Knowledge Graph Subject Course Concept Sub- concepts A rightward-moving bicycle increases its speed from 2.0 m/s to 12.0 m/s. Is the bicycle accelerating? Writing tools Machine Learning Classifiers Physics 101 Acceleration Do you need help writing a physics paper? Edges are created between users and Biology Mitosis
  • 8. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Content Classification Pipeline Text Pre-processing Collecting Training Data Model Building Offline SME • Reduces noise • Ensures quality • Improves overall performance • Training Data Collection / Examples of classes that we are trying to model • Model performance is directly correlated with quality of training data Model Evaluation • Model selection • Architecture • Parameter Tuning Student Online 8
  • 9. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Classification Problem Assigning decks to Courses • Decks are list of cards grouped together by students for studying. • There are several thousand courses, typically it is more granular than subjects but less granular than concepts.
  • 10. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved • TFIDF features with an SVM classifier Pros – • Gives decent performance on small training data. • Straightforward training pipeline. Cons – • Does not do well for subjects dominated by symbols, • Including word & character based features makes the token space & model extremely large. • Character Based CNN. • Has the ability to deal with out of vocabulary words. This makes it particularly suitable for user generated raw text. • Works for multiple languages. • Model size is small since the tokens are limited to the number of characters ~ 70. This makes real life deployments easier and faster. • Networks with convolutional and pooling layers are useful for classification tasks in which we expect to find strong local clues regarding class membership. Modeling Approaches
  • 11. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved CNN Model Architecture GlobalMaxPool1D Convolutions Feature Length DenseLayer Dropout Prelu Norm …. Convolution & pool layer …. 2 layers of convolution & pooling
  • 12. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Multi-task Modeling CNN Model CNN Model Cross Entropy Loss Output Card Front Front Back Card Back Similarity Function Card CNN Model Softmax -- # of courses Cross Entropy Loss Output Two tasks • Similarity between card front and back. • Classification of courses Adding another task improves the accuracy by a few percent.
  • 13. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Model Performance Top-3 -- 73% accuracy on offline- test data. Challenges • Imbalanced Training Data • Some classes have too few training examples Solutions • Collect More training data. • Use rule based techniques to augment training data
  • 14. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved Cross Product Recommendations! Cold Start Problem: Users often use one product such as Chegg Study and may just browse other products that provide Chegg Practice or Flash Cards. Solutions: Personalized • Content Filtering --- Use KG to determine courses, concepts and sub-concepts that users are currently studying and recommend trending content in that category. • Text Similarity --- Based on their content engagement. Use in house language models optimized for Chegg content.
  • 15. Confidential Material / © 2020 Chegg, Inc. / All Rights Reserved • Content drives recommendations • High quality • Relevant • Organizing the content into a Knowledge Graph (KG) facilitates content based recommendations. • Accuracy of classifiers is important --- models are constantly iterated even for few percent gain. • KG helps connect students and courses/concepts which helps with personalized recommendation • Cross product recommendations are possible through KG. • Cold Start problems are made easier. Takeaways
  • 16. Confidential Material / © 2019 Chegg, Inc. / All Rights Reserved Questions

Editor's Notes

  1. I am going to talk about personalizing the learning experience at Chegg using recommendation systems.
  2. Here is an outline of the presentation.
  3. Chegg is a centralized learning platform where a student comes to learn concepts required for academic performance, job interviews or other activities. The goal of any RS is to present content that is of high quality and relevant, i.e we show them what they want to study. An example of that is --- lets say the student has data analyst job interview --- we know this from past user interactions , so we show the student content related to learning “SQL”.
  4. This is an example of student experience at Chegg. A student logs in and finds suggestions in Mechanical Engineering and Chemistry. As you can seem this model suggests textbook solutions for users based on their past behavior, and it is accompanied by the message "based on your progress”. Another example is a concept-based recommendation module in Study, which is placed below an expert answer that the student is viewing. I wanted to use this slide to give you a look into our content. As you can see most of our content is academic materials.
  5. Now I will Segway into how this content is organized. We have build a knowledge graph which represents a hierarchy of subjects, courses and concepts. The nodes in this graph is provided by subject matter experts. We constantly iterate on this graph as we get suggestions for more nodes and edges. The machine Learning component comes in when we create edges between concept nodes and content. How does this look?
  6. Here is an example of how we connect content from different products to the nodes of the knowledge graph.
  7. When users interact with the content we are able to connect users to a node of the knowledge graph. Since user interactions constantly change with time the degs between users and KG nodes are constantly updated.
  8. Lets now do a deepdive into content classification since that is the backbone of all the recommendations here.
  9. Convolution and pooling layers are good at picking up signature at n-gram level, i.e it is able to pick up when certain phrases are indicative of certain class memberships.
  10. The two layers ensure that the correlations between n-grams are picked up at two different scales.
  11. We define two different task for optimization. One of them is to match the front of the card with the back of the card. We use the CNN model defined in the previous slide and use the dot product as the similarity function and use a cross entropy loss. For the classification problem we feed the CNN model into a softmax layer to predict the courses. Both tasks are optimized simultaneously.
  翻译: