尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
How to fine-tune a
Large Language Model
Durgesh Gupta
Lack of etiquette and manners is a huge turn off.
KnolX Etiquettes
 Punctuality
Join the session 5 minutes prior to the session start time. We start on
time and conclude on time!
 Feedback
Make sure to submit a constructive feedback for all sessions as it is very
helpful for the presenter.
 Silent Mode
Keep your mobile devices in silent mode, feel free to move out of session
in case you need to attend an urgent call.
 Avoid Disturbance
Avoid unwanted chit chat during the session.
1. What is Fine-tuning
2. Pre-trained Model Vs Fine-tuned Model
3. What is Pre-training?
4. Limitation of pre-trained base models
5. Advantage of fine-tuning your own LLM
6. What is Instruction fine-tuning
7. Data Preparation
8. Approach to fine-tuning
9. PEFT: Parameter Efficient fine-tuning
10. Error Analysis
11. Sample Training Code
01
What is Fine-tuning?
 Finetuning is tweaking the model’s parameters to make it suitable for
performing a specific task.
 We can fine-tune a pre-trained model or in simple words, train to
perform a specific task such as sentiment analysis, text generation,
finding document similarity, etc.
 What fine-tuning does for the model?
− Gets model to learn the data, rather than just get access to it.
− Steers the model to more consistent outputs
− Reduce hallucinations
− Customizes the model to a specific use case.
What is Fine-tuning?
02
Pre-trained Model Vs Fine-tuned Model
 No data to get started
 Smaller upfront cost
 No technical/training knowledge
 Connect data through retrieval (RAG)
 More Generic data fits
 Hallucinations
 RAG misses or gets incorrect data.
Pre-trained Model
 Domain specific data required
 Involves Upfront compute cost
 Needs technical expertise.
 Use RAG too (More Secure)
 More high-quality domain specific data
 Learn new information
 Able to correct incorrect information
Fine-tuned Model
Note: Less cost afterwards if smaller model
03
Training Model to learn text-generation
 Training LLMs from scratch is known as pre-training.
 It is a technique in which a large language model is trained on a vast amount of
unlabeled text.
 Utilizing the concept of Self-Supervised Learning, model masks a word and tries
to predict the next word with the help of the preceding words.
 Pre-training, it is a technique in which the model learns to predict the next word
in the text.
 Example: I am a data scientist.
− The model can create its own labeled data from this sentence like:
Text Label
I am
I am a
I am a data
I am a data scientist
04
Limitation of Pre-trained Model
 Contextual Understanding: Difficult differentiating context.
 Generating Misinformation: May generate incorrect or misleading
information.
 Lack of Creativity: Creativity based on mimicking patterns.
 Hallucination: Generates text that is erroneous, nonsensical, or
detached from reality.
05
Benefit of fine-tuning your own LLM
 Performance
− Less Hallucination
− Increase Consistency
− Reduce unwanted information
 Privacy
− On Prem
− Prevent Leakage
− No breaches
 Reliability
− Control Uptime
− Lower Latency
− Increased Transparency
− Greater Control
Impact of fine-tuning on the model
 Behavior Change
− Learning to respond more consistently
− Learning to focus, e.g., moderation
− Teasing out capability, e.g., better at conversation
 Gain Knowledge
− Increasing knowledge of new specific concepts.
− Correcting old incorrect information
06
What is instruction fine-tuning?
 Instruction fine-tuning is a specialized technique to tailor large language
models to perform specific tasks based on explicit instructions.
 It refers to the process of further training LLMs on a dataset consisting of
instruction, output pairs in a supervised fashion, which bridges the gap
between the next-word prediction objective of LLMs and the users'
objective of having LLMs adhere to human instructions.
 Teaches model to behave more like a chat bot.
 Better user interface for model interaction
− Increased AI adoption, from the thousands of researchers to million of
people
 Can access model pre-existing knowledge.
Instruction following datasets
 Some existing data is ready as-in online:
− FAQ's
− Customer Support Conversation
− Slack Messages
07
Data Selection Criteria
 Higher Quality
 Diversity
 Real
 More
Better
 Lower Quality
 Homogeneity
 Generated
 Less
Worse
Steps to prepare your data
1. Collect instruction-response pairs
2. Concatenate pairs (add prompt template, if required)
3. Tokenization: Pad, Truncate
4. Split into train/test
Tokenization
 Tokenization is the process of splitting text into individual units,
typically words or sub words.
 This step is crucial for the model to understand the structure of
the text.
 In languages like English, tokenization is relatively
straightforward, as words are typically separated by spaces.
Tokenization
This is an input text.
[CLS] This is an input text . [SEP]
101 2023 2003 1037 7953 2058 1012 102
ENCODING
08
Approach To Fine-tune LLM
 Figure out the task.
 Data collection related to the task: input/output pairs.
 Data generation, if required
 Fine tune a small model e.g., 50M-1B
 Vary the amount of data you give your model.
 Evaluate the model performance.
 Collect more data to improve.
 Increase task complexity
 Increase the model size for performance.
The steps for fine-tuning the Large Language Model are:
Fine-tuning Lifecycle
09
PEFT: Parameter Efficient Fine Tuning
 PEFT stands for Parameter Efficient Fine-tuning.
 ML models are essentially complex mathematical equations with
numerous coefficients or weights.
 These coefficients are responsible for the model behavior and make it
capable of learning from data.
 During training of ML models, we adjust these coefficients to minimize
errors and make accurate predictions.
 In case of LLMs, which can have billions of parameters, and changing all
of them during training can be computationally expensive and memory-
intensive.
 PEFT, as a subset of fine-tuning, takes the parameter efficiency seriously.
 Instead of altering all the coefficients of the model, PEFT selects a subset
of them.
 It helps us significantly reducing the computational and memory
requirements.
PEFT: Parameter Efficient Fine Tuning
PEFT: Parameter Efficient Fine Tuning
 LoRA (Low-Rank Adoption):
− It is a technique exploits the fact that some weights have more
significant impacts than others. In LoRA, the large weight matrix is
divided into two smaller matrices by factorization.
− We reduce the number of coefficients that need adjustment, making the
fine-tuning process more efficient.
 QLoRA (Quantization + Low-Rank Adoption):
− Quantization involves converting high-precision floating-point coefficients
into lower-precision representations, such as 4-bit integers.
− Quantization offers a solution by reducing the precision of these
coefficients.
− For instance, a 32-bit floating-point number can be represented as a 4-
bit integer within a specific range. This conversion significantly shrinks
the memory footprint.
LoRA and QLoRA for Coefficient Selection
10
Evaluating Generative AI model
 Huaman Evaluation: Human Expert Evaluation is most reliable.
 Test Data- Good test data is crucial
− High Quality
− Accurate
− Generalize
− Not seen in training data
 Elo Rankings
− Ranking of the top LLMs based on their Elo scores.
− The Elo scores are computed from the results of A/B tests, wherein the
LLMs are pitted against each other in a series of games.
− The ranking system employed is based on the Elo Rating System.
Evaluating Generative Models are Notoriously difficult !
Error Analysis
• Understand the base model behaviour before finetuning
• Categorize errors: iterate on data to fix these problems in data.
Category Example with Problem Example Fixed
Misspelling Your kidney is healthy,
but you lever is sick, get
your lever examined
Your kidney is healthy,
but your liver is sick
Too Long Diabetes is less likely
when you eat a healthy
diet makes diabetes less
likely, making …......
Diabetes is less likely
when you eat a healthy
diet
Repetitive Medical LLMs can save
healthcare workers time
and money and time and
money and time and
money.
Medical LLMs can save
healthcare workers time
and money
11
How to fine-tune and develop your own large language model.pptx

More Related Content

What's hot

Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...
NETUserGroupBern
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
ChaoYang81
 
Implications of GPT-3
Implications of GPT-3Implications of GPT-3
Implications of GPT-3
Raven Jiang
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
Loic Merckel
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
Jim Steele
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)
WarNik Chow
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
Hady Elsahar
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
Fiza987241
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
David Rostcheck
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3
Ishan Jain
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
Alexey Grigorev
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
ssuser4edc93
 
Large Language Models.pdf
Large Language Models.pdfLarge Language Models.pdf
Large Language Models.pdf
BLINXAI
 
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
SlideTeam
 
LLM Healthcare.pdf
LLM Healthcare.pdfLLM Healthcare.pdf
LLM Healthcare.pdf
ATPowr
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
Arvind Devaraj
 

What's hot (20)

Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...Large Language Models, Data & APIs - Integrating Generative AI Power into you...
Large Language Models, Data & APIs - Integrating Generative AI Power into you...
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
 
Implications of GPT-3
Implications of GPT-3Implications of GPT-3
Implications of GPT-3
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Large Language Models.pdf
Large Language Models.pdfLarge Language Models.pdf
Large Language Models.pdf
 
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
Reinforcement Learning In AI Powerpoint Presentation Slide Templates Complete...
 
LLM Healthcare.pdf
LLM Healthcare.pdfLLM Healthcare.pdf
LLM Healthcare.pdf
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 

Similar to How to fine-tune and develop your own large language model.pptx

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
10 Limitations of Large Language Models and Mitigation Options
10 Limitations of Large Language Models and Mitigation Options10 Limitations of Large Language Models and Mitigation Options
10 Limitations of Large Language Models and Mitigation Options
Mihai Criveti
 
Weak Supervision.pdf
Weak Supervision.pdfWeak Supervision.pdf
Weak Supervision.pdf
StephenLeo7
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
ankit_ppt
 
Fine-tuning Pre-Trained Models for Generative AI Applications
Fine-tuning Pre-Trained Models for Generative AI ApplicationsFine-tuning Pre-Trained Models for Generative AI Applications
Fine-tuning Pre-Trained Models for Generative AI Applications
Benjaminlapid1
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
GDG Bujumbura
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdf
DevinSohi
 
Google machine learning engineer exam dumps 2022
Google machine learning engineer exam dumps 2022Google machine learning engineer exam dumps 2022
Google machine learning engineer exam dumps 2022
SkillCertProExams
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
SwatiTripathi44
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
tt4765690
 
Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...
Rama Irsheidat
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
Shishir Choudhary
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
Roger Barga
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
Sanghamitra Deb
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
CCG
 
DataScientist Job : Between Myths and Reality.pdf
DataScientist Job : Between Myths and Reality.pdfDataScientist Job : Between Myths and Reality.pdf
DataScientist Job : Between Myths and Reality.pdf
Jedha Bootcamp
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
Awantik Das
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Yves Peirsman
 
Understanding Mahout classification documentation
Understanding Mahout  classification documentationUnderstanding Mahout  classification documentation
Understanding Mahout classification documentation
Naveen Kumar
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
CS, NcState
 

Similar to How to fine-tune and develop your own large language model.pptx (20)

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
10 Limitations of Large Language Models and Mitigation Options
10 Limitations of Large Language Models and Mitigation Options10 Limitations of Large Language Models and Mitigation Options
10 Limitations of Large Language Models and Mitigation Options
 
Weak Supervision.pdf
Weak Supervision.pdfWeak Supervision.pdf
Weak Supervision.pdf
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Fine-tuning Pre-Trained Models for Generative AI Applications
Fine-tuning Pre-Trained Models for Generative AI ApplicationsFine-tuning Pre-Trained Models for Generative AI Applications
Fine-tuning Pre-Trained Models for Generative AI Applications
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdf
 
Google machine learning engineer exam dumps 2022
Google machine learning engineer exam dumps 2022Google machine learning engineer exam dumps 2022
Google machine learning engineer exam dumps 2022
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
 
Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
 
DataScientist Job : Between Myths and Reality.pdf
DataScientist Job : Between Myths and Reality.pdfDataScientist Job : Between Myths and Reality.pdf
DataScientist Job : Between Myths and Reality.pdf
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
 
Understanding Mahout classification documentation
Understanding Mahout  classification documentationUnderstanding Mahout  classification documentation
Understanding Mahout classification documentation
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
 

More from Knoldus Inc.

Insights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability ExcellenceInsights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability Excellence
Knoldus Inc.
 
Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)
Knoldus Inc.
 
Code Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis FrameworkCode Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis Framework
Knoldus Inc.
 
AWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS PresentationAWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS Presentation
Knoldus Inc.
 
Amazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and AuthorizationAmazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and Authorization
Knoldus Inc.
 
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
Knoldus Inc.
 
Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
Knoldus Inc.
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
Knoldus Inc.
 
Performance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume ServicesPerformance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume Services
Knoldus Inc.
 
Snowflake and its features (Presentation)
Snowflake and its features (Presentation)Snowflake and its features (Presentation)
Snowflake and its features (Presentation)
Knoldus Inc.
 
Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
Knoldus Inc.
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
Knoldus Inc.
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
Knoldus Inc.
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
Knoldus Inc.
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
Knoldus Inc.
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
Knoldus Inc.
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
Knoldus Inc.
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
Knoldus Inc.
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
Knoldus Inc.
 

More from Knoldus Inc. (20)

Insights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability ExcellenceInsights Unveiled Test Reporting and Observability Excellence
Insights Unveiled Test Reporting and Observability Excellence
 
Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)Introduction to Splunk Presentation (DevOps)
Introduction to Splunk Presentation (DevOps)
 
Code Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis FrameworkCode Camp - Data Profiling and Quality Analysis Framework
Code Camp - Data Profiling and Quality Analysis Framework
 
AWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS PresentationAWS: Messaging Services in AWS Presentation
AWS: Messaging Services in AWS Presentation
 
Amazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and AuthorizationAmazon Cognito: A Primer on Authentication and Authorization
Amazon Cognito: A Primer on Authentication and Authorization
 
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentZIO Http A Functional Approach to Scalable and Type-Safe Web Development
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
 
Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
 
Performance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume ServicesPerformance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume Services
 
Snowflake and its features (Presentation)
Snowflake and its features (Presentation)Snowflake and its features (Presentation)
Snowflake and its features (Presentation)
 
Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
 

Recently uploaded

Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
ScyllaDB
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
leebarnesutopia
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
ScyllaDB
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024
ThousandEyes
 
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time MLMongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
ScyllaDB
 
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State StoreElasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
ScyllaDB
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
Ortus Solutions, Corp
 
Cyber Recovery Wargame
Cyber Recovery WargameCyber Recovery Wargame
Cyber Recovery Wargame
Databarracks
 
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
anilsa9823
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
UiPathCommunity
 
Real-Time Persisted Events at Supercell
Real-Time Persisted Events at  SupercellReal-Time Persisted Events at  Supercell
Real-Time Persisted Events at Supercell
ScyllaDB
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
Kieran Kunhya
 
ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
Building a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data PlatformBuilding a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data Platform
Enterprise Knowledge
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB
 
Introduction to ThousandEyes AMER Webinar
Introduction  to ThousandEyes AMER WebinarIntroduction  to ThousandEyes AMER Webinar
Introduction to ThousandEyes AMER Webinar
ThousandEyes
 

Recently uploaded (20)

Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024
 
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time MLMongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
MongoDB vs ScyllaDB: Tractian’s Experience with Real-Time ML
 
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State StoreElasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
 
Cyber Recovery Wargame
Cyber Recovery WargameCyber Recovery Wargame
Cyber Recovery Wargame
 
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
 
Real-Time Persisted Events at Supercell
Real-Time Persisted Events at  SupercellReal-Time Persisted Events at  Supercell
Real-Time Persisted Events at Supercell
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
 
ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
Building a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data PlatformBuilding a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data Platform
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
 
Introduction to ThousandEyes AMER Webinar
Introduction  to ThousandEyes AMER WebinarIntroduction  to ThousandEyes AMER Webinar
Introduction to ThousandEyes AMER Webinar
 

How to fine-tune and develop your own large language model.pptx

  • 1. How to fine-tune a Large Language Model Durgesh Gupta
  • 2. Lack of etiquette and manners is a huge turn off. KnolX Etiquettes  Punctuality Join the session 5 minutes prior to the session start time. We start on time and conclude on time!  Feedback Make sure to submit a constructive feedback for all sessions as it is very helpful for the presenter.  Silent Mode Keep your mobile devices in silent mode, feel free to move out of session in case you need to attend an urgent call.  Avoid Disturbance Avoid unwanted chit chat during the session.
  • 3. 1. What is Fine-tuning 2. Pre-trained Model Vs Fine-tuned Model 3. What is Pre-training? 4. Limitation of pre-trained base models 5. Advantage of fine-tuning your own LLM 6. What is Instruction fine-tuning 7. Data Preparation 8. Approach to fine-tuning 9. PEFT: Parameter Efficient fine-tuning 10. Error Analysis 11. Sample Training Code
  • 4. 01
  • 5. What is Fine-tuning?  Finetuning is tweaking the model’s parameters to make it suitable for performing a specific task.  We can fine-tune a pre-trained model or in simple words, train to perform a specific task such as sentiment analysis, text generation, finding document similarity, etc.  What fine-tuning does for the model? − Gets model to learn the data, rather than just get access to it. − Steers the model to more consistent outputs − Reduce hallucinations − Customizes the model to a specific use case.
  • 7. 02
  • 8. Pre-trained Model Vs Fine-tuned Model  No data to get started  Smaller upfront cost  No technical/training knowledge  Connect data through retrieval (RAG)  More Generic data fits  Hallucinations  RAG misses or gets incorrect data. Pre-trained Model  Domain specific data required  Involves Upfront compute cost  Needs technical expertise.  Use RAG too (More Secure)  More high-quality domain specific data  Learn new information  Able to correct incorrect information Fine-tuned Model Note: Less cost afterwards if smaller model
  • 9. 03
  • 10. Training Model to learn text-generation  Training LLMs from scratch is known as pre-training.  It is a technique in which a large language model is trained on a vast amount of unlabeled text.  Utilizing the concept of Self-Supervised Learning, model masks a word and tries to predict the next word with the help of the preceding words.  Pre-training, it is a technique in which the model learns to predict the next word in the text.  Example: I am a data scientist. − The model can create its own labeled data from this sentence like: Text Label I am I am a I am a data I am a data scientist
  • 11. 04
  • 12. Limitation of Pre-trained Model  Contextual Understanding: Difficult differentiating context.  Generating Misinformation: May generate incorrect or misleading information.  Lack of Creativity: Creativity based on mimicking patterns.  Hallucination: Generates text that is erroneous, nonsensical, or detached from reality.
  • 13. 05
  • 14. Benefit of fine-tuning your own LLM  Performance − Less Hallucination − Increase Consistency − Reduce unwanted information  Privacy − On Prem − Prevent Leakage − No breaches  Reliability − Control Uptime − Lower Latency − Increased Transparency − Greater Control
  • 15. Impact of fine-tuning on the model  Behavior Change − Learning to respond more consistently − Learning to focus, e.g., moderation − Teasing out capability, e.g., better at conversation  Gain Knowledge − Increasing knowledge of new specific concepts. − Correcting old incorrect information
  • 16. 06
  • 17. What is instruction fine-tuning?  Instruction fine-tuning is a specialized technique to tailor large language models to perform specific tasks based on explicit instructions.  It refers to the process of further training LLMs on a dataset consisting of instruction, output pairs in a supervised fashion, which bridges the gap between the next-word prediction objective of LLMs and the users' objective of having LLMs adhere to human instructions.  Teaches model to behave more like a chat bot.  Better user interface for model interaction − Increased AI adoption, from the thousands of researchers to million of people  Can access model pre-existing knowledge.
  • 18. Instruction following datasets  Some existing data is ready as-in online: − FAQ's − Customer Support Conversation − Slack Messages
  • 19. 07
  • 20. Data Selection Criteria  Higher Quality  Diversity  Real  More Better  Lower Quality  Homogeneity  Generated  Less Worse
  • 21. Steps to prepare your data 1. Collect instruction-response pairs 2. Concatenate pairs (add prompt template, if required) 3. Tokenization: Pad, Truncate 4. Split into train/test
  • 22. Tokenization  Tokenization is the process of splitting text into individual units, typically words or sub words.  This step is crucial for the model to understand the structure of the text.  In languages like English, tokenization is relatively straightforward, as words are typically separated by spaces.
  • 23. Tokenization This is an input text. [CLS] This is an input text . [SEP] 101 2023 2003 1037 7953 2058 1012 102 ENCODING
  • 24. 08
  • 25. Approach To Fine-tune LLM  Figure out the task.  Data collection related to the task: input/output pairs.  Data generation, if required  Fine tune a small model e.g., 50M-1B  Vary the amount of data you give your model.  Evaluate the model performance.  Collect more data to improve.  Increase task complexity  Increase the model size for performance. The steps for fine-tuning the Large Language Model are:
  • 27. 09
  • 28. PEFT: Parameter Efficient Fine Tuning  PEFT stands for Parameter Efficient Fine-tuning.  ML models are essentially complex mathematical equations with numerous coefficients or weights.  These coefficients are responsible for the model behavior and make it capable of learning from data.  During training of ML models, we adjust these coefficients to minimize errors and make accurate predictions.  In case of LLMs, which can have billions of parameters, and changing all of them during training can be computationally expensive and memory- intensive.  PEFT, as a subset of fine-tuning, takes the parameter efficiency seriously.  Instead of altering all the coefficients of the model, PEFT selects a subset of them.  It helps us significantly reducing the computational and memory requirements.
  • 30. PEFT: Parameter Efficient Fine Tuning  LoRA (Low-Rank Adoption): − It is a technique exploits the fact that some weights have more significant impacts than others. In LoRA, the large weight matrix is divided into two smaller matrices by factorization. − We reduce the number of coefficients that need adjustment, making the fine-tuning process more efficient.  QLoRA (Quantization + Low-Rank Adoption): − Quantization involves converting high-precision floating-point coefficients into lower-precision representations, such as 4-bit integers. − Quantization offers a solution by reducing the precision of these coefficients. − For instance, a 32-bit floating-point number can be represented as a 4- bit integer within a specific range. This conversion significantly shrinks the memory footprint. LoRA and QLoRA for Coefficient Selection
  • 31. 10
  • 32. Evaluating Generative AI model  Huaman Evaluation: Human Expert Evaluation is most reliable.  Test Data- Good test data is crucial − High Quality − Accurate − Generalize − Not seen in training data  Elo Rankings − Ranking of the top LLMs based on their Elo scores. − The Elo scores are computed from the results of A/B tests, wherein the LLMs are pitted against each other in a series of games. − The ranking system employed is based on the Elo Rating System. Evaluating Generative Models are Notoriously difficult !
  • 33. Error Analysis • Understand the base model behaviour before finetuning • Categorize errors: iterate on data to fix these problems in data. Category Example with Problem Example Fixed Misspelling Your kidney is healthy, but you lever is sick, get your lever examined Your kidney is healthy, but your liver is sick Too Long Diabetes is less likely when you eat a healthy diet makes diabetes less likely, making …...... Diabetes is less likely when you eat a healthy diet Repetitive Medical LLMs can save healthcare workers time and money and time and money and time and money. Medical LLMs can save healthcare workers time and money
  • 34. 11
  翻译: