尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
A Comprehensive Review of
Large Language Models for
Code Generation
Presented By: Sai Pragna Kancheti
INTRODUCTION:
 Chatgpt like chatbots has become popular in recent times, These chatbots are natural
language processing tools that are developed for general-purpose and uses artificial
intelligence to generate text after a user enters a prompt.
 Although these chatbots are made for general purpose, they are also good at
generating code from user prompts using Large Language Models
 In this presentation, we are going to systematically review Large Language
Models for code generation base on user prompts
 At the end, based on the results we have presented some Insights for further
research in this direction
What are LLMs?
 A large language model is a more advanced sort of language model that is
developed on vast volumes of text data using deep learning techniques.
 These models can generate human-like text and perform a variety of natural
language processing tasks
 The complexity of a language model can range from simple n-gram models to
more complex neural network models.
 Examples: GPT-3 (Generative Pretrained Transformer 3), BERT (Bidirectional
Encoder Representations from Transformers), RoBERTa (Robustly Optimized
BERT Approach) ,etc.,
LLMs for code generation
 The recent models excel at tasks like code completion and code synthesis
from natural language descriptions.
 One such promising model developed in the recent times is Austin et al.
(2021),which has demonstrated significant progress toward AI-based
programming aid.
 One of the largest of these models, Codex (Chen et al., 2021), has been
deployed as an in-IDE developer assistant that automatically generates code
based on the user's context in the real-world production tool GitHub Copilot1.
 Despite the enormous success of large language models of code, the most
powerful models are not publicly accessible.
LLMs for code
generation
Some of the Existing models of
code,their sizes and
availability(open source or not
open-source ) is shown in the
figure.
Challenges With the available LLMs for code
Generation
 Although these models can show good performance for code generation based
on the user prompt. There are some following challenges needed to be
addressed for these models for further development in this scope
 There was no large open-source language model trained almost exclusively on
code from multiple programming languages.
 Lack of availability of powerful models that are publicly accessible.
 Unavailability of access to the model's internals.
 This prohibits these models from being applied to code generation tasks and
inhibits research in this particular field for low-resource organizations
PRETRAINING
METHODS
Types of Pretraining Methods
Left-to-Right Language Models
 The auto-regressive, left-to-right language models predict the likelihood of a
certain token depending on the sequence of tokens that have come before it
 These models' sequential, left-to-right operation is especially useful for
activities connected to program generation, such as auto-completion code.
 However, because code isn't often produced in a single left-to-right pass,
utilizing context that appears "after" the moment of generation is difficult.
 Examples: CodeParrot, GPT-Neo ,GPT-J (6B) ,Codex (12B), GPT-NeoX (20B),
and Google’s (137B) (Austin et al., 2021)
 These type of the models are considered in review.
Masked Language Models
 While auto-regressive language models are powerful for modeling the
probability of sequences, their unidirectional nature makes them less suitable
for producing effective whole-sequence representations for downstream tasks
such as classification.
 One popular bidirectional objective function used widely in representation
learning is masked language modeling.
 where the aim is to predict masked text pieces based on surrounding context.
 Examples: CodeBERT (125M) and CuBERT (345M) are some of the examples of
these models.
Encoder-decoder Models
 An encoder-decoder model first uses an encoder to encode an input
sequence, and then uses a left-to-right LM to decode an output sequence
conditioned on the input sequence.
 Popular pretraining objectives include masked span prediction where the
input sequence is randomly masked with multiple masks and the output
sequence are the masked contents in order
 and denoising sequence reconstruction where the input is a corrupted
sequence and the output is the original sequence.
 These pretrained models are useful in many sequence-to-sequence tasks
 Examples: CodeT5 (220M) and PLBART (406M)
COMPARED MODELS
Existing Models
 Codex: Codex is a Language Learning Model (LLM) that has been specifically
adjusted using Python code available to the public on GitHub.
 This model employs GPT-3 due to its substantial proficiency in creating Python
programs. Despite being considerably smaller than GPT-3, with a total of 12 billion
parameters, Codex still exhibits remarkable performance.
 GPT-Neo: GPT-Neo is a series of substantial large language models have been trained
on the Pile dataset.
 These models, similar to GPT-3, are available in different sizes including 125M, 1.3B,
and 2.7B parameter versions.
 The GPT-Neo 2.7B version, in particular, is a transformer model that has been
developed based on EleutherAI's recreation of the GPT-3 architecture.
Existing Models
 GPT-J : GPT-J, developed by EleutherAI, is an open source model with 6 billion
parameters, trained on The Pile dataset.
 It largely adheres to the GPT-2 architecture and stands out as the highest performing
transformer language model available to the public, in terms of its zero-shot performance
on a range of subsequent tasks.
 CodeParrot: CodeParrot is a model based on GPT-2, possessing 1.5 billion
parameters, which has been specifically fine-tuned using publicly accessible code from
GitHub for the purpose of generating Python code
Introduced model- PolyCoder
 To overcome the challenges of
available LLMs for code
generation a new PolyCoder
model is introduced , which
boasts 2.7 billion parameters,
is trained on a diverse range of
repositories sourced from
GitHub, encompassing 12
distinct programming
languages. As shown in the
table
PolyCoder’s Training
 Polycoder uses the GPT-2 model architecture.
 To investigate the effect of model size scaling, it was trained using three
different model sizes: 2.7 billion, 400 million, and 160 million parameters,
with the largest 2.7B model equalling GPT-Neo's capacity to allow a fair
comparison
 The 2.7 billion parameter model is a 32-layer, 2,560 dimensional Transformer
model with a maximum context window of 2048 tokens, and it was trained
using a batch size of 128 sequences (262K tokens) for a total of 150K steps
PolyCoder’s Training
 The following table is a
Comparison of design
decisions and hyper-
parameters in training
different models of code.
PolyCoder’s Training
 The following figure is
the Training and
validation loss during the
150K step training
process
Results
Results of Extrinsic evaluations:
 Among the current models, PolyCoder performs less effectively than the comparably
sized GPT-Neo and even the smaller Codex 300M. In the grand scheme of things,
PolyCoder ranks after Codex, GPT-Neo/J, but outperforms CodeParrot
 Despite being trained exclusively on code, PolyCoder lags behind a model of similar
size, GPT-Neo 2.7B, which was trained on the Pile, a mix of both code and natural
language texts
 This finding implies that future studies could profit from mixing code from diverse
programming languages, along with natural language text
Results of Extrinsic evaluations:
 The following table
shows results of different
models on the
HumanEval benchmark,
and the number of
different typesof tokens
seen during the training
process.
Results of Intrinsic Evaluations
 Interestingly, PolyCoder surpasses Codex and all other models when it comes to the C
language. When considering only open-source models, PolyCoder outperforms the
similarly sized GPT-Neo 2.7B in C, JavaScript, Rust, Scala, and TypeScript
 In the remaining 11 languages apart from C, all other open-source models, including
the newly introduced PolyCoder, exhibit significantly lower performance (higher
perplexity) compared to Codex.
 This observation could imply that for languages where larger models don't yield extra
benefits, training the model solely on code might be sufficient or even slightly more
advantageous than training on a combination of natural language and code
Conclusions
 We've presented the results of a systematic evaluatoion of large language models for
code. The findings generally indicate that performance improves with bigger models
and extended training durations.
 Based on the results, we infer that GPT-Neo's superior performance over PolyCoder in
certain languages suggests that training on both natural language text and code can
enhance code modeling
 However, it's noteworthy that in the realm of the C programming language, PolyCoder
outperforms all models, including Codex, by achieving a lower perplexity
Thank
You

More Related Content

What's hot

Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
David Talby
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
SylvainGugger
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
OVHcloud
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
Numenta
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
PremNaraindas1
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
Loic Merckel
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
Pekka Abrahamsson / Tampere University
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
Fiza987241
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Hansi Thenuwara
 
Bert
BertBert
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Mihai Criveti
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
DianaGray10
 
OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
Nawroz University
 
Generative AI
Generative AIGenerative AI
Generative AI
Carlos J. Costa
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
Cori Faklaris
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
Jesus Rodriguez
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
Liming Zhu
 
Webinar on ChatGPT.pptx
Webinar on ChatGPT.pptxWebinar on ChatGPT.pptx
Webinar on ChatGPT.pptx
Abhilash Majumder
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
M Waleed Kadous
 

What's hot (20)

Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Bert
BertBert
Bert
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
 
Generative AI
Generative AIGenerative AI
Generative AI
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
 
Webinar on ChatGPT.pptx
Webinar on ChatGPT.pptxWebinar on ChatGPT.pptx
Webinar on ChatGPT.pptx
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
 

Similar to A Comprehensive Review of Large Language Models for.pptx

Smart modeling of smart software
Smart modeling of smart softwareSmart modeling of smart software
Smart modeling of smart software
Jordi Cabot
 
codex.pptx
codex.pptxcodex.pptx
codex.pptx
ASHISH KUMAR
 
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
DevOpsDays Tel Aviv
 
01 overview
01 overview01 overview
01 overview
Suresh Kumar
 
01 overview
01 overview01 overview
01 overview
Suresh Kumar
 
고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx
ssuser1e7611
 
Recent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP ApproachesRecent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP Approaches
IRJET Journal
 
Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?
Jordi Cabot
 
short-story.pptx
short-story.pptxshort-story.pptx
short-story.pptx
SravaniRaparla
 
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
majid lotfinia
 
Ready, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming languageReady, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming language
RTigger
 
thrift-20070401
thrift-20070401thrift-20070401
thrift-20070401
Hiroshi Ono
 
WEBSITE DEVELOPMENT
WEBSITE DEVELOPMENTWEBSITE DEVELOPMENT
WEBSITE DEVELOPMENT
shahzadebaujiti
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages
ijseajournal
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
Istvan Rath
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Marco Brambilla
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
SIVAJISADHANA
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
SIVAJISADHANA
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
SIVAJISADHANA
 

Similar to A Comprehensive Review of Large Language Models for.pptx (20)

Smart modeling of smart software
Smart modeling of smart softwareSmart modeling of smart software
Smart modeling of smart software
 
codex.pptx
codex.pptxcodex.pptx
codex.pptx
 
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
 
01 overview
01 overview01 overview
01 overview
 
01 overview
01 overview01 overview
01 overview
 
고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx
 
Recent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP ApproachesRecent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP Approaches
 
Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?
 
short-story.pptx
short-story.pptxshort-story.pptx
short-story.pptx
 
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
 
Ready, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming languageReady, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming language
 
thrift-20070401
thrift-20070401thrift-20070401
thrift-20070401
 
WEBSITE DEVELOPMENT
WEBSITE DEVELOPMENTWEBSITE DEVELOPMENT
WEBSITE DEVELOPMENT
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 

Recently uploaded

Post init hook in the odoo 17 ERP Module
Post init hook in the  odoo 17 ERP ModulePost init hook in the  odoo 17 ERP Module
Post init hook in the odoo 17 ERP Module
Celine George
 
bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
Sarojini38
 
pol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdfpol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdf
BiplabHalder13
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
EducationNC
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
Kalna College
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
khabri85
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
whatchangedhowreflec
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
ShwetaGawande8
 
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT KanpurDiversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Quiz Club IIT Kanpur
 
A Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by QuizzitoA Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by Quizzito
Quizzito The Quiz Society of Gargi College
 
Observational Learning
Observational Learning Observational Learning
Observational Learning
sanamushtaq922
 
How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
Celine George
 
How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17
Celine George
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
PJ Caposey
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
Kalna College
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
PriyaKumari928991
 
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
Nguyen Thanh Tu Collection
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
biruktesfaye27
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
Friends of African Village Libraries
 
Information and Communication Technology in Education
Information and Communication Technology in EducationInformation and Communication Technology in Education
Information and Communication Technology in Education
MJDuyan
 

Recently uploaded (20)

Post init hook in the odoo 17 ERP Module
Post init hook in the  odoo 17 ERP ModulePost init hook in the  odoo 17 ERP Module
Post init hook in the odoo 17 ERP Module
 
bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
 
pol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdfpol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdf
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
 
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT KanpurDiversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
 
A Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by QuizzitoA Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by Quizzito
 
Observational Learning
Observational Learning Observational Learning
Observational Learning
 
How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
 
How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17How to Download & Install Module From the Odoo App Store in Odoo 17
How to Download & Install Module From the Odoo App Store in Odoo 17
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
 
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
 
Information and Communication Technology in Education
Information and Communication Technology in EducationInformation and Communication Technology in Education
Information and Communication Technology in Education
 

A Comprehensive Review of Large Language Models for.pptx

  • 1. A Comprehensive Review of Large Language Models for Code Generation Presented By: Sai Pragna Kancheti
  • 2. INTRODUCTION:  Chatgpt like chatbots has become popular in recent times, These chatbots are natural language processing tools that are developed for general-purpose and uses artificial intelligence to generate text after a user enters a prompt.  Although these chatbots are made for general purpose, they are also good at generating code from user prompts using Large Language Models  In this presentation, we are going to systematically review Large Language Models for code generation base on user prompts  At the end, based on the results we have presented some Insights for further research in this direction
  • 3. What are LLMs?  A large language model is a more advanced sort of language model that is developed on vast volumes of text data using deep learning techniques.  These models can generate human-like text and perform a variety of natural language processing tasks  The complexity of a language model can range from simple n-gram models to more complex neural network models.  Examples: GPT-3 (Generative Pretrained Transformer 3), BERT (Bidirectional Encoder Representations from Transformers), RoBERTa (Robustly Optimized BERT Approach) ,etc.,
  • 4. LLMs for code generation  The recent models excel at tasks like code completion and code synthesis from natural language descriptions.  One such promising model developed in the recent times is Austin et al. (2021),which has demonstrated significant progress toward AI-based programming aid.  One of the largest of these models, Codex (Chen et al., 2021), has been deployed as an in-IDE developer assistant that automatically generates code based on the user's context in the real-world production tool GitHub Copilot1.  Despite the enormous success of large language models of code, the most powerful models are not publicly accessible.
  • 5. LLMs for code generation Some of the Existing models of code,their sizes and availability(open source or not open-source ) is shown in the figure.
  • 6. Challenges With the available LLMs for code Generation  Although these models can show good performance for code generation based on the user prompt. There are some following challenges needed to be addressed for these models for further development in this scope  There was no large open-source language model trained almost exclusively on code from multiple programming languages.  Lack of availability of powerful models that are publicly accessible.  Unavailability of access to the model's internals.  This prohibits these models from being applied to code generation tasks and inhibits research in this particular field for low-resource organizations
  • 9. Left-to-Right Language Models  The auto-regressive, left-to-right language models predict the likelihood of a certain token depending on the sequence of tokens that have come before it  These models' sequential, left-to-right operation is especially useful for activities connected to program generation, such as auto-completion code.  However, because code isn't often produced in a single left-to-right pass, utilizing context that appears "after" the moment of generation is difficult.  Examples: CodeParrot, GPT-Neo ,GPT-J (6B) ,Codex (12B), GPT-NeoX (20B), and Google’s (137B) (Austin et al., 2021)  These type of the models are considered in review.
  • 10. Masked Language Models  While auto-regressive language models are powerful for modeling the probability of sequences, their unidirectional nature makes them less suitable for producing effective whole-sequence representations for downstream tasks such as classification.  One popular bidirectional objective function used widely in representation learning is masked language modeling.  where the aim is to predict masked text pieces based on surrounding context.  Examples: CodeBERT (125M) and CuBERT (345M) are some of the examples of these models.
  • 11. Encoder-decoder Models  An encoder-decoder model first uses an encoder to encode an input sequence, and then uses a left-to-right LM to decode an output sequence conditioned on the input sequence.  Popular pretraining objectives include masked span prediction where the input sequence is randomly masked with multiple masks and the output sequence are the masked contents in order  and denoising sequence reconstruction where the input is a corrupted sequence and the output is the original sequence.  These pretrained models are useful in many sequence-to-sequence tasks  Examples: CodeT5 (220M) and PLBART (406M)
  • 13. Existing Models  Codex: Codex is a Language Learning Model (LLM) that has been specifically adjusted using Python code available to the public on GitHub.  This model employs GPT-3 due to its substantial proficiency in creating Python programs. Despite being considerably smaller than GPT-3, with a total of 12 billion parameters, Codex still exhibits remarkable performance.  GPT-Neo: GPT-Neo is a series of substantial large language models have been trained on the Pile dataset.  These models, similar to GPT-3, are available in different sizes including 125M, 1.3B, and 2.7B parameter versions.  The GPT-Neo 2.7B version, in particular, is a transformer model that has been developed based on EleutherAI's recreation of the GPT-3 architecture.
  • 14. Existing Models  GPT-J : GPT-J, developed by EleutherAI, is an open source model with 6 billion parameters, trained on The Pile dataset.  It largely adheres to the GPT-2 architecture and stands out as the highest performing transformer language model available to the public, in terms of its zero-shot performance on a range of subsequent tasks.  CodeParrot: CodeParrot is a model based on GPT-2, possessing 1.5 billion parameters, which has been specifically fine-tuned using publicly accessible code from GitHub for the purpose of generating Python code
  • 15. Introduced model- PolyCoder  To overcome the challenges of available LLMs for code generation a new PolyCoder model is introduced , which boasts 2.7 billion parameters, is trained on a diverse range of repositories sourced from GitHub, encompassing 12 distinct programming languages. As shown in the table
  • 16. PolyCoder’s Training  Polycoder uses the GPT-2 model architecture.  To investigate the effect of model size scaling, it was trained using three different model sizes: 2.7 billion, 400 million, and 160 million parameters, with the largest 2.7B model equalling GPT-Neo's capacity to allow a fair comparison  The 2.7 billion parameter model is a 32-layer, 2,560 dimensional Transformer model with a maximum context window of 2048 tokens, and it was trained using a batch size of 128 sequences (262K tokens) for a total of 150K steps
  • 17. PolyCoder’s Training  The following table is a Comparison of design decisions and hyper- parameters in training different models of code.
  • 18. PolyCoder’s Training  The following figure is the Training and validation loss during the 150K step training process
  • 20. Results of Extrinsic evaluations:  Among the current models, PolyCoder performs less effectively than the comparably sized GPT-Neo and even the smaller Codex 300M. In the grand scheme of things, PolyCoder ranks after Codex, GPT-Neo/J, but outperforms CodeParrot  Despite being trained exclusively on code, PolyCoder lags behind a model of similar size, GPT-Neo 2.7B, which was trained on the Pile, a mix of both code and natural language texts  This finding implies that future studies could profit from mixing code from diverse programming languages, along with natural language text
  • 21. Results of Extrinsic evaluations:  The following table shows results of different models on the HumanEval benchmark, and the number of different typesof tokens seen during the training process.
  • 22. Results of Intrinsic Evaluations  Interestingly, PolyCoder surpasses Codex and all other models when it comes to the C language. When considering only open-source models, PolyCoder outperforms the similarly sized GPT-Neo 2.7B in C, JavaScript, Rust, Scala, and TypeScript  In the remaining 11 languages apart from C, all other open-source models, including the newly introduced PolyCoder, exhibit significantly lower performance (higher perplexity) compared to Codex.  This observation could imply that for languages where larger models don't yield extra benefits, training the model solely on code might be sufficient or even slightly more advantageous than training on a combination of natural language and code
  • 23. Conclusions  We've presented the results of a systematic evaluatoion of large language models for code. The findings generally indicate that performance improves with bigger models and extended training durations.  Based on the results, we infer that GPT-Neo's superior performance over PolyCoder in certain languages suggests that training on both natural language text and code can enhance code modeling  However, it's noteworthy that in the realm of the C programming language, PolyCoder outperforms all models, including Codex, by achieving a lower perplexity
  翻译: