尊敬的 微信汇率:1円 ≈ 0.046215 元 支付宝汇率:1円 ≈ 0.046306元 [退出登录]
SlideShare a Scribd company logo
20IT011-COMPILER DESIGN
Text Book :
Alfred V Aho, Monica S Lam, Ravi Sethi and Jeffrey D Ullman, “Compilers - Principles,
Techniques and Tools”, Pearson Education, 2nd Edition, 2018.
COURSE OUTCOME
CO1: Ability to understand the functionalities of the different phases of compiler.
CO2: Ability to understand syntax-directed translation and run-time environment.
CO3: Ability to implement code optimization techniques.
CO4: Ability to apply different parsing algorithms to develop the parsers for a
given grammar.
CO5: Ability to design a scanner and a parser using LEX and YACC tools.
UNIT I
INTRODUCTION TO COMPILERS
UNIT-I SYLLABUS
Structure of a compiler – Lexical Analysis – Role
of Lexical Analyzer -Input Buffering –
Specification of Tokens – Recognition of Tokens-
Lex – Finite Automata-Regular Expressions to
Automata – Minimizing DFA.
STRUCTURE OF A COMPILER
• Definition: A program that accept as input a program text in a certain language
and produces as output a program text in another language (Grune et all)
• Compiler is a translator program that reads a program written in one language -
the source language- and translates it into an equivalent program in another
language-the target language.
COMPILATION PROCESS
PHASES OF A COMPILER
Lexical Analysis
• In a compiler, linear analysis is called lexical analysis or scanning.
• It takes source code as input
• It reads one character at a time and convert into lexemes( represent in the
form of tokens)
Example: position = initial + rate * 60
✓ The identifier position.
✓ The assignment symbol =.
✓ The identifier initial.
✓ The plus sign.
✓ The identifier rate.
✓ The multiplication sign.
✓ The number 60.
<Id,Position> <=,> <Id,initial> <+,> <Id,rate><*,> <num,60>
Contd..
• Needs to record each id attribute: keep a symbol table
• Lexical Analysis eliminates white spaces, etc..
Syntax Analysis
• It is called as parsing or Syntax Analysis
• It takes token as input and generate parse tree as output.
• The parser checks that the expression made by tokens is
syntactically correct or not.
• Context free grammars formalize the rules and guide syntax analysis.
Semantic Analysis
• It checks the parse tree follows the rules of language
• It keeps tracks of identifiers, types and expressions
• Output is annotated as tree syntax
Intermediate Code Generator
• After semantic analysis, some compilers generate an explicit
intermediate representation of the source program.
• This representation should be easy to produce and easy to translate
into the target program.
Code Optimization
• It is used to improve the intermediate code
• So the output can run fast and take less space
• It removes unnecessary lines of code and arrange the sequence in
order to speed up the execution.
Code Generation
• The final phase of the compiler is the generation of target code,
consisting of relocatable machine code or assembly code.
• The intermediate instructions are each translated into sequence of
machine instructions that perform the same task.
Symbol table Manager
• An essential function of a compiler is to record the identifiers used in
the source program and collect its information.
• A symbol table is a data structure containing a record for each
identifier with fields for attributes.(such as, its type, its scope, if
procedure names then the number and type of arguments etc.,)
• The data structure allows to find the record for each identifier and
store or retrieve data from that record quickly.
Example: sum=initial + value * 10 1 sum
2 initial
3 value
4 ….
Error Handling and Reporting
• Each phase can encounter errors. After detecting an error, a phase must deal that
error, so that compilation can proceed ,allowing further errors to be detected.
• Lexical phase can detect error where the characters remaining in the input do not
form any token.
• The syntax and semantic phases handle large fraction of errors. The stream of
tokens violates the syntax rules are determined by syntax phase.
• During semantic, the compiler tries to detect constructs that have the right
syntactic structure but no meaning to the operation involved.
Role of Lexical Analyser
Role of lexical Analyzer
• As the first phase of a compiler, the main task of the lexical analyzer is
to read the input characters of the source program, group them into
lexemes, and produce as output a sequence of tokens for each lexeme
in the source program.
• The stream of tokens is sent to the parser for syntax analysis.
• It is common for the lexical analyzer to interact with the symbol table
as well.
• When the lexical analyzer discovers a lexeme constituting an
identifier, it needs to enter that lexeme into the symbol table.
Contd..
• In some cases, information regarding the kind of identifier may be
read from the symbol table by the lexical analyzer to assist it in
determining the proper token it must pass to the parser.
Interaction between the lexical analyzer and the parser
• The interaction is implemented by having the parser call the lexical
analyzer.
• The call, suggested by the getNextToken command, causes the
lexical analyzer to read characters from its input until it can identify
the next lexeme and produce for it the next token, which it returns to
the parser.
• Since the lexical analyzer is the part of the compiler that reads the
source text, it may perform certain other tasks besides identification of
lexemes
Contd..
• One such task is stripping out comments and whitespace (blank,
newline, tab, and perhaps other characters that are used to separate
tokens in the input).
• Another task is correlating error messages generated by the compiler
with the source program.
• lexical analyzers are divided into a cascade of two processes:
• Scanning consists of the simple processes that do not require tokenization of the
input, such as deletion of comments and compaction of consecutive whitespace
characters into one.
• Lexical analysis proper is the more complex portion, where the scanner produces the
sequence of tokens as output.
Lexical Analysis and Parsing
• Simplicity of design – avoid complexity to work together .
• Improving Compiler Efficiency - A separate lexical analyzer allows us to apply
specialized techniques that serve only the lexical task, not the job of parsing. In
addition, specialized buffering techniques for reading input characters can speed
up the compiler significantly.
• Enhancing Compiler Portability -Input-device-specific peculiarities can be
restricted to the lexical analyzer.
Tokens, Patterns, Lexemes
• A token is a pair consisting of a token name and an optional
attribute value.
• A pattern is a description of the form that the lexemes of a token
may take.
• A lexeme is a sequence of characters in the source program that
matches the pattern for a token and is identified by the lexical analyzer
as an instance of that token.
Example
Attributes for tokens
Lexical Errors
• Some errors are out of power of lexical analyzer to recognize:
✓fi(a==f(x))..
• However it may be able to recognize errors like:
✓d=2r
• Such errors are recognized when no pattern for tokens matches a character
sequence
Example
Error Recovery
• Panic Mode: Successive characters are ignored until we reach to a well
formed token
• Delete one character from the remaining input
• Insert a missing character into the remaining input
• Replace a character by another character
• Transpose two adjacent characters
INPUT BUFFERING
Input Buffering
• Input buffer helps to find the correct lexeme.
• Sometimes lexical analyzer needs to look ahead some symbols to decide about
the token to return
- in C language: we need to look after -,= or < to decide what token to return
- in FORTRAN: DO 5 I=1.25
• We need to introduce a two buffer scheme to handle large look-aheads safely
Buffer Pairs
Two pointers to the input are maintained:
I. Pointer lexemeBegin, marks the beginning of the current lexeme,
whose extent we are attempting to determine.
II. Pointer forward scans ahead until a pattern match is found
Sentinels
• A sentinel is a special character or token used to mark the end of
a sequence.
• In the context of a lexical analyzer, a sentinel can be utilized to
indicate the end of the input source code or the end of a specific
token.
QUIZ
1. ______________ is the heart of the compiler.
2.what will be the next phase of Intermediate code generator?
3. What is the output of Lexical analyzer?
4. Token consist of pairs, what are they?
5. What is the command ,the parser generate to lexical analyzer?
6. What is the drawback in 1-Buffer?
7. What the two pointers used in Buffering technique?
8. What are the Back-end in Phases of compiler?
SPECIFICATION OF TOKENS
SPECIFICATION OF TOKENS
• In theory of compilation regular expressions are used to formalize the
specification of tokens.
• The specification of tokens depends on the pattern of the lexeme.
Example:
letter(letter+digit)*
• Each regular expression is a pattern specifying the form of strings.
Contd..
I) Alphabet, Strings, Languages
II) Operations on Language
III) Regular Expression
IV) Regular Definition
Strings and Languages
Strings and Languages
Symbol:
• A symbol is an abstract entity that we shall not define formally.
Examples:
• Letters: A|B|C…..|Z |a|b….|z
• Digit:0,1,2…9
• Special Characters
Contd..
Alphabet:
• Alphabet is a non-empty finite set of symbols. it is denoted by the
symbol Ɛ(Sigma).
Example:
• Ɛ={0,1} is the binary alphabet which consisting of digits
• Ɛ={a,b,c} is an alphabet which consisting of letters.
• ASCII is an alphabet ;it is used in many softwares.
• Unicode is an alphabet , which includes 1,00,000 characters.
Contd..
String:
• A String is defined as finite sequence of symbols over an alphabet Ɛ .
It is denoted as ‘w’.
• In language theory, the terms “sentence” and “word” are often used as
synonyms for “string”
Examples:
1.Alphabet Ɛ={ a, b }
Strings: w = {a, b, aa, ab, ba, bb, aaa, aab,……..}
2. Ɛ={ 0, 1 }
Strings: w = { 0, 1, 00, 01, 10, 11, 000, 001, 010, 011,100,101,110,111,
0000,....}
Different Notation used in a string
• Length
• Prefix:- any number of leading symbols
• Proper Prefix :- any number of leading symbols without € and
original string
• Suffix:- tailing symbols
• Proper Suffix:- any number of trailing symbols without € and
original string
• Substring
OPERATIONS ON LANGUAGES
Operations on Languages
REGULAR EXPRESSION
Definition
- A regular expression over Ɛ can be defined as
• ɸ is a regular expression for empty set
• Ɛ is a regular expression for null string {Ɛ}
• If ‘a’ is a symbol in Ɛ then ‘a’ is RE for the set {a}.
• If R and S are two RE then
• UNION of two R+E is a RE
• CONCATENATION of two RS is a RE
• KLEEN CLOSURE of any R* S* is a RE
Identity Rule or Algebric rule
EXAMPLE
Example
EXAMPLES
1) Design a RE to accept all possible combinations of a’s and b’s over
Ɛ={ a, b}
2) Design a RE which Starts with 1 and end with 0.

More Related Content

Similar to An Introduction to the Compiler Designss

Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
NesredinTeshome1
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
Sarmad Ali
 
Compiler Design.pptx
Compiler Design.pptxCompiler Design.pptx
Compiler Design.pptx
SouvikRoy149
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
Nayemid4676
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
Radhika Talaviya
 
Lexical Analysis.pdf
Lexical Analysis.pdfLexical Analysis.pdf
Lexical Analysis.pdf
BiswanathSethi2
 
Compiler Design Material
Compiler Design MaterialCompiler Design Material
Compiler Design Material
Dr. C.V. Suresh Babu
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
sivaganesh293
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
sivaganesh293
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptx
ssuser3b4934
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
Taymoor Nazmy
 
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptxCOMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
MUSHAMHARIKIRAN6737
 
chapter4 end.pptx
chapter4 end.pptxchapter4 end.pptx
chapter4 end.pptx
Maryam522887
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.ppt
Apoorv Diwan
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
Himajanaidu2
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
amudha arul
 
LANGUAGE TRANSLATOR
LANGUAGE TRANSLATORLANGUAGE TRANSLATOR
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
TANZINTANZINA
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compiler
adilmehmood93
 
10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf
AnimeshMani4
 

Similar to An Introduction to the Compiler Designss (20)

Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
Compiler Design.pptx
Compiler Design.pptxCompiler Design.pptx
Compiler Design.pptx
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
Lexical Analysis.pdf
Lexical Analysis.pdfLexical Analysis.pdf
Lexical Analysis.pdf
 
Compiler Design Material
Compiler Design MaterialCompiler Design Material
Compiler Design Material
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptx
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
 
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptxCOMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
 
chapter4 end.pptx
chapter4 end.pptxchapter4 end.pptx
chapter4 end.pptx
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.ppt
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
 
LANGUAGE TRANSLATOR
LANGUAGE TRANSLATORLANGUAGE TRANSLATOR
LANGUAGE TRANSLATOR
 
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compiler
 
10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf10600122065_Animesh mani (CD).pdf
10600122065_Animesh mani (CD).pdf
 

Recently uploaded

一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
nonods
 
Lateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptxLateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptx
DebendraDevKhanal1
 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
GOKULKANNANMMECLECTC
 
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdfFUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
EMERSON EDUARDO RODRIGUES
 
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdfAsymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
felixwold
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Balvir Singh
 
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
IJCNCJournal
 
Online train ticket booking system project.pdf
Online train ticket booking system project.pdfOnline train ticket booking system project.pdf
Online train ticket booking system project.pdf
Kamal Acharya
 
Technological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdfTechnological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdf
tanujaharish2
 
openshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoinopenshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoin
snaprevwdev
 
East Carolina University diploma. ECU diploma
East Carolina University diploma. ECU diplomaEast Carolina University diploma. ECU diploma
East Carolina University diploma. ECU diploma
College diploma
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
Lubi Valves
 
Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
Dwarkadas J Sanghvi College of Engineering
 
My Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdfMy Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdf
Geoffrey Wardle. MSc. MSc. Snr.MAIAA
 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
DharmaBanothu
 
comptia-security-sy0-701-exam-objectives-(5-0).pdf
comptia-security-sy0-701-exam-objectives-(5-0).pdfcomptia-security-sy0-701-exam-objectives-(5-0).pdf
comptia-security-sy0-701-exam-objectives-(5-0).pdf
foxlyon
 
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call GirlCall Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
sapna sharmap11
 
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
dulbh kashyap
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
DharmaBanothu
 
Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.
supriyaDicholkar1
 

Recently uploaded (20)

一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
一比一原版(psu学位证书)美国匹兹堡州立大学毕业证如何办理
 
Lateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptxLateral load-resisting systems in buildings.pptx
Lateral load-resisting systems in buildings.pptx
 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
 
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdfFUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
 
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdfAsymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
 
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimization–Long Short-Term Memory based Channel Estimation w...
 
Online train ticket booking system project.pdf
Online train ticket booking system project.pdfOnline train ticket booking system project.pdf
Online train ticket booking system project.pdf
 
Technological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdfTechnological Innovation Management And Entrepreneurship-1.pdf
Technological Innovation Management And Entrepreneurship-1.pdf
 
openshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoinopenshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoin
 
East Carolina University diploma. ECU diploma
East Carolina University diploma. ECU diplomaEast Carolina University diploma. ECU diploma
East Carolina University diploma. ECU diploma
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
 
Introduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.pptIntroduction to Computer Networks & OSI MODEL.ppt
Introduction to Computer Networks & OSI MODEL.ppt
 
My Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdfMy Airframe Metallic Design Capability Studies..pdf
My Airframe Metallic Design Capability Studies..pdf
 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
 
comptia-security-sy0-701-exam-objectives-(5-0).pdf
comptia-security-sy0-701-exam-objectives-(5-0).pdfcomptia-security-sy0-701-exam-objectives-(5-0).pdf
comptia-security-sy0-701-exam-objectives-(5-0).pdf
 
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call GirlCall Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
 
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
🚺ANJALI MEHTA High Profile Call Girls Ahmedabad 💯Call Us 🔝 9352988975 🔝💃Top C...
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
 
Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.
 

An Introduction to the Compiler Designss

  • 1. 20IT011-COMPILER DESIGN Text Book : Alfred V Aho, Monica S Lam, Ravi Sethi and Jeffrey D Ullman, “Compilers - Principles, Techniques and Tools”, Pearson Education, 2nd Edition, 2018.
  • 2. COURSE OUTCOME CO1: Ability to understand the functionalities of the different phases of compiler. CO2: Ability to understand syntax-directed translation and run-time environment. CO3: Ability to implement code optimization techniques. CO4: Ability to apply different parsing algorithms to develop the parsers for a given grammar. CO5: Ability to design a scanner and a parser using LEX and YACC tools.
  • 4. UNIT-I SYLLABUS Structure of a compiler – Lexical Analysis – Role of Lexical Analyzer -Input Buffering – Specification of Tokens – Recognition of Tokens- Lex – Finite Automata-Regular Expressions to Automata – Minimizing DFA.
  • 5. STRUCTURE OF A COMPILER • Definition: A program that accept as input a program text in a certain language and produces as output a program text in another language (Grune et all) • Compiler is a translator program that reads a program written in one language - the source language- and translates it into an equivalent program in another language-the target language.
  • 7. PHASES OF A COMPILER
  • 8. Lexical Analysis • In a compiler, linear analysis is called lexical analysis or scanning. • It takes source code as input • It reads one character at a time and convert into lexemes( represent in the form of tokens) Example: position = initial + rate * 60 ✓ The identifier position. ✓ The assignment symbol =. ✓ The identifier initial. ✓ The plus sign. ✓ The identifier rate. ✓ The multiplication sign. ✓ The number 60. <Id,Position> <=,> <Id,initial> <+,> <Id,rate><*,> <num,60>
  • 9. Contd.. • Needs to record each id attribute: keep a symbol table • Lexical Analysis eliminates white spaces, etc..
  • 10. Syntax Analysis • It is called as parsing or Syntax Analysis • It takes token as input and generate parse tree as output. • The parser checks that the expression made by tokens is syntactically correct or not. • Context free grammars formalize the rules and guide syntax analysis.
  • 11. Semantic Analysis • It checks the parse tree follows the rules of language • It keeps tracks of identifiers, types and expressions • Output is annotated as tree syntax
  • 12. Intermediate Code Generator • After semantic analysis, some compilers generate an explicit intermediate representation of the source program. • This representation should be easy to produce and easy to translate into the target program.
  • 13. Code Optimization • It is used to improve the intermediate code • So the output can run fast and take less space • It removes unnecessary lines of code and arrange the sequence in order to speed up the execution.
  • 14. Code Generation • The final phase of the compiler is the generation of target code, consisting of relocatable machine code or assembly code. • The intermediate instructions are each translated into sequence of machine instructions that perform the same task.
  • 15. Symbol table Manager • An essential function of a compiler is to record the identifiers used in the source program and collect its information. • A symbol table is a data structure containing a record for each identifier with fields for attributes.(such as, its type, its scope, if procedure names then the number and type of arguments etc.,) • The data structure allows to find the record for each identifier and store or retrieve data from that record quickly. Example: sum=initial + value * 10 1 sum 2 initial 3 value 4 ….
  • 16. Error Handling and Reporting • Each phase can encounter errors. After detecting an error, a phase must deal that error, so that compilation can proceed ,allowing further errors to be detected. • Lexical phase can detect error where the characters remaining in the input do not form any token. • The syntax and semantic phases handle large fraction of errors. The stream of tokens violates the syntax rules are determined by syntax phase. • During semantic, the compiler tries to detect constructs that have the right syntactic structure but no meaning to the operation involved.
  • 17. Role of Lexical Analyser
  • 18. Role of lexical Analyzer • As the first phase of a compiler, the main task of the lexical analyzer is to read the input characters of the source program, group them into lexemes, and produce as output a sequence of tokens for each lexeme in the source program. • The stream of tokens is sent to the parser for syntax analysis. • It is common for the lexical analyzer to interact with the symbol table as well. • When the lexical analyzer discovers a lexeme constituting an identifier, it needs to enter that lexeme into the symbol table.
  • 19. Contd.. • In some cases, information regarding the kind of identifier may be read from the symbol table by the lexical analyzer to assist it in determining the proper token it must pass to the parser.
  • 20. Interaction between the lexical analyzer and the parser • The interaction is implemented by having the parser call the lexical analyzer. • The call, suggested by the getNextToken command, causes the lexical analyzer to read characters from its input until it can identify the next lexeme and produce for it the next token, which it returns to the parser. • Since the lexical analyzer is the part of the compiler that reads the source text, it may perform certain other tasks besides identification of lexemes
  • 21. Contd.. • One such task is stripping out comments and whitespace (blank, newline, tab, and perhaps other characters that are used to separate tokens in the input). • Another task is correlating error messages generated by the compiler with the source program. • lexical analyzers are divided into a cascade of two processes: • Scanning consists of the simple processes that do not require tokenization of the input, such as deletion of comments and compaction of consecutive whitespace characters into one. • Lexical analysis proper is the more complex portion, where the scanner produces the sequence of tokens as output.
  • 22. Lexical Analysis and Parsing • Simplicity of design – avoid complexity to work together . • Improving Compiler Efficiency - A separate lexical analyzer allows us to apply specialized techniques that serve only the lexical task, not the job of parsing. In addition, specialized buffering techniques for reading input characters can speed up the compiler significantly. • Enhancing Compiler Portability -Input-device-specific peculiarities can be restricted to the lexical analyzer.
  • 23. Tokens, Patterns, Lexemes • A token is a pair consisting of a token name and an optional attribute value. • A pattern is a description of the form that the lexemes of a token may take. • A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token.
  • 26. Lexical Errors • Some errors are out of power of lexical analyzer to recognize: ✓fi(a==f(x)).. • However it may be able to recognize errors like: ✓d=2r • Such errors are recognized when no pattern for tokens matches a character sequence
  • 28. Error Recovery • Panic Mode: Successive characters are ignored until we reach to a well formed token • Delete one character from the remaining input • Insert a missing character into the remaining input • Replace a character by another character • Transpose two adjacent characters
  • 30. Input Buffering • Input buffer helps to find the correct lexeme. • Sometimes lexical analyzer needs to look ahead some symbols to decide about the token to return - in C language: we need to look after -,= or < to decide what token to return - in FORTRAN: DO 5 I=1.25 • We need to introduce a two buffer scheme to handle large look-aheads safely
  • 31. Buffer Pairs Two pointers to the input are maintained: I. Pointer lexemeBegin, marks the beginning of the current lexeme, whose extent we are attempting to determine. II. Pointer forward scans ahead until a pattern match is found
  • 32. Sentinels • A sentinel is a special character or token used to mark the end of a sequence. • In the context of a lexical analyzer, a sentinel can be utilized to indicate the end of the input source code or the end of a specific token.
  • 33.
  • 34. QUIZ 1. ______________ is the heart of the compiler. 2.what will be the next phase of Intermediate code generator? 3. What is the output of Lexical analyzer? 4. Token consist of pairs, what are they? 5. What is the command ,the parser generate to lexical analyzer? 6. What is the drawback in 1-Buffer? 7. What the two pointers used in Buffering technique? 8. What are the Back-end in Phases of compiler?
  • 36. SPECIFICATION OF TOKENS • In theory of compilation regular expressions are used to formalize the specification of tokens. • The specification of tokens depends on the pattern of the lexeme. Example: letter(letter+digit)* • Each regular expression is a pattern specifying the form of strings.
  • 37. Contd.. I) Alphabet, Strings, Languages II) Operations on Language III) Regular Expression IV) Regular Definition
  • 39. Strings and Languages Symbol: • A symbol is an abstract entity that we shall not define formally. Examples: • Letters: A|B|C…..|Z |a|b….|z • Digit:0,1,2…9 • Special Characters
  • 40. Contd.. Alphabet: • Alphabet is a non-empty finite set of symbols. it is denoted by the symbol Ɛ(Sigma). Example: • Ɛ={0,1} is the binary alphabet which consisting of digits • Ɛ={a,b,c} is an alphabet which consisting of letters. • ASCII is an alphabet ;it is used in many softwares. • Unicode is an alphabet , which includes 1,00,000 characters.
  • 41. Contd.. String: • A String is defined as finite sequence of symbols over an alphabet Ɛ . It is denoted as ‘w’. • In language theory, the terms “sentence” and “word” are often used as synonyms for “string” Examples: 1.Alphabet Ɛ={ a, b } Strings: w = {a, b, aa, ab, ba, bb, aaa, aab,……..} 2. Ɛ={ 0, 1 } Strings: w = { 0, 1, 00, 01, 10, 11, 000, 001, 010, 011,100,101,110,111, 0000,....}
  • 42. Different Notation used in a string • Length • Prefix:- any number of leading symbols • Proper Prefix :- any number of leading symbols without € and original string • Suffix:- tailing symbols • Proper Suffix:- any number of trailing symbols without € and original string • Substring
  • 46. Definition - A regular expression over Ɛ can be defined as • ɸ is a regular expression for empty set • Ɛ is a regular expression for null string {Ɛ} • If ‘a’ is a symbol in Ɛ then ‘a’ is RE for the set {a}. • If R and S are two RE then • UNION of two R+E is a RE • CONCATENATION of two RS is a RE • KLEEN CLOSURE of any R* S* is a RE
  • 47. Identity Rule or Algebric rule
  • 50. EXAMPLES 1) Design a RE to accept all possible combinations of a’s and b’s over Ɛ={ a, b} 2) Design a RE which Starts with 1 and end with 0.
  翻译: