尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
UNIT-1
PART-A
Overview of Compilation
COMPILER
A compiler is a program that reads a program in one language,
the source language and
Translates into an equivalent program in another language,
the target language.
The translation process should also report the presence of
errors in the source program
Source
Program → Compiler → Target
Program
↓
Error
Messages
COMPILER
There are two parts of compilation.
.
The analysis part breaks up the source program into
constant piece and creates an
intermediate representation of the source program.
The synthesis part constructs the desired target program
from the intermediate representation.
Phases of Compiler
.
Lexical analyzer
• Lexical Analyzer reads the source program character by
character and returns the
• tokens of the source program.
• • A token describes a pattern of characters having same
meaning in the source
• program. (such as identifiers, operators, keywords, numbers,
delimeters and so
• on)
• Ex: newval := oldval + 12 => tokens: newval identifier
• := assignment operator
• oldval identifier
Lexical analyzer
+
add operator
12 a number
• Puts information about identifiers into the symbol table.
• • ‘egular expressions are used to describe tokens (lexical
constructs).
Syntax analyzer
Syntax analyzer.
The syntax of a language is specified by a context free
grammar (CFG).
• The rules in a CFG are mostly recursive.
• A syntax analyzer checks whether a given program satisfies
the rules implied by a CFG or not.
If it satisfies, the syntax analyzer creates a parse tree for the
given program.
Syntax analyzer
Ex:
We use BNF (Backus Naur Form) to specify a CFG
assgstmt -> identifier := expression
expression -> identifier
expression -> number
expression -> expression + expression
Semantic analyzer
• A semantic analyzer checks the source program for semantic
errors and collects
• the type information for the code generation.
• Type-checking is an important part of semantic analyzer.
• Normally semantic information cannot be represented by a
context-free language
• used in syntax analyzers.
• Context-free grammars used in the syntax analysis are
integrated with attributes
• (semantic rules)
• the result is a syntax-directed translation,
• Attribute grammars
Semantic analyzer
Ex:
newval := oldval + 12
The type of the identifier newval must match with type of
the expression (oldval+12)
Intermediate code generation
A compiler may produce an explicit intermediate codes
representing the source program.
These intermediate codes are generally machine
(architecture independent). But the level of intermediate
codes is close to the level of machine codes.
Intermediate code generation
Ex:
newval := oldval * fact + 1
id1 := id2 * id3 + 1
MULT id2,id3,temp1 Intermediates Codes (Quadraples)
ADD temp1,#1,temp2
MOV temp2,,id1
Code optimizer
The code optimizer optimizes the code produced by the
intermediate code generator in the terms of time and space.
Ex:
MULT id2,id3,temp1
ADD temp1,#1,id1
Code generator
Produces the target language in a specific architecture.
The target program is normally is a relocatable object file
containing the machine codes.
Ex:
( assume that we have an architecture with instructions
whose at least one of its operands is a machine register)
Code generator
MOVEid2,R1
MULT id3,R1
ADD #1,R1
MOVER1,id1
Compiler construction tools
• A number of tools have been developed variously called
compiler –compiler , compiler generator or translator
writing system
• The input for these systems may contain
• 1. a description of source language.
• 2. a description of what output to be
• generated.
• 3. a description of the target machine.
Compiler construction tools
The principal aids provided by compiler-compiler are
1. Scanner Generator
2. Parser generator
3. Facilities for code generation
Lexical Analyzer
• The Role of the Lexical Analyzer
1st phase of compiler
Reads i/p character & produce o/p sequence of tokens that
the Parser uses for syntax analysis
It can either work as a separate module or as sub module
Tokens , Patterns and Lexemes
• Lexeme: Sequence of character in the source pm that is
matched against the pattern for a token
• Pattern: The rule associated with each set of strings is called
pattern.
• Lexeme is matched against pattern to generate token
• Token: Token is word, which describes the lexeme in source
pgm. It is generated when lexeme is matched against pattern
3 Methods of constructing Lexical Analyzer
.
• 1. Using Lexical Code Generator
• Such compilers/ tools are available that takes in Regular
Expressions
• As i/p and generate code for that R.E.These tools can be
used to generate Lexical Analyzer code from R.Es
• - Example of such Tool is LEX for Unix .
3 Methods of constructing Lexical Analyzer
R.E LEX.YY.C
LEX.YY.C a.out(lexical)
tokens
source
program
Lex generator
compiler
a.out
Compiler and Translator
• Compiler is a form of translator that translate a program
written in one language “
• The “ource Language” to an equivalent program in a second
language “ The Target
• language ” or “ The Object Language “
Compiler and Translator
Prog in source
Language
prog in target
language
Errors & Diagnostics
Assemblers, Compilers and Interpreters are all specific
translators
compiler
Assemblers
M/C code
Assembly
• Language
• (opcodes or mnemonics)
• Interpreters
• - Interpret the statements of the High level Language pgm as
they are encountered .
• Produce o/p of statement as they are interpreted .
assembler
Languages involved in Compiler
3 languages are involved in a compiler
• 1. Source language: Read by compiler
• 2. Target or Object language : translated by compiler
/translator to another language
• 4. Host Language: Language in which compiler is written
Advantages of Compiler
• Conciseness which improves programmer productivity,
• semantic restriction
• Translate and analyze H.L.L.(source pgm) only once and then
• run the equivalent m/c code produce by compiler
Disadvantage of Compiler
• Slower speed
• Size of compiler and compiled code
• Debugging involves all source code
Interpreter versus Compiler
Compiler translates to machine code
Interpreter
Variant: Interpretation of intermediate code
Compiler generates code for a "virtual machine" (VM)
VM is simulated by software
Single-Pass Compilers
Phases work in an interleaved way
Static Structure of a (Single-Pass) Compiler
Multi-Pass Compilers
Phases are separate "Programs", which run sequentially
Why multi-pass?
If memory is scarce (irrelevant today)
If the language is complex
If portability is important
Often Two-Pass Compilers
Loader/Linker Editor:
Performs 2 functions
• i. Loading : relocatable code is converted to absolute code
i.e. placed at their specified position
ii. Link-Editor :
Make single pgm from several files of relocatable
m/c code.
The file may be o/p of different compilation .
May be Library files or routine provided by system .
Result is executable file
Preprocessor
Produce i/p to compiler. They may perform
i. Macro processing: shorthand for longer construct .
ii. File inclusion: separate module can be used by including
their file e.g #include <iostream> .
iii. Rational preprocessors: give support for additional
facilities which are not included in compiler itself .
iv.Language Extension: may include extra capabilities .
Major Types of Compiler
• 1. Self-resident Compiler: generates Target code for the
same m/c or host .
• 2. Cross Compilers: generates target code for m/c other then
host .
Phases and passes
• In logical terms a compiler is thought of as consisting of
stages and phases
• Physically it is made up of passes
• The compiler has one pass for each time the source code, or
a representation of
• it, is read
• Many compilers have just a single pass so that the complete
compilation
• process is performed while the code is read once
Phases and passes
• The various phases described will therefore be executed in
parallel
• Earlier compilers had a large number of passes, typically due
to the limited
• memory space available
• Modern compilers are single pass since memory space is not
usually a problem
Use of tools
• The 2 main types of tools used in compiler production
are:
• 1. a lexical analyzer generator
• Takes as input the lexical structure of a language, which
defines how its
• tokens are made up from characters
• Produces as output a lexical analyzer (a program in C
for example) for the
• language
• Unix lexical analyzer Lex
• 2. a symbol analyzer generator
Use of tools
• Takes as input the syntactical definition of a language
• Produces as output a syntax analyzer (a program in C for
example) for the
• language
• The most widely know is the Unix-based YACC (Yet
Another
• Compiler-Compiler), used in conjunction with Lex.
• Bison: public domain version
Applications of compiler techniques
• Compiler technology is useful for a more general class of
applications
• Many programs share the basic properties of compilers: they
read textual input,
• organize it into a hierarchical structure and then process the
structure
• An understanding how programming language compilers are
designed and
• organized can make it easier to implement these compiler
like applications as
• well
• More importantly, tools designed for compiler writing such
as lexical analyzer
Applications of compiler techniqu
• generators and parser generators can make it vastly easier to
implement such
• applications
• Thus, compiler techniques - An important knowledge for
computer science
• engineers
• Examples:
• Document processing: Latex, HTML, XML
• User interfaces: interactive applications, file systems, databases
• Natural language treatment
• Automata Theory, La
Interpreters
• Interpreters: Instead of producing a target program as a
translation, an interpreter
• performs the operations implied by the source program
Differences between compiler and Interpreter
no
1
COMPILER
It takes entire program as input
2 Intermediate object code is generated
3 Conditional control statements are
executes faster
4
5
Memory requirement is more
Program need not be compiled every
time
6 Errors are displayed after entire
program is checked
7 Ex: C Compiler
INTERPRETER
It takes single instruction as
input
No intermediate object code
is generated
Conditional control
statements are executes
slower
Memory requirement is less
Every time higher level
program is converted into
lower level program
Errors are displayed for every
instruction interpreted (if
any)
Ex : Basic
Types of compilers
1.Incremental compiler :
It is a compiler it performs the recompilation of only modified
source rather than compiling the whole source program.
Features:
1.It tracks the dependencies between output and source
program.
2.It produces the same result as full recompilation.
3.It performs less task than recompilation.
4.The process of incremental compilation is effective for
maintenance.
Types of compilers
2.Cross compiler:
Basically there exists 3 languages
1.source language i.e application program.
2.Target language in which machine code is return.
3.Implementation language in which a compiler is
return.
All these 3 languages are different. In other words
there may be a compiler which run on one machine and
produces the target code for another machine. Such a
compiler is called cross compiler.
Types of compilers
To represent cross compiler T diagram is drawn as follows.
I
S T
I
•
• CROSS COMPILER:For source language L the target language
N get generated which runs on machine M.
L N L N
•
•
•
S S M M
M
Bootstrapping Compilers and T-diagrams
• The rules for T-diagrams are very simple. A compiler written
in some language “C” (could be anything from machine code
on up) that translates programs in language A to language B
looks like this (these diagrams are from
Bootstrapping Compilers and T-diagrams
• Now suppose you have a machine that can directly run HP
machine code, and a compiler from ML to HP machine code,
and you want to get a ML compiler running on different
machine code P.You can start by writing an ML-to-P
compiler in ML, and compile that to get an ML-to-P compiler
in HP:
Bootstrapping Compilers and T-diagrams
From there, feed the new ML-to-P compiler to itself, running on
the HP machine, and you end up with an ML-to-P compiler
that runs on a P machine!
Lexical Analysis
• lexical analyser or scanner is a program that groups
sequences of characters into lexemes, and outputs (to
the syntax analyser) a sequence of tokens. Here:
• (a) Tokens are symbolic names for the entities that
make up the text of the program;
• e.g.
• If for the keyword if , and id
• for any identifier. These make up the output of
• the lexical analyser
The role of lexical analyzer
Lexical Analyzer Parser
Source
program
token
getNextToken
Symbol
Tosemantic
analysis
Tokens, Patterns and Lexemes
• A token is a pair a token name and an optional token value
• A pattern is a description of the form that the lexemes of a
token may take
• A lexeme is a sequence of characters in the source program
that matches the pattern for a token
Example
Token Informal description Sample lexemes
if
else
comparison
id
number
Literal
Characters i, f
Characters e, l, s, e
< or > or <= or >= or == or !=
Letter followed by letter and digits
Any numeric constant
Anything but “ sorrounded by “
if
else
<=, !=
pi, score, D2
3.14159, 0, 6.02e23
“core dumped”
printf(“total = %dn”, score);
Attributes for tokens
• E = M * C ** 2
– <id, pointer to symbol table entry for E>
– <assign-op>
– <id, pointer to symbol table entry for M>
– <mult-op>
– <id, pointer to symbol table entry for C>
– <exp-op>
– <number, integer value 2>
Lexical errors
• Some errors are out of power of lexical analyzer to
recognize:
– fi (a == f(x)) …
• However it may be able to recognize errors like:
– d = 2r
• Such errors are recognized when no pattern for tokens
matches a character sequence
Specification of tokens
• In theory of compilation regular expressions are used to
formalize the specification of tokens
• Regular expressions are means for specifying regular
languages
• Example:
• Letter_(letter_ | digit)*
• Each regular expression is a pattern specifying the form of
strings
Regular expressions
• Ɛ is a regular expression, L(Ɛ) = {Ɛ}
• If a is a symbol in ∑then a is a regular expression, L(a) ={a}
• (r) | (s) is a regular expression denoting the language L(r) ∪
L(s)
• (r)(s) is a regular expression denoting the language L(r)L(s)
• (r)* is a regular expression denoting (L9r))*
• (r) is a regular expression denting L(r)
Regular definitions
d1 -> r1
d2 -> r2
…
dn -> rn
• Example:
letter_ -> A | B | … | Z | a | b | … | Z | _
digit -> 0 | 1 | … | 9
id -> letter_ (letter_ | digit)*
Extensions
• One or more instances: (r)+
• Zero of one instances: r?
• Character classes: [abc]
• Example:
– letter_ -> [A-Za-z_]
– digit
– id
-> [0-9]
-> letter_(letter|digit)*
Recognition of tokens
• Starting point is the language grammar to understand the
tokens:
stmt -> if expr then stmt
| if expr then stmt else stmt
| Ɛ
expr -> term relop term
| term
term -> id
| number
Recognition of tokens (cont.)
• The next step is to formalize the patterns:
digit
Digits
-> [0-9]
-> digit+
number -> digit(.digits)? (E[+-]? Digit)?
letter -> [A-Za-z_]
id -> letter (letter|digit)*
If -> if
Then -> then
Else -> else
Relop -> < | > | <= | >= | = | <>
• We also need to handle whitespaces:
ws -> (blank | tab | newline)+
Transition diagrams
• Transition diagram for relop
Transition diagrams (cont.)
• Transition diagram for reserved words and identifiers
Transition diagrams (cont.)
• Transition diagram for unsigned numbers
Transition diagrams (cont.)
• Transition diagram for whitespace
Lexical Analyzer Generator - Lex
Compiler
Lex Source
program
lex.l
lex.yy.c
C
compiler
lex.yy.c a.out
a.out
Input
stream
Sequenc
e of
tokens
LEX Example
%{
/* definitions of manifest constants
LT,LE, EQ, NE, GT,GE,
IF, THEN, ELSE, ID, NUMBER, RELOP */
%}
/* regular definitions
[ tn]
delim
ws
letter
digit
id
{delim}+
[A-Za-z]
[0-9]
{letter}({letter}|{digit})*
number {digit}+(.{digit}+)?(E[+-]?{digit}+)?
LEX Example
%%
{ws} {/* no action and no return */}
if {return(IF);}
then {return(THEN);}
else {return(ELSE);} {id} {yylval = (int) installID();
return(ID); }
{yylval = (int) installNum(); return(NUMBER);}
{number}
…
LEX Example
Int installID() {/* funtion to install the lexeme,whose
first character is pointed to by yytext, and whose
length is yyleng, into the symbol table and return a
pointer thereto*/
}
Int installNum() { /* similar to installID, butputs
numerical constants intoaseparate table */
}
Finite Automata
• Regular expressions = specification
• Finite automata = implementation
• A finite automaton consists of
– An input alphabet 
– A set of states S
– A start state n
– A set of accepting states F  S
– A set of transitions state input state
75
Finite Automata
• Transition
s1 a s2
• Is read
In state s1 on input “a” go to state s2
• If end of input
– If in accepting state => accept, othewise => reject
• If no transition possible => reject
76
Finite Automata State Graphs
77
• A state
• The start state
• An accepting state
• A transition
a
Example
• Alphabet {0,1}
• What language does this recognize?
78
0
1
0
1
0
1
And Another Example
79
• Alphabet still { 0, 1 }
1
1
• The operation of the automaton is not completely defined
by the input
– On input “11” the automaton could be in either state
Epsilon Moves
80
• Another kind of transition: -moves

• Machine can move from state A to state B without
reading input
A B
Deterministic and Nondeterministic Automata
• Deterministic Finite Automata (DFA)
– One transition per input per state
– No -moves
• Nondeterministic Finite Automata (NFA)
– Can have multiple transitions for one input in a given
state
– Can have -moves
• Finite automata have finite memory
– Need only to encode the current state
81
Execution of Finite Automata
• A DFA can take only one path through the state graph
– Completely determined by input
• NFAs can choose
– Whether to make -moves
– Which of multiple transitions for a single input to take .
82
Acceptance of NFAs
83
• Input:
0
• An NFA can get into multiple states
1
1
1 0 1
• Rule: NFA accepts if it can get in a final state
NFA vs. DFA (1)
• NFAs and DFAs recognize the same set of languages (regular
languages)
• DFAs are easier to implement
– There are no choices to consider
84
NFA vs. DFA (2)
• For a given language the NFA can be simpler than the DFA
85
0
1
0
0
1
0
0
NFA
DFA
1
1
• DFA can be exponentially larger than NFA
Regular Expressions to Finite Automata
• High-level sketch
86
Regular
expressions
NFA
DFA
Lexical
Specification
Table-driven
Implementation of DFA
Regular Expressions to NFA (1)
• For each kind of rexp, define an NFA
– Notation: NFA for rexp A
87
A
• For 

• For input a
a
Regular Expressions to NFA (2)
• For AB
88
A B

• For A | B
A
B




89
A


Regular Expressions to NFA (3)
• For A*

Example of RegExp -> NFA conversion
90

C E
1
0
D F


B


G
• Consider the regular expression
(1 | 0)*1
• The NFA is



A H 1
I J
Next
91
Regular
expressions
NFA
DFA
Lexical
Specification
Table-driven
Implementation of DFA
NFA to DFA. The Trick
• Simulate the NFA
• Each state of resulting DFA
= a non-empty subset of states of the NFA
• Start state
= the set of NFA states reachable through -moves from NFA
start state
• Add a transition S a “’ to DFA iff
– “’ is the set of NFA states reachable from the states in “
after seeing the input a
• considering -moves as well
92
NFA -> DFA Example
93
1
0
1
 






A B
C
D
E
F
G H I J
ABCDHI
FGABCDHI
EJGABCDHI
0
1
0
1
0 1
NFA to DFA. Remark
• An NFA may be in many states at any time
• How many different states ?
• If there are N states, the NFA must be in some subset of
those N states
• How many non-empty subsets are there?
– 2N - 1 = finitely many, but exponentially many
94
Implementation
• A DFA can be implemented by a 2D table T
– One dimension is “states”
– Other dimension is “input symbols”
– For every transition Si a Sk define T[i,a] = k
• DFA “execution”
– If in state Si and input a, read T[i,a] = k and skip to stateSk
– Very efficient
95
Table Implementation of a DFA
96
S
T
U
0
1
0
1
0 1
0 1
S T U
T T U
U T U
Implementation (Cont.)
• NFA -> DFA conversion is at the heart of tools such as flex or
jflex
• But, DFAs can be huge
• In practice, flex-like tools trade off speed for space in the
choice of NFA and DFA representations
97
UNIT-1
PATR-B
Top Down Parsing
Top-Down Parsing
• The parse tree is created top to bottom.
• Top-down parser
– Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule
does not work, we backtrack to try other alternatives.)
• It is a general parsing technique, but not widely used.
• Not efficient
Top-Down Parsing
– Predictive Parsing
• no backtracking
• efficient
• needs a special form of grammars (LL(1) grammars).
• Recursive Predictive Parsing is a special form of
Recursive Descent parsing without backtracking.
• Non-Recursive (Table Driven) Predictive Parser is also
known as LL(1) parser.
Recursive-Descent Parsing (uses Backtracking)
• Backtracking is needed.
• It tries to find the left-most derivation.
S  aBc
B  bc | b
S S
input: abc
a B a B
c
b c b
c
fails, backtrack
Predictive Parser
a grammar   a grammar suitable
for predictive
left parsing (a LL(1)
eliminate
grammar)
left recursion factor no %100 guarantee.
• When re-writing a non-terminal in a derivation step, a
predictive parser can uniquely choose a production rule
by just looking the current symbol in the input string.
A  1 | ...| n input: ... a .......
current token
Predictive Parser (example)
stmt  if ...... |
while ...... |
begin ...... |
for .....
• When we are trying to write the non-terminal stmt, if the current
token is if we have to choose first production rule.
• When we are trying to write the non-terminal stmt, we can
uniquely choose the production rule by just looking the current
token.
• We eliminate the left recursion in the grammar, and left factor it.
But it may not be suitable for predictive parsing (not LL(1)
grammar).
Recursive Predictive Parsing
• Each non-terminal corresponds to a procedure.
Ex: A  aBb (This is only the production rule for A)
proc A {
- match the current token with a, and move to the
next token;
- call ‘B’;
- match the current token with b, and move to the
next token;
}
Recursive Predictive Parsing (cont.)
A  aBb | bAB
proc A {
case of the current token {
‘a’: - match the current token with a, and move to
the next token;
- call ‘B’;
- match the current token with b, and move to
the next token;
‘b’: - match the current token with b, and move to
the next token;
- call ‘A’;
- call ‘B’;
}
}
Recursive Predictive Parsing (cont.)
• When to apply -productions.
A  aA | bB | 
• If all other productions fail, we should apply an -
production. For example, if the current token is not a
or b, we may apply the -production.
• Most correct choice: We should apply an -production
for a non-terminal A when the current token is in the
follow set of A (which terminals can follow A in the
sentential forms).
Non-Recursive Predictive Parsing -- LL(1) Parser
• Non-Recursive predictive parsing is a table-driven parser.
• It is a top-down parser.
• It is also known as LL(1) Parser.
input buffer
Non-recursive
stack
output
Predictive Parser
Parsing Table
LL(1) Parser
input buffer
– our string to be parsed. We will assume that its end is marked with
a special symbol $.
output
– a production rule representing a step of the derivation sequence
(left-most derivation) of the string in the input buffer.
stack
– contains the grammar symbols
– at the bottom of the stack, there is a special end marker symbol $.
– initially the stack contains only the symbol $ and the starting symbol
S. $S  initial stack
– when the stack is emptied (ie. only $ left in the stack), the parsing is
completed.
LL(1) Parser
parsing table
– a two-dimensional array M[A,a]
– each row is a non-terminal symbol
– each column is a terminal symbol or the special symbol $
– each entry holds a production rule.
LL(1) Parser – Parser Actions
• The symbol at the top of the stack (say X) and
the current symbol in the input string (say a)
determine the parser action.
• There are four possible parser actions.
1. If X and a are $  parser halts (successful
completion)
2. If X and a are the same terminal symbol
(different from $)
 parser pops X from the stack, and moves the
next symbol in the input buffer.
LL(1) Parser – Parser Actions
3. If X is a non-terminal
 parser looks at the parsing table entry
M[X,a]. If M[X,a] holds a production rule
XY1Y2...Yk, it pops X from the stack and
pushes Yk,Yk-1,...,Y1 into the stack. The parser
also outputs the production rule XY1Y2...Yk to
represent a step of the derivation.
4. none of the above  error
– all empty entries in the parsing table are errors.
– If X is a terminal symbol different from a, this is also an error
case.
LL(1) Parser – Example1
LL(1)
S  aBa
Parsing
B  bB | 
Table
a b $
S S  aBa
B B   B  bB
LL(1) Parser – Example
input
stack
$S
output
abba$ S  aBa
$aBa abba$
$aB bba$
$aBb bba$
$aB ba$
$aBb ba$
$aB a$
$a a$
B  bB
B  bB
B  
$ accept, successful
$
completion
LL(1) Parser – Example1 (cont.)
B
B
b
b

Outputs: S  aBa B  bB B  bB B  
Derivation(left-most): SaBaabBaabbBaabba
S
parse tree
a B a
LL(1) Parser – Example2
E  TE’
E’  +TE’ | 
T  FT’
T’ *FT’ | 
F  (E) | id
id + * ( ) $
E E 
TE’
E  TE’
E’ E’  +TE’ E’   E’  
T T 
FT’
T  FT’
T’ T’   T’  *FT’ T’   T’  
F F  id F  (E)
stack input
LL(1) Parser – Example2
output
$E id+id$ E  TE’
$E’T id+id$ T  FT’
$E’ T’F id+id$ F  id
$ E’ T’id id+id$
$ E’ T’ +id$ T’ 
$ E’ +id$ E’ +TE’
$ E’ T+ +id$
$ E’ T id$ T  FT’
$ E’ T’ F id$ F  id
$ E’ T’id id$
$ E’ T’ $ T’ 
$ E’ $ E’ 
$ $ accept
Constructing LL(1) Parsing Tables
• Two functions are used in the construction of LL(1) parsing tables:
– FIRST FOLLOW
• FIRST() is a set of the terminal symbols which occur as first
symbols in strings derived from  where  is any string of
grammar symbols.
• if  derives to , then  is also in FIRST() .
• FOLLOW(A) is the set of the terminals which occur immediately
after (follow) the non-terminal A in the strings derived from the
starting symbol.
– a terminal a is in FOLLOW(A) if S  Aa
– $ is in FOLLOW(A) if S  A
*
*
Compute FIRST for Any String X
  is inFIRST(X).
• If X is a terminal symbol  FIRST(X)={X}
• If X is a non-terminal symbol and X   is a production rule
• If X is a non-terminal symbol and X  Y1Y2..Ynis a production
rule  if a terminal a in FIRST(Yi) and  is in all FIRST(Yj) for
j=1,...,i-1 then a is in FIRST(X).
 if  is in all FIRST(Yj) for j=1,...,n
then  is in FIRST(X).
• If X is   FIRST(X)={}
• If X is Y1Y2..Yn
 if a terminal a in FIRST(Yi) and  is in all FIRST(Yj)for
j=1,...,i-1 then a is in FIRST(X).
 if  is in all FIRST(Yj) for j=1,...,n
then  is in FIRST(X).
FIRST Example
E  TE’
E’ +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
FIRST(F) = {(,id} FIRST(TE’) = {(,id}
FIRST(T’) = {*, } FIRST(+TE’ ) = {+}
FIRST(T) = {(,id} FIRST() = {}
FIRST(FT’) = {(,id}
FIRST(E’) = {+, }
FIRST(E) = {(,id} FIRST(*FT’) = {*}
FIRST() = {}
FIRST((E)) = {(}
FIRST(id) = {id}
Compute FOLLOW (for non-terminals)
• If S is the start symbol  $ is in FOLLOW(S)
• if A  B is a production rule
 everything in FIRST() is FOLLOW(B) except 
• If ( A  B is a production rule ) or
( A  B is a production rule and  is in FIRST())
 everything in FOLLOW(A) is in FOLLOW(B).
We apply these rules until nothing more can be added
to any follow set.
FOLLOW Example
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }
Constructing LL(1) Parsing Table -- Algorithm
• for each production rule A   of a grammar G



– for each terminal a in FIRST()
add A   to M[A,a]
– If  in FIRST()
for each terminal a in FOLLOW(A) add A   to M[A,a]
– If  in FIRST() and $ in FOLLOW(A)
add A   to M[A,$]
• All other undefined entries of the parsing table
are error entries.
Constructing LL(1) Parsing Table -- Example
FIRST(TE’)={(,id}
FIRST(+TE’ )={+}
 E  TE’ into M[E,(] and M[E,id]
 E’  +TE’ intoM[E’,+]
E  TE’
E’ +TE’
E’  FIRST()={}
but since  in FIRST()
and FOLLOW(E’)={$,)}
 none
 E’   into M[E’,$] and M[E’,)]
FIRST(FT’)={(,id}
FIRST(*FT’ )={*}
 T  FT’ into M[T,(] and M[T,id]
 T’  *FT’ intoM[T’,*]
T  FT’
T’ *FT’
T’  FIRST()={}
but since  in FIRST()
and FOLLOW(T’)={$,),+}
 none
 T’   into M[T’,$], M[T’,)] and
FIRST((E) )={(}
M[T’,+]
F  (E)
F  id FIRST(id)={id}
 F  (E) intoM[F,(]
 F  id intoM[F,id]
LL(1) Grammars
• A grammar whose parsing table has no multiply-
defined entries is said to be LL(1) grammar.
one input symbol used as a look-head symbol do
determine parser action
LL(1) left most derivation
input scanned from left to right
• The parsing table of a grammar may contain more
than one production rule. In this case, we say that it is
not a LL(1) grammar.
A Grammar which is not LL(1)
S  i C t S E | a
E  e S | 
C  b
FOLLOW(S) = { $,e }
FOLLOW(E) = { $,e }
FOLLOW(C) = { t }
FIRST(iCtSE) = {i}
FIRST(a) = {a}
FIRST(eS) = {e}
FIRST() = {}
FIRST(b) = {b}
a b e i t $
S S  a S 
iCtSE
E E  e S
E  
E 

C C  b two prod uction rulesfor M[ E,e]
Problem  ambiguity
A Grammar which is not LL(1) (cont.)
• What do we have to do it if the resulting
parsing table contains multiply defined
entries?
– If we didn’t eliminate left recursion, eliminate the left
recursion in the grammar.
– If the grammar is not left factored, we have to left factor
the grammar.
– If its (new grammar’s) parsing table still contains multiply
defined entries, that grammar is ambiguous or it is
inherently not a LL(1) grammar.
A Grammar which is not LL(1) (cont.)
• A left recursive grammar cannot be a LL(1) grammar.
– A  A | 
 any terminal that appears in FIRST() also appears FIRST(A)
because A  .
 If  is , any terminal that appears in FIRST() also appearsin
FIRST(A) and FOLLOW(A).
• A grammar is not left factored, it cannot be a LL(1)
grammar
• A  1 |2
any terminal that appears in FIRST(1) also appears in
FIRST(2).
• An ambiguous grammar cannot be a LL(1) grammar.
Properties of LL(1) Grammars
• A grammar G is LL(1) if and only if the following
conditions hold for two distinctive production
rules A   and A  
1. Both  and  cannot derive strings starting with same
terminals.
2. At most one of  and  can derive to .
3. If  can derive to , then  cannot derive to any string starting
with a terminal in FOLLOW(A).
Error Recovery in Predictive Parsing
• An error may occur in the predictive parsing
(LL(1) parsing)
– if the terminal symbol on the top of stack does not match with
the current input symbol.
– if the top of stack is a non-terminal A, the current input symbol
is a, and the parsing table entry M[A,a] is empty.
• What should the parser do in an error case?
– The parser should be able to give an error message (as much
as possible meaningful error message).
– It should be recover from that error case, and it should be able
to continue the parsing with the rest of the input.
Error Recovery Techniques
• Panic-Mode Error Recovery
– Skipping the input symbols until a synchronizing token is
found.
• Phrase-Level Error Recovery
– Each empty entry in the parsing table is filled with a pointer to
a specific error routine to take care that error case.
• Error-Productions
– If we have a good idea of the common errors that might be
encountered, we can augment the grammar with productions
that generate erroneous constructs.
– When an error production is used by the parser, we can
generate appropriate error diagnostics.
– Since it is almost impossible to know all the errors that can be
made by the programmers, this method is not practical.
Error Recovery Techniques
• Global-Correction
– Ideally, we would like a compiler to make as few change as
possible in processing incorrect inputs.
– We have to globally analyze the input to find the error.
– This is an expensive method, and it is not in practice.
Panic-Mode Error Recovery in LL(1) Parsing
• In panic-mode error recovery, we skip all the input
symbols until a synchronizing token is found.
• What is the synchronizing token?
– All the terminal-symbols in the follow set of a non-terminal can be
used as a synchronizing token set for that non-terminal.
• So, a simple panic-mode error recovery for the
LL(1) parsing:
– All the empty entries are marked as synch to indicate that the
parser will skip all the input symbols until a symbol in the follow
set of the non-terminal A which on the top of the stack. Then the
parser will pop that non-terminal A from the stack. The parsing
continues from that state.
– Tohandle unmatched terminal symbols, the parser pops that
unmatched terminal symbol from the stack and it issues an error
message saying that that unmatched terminal is inserted.
Panic-Mode Error Recovery - Example
S  AbS | e | 
A  a | cAd
FOLLOW(S)={$}
FOLLOW(A)={b,d}
output stack output
S  AbS $S ceadb$ S  AbS
Error: missing b, inserted $SbdA
$SbA
$SbdAc
eadb$
ceadb$ A  cAd
ceadb$
Error:unexpected e (illegal A)
stack input
$S aab$
$SbA aab$ A  a
$Sba aab$
$Sbab$
$S ab$ S  AbS (Remove all input tokens until first b or d,
pop A)
$SbA
$Sba
$Sbb$
ab$
ab$
A  a
$S
$Sbd db$
$Sb b$
$ S  
$S $ S   $ $ accept
$ $ accept
c inpuct
a b c d e $
S S 
AbS
syn
c
S  AbS syn
c
S 
e
S  
A A  a syn A  cAd syn sync sync
Phrase-Level Error Recovery
• Each empty entry in the parsing table is filled
with a pointer to a special error routine which
will take care that error case.
• These error routines may:
– change, insert, or delete input symbols.
– issue appropriate error messages
– pop items from the stack.
• We should be careful when we design these
error routines, because we may put the parser
into an infinite loop.
Syntax Analyzer
• Syntax Analyzer creates the syntactic structure of the
given source program.
• This syntactic structure is mostly a parse tree.
• Syntax Analyzer is also known as parser.
• The syntax of a programming is described by a context-
free grammar (CFG). We will use BNF (Backus-Naur Form)
notation in the description of CFGs.
• The syntax analyzer (parser) checks whether a given
source program satisfies the rules implied by a context-
free grammar or not.
– If it satisfies, the parser creates the parse tree of that program.
– Otherwise the parser gives the error messages.
Syntax Analyzer
• A context-free grammar
– gives a precise syntactic specification of a programming
language.
– the design of the grammar is an initial phase of the
design of a compiler.
– a grammar can be directly converted into a parser by
some tools.
Parser
Lexical
Analyze
r
Parser
source
program
token
get nexttoken
parse tree
• Parser works on a stream of tokens.
• The smallest item is a token.
Parsers (cont.)
• We categorize the parsers into two groups:
1. Top-Down Parser
– the parse tree is created top to bottom, starting from the root.
2. Bottom-Up Parser
– the parse is created bottom to top; starting from the leaves
• Both top-down and bottom-up parsers scan the input from left to
right (one symbol at a time).
• Efficient top-down and bottom-up parsers can be implemented
only for sub-classes of context-free grammars.
– LL for top-down parsing
– LR for bottom-up parsing
Context-Free Grammars
• Inherently recursive structures of a
programming language are defined by a context-
free grammar.
• In a context-free grammar, we have:
– A finite set of terminals (in our case, this will be the set of
tokens)
– A finite set of non-terminals (syntactic-variables)
– A finite set of productions rules in the following form
• A   where A is a non-terminal and
 is a string of terminals and non-terminals
(including the empty string)
– A start symbol (one of the non-terminal symbol)
Context-Free Grammars
| E – E | E * E | E / E | - E
• Example:
E  E + E
E  ( E )
E  id
Derivations
E  E+E
• E+E derives from E
– we can replace E by E+E
– to able to do this, we have to have a production rule EE+E in
our grammar.
E  E+E  id+E  id+id
•*A sequence of replacements of non-terminal
+
symbols is called a derivation of id+id from E.
Derivations
• In general a derivation step is
A   if there is a production rule A in our grammar
where  and  are arbitrary strings of terminal and
non-terminal symbols
1  2  ...  n (n derives from 1 or 1
derives n )
 : derives in one step
 : derives in zero or more steps
 : derives in one or more steps
CFG - Terminology
• L(G) is the language of G (the language generated by G)
which is a set of sentences.
• A sentence of L(G) is a string of terminal symbols of G.
• If S is the start symb+ol of G then
 is a sentence of L(G) iff S   where  is a string of terminals of G.
 If G is a context-free grammar, L(G) is a context-free
language.
• Two grammars are equivalent if they produce the same
language.
• S   - If  contains non-terminals, it is called as a
se*ntential form of G.
- If  does not contain non-terminals, it is
called as a sentence of G.
Derivation Example
E  -E  -(E)  -(E+E)  -(id+E)  -(id+id)
OR
E  -E  -(E)  -(E+E)  -(E+id)  -(id+id)
• At each derivation step, we can choose any of the non-terminal in
the sentential form of G for the replacement.
• If we always choose the left-most non-terminal in each derivation
step, this derivation is called as left-most derivation.
• If we always choose the right-most non-terminal in each
derivation step, this derivation is called as right-most derivation.
Left-Most and Right-Most Derivations
Left-Most Derivation
lm lm lm lm
lm
E  -E  -(E)  -(E+E)  -(id+E)  -(id+id)
Right-Most Derivation
E  -E  -(E)  -(E+E)  -(E+id)  -(id+id)
rm rm rm rm rm
• We will see that the top-down parsers try to find the left-most
derivation of the given source program.
• We will see that the bottom-up parsers try to find the right-most
derivation of the given source program in the reverse order.
Parse Tree
• Inner nodes of a parse tree are non-terminalsymbols.
• The leaves of a parse tree are terminal symbols.
• A parse tree can be seen as a graphical representation of a derivation.
E  -E E
E
-
E
E
E + E
-
( E )
E
E
-
( E )
E
E
id
E
E +
-
( E )
id
E
E
E + E
-
( E )
id
 -(E)  -(E+E)
 -
(id+E)
 -(id+id)
Ambiguity
E  E+E  id+E id+E*E
 id+id*E  id+id*id
E  E*E  E+E*E id+E*E
 id+id*E  id+id*id
id
id
E + E id
E
E
* E
• A grammar produces more than one parse tree for a sentenceis
called as an ambiguous grammar.
E
id E
E + E
* E
id id
Ambiguity (cont.)
• For the most parsers, the grammar must be unambiguous.
• unambiguous grammar
 unique selection of the parse tree for a sentence
• We should eliminate the ambiguity in the grammar during
the design phase of the compiler.
• An unambiguous grammar should be written to eliminate
the ambiguity.
• We have to prefer one of the parse trees of a sentence
(generated by an ambiguous grammar) to disambiguate
that grammar to restrict to this choice.
Ambiguity (cont.)
stmt  if expr then stmt |
if expr then stmt else stmt | otherstmts
if E1 then if E2 then S1 else S2
stmt
if expr then stmt else
E1 if expr then stmt S2
E2 S1
stmt
stmt if expr then stmt
E1 if expr then stmt else stm
S2
1
E2 S1
2
Ambiguity
• We prefer the second parse tree (else matches with closestif).
• So, we have to disambiguate our grammar to reflect this choice.
• The unambiguous grammar will be:
stmt  matchedstmt | unmatchedstmt
matchedstmt  if expr then matchedstmt else matchedstmt
| otherstmts
unmatchedstmt  if expr then stmt |
if expr then matchedstmt elseunmatchedstmt
Ambiguity – Operator Precedence
• Ambiguous grammars (because of ambiguous operators) can be
disambiguated according to the precedence and associativity
rules.
E  E+E | E*E | E^E | id | (E)
disambiguate the grammar
precedence: ^ (right to left)
* (left to right)
+ (left to right)
E  E+T | T
T  T*F | F
F  G^F | G
G  id | (E)

Left Recursion
• A grammar is left recursive if it has a non-terminal A such
that there is a derivation.
+
A  A for some string 
• Top-down parsing techniques cannot handle left-recursive
grammars.
• So, we have to convert our left-recursive grammar into an
equivalent grammar which is not left-recursive.
• The left-recursion may appear in a single step of the
derivation (immediate left-recursion), or may appear in
more than one step of the derivation.
Immediate Left-Recursion
where  does not start withA
eliminate immediate left recursion
A  A  | 

A   A’
A’   A’ |  an equivalent grammar
In general,
A  A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with A
 eliminate immediate left recursion
A  1 A’ | ... | n A’
A’  1 A’ | ... | m A’ |  an equivalent grammar
Immediate Left-Recursion -- Example
E  E+T | T
T  T*F | F
F  id | (E)

E  T E’
E’  +T E’ |
T  F T’
T’  *F T’ | 
F  id | (E)
eliminate immediate left recursion
Left-Recursion -- Problem
• A grammar cannot be immediately left-recursive, but it still can be
left-recursive.
• By just eliminating the immediate left-recursion, we may not get
a grammar which is not left-recursive.
S  Aa | b
A  Sc | d This grammar is not immediately left-recursive,
but it is still left-recursive.
S  Aa  Sca
A  Sc  Aac
or
causes to a left-recursion
• So, we have to eliminate all left-recursions from our grammar
Eliminate Left-Recursion -- Algorithm
- Arrange non-terminals in some order: A1 ...An
- for i from 1 to n do {
- for j from 1 to i-1 do {
replace each production
Ai  Aj 
by
Ai  1  | ... | k 
where Aj  1 | ... | k
}
- eliminate immediate left-recursions among Ai
productions
}
Eliminate Left-Recursion -- Example
S  Aa | b
A  Ac | Sd | f
-Order of non-terminals: S, A
for S:
- we do not enter the inner loop.
- there is no immediate left recursion in S.
for A:
- Replace A  Sd with A  Aad | bd
So, we will have A  Ac | Aad | bd | f
- Eliminate the immediate left-recursion in A
A  bdA’ |fA’
A’ cA’ | adA’ | 
Eliminate Left-Recursion -- Example
So, the resulting equivalent grammar which is
not left-recursive is:
S  Aa | b
A  bdA’ |fA’
A’  cA’ | adA’ | 
Eliminate Left-Recursion – Example2
S  Aa | b
A  Ac | Sd | f
-Order of non-terminals: A, S
for A:
- we do not enter the inner loop.
- Eliminate the immediate left-recursion in A
A  SdA’ |fA’
A’  cA’ | 
Eliminate Left-Recursion – Example2
for S:
- Replace S  Aa with S  SdA’a | fA’a
So, we will have S  SdA’a | fA’a | b
- Eliminate the immediate left-recursion in S
S  fA’a“’ | b“’
S’  dA’a“’ | 
So, the resulting equivalent grammar which is not
left-recursive is:
S  fA’a“’ | b“’
S’  dA’a“’ | 
A  SdA’ |fA’
A’ cA’ | 
Left-Factoring
• A predictive parser (a top-down parser without
backtracking) insists that the grammar must be left-
factored.
grammar  a new equivalent grammar suitable for
predictive parsing
stmt  if expr then stmt else stmt |
if expr then stmt
• when we see if, we cannot now which production rule
to choose to re-write stmt in the derivation.
Left-Factoring (cont.)
• In general,
A  1 | 2
symbols
where  is non-empty and the first
of 1 and 2 (if they haveone)are
different.
• when processing  we cannot know whether expand
A to 1 or
A to 2
• But, if we re-write the grammar as follows
A  A’
A’  1 | 2 so, we can immediately expand A to A’
Left-Factoring -- Algorithm
• For each non-terminal A with two or more
alternatives (production rules) with a common
non-empty prefix, let say
A  1 | ...| n | 1 | ... |m
convert it into
A  A’ | 1 | ... |m
A’ 1 | ... |n
Left-Factoring – Example1
A  abB | aB | cdg | cdeB | cdfB

A  aA’ | cdg | cdeB | cdfB
A’  bB | B

A  aA’ | cdA’’
A’  bB | B
A’’  g | eB | fB
Left-Factoring – Example2
A  ad | a | ab | abc | b

A  aA’ | b
A’  d |  | b | bc

A  aA’ | b
A’  d |  | bA’’
A’’   | c
Non-Recursive Predictive Parsing -- LL(1) Parser
• Non-Recursive predictive parsing is a table-driven parser.
• It is a top-down parser.
• It is also known as LL(1) Parser.
input buffer
Non-recursive
stack
output
Predictive Parser
Parsing Table
LL(1) Parser
input buffer
– our string to be parsed. We will assume that its end is marked with a
special symbol $.
output
– a production rule representing a step of the derivation sequence
(left-most derivation) of the string in the input buffer.
stack
– contains the grammar symbols
– at the bottom of the stack, there is a special end marker symbol $.
– initially the stack contains only the symbol $ and the starting symbol
S. $S  initialstack
LL(1) Parser
– when the stack is emptied (ie. only $ left in the stack), the
parsing is completed.
parsing table
– a two-dimensional array M[A,a]
– each row is a non-terminal symbol
– each column is a terminal symbol or the special symbol$
– each entry holds a production rule.
LL(1) Parser – Parser Actions
• The symbol at the top of the stack (say X) and the
current symbol in the input string (say a) determine
the parser action.
• There are four possible parser actions.
1. If X and a are $  parser halts (successful
completion)
2. If X and a are the same terminal symbol (different
from $)
 parser pops X from the stack, and moves the next
symbol in the input buffer.
LL(1) Parser – Parser Actions
3. If X is a non-terminal
 parser looks at the parsing table entry
M[X,a]. If M[X,a] holds a production rule
XY1Y2...Yk, it pops X from the stack and
pushes Yk,Yk-1,...,Y1 into the stack. The parser
also outputs the production rule XY1Y2...Yk to
represent a step of the derivation.
4. none of the above  error
– all empty entries in the parsing table are errors.
– If X is a terminal symbol different from a, this is also an error
case.
LL(1) Parser – Example1
LL(1)
S  aBa
Parsing
B  bB |  Table
output
S  aBa
B  bB
B  bB
stack
$S
$aBa
$aB
$aBb
$aB
$aBb
input
abba$
abba$
bba$
bba$
ba$
ba$
a b $
S S  aBa
B B   B  bB
LL(1) Parser – Example1
a$ B  
a$
$aB
$a
$ $ accept, successful
completion
LL(1) Parser – Example1 (cont.)
B
B
b
b

Outputs: S  aBa B  bB B  bB B  
Derivation(left-most): SaBaabBaabbBaabba
S
parse tree
a B a
LL(1) Parser – Example2
E  TE’
E’  +TE’ | 
T  FT’
T’ *FT’ | 
F  (E) | id
id + * ( ) $
E E 
TE’
E  TE’
E’ E’  +TE’ E’   E’  
T T 
FT’
T  FT’
T’ T’   T’ 
*FT’
T’   T’  
F F  id F  (E)
LL(1) Parser – Example2
stack input output
$E id+id$ E  TE’
$E’T id+id$ T  FT’
$E’ T’F id+id$ F  id
$ E’ T’id
$ E’ T’
id+id$
+id$ T’ 
$ E’ +id$ E’ +TE’
$ E’ T+
$ E’ T
+id$
id$ T  FT’
LL(1) Parser – Example2
$ E’ T’ F id$ F  id
$ E’ T’id
$ E’ T’
id$
$ T’ 
$ E’ $ E’ 
$ $ accept
Constructing LL(1) Parsing Tables
• Two functions are used in the construction of LL(1) parsing tables:
– FIRST FOLLOW
• FIRST() is a set of the terminal symbols which occur as first
symbols in strings derived from  where  is any string of
grammar symbols.
• if  derives to , then  is also in FIRST() .
• FOLLOW(A) is the set of the terminals which occur immediately
after (follow) the non-terminal A in the strings derived from the
starting symbol.
– a terminal a is in FOLLOW(A) if S  Aa
– $ is in FOLLOW(A) if S  A
*
*
Compute FIRST for Any String X
  is inFIRST(X).
• If X is a terminal symbol  FIRST(X)={X}
• If X is a non-terminal symbol and X   is a production rule
• If X is a non-terminal symbol and X  Y1Y2..Ynis a production
rule  if a terminal a in FIRST(Yi) and  is in all FIRST(Yj) for
j=1,...,i-1 then a is in FIRST(X).
 if  is in all FIRST(Yj) for j=1,...,n
then  is in FIRST(X).
• If X is   FIRST(X)={}
• If X is Y1Y2..Yn
 if a terminal a in FIRST(Yi) and  is in all FIRST(Yj)for
j=1,...,i-1 then a is in FIRST(X).
 if  is in all FIRST(Yj) for j=1,...,n
then  is in FIRST(X).
FIRST Example
E  TE’
E’ +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
FIRST(F) = {(,id} FIRST(TE’) = {(,id}
FIRST(T’) = {*, } FIRST(+TE’ ) = {+}
FIRST(T) = {(,id} FIRST() = {}
FIRST(FT’) = {(,id}
FIRST(E’) = {+, }
FIRST(E) = {(,id} FIRST(*FT’) = {*}
FIRST() = {}
FIRST((E)) = {(}
FIRST(id) = {id}
Compute FOLLOW (for non-terminals)
• If S is the start symbol  $ is in FOLLOW(S)
• if A  B is a production rule
 everything in FIRST() is FOLLOW(B) except 
• If ( A  B is a production rule ) or
( A  B is a production rule and  is in FIRST())
 everything in FOLLOW(A) is in FOLLOW(B).
We apply these rules until nothing more can be added
to any follow set.
FOLLOW Example
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T’) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }
Constructing LL(1) Parsing Table -- Algorithm
• for each production rule A   of a grammar G
– for each terminal a in FIRST()
 add A   to M[A,a]


– If  in FIRST()
for each terminal a in FOLLOW(A) add A   to M[A,a]
– If  in FIRST() and $ in FOLLOW(A)
add A   to M[A,$]
• All other undefined entries of the parsing table
are error entries.
Constructing LL(1) Parsing Table -- Example
E  TE’ FIRST(TE’)={(,id}  E  TE’ into
M[E,(] and M[E,id]
E’  +TE’ FIRST(+TE’ )={+}  E’  +TE’ intoM[E’,+]
E’  FIRST()={}  none
but since  in FIRST()
 E’   into
and FOLLOW(E’)={$,)}
M[E’,$] and M[E’,)]
T  FT’ FIRST(FT’)={(,id}  T  FT’ into
M[T,(] and M[T,id]
T’  *FT’ FIRST(*FT’ )={*}  T’  *FT’ into M[T’,*]
Constructing LL(1) Parsing Table -- Example
T’  FIRST()={}
but since  in FIRST()
 none
 T’ 
and FOLLOW(T’)={$,),+}
into M[T’,$], M[T’,)] and M[T’,+]
FIRST((E) )={(}  F 
FIRST(id)={id}  F  id
F  (E)
(E) into M[F,(]
F  id
into M[F,id]
LL(1) Grammars
• A grammar whose parsing table has no multiply-
defined entries is said to be LL(1) grammar.
one input symbol used as a look-head symbol do
determine parser action
LL(1) left most derivation
input scanned from left to right
• The parsing table of a grammar may contain more
than one production rule. In this case, we say that it is
not a LL(1) grammar.
A Grammar which is not LL(1)
S  i C t S E | a
E  e S | 
C  b
FOLLOW(S) = { $,e }
FOLLOW(E) = { $,e }
FOLLOW(C) = { t }
FIRST(iCtSE) = {i}
FIRST(a) = {a}
FIRST(eS) = {e}
FIRST() = {}
FIRST(b) = {b}
a b e i t $
S S  a S 
iCtSE
E
two p
E  e S
E  
roductio n rules for
E 

C C  b
M[E,e]
Problem  ambiguity
A Grammar which is not LL(1) (cont.)
• What do we have to do it if the resulting
parsing table contains multiply defined
entries?
– If we didn’t eliminate left recursion, eliminate the left
recursion in the grammar.
– If the grammar is not left factored, we have to left factor
the grammar.
– If its (new grammar’s) parsing table still contains multiply
defined entries, that grammar is ambiguous or it is
inherently not a LL(1) grammar.
A Grammar which is not LL(1) (cont.)
• A left recursive grammar cannot be a LL(1) grammar.
– A  A | 
 any terminal that appears in FIRST() also appears FIRST(A)
because A  .
 If  is , any terminal that appears in FIRST() also appearsin
FIRST(A) and FOLLOW(A).
• A grammar is not left factored, it cannot be a LL(1)
grammar
• A  1 |2
any terminal that appears in FIRST(1) also appears in
FIRST(2).
• An ambiguous grammar cannot be a LL(1) grammar.
Properties of LL(1) Grammars
• A grammar G is LL(1) if and only if the following
conditions hold for two distinctive production
rules A   and A  
1. Both  and  cannot derive strings starting with same
terminals.
2. At most one of  and  can derive to .
3. If  can derive to , then  cannot derive to any string starting
with a terminal in FOLLOW(A).
Error Recovery in Predictive Parsing
• An error may occur in the predictive parsing
(LL(1) parsing)
– if the terminal symbol on the top of stack does not match with
the current input symbol.
– if the top of stack is a non-terminal A, the current input symbol
is a, and the parsing table entry M[A,a] is empty.
• What should the parser do in an error case?
– The parser should be able to give an error message (as much
as possible meaningful error message).
– It should be recover from that error case, and it should be able
to continue the parsing with the rest of the input.
Error Recovery Techniques
• Panic-Mode Error Recovery
– Skipping the input symbols until a synchronizing token is
found.
• Phrase-Level Error Recovery
– Each empty entry in the parsing table is filled with a pointer to
a specific error routine to take care that error case.
• Error-Productions
– If we have a good idea of the common errors that might be
encountered, we can augment the grammar with productions
that generate erroneous constructs.
– When an error production is used by the parser, we can
generate appropriate error diagnostics.
– Since it is almost impossible to know all the errors that can be
made by the programmers, this method is not practical.
Error Recovery Techniques
• Global-Correction
– Ideally, we would like a compiler to make as few change
as possible in processing incorrect inputs.
– We have to globally analyze the input to find the error.
– This is an expensive method, and it is not in practice.
Panic-Mode Error Recovery in LL(1) Parsing
• In panic-mode error recovery, we skip all the input
symbols until a synchronizing token is found.
• What is the synchronizing token?
– All the terminal-symbols in the follow set of a non-terminal can be
used as a synchronizing token set for that non-terminal.
• So, a simple panic-mode error recovery for the
LL(1) parsing:
– All the empty entries are marked as synch to indicate that the
parser will skip all the input symbols until a symbol in the follow
set of the non-terminal A which on the top of the stack. Then the
parser will pop that non-terminal A from the stack. The parsing
continues from that state.
– Tohandle unmatched terminal symbols, the parser pops that
unmatched terminal symbol from the stack and it issues an error
message saying that that unmatched terminal is inserted.
Panic-Mode Error Recovery - Example
S  AbS | e | 
A  a | cAd
FOLLOW(S)={$}
FOLLOW(A)={b,d}
a b c d e $
S S 
AbS
syn
c
S  AbS syn
c
S 
e
S  
A A  a syn
c
A  cAd syn
c
sync sync
put
 AbS
 a
stack input
$S ceadb$
$SbA ceadb$
$SbdAc ceadb$
$SbdAeadb$
stack input out
output
$S aab$ S
S  AbS
$SbA aab$ A
A  cAd
$Sba aab$
$Sb ab$
Panic-Mode Error Recovery - Example
$S ab$ S  AbS(Remove all input tokens until first b or d, popA)
$SbA
$Sbd
$Sba
$Sb
ab$
db$
ab$
b$
A  a
$Sb b$
$S $ S  
$ $ accept
Error : unexpected e (illegal A)
Phrase-Level Error Recovery
• Each empty entry in the parsing table is filled
with a pointer to a special error routine which
will take care that error case.
• These error routines may:
– change, insert, or delete input symbols.
– issue appropriate error messages
– pop items from the stack.
• We should be careful when we design these
error routines, because we may put the parser
into an infinite loop.
Unit-2
Bottom - Up Parsing
Bottom-Up Parsing
• A bottom-up parser creates the parse tree of the
given input starting from leaves towards the root.
• A bottom-up parser tries to find the right-most
derivation of the given input in the reverse order.
S  ...   (the right-most derivation of )
 (the bottom-up parser finds the right-most derivation in
the reverse order)
• Bottom-up parsing is also known as shift-reduce
parsing because its two main actions are shift and
reduce.
– At each shift action, the current symbol in the input string is pushed
to a stack.
Bottom-Up Parsing
– At each reduction step, the symbols at the top of the
stack (this symbol sequence is the right side of a
production) will replaced by the non-terminal at the left
side of that production.
– There are also two more actions: accept and error.
Shift-Reduce Parsing
• A shift-reduce parser tries to reduce the given input string
into the starting symbol.
a string  the starting symbol
reduced to
• At each reduction step, a substring of the input matching to
the right side of a production rule is replaced by the non-
terminal at the left side of that production rule.
• If the substring is chosen correctly, the right most derivation
of that string is created in the reverse order.
Rightmost Derivation: S  
*
rm
Shift-Reduce Parser finds:   ... S
rm rm
Shift-Reduce Parsing -- Example
input string: aaabb
aaAbb
aAbb 
S  aABb
A  aA | a
B  bB | b
reduction
aABb
S
S  aABb  aAbb  aaAbb  aaabb
rm rm rm rm
Right Sentential Forms
• How do we know which substring to be replaced at each
reduction step?
Handle
• Informally, a handle of a string is a substring that
matches the right side of a production rule.
– But not every substring matches the right side of a production rule is
handle
• A handle of a right sentential form  ( ) is
a production rule A   and a position of 
where the string  may be found and replaced by A
to produce
the previous right-sentential form in a rightmost
derivation of .
S  A  
• If the grammar is unambiguous, then every right-
sentential form of the grammar has exactly one handle.
• We will see that  is a string of terminals.
*
rm rm
Handle Pruning
• Then find a handle An-1n-1 inn-1,
and replace n-1 in by An-1 to getn-2.
• Repeat this, until we reach S.
• A right-most derivation in reverse can be obtained by
handle-pruning.
rm rm rm rm rm
S=0  1  2  ...  n-1  n=
input string
• Start from n, find a handle Ann in n,
and replace n in by An to getn-1.
A Shift-Reduce Parser
E  E+T |T
T  T*F |F
F  (E) | id
Right-Most Derivation of id+id*id
E  E+T  E+T*F  E+T*id  E+F*id
 E+id*id  T+id*id  F+id*id  id+id*id
Right-Most Sentential Form
id+id*id
F+id*id
T+id*id
E+id*id
Reducing Production
F  id
T  F
E  T
F  id
T  F
F  id
T  T*F
E  E+T
Handles are red and underlined in the right-sentential
E+F*id
E+T*id
E+T*F
E+T
E
forms.
A Stack Implementation of A Shift-Reduce Parser
• There are four possible actions of a shift-parser action:
1. Shift : The next input symbol is shifted onto the top of the stack.
2. Reduce: Replace the handle on the top of the stack by the non-
terminal.
3. Accept: Successful completion of parsing.
4. Error: Parser discovers a syntax error, and calls an error recovery
routine.
• Initial stack just contains only the end-marker $.
• The end of the input string is marked by the end-
marker $.
A Stack Implementation of A Shift-Reduce Parser
$E+id
$E+F
*id$
*id$
reduce by F  id
reduce by T  F T 2 T 5 *
F 6
$E+T
$E+T*
*id$
id$
shift
shift F 1 F 4
id
id id
$E+T*id
$E+T*F
$E+T
$E
$
$
$
$
reduce by F  id
reduce by T  T*F
reduce by E  E+T
accept
Stack Input Action
$ id+id*id$shift
$id
$F
$T
+id*id$
+id*id$
+id*id$
reduce by F  id
reduce by T  F
reduce by E  T
Parse Tree
E 8
$E
$E+ id*
+id*id$
id$ shift
shift
E 3 + T 7
Conflicts During Shift-Reduce Parsing
• There are context-free grammars for which shift-reduce
parsers cannot be used.
• Stack contents and the next input symbol may not
decide action:
– shift/reduce conflict: Whether make a shift operation or a reduction.
– reduce/reduce conflict: The parser cannot decide which of several
reductions to make.
• If a shift-reduce parser cannot be used for a grammar,
that grammar is called as non-LR(k) grammar.
right-most k lookhead
left to right
scanning derivation
• An ambiguous grammar can never be a LR grammar.
Shift-Reduce Parsers
• There are two main categories of shift-reduce
parsers
1. Operator-Precedence Parser
– simple, but only a small class of grammars.
2. LR-Parsers
– covers wide range of grammars.
• SLR – simple LR parser
• LR – most general LR parser
• LALR – intermediate LR parser (lookhead LR parser)
SLR
CFG
LR
LALR
• Operator
Operator-Precedence Parser
grammar
– small, but an important class of grammars
– we may have an efficient operator precedence parser (a shift-
reduce parser) for an operator grammar.
• In an operator grammar, no production rule can
have:
–  at the right side
– two adjacent non-terminals at the right side.
• Ex:
EAB EEOE EE+E |
Aa Eid E*E |
Bb O+|*|/
E/E | id
not operator grammar not operator grammar operator
grammar
Precedence Relations
• In operator-precedence parsing, we define three disjoint
precedence relations between certain pairs of terminals.
a <. b
a =· b
a .> b
b has higher precedence than a
b has same precedence as a
b has lower precedence than a
• The determination of correct precedence relations
between terminals are based on the traditional notions of
associativity and precedence of operators. (Unary minus
causes a problem).
Using Operator-Precedence Relations
• The intention of the precedence relations is to
find the handle of a right-sentential form,
<. with marking the left end,
=· appearing in the interior of the handle, and
.> marking the right hand.
• In our input string $a1a2...an$, we insert the
precedence relation between the pairs of
terminals (the precedence relation holds
between the terminals in that pair).
Using Operator -Precedence Relations
id * $
d
id
e ce .> .> .>
+ <. .> <. .>
* <. .> .> .>
$ <. <. <.
E  E+E | E-E | E*E | E/E | E^E | (E) | -E | id
The partial operator-prece n
table for this grammar
• Then the input string id+id*id with the precedence
relations inserted will be:
$ <. id .> + <. id .> * <.id .> $
ToFind The Handles
1. Scan the string from left end until the first .> is
encountered.
2. Then scan backwards (to the left) over any =· until a
<. is encountered.
3. The handle contains everything to left of the first .>
and to the right of the <. is encountered.
E  id
E  id
E  id
$ id + id * id $
$ E + id * id $
$ E + E * id $
$ E + E * .E $
E  E+E
E  E*E
$ E + E $
$ <. id .> + <. id .> * <. id .> $
$ <. + <. id .> * <. id .> $
$ <. + <. * <. id .> $
$ <. + <. * .>$
$ <. + .> $
$ $ $ E $
Operator-Precedence Parsing Algorithm
• The input string is w$, the initial stack is $ and a table
holds precedence relations between certain terminals
Algorithm:
set p to point to the first symbol of w$ ;
repeat forever
if ( $ is on top of the stack and p points to $ ) then
return
else {
let a be the topmost terminal symbol on the stack and
let b be the symbol pointed to by p;
if ( a <. b or a =· b ) then { /* SHIFT */
Operator-Precedence Parsing
Algorithm
push b onto the stack;
advance p to the next input symbol;
}
/* REDUCE */
else if ( a .> b ) then
repeat pop stack
until ( the top of stack terminal is related
by <. to the terminal most recently popped);
else error();
}
Operator-Precedence Parsing Algorithm -- Example
stack input action
$ id+id*id$ $ <. id shift
$id +id*id$ id .> + reduce E  id
$ +id*id$ shift
$+ id*id$ shift
$+id *id$ id .> * reduce E  id
$+ *id$ shift
$+* id$ shift
$+*id $ id .> $ reduce E  id
$+* $ * .> $ reduce E  E*E
$+ $ + .> $ reduce E  E+E
$ $ accept
How to Create Operator-Precedence Relations
• We use associativity and precedence relations among operators.
1. If operator O1 has higher precedence than operator O2,
 O1
.> O2 and O2 <. O1
2. If operator O1 and operator O2 have equal precedence,
they are left-associative  O1
.> O2 and O2
.> O1
they are right-associative  O1 <. O2 and O2 <. O1
(<. O, O .> ), ) .> O, O .> $, and $ <.
3. For all operators O,
O <. id, id .> O, O <. (,
O
4. Also, let
(=·) $ <. ( id .> ) ) .> $
( <. ( $ <. id id .> $ ) .> )
( <. id
Operator-Precedence Relations
+ - * / ^ id ( ) $
+ .> .> <. <. <. <. <. .> .>
- .> .> <. <. <. <. <. .> .>
* .> .> .> .> <. <. <. .> .>
/ .> .> .> .> <. <. <. .> .>
^ .> .> .> .> <. <. <. .> .>
id .> .> .> .> .> .> .>
( <. <. <. <. <. <. <. =·
) .> .> .> .> .> .> .>
$ <. <. <. <. <. <. <.
Handling Unary Minus
• Operator-Precedence parsing cannot handle the unary
minus when we have also the binary minus in our
grammar.
• The best approach to solve this problem, let the lexical
analyzer handle this problem.
– The lexical analyzer will return two different operators for the unary
minus and the binary minus.
– The lexical analyzer will need a lookhead to distinguish the binary minus
from the unary minus.
• Then, we make
for any operator
if unary-minus has higher
if unary-minus has lower
O <. unary-minus
unary-minus .> O
precedence than O
unary-minus <. O
(or equal) precedence than O
Precedence Functions
• Compilers using operator precedence parsers do
not need to store the table of precedence
relations.
• The table can be encoded by two precedence
functions f and g that map terminal symbols to
integers.
• For symbols a and b.
f(a) < g(b) whenever a <. b
f(a) = g(b) whenever a =· b
f(a) > g(b) whenever a .> b
Disadvantages of Operator Precedence Parsing
• Disadvantages:
– It cannot handle the unary minus (the lexical analyzer should
handle the unary minus).
– Small class of grammars.
– Difficult to decide which language is recognized by the
grammar.
• Advantages:
– simple
– powerful enough for expressions in programming languages
Error Recovery in Operator-Precedence Parsing
Error Cases:
1. No relation holds between the terminal on the top of
stack and the next input symbol.
2. A handle is found (reduction step), but there is no
production with this handle as a right side
Error Recovery:
1. Each empty entry is filled with a pointer to an error
routine.
2. Decides the popped handle “looks like” which right hand
side. And tries to recover from that situation.
LR Parsers
• The most powerful shift-reduce parsing (yet
efficient) is:
LR(k) parsing.
right-most k
derivation (k is omitted 
left to right
lookhead
scanning
it is 1)
LR Parsers
• LR parsing is attractive because:
– LR parsing is most general non-backtracking shift-reduce
parsing, yet it is still efficient.
– The class of grammars that can be parsed using LR
methods is a proper superset of the class of grammars
that can be parsed with predictive parsers.
LL(1)-Grammars  LR(1)-Grammars
– An LR-parser can detect a syntactic error as soon as it is
possible to do so a left-to-right scan of the input.
LR Parsers
• LR-Parsers
– covers wide range of grammars.
– SLR – simple LR parser
– CLR – most general LR parser
– LALR – intermediate LR parser (look-head LR parser)
– SLR, LR and LALR work same (they used the same
algorithm), only their parsing tables are different.
LR Parsing Algorithm
Sm
Xm
Sm-1
Xm-
1
.
.
S1
X1
S0
a1 ... ai ... an $
Action Table Goto Table
s
t
a
t
e
s
terminals and $
four different
actions
s
t
a
t
e
s
non-terminal
each item is
a state number
LR ParsingAlgorithm
stack
input
output
A Configuration of LR Parsing Algorithm
• A configuration of a LR parsing is:
( So X1 S1 ...Xm Sm, ai ai+1 ... an $ )
Stack Rest of Input
• Sm and ai decides the parser action by consulting the
parsing action table. (Initial Stack contains just So )
• A configuration of a LR parsing represents the right
sentential form:
X1 ... Xm ai ai+1 ... an $
Actions of A LR-Parser
1. shift s -- shifts the next input symbol and the state s onto
the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
2. reduce A (or rn where n is a production number)
– pop 2|| (=r) items from the stack;
– then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
– Output is the reducing production reduce A
2. Accept – Parsing successfully completed
3. Error -- Parser detected an error (an empty entry in the
action table)
Reduce Action
• pop 2|| (=r) items from the stack; let us assume
that  = Y1Y2...Yr
• then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-r Sm-r Y1 Sm-r ...Yr Sm, ai ai+1 ... an $ )
 ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
• In fact, Y1Y2...Yr is a handle.
X1 ... Xm-r A ai ... an $  X1 ... Xm Y1...Yr ai ai+1 ... an $
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
(SLR) Parsing Tables for Expression Grammar
Action Table Goto Table
1) E  E+T
2) E  T
3) T  T*F
4)T  F 5)
F  (E)
6) F  id
Actions of A (S)LR-Parser -- Example
stack
0
0id5
input
id*id+id$
*id+id$
action
shift 5
reduce by Fid
output
Fid
0F3 *id+id$ reduce by TF TF
0T2 *id+id$ shift 7
0T2*7
0T2*7id5
0T2*7F10
id+id$
+id$
+id$
shift 5
reduce by Fid
reduce by TT*F
Fid
TT*F
0T2 +id$ reduce by ET ET
Actions of A (S)LR-Parser -- Example
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by Fid
0E1+6F3 $ reduce by TF
0E1+6T9 $ reduce by EE+T
0E1 $ accept
Constructing SLR Parsing Tables – LR(0) Item
• An LR(0) item of a grammar G is a production of G a dot at the
some position of the right side.
• Ex: A  aBb Possible LR(0) Items: A  .aBb
(four different possibility) A 
a.Bb
A  aB.b
A  aBb.
• Sets of LR(0) items will be the states of action and goto table of
the SLR parser.
• A collection of sets of LR(0) items (the canonical LR(0) collection)
is the basis for constructing SLR parsers.
• Augmented Grammar:
G’ is G with a new production rule “’“ where “’ is the new
starting symbol.
The Closure Operation
• If I is a set of LR(0) items for a grammar G,
then closure(I) is the set of LR(0) items
constructed from I by the two rules:
1. Initially, every LR(0) item in I is added to closure(I).
2. If A  .B is in closure(I) and B is a productionrule
of G; then B. will be in the closure(I).
We will apply this rule until no more new LR(0) items can
be added to closure(I).
The Closure Operation -- Example
E’  E closure({E’  .E}) =
{ E’  .E
E  .E+T
E  .T
T  .T*F
T  .F
E  E+T
kernel items
E  T
T  T*F
T  F
F  (E)
F  id F  .(E)
F  .id }
Goto Operation
• If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is
defined as follows:
– If A  .X in I then every item in closure({A  X.})will be in goto(I,X).
Example:
I ={ E’  .E, E  .E+T, E  .T,
T  .T*F, T  .F,
F  .(E), F  .id }
goto(I,E) = { E’  E., E  E.+T }
goto(I,T) = { E  T., T  T.*F }
goto(I,F) = {T  F.}
goto(I,() = { F  (.E), E  .E+T, E  .T,T  .T*F, T .F,
F  .(E), F  .id }
goto(I,id) = { F  id. }
Construction of The Canonical LR(0) Collection
• Tocreate the SLR parsing tables for a grammar G, we will
create the canonical L‘(O) collection of the grammar G’.
• Algorithm:
C is { closure({“’.S}) }
repeat the followings until no more set of LR(0) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C
• goto function is a DFA on the sets in C.
The Canonical LR(0) Collection --
Example
Transition Diagram (DFA) of Goto Function
I0 I1
I2
I3
I4
I5
I6
I7
I8
to I2
to I3
to I4
I9
to I3
to I4
I11
to I6
id to I5
I10
to I4
to I5
(
F
*
E
E
+
T
T
)
T
F
F
F
(
id
id
(
* to I7
(
id
+
Constructing SLR Parsing Table
(of an augumented grammar G’)
1. Construct the canonical collection of sets of
L‘(O) items for G’. C{I0,...,In}
2. Create the parsing action table as follows
• If a is a terminal, A.a in Ii and goto(Ii,a)=Ij then
action[i,a] is shift j.
• If A. is in Ii , then action[i,a] is reduce A for all a
in FOLLOW(A) where A“’.
• If “’S. is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the
grammar is not SLR(1).
Constructing SLR Parsing Table
3. Create the parsing gototable
• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
4. All entries not defined by (2) and (3) are
errors.
5. Initial state of the parser contains “’.S
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Parsing Tables of Expression Grammar
Action Table Goto Table
SLR(1) Grammar
• An LR parser using SLR(1) parsing tables for a
grammar G is called as the SLR(1) parser for
G.
• If a grammar G has an SLR(1) parsing table, it
is called SLR(1) grammar (or SLR grammar in
short).
• Every SLR grammar is unambiguous, but
every unambiguous grammar is not a SLR
grammar.
shift/reduce and reduce/reduce conflicts
• If a state does not know whether it will make a shift
operation or reduction for a terminal, we say that
there is a shift/reduce conflict.
• If a state does not know whether it will make a
reduction operation using the production rule i or j
for a terminal, we say that there is a reduce/reduce
conflict.
• If the SLR parsing table of a grammar G has a conflict,
we say that that grammar is not SLR grammar.
Conflict Example
I1: “’ S. I6: S L=.R I9: S 
R  .L
I2: S  L.=R L .*R
R  L. L  .id
S  L=R
L=R.
S  R
L *R
L  id
R  L
I0: “’  .S
S  .L=R
S  .R
L  .*R
L  .id
R  .L I3: S R.
I7: L *R.
Problem
FOLLOW(R)={=,$} I8: R L.
=
I4: L  *.R
R  .L
L .*R
L  .id
shift 6
reduce by R  L
shift/reduce conflict I5: L id.
Conflict Example2
S  AaAb
S  BbBa
A  
B  
I0: “’  .S
S  .AaAb
S  .BbBa
A  .
B  .
Problem
FOLLOW(A)={a,b}
FOLLOW(B)={a,b}
b
a reduce by A  
reduce by B  
reduce/reduce conflict
reduce by A  
reduce by B  
reduce/reduce conflict
Constructing Canonical LR(1) Parsing Tables
• In SLR method, the state i makes a reduction by A
when the current token is a:
– if the A. in the Ii and a is FOLLOW(A)
• In some situations, A cannot be followed by the
terminal a in a right-sentential form when 
and the state i are on the top stack. This means
that making reduction in this case is not correct.
S  AaAb
S  BbBa
SAaAbAabab SBbBaBbaba
A   Aab   ab Bba   ba
B   AaAb  Aa  b BbBa  Bb a
LR(1) Item
• Toavoid some of invalid reductions, the states need
to carry more information.
• Extra information is put into a state by including a
terminal symbol as a second component in an item.
where a is the look-head of
• A LR(1) item is:
A  .,a
the LR(1) item
(a is a terminal or end-
marker.)
LR(1) Item (cont.)
• When  ( in the LR(1) item A  .,a ) is not
empty, the look-head does not have any affect.
• When  is empty (A  .,a ), we do the
reduction by A only if the next input symbol
is a (not for any terminal in FOLLOW(A)).
• A state will contain A  .,a1 where
{a1,...,an}  FOLLOW(A)
...
A  .,an
Canonical Collection of Sets of LR(1) Items
• The construction of the canonical collection of
the sets of LR(1) items are similar to the
construction of the canonical collection of the
sets of LR(0) items, except that closure and goto
operations work a little bit different.
closure(I) is: ( where I is a set of LR(1) items)
– every LR(1) item in I is in closure(I)
– if A.B,a in closure(I) and B is a production rule of G;
then B.,b will be in the closure(I) for each terminal b in
FIRST(a) .
goto operation
• If I is a set of LR(1) items and X is a grammar
symbol (terminal or non-terminal), then
goto(I,X) is defined as follows:
– If A  .X,a in I
then every item in closure({A  X.,a}) will be in
goto(I,X).
Construction of The Canonical LR(1) Collection
• Algorithm:
C is { closure({“’.S,$}) }
repeat the followings until no more set of LR(1) items can be added to
C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C
• goto function is a DFA on the sets in C.
A Short Notation for The Sets of LR(1) Items
• A set of LR(1) items containing the following
items
A  .,a1
...
A  .,an
can be written as
A  .,a1/a2/.../an
Canonical LR(1) Collection -- Example
2
I : S  A.aAb ,$
S  AaAb
S  BbBa
A  
B  
I0: “’  .S ,$
S  .AaAb ,$
S  .BbBa,$
A  . ,a
B  .,b I3: S  B.bBa,$
I6: S  AaA.b ,$ I8: S  AaAb.,$
I4: S  Aa.Ab ,$
A  . ,b
I7: S  BbB.a ,$ I9: S  BbBa.,$
I5: S  Bb.Ba,$
B  . ,a
S
AI1:“’  S.,$
B
a
b
A
B
a
b
to I4
to I5
Canonical LR(1) Collection – Example2
“’  S
1) S  L=R
2) S  R
3) L *R
4) L  id
5) R  L
I0:“’  .S,$
S  .L=R,$
S  .R,$
L  .*R,$/=
L  .id,$/=
R  .L,$
I1:“’  S.,$
I3:S  R.,$
I4:L  *.R,$/=
R  .L,$/=
L .*R,$/=
L  .id,$/=
5
I :L id.,$/=
6
I :S L=.R,$
R  .L,$
L  .*R,$
L  .id,$
I7:L  *R.,$/=
I8: R  L.,$/=
I9:S  L=R.,$
I10:R  L.,$
11
I :L  *.R,$
R  .L,$
L .*R,$
L  .id,$
12
I :L  id.,$
I13:L  *R.,$
to I6
to I7
to I8
to I4
to I5
to I10
to I11
to I12
to I9
to I10
to I11
to I12
to I13
i
d
S
L I2:S  L.=R,$
R  L.,$
R
R
L
R
id
i
d
i
d
R
L
L
*
*
*
*
I4 and I11
I5 and I12
I7 and I13
8
I and I10
Construction of LR(1) Parsing Tables
1. Construct the canonical collection of setsof
L‘(1) items for G’. C{I0,...,In}
2. Create the parsing action table as follows
• If a is a terminal, A.a,b in Ii and goto(Ii,a)=Ij then
action[i,a] is shift j.
• If A.,a is in Ii , then action[i,a] is reduce A where
A“’.
• If “’S.,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the
grammar is not LR(1).
Construction of LR(1) Parsing Tables
3. Create the parsing gototable
• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j
4. All entries not defined by (2) and (3) are
errors.
5. Initial state of the parser contains “’.S,$
LR(1) Parsing Tables – (for Example2)
id * = $ S L R
no shift/reduce
no reduce/redu

so, it is a LR(1) g
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
5 r4 r4
6 s12 s11 10 9
7 r3 r3
8 r5 r5
9 r1
10 r5
11 s12 s11 10 13
12 r4
13 r3
or
ce conflict
rammar
LALR Parsing Tables
• LALR stands for LookAhead LR.
• LALR parsers are often used in practice because LALR
parsing tables are smaller than LR(1) parsing tables.
• The number of states in SLR and LALR parsing tables for
a grammar G are equal.
• But LALR parsers recognize more grammars than SLR
parsers.
• yacc creates a LALR parser for the given grammar.
• A state of LALR parser will be again a set of LR(1) items.
Creating LALR Parsing Tables

Canonical LR(1) Parser
LALR Parser
shrink # of states
• This shrink process may introduce a
reduce/reduce conflict in the resulting LALR
parser (so the grammar is NOT LALR)
• But, this shrink process does not produce a
shift/reduce conflict.
The Core of A Set of LR(1) Items
• The core of a set of LR(1) items is the set of its first component.
Ex: S  L.=R,$  S  L.=R Core
R  L.,$ R  L.
• We will find the states (sets of LR(1) items) in a canonical LR(1)
parser with same cores. Then we will merge them as a single
state.
I1:L  id.,= A new state: I12: L  id.,=
L  id.,$
I2:L  id.,$

have same core, merge them
The Core of A Set of LR(1) Items
• We will do this for all states of a canonical
LR(1) parser to get the states of the LALR
parser.
• In fact, the number of the states of the LALR
parser for a grammar will be equal to the
number of states of the SLR parser for that
grammar.
Creation of LALR Parsing Tables
• Create the canonical LR(1) collection of the sets of LR(1)
items for the given grammar.
• Find each core; find all sets having that same core; replace
those sets having same cores with a single set which is
their union.
C={I0,...,In}  C’={J1,...,Jm} where m  n
• Create the parsing tables (action and goto tables) same as
the construction of the parsing tables of LR(1) parser.
– Note that: If J=I1  ...  Ik since I1,...,Ik have same cores
 cores of goto(I1,X),...,goto(I2,X) must be same.
– So, goto(J,X)=K where K is the union of all sets of items having same
cores as goto(I1,X).
• If no conflict is introduced, the grammar is LALR(1)
grammar. (We may only introduce reduce/reduce
conflicts; we cannot introduce a shift/reduce conflict)
Shift/Reduce Conflict
• We say that we cannot introduce a shift/reduce conflict during the shrink
process for the creation of the states of a LALR parser.
• Assume that we can introduce a shift/reduce conflict. In this case, a state
of LALR parser must have:
A  .,a and B  .a,b
• This means that a state of the canonical LR(1) parser must have:
A  .,a and B  .a,c
But, this state has also a shift/reduce conflict. i.e. The original canonical
LR(1) parser has a conflict.
(Reason for this, the shift operation does not depend on lookaheads)
Reduce/Reduce Conflict
• But, we may introduce a reduce/reduce conflict during the shrink
process for the creation of the states of a LALR parser.
I1 : A  .,a
B  .,b
I2: A  .,b
B  .,c

 reduce/reduce
conflict
I12: A  .,a/b
B  .,b/c
Canonical LALR(1) Collection – Example2
“’  S
1) S  L=R
2) S  R
3) L *R
4) L  id
5) R  L
S  .R,$
L 
.*R,$/=
L  .id,$/=
R  .L,$
2
I :S  L.=R,$
R  L.,$
I3:S 
R.,$
S  .L=R,$ R  .L,$/=
L .*R,$/=
L  .id,$/=
I512:L 
id.,$/=
I :S  L=.R,$
R  .
6
L  .L,$
*R,$
L  .id,$
I713:L 
*R.,$/=
I810: R 
L.,$/=
I9:S  L=R.,$
to I6
to I713
to I810
to I411
to I512
to I810
to I411
to I512
to I9
S L
L
R
L
R
id
i
d
i
d
I0:“’  .S,$ I1:“’  S.,$ I411:L  *.R,$/= R
*
*
*
Same Cores
I4 and I11
I5 and I12
I7 and I13
8
I and I10
LALR(1) Parsing Tables – (for Example2)
id * = $ S L R
no shift/reduce or
no reduce/reduce

so, it is a LALR(1) g
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
5 r4 r4
6 s12 s11 10 9
7 r3 r3
8 r5 r5
9 r1
conflict
rammar
Using Ambiguous Grammars
• All grammars used in the construction of LR-parsing tables must be un-
ambiguous.
• Can we create LR-parsing tables for ambiguous grammars ?
– Yes, but they will have conflicts.
– We can resolve these conflicts in favor of one of them to disambiguate
the grammar.
– At the end, we will have again an unambiguous grammar.
• Why we want to use an ambiguous grammar?
– Some of the ambiguous grammars are much natural, and a
corresponding unambiguous grammar can be very complex.
– Usage of an ambiguous grammar may eliminate unnecessary reductions.
• Ex.
E  E+T | T
E  E+E | E*E | (E) | id  T  T*F | F
F  (E) | id
Sets of LR(0) Items for Ambiguous Grammar
E  .E+E
E  .E*E
E  .(E)
E  .id
E  E .*E
2
I : E  (.E)
E  .E+E
E  .E*E
E  .(E)
E  .id
I3: E  id.
E  E .+E E  .E+E
E  .E*E
E  .(E)
E  .id
5
I : E  E *.E
E  .E+E
E  .E*E
E  .(E)
E  .id
I6: E  (E.)
E  E.+E
E  E.*E
I0: E’  .E I1: E’  E. I4: E  E +.E I7: E  E+E.
E  E.+E
E  E.*E
8
I : E E*E .
E  E.+E
E  E.*E
I9: E  (E).
5
)
E
E
E
*
+
+
E +
*
+
*
I
*
(
(
(
(
id
id
id
id
I2
I2
I3
I3
I4
I4
I5
I4
I5
SLR-Parsing Tables for Ambiguous Grammar
FOLLOW(E) = { $,+,*,) }
State I7 has shift/reduce conflicts for symbols + and *.
I0 I7
E I1
+ I4
E
when current token is +
shift  + is right-associative
reduce  + is left-associative
when current token is *
shift  * has higher precedence than +
reduce  + has higher precedence than *
SLR-Parsing Tables for Ambiguous Grammar
FOLLOW(E) = { $,+,*,) }
State I8 has shift/reduce conflicts for symbols + and *.
I0 I7
E I1
* I5
E
when current token is *
shift  * is right-associative
reduce  * is left-associative
when current token is +
shift  + has higher precedence than *
reduce  * has higher precedence than +
id + * ( ) $ E
0 s3 s2 1
1 s4 s5 acc
2 s3 s2 6
3 r4 r4 r4 r4
4 s3 s2 7
5 s3 s2 8
6 s4 s5 s9
7 r1 s5 r1 r1
8 r2 r2 r2 r2
9 r3 r3 r3 r3
SLR-Parsing Tables for Ambiguous Grammar
Action Goto
Error Recovery in LR Parsing
• An LR parser will detect an error when it consults the
parsing action table and finds an error entry. All empty
entries in the action table are error entries.
• Errors are never detected by consulting the goto table.
• An LR parser will announce error as soon as there is no
valid continuation for the scanned portion of the input.
• A canonical LR parser (LR(1) parser) will never make
even a single reduction before announcing an error.
• The SLR and LALR parsers may make several reductions
before announcing an error.
• But, all LR parsers (LR(1), LALR and SLR parsers) will
never shift an erroneous input symbol onto the stack.
Panic Mode Error Recovery in LR Parsing
• Scan down the stack until a state s with a goto on a
particular nonterminal A is found. (Get rid of
everything from the stack before this state s).
• Discard zero or more input symbols until a symbol a is
found that can legitimately follow A.
– The symbol a is simply in FOLLOW(A), but this may not work forall
situations.
• The parser stacks the nonterminal A and the state
goto[s,A], and it resumes the normal parsing.
• This nonterminal A is normally is a basic programming
block (there can be more than one choice for A).
– stmt, expr, block, ...
Phrase-Level Error Recovery in LR Parsing
• Each empty entry in the action table is marked
with a specific error routine.
• An error routine reflects the error that the user
most likely will make in that case.
• An error routine inserts the symbols into the
stack or the input (or it deletes the symbols
from the stack and the input, or it can do both
insertion and deletion).
– missing operand
– unbalanced right parenthesis
UNIT-3
PART-A
Semantic analysis
276
277
Syntax Analysis Example
a := b + c* 100
 The seven tokens are grouped into a parse tree
Assignment stmt
identifier
a
:= expression
expression expression
+
identifier
b
c*100
Example of Parse Tree
278
Given the grammar:
list  list + digit
list  list - digit
(2.2)
(2.3)
list  digit (2.4)
digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 (2.5)
What is the parse
tree for 9-5+2?
list
digit
list digit
list digit
9 - 5 + 2
Abstract Syntax Tree (AST)
279
The AST is a condensed/simplified/abstract form of
the parse tree in which:
1. Operators are directly associated with interior nodes
(non-terminals)
2. Chains of single productions are collapsed.
3. Terminals that have no attributes are ignored, i.e.,
the corresponding leaf nodes are discarded.
Abstract and Concrete Trees
280
list
digit
list digit
list digit
9 - 5 + 2
Parse or concrete tree
+
- 2
9 5
Abstract syntaxtree
Advantages of the AST Representation
281
• Convenient representation for semantic analysis and
intermediate-language (IL) generation
• Useful for building other programming language tools e.t., a
syntax-directed editor
282
Syntax Directed Translation (SDT)
Syntax-directed translation is a method of translating a string into a
sequence of actions by attaching on such action to each rule of agrammar.
A syntax-directed translation is defined by augmenting the CFG: a translation
rule is defined for each production. A translation rule defines the translation
of the left-hand side non terminal.
283
Syntax-Directed Definitions and Translation
Schemes
A. Syntax-Directed Definitions:
• give high-level specifications for translations
• hide many implementation details such as order of evaluation of
semantic actions.
• We associate a production rule with a set of semantic actions, and we do
not say when they will be evaluated.
B. Translation Schemes:
• Indicate the order of evaluation of semantic actions associated with
a production rule.
• In other words, translation schemes give a little bit information about
implementation details.
Example Syntax-Directed Definition
term ::= ID
{ term.place := ID.place ; term.code = “” }
term1 ::= term2 *ID
{term1.place := newtemp( );
term1.code := term2.code || ID.code ||
gen(term1.place ‘:=‘ term2.place ‘*’ ID.place}
expr ::= term
{ expr.place := term.place ; expr.code := term.code }
expr1 ::= expr2 +term
{ expr1.place := newtemp( )
expr1.code := expr2.code || term.code ||
gen(expr1.place ‘:=‘ expr2.place ‘+’ term.place }
284
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx

More Related Content

Similar to COMPILER DESIGN PPTS.pptx

Pros and cons of c as a compiler language
  Pros and cons of c as a compiler language  Pros and cons of c as a compiler language
Pros and cons of c as a compiler language
Ashok Raj
 
Compiler construction tools
Compiler construction toolsCompiler construction tools
Compiler construction tools
Akhil Kaushik
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdf
DrIsikoIsaac
 
Concept of compiler in details
Concept of compiler in detailsConcept of compiler in details
Concept of compiler in details
kazi_aihtesham
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptx
AshenafiGirma5
 
Introduction to compiler
Introduction to compilerIntroduction to compiler
Introduction to compiler
Abha Damani
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptx
ZiyadMohammed17
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introduction
mengistu23
 
System software module 4 presentation file
System software module 4 presentation fileSystem software module 4 presentation file
System software module 4 presentation file
jithujithin657
 
Presentation1
Presentation1Presentation1
Presentation1
Zarin Tasnim
 
Presentation1
Presentation1Presentation1
Presentation1
Zarin Tasnim
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
Rossy719186
 
1._Introduction_.pptx
1._Introduction_.pptx1._Introduction_.pptx
1._Introduction_.pptx
Anbarasan Radhakrishnan R
 
Chapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptxChapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptx
dawod yimer
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
NISHASOMSCS113
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compiler
Iffat Anjum
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdf
DrIsikoIsaac
 
Chapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialChapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course Material
gadisaAdamu
 
Introduction to Compiler Construction
Introduction to Compiler Construction Introduction to Compiler Construction
Introduction to Compiler Construction
Sarmad Ali
 
Compilers.pptx
Compilers.pptxCompilers.pptx
Compilers.pptx
MohammedMohammed578197
 

Similar to COMPILER DESIGN PPTS.pptx (20)

Pros and cons of c as a compiler language
  Pros and cons of c as a compiler language  Pros and cons of c as a compiler language
Pros and cons of c as a compiler language
 
Compiler construction tools
Compiler construction toolsCompiler construction tools
Compiler construction tools
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdf
 
Concept of compiler in details
Concept of compiler in detailsConcept of compiler in details
Concept of compiler in details
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptx
 
Introduction to compiler
Introduction to compilerIntroduction to compiler
Introduction to compiler
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptx
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introduction
 
System software module 4 presentation file
System software module 4 presentation fileSystem software module 4 presentation file
System software module 4 presentation file
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation1
Presentation1Presentation1
Presentation1
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
 
1._Introduction_.pptx
1._Introduction_.pptx1._Introduction_.pptx
1._Introduction_.pptx
 
Chapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptxChapter 2 Program language translation.pptx
Chapter 2 Program language translation.pptx
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compiler
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdf
 
Chapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialChapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course Material
 
Introduction to Compiler Construction
Introduction to Compiler Construction Introduction to Compiler Construction
Introduction to Compiler Construction
 
Compilers.pptx
Compilers.pptxCompilers.pptx
Compilers.pptx
 

More from MUSHAMHARIKIRAN6737

184544seminarneuralink-210514063641 (5).pdf
184544seminarneuralink-210514063641 (5).pdf184544seminarneuralink-210514063641 (5).pdf
184544seminarneuralink-210514063641 (5).pdf
MUSHAMHARIKIRAN6737
 
Basic Introduction Part 1 (1).pdf
Basic Introduction Part 1 (1).pdfBasic Introduction Part 1 (1).pdf
Basic Introduction Part 1 (1).pdf
MUSHAMHARIKIRAN6737
 
FSD JavaScript Part 11.pdf
FSD JavaScript Part 11.pdfFSD JavaScript Part 11.pdf
FSD JavaScript Part 11.pdf
MUSHAMHARIKIRAN6737
 
kiran chandu ppt.pptx
kiran chandu ppt.pptxkiran chandu ppt.pptx
kiran chandu ppt.pptx
MUSHAMHARIKIRAN6737
 
Design is a blueprint for the development and implementation.docx
Design is a blueprint for the development and implementation.docxDesign is a blueprint for the development and implementation.docx
Design is a blueprint for the development and implementation.docx
MUSHAMHARIKIRAN6737
 
A_Research_Paper_on_College_Management_S.pdf
A_Research_Paper_on_College_Management_S.pdfA_Research_Paper_on_College_Management_S.pdf
A_Research_Paper_on_College_Management_S.pdf
MUSHAMHARIKIRAN6737
 
Mini Project Abstract Review Sample PPT Template (2).pptx
Mini Project Abstract Review Sample PPT Template (2).pptxMini Project Abstract Review Sample PPT Template (2).pptx
Mini Project Abstract Review Sample PPT Template (2).pptx
MUSHAMHARIKIRAN6737
 

More from MUSHAMHARIKIRAN6737 (7)

184544seminarneuralink-210514063641 (5).pdf
184544seminarneuralink-210514063641 (5).pdf184544seminarneuralink-210514063641 (5).pdf
184544seminarneuralink-210514063641 (5).pdf
 
Basic Introduction Part 1 (1).pdf
Basic Introduction Part 1 (1).pdfBasic Introduction Part 1 (1).pdf
Basic Introduction Part 1 (1).pdf
 
FSD JavaScript Part 11.pdf
FSD JavaScript Part 11.pdfFSD JavaScript Part 11.pdf
FSD JavaScript Part 11.pdf
 
kiran chandu ppt.pptx
kiran chandu ppt.pptxkiran chandu ppt.pptx
kiran chandu ppt.pptx
 
Design is a blueprint for the development and implementation.docx
Design is a blueprint for the development and implementation.docxDesign is a blueprint for the development and implementation.docx
Design is a blueprint for the development and implementation.docx
 
A_Research_Paper_on_College_Management_S.pdf
A_Research_Paper_on_College_Management_S.pdfA_Research_Paper_on_College_Management_S.pdf
A_Research_Paper_on_College_Management_S.pdf
 
Mini Project Abstract Review Sample PPT Template (2).pptx
Mini Project Abstract Review Sample PPT Template (2).pptxMini Project Abstract Review Sample PPT Template (2).pptx
Mini Project Abstract Review Sample PPT Template (2).pptx
 

Recently uploaded

bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
Sarojini38
 
How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
Celine George
 
Observational Learning
Observational Learning Observational Learning
Observational Learning
sanamushtaq922
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
Celine George
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
EducationNC
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
heathfieldcps1
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapitolTechU
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
MattVassar1
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
Friends of African Village Libraries
 
How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...
Infosec
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
whatchangedhowreflec
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
MattVassar1
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Catherine Dela Cruz
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
Kalna College
 
The Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teachingThe Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teaching
Derek Wenmoth
 
Creating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptxCreating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptx
Forum of Blended Learning
 
8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity
RuchiRathor2
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
roshanranjit222
 
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
Nguyen Thanh Tu Collection
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
Celine George
 

Recently uploaded (20)

bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
 
How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
 
Observational Learning
Observational Learning Observational Learning
Observational Learning
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
 
How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
 
The Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teachingThe Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teaching
 
Creating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptxCreating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptx
 
8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
 
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
BỘ BÀI TẬP TEST THEO UNIT - FORM 2025 - TIẾNG ANH 12 GLOBAL SUCCESS - KÌ 1 (B...
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
 

COMPILER DESIGN PPTS.pptx

  • 2. COMPILER A compiler is a program that reads a program in one language, the source language and Translates into an equivalent program in another language, the target language. The translation process should also report the presence of errors in the source program
  • 3. Source Program → Compiler → Target Program ↓ Error Messages COMPILER
  • 4. There are two parts of compilation. . The analysis part breaks up the source program into constant piece and creates an intermediate representation of the source program. The synthesis part constructs the desired target program from the intermediate representation.
  • 6. .
  • 7. Lexical analyzer • Lexical Analyzer reads the source program character by character and returns the • tokens of the source program. • • A token describes a pattern of characters having same meaning in the source • program. (such as identifiers, operators, keywords, numbers, delimeters and so • on) • Ex: newval := oldval + 12 => tokens: newval identifier • := assignment operator • oldval identifier
  • 8. Lexical analyzer + add operator 12 a number • Puts information about identifiers into the symbol table. • • ‘egular expressions are used to describe tokens (lexical constructs).
  • 10. Syntax analyzer. The syntax of a language is specified by a context free grammar (CFG). • The rules in a CFG are mostly recursive. • A syntax analyzer checks whether a given program satisfies the rules implied by a CFG or not. If it satisfies, the syntax analyzer creates a parse tree for the given program.
  • 11. Syntax analyzer Ex: We use BNF (Backus Naur Form) to specify a CFG assgstmt -> identifier := expression expression -> identifier expression -> number expression -> expression + expression
  • 12. Semantic analyzer • A semantic analyzer checks the source program for semantic errors and collects • the type information for the code generation. • Type-checking is an important part of semantic analyzer. • Normally semantic information cannot be represented by a context-free language • used in syntax analyzers. • Context-free grammars used in the syntax analysis are integrated with attributes • (semantic rules) • the result is a syntax-directed translation, • Attribute grammars
  • 13. Semantic analyzer Ex: newval := oldval + 12 The type of the identifier newval must match with type of the expression (oldval+12)
  • 14. Intermediate code generation A compiler may produce an explicit intermediate codes representing the source program. These intermediate codes are generally machine (architecture independent). But the level of intermediate codes is close to the level of machine codes.
  • 15. Intermediate code generation Ex: newval := oldval * fact + 1 id1 := id2 * id3 + 1 MULT id2,id3,temp1 Intermediates Codes (Quadraples) ADD temp1,#1,temp2 MOV temp2,,id1
  • 16. Code optimizer The code optimizer optimizes the code produced by the intermediate code generator in the terms of time and space. Ex: MULT id2,id3,temp1 ADD temp1,#1,id1
  • 17. Code generator Produces the target language in a specific architecture. The target program is normally is a relocatable object file containing the machine codes. Ex: ( assume that we have an architecture with instructions whose at least one of its operands is a machine register)
  • 19. Compiler construction tools • A number of tools have been developed variously called compiler –compiler , compiler generator or translator writing system • The input for these systems may contain • 1. a description of source language. • 2. a description of what output to be • generated. • 3. a description of the target machine.
  • 20. Compiler construction tools The principal aids provided by compiler-compiler are 1. Scanner Generator 2. Parser generator 3. Facilities for code generation
  • 21. Lexical Analyzer • The Role of the Lexical Analyzer 1st phase of compiler Reads i/p character & produce o/p sequence of tokens that the Parser uses for syntax analysis It can either work as a separate module or as sub module
  • 22. Tokens , Patterns and Lexemes • Lexeme: Sequence of character in the source pm that is matched against the pattern for a token • Pattern: The rule associated with each set of strings is called pattern. • Lexeme is matched against pattern to generate token • Token: Token is word, which describes the lexeme in source pgm. It is generated when lexeme is matched against pattern
  • 23. 3 Methods of constructing Lexical Analyzer . • 1. Using Lexical Code Generator • Such compilers/ tools are available that takes in Regular Expressions • As i/p and generate code for that R.E.These tools can be used to generate Lexical Analyzer code from R.Es • - Example of such Tool is LEX for Unix .
  • 24. 3 Methods of constructing Lexical Analyzer R.E LEX.YY.C LEX.YY.C a.out(lexical) tokens source program Lex generator compiler a.out
  • 25. Compiler and Translator • Compiler is a form of translator that translate a program written in one language “ • The “ource Language” to an equivalent program in a second language “ The Target • language ” or “ The Object Language “
  • 26. Compiler and Translator Prog in source Language prog in target language Errors & Diagnostics Assemblers, Compilers and Interpreters are all specific translators compiler
  • 27. Assemblers M/C code Assembly • Language • (opcodes or mnemonics) • Interpreters • - Interpret the statements of the High level Language pgm as they are encountered . • Produce o/p of statement as they are interpreted . assembler
  • 28. Languages involved in Compiler 3 languages are involved in a compiler • 1. Source language: Read by compiler • 2. Target or Object language : translated by compiler /translator to another language • 4. Host Language: Language in which compiler is written
  • 29. Advantages of Compiler • Conciseness which improves programmer productivity, • semantic restriction • Translate and analyze H.L.L.(source pgm) only once and then • run the equivalent m/c code produce by compiler
  • 30. Disadvantage of Compiler • Slower speed • Size of compiler and compiled code • Debugging involves all source code Interpreter versus Compiler Compiler translates to machine code
  • 32. Variant: Interpretation of intermediate code Compiler generates code for a "virtual machine" (VM) VM is simulated by software
  • 33. Single-Pass Compilers Phases work in an interleaved way
  • 34. Static Structure of a (Single-Pass) Compiler
  • 35. Multi-Pass Compilers Phases are separate "Programs", which run sequentially
  • 36. Why multi-pass? If memory is scarce (irrelevant today) If the language is complex If portability is important
  • 38. Loader/Linker Editor: Performs 2 functions • i. Loading : relocatable code is converted to absolute code i.e. placed at their specified position ii. Link-Editor : Make single pgm from several files of relocatable m/c code. The file may be o/p of different compilation . May be Library files or routine provided by system . Result is executable file
  • 39. Preprocessor Produce i/p to compiler. They may perform i. Macro processing: shorthand for longer construct . ii. File inclusion: separate module can be used by including their file e.g #include <iostream> . iii. Rational preprocessors: give support for additional facilities which are not included in compiler itself . iv.Language Extension: may include extra capabilities .
  • 40. Major Types of Compiler • 1. Self-resident Compiler: generates Target code for the same m/c or host . • 2. Cross Compilers: generates target code for m/c other then host .
  • 41. Phases and passes • In logical terms a compiler is thought of as consisting of stages and phases • Physically it is made up of passes • The compiler has one pass for each time the source code, or a representation of • it, is read • Many compilers have just a single pass so that the complete compilation • process is performed while the code is read once
  • 42. Phases and passes • The various phases described will therefore be executed in parallel • Earlier compilers had a large number of passes, typically due to the limited • memory space available • Modern compilers are single pass since memory space is not usually a problem
  • 43. Use of tools • The 2 main types of tools used in compiler production are: • 1. a lexical analyzer generator • Takes as input the lexical structure of a language, which defines how its • tokens are made up from characters • Produces as output a lexical analyzer (a program in C for example) for the • language • Unix lexical analyzer Lex • 2. a symbol analyzer generator
  • 44. Use of tools • Takes as input the syntactical definition of a language • Produces as output a syntax analyzer (a program in C for example) for the • language • The most widely know is the Unix-based YACC (Yet Another • Compiler-Compiler), used in conjunction with Lex. • Bison: public domain version
  • 45. Applications of compiler techniques • Compiler technology is useful for a more general class of applications • Many programs share the basic properties of compilers: they read textual input, • organize it into a hierarchical structure and then process the structure • An understanding how programming language compilers are designed and • organized can make it easier to implement these compiler like applications as • well • More importantly, tools designed for compiler writing such as lexical analyzer
  • 46. Applications of compiler techniqu • generators and parser generators can make it vastly easier to implement such • applications • Thus, compiler techniques - An important knowledge for computer science • engineers • Examples: • Document processing: Latex, HTML, XML • User interfaces: interactive applications, file systems, databases • Natural language treatment • Automata Theory, La
  • 47. Interpreters • Interpreters: Instead of producing a target program as a translation, an interpreter • performs the operations implied by the source program
  • 48. Differences between compiler and Interpreter no 1 COMPILER It takes entire program as input 2 Intermediate object code is generated 3 Conditional control statements are executes faster 4 5 Memory requirement is more Program need not be compiled every time 6 Errors are displayed after entire program is checked 7 Ex: C Compiler INTERPRETER It takes single instruction as input No intermediate object code is generated Conditional control statements are executes slower Memory requirement is less Every time higher level program is converted into lower level program Errors are displayed for every instruction interpreted (if any) Ex : Basic
  • 49. Types of compilers 1.Incremental compiler : It is a compiler it performs the recompilation of only modified source rather than compiling the whole source program. Features: 1.It tracks the dependencies between output and source program. 2.It produces the same result as full recompilation. 3.It performs less task than recompilation. 4.The process of incremental compilation is effective for maintenance.
  • 50. Types of compilers 2.Cross compiler: Basically there exists 3 languages 1.source language i.e application program. 2.Target language in which machine code is return. 3.Implementation language in which a compiler is return. All these 3 languages are different. In other words there may be a compiler which run on one machine and produces the target code for another machine. Such a compiler is called cross compiler.
  • 51. Types of compilers To represent cross compiler T diagram is drawn as follows. I S T I • • CROSS COMPILER:For source language L the target language N get generated which runs on machine M. L N L N • • • S S M M M
  • 52. Bootstrapping Compilers and T-diagrams • The rules for T-diagrams are very simple. A compiler written in some language “C” (could be anything from machine code on up) that translates programs in language A to language B looks like this (these diagrams are from
  • 53. Bootstrapping Compilers and T-diagrams • Now suppose you have a machine that can directly run HP machine code, and a compiler from ML to HP machine code, and you want to get a ML compiler running on different machine code P.You can start by writing an ML-to-P compiler in ML, and compile that to get an ML-to-P compiler in HP:
  • 54. Bootstrapping Compilers and T-diagrams From there, feed the new ML-to-P compiler to itself, running on the HP machine, and you end up with an ML-to-P compiler that runs on a P machine!
  • 55. Lexical Analysis • lexical analyser or scanner is a program that groups sequences of characters into lexemes, and outputs (to the syntax analyser) a sequence of tokens. Here: • (a) Tokens are symbolic names for the entities that make up the text of the program; • e.g. • If for the keyword if , and id • for any identifier. These make up the output of • the lexical analyser
  • 56. The role of lexical analyzer Lexical Analyzer Parser Source program token getNextToken Symbol Tosemantic analysis
  • 57. Tokens, Patterns and Lexemes • A token is a pair a token name and an optional token value • A pattern is a description of the form that the lexemes of a token may take • A lexeme is a sequence of characters in the source program that matches the pattern for a token
  • 58. Example Token Informal description Sample lexemes if else comparison id number Literal Characters i, f Characters e, l, s, e < or > or <= or >= or == or != Letter followed by letter and digits Any numeric constant Anything but “ sorrounded by “ if else <=, != pi, score, D2 3.14159, 0, 6.02e23 “core dumped” printf(“total = %dn”, score);
  • 59. Attributes for tokens • E = M * C ** 2 – <id, pointer to symbol table entry for E> – <assign-op> – <id, pointer to symbol table entry for M> – <mult-op> – <id, pointer to symbol table entry for C> – <exp-op> – <number, integer value 2>
  • 60. Lexical errors • Some errors are out of power of lexical analyzer to recognize: – fi (a == f(x)) … • However it may be able to recognize errors like: – d = 2r • Such errors are recognized when no pattern for tokens matches a character sequence
  • 61. Specification of tokens • In theory of compilation regular expressions are used to formalize the specification of tokens • Regular expressions are means for specifying regular languages • Example: • Letter_(letter_ | digit)* • Each regular expression is a pattern specifying the form of strings
  • 62. Regular expressions • Ɛ is a regular expression, L(Ɛ) = {Ɛ} • If a is a symbol in ∑then a is a regular expression, L(a) ={a} • (r) | (s) is a regular expression denoting the language L(r) ∪ L(s) • (r)(s) is a regular expression denoting the language L(r)L(s) • (r)* is a regular expression denoting (L9r))* • (r) is a regular expression denting L(r)
  • 63. Regular definitions d1 -> r1 d2 -> r2 … dn -> rn • Example: letter_ -> A | B | … | Z | a | b | … | Z | _ digit -> 0 | 1 | … | 9 id -> letter_ (letter_ | digit)*
  • 64. Extensions • One or more instances: (r)+ • Zero of one instances: r? • Character classes: [abc] • Example: – letter_ -> [A-Za-z_] – digit – id -> [0-9] -> letter_(letter|digit)*
  • 65. Recognition of tokens • Starting point is the language grammar to understand the tokens: stmt -> if expr then stmt | if expr then stmt else stmt | Ɛ expr -> term relop term | term term -> id | number
  • 66. Recognition of tokens (cont.) • The next step is to formalize the patterns: digit Digits -> [0-9] -> digit+ number -> digit(.digits)? (E[+-]? Digit)? letter -> [A-Za-z_] id -> letter (letter|digit)* If -> if Then -> then Else -> else Relop -> < | > | <= | >= | = | <> • We also need to handle whitespaces: ws -> (blank | tab | newline)+
  • 68. Transition diagrams (cont.) • Transition diagram for reserved words and identifiers
  • 69. Transition diagrams (cont.) • Transition diagram for unsigned numbers
  • 70. Transition diagrams (cont.) • Transition diagram for whitespace
  • 71. Lexical Analyzer Generator - Lex Compiler Lex Source program lex.l lex.yy.c C compiler lex.yy.c a.out a.out Input stream Sequenc e of tokens
  • 72. LEX Example %{ /* definitions of manifest constants LT,LE, EQ, NE, GT,GE, IF, THEN, ELSE, ID, NUMBER, RELOP */ %} /* regular definitions [ tn] delim ws letter digit id {delim}+ [A-Za-z] [0-9] {letter}({letter}|{digit})* number {digit}+(.{digit}+)?(E[+-]?{digit}+)?
  • 73. LEX Example %% {ws} {/* no action and no return */} if {return(IF);} then {return(THEN);} else {return(ELSE);} {id} {yylval = (int) installID(); return(ID); } {yylval = (int) installNum(); return(NUMBER);} {number} …
  • 74. LEX Example Int installID() {/* funtion to install the lexeme,whose first character is pointed to by yytext, and whose length is yyleng, into the symbol table and return a pointer thereto*/ } Int installNum() { /* similar to installID, butputs numerical constants intoaseparate table */ }
  • 75. Finite Automata • Regular expressions = specification • Finite automata = implementation • A finite automaton consists of – An input alphabet  – A set of states S – A start state n – A set of accepting states F  S – A set of transitions state input state 75
  • 76. Finite Automata • Transition s1 a s2 • Is read In state s1 on input “a” go to state s2 • If end of input – If in accepting state => accept, othewise => reject • If no transition possible => reject 76
  • 77. Finite Automata State Graphs 77 • A state • The start state • An accepting state • A transition a
  • 78. Example • Alphabet {0,1} • What language does this recognize? 78 0 1 0 1 0 1
  • 79. And Another Example 79 • Alphabet still { 0, 1 } 1 1 • The operation of the automaton is not completely defined by the input – On input “11” the automaton could be in either state
  • 80. Epsilon Moves 80 • Another kind of transition: -moves  • Machine can move from state A to state B without reading input A B
  • 81. Deterministic and Nondeterministic Automata • Deterministic Finite Automata (DFA) – One transition per input per state – No -moves • Nondeterministic Finite Automata (NFA) – Can have multiple transitions for one input in a given state – Can have -moves • Finite automata have finite memory – Need only to encode the current state 81
  • 82. Execution of Finite Automata • A DFA can take only one path through the state graph – Completely determined by input • NFAs can choose – Whether to make -moves – Which of multiple transitions for a single input to take . 82
  • 83. Acceptance of NFAs 83 • Input: 0 • An NFA can get into multiple states 1 1 1 0 1 • Rule: NFA accepts if it can get in a final state
  • 84. NFA vs. DFA (1) • NFAs and DFAs recognize the same set of languages (regular languages) • DFAs are easier to implement – There are no choices to consider 84
  • 85. NFA vs. DFA (2) • For a given language the NFA can be simpler than the DFA 85 0 1 0 0 1 0 0 NFA DFA 1 1 • DFA can be exponentially larger than NFA
  • 86. Regular Expressions to Finite Automata • High-level sketch 86 Regular expressions NFA DFA Lexical Specification Table-driven Implementation of DFA
  • 87. Regular Expressions to NFA (1) • For each kind of rexp, define an NFA – Notation: NFA for rexp A 87 A • For   • For input a a
  • 88. Regular Expressions to NFA (2) • For AB 88 A B  • For A | B A B    
  • 89. 89 A   Regular Expressions to NFA (3) • For A* 
  • 90. Example of RegExp -> NFA conversion 90  C E 1 0 D F   B   G • Consider the regular expression (1 | 0)*1 • The NFA is    A H 1 I J
  • 92. NFA to DFA. The Trick • Simulate the NFA • Each state of resulting DFA = a non-empty subset of states of the NFA • Start state = the set of NFA states reachable through -moves from NFA start state • Add a transition S a “’ to DFA iff – “’ is the set of NFA states reachable from the states in “ after seeing the input a • considering -moves as well 92
  • 93. NFA -> DFA Example 93 1 0 1         A B C D E F G H I J ABCDHI FGABCDHI EJGABCDHI 0 1 0 1 0 1
  • 94. NFA to DFA. Remark • An NFA may be in many states at any time • How many different states ? • If there are N states, the NFA must be in some subset of those N states • How many non-empty subsets are there? – 2N - 1 = finitely many, but exponentially many 94
  • 95. Implementation • A DFA can be implemented by a 2D table T – One dimension is “states” – Other dimension is “input symbols” – For every transition Si a Sk define T[i,a] = k • DFA “execution” – If in state Si and input a, read T[i,a] = k and skip to stateSk – Very efficient 95
  • 96. Table Implementation of a DFA 96 S T U 0 1 0 1 0 1 0 1 S T U T T U U T U
  • 97. Implementation (Cont.) • NFA -> DFA conversion is at the heart of tools such as flex or jflex • But, DFAs can be huge • In practice, flex-like tools trade off speed for space in the choice of NFA and DFA representations 97
  • 99. Top-Down Parsing • The parse tree is created top to bottom. • Top-down parser – Recursive-Descent Parsing • Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.) • It is a general parsing technique, but not widely used. • Not efficient
  • 100. Top-Down Parsing – Predictive Parsing • no backtracking • efficient • needs a special form of grammars (LL(1) grammars). • Recursive Predictive Parsing is a special form of Recursive Descent parsing without backtracking. • Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.
  • 101. Recursive-Descent Parsing (uses Backtracking) • Backtracking is needed. • It tries to find the left-most derivation. S  aBc B  bc | b S S input: abc a B a B c b c b c fails, backtrack
  • 102. Predictive Parser a grammar   a grammar suitable for predictive left parsing (a LL(1) eliminate grammar) left recursion factor no %100 guarantee. • When re-writing a non-terminal in a derivation step, a predictive parser can uniquely choose a production rule by just looking the current symbol in the input string. A  1 | ...| n input: ... a ....... current token
  • 103. Predictive Parser (example) stmt  if ...... | while ...... | begin ...... | for ..... • When we are trying to write the non-terminal stmt, if the current token is if we have to choose first production rule. • When we are trying to write the non-terminal stmt, we can uniquely choose the production rule by just looking the current token. • We eliminate the left recursion in the grammar, and left factor it. But it may not be suitable for predictive parsing (not LL(1) grammar).
  • 104. Recursive Predictive Parsing • Each non-terminal corresponds to a procedure. Ex: A  aBb (This is only the production rule for A) proc A { - match the current token with a, and move to the next token; - call ‘B’; - match the current token with b, and move to the next token; }
  • 105. Recursive Predictive Parsing (cont.) A  aBb | bAB proc A { case of the current token { ‘a’: - match the current token with a, and move to the next token; - call ‘B’; - match the current token with b, and move to the next token; ‘b’: - match the current token with b, and move to the next token; - call ‘A’; - call ‘B’; } }
  • 106. Recursive Predictive Parsing (cont.) • When to apply -productions. A  aA | bB |  • If all other productions fail, we should apply an - production. For example, if the current token is not a or b, we may apply the -production. • Most correct choice: We should apply an -production for a non-terminal A when the current token is in the follow set of A (which terminals can follow A in the sentential forms).
  • 107. Non-Recursive Predictive Parsing -- LL(1) Parser • Non-Recursive predictive parsing is a table-driven parser. • It is a top-down parser. • It is also known as LL(1) Parser. input buffer Non-recursive stack output Predictive Parser Parsing Table
  • 108. LL(1) Parser input buffer – our string to be parsed. We will assume that its end is marked with a special symbol $. output – a production rule representing a step of the derivation sequence (left-most derivation) of the string in the input buffer. stack – contains the grammar symbols – at the bottom of the stack, there is a special end marker symbol $. – initially the stack contains only the symbol $ and the starting symbol S. $S  initial stack – when the stack is emptied (ie. only $ left in the stack), the parsing is completed.
  • 109. LL(1) Parser parsing table – a two-dimensional array M[A,a] – each row is a non-terminal symbol – each column is a terminal symbol or the special symbol $ – each entry holds a production rule.
  • 110. LL(1) Parser – Parser Actions • The symbol at the top of the stack (say X) and the current symbol in the input string (say a) determine the parser action. • There are four possible parser actions. 1. If X and a are $  parser halts (successful completion) 2. If X and a are the same terminal symbol (different from $)  parser pops X from the stack, and moves the next symbol in the input buffer.
  • 111. LL(1) Parser – Parser Actions 3. If X is a non-terminal  parser looks at the parsing table entry M[X,a]. If M[X,a] holds a production rule XY1Y2...Yk, it pops X from the stack and pushes Yk,Yk-1,...,Y1 into the stack. The parser also outputs the production rule XY1Y2...Yk to represent a step of the derivation. 4. none of the above  error – all empty entries in the parsing table are errors. – If X is a terminal symbol different from a, this is also an error case.
  • 112. LL(1) Parser – Example1 LL(1) S  aBa Parsing B  bB |  Table a b $ S S  aBa B B   B  bB
  • 113. LL(1) Parser – Example input stack $S output abba$ S  aBa $aBa abba$ $aB bba$ $aBb bba$ $aB ba$ $aBb ba$ $aB a$ $a a$ B  bB B  bB B   $ accept, successful $ completion
  • 114. LL(1) Parser – Example1 (cont.) B B b b  Outputs: S  aBa B  bB B  bB B   Derivation(left-most): SaBaabBaabbBaabba S parse tree a B a
  • 115. LL(1) Parser – Example2 E  TE’ E’  +TE’ |  T  FT’ T’ *FT’ |  F  (E) | id id + * ( ) $ E E  TE’ E  TE’ E’ E’  +TE’ E’   E’   T T  FT’ T  FT’ T’ T’   T’  *FT’ T’   T’   F F  id F  (E)
  • 116. stack input LL(1) Parser – Example2 output $E id+id$ E  TE’ $E’T id+id$ T  FT’ $E’ T’F id+id$ F  id $ E’ T’id id+id$ $ E’ T’ +id$ T’  $ E’ +id$ E’ +TE’ $ E’ T+ +id$ $ E’ T id$ T  FT’ $ E’ T’ F id$ F  id $ E’ T’id id$ $ E’ T’ $ T’  $ E’ $ E’  $ $ accept
  • 117. Constructing LL(1) Parsing Tables • Two functions are used in the construction of LL(1) parsing tables: – FIRST FOLLOW • FIRST() is a set of the terminal symbols which occur as first symbols in strings derived from  where  is any string of grammar symbols. • if  derives to , then  is also in FIRST() . • FOLLOW(A) is the set of the terminals which occur immediately after (follow) the non-terminal A in the strings derived from the starting symbol. – a terminal a is in FOLLOW(A) if S  Aa – $ is in FOLLOW(A) if S  A * *
  • 118. Compute FIRST for Any String X   is inFIRST(X). • If X is a terminal symbol  FIRST(X)={X} • If X is a non-terminal symbol and X   is a production rule • If X is a non-terminal symbol and X  Y1Y2..Ynis a production rule  if a terminal a in FIRST(Yi) and  is in all FIRST(Yj) for j=1,...,i-1 then a is in FIRST(X).  if  is in all FIRST(Yj) for j=1,...,n then  is in FIRST(X). • If X is   FIRST(X)={} • If X is Y1Y2..Yn  if a terminal a in FIRST(Yi) and  is in all FIRST(Yj)for j=1,...,i-1 then a is in FIRST(X).  if  is in all FIRST(Yj) for j=1,...,n then  is in FIRST(X).
  • 119. FIRST Example E  TE’ E’ +TE’ |  T  FT’ T’  *FT’ |  F  (E) | id FIRST(F) = {(,id} FIRST(TE’) = {(,id} FIRST(T’) = {*, } FIRST(+TE’ ) = {+} FIRST(T) = {(,id} FIRST() = {} FIRST(FT’) = {(,id} FIRST(E’) = {+, } FIRST(E) = {(,id} FIRST(*FT’) = {*} FIRST() = {} FIRST((E)) = {(} FIRST(id) = {id}
  • 120. Compute FOLLOW (for non-terminals) • If S is the start symbol  $ is in FOLLOW(S) • if A  B is a production rule  everything in FIRST() is FOLLOW(B) except  • If ( A  B is a production rule ) or ( A  B is a production rule and  is in FIRST())  everything in FOLLOW(A) is in FOLLOW(B). We apply these rules until nothing more can be added to any follow set.
  • 121. FOLLOW Example E  TE’ E’  +TE’ |  T  FT’ T’  *FT’ |  F  (E) | id FOLLOW(E) = { $, ) } FOLLOW(E’) = { $, ) } FOLLOW(T) = { +, ), $ } FOLLOW(T’) = { +, ), $ } FOLLOW(F) = {+, *, ), $ }
  • 122. Constructing LL(1) Parsing Table -- Algorithm • for each production rule A   of a grammar G    – for each terminal a in FIRST() add A   to M[A,a] – If  in FIRST() for each terminal a in FOLLOW(A) add A   to M[A,a] – If  in FIRST() and $ in FOLLOW(A) add A   to M[A,$] • All other undefined entries of the parsing table are error entries.
  • 123. Constructing LL(1) Parsing Table -- Example FIRST(TE’)={(,id} FIRST(+TE’ )={+}  E  TE’ into M[E,(] and M[E,id]  E’  +TE’ intoM[E’,+] E  TE’ E’ +TE’ E’  FIRST()={} but since  in FIRST() and FOLLOW(E’)={$,)}  none  E’   into M[E’,$] and M[E’,)] FIRST(FT’)={(,id} FIRST(*FT’ )={*}  T  FT’ into M[T,(] and M[T,id]  T’  *FT’ intoM[T’,*] T  FT’ T’ *FT’ T’  FIRST()={} but since  in FIRST() and FOLLOW(T’)={$,),+}  none  T’   into M[T’,$], M[T’,)] and FIRST((E) )={(} M[T’,+] F  (E) F  id FIRST(id)={id}  F  (E) intoM[F,(]  F  id intoM[F,id]
  • 124. LL(1) Grammars • A grammar whose parsing table has no multiply- defined entries is said to be LL(1) grammar. one input symbol used as a look-head symbol do determine parser action LL(1) left most derivation input scanned from left to right • The parsing table of a grammar may contain more than one production rule. In this case, we say that it is not a LL(1) grammar.
  • 125. A Grammar which is not LL(1) S  i C t S E | a E  e S |  C  b FOLLOW(S) = { $,e } FOLLOW(E) = { $,e } FOLLOW(C) = { t } FIRST(iCtSE) = {i} FIRST(a) = {a} FIRST(eS) = {e} FIRST() = {} FIRST(b) = {b} a b e i t $ S S  a S  iCtSE E E  e S E   E   C C  b two prod uction rulesfor M[ E,e] Problem  ambiguity
  • 126. A Grammar which is not LL(1) (cont.) • What do we have to do it if the resulting parsing table contains multiply defined entries? – If we didn’t eliminate left recursion, eliminate the left recursion in the grammar. – If the grammar is not left factored, we have to left factor the grammar. – If its (new grammar’s) parsing table still contains multiply defined entries, that grammar is ambiguous or it is inherently not a LL(1) grammar.
  • 127. A Grammar which is not LL(1) (cont.) • A left recursive grammar cannot be a LL(1) grammar. – A  A |   any terminal that appears in FIRST() also appears FIRST(A) because A  .  If  is , any terminal that appears in FIRST() also appearsin FIRST(A) and FOLLOW(A). • A grammar is not left factored, it cannot be a LL(1) grammar • A  1 |2 any terminal that appears in FIRST(1) also appears in FIRST(2). • An ambiguous grammar cannot be a LL(1) grammar.
  • 128. Properties of LL(1) Grammars • A grammar G is LL(1) if and only if the following conditions hold for two distinctive production rules A   and A   1. Both  and  cannot derive strings starting with same terminals. 2. At most one of  and  can derive to . 3. If  can derive to , then  cannot derive to any string starting with a terminal in FOLLOW(A).
  • 129. Error Recovery in Predictive Parsing • An error may occur in the predictive parsing (LL(1) parsing) – if the terminal symbol on the top of stack does not match with the current input symbol. – if the top of stack is a non-terminal A, the current input symbol is a, and the parsing table entry M[A,a] is empty. • What should the parser do in an error case? – The parser should be able to give an error message (as much as possible meaningful error message). – It should be recover from that error case, and it should be able to continue the parsing with the rest of the input.
  • 130. Error Recovery Techniques • Panic-Mode Error Recovery – Skipping the input symbols until a synchronizing token is found. • Phrase-Level Error Recovery – Each empty entry in the parsing table is filled with a pointer to a specific error routine to take care that error case. • Error-Productions – If we have a good idea of the common errors that might be encountered, we can augment the grammar with productions that generate erroneous constructs. – When an error production is used by the parser, we can generate appropriate error diagnostics. – Since it is almost impossible to know all the errors that can be made by the programmers, this method is not practical.
  • 131. Error Recovery Techniques • Global-Correction – Ideally, we would like a compiler to make as few change as possible in processing incorrect inputs. – We have to globally analyze the input to find the error. – This is an expensive method, and it is not in practice.
  • 132. Panic-Mode Error Recovery in LL(1) Parsing • In panic-mode error recovery, we skip all the input symbols until a synchronizing token is found. • What is the synchronizing token? – All the terminal-symbols in the follow set of a non-terminal can be used as a synchronizing token set for that non-terminal. • So, a simple panic-mode error recovery for the LL(1) parsing: – All the empty entries are marked as synch to indicate that the parser will skip all the input symbols until a symbol in the follow set of the non-terminal A which on the top of the stack. Then the parser will pop that non-terminal A from the stack. The parsing continues from that state. – Tohandle unmatched terminal symbols, the parser pops that unmatched terminal symbol from the stack and it issues an error message saying that that unmatched terminal is inserted.
  • 133. Panic-Mode Error Recovery - Example S  AbS | e |  A  a | cAd FOLLOW(S)={$} FOLLOW(A)={b,d} output stack output S  AbS $S ceadb$ S  AbS Error: missing b, inserted $SbdA $SbA $SbdAc eadb$ ceadb$ A  cAd ceadb$ Error:unexpected e (illegal A) stack input $S aab$ $SbA aab$ A  a $Sba aab$ $Sbab$ $S ab$ S  AbS (Remove all input tokens until first b or d, pop A) $SbA $Sba $Sbb$ ab$ ab$ A  a $S $Sbd db$ $Sb b$ $ S   $S $ S   $ $ accept $ $ accept c inpuct a b c d e $ S S  AbS syn c S  AbS syn c S  e S   A A  a syn A  cAd syn sync sync
  • 134. Phrase-Level Error Recovery • Each empty entry in the parsing table is filled with a pointer to a special error routine which will take care that error case. • These error routines may: – change, insert, or delete input symbols. – issue appropriate error messages – pop items from the stack. • We should be careful when we design these error routines, because we may put the parser into an infinite loop.
  • 135. Syntax Analyzer • Syntax Analyzer creates the syntactic structure of the given source program. • This syntactic structure is mostly a parse tree. • Syntax Analyzer is also known as parser. • The syntax of a programming is described by a context- free grammar (CFG). We will use BNF (Backus-Naur Form) notation in the description of CFGs. • The syntax analyzer (parser) checks whether a given source program satisfies the rules implied by a context- free grammar or not. – If it satisfies, the parser creates the parse tree of that program. – Otherwise the parser gives the error messages.
  • 136. Syntax Analyzer • A context-free grammar – gives a precise syntactic specification of a programming language. – the design of the grammar is an initial phase of the design of a compiler. – a grammar can be directly converted into a parser by some tools.
  • 137. Parser Lexical Analyze r Parser source program token get nexttoken parse tree • Parser works on a stream of tokens. • The smallest item is a token.
  • 138. Parsers (cont.) • We categorize the parsers into two groups: 1. Top-Down Parser – the parse tree is created top to bottom, starting from the root. 2. Bottom-Up Parser – the parse is created bottom to top; starting from the leaves • Both top-down and bottom-up parsers scan the input from left to right (one symbol at a time). • Efficient top-down and bottom-up parsers can be implemented only for sub-classes of context-free grammars. – LL for top-down parsing – LR for bottom-up parsing
  • 139. Context-Free Grammars • Inherently recursive structures of a programming language are defined by a context- free grammar. • In a context-free grammar, we have: – A finite set of terminals (in our case, this will be the set of tokens) – A finite set of non-terminals (syntactic-variables) – A finite set of productions rules in the following form • A   where A is a non-terminal and  is a string of terminals and non-terminals (including the empty string) – A start symbol (one of the non-terminal symbol)
  • 140. Context-Free Grammars | E – E | E * E | E / E | - E • Example: E  E + E E  ( E ) E  id
  • 141. Derivations E  E+E • E+E derives from E – we can replace E by E+E – to able to do this, we have to have a production rule EE+E in our grammar. E  E+E  id+E  id+id •*A sequence of replacements of non-terminal + symbols is called a derivation of id+id from E.
  • 142. Derivations • In general a derivation step is A   if there is a production rule A in our grammar where  and  are arbitrary strings of terminal and non-terminal symbols 1  2  ...  n (n derives from 1 or 1 derives n )  : derives in one step  : derives in zero or more steps  : derives in one or more steps
  • 143. CFG - Terminology • L(G) is the language of G (the language generated by G) which is a set of sentences. • A sentence of L(G) is a string of terminal symbols of G. • If S is the start symb+ol of G then  is a sentence of L(G) iff S   where  is a string of terminals of G.  If G is a context-free grammar, L(G) is a context-free language. • Two grammars are equivalent if they produce the same language. • S   - If  contains non-terminals, it is called as a se*ntential form of G. - If  does not contain non-terminals, it is called as a sentence of G.
  • 144. Derivation Example E  -E  -(E)  -(E+E)  -(id+E)  -(id+id) OR E  -E  -(E)  -(E+E)  -(E+id)  -(id+id) • At each derivation step, we can choose any of the non-terminal in the sentential form of G for the replacement. • If we always choose the left-most non-terminal in each derivation step, this derivation is called as left-most derivation. • If we always choose the right-most non-terminal in each derivation step, this derivation is called as right-most derivation.
  • 145. Left-Most and Right-Most Derivations Left-Most Derivation lm lm lm lm lm E  -E  -(E)  -(E+E)  -(id+E)  -(id+id) Right-Most Derivation E  -E  -(E)  -(E+E)  -(E+id)  -(id+id) rm rm rm rm rm • We will see that the top-down parsers try to find the left-most derivation of the given source program. • We will see that the bottom-up parsers try to find the right-most derivation of the given source program in the reverse order.
  • 146. Parse Tree • Inner nodes of a parse tree are non-terminalsymbols. • The leaves of a parse tree are terminal symbols. • A parse tree can be seen as a graphical representation of a derivation. E  -E E E - E E E + E - ( E ) E E - ( E ) E E id E E + - ( E ) id E E E + E - ( E ) id  -(E)  -(E+E)  - (id+E)  -(id+id)
  • 147. Ambiguity E  E+E  id+E id+E*E  id+id*E  id+id*id E  E*E  E+E*E id+E*E  id+id*E  id+id*id id id E + E id E E * E • A grammar produces more than one parse tree for a sentenceis called as an ambiguous grammar. E id E E + E * E id id
  • 148. Ambiguity (cont.) • For the most parsers, the grammar must be unambiguous. • unambiguous grammar  unique selection of the parse tree for a sentence • We should eliminate the ambiguity in the grammar during the design phase of the compiler. • An unambiguous grammar should be written to eliminate the ambiguity. • We have to prefer one of the parse trees of a sentence (generated by an ambiguous grammar) to disambiguate that grammar to restrict to this choice.
  • 149. Ambiguity (cont.) stmt  if expr then stmt | if expr then stmt else stmt | otherstmts if E1 then if E2 then S1 else S2 stmt if expr then stmt else E1 if expr then stmt S2 E2 S1 stmt stmt if expr then stmt E1 if expr then stmt else stm S2 1 E2 S1 2
  • 150. Ambiguity • We prefer the second parse tree (else matches with closestif). • So, we have to disambiguate our grammar to reflect this choice. • The unambiguous grammar will be: stmt  matchedstmt | unmatchedstmt matchedstmt  if expr then matchedstmt else matchedstmt | otherstmts unmatchedstmt  if expr then stmt | if expr then matchedstmt elseunmatchedstmt
  • 151. Ambiguity – Operator Precedence • Ambiguous grammars (because of ambiguous operators) can be disambiguated according to the precedence and associativity rules. E  E+E | E*E | E^E | id | (E) disambiguate the grammar precedence: ^ (right to left) * (left to right) + (left to right) E  E+T | T T  T*F | F F  G^F | G G  id | (E) 
  • 152. Left Recursion • A grammar is left recursive if it has a non-terminal A such that there is a derivation. + A  A for some string  • Top-down parsing techniques cannot handle left-recursive grammars. • So, we have to convert our left-recursive grammar into an equivalent grammar which is not left-recursive. • The left-recursion may appear in a single step of the derivation (immediate left-recursion), or may appear in more than one step of the derivation.
  • 153. Immediate Left-Recursion where  does not start withA eliminate immediate left recursion A  A  |   A   A’ A’   A’ |  an equivalent grammar In general, A  A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with A  eliminate immediate left recursion A  1 A’ | ... | n A’ A’  1 A’ | ... | m A’ |  an equivalent grammar
  • 154. Immediate Left-Recursion -- Example E  E+T | T T  T*F | F F  id | (E)  E  T E’ E’  +T E’ | T  F T’ T’  *F T’ |  F  id | (E) eliminate immediate left recursion
  • 155. Left-Recursion -- Problem • A grammar cannot be immediately left-recursive, but it still can be left-recursive. • By just eliminating the immediate left-recursion, we may not get a grammar which is not left-recursive. S  Aa | b A  Sc | d This grammar is not immediately left-recursive, but it is still left-recursive. S  Aa  Sca A  Sc  Aac or causes to a left-recursion • So, we have to eliminate all left-recursions from our grammar
  • 156. Eliminate Left-Recursion -- Algorithm - Arrange non-terminals in some order: A1 ...An - for i from 1 to n do { - for j from 1 to i-1 do { replace each production Ai  Aj  by Ai  1  | ... | k  where Aj  1 | ... | k } - eliminate immediate left-recursions among Ai productions }
  • 157. Eliminate Left-Recursion -- Example S  Aa | b A  Ac | Sd | f -Order of non-terminals: S, A for S: - we do not enter the inner loop. - there is no immediate left recursion in S. for A: - Replace A  Sd with A  Aad | bd So, we will have A  Ac | Aad | bd | f - Eliminate the immediate left-recursion in A A  bdA’ |fA’ A’ cA’ | adA’ | 
  • 158. Eliminate Left-Recursion -- Example So, the resulting equivalent grammar which is not left-recursive is: S  Aa | b A  bdA’ |fA’ A’  cA’ | adA’ | 
  • 159. Eliminate Left-Recursion – Example2 S  Aa | b A  Ac | Sd | f -Order of non-terminals: A, S for A: - we do not enter the inner loop. - Eliminate the immediate left-recursion in A A  SdA’ |fA’ A’  cA’ | 
  • 160. Eliminate Left-Recursion – Example2 for S: - Replace S  Aa with S  SdA’a | fA’a So, we will have S  SdA’a | fA’a | b - Eliminate the immediate left-recursion in S S  fA’a“’ | b“’ S’  dA’a“’ |  So, the resulting equivalent grammar which is not left-recursive is: S  fA’a“’ | b“’ S’  dA’a“’ |  A  SdA’ |fA’ A’ cA’ | 
  • 161. Left-Factoring • A predictive parser (a top-down parser without backtracking) insists that the grammar must be left- factored. grammar  a new equivalent grammar suitable for predictive parsing stmt  if expr then stmt else stmt | if expr then stmt • when we see if, we cannot now which production rule to choose to re-write stmt in the derivation.
  • 162. Left-Factoring (cont.) • In general, A  1 | 2 symbols where  is non-empty and the first of 1 and 2 (if they haveone)are different. • when processing  we cannot know whether expand A to 1 or A to 2 • But, if we re-write the grammar as follows A  A’ A’  1 | 2 so, we can immediately expand A to A’
  • 163. Left-Factoring -- Algorithm • For each non-terminal A with two or more alternatives (production rules) with a common non-empty prefix, let say A  1 | ...| n | 1 | ... |m convert it into A  A’ | 1 | ... |m A’ 1 | ... |n
  • 164. Left-Factoring – Example1 A  abB | aB | cdg | cdeB | cdfB  A  aA’ | cdg | cdeB | cdfB A’  bB | B  A  aA’ | cdA’’ A’  bB | B A’’  g | eB | fB
  • 165. Left-Factoring – Example2 A  ad | a | ab | abc | b  A  aA’ | b A’  d |  | b | bc  A  aA’ | b A’  d |  | bA’’ A’’   | c
  • 166. Non-Recursive Predictive Parsing -- LL(1) Parser • Non-Recursive predictive parsing is a table-driven parser. • It is a top-down parser. • It is also known as LL(1) Parser. input buffer Non-recursive stack output Predictive Parser Parsing Table
  • 167. LL(1) Parser input buffer – our string to be parsed. We will assume that its end is marked with a special symbol $. output – a production rule representing a step of the derivation sequence (left-most derivation) of the string in the input buffer. stack – contains the grammar symbols – at the bottom of the stack, there is a special end marker symbol $. – initially the stack contains only the symbol $ and the starting symbol S. $S  initialstack
  • 168. LL(1) Parser – when the stack is emptied (ie. only $ left in the stack), the parsing is completed. parsing table – a two-dimensional array M[A,a] – each row is a non-terminal symbol – each column is a terminal symbol or the special symbol$ – each entry holds a production rule.
  • 169. LL(1) Parser – Parser Actions • The symbol at the top of the stack (say X) and the current symbol in the input string (say a) determine the parser action. • There are four possible parser actions. 1. If X and a are $  parser halts (successful completion) 2. If X and a are the same terminal symbol (different from $)  parser pops X from the stack, and moves the next symbol in the input buffer.
  • 170. LL(1) Parser – Parser Actions 3. If X is a non-terminal  parser looks at the parsing table entry M[X,a]. If M[X,a] holds a production rule XY1Y2...Yk, it pops X from the stack and pushes Yk,Yk-1,...,Y1 into the stack. The parser also outputs the production rule XY1Y2...Yk to represent a step of the derivation. 4. none of the above  error – all empty entries in the parsing table are errors. – If X is a terminal symbol different from a, this is also an error case.
  • 171. LL(1) Parser – Example1 LL(1) S  aBa Parsing B  bB |  Table output S  aBa B  bB B  bB stack $S $aBa $aB $aBb $aB $aBb input abba$ abba$ bba$ bba$ ba$ ba$ a b $ S S  aBa B B   B  bB
  • 172. LL(1) Parser – Example1 a$ B   a$ $aB $a $ $ accept, successful completion
  • 173. LL(1) Parser – Example1 (cont.) B B b b  Outputs: S  aBa B  bB B  bB B   Derivation(left-most): SaBaabBaabbBaabba S parse tree a B a
  • 174. LL(1) Parser – Example2 E  TE’ E’  +TE’ |  T  FT’ T’ *FT’ |  F  (E) | id id + * ( ) $ E E  TE’ E  TE’ E’ E’  +TE’ E’   E’   T T  FT’ T  FT’ T’ T’   T’  *FT’ T’   T’   F F  id F  (E)
  • 175. LL(1) Parser – Example2 stack input output $E id+id$ E  TE’ $E’T id+id$ T  FT’ $E’ T’F id+id$ F  id $ E’ T’id $ E’ T’ id+id$ +id$ T’  $ E’ +id$ E’ +TE’ $ E’ T+ $ E’ T +id$ id$ T  FT’
  • 176. LL(1) Parser – Example2 $ E’ T’ F id$ F  id $ E’ T’id $ E’ T’ id$ $ T’  $ E’ $ E’  $ $ accept
  • 177. Constructing LL(1) Parsing Tables • Two functions are used in the construction of LL(1) parsing tables: – FIRST FOLLOW • FIRST() is a set of the terminal symbols which occur as first symbols in strings derived from  where  is any string of grammar symbols. • if  derives to , then  is also in FIRST() . • FOLLOW(A) is the set of the terminals which occur immediately after (follow) the non-terminal A in the strings derived from the starting symbol. – a terminal a is in FOLLOW(A) if S  Aa – $ is in FOLLOW(A) if S  A * *
  • 178. Compute FIRST for Any String X   is inFIRST(X). • If X is a terminal symbol  FIRST(X)={X} • If X is a non-terminal symbol and X   is a production rule • If X is a non-terminal symbol and X  Y1Y2..Ynis a production rule  if a terminal a in FIRST(Yi) and  is in all FIRST(Yj) for j=1,...,i-1 then a is in FIRST(X).  if  is in all FIRST(Yj) for j=1,...,n then  is in FIRST(X). • If X is   FIRST(X)={} • If X is Y1Y2..Yn  if a terminal a in FIRST(Yi) and  is in all FIRST(Yj)for j=1,...,i-1 then a is in FIRST(X).  if  is in all FIRST(Yj) for j=1,...,n then  is in FIRST(X).
  • 179. FIRST Example E  TE’ E’ +TE’ |  T  FT’ T’  *FT’ |  F  (E) | id FIRST(F) = {(,id} FIRST(TE’) = {(,id} FIRST(T’) = {*, } FIRST(+TE’ ) = {+} FIRST(T) = {(,id} FIRST() = {} FIRST(FT’) = {(,id} FIRST(E’) = {+, } FIRST(E) = {(,id} FIRST(*FT’) = {*} FIRST() = {} FIRST((E)) = {(} FIRST(id) = {id}
  • 180. Compute FOLLOW (for non-terminals) • If S is the start symbol  $ is in FOLLOW(S) • if A  B is a production rule  everything in FIRST() is FOLLOW(B) except  • If ( A  B is a production rule ) or ( A  B is a production rule and  is in FIRST())  everything in FOLLOW(A) is in FOLLOW(B). We apply these rules until nothing more can be added to any follow set.
  • 181. FOLLOW Example E  TE’ E’  +TE’ |  T  FT’ T’  *FT’ |  F  (E) | id FOLLOW(E) = { $, ) } FOLLOW(E’) = { $, ) } FOLLOW(T) = { +, ), $ } FOLLOW(T’) = { +, ), $ } FOLLOW(F) = {+, *, ), $ }
  • 182. Constructing LL(1) Parsing Table -- Algorithm • for each production rule A   of a grammar G – for each terminal a in FIRST()  add A   to M[A,a]   – If  in FIRST() for each terminal a in FOLLOW(A) add A   to M[A,a] – If  in FIRST() and $ in FOLLOW(A) add A   to M[A,$] • All other undefined entries of the parsing table are error entries.
  • 183. Constructing LL(1) Parsing Table -- Example E  TE’ FIRST(TE’)={(,id}  E  TE’ into M[E,(] and M[E,id] E’  +TE’ FIRST(+TE’ )={+}  E’  +TE’ intoM[E’,+] E’  FIRST()={}  none but since  in FIRST()  E’   into and FOLLOW(E’)={$,)} M[E’,$] and M[E’,)] T  FT’ FIRST(FT’)={(,id}  T  FT’ into M[T,(] and M[T,id] T’  *FT’ FIRST(*FT’ )={*}  T’  *FT’ into M[T’,*]
  • 184. Constructing LL(1) Parsing Table -- Example T’  FIRST()={} but since  in FIRST()  none  T’  and FOLLOW(T’)={$,),+} into M[T’,$], M[T’,)] and M[T’,+] FIRST((E) )={(}  F  FIRST(id)={id}  F  id F  (E) (E) into M[F,(] F  id into M[F,id]
  • 185. LL(1) Grammars • A grammar whose parsing table has no multiply- defined entries is said to be LL(1) grammar. one input symbol used as a look-head symbol do determine parser action LL(1) left most derivation input scanned from left to right • The parsing table of a grammar may contain more than one production rule. In this case, we say that it is not a LL(1) grammar.
  • 186. A Grammar which is not LL(1) S  i C t S E | a E  e S |  C  b FOLLOW(S) = { $,e } FOLLOW(E) = { $,e } FOLLOW(C) = { t } FIRST(iCtSE) = {i} FIRST(a) = {a} FIRST(eS) = {e} FIRST() = {} FIRST(b) = {b} a b e i t $ S S  a S  iCtSE E two p E  e S E   roductio n rules for E   C C  b M[E,e] Problem  ambiguity
  • 187. A Grammar which is not LL(1) (cont.) • What do we have to do it if the resulting parsing table contains multiply defined entries? – If we didn’t eliminate left recursion, eliminate the left recursion in the grammar. – If the grammar is not left factored, we have to left factor the grammar. – If its (new grammar’s) parsing table still contains multiply defined entries, that grammar is ambiguous or it is inherently not a LL(1) grammar.
  • 188. A Grammar which is not LL(1) (cont.) • A left recursive grammar cannot be a LL(1) grammar. – A  A |   any terminal that appears in FIRST() also appears FIRST(A) because A  .  If  is , any terminal that appears in FIRST() also appearsin FIRST(A) and FOLLOW(A). • A grammar is not left factored, it cannot be a LL(1) grammar • A  1 |2 any terminal that appears in FIRST(1) also appears in FIRST(2). • An ambiguous grammar cannot be a LL(1) grammar.
  • 189. Properties of LL(1) Grammars • A grammar G is LL(1) if and only if the following conditions hold for two distinctive production rules A   and A   1. Both  and  cannot derive strings starting with same terminals. 2. At most one of  and  can derive to . 3. If  can derive to , then  cannot derive to any string starting with a terminal in FOLLOW(A).
  • 190. Error Recovery in Predictive Parsing • An error may occur in the predictive parsing (LL(1) parsing) – if the terminal symbol on the top of stack does not match with the current input symbol. – if the top of stack is a non-terminal A, the current input symbol is a, and the parsing table entry M[A,a] is empty. • What should the parser do in an error case? – The parser should be able to give an error message (as much as possible meaningful error message). – It should be recover from that error case, and it should be able to continue the parsing with the rest of the input.
  • 191. Error Recovery Techniques • Panic-Mode Error Recovery – Skipping the input symbols until a synchronizing token is found. • Phrase-Level Error Recovery – Each empty entry in the parsing table is filled with a pointer to a specific error routine to take care that error case. • Error-Productions – If we have a good idea of the common errors that might be encountered, we can augment the grammar with productions that generate erroneous constructs. – When an error production is used by the parser, we can generate appropriate error diagnostics. – Since it is almost impossible to know all the errors that can be made by the programmers, this method is not practical.
  • 192. Error Recovery Techniques • Global-Correction – Ideally, we would like a compiler to make as few change as possible in processing incorrect inputs. – We have to globally analyze the input to find the error. – This is an expensive method, and it is not in practice.
  • 193. Panic-Mode Error Recovery in LL(1) Parsing • In panic-mode error recovery, we skip all the input symbols until a synchronizing token is found. • What is the synchronizing token? – All the terminal-symbols in the follow set of a non-terminal can be used as a synchronizing token set for that non-terminal. • So, a simple panic-mode error recovery for the LL(1) parsing: – All the empty entries are marked as synch to indicate that the parser will skip all the input symbols until a symbol in the follow set of the non-terminal A which on the top of the stack. Then the parser will pop that non-terminal A from the stack. The parsing continues from that state. – Tohandle unmatched terminal symbols, the parser pops that unmatched terminal symbol from the stack and it issues an error message saying that that unmatched terminal is inserted.
  • 194. Panic-Mode Error Recovery - Example S  AbS | e |  A  a | cAd FOLLOW(S)={$} FOLLOW(A)={b,d} a b c d e $ S S  AbS syn c S  AbS syn c S  e S   A A  a syn c A  cAd syn c sync sync put  AbS  a stack input $S ceadb$ $SbA ceadb$ $SbdAc ceadb$ $SbdAeadb$ stack input out output $S aab$ S S  AbS $SbA aab$ A A  cAd $Sba aab$ $Sb ab$
  • 195. Panic-Mode Error Recovery - Example $S ab$ S  AbS(Remove all input tokens until first b or d, popA) $SbA $Sbd $Sba $Sb ab$ db$ ab$ b$ A  a $Sb b$ $S $ S   $ $ accept Error : unexpected e (illegal A)
  • 196. Phrase-Level Error Recovery • Each empty entry in the parsing table is filled with a pointer to a special error routine which will take care that error case. • These error routines may: – change, insert, or delete input symbols. – issue appropriate error messages – pop items from the stack. • We should be careful when we design these error routines, because we may put the parser into an infinite loop.
  • 197. Unit-2 Bottom - Up Parsing
  • 198. Bottom-Up Parsing • A bottom-up parser creates the parse tree of the given input starting from leaves towards the root. • A bottom-up parser tries to find the right-most derivation of the given input in the reverse order. S  ...   (the right-most derivation of )  (the bottom-up parser finds the right-most derivation in the reverse order) • Bottom-up parsing is also known as shift-reduce parsing because its two main actions are shift and reduce. – At each shift action, the current symbol in the input string is pushed to a stack.
  • 199. Bottom-Up Parsing – At each reduction step, the symbols at the top of the stack (this symbol sequence is the right side of a production) will replaced by the non-terminal at the left side of that production. – There are also two more actions: accept and error.
  • 200. Shift-Reduce Parsing • A shift-reduce parser tries to reduce the given input string into the starting symbol. a string  the starting symbol reduced to • At each reduction step, a substring of the input matching to the right side of a production rule is replaced by the non- terminal at the left side of that production rule. • If the substring is chosen correctly, the right most derivation of that string is created in the reverse order. Rightmost Derivation: S   * rm Shift-Reduce Parser finds:   ... S rm rm
  • 201. Shift-Reduce Parsing -- Example input string: aaabb aaAbb aAbb  S  aABb A  aA | a B  bB | b reduction aABb S S  aABb  aAbb  aaAbb  aaabb rm rm rm rm Right Sentential Forms • How do we know which substring to be replaced at each reduction step?
  • 202. Handle • Informally, a handle of a string is a substring that matches the right side of a production rule. – But not every substring matches the right side of a production rule is handle • A handle of a right sentential form  ( ) is a production rule A   and a position of  where the string  may be found and replaced by A to produce the previous right-sentential form in a rightmost derivation of . S  A   • If the grammar is unambiguous, then every right- sentential form of the grammar has exactly one handle. • We will see that  is a string of terminals. * rm rm
  • 203. Handle Pruning • Then find a handle An-1n-1 inn-1, and replace n-1 in by An-1 to getn-2. • Repeat this, until we reach S. • A right-most derivation in reverse can be obtained by handle-pruning. rm rm rm rm rm S=0  1  2  ...  n-1  n= input string • Start from n, find a handle Ann in n, and replace n in by An to getn-1.
  • 204. A Shift-Reduce Parser E  E+T |T T  T*F |F F  (E) | id Right-Most Derivation of id+id*id E  E+T  E+T*F  E+T*id  E+F*id  E+id*id  T+id*id  F+id*id  id+id*id Right-Most Sentential Form id+id*id F+id*id T+id*id E+id*id Reducing Production F  id T  F E  T F  id T  F F  id T  T*F E  E+T Handles are red and underlined in the right-sentential E+F*id E+T*id E+T*F E+T E forms.
  • 205. A Stack Implementation of A Shift-Reduce Parser • There are four possible actions of a shift-parser action: 1. Shift : The next input symbol is shifted onto the top of the stack. 2. Reduce: Replace the handle on the top of the stack by the non- terminal. 3. Accept: Successful completion of parsing. 4. Error: Parser discovers a syntax error, and calls an error recovery routine. • Initial stack just contains only the end-marker $. • The end of the input string is marked by the end- marker $.
  • 206. A Stack Implementation of A Shift-Reduce Parser $E+id $E+F *id$ *id$ reduce by F  id reduce by T  F T 2 T 5 * F 6 $E+T $E+T* *id$ id$ shift shift F 1 F 4 id id id $E+T*id $E+T*F $E+T $E $ $ $ $ reduce by F  id reduce by T  T*F reduce by E  E+T accept Stack Input Action $ id+id*id$shift $id $F $T +id*id$ +id*id$ +id*id$ reduce by F  id reduce by T  F reduce by E  T Parse Tree E 8 $E $E+ id* +id*id$ id$ shift shift E 3 + T 7
  • 207. Conflicts During Shift-Reduce Parsing • There are context-free grammars for which shift-reduce parsers cannot be used. • Stack contents and the next input symbol may not decide action: – shift/reduce conflict: Whether make a shift operation or a reduction. – reduce/reduce conflict: The parser cannot decide which of several reductions to make. • If a shift-reduce parser cannot be used for a grammar, that grammar is called as non-LR(k) grammar. right-most k lookhead left to right scanning derivation • An ambiguous grammar can never be a LR grammar.
  • 208. Shift-Reduce Parsers • There are two main categories of shift-reduce parsers 1. Operator-Precedence Parser – simple, but only a small class of grammars. 2. LR-Parsers – covers wide range of grammars. • SLR – simple LR parser • LR – most general LR parser • LALR – intermediate LR parser (lookhead LR parser) SLR CFG LR LALR
  • 209. • Operator Operator-Precedence Parser grammar – small, but an important class of grammars – we may have an efficient operator precedence parser (a shift- reduce parser) for an operator grammar. • In an operator grammar, no production rule can have: –  at the right side – two adjacent non-terminals at the right side. • Ex: EAB EEOE EE+E | Aa Eid E*E | Bb O+|*|/ E/E | id not operator grammar not operator grammar operator grammar
  • 210. Precedence Relations • In operator-precedence parsing, we define three disjoint precedence relations between certain pairs of terminals. a <. b a =· b a .> b b has higher precedence than a b has same precedence as a b has lower precedence than a • The determination of correct precedence relations between terminals are based on the traditional notions of associativity and precedence of operators. (Unary minus causes a problem).
  • 211. Using Operator-Precedence Relations • The intention of the precedence relations is to find the handle of a right-sentential form, <. with marking the left end, =· appearing in the interior of the handle, and .> marking the right hand. • In our input string $a1a2...an$, we insert the precedence relation between the pairs of terminals (the precedence relation holds between the terminals in that pair).
  • 212. Using Operator -Precedence Relations id * $ d id e ce .> .> .> + <. .> <. .> * <. .> .> .> $ <. <. <. E  E+E | E-E | E*E | E/E | E^E | (E) | -E | id The partial operator-prece n table for this grammar • Then the input string id+id*id with the precedence relations inserted will be: $ <. id .> + <. id .> * <.id .> $
  • 213. ToFind The Handles 1. Scan the string from left end until the first .> is encountered. 2. Then scan backwards (to the left) over any =· until a <. is encountered. 3. The handle contains everything to left of the first .> and to the right of the <. is encountered. E  id E  id E  id $ id + id * id $ $ E + id * id $ $ E + E * id $ $ E + E * .E $ E  E+E E  E*E $ E + E $ $ <. id .> + <. id .> * <. id .> $ $ <. + <. id .> * <. id .> $ $ <. + <. * <. id .> $ $ <. + <. * .>$ $ <. + .> $ $ $ $ E $
  • 214. Operator-Precedence Parsing Algorithm • The input string is w$, the initial stack is $ and a table holds precedence relations between certain terminals Algorithm: set p to point to the first symbol of w$ ; repeat forever if ( $ is on top of the stack and p points to $ ) then return else { let a be the topmost terminal symbol on the stack and let b be the symbol pointed to by p; if ( a <. b or a =· b ) then { /* SHIFT */
  • 215. Operator-Precedence Parsing Algorithm push b onto the stack; advance p to the next input symbol; } /* REDUCE */ else if ( a .> b ) then repeat pop stack until ( the top of stack terminal is related by <. to the terminal most recently popped); else error(); }
  • 216. Operator-Precedence Parsing Algorithm -- Example stack input action $ id+id*id$ $ <. id shift $id +id*id$ id .> + reduce E  id $ +id*id$ shift $+ id*id$ shift $+id *id$ id .> * reduce E  id $+ *id$ shift $+* id$ shift $+*id $ id .> $ reduce E  id $+* $ * .> $ reduce E  E*E $+ $ + .> $ reduce E  E+E $ $ accept
  • 217. How to Create Operator-Precedence Relations • We use associativity and precedence relations among operators. 1. If operator O1 has higher precedence than operator O2,  O1 .> O2 and O2 <. O1 2. If operator O1 and operator O2 have equal precedence, they are left-associative  O1 .> O2 and O2 .> O1 they are right-associative  O1 <. O2 and O2 <. O1 (<. O, O .> ), ) .> O, O .> $, and $ <. 3. For all operators O, O <. id, id .> O, O <. (, O 4. Also, let (=·) $ <. ( id .> ) ) .> $ ( <. ( $ <. id id .> $ ) .> ) ( <. id
  • 218. Operator-Precedence Relations + - * / ^ id ( ) $ + .> .> <. <. <. <. <. .> .> - .> .> <. <. <. <. <. .> .> * .> .> .> .> <. <. <. .> .> / .> .> .> .> <. <. <. .> .> ^ .> .> .> .> <. <. <. .> .> id .> .> .> .> .> .> .> ( <. <. <. <. <. <. <. =· ) .> .> .> .> .> .> .> $ <. <. <. <. <. <. <.
  • 219. Handling Unary Minus • Operator-Precedence parsing cannot handle the unary minus when we have also the binary minus in our grammar. • The best approach to solve this problem, let the lexical analyzer handle this problem. – The lexical analyzer will return two different operators for the unary minus and the binary minus. – The lexical analyzer will need a lookhead to distinguish the binary minus from the unary minus. • Then, we make for any operator if unary-minus has higher if unary-minus has lower O <. unary-minus unary-minus .> O precedence than O unary-minus <. O (or equal) precedence than O
  • 220. Precedence Functions • Compilers using operator precedence parsers do not need to store the table of precedence relations. • The table can be encoded by two precedence functions f and g that map terminal symbols to integers. • For symbols a and b. f(a) < g(b) whenever a <. b f(a) = g(b) whenever a =· b f(a) > g(b) whenever a .> b
  • 221. Disadvantages of Operator Precedence Parsing • Disadvantages: – It cannot handle the unary minus (the lexical analyzer should handle the unary minus). – Small class of grammars. – Difficult to decide which language is recognized by the grammar. • Advantages: – simple – powerful enough for expressions in programming languages
  • 222. Error Recovery in Operator-Precedence Parsing Error Cases: 1. No relation holds between the terminal on the top of stack and the next input symbol. 2. A handle is found (reduction step), but there is no production with this handle as a right side Error Recovery: 1. Each empty entry is filled with a pointer to an error routine. 2. Decides the popped handle “looks like” which right hand side. And tries to recover from that situation.
  • 223. LR Parsers • The most powerful shift-reduce parsing (yet efficient) is: LR(k) parsing. right-most k derivation (k is omitted  left to right lookhead scanning it is 1)
  • 224. LR Parsers • LR parsing is attractive because: – LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient. – The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive parsers. LL(1)-Grammars  LR(1)-Grammars – An LR-parser can detect a syntactic error as soon as it is possible to do so a left-to-right scan of the input.
  • 225. LR Parsers • LR-Parsers – covers wide range of grammars. – SLR – simple LR parser – CLR – most general LR parser – LALR – intermediate LR parser (look-head LR parser) – SLR, LR and LALR work same (they used the same algorithm), only their parsing tables are different.
  • 226. LR Parsing Algorithm Sm Xm Sm-1 Xm- 1 . . S1 X1 S0 a1 ... ai ... an $ Action Table Goto Table s t a t e s terminals and $ four different actions s t a t e s non-terminal each item is a state number LR ParsingAlgorithm stack input output
  • 227. A Configuration of LR Parsing Algorithm • A configuration of a LR parsing is: ( So X1 S1 ...Xm Sm, ai ai+1 ... an $ ) Stack Rest of Input • Sm and ai decides the parser action by consulting the parsing action table. (Initial Stack contains just So ) • A configuration of a LR parsing represents the right sentential form: X1 ... Xm ai ai+1 ... an $
  • 228. Actions of A LR-Parser 1. shift s -- shifts the next input symbol and the state s onto the stack ( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ ) 2. reduce A (or rn where n is a production number) – pop 2|| (=r) items from the stack; – then push A and s where s=goto[sm-r,A] ( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ ) – Output is the reducing production reduce A 2. Accept – Parsing successfully completed 3. Error -- Parser detected an error (an empty entry in the action table)
  • 229. Reduce Action • pop 2|| (=r) items from the stack; let us assume that  = Y1Y2...Yr • then push A and s where s=goto[sm-r,A] ( So X1 S1 ... Xm-r Sm-r Y1 Sm-r ...Yr Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ ) • In fact, Y1Y2...Yr is a handle. X1 ... Xm-r A ai ... an $  X1 ... Xm Y1...Yr ai ai+1 ... an $
  • 230. state id + * ( ) $ E T F 0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5 (SLR) Parsing Tables for Expression Grammar Action Table Goto Table 1) E  E+T 2) E  T 3) T  T*F 4)T  F 5) F  (E) 6) F  id
  • 231. Actions of A (S)LR-Parser -- Example stack 0 0id5 input id*id+id$ *id+id$ action shift 5 reduce by Fid output Fid 0F3 *id+id$ reduce by TF TF 0T2 *id+id$ shift 7 0T2*7 0T2*7id5 0T2*7F10 id+id$ +id$ +id$ shift 5 reduce by Fid reduce by TT*F Fid TT*F 0T2 +id$ reduce by ET ET
  • 232. Actions of A (S)LR-Parser -- Example 0E1 +id$ shift 6 0E1+6 id$ shift 5 0E1+6id5 $ reduce by Fid 0E1+6F3 $ reduce by TF 0E1+6T9 $ reduce by EE+T 0E1 $ accept
  • 233. Constructing SLR Parsing Tables – LR(0) Item • An LR(0) item of a grammar G is a production of G a dot at the some position of the right side. • Ex: A  aBb Possible LR(0) Items: A  .aBb (four different possibility) A  a.Bb A  aB.b A  aBb. • Sets of LR(0) items will be the states of action and goto table of the SLR parser. • A collection of sets of LR(0) items (the canonical LR(0) collection) is the basis for constructing SLR parsers. • Augmented Grammar: G’ is G with a new production rule “’“ where “’ is the new starting symbol.
  • 234. The Closure Operation • If I is a set of LR(0) items for a grammar G, then closure(I) is the set of LR(0) items constructed from I by the two rules: 1. Initially, every LR(0) item in I is added to closure(I). 2. If A  .B is in closure(I) and B is a productionrule of G; then B. will be in the closure(I). We will apply this rule until no more new LR(0) items can be added to closure(I).
  • 235. The Closure Operation -- Example E’  E closure({E’  .E}) = { E’  .E E  .E+T E  .T T  .T*F T  .F E  E+T kernel items E  T T  T*F T  F F  (E) F  id F  .(E) F  .id }
  • 236. Goto Operation • If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows: – If A  .X in I then every item in closure({A  X.})will be in goto(I,X). Example: I ={ E’  .E, E  .E+T, E  .T, T  .T*F, T  .F, F  .(E), F  .id } goto(I,E) = { E’  E., E  E.+T } goto(I,T) = { E  T., T  T.*F } goto(I,F) = {T  F.} goto(I,() = { F  (.E), E  .E+T, E  .T,T  .T*F, T .F, F  .(E), F  .id } goto(I,id) = { F  id. }
  • 237. Construction of The Canonical LR(0) Collection • Tocreate the SLR parsing tables for a grammar G, we will create the canonical L‘(O) collection of the grammar G’. • Algorithm: C is { closure({“’.S}) } repeat the followings until no more set of LR(0) items can be added to C. for each I in C and each grammar symbol X if goto(I,X) is not empty and not in C add goto(I,X) to C • goto function is a DFA on the sets in C.
  • 238. The Canonical LR(0) Collection -- Example
  • 239. Transition Diagram (DFA) of Goto Function I0 I1 I2 I3 I4 I5 I6 I7 I8 to I2 to I3 to I4 I9 to I3 to I4 I11 to I6 id to I5 I10 to I4 to I5 ( F * E E + T T ) T F F F ( id id ( * to I7 ( id +
  • 240. Constructing SLR Parsing Table (of an augumented grammar G’) 1. Construct the canonical collection of sets of L‘(O) items for G’. C{I0,...,In} 2. Create the parsing action table as follows • If a is a terminal, A.a in Ii and goto(Ii,a)=Ij then action[i,a] is shift j. • If A. is in Ii , then action[i,a] is reduce A for all a in FOLLOW(A) where A“’. • If “’S. is in Ii , then action[i,$] is accept. • If any conflicting actions generated by these rules, the grammar is not SLR(1).
  • 241. Constructing SLR Parsing Table 3. Create the parsing gototable • for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j 4. All entries not defined by (2) and (3) are errors. 5. Initial state of the parser contains “’.S
  • 242. state id + * ( ) $ E T F 0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1 10 r3 r3 r3 r3 11 r5 r5 r5 r5 Parsing Tables of Expression Grammar Action Table Goto Table
  • 243. SLR(1) Grammar • An LR parser using SLR(1) parsing tables for a grammar G is called as the SLR(1) parser for G. • If a grammar G has an SLR(1) parsing table, it is called SLR(1) grammar (or SLR grammar in short). • Every SLR grammar is unambiguous, but every unambiguous grammar is not a SLR grammar.
  • 244. shift/reduce and reduce/reduce conflicts • If a state does not know whether it will make a shift operation or reduction for a terminal, we say that there is a shift/reduce conflict. • If a state does not know whether it will make a reduction operation using the production rule i or j for a terminal, we say that there is a reduce/reduce conflict. • If the SLR parsing table of a grammar G has a conflict, we say that that grammar is not SLR grammar.
  • 245. Conflict Example I1: “’ S. I6: S L=.R I9: S  R  .L I2: S  L.=R L .*R R  L. L  .id S  L=R L=R. S  R L *R L  id R  L I0: “’  .S S  .L=R S  .R L  .*R L  .id R  .L I3: S R. I7: L *R. Problem FOLLOW(R)={=,$} I8: R L. = I4: L  *.R R  .L L .*R L  .id shift 6 reduce by R  L shift/reduce conflict I5: L id.
  • 246. Conflict Example2 S  AaAb S  BbBa A   B   I0: “’  .S S  .AaAb S  .BbBa A  . B  . Problem FOLLOW(A)={a,b} FOLLOW(B)={a,b} b a reduce by A   reduce by B   reduce/reduce conflict reduce by A   reduce by B   reduce/reduce conflict
  • 247. Constructing Canonical LR(1) Parsing Tables • In SLR method, the state i makes a reduction by A when the current token is a: – if the A. in the Ii and a is FOLLOW(A) • In some situations, A cannot be followed by the terminal a in a right-sentential form when  and the state i are on the top stack. This means that making reduction in this case is not correct. S  AaAb S  BbBa SAaAbAabab SBbBaBbaba A   Aab   ab Bba   ba B   AaAb  Aa  b BbBa  Bb a
  • 248. LR(1) Item • Toavoid some of invalid reductions, the states need to carry more information. • Extra information is put into a state by including a terminal symbol as a second component in an item. where a is the look-head of • A LR(1) item is: A  .,a the LR(1) item (a is a terminal or end- marker.)
  • 249. LR(1) Item (cont.) • When  ( in the LR(1) item A  .,a ) is not empty, the look-head does not have any affect. • When  is empty (A  .,a ), we do the reduction by A only if the next input symbol is a (not for any terminal in FOLLOW(A)). • A state will contain A  .,a1 where {a1,...,an}  FOLLOW(A) ... A  .,an
  • 250. Canonical Collection of Sets of LR(1) Items • The construction of the canonical collection of the sets of LR(1) items are similar to the construction of the canonical collection of the sets of LR(0) items, except that closure and goto operations work a little bit different. closure(I) is: ( where I is a set of LR(1) items) – every LR(1) item in I is in closure(I) – if A.B,a in closure(I) and B is a production rule of G; then B.,b will be in the closure(I) for each terminal b in FIRST(a) .
  • 251. goto operation • If I is a set of LR(1) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows: – If A  .X,a in I then every item in closure({A  X.,a}) will be in goto(I,X).
  • 252. Construction of The Canonical LR(1) Collection • Algorithm: C is { closure({“’.S,$}) } repeat the followings until no more set of LR(1) items can be added to C. for each I in C and each grammar symbol X if goto(I,X) is not empty and not in C add goto(I,X) to C • goto function is a DFA on the sets in C.
  • 253. A Short Notation for The Sets of LR(1) Items • A set of LR(1) items containing the following items A  .,a1 ... A  .,an can be written as A  .,a1/a2/.../an
  • 254. Canonical LR(1) Collection -- Example 2 I : S  A.aAb ,$ S  AaAb S  BbBa A   B   I0: “’  .S ,$ S  .AaAb ,$ S  .BbBa,$ A  . ,a B  .,b I3: S  B.bBa,$ I6: S  AaA.b ,$ I8: S  AaAb.,$ I4: S  Aa.Ab ,$ A  . ,b I7: S  BbB.a ,$ I9: S  BbBa.,$ I5: S  Bb.Ba,$ B  . ,a S AI1:“’  S.,$ B a b A B a b to I4 to I5
  • 255. Canonical LR(1) Collection – Example2 “’  S 1) S  L=R 2) S  R 3) L *R 4) L  id 5) R  L I0:“’  .S,$ S  .L=R,$ S  .R,$ L  .*R,$/= L  .id,$/= R  .L,$ I1:“’  S.,$ I3:S  R.,$ I4:L  *.R,$/= R  .L,$/= L .*R,$/= L  .id,$/= 5 I :L id.,$/= 6 I :S L=.R,$ R  .L,$ L  .*R,$ L  .id,$ I7:L  *R.,$/= I8: R  L.,$/= I9:S  L=R.,$ I10:R  L.,$ 11 I :L  *.R,$ R  .L,$ L .*R,$ L  .id,$ 12 I :L  id.,$ I13:L  *R.,$ to I6 to I7 to I8 to I4 to I5 to I10 to I11 to I12 to I9 to I10 to I11 to I12 to I13 i d S L I2:S  L.=R,$ R  L.,$ R R L R id i d i d R L L * * * * I4 and I11 I5 and I12 I7 and I13 8 I and I10
  • 256. Construction of LR(1) Parsing Tables 1. Construct the canonical collection of setsof L‘(1) items for G’. C{I0,...,In} 2. Create the parsing action table as follows • If a is a terminal, A.a,b in Ii and goto(Ii,a)=Ij then action[i,a] is shift j. • If A.,a is in Ii , then action[i,a] is reduce A where A“’. • If “’S.,$ is in Ii , then action[i,$] is accept. • If any conflicting actions generated by these rules, the grammar is not LR(1).
  • 257. Construction of LR(1) Parsing Tables 3. Create the parsing gototable • for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j 4. All entries not defined by (2) and (3) are errors. 5. Initial state of the parser contains “’.S,$
  • 258. LR(1) Parsing Tables – (for Example2) id * = $ S L R no shift/reduce no reduce/redu  so, it is a LR(1) g 0 s5 s4 1 2 3 1 acc 2 s6 r5 3 r2 4 s5 s4 8 7 5 r4 r4 6 s12 s11 10 9 7 r3 r3 8 r5 r5 9 r1 10 r5 11 s12 s11 10 13 12 r4 13 r3 or ce conflict rammar
  • 259. LALR Parsing Tables • LALR stands for LookAhead LR. • LALR parsers are often used in practice because LALR parsing tables are smaller than LR(1) parsing tables. • The number of states in SLR and LALR parsing tables for a grammar G are equal. • But LALR parsers recognize more grammars than SLR parsers. • yacc creates a LALR parser for the given grammar. • A state of LALR parser will be again a set of LR(1) items.
  • 260. Creating LALR Parsing Tables  Canonical LR(1) Parser LALR Parser shrink # of states • This shrink process may introduce a reduce/reduce conflict in the resulting LALR parser (so the grammar is NOT LALR) • But, this shrink process does not produce a shift/reduce conflict.
  • 261. The Core of A Set of LR(1) Items • The core of a set of LR(1) items is the set of its first component. Ex: S  L.=R,$  S  L.=R Core R  L.,$ R  L. • We will find the states (sets of LR(1) items) in a canonical LR(1) parser with same cores. Then we will merge them as a single state. I1:L  id.,= A new state: I12: L  id.,= L  id.,$ I2:L  id.,$  have same core, merge them
  • 262. The Core of A Set of LR(1) Items • We will do this for all states of a canonical LR(1) parser to get the states of the LALR parser. • In fact, the number of the states of the LALR parser for a grammar will be equal to the number of states of the SLR parser for that grammar.
  • 263. Creation of LALR Parsing Tables • Create the canonical LR(1) collection of the sets of LR(1) items for the given grammar. • Find each core; find all sets having that same core; replace those sets having same cores with a single set which is their union. C={I0,...,In}  C’={J1,...,Jm} where m  n • Create the parsing tables (action and goto tables) same as the construction of the parsing tables of LR(1) parser. – Note that: If J=I1  ...  Ik since I1,...,Ik have same cores  cores of goto(I1,X),...,goto(I2,X) must be same. – So, goto(J,X)=K where K is the union of all sets of items having same cores as goto(I1,X). • If no conflict is introduced, the grammar is LALR(1) grammar. (We may only introduce reduce/reduce conflicts; we cannot introduce a shift/reduce conflict)
  • 264. Shift/Reduce Conflict • We say that we cannot introduce a shift/reduce conflict during the shrink process for the creation of the states of a LALR parser. • Assume that we can introduce a shift/reduce conflict. In this case, a state of LALR parser must have: A  .,a and B  .a,b • This means that a state of the canonical LR(1) parser must have: A  .,a and B  .a,c But, this state has also a shift/reduce conflict. i.e. The original canonical LR(1) parser has a conflict. (Reason for this, the shift operation does not depend on lookaheads)
  • 265. Reduce/Reduce Conflict • But, we may introduce a reduce/reduce conflict during the shrink process for the creation of the states of a LALR parser. I1 : A  .,a B  .,b I2: A  .,b B  .,c   reduce/reduce conflict I12: A  .,a/b B  .,b/c
  • 266. Canonical LALR(1) Collection – Example2 “’  S 1) S  L=R 2) S  R 3) L *R 4) L  id 5) R  L S  .R,$ L  .*R,$/= L  .id,$/= R  .L,$ 2 I :S  L.=R,$ R  L.,$ I3:S  R.,$ S  .L=R,$ R  .L,$/= L .*R,$/= L  .id,$/= I512:L  id.,$/= I :S  L=.R,$ R  . 6 L  .L,$ *R,$ L  .id,$ I713:L  *R.,$/= I810: R  L.,$/= I9:S  L=R.,$ to I6 to I713 to I810 to I411 to I512 to I810 to I411 to I512 to I9 S L L R L R id i d i d I0:“’  .S,$ I1:“’  S.,$ I411:L  *.R,$/= R * * * Same Cores I4 and I11 I5 and I12 I7 and I13 8 I and I10
  • 267. LALR(1) Parsing Tables – (for Example2) id * = $ S L R no shift/reduce or no reduce/reduce  so, it is a LALR(1) g 0 s5 s4 1 2 3 1 acc 2 s6 r5 3 r2 4 s5 s4 8 7 5 r4 r4 6 s12 s11 10 9 7 r3 r3 8 r5 r5 9 r1 conflict rammar
  • 268. Using Ambiguous Grammars • All grammars used in the construction of LR-parsing tables must be un- ambiguous. • Can we create LR-parsing tables for ambiguous grammars ? – Yes, but they will have conflicts. – We can resolve these conflicts in favor of one of them to disambiguate the grammar. – At the end, we will have again an unambiguous grammar. • Why we want to use an ambiguous grammar? – Some of the ambiguous grammars are much natural, and a corresponding unambiguous grammar can be very complex. – Usage of an ambiguous grammar may eliminate unnecessary reductions. • Ex. E  E+T | T E  E+E | E*E | (E) | id  T  T*F | F F  (E) | id
  • 269. Sets of LR(0) Items for Ambiguous Grammar E  .E+E E  .E*E E  .(E) E  .id E  E .*E 2 I : E  (.E) E  .E+E E  .E*E E  .(E) E  .id I3: E  id. E  E .+E E  .E+E E  .E*E E  .(E) E  .id 5 I : E  E *.E E  .E+E E  .E*E E  .(E) E  .id I6: E  (E.) E  E.+E E  E.*E I0: E’  .E I1: E’  E. I4: E  E +.E I7: E  E+E. E  E.+E E  E.*E 8 I : E E*E . E  E.+E E  E.*E I9: E  (E). 5 ) E E E * + + E + * + * I * ( ( ( ( id id id id I2 I2 I3 I3 I4 I4 I5 I4 I5
  • 270. SLR-Parsing Tables for Ambiguous Grammar FOLLOW(E) = { $,+,*,) } State I7 has shift/reduce conflicts for symbols + and *. I0 I7 E I1 + I4 E when current token is + shift  + is right-associative reduce  + is left-associative when current token is * shift  * has higher precedence than + reduce  + has higher precedence than *
  • 271. SLR-Parsing Tables for Ambiguous Grammar FOLLOW(E) = { $,+,*,) } State I8 has shift/reduce conflicts for symbols + and *. I0 I7 E I1 * I5 E when current token is * shift  * is right-associative reduce  * is left-associative when current token is + shift  + has higher precedence than * reduce  * has higher precedence than +
  • 272. id + * ( ) $ E 0 s3 s2 1 1 s4 s5 acc 2 s3 s2 6 3 r4 r4 r4 r4 4 s3 s2 7 5 s3 s2 8 6 s4 s5 s9 7 r1 s5 r1 r1 8 r2 r2 r2 r2 9 r3 r3 r3 r3 SLR-Parsing Tables for Ambiguous Grammar Action Goto
  • 273. Error Recovery in LR Parsing • An LR parser will detect an error when it consults the parsing action table and finds an error entry. All empty entries in the action table are error entries. • Errors are never detected by consulting the goto table. • An LR parser will announce error as soon as there is no valid continuation for the scanned portion of the input. • A canonical LR parser (LR(1) parser) will never make even a single reduction before announcing an error. • The SLR and LALR parsers may make several reductions before announcing an error. • But, all LR parsers (LR(1), LALR and SLR parsers) will never shift an erroneous input symbol onto the stack.
  • 274. Panic Mode Error Recovery in LR Parsing • Scan down the stack until a state s with a goto on a particular nonterminal A is found. (Get rid of everything from the stack before this state s). • Discard zero or more input symbols until a symbol a is found that can legitimately follow A. – The symbol a is simply in FOLLOW(A), but this may not work forall situations. • The parser stacks the nonterminal A and the state goto[s,A], and it resumes the normal parsing. • This nonterminal A is normally is a basic programming block (there can be more than one choice for A). – stmt, expr, block, ...
  • 275. Phrase-Level Error Recovery in LR Parsing • Each empty entry in the action table is marked with a specific error routine. • An error routine reflects the error that the user most likely will make in that case. • An error routine inserts the symbols into the stack or the input (or it deletes the symbols from the stack and the input, or it can do both insertion and deletion). – missing operand – unbalanced right parenthesis
  • 277. 277 Syntax Analysis Example a := b + c* 100  The seven tokens are grouped into a parse tree Assignment stmt identifier a := expression expression expression + identifier b c*100
  • 278. Example of Parse Tree 278 Given the grammar: list  list + digit list  list - digit (2.2) (2.3) list  digit (2.4) digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 (2.5) What is the parse tree for 9-5+2? list digit list digit list digit 9 - 5 + 2
  • 279. Abstract Syntax Tree (AST) 279 The AST is a condensed/simplified/abstract form of the parse tree in which: 1. Operators are directly associated with interior nodes (non-terminals) 2. Chains of single productions are collapsed. 3. Terminals that have no attributes are ignored, i.e., the corresponding leaf nodes are discarded.
  • 280. Abstract and Concrete Trees 280 list digit list digit list digit 9 - 5 + 2 Parse or concrete tree + - 2 9 5 Abstract syntaxtree
  • 281. Advantages of the AST Representation 281 • Convenient representation for semantic analysis and intermediate-language (IL) generation • Useful for building other programming language tools e.t., a syntax-directed editor
  • 282. 282 Syntax Directed Translation (SDT) Syntax-directed translation is a method of translating a string into a sequence of actions by attaching on such action to each rule of agrammar. A syntax-directed translation is defined by augmenting the CFG: a translation rule is defined for each production. A translation rule defines the translation of the left-hand side non terminal.
  • 283. 283 Syntax-Directed Definitions and Translation Schemes A. Syntax-Directed Definitions: • give high-level specifications for translations • hide many implementation details such as order of evaluation of semantic actions. • We associate a production rule with a set of semantic actions, and we do not say when they will be evaluated. B. Translation Schemes: • Indicate the order of evaluation of semantic actions associated with a production rule. • In other words, translation schemes give a little bit information about implementation details.
  • 284. Example Syntax-Directed Definition term ::= ID { term.place := ID.place ; term.code = “” } term1 ::= term2 *ID {term1.place := newtemp( ); term1.code := term2.code || ID.code || gen(term1.place ‘:=‘ term2.place ‘*’ ID.place} expr ::= term { expr.place := term.place ; expr.code := term.code } expr1 ::= expr2 +term { expr1.place := newtemp( ) expr1.code := expr2.code || term.code || gen(expr1.place ‘:=‘ expr2.place ‘+’ term.place } 284
  翻译: