尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
LEXICALANALYSER (SCANNER)
• Lexical analysis is the first phase of a compiler. It takes modified source code from
language preprocessors that are written in the form of sentences. The lexical analyzer
breaks these syntaxes into a series of tokens, by removing any whitespace or comments
in the source code.
• Lexical Analysis is the first phase of the compiler also known as a scanner. It converts
the High level input program into a sequence of Tokens.
• If the lexical analyzer finds a token invalid, it generates an error. The lexical
analyzer works closely with the syntax analyzer. It reads character streams from the
source code, checks for legal tokens, and passes the data to the syntax analyzer
when it demands.
• Tokens
Lexemes are said to be a sequence of characters (alphanumeric) in a token. There
are some predefined rules for every lexeme to be identified as a valid token. These
rules are defined by grammar rules, by means of a pattern. A pattern explains what
can be a token, and these patterns are defined by means of regular expressions.
•Lexical Analysis can be implemented with the Deterministic finite Automata.
• The output is a sequence of tokens that is sent to the parser for syntax analysis
• What is a token?
A lexical token is a sequence of characters that can be treated as a unit in the grammar
of the programming languages.
Example of tokens:
• Type token (id, number, real, . . . )
• Punctuation tokens (IF, void, return, . . . )
• Alphabetic tokens (keywords)
Keywords; Examples- for, while, if etc.
Identifier; Examples-Variable name, function name, etc.
Operators; Examples '+', '++', '-' etc.
Separators; Examples ',' ';' etc
Example of Non-Tokens:
•Comments, preprocessor directive, macros, blanks, tabs, newline, etc.
Lexeme: The sequence of characters matched by a pattern to form
the corresponding token or a sequence of input characters that comprises a single token is
called a lexeme.
eg- “float”, “abs_zero_Kelvin”, “=”, “-”, “273”, “;”
How Lexical Analyzer functions
1. Tokenization i.e. Dividing the program into valid tokens.
2. Remove white space characters.
3. Remove comments.
4. It also provides help in generating error messages by providing row numbers and column
numbers.
Suppose we pass a statement through lexical analyzer –
a = b + c ; It will generate token sequence like this:
id=id+id; Where each id refers to it’s variable in the symbol table referencing all details
For example, consider the program
int main()
{ // 2 variables
int a, b; a = 10;
return 0;
}
All the valid tokens are:
'int’ 'main’ '(‘ ')’ '{‘ 'int’ 'a' ‘, ' 'b’ ';' ‘ a' '=‘ ‘ 10’ ';’
'return’ '0' ';’ '}'
Above are the valid tokens.
You can observe that we have omitted comments.
Exercise 1:
Count number of tokens :
int main()
{
int a = 10, b = 20;
printf("sum is :%d", a + b );
return 0;
}
Answer: Total number of token: 27.
Exercise 2:
Count number of tokens :
int max(int i);
•Lexical analyzer first read int and finds it to be valid and accepts as token
•max is read by it and found to be a valid function name after reading (
• int is also a token , then again i as another token and finally ;
Answer: Total number of tokens 7
Basic Terminologies
What’s a lexeme?
A lexeme is a sequence of characters that are included in the source program according to the
matching pattern of a token. It is nothing but an instance of a token.
What’s a token?
Tokens in compiler design are the sequence of characters which represents a unit of
information in the source program.
What is Pattern?
A pattern is a description which is used by the token. In the case of a keyword which uses
as a token, the pattern is a sequence of characters.
Roles of the Lexical analyzer:
Lexical analyzer performs below given tasks:
• Helps to identify token into the symbol table
• Removes white spaces and comments from the source program
• Correlates error messages with the source program
• Helps you to expands the macros if it is found in the source program
• Read input characters from the source program
Lexical Errors
A character sequence which is not possible to scan into any valid token is
a lexical error.
• Lexical errors are not very common, but it should be managed by a scanner
• Misspelling of identifiers, operators, keyword are considered as lexical errors
• Generally, a lexical error is caused by the appearance of some illegal character, mostly
at the beginning of a token.

More Related Content

Similar to Lexical Analysis.pdf

role of lexical parser compiler design1-181124035217.pdf
role of lexical parser compiler design1-181124035217.pdfrole of lexical parser compiler design1-181124035217.pdf
role of lexical parser compiler design1-181124035217.pdf
ranjan317165
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
Anujashejwal
 
Compiler Design
Compiler Design Compiler Design
Compiler Design
waqar ahmed
 
compiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptxcompiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptx
ranjan317165
 
Compiler lecture 04
Compiler lecture 04Compiler lecture 04
Compiler lecture 04
University of Chitral
 
Compiler lecture 04
Compiler lecture 04Compiler lecture 04
Compiler lecture 04
University of Chitral
 
Using Static Analysis in Program Development
Using Static Analysis in Program DevelopmentUsing Static Analysis in Program Development
Using Static Analysis in Program Development
PVS-Studio
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design
Aman Sharma
 
11700220036.pdf
11700220036.pdf11700220036.pdf
11700220036.pdf
SouvikRoy149
 
Token, Pattern and Lexeme
Token, Pattern and LexemeToken, Pattern and Lexeme
Token, Pattern and Lexeme
A. S. M. Shafi
 
1._Introduction_.pptx
1._Introduction_.pptx1._Introduction_.pptx
1._Introduction_.pptx
Anbarasan Radhakrishnan R
 
automata theroy and compiler designc.pptx
automata theroy and compiler designc.pptxautomata theroy and compiler designc.pptx
automata theroy and compiler designc.pptx
YashaswiniYashu9555
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
Rossy719186
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
Taymoor Nazmy
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
Munni28
 
Compilers in computer programming
Compilers in computer programmingCompilers in computer programming
Compilers in computer programming
Chetan Pandey
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler Design
Akhil Kaushik
 
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
TANZINTANZINA
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
A. S. M. Shafi
 
Compiler Design.pptx
Compiler Design.pptxCompiler Design.pptx
Compiler Design.pptx
SouvikRoy149
 

Similar to Lexical Analysis.pdf (20)

role of lexical parser compiler design1-181124035217.pdf
role of lexical parser compiler design1-181124035217.pdfrole of lexical parser compiler design1-181124035217.pdf
role of lexical parser compiler design1-181124035217.pdf
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Compiler Design
Compiler Design Compiler Design
Compiler Design
 
compiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptxcompiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptx
 
Compiler lecture 04
Compiler lecture 04Compiler lecture 04
Compiler lecture 04
 
Compiler lecture 04
Compiler lecture 04Compiler lecture 04
Compiler lecture 04
 
Using Static Analysis in Program Development
Using Static Analysis in Program DevelopmentUsing Static Analysis in Program Development
Using Static Analysis in Program Development
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design
 
11700220036.pdf
11700220036.pdf11700220036.pdf
11700220036.pdf
 
Token, Pattern and Lexeme
Token, Pattern and LexemeToken, Pattern and Lexeme
Token, Pattern and Lexeme
 
1._Introduction_.pptx
1._Introduction_.pptx1._Introduction_.pptx
1._Introduction_.pptx
 
automata theroy and compiler designc.pptx
automata theroy and compiler designc.pptxautomata theroy and compiler designc.pptx
automata theroy and compiler designc.pptx
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Compilers in computer programming
Compilers in computer programmingCompilers in computer programming
Compilers in computer programming
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler Design
 
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Compiler Design.pptx
Compiler Design.pptxCompiler Design.pptx
Compiler Design.pptx
 

Recently uploaded

8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity
RuchiRathor2
 
managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
nabaegha
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
chaudharyreet2244
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
heathfieldcps1
 
A Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by QuizzitoA Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by Quizzito
Quizzito The Quiz Society of Gargi College
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
Friends of African Village Libraries
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
roshanranjit222
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
Celine George
 
The Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teachingThe Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teaching
Derek Wenmoth
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
biruktesfaye27
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
whatchangedhowreflec
 
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT KanpurDiversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Quiz Club IIT Kanpur
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
MattVassar1
 
220711130095 Tanu Pandey message currency, communication speed & control EPC ...
220711130095 Tanu Pandey message currency, communication speed & control EPC ...220711130095 Tanu Pandey message currency, communication speed & control EPC ...
220711130095 Tanu Pandey message currency, communication speed & control EPC ...
Kalna College
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
Frederic Fovet
 
Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17
Celine George
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
PriyaKumari928991
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
Kalna College
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Kalna College
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
Kalna College
 

Recently uploaded (20)

8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity
 
managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
 
A Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by QuizzitoA Quiz on Drug Abuse Awareness by Quizzito
A Quiz on Drug Abuse Awareness by Quizzito
 
Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024Library news letter Kitengesa Uganda June 2024
Library news letter Kitengesa Uganda June 2024
 
IoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdfIoT (Internet of Things) introduction Notes.pdf
IoT (Internet of Things) introduction Notes.pdf
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
 
The Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teachingThe Science of Learning: implications for modern teaching
The Science of Learning: implications for modern teaching
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
 
Erasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES CroatiaErasmus + DISSEMINATION ACTIVITIES Croatia
Erasmus + DISSEMINATION ACTIVITIES Croatia
 
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT KanpurDiversity Quiz Prelims by Quiz Club, IIT Kanpur
Diversity Quiz Prelims by Quiz Club, IIT Kanpur
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
 
220711130095 Tanu Pandey message currency, communication speed & control EPC ...
220711130095 Tanu Pandey message currency, communication speed & control EPC ...220711130095 Tanu Pandey message currency, communication speed & control EPC ...
220711130095 Tanu Pandey message currency, communication speed & control EPC ...
 
Decolonizing Universal Design for Learning
Decolonizing Universal Design for LearningDecolonizing Universal Design for Learning
Decolonizing Universal Design for Learning
 
Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
 

Lexical Analysis.pdf

  • 2. • Lexical analysis is the first phase of a compiler. It takes modified source code from language preprocessors that are written in the form of sentences. The lexical analyzer breaks these syntaxes into a series of tokens, by removing any whitespace or comments in the source code. • Lexical Analysis is the first phase of the compiler also known as a scanner. It converts the High level input program into a sequence of Tokens. • If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works closely with the syntax analyzer. It reads character streams from the source code, checks for legal tokens, and passes the data to the syntax analyzer when it demands. • Tokens Lexemes are said to be a sequence of characters (alphanumeric) in a token. There are some predefined rules for every lexeme to be identified as a valid token. These rules are defined by grammar rules, by means of a pattern. A pattern explains what can be a token, and these patterns are defined by means of regular expressions.
  • 3. •Lexical Analysis can be implemented with the Deterministic finite Automata. • The output is a sequence of tokens that is sent to the parser for syntax analysis
  • 4. • What is a token? A lexical token is a sequence of characters that can be treated as a unit in the grammar of the programming languages. Example of tokens: • Type token (id, number, real, . . . ) • Punctuation tokens (IF, void, return, . . . ) • Alphabetic tokens (keywords) Keywords; Examples- for, while, if etc. Identifier; Examples-Variable name, function name, etc. Operators; Examples '+', '++', '-' etc. Separators; Examples ',' ';' etc Example of Non-Tokens: •Comments, preprocessor directive, macros, blanks, tabs, newline, etc.
  • 5. Lexeme: The sequence of characters matched by a pattern to form the corresponding token or a sequence of input characters that comprises a single token is called a lexeme. eg- “float”, “abs_zero_Kelvin”, “=”, “-”, “273”, “;” How Lexical Analyzer functions 1. Tokenization i.e. Dividing the program into valid tokens. 2. Remove white space characters. 3. Remove comments. 4. It also provides help in generating error messages by providing row numbers and column numbers.
  • 6. Suppose we pass a statement through lexical analyzer – a = b + c ; It will generate token sequence like this: id=id+id; Where each id refers to it’s variable in the symbol table referencing all details For example, consider the program int main() { // 2 variables int a, b; a = 10; return 0; } All the valid tokens are: 'int’ 'main’ '(‘ ')’ '{‘ 'int’ 'a' ‘, ' 'b’ ';' ‘ a' '=‘ ‘ 10’ ';’ 'return’ '0' ';’ '}' Above are the valid tokens. You can observe that we have omitted comments.
  • 7. Exercise 1: Count number of tokens : int main() { int a = 10, b = 20; printf("sum is :%d", a + b ); return 0; }
  • 8. Answer: Total number of token: 27. Exercise 2: Count number of tokens : int max(int i); •Lexical analyzer first read int and finds it to be valid and accepts as token •max is read by it and found to be a valid function name after reading ( • int is also a token , then again i as another token and finally ; Answer: Total number of tokens 7
  • 9. Basic Terminologies What’s a lexeme? A lexeme is a sequence of characters that are included in the source program according to the matching pattern of a token. It is nothing but an instance of a token. What’s a token? Tokens in compiler design are the sequence of characters which represents a unit of information in the source program. What is Pattern? A pattern is a description which is used by the token. In the case of a keyword which uses as a token, the pattern is a sequence of characters.
  • 10. Roles of the Lexical analyzer: Lexical analyzer performs below given tasks: • Helps to identify token into the symbol table • Removes white spaces and comments from the source program • Correlates error messages with the source program • Helps you to expands the macros if it is found in the source program • Read input characters from the source program Lexical Errors A character sequence which is not possible to scan into any valid token is a lexical error.
  • 11. • Lexical errors are not very common, but it should be managed by a scanner • Misspelling of identifiers, operators, keyword are considered as lexical errors • Generally, a lexical error is caused by the appearance of some illegal character, mostly at the beginning of a token.
  翻译: