Structure of the LEX program

 

Structure of the LEX program

In this article, we will basic concepts of LEX and YACC programs in COmpiler design and Structure of the LEX program.

Introduction to LEX:

Lex & YACC are the tools designed for writers of compilers & interpreters.

Lex & Yacc helps us write programs that transform structured input. In programs with structured input, two tasks occur again & again.

  1. Dividing the input into meaningful units (tokens).
  2. Establishing or discovering the relationship among the tokens.

Two Rules to remember of Lex:

  1. Lex will always match the longest (number of characters) token possible.
    Ex: Input: abc
    Then [a-z]+ matches abc rather than a or ab or bc.
  2. If two or more possible tokens are of the same length, then the token with the regular expression that is defined first in the lex specification is favored.
    Ex:
    [a-z]+ {printf(“Hello”);}
    [hit] {printf(“World”);}
    Input: hit output: Hello

The Structure of LEX:

%{
Definition section
%}

%%
Rules section
%%

User Subroutine section  

The Definition section is the place to define macros and import header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file. It is bracketed with %{ and %}.

The Rules section is the most important section; Each rule is made up of two parts: a pattern and an action separated by whitespace. The lexer that lex generates will execute the action when it recognizes the pattern. Patterns are simply regular expressions. When the lexer sees some text in the input matching a given pattern, it executes the associated C code. It is bracketed with %% & %%.

The User Subroutine section in which all the required procedures are defined. It contains the main in which C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section.

Sample LEX program to recognize numbers

%{
#include <stdio.h>
%}
 
%%
[0-9]+ { printf(“Saw an integer: %s\n”, yytext); }
. { ;}
%%
 
main( )
{
printf(“Enter some input that consists of an integer number\n”);
yylex();
}

int yywrap()
{
return 1;
}

Output:

Running Lex program:

[student@localhost ~]$ lex 1a.l

[student@localhost ~]$ cc lex.yy.c

[student@localhost ~]$ ./a.out

Enter some input that consists of an integer number

hello 2345

Saw an integer: 2345

Explanation:

First-line runs lex over the lex specification & generates a file, lex.yy.c which contains C code for the lexer. 

The second line compiles the C file.

The third line executes the C file.

Summary:

This article discusses, Structure of the LEX program. If you like the article, do share it with your friends.

Leave a Comment

Your email address will not be published. Required fields are marked *