LEX: LEX translates a set of regular expression specifications (given as input in input_file.l) into a C implementation of a corresponding finite state machine (lex.yy.c). This C program, when compiled, yields an executable lexical analyzer. Lex and Flex (Fast Lexical Analysis Generator) Scanner https://stackoverflow.com/questions/32658943/using-lex-creating-scanner Why use LEX https://stackoverflow.com/questions/1820103/why-use-lexical-analyzers Syntax of lex program: 1) Declarations The declarations section consists of two parts, auxiliary declarations and regular definitions. a) Auxiliary Declarations The auxiliary declarations are copied as such by LEX to the output lex.yy.c file. This C code consists of instructions to the C compiler and are not processed by the LEX tool. %{ C declarations and includes %} %{ #include<stdio.h> #include<string.h> int i = 0; %} b) Regular Definitions letter (letter | num )+ - how would compiler know what is letter? Letter [a-zA-Z] num [0-9] 2) Transition Rules - specify regex (regular expressions) - Start and end with %% - Rules in a LEX program consists of two parts : 1. The pattern to be matched 2. The corresponding action to be executed %% \n yylineno++; #include<{letter}+(\.{letter}*)?> printf(“preprocessor”) %% 3) User Routines LEX generates C code for the rules specified in the Rules section and places this code into a single function called yylex(). RUN: lex try.l gcc lex.yy.c ./a.out file.c giving file.c (actual source code to a.out), then it will generate the tokens