Introduction to Readers

Introduction to Readers Racket is not just a programming language, it is a base using which you can create your own language or environment. This language will use the syntax you specify to do the operations you want. Basically, you are the ultimate designer of the language environment you are setting up. If you want a program in your language to only constitute of ‘dot’ and ‘dash’ as in morse code, you can do that. Of course, you first need to make sense of what the combinations of ‘dots’ and ‘dashes’ mean. To achieve this, a racket program is first parsed through a reader. If you include “#lang racket,” default reader is used. However, we can specify our own reader which can be extended through #reader form. As a demonstration, let us write a program that reads an arithmetic expression of the form “3+6/5*9” and evaluates the expression. STEP 1: Open a new racket file and mention the language you want to use. We want to use racket. STEP 2: We need to define two functions, read and read-syntax. two functions and then specify them. First, provide these (provide read read-syntax) (define (read in) (syntax->datum (read-syntax #f in))) (define (read-syntax src in) (skip-whitespace in) (read-arith src in)) Usually, read is meant to parse through data, whereas read-syntax is used to parse the programs. That is, read-syntax usually returns a syntax object that is data obtained from read function is encapsulated with information about its source etc. STEP 3: Let us explore a new function regexp-match. This function matches a regular expression with a string. The syntax for instructions to write regular expressions can be found on http://docs.racket-lang.org/reference/regexp.html (define (skip-whitespace in) (regexp-match #px"^\\s*" in)) In the above function, #px”^\\s*” pattern is matched in the input in. Here, #px stands for preg expressions. There is slight difference between preg (#px) and reg (#rx) expression in their syntax. preg expressions are compatible with perl. Following #px, is the expression itself in quotes. ^ denotes beginning of input, or point after a new line. \\s signifies the space, and * signifies 0 or more terms that matches the space. Thus, the first sequences of spaces are read using skip-whitespace function. STEP 4: Next step is figuring out how to get the location information of object read. One command that helps you do this is port-next-location. port-next-location returns #f or integer for line number, column number and position of the next character to be read. (define-values (line col pos) (port-next-location in)) STEP 5: Let us define a function that will check the sanity of the syntax entered by the user. (define expr-match (regexp-match #px"^([a-z]|[0-9]+)(?:[-+*/]([a-z]|[0-9]+))*(?![-+*/])" in)) Let’s see what is going on here. The input in is matched with the expression #px"^([az]|[0-9]+)(?:[-+*/]([a-z]|[0-9]+))*(?![-+*/])". The expression suggests that the first part needs to be start of the expression, followed by either an alphabet or number. The + symbol indicates that the alphabets or the numeric character may be present one or more times. This sequence is followed by an operator, either +, -, * or /, followed by additional alpha-numeric characters repeated one or more times. The * denotes, this pattern of operator followed by alpha-numeric characters may be followed one or more times. At the end, the last character should not be an operator. Basically, ? matches a character 0 or 1 times, ! means not in the given set. The parenthesis, ( ) require the matching of substrings as well, which are also returned as following items of the list. If the pattern is not matched, #f is returned. Go ahead and play with this function. Pass “3+4*5” in expr-match and see what is returned. Pass a few more strings consisting of the arithmetic expressions. Note the terms returned. Do these values match with your intuition. STEP 5: datum->syntax is an important function when constructing a reader. It converts the datum v to a syntax object, and wraps the location information around it. The location information should be provided in form of a list or a vector, consisting of (list source-name line column position span). (define (to-syntax v delta span-str) (datum->syntax #f v (make-srcloc delta span-str))) (define (make-srcloc delta span-str) (and line (vector src line (+ col delta) (+ pos delta) (string-length span-str)))) STEP 6: Now, let us try to actually make sense of everything. First lets look at the function below, and try to see what you can understand on your own. (define (parse-expr s delta) (match (or (regexp-match #rx"^(.*?)([+-])(.*)$" s) (regexp-match #rx"^(.*?)([*/])(.*)$" s)) [(list _ a-str op-str b-str) (define a-len (string-length a-str)) (define a (parse-expr a-str delta)) (define b (parse-expr b-str (+ delta 1 a-len))) (define op (to-syntax (string->symbol op-str) (+ delta a-len) op-str)) (to-syntax (list op a b) delta s)] [else (to-syntax (or (string->number s) (string->symbol s)) delta s)])) Let me start by giving you some hints. The first part is a function called match. You can read more about match on http://docs.racket-lang.org/reference/match.html. match basically reads a list or an input, and returns a matching sequence as output. For example, (match ‘(1 2 3) [(list a b c) 4] [(list a b) 2) will return a 4, as a 3 value list is matched with 4. The same expression, (match ‘(1 2) [(list a b c) 4] [(list a b) 2) will return a 2. Now, lets look at the use of match in our function. First, an or function will only evaluate the first function. If the first function is false, then it will move onto the next expression, otherwise return the first expression. I have already explained reg-exp-match in detail. The first expression basically extracts the products or divisions together, and separates the plus and minus symbols. The next regexp-match extracts the sum and difference together. If you use these expressions independently, you will get more feel. The last three terms of the list returned by regexp-match are matched with (to-syntax (list op a b) delta s). If the return is not a list of more than three terms, then the value returned is (to-syntax (or (string->number s)(string->symbol s)) delta s). You will have to look at this function in a little more detail to see how the expression is parsed recursively and converted to racket syntax. STEP 7: Finally, lets try to combine everything together. Let us wrap Steps 4 through 6 in another function called read-arith Thus, (define (read-arith src in) (Write steps 4 to 6) (unless expr-match (raise-read-error "bad arithmetic syntax" src line col pos (and pos (- (file-position in) pos)))) (parse-expr (bytes->string/utf-8 (car expr-match)) 0)) Here, we will check if the expression is a valid arithmetic expression or not. Then, parse the expression. STEP 8: Check to see if your reader works. Save the above file as “myreader.rkt”. In the interactive window, type #reader”myreader.rkt” 4*3+9, and execute. Note: For your projects, a number of you have asked how to use the normal symbols +, etc to add and subtract quaternions. You can use the (rename-out functions to change the name of the functions you have defined. For example, see the code below. Name a file qsys.rkt and write the following code in it. (module <ignored> racket (provide (except-out (all-from-out racket) + *) (rename-out (+q +)(*q *) ) ) (define(+q x y) (+ x y 1000)) (define(*q x y) (* x y 1000)) ) Notice that we are providing all the functions from racket except + and *, and providing our + and * by renaming the existing function as + and *. In a different file, you can write, #lang racket (require "qsystem.rkt") (+ 2 3) (* 2 3) and run the file.

Introduction to Readers

Related documents

Products

Support

Introduction to Readers

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib