Uploaded by nngochue

ANTLR4, .NET Core 2.1, and C# Using the Visitor - Full Boar LLC

advertisement
6/12/2019
ANTLR4, .NET Core 2.1, and C#: Using the Visitor - Full Boar LLC
ANTLR4, .NET Core 2.1, and C#: Using the
Visitor
.NET Core

Antlr

Antlr4

C#

DSL

Parsing
October 16, 2018 by Michael Jay
In the first post in this ANTLR4 series we went over setting up the tooling and tested everything with a simple
grammar file. This post will focus on using ANTLR4 to generate the C# classes need in order to implement a simple
visitor.
This post will use the grammar file created in the previous post. You can work your way through that post or simply
download the source code from GitHub. Whatever works for you.
Before getting to far into the code, it’s probably a good idea to
understand a bit more about how ANTLR works. Grammar files
are used by ANTLR to generate a lexer and a parser. The lexer is
used to turn the raw input into a token stream. Finally, the parser
validates the token stream and generates a syntax tree. If you
want to do more than validate the input you must traverse the
syntax tree using one of the two methods ANTLR provides:
listeners and visitors.
Before getting to far into the
code, it’s probably a good idea to
understand a bit more about how
ANTLR works.
–Michael
By default ANTLR4 will generate the files necessary to provide you with a base listener in the language of your
choice. While the listener approach is perfectly acceptable, I find the visitor pattern to be a better fit for most of my
use cases. You can find a more detailed comparison here.
Generating C# Files With ANTLR4
In the previous post we generated the Java files for your grammar using the antlr4 batch file we created. It turns out
generating C# instead of Java is as simple as using one of many command line options the ANTLR4 tool supports
(you can find a complete listing here).
Let’s create a batch file named gencsharp.bat in the same directory as our Calculator.g4 file:
REM gencsharp.bat
antlr4 -Dlanguage=CSharp -o csharp Calculator.g4 -no-listener -visitor
Let’s breakdown the second line piece by piece:

antlr4 – This is the batch file we setup in the first part of this series

-Dlanguage=CSharp – This option tells the ANTLR4 tool to ignore whatever language is configured in the
grammar file and target the specified language instead

-o csharp – Tell ANTLR4 to put the output in a directory named csharp

-no-listener and -visitor – Tell ANTLR4 to generate a visitor instead of a listener
https://fullboarllc.com/antlr4-dotnet-core-visitor/
1/7
6/12/2019
ANTLR4, .NET Core 2.1, and C#: Using the Visitor - Full Boar LLC
To simplify generating the Java files, let’s go ahead and setup a batch file for that as well: call it genjava.bat:
REM genjava.bat
antlr4 -Dlanguage=Java -o java Calculator.g4
 Notice that we did not provide the -no-listener and -visitor command line options for
the Java version. The TestRig uses the default output generated by the ANTLR4
tooling.
The Java files will come in handy when we went to use the TestRig to test new features in our grammar files.
At this point you can generate the C# files by simply running the gencsharp.bat file we created above. When
ANTLR4 has finished running you should see the following files in your csharp folder:
csharp folder contents after running gencsharp.bat
Setting up Visual Studio
When it comes to developing solutions using ANTLR4 you have a lot of options. You can develop a C# project using
Visual Studio, VS Code, Rider, or the text editor of your choice and the command line. In order to help ease into
things, this post will use Visual Studio. However, of the aforementioned options, Visual Studio has the weakest
support for working with ANTLR4 grammar files. In later posts we’ll look at some of the other options.
Step 1: Create a New .NET Core Console Application
If you’re researching how to use ANTLR4 with C# and .NET Core, odds are you don’t need much help with this step.
Long story short:
1. Launch Visual Studio
2. Select File  New  Project...
3. Select Console App (.NET Core) as the project type
4. Name it Calculator
https://fullboarllc.com/antlr4-dotnet-core-visitor/
2/7
6/12/2019
ANTLR4, .NET Core 2.1, and C#: Using the Visitor - Full Boar LLC
5. Click on OK
Step 2: Add the ANTLR4 Generated C# files to Your Project
Copy the .cs files generated by ANTLR4 (CalculatorLexer.cs, CalculatorParser.cs, CalculatorBaseVisitor.cs)
and paste them into your Visual Studio project folder. While not necessary, I created a Parsing folder in my project
for these files. CalculatorParser.cs was renamed to ICalculatorParser.cs in order to adhere to C# naming
conventions: I recommend you do the same. This is what my solution looks like:
ANTLR4 Project Folder Structure
For now, ignore CalculatorVisitor.cs and ThrowingErrorListener.cs. They will be added shortly.
Implementing the Visitor
With the C# files generated and added to your project you are finally ready to do something interesting with the
grammar we first created in the previous post.

The visitor implemented here is not intended to be used in production code. It’s a
simplified example to be used for learning. Feel free to provide feedback, but let’s not
focus on how elegant (or not) the visitor is.
The base visitor provided by ANTLR4 provides one method per parser rule in the grammar file (the lowercase rules:
operand and expression). These methods can be overridden in a derived class and provide an insertion point for
your logic. In our case we only have two methods to override: VisitExpression and VisitOperand. It is common to
have one class derived from the base visitor class per rule implementation. Given the simplicity of this example, we’ll
stick to a single class.
Implementing VisitOperand
Let’s first address the implementation of VisitOperand. Here is the code for that method in it’s entirety:
https://fullboarllc.com/antlr4-dotnet-core-visitor/
3/7
6/12/2019
ANTLR4, .NET Core 2.1, and C#: Using the Visitor - Full Boar LLC
public override int VisitOperand([NotNull]OperandContext context)
{
ITerminalNode digit = context.DIGIT();
return digit != null
? int.Parse(digit.GetText())
: HandleGroup(context.operand(), context.OPERATOR());
}
For reference, here is the associated parser rule:
operand: DIGIT | LPAREN operand (OPERATOR operand)+ RPAREN;
Based on our grammar, we know that operand will either be a DIGIT or a group containing multiple operators and
operands (e.g. (1+2+3)). We can use that knowledge to determine how to best walk the syntax tree. If
context.DIGIT() returns something other than null, then we know that we have a DIGIT. Otherwise we can assume
that we have a group we need to deal with. The names generated by ANTLR4 are a bit deceiving. Both
context.operand() and context.OPERATOR() appear to be single values; however, both return arrays.
Handling Groups
Handling the DIGIT case is straight forward: convert the string value into an integer value and return it. Group
handling is a bit more complex. For that we’ll implement the HandleGroup method:
private int HandleGroup(OperandContext[] operandCtxs, ITerminalNode[] operatorNodes)
{
List operands = operandCtxs.Select(Visit).ToList();
Queue operators = new Queue(operatorNodes.Select(o => o.GetText()));
return operands.Aggregate((a, c) => _funcMap[operators.Dequeue()](a, c));
}
The first line of the method handles converting the operand nodes to a collection of integers. Visit‘s default
implementation will eventually call the VisitOperand method we implemented above. Linq is used to map the
OperandContext array to an array of integers via the aforementioned Visit method.
Operator nodes are all terminal nodes. That means those nodes represent leaves in the syntax tree: no need to visit
them. Once again Linq is used to handle mapping values. We know that we’ll want to use each operator once and
only once. A Queue provides a simple way to keep track of which operators we have and have not used.
Lastly we need to reduce the list of operands and the queue of operators down to a single calculated value. That can
be accomplished with Linq’s equivalent of a reduce method: Aggregate. The last line of the method aggregates
together two operands using a function map keyed off of the operator. The result is stored in the accumulator. This
process repeats until all of the operands have been reduced down to one value.
 This simple implementation does not have error handling, nor does it properly handle
operator precedence. Please don’t try to use it in production code. Your boss won’t
be happy.
Implementing VisitExpression
https://fullboarllc.com/antlr4-dotnet-core-visitor/
4/7
6/12/2019
ANTLR4, .NET Core 2.1, and C#: Using the Visitor - Full Boar LLC
Referencing the parser rules for expression and operand you can see that they have a few things in common:
expression: operand (OPERATOR operand)+;
operand: DIGIT | LPAREN operand (OPERATOR operand)+ RPAREN;
It seems fairly reasonable that we can leverage the code we wrote to handle groups in VisitOperand here as well.
Let’s just do that and move on:
public override int VisitExpression([NotNull]ExpressionContext context)
{
return HandleGroup(context.operand(), context.OPERATOR());
}
At this point we have fully implemented our very basic visitor. While the code doesn’t reflect production ready
practices, it should be more than enough to get you started.
The Complete CalculatorVisitor Class
As mentioned in the introduction, you can download the project source code from GitHub. If you would prefer not to
bother with that, here is the complete listing for the CalculatorVisitor.cs file:
https://fullboarllc.com/antlr4-dotnet-core-visitor/
5/7
6/12/2019
ANTLR4, .NET Core 2.1, and C#: Using the Visitor - Full Boar LLC
using
using
using
using
using
System;
System.Collections.Generic;
System.Linq;
Antlr4.Runtime.Misc;
Antlr4.Runtime.Tree;
using OperandContext = CalculatorParser.OperandContext;
using ExpressionContext = CalculatorParser.ExpressionContext;
namespace Calculator.Parsing
{
internal class CalculatorVisitor : CalculatorBaseVisitor
{
#region Member Variables
private readonly Dictionary> _funcMap =
new Dictionary>
{
{"+", (a, b) => a + b},
{"-", (a, b) => a - b},
{"*", (a, b) => a * b},
{"/", (a, b) => a / b}
};
#endregion
#region Base Class Overrides
public override int VisitExpression([NotNull]ExpressionContext context)
{
return HandleGroup(context.operand(), context.OPERATOR());
}
public override int VisitOperand([NotNull]OperandContext context)
{
ITerminalNode digit = context.DIGIT();
return digit != null
? int.Parse(digit.GetText())
: HandleGroup(context.operand(), context.OPERATOR());
}
#endregion
#region Utility Methods
private int HandleGroup(OperandContext[] operandCtxs, ITerminalNode[] operatorNodes)
{
List operands = operandCtxs.Select(Visit).ToList();
Queue operators = new Queue(operatorNodes.Select(o => o.GetText()));
return operands.Aggregate((a, c) => _funcMap[operators.Dequeue()](a, c));
}
#endregion
}
}
Putting the Visitor to Use
The last thing we need to do is to actually put the CalculatorVisitor class to use in some sort of meaningful way.
Using the visitor is straight forward. First, use the lexer to tokenize the input stream. Second, use the parser to build
the syntax tree. Finally, visit the syntax tree.
For a complete listing of Program.cs please visit the GitHub repository. The relevant code is listed below:
https://fullboarllc.com/antlr4-dotnet-core-visitor/
6/7
6/12/2019
ANTLR4, .NET Core 2.1, and C#: Using the Visitor - Full Boar LLC
private int EvaluateInput(string input)
{
CalculatorLexer lexer = new CalculatorLexer(new AntlrInputStream(input));
lexer.RemoveErrorListeners();
lexer.AddErrorListener(new ThrowingErrorListener());
CalculatorParser parser = new CalculatorParser(new CommonTokenStream(lexer));
parser.RemoveErrorListeners();
parser.AddErrorListener(new ThrowingErrorListener());
return new CalculatorVisitor().Visit(parser.expression());
}
Assuming all has gone well, you should be able to build the program, run it, and calculate the result of simple
expressions.
-2,
yeah…
that
looks
right
Helpful Links
While I do my best to provide useful information, you should probably supplement what I’ve written above with some
additional information:
Getting started with ANTLR in C#
Quick Starter on Parser Grammars – No Past Experience Required
 Copyright 2019 Full Boar LLC
https://fullboarllc.com/antlr4-dotnet-core-visitor/
7/7
Download