BNF

advertisement
BNF
Dr. Milica Barjaktarovic
Computers are stupid… aka very literal
•If I say: rm * .out in Unix, Unix will remove ALL my files,
although I wanted to say rm *.out
•Computer understands ONLY the rules it knows about
•Computer reads computer code and translates verbatim EVERY
letter, every comma, every dot, … ***everything!*** and tries to
see if it matches what the computer knows
•If it looks ok according to the rule the computer knows about,
then the computer will assume certain meaning of that code
according to that rule
•How do we deal with that?
–By following the rules precisely
Syntax and Semantics
•Syntax: what it must look like
•Semantics: what it means
•For example, both of these commands have proper syntax, but
very different semantics:
–rm * .out
–rm *.out
•For example: a sentence in very simple English has syntax: noun
verb object period
–“Alice studies the book.” is a sentence that has the proper syntax.
The semantics is…
–“Bob studies.” is a sentence that doesn’t have proper syntax
according to this simple English.
Ok then what?
•If the computer is so particular and we have to be so clear and so
precise in how we give it orders, we need to have some means to
communicate clearly – with each other and with the computer
•Is English clear and unambiguous enough for communicating
with a computer?
–Hm….
•Solution: formalize the communication (i.e. express things more
mathematically)
–BNF, EBNF, …
What is BNF for
•A way to formally (i.e. unambiguously) express syntax and/or
behavior of – anything! programming languages, natural
languages, procedures, serial numbers, equipment behavior, etc.
•Roughly speaking:
–BNF description is a bunch of rules (called productions)
–Each rule has a symbol called non-terminal on the left side, which
is equal to a bunch of symbols called terminals on the right side
–So, start from the top (i.e. the start symbol) and keep on
expanding the non-terminals by substituting their assignment
–This is somewhat similar to deriving equations in regular math.
Difference: there are many possibilities in BNF, each can be
expanded differently
BNF definition
•Roughly speaking, BNF grammar will be a bunch of rules (i.e.
productions) looking like this:
–non-terminal_1 = some combo of terminals
–non-terminal_2 = some combo of terminals
–…
–There is one rule, i.e. one production, for each non-terminal; and
rules can be recursive, have choices, optional parts, repeated parts,
etc.
–Terminals consist of lexemes and tokes, i.e. small lexical units
•Formal syntax and postal address example:
–http://en.wikipedia.org/wiki/Backus-Naur_form#Introduction
BNF vs EBNF
•EBNF has shortcuts to define repeats, etc. Notice that there are
two alternative ways to represent those features, for example {} or
+.
()
Parentheses. Used to group several elements, so they are treated as one
single token
?
Any token followed by ? occurs 0 or 1 times
*
Any token followed by * can occur 0 or more times
+
Any token followed by + can occur 1 or more times
.
Any character/token can occur one time
~
Any character/token following the ~ may not occur at the current place
..
Between two characters .. spans a range which accepts every character
between both boundaries inclusive.
Usage
Notation
definition
=
concatenation
,
termination
;
separation
|
option
[ ... ]
repetition
{ ... }
grouping
( ... )
double quotation marks " ... "
single quotation marks
' ... '
comment
(* ... *)
special sequence
? ... ?
exception
-
–http://www.cs.umd.edu/class/spring2002/cmsc214/Tutorial/ebnf.html {} []
etc notation
- http://www.antlr.org/wiki/display/ANTLR3/Quick+Starter+on+Parser+G
rammars+-+No+Past+Experience+Required + * ? etc notations
- http://odin.himinbi.org/bytewise_ebnf/ebnf_spec.html Alternative ways
comparison
- Sandwich example:
“A sandwich consists of a lower slice of bread, mustard or mayonnaise;
optional lettuce, an optional slice of tomato; two to four slices of either
bologna, salami, or ham (in any combination); one or more slices of
cheese, and a top slice of bread.”
This translates to:
sandwich ::=
lower_slice
[ mustard | mayonnaise ]
-- should this be here?
lettuce? tomato?
[ bologna | salami | ham ] {2,4}
cheese+
top_slice
Also, it can be written as shown below (also, for a slightly better sandwich):
<sandwich> ::
<lowerbreadslice>
( <mustard> | <mayo>)
[lettuce]
[tomato]
( <bologney> | <salami> | <ham>) {2-4}
( <cheese>) {1-4}
<topbreadslice>
Examples
The following examples have BNF specs of real-life and real-life like applications
–http://www.garshol.priv.no/download/text/bnf.html#id2.2. simple parsers
for numerals
–http://www.ugrad.cs.ubc.ca/~cs126/Homepage/tutorials/tutOne.html simple
calculators, serial numbers spec, etc.
- - http://www.csm.astate.edu/~rossa/cs3543/bnf.html http, calculator, etc.
- http://courses.cs.vt.edu/~cs1104/BNF/BNF.samples.html “specs on the
fly” applet and demo
- http://www.w3.org/Addressing/URL/5_BNF.html URL
BNF in Real Life
•Most (if not every) programming language is specified via BNF –
because of standardization
–http://cui.unige.ch/isi/bnf/
–Java http://cui.unige.ch/isi/bnf/JAVA/AJAVA.html
–Java http://cui.unige.ch/isi/bnf/JAVA/BNFindex.html
– http://lists.canonical.org/pipermail/kragen-hacks/1999October/000201.html
– C http://www.cs.man.ac.uk/~pjj/bnf/c_syntax.bnf
- Algol http://www.lrz-muenchen.de/~bernhard/Algol-BNF.html
- 3APL http://www.cs.uu.nl/3apl/bnf.pdf
•Most standardized or aspiring to be standardized documents are in
BNF
–Google “BNF for TCP or RFC”
- e.g. OPM
http://sdm.lbl.gov/OPM/DM_TOOLS/OPM/OPM_4.1/OPM_S
T/node8.html
-
-Internet protocols use Augmented BNF (ABNF), i.e. a short-hand
version of BNF: http://www.unix.com.ua/rfc/rfc2234.html
http://xml.resource.org/public/rfc/html/rfc2234.html
- SQL functions are described in BNF:
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.js
p?topic=/com.ibm.db2.udb.doc/admin/r0003509.htm
- http://www.w3.org/TR/REC-PICS-services-961031 Rating
service
- http://www.computer.org/portal/cms_docs_ieeecs/ieeecs/edu
cation/csidc/2001ProjectReports/Karlsruhe.pdf Remote
control system
-AMQP bug discussion: https://jira.amqp.org/jira/browse/AMQP63
-ZOE language spec:
http://radio.weblogs.com/0101039/stories/misc/bnfGrammarForZo
eSpecification.html
PS –
[RFC 2119] IETF (Internet Engineering Task Force). RFC 2119: Key words for use in
RFCs to Indicate Requirement Levels. S. Bradner. 1997.
Prof. Nancy Reed’s notes with examples from
Sebesta book for ICS313
–http://www2.hawaii.edu/~nreed/ics313/lectures/03syntax.pdf
Download