– Skill being assessed:

advertisement
Assessment Item J2 – Scanner and parser design – CS431
Skill being assessed: Ability of the student to use results from formal language
theory and algorithmic principles to inform design decisions in compiler
construction.
Program outcome to which this skill is mapped: (j) An ability to apply
mathematical foundations, algorithmic principles, and computer science theory in
the modeling and design of computer-based systems in a way that demonstrates
comprehension of the tradeoffs involved in design choices
Performance Assessment Abstract: Designing a compiler is a complex endeavor that
benefits from many areas of computer science, including formal language theory
and algorithmic principles. For example, the techniques used in the two initial
stages of a compiler have a solid mathematical foundation. In fact, scanners and
parsers are fully based on formal models of computation called regular expressions
(or regex's) and context-free grammars (or CFG's), respectively. As you know, each
regex or CFG represents a language (or set of strings/token sequences) that it
generates. However, these models have different computational powers. For
example, some languages can be generated by one or more CFG's but cannot be
generated by any regex. Furthermore, every language that is generated by a regex
can also be generated by at least one CFG. In short, CFG's are a more powerful
computational model than regex's, since the former can do everything that the latter
can do, and more. Similarly, there are several interesting sub-classes of CFG's, such
as LL(1) and LR(1) grammars, with different computational powers. For us, the
most relevant result of formal language theory is that regex's, LL(1) grammars,
LR(1) grammars, and general CFG's each have more computational power than their
predecessor in the list. Since CFG's are the most powerful, why not use them for
both scanning and parsing? Why even separate these two stages? The reason, of
course, is that there exist many trade-offs that apply to compiler design. First, there
are several stakeholders to consider, including the programming language designer,
the compiler writer, the other system programmers, the application programmer,
and the end user of the compiled applications. Second, each stakeholder may focus
on different and often conflicting features such as: run time, memory requirements,
maintenance, ease of use, user friendliness of error messages, etc. Third, several of
these features are relevant to most of the software systems involved, namely the
compiler itself, but also the source code it takes as input, the operating system it
relies on, and the executable code it outputs. Finally, consider the following facts:
Fact 1: Regexp's can be used to build scanners that run in O(n) time, where n is the total
number of characters in the source program file.
Fact 2: Parsers that can handle any CFG run in O(n3 ) time, where n is the number of tokens
produced by the scanner.
Fact 3: Parsers that can handle LL(1) or LR(1) grammars run in O(n) time, where n is the
number of tokens in the output of the scanner, but LR(1) parsers are harder to implement.
Now, use the foregoing trade-offs and facts to write a well-structured and
grammatically correct essay that answers the following questions as precisely as
possible:
1. Why is scanning typically separated from parsing? You must state and justify
exactly three DISTINCT advantages of this common practice in compiler
design. An alternative approach would be to use grammars for both scanning
and parsing, which could then be combined into a single phase. But then, why
even use regex's?
2. If you were to write your own parser-generating tool (like yacc or JavaCC),
would you support LR(1) grammars or only LL(1) grammars? Remember to
justify your answer fully and to consider the relevant stakeholders and tradeoffs. There is no one correct answer. Any choice is acceptable as long as it is
backed by a detailed consideration of pros and cons of each alternative in
relation to your goals for this tool. You must state these goals as part of your
answer.
Make sure to clearly separate your two answers. For each one, you must use precise
phrasing and justify each claim with a cogent argument. Your essay must be on the
order of, and no longer than, two single-spaced pages. If you consult and use sources
to write your essay, you must include full references to these sources at the end of
your essay. However, references are not included in the page count.
Rubric for Evaluation
Criteria
Scanning
versus
parsing
Exemplary
Student clearly
stated three
distinct advantages
to differentiating
the scanning and
parsing phases. All
three advantages
are extremely well
justified based
(partly) on the
given trade-offs
and facts.
Satisfactory
Student stated
three distinct
advantages to
differentiating the
scanning and
parsing phases. All
three advantages
are reasonably well
justified based
(partly) on the
given trade-offs
and facts.
Marginal
Student stated
three advantages
but two of them
are mostly
different
formulations of
the same one or
the justification
for some of them
is lacking in
clarity or detail.
LL(1)
grammars
versus LR(1)
grammars
Student made a
clear choice of
which type of
grammar to
support and
justified that
choice superbly by
weighing their own
goals, the pros and
cons of the
Student made a
clear choice of
which type of
grammar to
support and
justified that choice
reasonably well by
weighing their own
goals, the pros and
cons of the
Student made a
clear choice of
which type of
grammar to
support but this
choice is based on
a somewhat
limited
consideration of
the pros and cons
Deficient
Student did not
articulate more
than one
advantage, or they
did not use the
given trade-offs
and facts, or the
poor quality of the
write-up makes it
hard to identify
the purported
advantages or to
understand their
justification.
Student did not
make a clear
choice, or did not
state their goals at
all, or did not
justify their choice
with a logical
argument based
on the pros and
cons of the
alternatives, and
thus coming to a
fully justified
conclusion.
alternatives, and
coming to a
sufficiently justified
conclusion.
of the
alternatives, or
student did not
state their goals in
clear enough
language.
alternatives and
their goals.
Download