Inneh˚ all

advertisement
TDDC74 Programming: Abstraction and Modelling
Supplement Document
SICP, Chapter 02
Innehåll
1 Overview: SICP 02 – Data Abstraction
2
2 SICP 2.1-2.3
2
2.1
Abstract Interfaces, Barriers, & ADTs . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.2
Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.3
Pretty-printing CONS Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.4
Box-and-pointer Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.5
Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.5.1
Symbols versus Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.5.2
quote and cons/list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.5.3
Diagramming quoted structures . . . . . . . . . . . . . . . . . . . . . . . . .
6
Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.6
3 Vocabulary
8
1
1
Overview: SICP 02 – Data Abstraction
This document contains supplemental information for PRAM; this material should be studied and
understood in addition to the material in SICP, Chapter 02.
As you work on SICP, chapter 02, keep in mind that the main point is to learn about data abstraction. In order to explore this issue, it is necessary to learn and master the particular details of how
to construct and manipulate different types of compound data in Scheme. Therefore, of course,
we spend time on these details. However, it is important for students to recall that although the
specific details vary from language to language, the main principles are still valid and useful.
2
SICP 2.1-2.3
2.1
Abstract Interfaces, Barriers, & ADTs
See SICP 2.1 and definitions in the appendix.
2.2
Pairs
Students often find pair-structures and pair-operations confusing – especially the difference(s)
between cons and list. Using box-and-pointer notation will really help solidify an understanding
here.
Obs! Section 2.1.1 of SICP (“Example: Arithmetic Operations for Rational Numbers”) introduces
important material about pair data-structures. Be sure to read that material.
• The empty list is written as ’() – you will see it without the quote when it is returned as a
value.
• The cdr of a pair is the rest of the structure; it is not the “second element.”
(cdr (list 1 2 3 4)) does not return 2 !
• The empty list is not a pair; it is an error to ask for the car or cdr of the empty list.
• The list? predicate returns #f for improper lists:
(list? (cons 1 2)) → #f
• An improper list is a structure in which the cdr of the last pair is something other than ()
(the empty list)
• Similarly, a proper list is a structure in which the cdr of the last pair is () (the empty list)
• OBS! cons and list create new cons-cells for each pair-element. Thus, the two calls to cons
below create two separate cons cells.
(list (cons 1 2) (cons 1 2))
2
2.3
Pretty-printing CONS Structures
The “pretty printing” of cons structures can be confusing – especially when students implement
their own versions of some primitives and see different results. This is particularly true for expressions that use cons more than once.
(cons 1 (cons 2 3)) → (1 . (2 . 3))
(cons 1 (cons 2 3)) → (1 2 . 3)
; should print
; does print
(cons (cons 1 2) (cons 2 3)) → ((1 . 2) . (2 . 3))
(cons (cons 1 2) (cons 2 3)) → ((1 . 2) 2 . 3)
; should print
; does print
For each of the two pairs of expressions above, the first expression is what should be displayed,
based on the logic of forming improper lists. However, in order to increase readability, Scheme
“pretty prints” a more compact notation. The printing rule: if the next item to the right is a pair,
leave out the leading dot and the parentheses that enclose that pair.
2.4
Box-and-pointer Diagrams
Just as we had Substitution diagramming for the Substitution Model of Evaluation, so we also
introduce a diagramming technique for visualizing pair structures of returned values.
OBS! These diagrams are very important – and will be used through-out the rest of the course.
Some things to keep in mind when creating box-and-pointer diagrams:
• In order to creating box and pointer diagrams: start by diagramming the backbone.
• Keep in mind: box-and-pointer diagrams are a visualization of the structure of returned values.
Before creating a diagram, scan the expression to ensure it is well-formed ! The following
expression will generate an error:
(list 1 (2 3)) → ERROR
Remember, cons and list follow the usual rules of evaluation: evaluate expressions and subexpressions for their values before returning the resulting structure. So, in the example above,
Scheme will choke on the expression (2 3)
Warning: if we ask you to diagram a cons or list structure that generates an error, we expect
“error” and an explanation as an answer!
• Warning! It is very important remember to include a pointer to the structure. For the next
expression, there is an arrow pointing to the box-and-pointer diagram for the cons of 1 and
2:
(cons 1 2)
Similarly, in the expression below, there would be an arrow pointing from bar to the box-andpointer diagram for the list of 2 and 3 :
(define bar (list cons list))
Finally, what is the box-and-pointer diagram for the returned value of evaluating the following
expression?
((car bar) 2 3) → ???
This issue will become extremely important in SICP 3 when we need to visualize how datastructures are modified !!!
3
2.5
Symbols
A symbol is a quoted – or unevaluated group – of alphabetical characters.
(symbol? ’a) → #t
(symbol? ’abaababdb) → #t
(symbol? ’one) → #t
(symbol? ’1) → #f
(symbol? ’(1 2)) → #f
Such a group of characters is symbolic when it is used to refer to something else, as in the case of
a variable. This is potentially confusing, since the creation of a name/value binding is the one case
where a symbol is used without a quotation. This is because the creation of those bindings treats
the names as symbols rather than evaluating them. In other words, when Scheme evaluates this
expression:
(define quux 32)
It is treating quux as a symbol (rather than evaluating it).
However, such names will not technically test true with the symbol? predicate.
(symbol? quux) → #f
The only elements that will test true as symbols are those where we explicitly use quotation
(symbol? (quote quux)) → #t
(symbol? ’quux) → #t
This is true even when we test whether the returned value is a symbol:
(define quux (quote blah))
(symbol? quux) → #t
; obs!
; obs!
There are a number of issues about symbols and quoting to observe:
• Creation of box-and-pointer diagrams
• Symbols versus strings
• quote and cons/list
Scheme expects that a set of characters without quote (or parenthesis) is a name for a binding. If
the binding exists, calling that name returns the value bound to the name. If the binding does not
exist, an error is returned.
(define foo 42)
foo → 42
(quote foo) → foo
baz → error: reference to undefined identifier: baz
(quote baz) → baz
Numbers are already self-evaluating, it doesn’t hurt to quote them – but there is no need to do so:
3 → 3
’3 → 3
4
2.5.1
Symbols versus Strings
Although both symbols and strings are displayed the same on the screen, they are different datatypes and have different characteristics!
The main point about symbols is that they “point to something else.” Thus, variables are symbolic
names – the symbol (name) actually refers to something else. In the case of variables, the name is
just a convenient way to refer to the thing of interest, whether it is a single value or a procedure
or something else. When we deal with symbols, we are not interested in the “parts” of the symbol
– we are interested in accessing and using what the symbol points to. In fact, technically, a Scheme
symbol does not have “parts”, even though it may be displayed on the screen in such a way that
it looks like a sequence of characters.
A string is a particular sequence of characters. When we identify some sequence of characters as a
string, as in åord”, we are creating a different data-type.
’word is not the same as åord"
(symbol?
(symbol?
(string?
(string?
’one)
öne")
’one)
öne")
→
→
→
→
#t
#f
#f
#t
• Symbols and strings are different data-types – which means they support different kinds of
operations (as do one and 1 )
• Symbols and strings are both made up of characters
• The characters that make up a symbol are not decomposable; on the other hand, there are
many useful string-operations (separating into separate characters, concatenating, etc.)
• There can only be one name (“symbol”) bound to a given value in a given scope; there can
be many identical strings
• Symbols are not case-sensitive ’Abc ’abc ’ABC
Students often find the difference between strings and symbols confusing. They see a symbol, such
as foo and automatically “read” it as consisting of a string of characters.
But consider the following:
"123" → ?
123 → ?
The number is also made of characters, yet we usually understand that it is a number.
For this course, very little use of strings or string-processing; mostly as error messages in procedures:
(define (foobaz positive-input)
(if (<= positive-input 0)
"ERROR: input a positive number"
(* positive-input positive-input)))
Yes, a procedure can return a string – just as it can return a number or list or any other type of
value.
Note that in SICP, chapter 2, there is mention of the primitive display. We will use this and other
“display” primitives in SICP 03.
For now, all you need to know is that display is related to aspects of non-functional programming.
5
In other words, in the example of foobaz above, the procedure is able to return a string as a value.
In this case, foobaz is a typical functional machine: returning a value. As a side effect, Scheme is
nice enough to display the returned value on the screen.
This side effect is not functional. Why? Because it changes the state of the screen – and this change
lasts beyond the end of the evaluation of the procedure foobaz
When we want to directly control what appears – and how it appears – on a display screen, then we
will start using non-functional commands, such as display. We will study and use these in part 3
of the course.
2.5.2
quote and cons/list
It is essential to remember that quote and cons / list do not operate the same way!!!
• Scheme takes the argument to quote without evaluating it – and then creates the appropriate
structure
• Scheme applies list and cons to its arguments – and those arguments are evaluated for their
values before being used by cons or list to create a structure
(quote (1 (2 3)))
(list 1 (2 3))
; this is NOT equivalent
; to this
(quote (1 (2 3)))
(list 1 (list 2 3))
; this is NOT equivalent
; to this
(quote (1 (2 3)))
(list 1 (list 2 3))
; this is equivalent to what is RETURNED
; by this
2.5.3
Diagramming quoted structures
• Structures created with quote do have a list-structure – and it can be diagrammed with
box-and-pointer.
• Obs! The quote is what you use to tell Scheme not to evaluate something; it isn’t part of the
output. So, do not:
– quote printed results
– insert quotes in box and pointer diagrams
Test yourself by diagramming the following:
(cons ’(a) ’(b)) → ((a) b)
(define foo ’(list +))
(car foo) → ???
(define bar (list list +))
(car bar) → ??
(define baz (list ’list +))
(car r) → ???
6
2.6
Equality
Just as there is an equality predicate for numbers, there is one for symbols and one for lists:
(= 3 3) → #t
(eq? ’foo ’foo) → #t
(eq? ’foo ’bar) → #f
(equal? ’(1 2) ’(1 2)) → #t
(equal? ’(1 2) ’(1 3)) → #f
(equal? ’(1 (2 3)) ’(1 (2 3))) → #t
(equal? ’(1 2 3) ’(1 (2 3))) → #f
For now, please use only the three following tests for equality – and in the following ways:
• = works on numbers
• eq? works on symbols
• equal? works on lists
In general, we will be checking to ensure that code isn’t simply using equal? (or otherwise toogeneral) tests of equality.
Note that there is some deep subtlety about the nature of equality in general – and about the
difference between equality and identity; we will explore this further in SICP, chapter 03.
7
3
Vocabulary
Abstract Data Type (ADT) In its simplest terms, an ADT is a “compound” form of data
created out of primitive (or built-in) data-types and operations. More importantly, an ADT
allows us to separate how the data is used from how the data is represented. This, in turn,
makes it much easier to change underlying implementations for debugging, optimization, and
the like.
Abstract Interface The collection of constructors, selectors, and predicates by which we define a
particular data-type. This collection functions as an interface to the functionality implemented “below” the interface. The abstract interface is also known as an abstraction barrier.
Accessor One of the procedure-types (constructor, selector, etc.) making up the abstract interface.
Accumulator A recursive procedure that “accumulates” its results (typically by “consing up a
result”).
Cons cell The structure that results from applying cons to two arguments. The value associated
with the first part of the cons cell can be returned with car – and the value associated with
the rest of the cons cell can be returned with cdr. Loosely, cons cells and pairs are often used
as synonyms.
Constructor A procedure for creating data of a specific data-type.
Data We usually think of data as the “stuff” that is operated upon by procedures. However, we
can also define data in terms of an abstract interface: the operations we use to manipulate
them – and the formal relationships we specify between arguments and values.
Data-type A specific abstract interface for one type of data.
Filter A procedure for separating data that meets some criteria from other data. Note: in SICP,
to “filter” means to return items that match the predicate-argument – if, for example, one
applies filter to the predicate odd? and a list of numbers, then the SICP version of filter
returns the odd numbers as the value. In other texts and code, it is common for filter to
return the items that do not match the argument-predicate (that is, to “filter out” the items
the match the argument-predicate).
List Since the cdr of a single pair can point to another pair, technically a list is a pair is one in
which the the cdr of the final pair of the backbone cons sequence is nil (or the empty list). (If
there is only a single cons cell, then this applies to the cdr of that cell.) In everyday language,
the term list is not usually used to refer to a single pair, but rather to a series of cons cells.
Note: an improper list is one in which the cdr of the final pair of the backbone sequence is
non-nil ; an improper list tests false for list?
Map One-to-one transformation. Note also Scheme’s map procedure.
Pair A fundamental two-part compound data-structure in Scheme, created with cons (for “construct”).
Also known as a cons cell.
Predicate (or “recognizer”) A predicate tests its argument(s) and returns true (#t) or false
(#f).
8
(number? 2) → #t
(pair? 2) → #f
(symbol? 2) → #f
(pair? (cons 1 2)) → #t
(symbol? ’two) → #t
Selector A procedure for selecting different “parts” of data of a specific data-type.
Symbol A symbol is a particular data-type that is created by quoting – or preventing the evaluation
– of alphabetical characters. In more general terms, a symbol stands for something else, as in
the case of a variable name that is bound to a value.
Tree A self-similar data-structure in which branches consist of “leaves” or sub-branches (which
may consist of leaves or sub-branches). A particular kind of tree is a binary tree, in which
each branch consists of exactly two leaves or sub-branches.
9
Download