AI Memo No. 452 January 1978
THE REVISED REPORT ON
SCHEME
A DIALECT OF LISPby
Guy Lewis Steele Jr.* and Gerald Jay Sussman
SCHEME is a dialect of LISP. It is an expression-oriented, applicative order,
interpreter-based language which allows one to manipulate programs as
data. It differs from most current dialects of LISP in that it closes all lambda-expressions in the
environment of their definition or declaration, rather than in the
execution environment. This has the consequence that variables are
normally lexically scoped, as in ALGOL. However,
in contrast with ALGOL, SCHEME treats procedures as a first-class data type.
They can be the values of variables, the returned values of procedures,
and components of data structures. Another difference from LISP is that SCHEME is
implemented in such a way that tail-recursions execute without net
growth of the interpreter stack. The effect of this is that a procedure
call behaves like a GOTO, and thus procedure calls can be
used to implement iterations, as in PLASMA.
Here we give a complete “user manual” for the SCHEME language. Some features described here were not documented in the original report on SCHEME (for instance particular macros). Other features have been added, changed, or deleted as our understanding of certain language issues evolved. Annotations to the manual describe the motivations for these changes.
This report describes research done at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the laboratory’s artificial intelligence research is provided in part by the Advanced Research Projects Agency of the Department of Defense under Office of Naval Research contract N00014-75-C-0643.
* NSF Fellow
{{THE
REVISED REPORT ON SCHEME A DIALECT OF LISP © 1978 by Massachussetts
Institute of Technology, and this transcription, licensed
CC
BY-NC 4.0
(Attribution-NonCommercial 4.0 International).
Transcription by
Roger Turner: links, Contents page, and {{transcriber notes}}
added.}}
Steele and Sussman The Revised Report on SCHEME
ASET Has
Disappeared}BLOCK
Exploits Applicative Order}EVALUATE Has
Disappeared}EVALUATE!UNINTERRUPTIBLY
Has Disappeared}FEXPRs Are Okay by
Us}FUNCALL is a
Pain}AND}OR}SCHEME programs are represented as LISP s-expressions. The evaluator interprets these s-expressions in a specified way. This specification constitutes the definition of the language.
The definition of SCHEME is a little fuzzy around the edges. This is because of the inherent extensibility of LISP-like languages {Note LISP Is a Ball of Mud}. We can define a few essential features which constitute the “kernel” of the language, and also enumerate several syntactic and semantic extensions which are convenient and normally included in a given implementation. The existence of a mechanism for such extensions is a part of the kernel of SCHEME; however, any particular such extension is not necessarily part of the kernel.
For those who like this sort of thing, here is the BNF for SCHEME programs. {Note LISP BNF}:
<form> ::= <non-symbol atom> | <identifier> | <magic> | <combination>
<identifier> ::= <atomic symbol>
<magic> ::= <lambda-expression>
| (QUOTE <s-expression> )
| (IF <form> <form> <form> )
| (IF <form> <form> )
| (LABELS ( <labels list> ) <body> )
| (DEFINE <identifier> <lambda-expression> )
| (DEFINE <identifier> ( <identifier list> ) <form> )
| (DEFINE ( <identifier> <identifier list> ) <form> )
| (ASET' <identifier> <form> )
| (FLUIDBIND ( <fluidbind list> ) <form> )
| (FLUID <identifier> )
| (FLUIDSET' <identifier > <form> )
| (CATCH <identifier> <form> )
| <syntactic extension>
<lambda-expression> ::= (LAMBDA ( <identifier list > ) <body>)
<identifier list> ::= <empty> | <identifier> <identifier list>
<body> ::= <form>
<labels list> ::= <empty>
| ( <identifier> <lambda-expression> ) <labels list>
<fluidbind list> ::= <empty> | ( <identifier > <form> ) <fluidbind list>
<combination> ::= ( <form list> )
<form list> ::= <form> | <form) <form list>
<syntactic extension> ::= ( <magic word> . <s-expression> )
(magic word> ::= <atomic symbol>
<non-symbol atom> ::= <number> | <array> | <string> | ...
Atoms which are not atomic symbols (identifiers) evaluate to
themselves. Typical examples of such atoms are numbers, arrays, and
strings (character arrays). Symbols are treated as identifiers or
variables. They may be lexically bound by lambda-expressions. There is a
global environment containing values for (some) free variables. Many of
the variables in this global environment initially have as their values
primitive operations such as, for example, CAR,
CONS, and PLUS. SCHEME
differs from most LISP systems in that the atom
car is not itself an operation (in the sense of being an invocable
object, e.g. a valid first argument to APPLY), but only has
one as a value when considered as an identifier.
Non-atomic forms are divided by the evaluator into two classes:
combinations and “magic (special) forms”. The BNF given above is
ambiguous; any magic form can also be parsed as a combination. The
evaluator always treats an ambiguous case as a magic form. Magic forms
are recognized by the presence of a “magic (reserved) word” in the car
position of the form. All non-atomic forms which are not magic forms are
considered to be combinations. The system has a small initial set of
magic words; there is also a mechanism for creating new ones {Note FUNCALL
is a Pain}.
A combination is considered to be a list of subforms. These subforms are all evaluated. The first value must be a procedure; it is applied to the other values to get the value of the combination. There are four important points here:
The procedure position is always evaluated just like any other position. (This is why the primitive operators are the values of global identifiers.)
The procedure is never “re-evaluated”; if the first subform fails to evaluate to an applicable procedure, it is an error. Thus, unlike most LISP systems, SCHEME]{.sc} always evaluates the first subform of a combination exactly once.
The arguments are all completely evaluated before the procedure is applied; that is, SCHEME, like most LISP systems, is an applicative-order language. Many SCHEME programs exploit this fact.
The argument forms (and procedure form) may in principle be evaluated in any order. This is unlike the usual LISP left-to-right order. (All SCHEME interpreters implemented so far have in fact performed left-to-right evaluation, but we do not wish programs to depend on this fact. Indeed, there are some reasons why a clever interpreter might want to evaluate them right-to-left, e.g. to get things on a stack in the correct order.)
The magic forms in this section are all part of the kernel of SCHEME and so must exist in any SCHEME implementation.
(LAMBDA ( <identifier list> ) <body> )
Lambda-expressions evaluate to procedures. Unlike most LISP systems, SCHEME does not
consider a lambda-expression (an s-expression whose car is the atom
LAMBDA) to be a procedure. A lambda-expression only
evaluates to a procedure. A lambda-expression should be thought
of as a partial description of a procedure; a procedure
and a description of it are conceptually distinct objects. A
lambda-expression must be “closed” (associated with an environment) to
produce a procedure object. Evaluation of a lambda-expression performs
such a closure operation.
The resulting procedure takes as many arguments as there are
identifiers in the identifier list of the lambda-expression. When the
procedure is eventually invoked, the intuitive effect is that the
evaluation of the procedure call is equivalent to the evaluation of the
<body> in an environment consisting of (a) the
environment in which the lambda-expression had been evaluated to produce
the procedure, plus (b) the pairing of the identifiers of the
<identifier list> with the arguments supplied to the
procedure. The pairings (b) take precedence over the environment (a),
and to prevent confusion no identifier may appear twice in the
<identifier list>. The net effect is to implement
ALGOL-style lexical scoping [Naur],
and to “solve the funarg problem” [Moses].
(IF <predicate> <consequent> <alternative>)
This is a primitive conditional operator. The predicate form is
evaluated. If the result is non-NIL {Note IF
Is Data-Dependent}, then the consequent is evaluated, and otherwise
the alternative is evaluated. The resulting value (if there is one) is
the value of the IF form.
(IF <predicate> <consequent>)
As above, but if the predicate evaluates to NIL, then
NIL is the value of the IF form. (As a matter
of style, this is usually used only when the value of the
IF form doesn’t matter, for example, when the consequent is
intended to cause a side effect.)
(QUOTE <s-expression>)
As in LISP, this quotes the argument form so
that it will be passed verbatim as data; the value of this form is
<s-expression>. If a SCHEME
implementation has the MacLISP
read-macro-character feature, then the abbreviation 'FOO
may be used instead of (QUOTE FOO).
(LABELS ( <labels list> ) <body> )
where<labels list> ::= <empty> | (<identifier> <lambda-expression>) <labels list>This has the effect of evaluating the <body> in an
environment where all the identifiers (which, as for
LAMBDA, must all be distinct) in the labels list evaluate
to the values of the respective lambda-expressions. Furthermore, the
procedures which are the values of the lambda-expressions are themselves
closed in that environment, and not in the outer environment; this
allows the procedures to call themselves and each other
recursively. For example, consider a procedure which counts all the
atoms in a list structure recursively to all levels, but which doesn’t
count the NILs which terminate lists (but NILs
in the car of a list count). In order to perform this we define two
mutually recursive procedures, one to count the car and one to count the
cdr, as follows:
(DEFINE COUNT
(LAMBDA (L)
(LABELS ((COUNTCAR
(LAMBDA (L)
(IF (ATOM L) 1
(+ (COUNTCAR (CAR L))
(COUNTCDR (CDR L))))))
(COUNTCDR
(LAMBDA (L)
(IF (ATOM L)
(IF (NULL L) 0 1)
(+ (COUNTCAR (CAR L))
(COUNTCDR (CDR L))))))
(COUNTCDR L))))
(We have decided not to use the traditional LISP LABEL primitive in SCHEME because it is difficult to define several
mutually recursive procedures using only LABEL. Although
LABELS is a little more complicated than
LABEL, it is considerably more convenient. Contrast this
design decision with the choice of IF over the more
traditional COND, where the definitional simplicity of
IF outweighed the somewhat greater convenience of
COND.)
These magic forms produce side effects in the environment.
(DEFINE <identifier> <lambda-expression> )
This is used for defining a procedure in the “global environment”
permanently, as opposed to LABELS, which is used for
temporary procedure definitions in a local environment.
DEFINE takes a name and a lambda-expression; it evaluates
the lambda-expression in the global environment and
causes the result to be the global value of the
identifier. (DEFINE may perform other
implementation-dependent operations as well, such as keeping track of
defined procedures for an editor. For this reason it is the preferred
way to define a globally available procedure.)
(DEFINE <identifier> ( <identifier list> ) <form list> )(DEFINE ( <identifier> <identifier list> ) <form list> )These alternative syntaxes permitted by DEFINE are
equivalent to:
(DEFINE <identifier>
(LAMBDA ( <identifier list> )
(BLOCK <form list> )))
where BLOCK is a syntactic extension defined below. For
example, these three definitions are equivalent:
(DEFINE CIRCULATE (LAMBDA (X) (RPLACD X X)))
(DEFINE CIRCULATE (X) (RPLACD X X))
(DEFINE (CIRCULATE X) (RPLACD X X))
These forms are provided to support stylistic diversity.
(ASET' <identifier> <form>)
This is analogous to the LISP primitive SETQ. For
example, to define a cell [Smith and Hewitt], we may use
ASET' as follows:
(DEFINE CONS-CELL
(LAMBDA (CONTENTS)
(LABELS ((THE-CELL
(LAMBDA (MSG)
(IF (EQ MSG 'CONTENTS?) CONTENTS
(IF (EQ MSG 'CELL?) 'YES
(IF (EQ (CAR MSG) '<-)
(BLOCK (ASET' CONTENTS (CADR MSG))
THE-CELL)
(ERROR '|UNRECOGNIZED MESSAGE - CELL|
MSG
'WRNG-TYPE-ARG)))))))
THE-CELL)))
Note that ASET' may be used on global identifiers as
well as locally bound identifiers {Note ASET
Has Disappeared}.
These magic forms implement escape objects and fluid (dynamic) variables. They are not a part of the essential kernel. For a further explication of their semantics in terms of kernel primitives, see [Imperative].
(FLUIDBIND ( <fluidbind list> ) <form> )<fluidbind list> ::= <empty> | (<identifier> <form>) <fluidbind list>This evaluates the <form> in the environment of
the FLUIDBIND form, with a dynamic
environment to which dynamic bindings of the identifiers in the
<fluidbind list> have been added. Any procedure
dynamically called by the form, even if not lexically apparent to the
FLUIDBIND form, will see this dynamic environment (unless
modified by further FLUIDBINDs, of course). The dynamic
environment is restored on return from the form.
Most LISP systems use a dynamic environment
for all variables. A SCHEME which implements
FLUIDBIND provides two distinct environments. The fluid
variable named FOO is completely unrelated to a normal
lexical variable named FOO {Note Global
Fluid Environment}, and the access mechanisms for the two are
distinct.
(FLUID <identifier>)
The value of this form is the value of the
<identifier> in the current dynamic environment. In
SCHEME implementations which have the MacLISP read-macro-character feature,
(FLUID FOO) may be abbreviated to •FOO.
(FLUIDSET' <identifier> <form> )
The value of the <form> is assigned to the
<identifier> in the current dynamic environment.
(STATIC <identifier>)
The value of this is the value of the lexical identifier; writing
this is the same as writing just <identifier> {Note
What
Use Is It?}.
(CATCH <identifier> <form> )
This evaluates the form in an environment where the identifier is
bound to an “escape object” [Landin]
[Reynolds].
This is a strange object which can be invoked as if it were a procedure
of one argument. When the escape object is so invoked, then control
proceeds as if the CATCH expression had returned with the
supplied argument as its value {Note Multiple
Throw}.
If both CATCH and FLUIDBIND are
implemented, then their semantics are intertwined. When the escape
object is called, then the dynamic environment is restored to the one
which was current at the time the CATCH form was evaluated
{Note Environment
Symmetry}.
For a contorted example, consider the following obscure definition of
SQRT (Sussman’s least favorite style/Steele’s favorite; but
see [SCHEME]):
(DEFINE SQRT
(LAMBDA (X EPSILON)
((LAMBDA (ANS TAG GO)
(CATCH RETURN
(BLOCK
(CATCH M (ASET' TAG M)) ;CREATE PROG TAG
(IF (< (ABS (-$ (*$ ANS ANS) X)) EPSILON) ;CAMGE
(RETURN ANS)) ;POPJ
(ASET' ANS (//$ (+$ (//$ X ANS) ANS) 2.0)) ;MOVEM
(GO TAG)))) ;JRST
1.0
NIL
(LAMBDA (F) (F NIL)))))
This example differs slightly from the version given in [SCHEME]; notice the forms
(RETURN ANS) and (GO TAG).
As another example, we can define a THROW function,
which may then be used with CATCH much as it is in MacLISP [Moon] (except that in
MacLISP the tag is written after the body of the
CATCH, not before):
(DEFINE THROW (LAMBDA (TAG RESULT) (TAG RESULT)))
An example of its use:
(CATCH LOSE
(MAPCAR (LAMBDA (X) (IF (MINUSP X) (THROW LOSE NIL) (SQRT X)))
NUMLIST))
Indeed, note the similarity between THROW and the
definition of GO in the first example.
SCHEME has a syntactic extension mechanism which provides a way to define an identifier to be a magic word, and to associate a function with that word. The function accepts the magic form as an argument, and produces a new form; this new form is then evaluated in place of the original (magic) form. This is precisely the same as the MacLISP macro facility.
Some standard syntactic extensions are provided by the system for
convenience in ordinary programming. They are distinguished from other
magic words in that they are semantically defined in terms of others
rather than being primitive {Note FEXPRs
Are Okay by Us}. For expository purposes they are described here in
a pattern-matching/production-rule kind of language. The matching is on
s-expression structure, not on character string syntax, and takes
advantage of the definition of list notation:
(A B C) = (A . (B . (C . NIL))). Thus the pattern
(x . r) matches (A B C), with
x = A and r = (B C). The ordering of the
“productions” is significant; the first one which matches is to be
used.
(BLOCK x1 x2
... xn)
(BLOCK x) -> x
(BLOCK x . r) -> ((LAMBDA (A B) (B)) x (LAMBDA () (BLOCK . r)))
BLOCK sequentially evaluates the subforms
xi from left to right. For
example:
(BLOCK (ASET' X 43) (PRINT X) (+ X 1))
returns 44 after setting X to
43 and then printing it {Note BLOCK
Exploits Applicative Order}.
(LET ((v1
x1) (v2
x2) ... (vn
xn)) . body)
-> ((LAMBDA (v1
v2
... vn) (BLOCK . body)) x1
x2
... xn)LET provides a convenient syntax for binding several
variables to corresponding quantities. It allows the forms for the
quantities to appear textually adjacent to their corresponding
variables. Notice that the variables are all bound simultaneously, not
sequentially, and that the initialization forms
xi may be evaluated in any order. For
convenience, LET also supplies a BLOCK around
the forms constituting its body.
(DO ((v1 x1
s1) ... (vn
xn
sn)) (test . done) . body)
-> (LET ((A1 (LAMBDA () x1)) (B1 (LAMBDA (v1
... vn) s1) ... (An (LAMBDA () xn)) (Bn (LAMBDA (v1
... vn) sn)) (TS (LAMBDA (v1
... vn) test)) (DN (LAMBDA (v1
... vn) (BLOCK . done))) (BD (LAMBDA (v1
... vn) (BLOCK . body)))) (LABELS ((LOOP (LAMBDA (Z1 ... Zn) (IF (TS Z1 ... Zn) (DN Z1 ... Zn) (BLOCK (BD Z1 ... Zn) (LOOP (B1 Z1 ... Zn) ... (Bn Z1 ... Zn))))))) (LOOP (A1) ... (An))))This is essentially the MacLISP “new-style”
DO loop [Moon]. The variables
vi are bound to the values of the corresponding
xi, and stepped in parallel after every
execution of the body by the si (by “step” we
mean “set to the value of”, not “increment by”). If an
si is omitted, vi is
assumed; this results in the variable not being stepped. If in addition
xi, is omitted, NIL is assumed. The
loop terminates when the test evaluates non-NIL; it is
evaluated before each execution of the body. When this occurs, the
done part is evaluated as a BLOCK.
The complexity of the definition shown is due to an effort to avoid
conflict of variable names, as for BLOCK. The auxiliary
variables Ai, Bi, and Zi must be
generated to produce as many as are needed, but they need not be chosen
different from all variables appearing in xi,
si, body, etc.
The iteration is effected entirely by procedure calls. In this manner
the definition of DO exploits the tail-recursive properties
of SCHEME [SCHEME] [Imperative].
As an example, here is a definition of a function to find the length of a list:
(DEFINE (LENGTH X)
(DO ((Z X (CDR Z))
(N 0 (+ N 1)))
((NULL Z) N)))
The initializations forms xi may be evaluated
in any order, and on each iteration the stepping form
si may be evaluated in any order. This differs
from the MacLISP definition of DO.
For example, this definition of NREVERSE (destructively
reverse a list) would work in MacLISP but not
necessarily in SCHEME:
(DEFINE NREVERSE (X)
(DO ((A X (CDR A))
(B NIL (RPLACD A B)))
((NULL A) B)))
This definition depends on the CDR occurring before the
RPLACD. In SCHEME we must instead
write:
(DEFINE NREVERSE (X)
(DO ((A X (PROG1 (CDR A) (RPLACD A B)))
(B NIL A))
((NULL A) B)))
where by (PROG1 x y) We mean
((LAMBDA (P Q) (BLOCK (Q) P)) x (LAMBDA () y)) (but
PROG1 is not really a defined SCHEME primitive).
Note also that the effect of an ASET' on a
DO variable does not survive to the next iteration; this
differs from using SETQ on a DO variable in
MacLISP.
(ITERATE name ((v1
e1) ... (vn
en)) . body)
-> (LABELS ((name (LAMBDA (v1
... vn) (BLOCK . body)))) (name e1
... en))This defines a looping construct more general than DO.
For example, consider a function to sort out a list of s-expressions
into atoms and lists:
(DEFINE COLLATE
(LAMBDA (X)
(ITERATE COL
((Z X) (ATOMS NIL) (LISTS NIL))
(IF (NULL Z)
(LIST ATOMS LISTS)
(IF (ATOM (CAR Z) )
(COL (CDR Z) (CONS (CAR Z) ATOMS) LISTS)
(COL (CDR Z) ATOMS (CONS (CAR Z)
LISTS)))))))
We have found many situations involving loops where there may be more
than one condition on which to exit and/or more than one condition to
iterate, where DO is too restrictive but
ITERATE suffices. Notice that because each loop has a name,
one can specify from an inner loop that the next iteration of any outer
loop is to occur. Here is a function very similar to the one used in one
SCHEME implementation for variable lookup: there
are two lists of lists, one containing names and the other values.
(DEFINE (LOOKUP NAME VARS VALUES)
(ITERATE MAJOR-LOOP
((VARS-BACKBONE VARS)
(VALUES-BACKBONE VALUES))
(IF (NULL VARS-BACKBONE)
NIL
(ITERATE MINOR-LOOP
((VARS-RIB (CAR VARS-BACKBONE))
(VALUES-RIB (CAR VALUES-BACKBONE)))
(IF (NULL VARS-RIB)
(MAJOR-LOOP (CDR VARS-BACKBONE)
(CDR VALUES-BACKBONE))
(IF (EQ (CAR VARS-RIB) NAME)
VALUES-RIB
(MINOR-LOOP (CDR VARS-RIB)
(CDR VALUES-RIB)))))
(We had originally wanted to call this construct LOOP,
but see {Note FUNCALL
is a Pain}. Compare this with looping constructs appearing in [Hewitt].)
It happens that ITERATE is a misleading name; the
construct can actually be used for recursion (“true” recursion, as
opposed to tail-recursion) as well. If the name is invoked
from a non-tail-recursive invoked situation, the argument evaluation in
which the call is embedded is not aborted. It just so happens that we
have found ITERATE useful primarily to implement
complicated iterations. One can draw the rough analogy
ITERATE : LABELS :: LET : LAMBDA.
(TEST pred in alt)
-> ((LAMBDA (P F A) (IF P ((F) P) (A)))
pred
(LAMBDA () fn)
(LAMBDA () alt))
The predicate is evaluated; if its value is non-NIL then
the form fn should evaluate to a procedure of one argument,
which is then invoked on the value of the predicate. Otherwise the
alternative alt is evaluated.
This construct is of occasional use with LISP
“predicates” which return a “useful” non-NIL value. For the
consequent of an IF to get at the non-NIL
value of the predicate, one might first bind a variable to the value of
the predicate, and this variable would then be visible to the
alternative as well. With TEST, the use of the variable is
restricted to the consequent.
An example:
(TEST (ASSQ VARIABLE ENVIRONMENT)
CDR
(GLOBALVALUE VARIABLE))
Using an a-list to represent an environment, one wants to use the cdr
of the result of ASSQ if it is non-NIL; but if
it is NIL, then the variable was not in the environment,
and one must look elsewhere.
(COND (p1 .
e1) ... (pn .
en))
(COND) -> 'NIL
(COND (p) . r) -> ((LAMBDA (V R) (IF V V (R)))
p
(LAMBDA () (COND . r)))
(COND (p => f) . r) -> (TEST p f (COND . r))
(COND (p . e) . r) -> (IF p (BLOCK . e) (COND . r ))
This COND is a superset of the MacLISP COND. As in MacLISP, singleton clauses return the value of the
predicate if it is non-NIL, and clauses with two or more
forms treat the first as the predicate and the rest as constituents of a
BLOCK, thus evaluating them in order.
The extension to the MacLISP
COND made in SCHEME is flagged by
the atom =>. (It cannot be confused with the more
general case of two BLOCK constituents because having the
atom => as the first element of a BLOCK is
not useful.) In this situation the form f following the
=> should have as its value a function of one argument;
if the predicate p is non-NIL, this function
is determined and invoked on the value returned by the predicate. This
is useful for the common situation encountered in LISP:
(COND ((SETQ IT (GET X 'PROPERTY)) (HACK IT))
...)
which in SCHEME can be rendered without using
a variable global to the COND:
(COND ((GET X 'PROPERTY)
=> (LAMBDA (IT) (HACK IT)))
...)
or, in this specific instance, simply as:
(COND ((GET X 'PROPERTY) => HACK)
...)
(OR x1 x2
... xn)
(OR) -> 'NIL
(OR x) -> x
(OR x . r) -> (COND (x) (T (OR . r)))
This standard LISP primitive evaluates the
forms xi in order, returning the first
non-NIL value (and ignoring all following forms). If all
forms produce NIL, then NIL is returned {Note
Tail-Recursive
OR}.
(AND x1 x2
... xn)
(AND) -> 'T
(AND x) -> x
(AND x . r) -> (COND (x) (x (AND . r)))
This standard LISP primitive evaluates the
forms xi in order. If any form produces
NIL, then NIL is returned, and succeeding
forms xi are ignored. If all forms produce
non-NIL values, the value of the last is returned {Note Tail-Recursive
AND}.
(AMAPCAR f x1
x2
... xn))
-> (DO ((FN f) (V1 x1
(CDR V1)) (V2 x2
(CDR V2)) ... (Vn xn
(CDR Vn)) (Q 'NIL (CONS (FN (CAR V1) (CAR V2) ... (CAR Vn)) Q))) ((OR (NULL V1) (NULL V2) ... (NULL Vn)) (NREVERSE Q)))AMAPCAR is analogous to the MacLISP MAPCAR function. The function
f, a function of n arguments, is mapped
simultaneously down the lists x1,
x2, ..., xn; that is,
f is applied to tuples of successive elements of the lists.
The values returned by f are collected and returned as a
list. Note that AMAPCAR of a fixed number of arguments
could easily be written as a function in SCHEME.
It is a syntactic extension only so that it may accommodate any number
of arguments, which saves the trouble of defining an entire set of
primitive functions AMAPCAR1, AMAPCAR2, …
where AMAPCARn takes n+1 arguments.
(AMAPLIST f x1
x2
... xn))
-> (DO ((FN f) (V1 x1
(CDR V1)) (V2 x2
(CDR V2)) ... (Vn xn
(CDR Vn)) (Q 'NIL (CONS (FN V1 V2 ... Vn) Q))) ((OR (NULL V1) (NULL V2) ... (NULL Vn)) (NREVERSE Q)))AMAPLIST is analogous to the MacLISP MAPLIST function. The function
f, a function of n arguments, iis applied to
tuples of successive tails of the lists. The values returned by
f are collected and returned as a list.
(AMAPC f x1
x2
... xn))
-> (DO ((FN f) (X1 x1
(CDR X1)) (X2 x2
(CDR X2)) ... (Xn xn
(CDR Xn))) ((OR (NULL X1) (NULL X2) ... (NULL Xn)) 'NIL) (FN (CAR X1) (CAR X2) ... (CAR Xn)))AMAPC is analogous to the MacLISP MAPC function. The function
f, a function of n arguments, is mapped
simultaneously down the lists x1,
x2, ..., xn; that is,
f is applied to tuples of successive elements of the lists.
Thus AMAPC is similar to AMAPCAR, except that
no values are expected from f; therefore f
need not be a function, but may be any procedure.
(PROG varlist s1
s2
... sn)
The SCHEME PROG is like the
ordinary LISP PROG. There is no
simple way to describe the transformation of PROG syntax
into SCHEME primitives. The basic idea is that a
large LABELS statement is created, with a labelled
procedure (of zero arguments) for each PROG statement. Each
statement is transformed in such a way that each one that “drops
through” is made to call the labelled procedure for the succeeding
statement; each appearance of (GO tag) is converted to a
call on the labelled procedure for the statement following the tag; and
each appearance of (RETURN value) is replaced by value.
Practical experience with SCHEME has shown
that PROG is almost never used. It is usually more
convenient just to write the corresponding LABELS directly.
This allows one to write LABELS procedures which take
arguments, which tends to clarify the flow of data [Imperative].
The rest of this section (FSUBRs) applies only to the PDP-10 MacLISP implementation of SCHEME .
FSUBRs
As a user convenience, the PDP-10 MacLISP implementation of SCHEME treats FSUBRs specially; any FSUBR provided by
the MacLISP system is automatically a SCHEME primitive (but user FSUBRs are not). Of course,
if the FSUBR tries to evaluate some form obtained from its argument, the
variable references will not refer to SCHEME
variables. As a special case, the SCHEME
syntactic extension AARRAYCALL is provided to get the
effect of the MacLISP FSUBR
ARRAYCALL.
A SCHEME implementation should have one or
more ways for the user to extend the inventory of magic words. The
methods provided will vary from implementation to implementation. The
following primitive (SCHMAC) is provided in the PDP-10 MacLISP implementation
of SCHEME.
(SCHMAC name pattern body)
After execution of this form, a syntactic extension keyed on the atom
name is defined. When a form (name . rest) is
to be evaluated, rest is matched against the pattern, which
is a (possibly “dotted”) list of variables. The body is
then evaluated in an environment where the variables in the pattern have
as values the corresponding parts of rest. This should
result in a form to be evaluated in place of the form
(name . rest).
The body is not necessarily SCHEME code, but rather code in the same meta-language used to write the evaluator. In the PDP-10 MacLISP SCHEME implementation, the body is MacLISP code.
As an example, here is a definition of TEST:
(SCHMAC TEST (PRED FN ALT)
(LIST '(LAMBDA (P F A) (IF P ((F) P) (A)))
PRED
(LIST 'LAMBDA '() FN)
(LIST 'LAMBDA '() ALT)))
The body of a SCHMAC almost always performs a complicated consing-up
of a program structure. Often one needs to make a copy of a standard
structure, with a few values filled in. To make this easier, SCHEME provides an “unquoting quote” feature. An
expression of the form "<s-expression> is just like
'<s-expression>, except that sub-expressions preceded
by “,” or “@” represent expressions whose
values are to be made part of (a copy of) the s-expression at that
point. A “,” denotes simple inclusion, while
“@” denotes “splicing” or “segment” inclusion. (Compare
this with the treatment of lists with embedded forms in MUDDLE [Galley and
Pfister], which in turn inspired the !” syntax of CONNIVER [McDermott and Sussman], from
which SCHEME’s " syntax is
derived.) Using this, one can define TEST as follows:
(SCHMAC TEST (PRED FN ALT)
"((LAMBDA (P F A) (IF P ((F) P) (A)))
,PRED
(LAMBDA () ,FN)
(LAMBDA () ,ALT)))
Similarly, LET can be defined as:
(SCHMAC LET (DEFNS . BODY)
"((LAMBDA ,(MAPCAR 'CAR DEFNS)
(BLOCK . ,BODY))
@(MAPCAR 'CADR DEFNS)))
One could also write (BLOCK @BODY) instead of
(BLOCK . ,BODY).
Notice the use of (MAPCAR 'CAR DEFNS) rather than
(AMAPCAR CAR DEFNS), and recall that, as stated above, the
body of a SCHMAC is MacLISP code,
not SCHEME code. Consider too this definition of
COND:
(SCHMAC COND CLAUSES
(COND ((NULL CLAUSES) ''NIL)
((NULL (CDAR CLAUSES))
"((LAMBDA (V R) (IF V V R))
,(CAAR CLAUSES)
(LAMBDA () (COND . ,(CDR CLAUSES)))))
((EQ (CADAR CLAUSES) '=>)
"(TEST ,(CAAR CLAUSES) ,(CADDAR CLAUSES)
(COND . ,(CDR CLAUSES))))
(T "(IF ,(CAAR CLAUSES)
(BLOCK . ,(CDAR CLAUSES))
(COND . ,(CDR CLAUSES))))))
We have used COND to define COND! The
definition is not circular, however; the MacLISP
COND is being used to define the SCHEME COND, and indeed the two have
slightly different semantics. The definition would have
been circular had we written (COND (V) (R)) instead of
(IF V V R), for the latter is part of the generated SCHEME code.
(MACRO name pattern body)
This is just like SCHMAC, except that body
is SCHEME code rather than MacLISP code. While macros defined with
SCHMAC run only in a MacLISP
implementation of SCHEME, those defined with
MACRO should be completely transportable. (We described
SCHMAC first to emphasize the fact that macros are
conceptually part of the interpreter, and so conceptually written in the
meta-language. It so happens, however, that SCHEME is a good meta-language for SCHEME, and so introducing this meta-circularity
provides no serious problems. Contrast this with writing PL/I macros in PL/I!)
The example of defining COND using SCHMAC
above would be circular if we changed the word SCHMAC to
MACRO. However, we can avoid this by avoiding the use of
COND in the definition:
(MACRO COND CLAUSES
(IF (NULL CLAUSES) ''NIL
(IF (NULL (CDAR CLAUSES))
"((LAMBDA (V R) (IF V V R))
,(CAAR CLAUSES)
(LAMBDA () (COND . ,(CDR CLAUSES) )))
(IF (EQ (CADAR CLAUSES) '=>)
"(TEST ,(CAAR CLAUSES)
,(CADDAR CLAUSES)
(COND . ,(CDR CLAUSES)))
"(IF ,(CAAR CLAUSES)
(BLOCK . ,(CDAR CLAUSES))
(COND . ,(CDR CLAUSES)))))))
We strongly encourage the use of MACRO instead of
SCHMAC in practice so that macro definitions will not be
dependent on the properties of a specific implementation.
All the usual MacLISP SUBRs are available in
SCHEME as procedures which are the values of
global variables. The particular primitives CONS,
CAR, CDR, ATOM, and
EQ are part of the kernel of SCHEME! Others, such as +,
//, =, EQUAL,
RPLACA, RPLACD, etc. are quite convenient to
have.
Although there is no way in SCHEME to write a
LEXPR (a function of a variable number of arguments), MacLISP LSUBRs are also available to the SCHEME user. One may wish to regard these as syntactic
extensions in much the same Way AMAPCAR is; for example,
LIST may be thought of as a syntactic extension such
that:
(LIST) -> 'NIL
(LIST x . r) -> (CONS x (LIST . r ))
Below we also describe some additional primitive functions provided with SCHEME. Their names do not have any special syntactic properties in the way that the magic words for syntactic extensions described in the previous section do. However, they do deal with the underlying implementation, and so could not be programmed directly by the user were they not provided as primitives.
The following primitive functions (PROCP and
ENCLOSE) are part of the kernel of SCHEME.
(PROCP thing)
This is a predicate which is true of procedures, and not of anything
else. Thus if (PROCP x) is true, then it is safe to invoke
the value of x.
More precisely, if PROCP returns a non-NIL
value, then the value describes the number of arguments accepted by the
procedure. For SCHEME procedures this will be an
integer, the number of arguments. For primitive functions, this may be
implementation-dependent; in the PDP-10 MacLISP implementation of SCHEME, PROCP of an LSUBR returns the
MacLISP ARGS property for that
LSUBR. If an object given to PROCP is a procedure but the
number of arguments it requires cannot be determined for some reason,
then PROCP returns T.
(ENCLOSE fnrep envrep)
ENCLOSE takes two s-expressions, one representing the
code for a procedure, and the other representing the (lexical)
environment in which the procedure is to run. ENCLOSE
returns a (closed) procedure which can be invoked.
The representation of the code is the standard s-expression description (a lambda-expression). The representation of the environment is an association list (a-list) of the traditional kind:
((var1 . value1) (var2 . value2) ...)
NIL represents the global lexical environment.
This description of ENCLOSE should not be construed as
describing how the implementation of SCHEME
represents either environment or code internally. Indeed,
ENCLOSE could be as simple as CONS, or as
complicated as a compiler. All that ENCLOSE guarantees to
do is to compute a procedure object given a description of its desired
behavior. The description must be in the prescribed form; but the result
may be in any form convenient to the implementation, as long as it
satisfies the predicate PROCP {Note EVALUATE
Has Disappeared} {Note S-expressions
Are Not Functions}.
As an example, we can write APPLY using
ENCLOSE. One way is to generate a lot of names for the
arguments involved:
(DEFINE APPLY
(LAMBDA (FN ARGS)
(LET ((VARS (AMAPCAR (LAMBDA (X) (GENSYM)) ARGS))
(FNVAR (GENSYM)))
((ENCLOSE "(LAMBDA () (,FNVAR @VARS))
(CONS (CONS FNVAR FN)
(AMAPCAR CONS VARS ARGS)))))))
Here a procedure which will call the procedure FN on the
required number of arguments is enclosed in an environment with all the
variables bound to the appropriate values. For those who don’t like
GENSYM, here is another way to do it:
(DEFINE APPLY
(LAMBDA (FN ARGS)
(DO ((TAIL 'A "(COR ,TAIL))
(REFS NIL (CONS "(CAR ,TAIL) REFS))
(COUNT ARGS (CDR COUNT)))
((NULL COUNT)
((ENCLOSE "(LAMBDA (F A) (F @(REVERSE REFS))) NIL)
FN ARGS)))))
In this version we create a series of forms (CAR A),
(CAR (CDR A)), (CAR (CDR (CDR A))), … to be
used to access the arguments. (In a way, these are distinct names for
the arguments in the same way that the gensyms were for the first
version.) The values FN and ARGS are passed in
as arguments to the enclosed procedure, rather than giving a non-nil
environment representation to ENCLOSE.
As another example, we define a function called
*LAMBDA:
(DEFINE (*LAMBDA VARS BODY)
(ENCLOSE "(LAMBDA ,VARS ,BODY) NIL))
Writing (*LAMBDA '(X Y) '(FOO Y X)) is just like writing
(LAMBDA (X Y) (FOO Y X)). However, if there are any free
variables in the supplied body, then *LAMBDA will cause
them to refer to the global environment, not the current one. We cannot
in general simulate LAMBDA by using *LAMBDA,
because SCHEME (purposefully) does not provide a
general way to get a representation of the current environment. We
could, of course, require the user to give *LAMBDA a
representation of the current environment, but this hardly seems
worthwhile.
The following primitive functions allow for multiprocessing. We do not pretend that they are “right” in any sense, and are not particularly attached to these specific definitions. They are not part of the kernel of SCHEME. (Their primary use in practice is for bootstrapping SCHEME by creating an initial process for the top-level user interface loop.)
There are no primitives for process synchronization, as we have no
good theory of how best to do this. However, in the PDP-10 MacLISP implementation
of SCHEME we guarantee that SUBRs and LSUBRs
execute in an uninterruptible fashion; that is, such functions can be
considered “atomic” for synchronization purposes. The user is invited to
exploit this fact to invent his own synchronization primitives {Note EVALUATE!UNINTERRUPTIBLY
Has Disappeared}.
(CREATE!PROCESS proc)
This is the process generator for multiprocessing. It takes one
argument, a procedure of no arguments. If the procedure ever terminates.
the entire process automatically terminates. The value of
CREATE!PROCESS iS a process ID for the newly generated
process. Note that the newly created process will not actually run until
it is explicitly started. When started, the procedure will be invoked
(with no arguments), and the process will run “in parallel” with all
other active processes.
(START!PROCESS procid)
This takes one argument, a process id, and starts up or resumes that process, which then runs.
(STOP!PROCESS procid)
This also takes a process id, but stops the process. The stopped
process may be continued from where it was stopped by using
START!PROCESS again on it; The global variable
**PROCESS** always contains the process id of the currently
running process; thus a process can stop itself by doing
(STOP!PROCESS **PROCESS**).
(TERMINATE)
This primitive stops and kills the process which invokes it. The
process may not be resumed by START!PROCESS. Some other
process is selected to run. If the last process is terminated, SCHEME automatically prints a warning message, and
then creates a new process running the standard SCHEME “read-eval-print” (actually “read–stick (LAMBDA
() .) around–enclose in top-level environment–invoke–print”) loop.
An example of the use of TERMINATE:
(TERMINATE)
ASET Has Disappeared} ^The more general primitive ASET described in [SCHEME] has been removed from the SCHEME language. Although the case of a general
evaluated expression for the variable name causes no real semantic
difficulty (it can be viewed as a syntactic extension representing a
large CASE statement, as pointed out in [Declarative]), it can be
confusing to the reader. Moreover, in two years we have not found a use
for it. Therefore we have replaced ASET with
ASET', which requires the name of the variable to be
modified to appear manifestly.
We confess to being “cute” when we say that the name of the primitive
is ASET'. We have not changed the implementation at all,
but merely require that the first argument be quoted. The form
(ASET' FOO BAR) is parsed by the MacLISP reader as (ASET (QUOTE FOO) BAR).
Of course, a different implementation of SCHEME
might actually take ASET' as a single name. We apologize
for this nonsense.
BLOCK Exploits Applicative Order} ^The definition shown for BLOCK exploits the applicative
order of evaluation of SCHEME to perform this
{Note Normal
Order Loses}. It does not depend on left-to-right
evaluation of arguments to functions! Notice also that in
(BLOCK x . r) -> ((LAMBDA (A B) (B)) x (LAMBDA () (BLOCK . r)))
there can be no conflict between the auxiliary variables
A and B and any variables occurring in
x and r. It is thus unnecessary to choose
variables different from any others appearing in the code. In this
respect this definition is an improvement over the one given in [Imperative]. This trick
(which is actually a deep property of the lexical scoping rules) is used
in a general way in most of the definitions of syntactic extensions: one
wraps all the “user code” in lambda-expressions in the outer
environment, passes them in bound to internal names, and then invokes
them as necessary within the internal code for the definition.
One may think of an escape object as being “closed” with respect to a dynamic environment (and here we mean not only fluid variables but the chain of pending procedure calls) in much the same way that an ordinary procedure is closed with respect to a lexical environment. Just as a procedure cannot execute properly except in conjunction with a static environment of the appropriate form, so an escape object cannot properly resume control except in a dynamic environment of the appropriate form.
EVALUATE Has Disappeared} ^The EVALUATE primitive described in [SCHEME] has been removed from the language. We
discovered (the hard way) that the straightforward implementation of
EVALUATE (evaluate the given expression in the current
environment) destroys referential transparency. We then altered it to
evaluate the expression in the top-level environment, but were still
disturbed by the extent to which one is tied to a particular
representation of a procedure to be executed.
We eventually invented an ENCLOSE of one argument (a
lambda- expression), which enclosed the procedure in the top-level
environment. This allowed one to remove the dependence on representation
by making a procedure, and then to pass the procedure around for a while
before invoking it. We had no provision for closing in an arbitrary
environment, because we did not want to provide the user with direct
access to environments as data objects. The excellent idea of allowing
ENCLOSE to accept a representation of an
environment was suggested to us by R. M. Fano.
EVALUATE!UNINTERRUPTIBLY Has Disappeared} ^The EVALUATE!UNINTERRUPTIBLY primitive described in [SCHEME] has been removed from the language. This
primitive was half a joke, and we have since discovered that it had a
serious flaw in its definition, namely that the scope of the
uninterruptibility is lexical. This worked in our limited examples only
by virtue of the fact that SUBRs were atomic operations. In general,
this primitive is worse than useless for synchronization purposes.
Synchronization is clearly a dynamic and not a static phenomenon. We
have no good theory of synchronization (primitives for this were
included in [SCHEME] primarily to show that it could be done,
however kludgily), and so have defined no replacement for
EVALUATE!UNINTERRUPTIBLY. We apologize for any confusion
this mistake may have caused.
While the syntactic extensions are defined in terms of other
constructs, they need not be implemented in terms of them. For example,
in the current PDP-10 MacLISP implementation of SCHEME, BLOCK is actually implemented in
the same way IF and QUOTE are, rather than as
a macro in terms of LAMBDA. This was done purely to speed
up the interpreter. The compiler still uses the macro definition (though
we could change that too if warranted). The point is that the user
doesn’t have to know about all this.
It is somewhat an accident that magic forms look like procedure calls
(see also {Note FUNCALL
is a Pain}): a name appearing in the car of a list may represent
either a procedure or a magic word, but not both. (We could, for
example, say that magic forms are distinguished by a magic word in the
cadr of a list, thus allowing forms such as
(FOO := (+ FOO 1)), where := is a magic word
for assignment. PLASMA [Smith and Hewitt] allowed just this
ability with its “italics” or “reserved word” feature.) Thanks to this
accident many LISP interpreters store the magic
function definition in the place where an ordinary procedure definition
is stored. A special marker (traditionally EXPR/SUBR or FEXPR/FSUBR)
distinguishes ordinary functions from magic ones. This allows the lookup
for a magic word definition and an ordinary value to be simultaneous,
thus speeding up the implementation; it is purely an engineering trick
and not a semantic essence. However, this trick has led to a
generalization wherein QUOTE and COND are
regarded as functions on an equal basis with CAR and
CONS; to be sure, they take their arguments in a funny way
- unevaluated - but they are still regarded as functions. This leads to
all manner of confusion, which has its roots in a confusion between a
procedure and its representation.
It is helpful to consider a simple thought experiment. Let us
postulate a toy language called “Number-LISP”.
Programs in this language are written as s-expressions, as usual; the
kernel primitives LAMBDA, IF,
LABELS, etc. are all present. However, the primitive
functions CONS, CAR, and CDR are
absent; one has only +, -, *,
//, and =. QUOTE is
not available; the only constants one can write are
numbers.
Now Number-LISP can be used to perform all
kinds of arithmetic, but it is clearly a poor language in which to write
a LISP interpreter. Now consider the magic form
processors and syntactic extension functions for Number-LISP. They are
procedures on s-expressions or functions from s-expressions to
s-expressions which transform one form into another. Whatever processes
IF or LABELS or BLOCK is clearly.
not a Number-LISP procedure, because it must
deal with the text of a Number-LISP
procedure, not just the data to be operated on by that procedure. The
IF-processor (a “FEXPR”) for Number-LISP must be coded in the
meta-language for Number-LISP,
whatever that may be.
Now it is one of the great features of ordinary LISP that it can serve as its own meta-language. This provides great power, but also permits great confusion. If the implementation allows mixing of levels of definition, we must keep them separate in our minds. For this reason we don’t mind using the “FEXPR hack” to implement syntactic forms, but we do mind thinking of them as functions just like EXPRS.
FUNCALL is a Pain} 1, 2, 3, 4The ambiguity between magic forms and combinations could be
eliminated by reserving a special subclass of lists to represent
combinations, and allowing all others to represent magic forms. For
example, we might say that all lists beginning with the atom
CALL are combinations. Then we would write
(CALL CONS A B) rather than (CONS A B). One
could then have a procedure named LAMBDA, for example;
there could be no confusion between (LAMBDA (A) (B A)) and
(CALL LAMBDA (CALL A) (CALL B A)), as there would be
between (LAMBDA (A) (B A)) as a combination and as a magic
form denoting a procedure. Notice that CALL is intended to
be merely a syntactic marker, like LAMBDA or
IF, and not a function as FUNCALL is in MacLISP [Moon].
If this CALL convention were adopted, there could be no
confusion between combinations and other kinds of forms. Not all
expressions would have meaningful interpretations; for example
(FOO A B) would not mean anything (certainly not a call to
the function FOO, which would be written as
(CALL FOO A B)). The space of meaningful s-expressions
would be a very sparse subset of all s-expressions, rather than a dense
one. It would also make writing SCHEME code very
clumsy. (These two facts are of course correlated.) Combinations occur
about as often as all other non-atomic forms put together; we would like
to write as little as possible to denote a call. As in traditional LISP, we agree to tolerate the ambiguity in SCHEME as the price of notational convenience. Indeed,
this ambiguity is sometimes exploited; it is convenient not to have to
know whether AMAPCAR is a function or a magic word.
This compromise does lead to difficulties, however. For example, we had wanted to define an iteration feature:
(LOOP name varspecs body)
Unfortunately, there is a great deal of existing code written in SCHEME of the form:
(LABELS ((LOOP (LAMBDA ... (LOOP ...) ...)))
(LOOP ...))
because LOOP has become a standard name for use in a
LABELS procedure which implements an iteration (see, for
example, our definition of DO!). If LOOP were
to become a new magic word, then all this existing code would no longer
work. We were therefore forced to name it ITERATE instead
(after verifying that no existing code used the name
ITERATE for another purpose!).
There would have been no problem if all this code had been written as:
(LABELS ((LOOP (LAMBDA ... (CALL LOOP ...) ...)))
(CALL LOOP ...))
To this extent SCHEME has unfortunately, despite our best intentions, inherited a certain amount of referential opacity.
There is a question as to the meaning of the global fluid environment. In the PDP-10 MacLISP implementation of SCHEME, the global lexical and fluid environments coincide, but this was an arbitrary choice of convenience influenced by the structure of MacLISP. We recommend that the two global environments be kept distinct.
IF Is Data-Dependent} ^We should note that the usefulness of the definition of
IF explicitly depends on the particular kinds of data types
and the particular primitive functions available; we expect to use
IF with primitive predicates such as ATOM and
EQ. This is in contrast to other kernel forms such as
LAMBDA and LABELS expressions, whose semantics
are independent of the data.
We erred in [SCHEME] when we stated that a practical
interpreter must have a little of each of call-by-value and call-by-name
in it. The argument was roughly that a call-by-name interpreter must
become call-by- value when a primitive operator is to be applied, and a
call-by-value interpreter must have some primitive conditional such as
IF. We did mention the trick of eliminating IF
in a call-by-name interpreter by defining predicates to return
(LAMBDA (X Y) X) for TRUE and (LAMBDA (X Y) Y)
for FALSE, whereupon one typically writes:
((= A B) <do this if TRUE> <do this if FALSE>)
but noted that it depends critically on the use of normal order evaluation.
What we had not fully understood at that point was the trick of
simulating call-by-name in terms of call-by-value by using
lambda-expressions (our use of it in the TRY!TWO!THINGS!IN!PARALLEL
example notwithstanding!); this trick was described generally in [Imperative]. A special case
of this trick is to define the primitive predicates in a call-by-value
interpreter to return (LAMBDA (X Y) (X)) for TRUE and
(LAMBDA (X Y) (Y)) for FALSE. Then one can write things
like:
((= A B)
(LAMBDA () <do this if TRUE>)
(LAMBDA () <do this if FALSE>))
and so eliminate a call-by-name-like magic form such as
IF. One can make the dependence of the conditional on the
primitive data operations even more explicit by defining predicates not
to return any particular value, but to require two “continuations” as
arguments, of which it will invoke one:
(= A B
(LAMBDA () <do this if TRUE>)
(LAMBDA () <do this if FALSE>))
We were correct when we said that a practical interpreter must have
call-by-name to some extent in that there must be some way to designate
two pieces of as yet uninterpreted program text of which only one is to
be evaluated. We simply did not notice that LAMBDA provides
this ability, and so a separate primitive such as IF is not
necessary. We have chosen to retain IF in the language
because it is traditional, because its implementation is easy to
understand, and because it allows us to take advantage of many existing
predicates in the host language in the PDP-10
MacLISP implementation.
These rules refer to the following rules for LISP s-expressions:
<atomic symbol>::= <alphanumeric string> <letter> <alphanumeric string>
<alphanumeric string) ::= <empty> |
<alphanumeric character> <alphanumeric string>
<alphanumeric character> ::= <letter> | <digit> | / <character>
<letter> ::= A | B | ... | Y | Z | * | $ | % | ...
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<number> ::= <VERY implementation dependent>
<string> ::= " <character string> "
<character string> ::= ‹empty> | <literal character> <character string>
<literal character> ::= <any character except " or /> | / <character>
(In the PDP-10 MacLISP implementation of SCHEME, we use the " character for
another purpose because PDP-10 MacLISP does not have a string data type. We mention
strings only as a familiar example other than numbers of an atomic data
type other than identifiers.)
In addition, we assume the usual interchangeability of list notation
and dot notation for s-expressions: (A B C) =
(A . (B . (C . NIL))). Thus the pattern ( (COND (A B) (T C)). It
is the s-expression representation we care about, not particular
character strings.
LISP is extensible in two ways. First, there
is the simple ability to define new functions; these are used in a way
that is both syntactically and semantically identical to built-in
primitive functions like CAR. This is in contrast to
“algebraic” programming languages, which either make an arbitrary
syntactic distinction between addition, say, and a user function, or
wander into the quagmire of extensible parsers. Second, there is a
uniform macro facility for transforming one syntactic form into another;
this facility is based on internal data structures rather than external
character-string syntax.
Joel Moses (private communication) once made some remarks on the difference between LISP and APL, which we paraphrase here: “APL is like a diamond. It has a beautiful crystal structure; all of its parts are related in a uniform and elegant way. But if you try to extend this structure in any way – even by adding another diamond – you get an ugly kludge. LISP, on the other hand, is like a ball of mud. You can add any amount of mud to it (e.g. MICRO-PLANNER or CONNIVER) and it still looks like a ball of mud!”
A full implementation of SCHEME allows this
to work even if the CATCH has already been “returned from”;
that is, the escape object can be used to “return from the catch”
several times. However, we allow the possibility that an implementation
may allow the escape object to be invoked only from within the execution
of the form inside the catch. (This is essentially the restriction that
MacLISP makes on its CATCH
construct [Moon].) This
restriction permits a stack discipline for the allocation of
continuations, and also greatly simplifies the flow analysis problem for
a compiler [RABBIT].
Our definition of BLOCK exploits the fact that SCHEME is an applicative-order (call-by-value)
language in order to enforce sequencing. Sussman has proved that one
cannot do a similar thing in a normal-order (call-by-name) language:
Theorem: Normal order, as such, is incapable of
enforcing sequencing (whereas applicative order is) in the form of the
BLOCK construct.
(Informal) proof: The essence of
(BLOCK a b) is that a is evaluated before
b, and that the value of b is the value of the
BLOCK (or, more correctly, the value or meaning of the
BLOCK is independent of a; a is
executed only for its side effects). Now we know that if
(BLOCK a b) has any value at all, it can be found by using
normal order (normal order terminates if any order does). Now suppose
that the computation of a does not terminate, but the
computation of b does. Then (BLOCK a b) must
terminate under normal order, because the value of the block is the
value of b; but this contradicts the requirement that
a finish before b is calculated. QED
This is an informal indication that normal order is less useful (or at least less powerful) in a programming language than applicative order. We also noted in [SCHEME] that normal order iterations tend to consume more space than applicative order iterations, because of the buildup of thunk structure. Given that one can simulate normal order in applicative order by explicitly creating closures [Imperative], there seems to be little to recommend normal order over applicative order in a practical programming language.
The notes are ordered alphabetically by name, not in order of reference within the text.
Recall that a lambda-expression (i.e. an s-expression whose car is
the atomic symbol LAMBDA) is not itself a valid procedure.
It is necessary to ENCLOSE it in order to invoke it.
Moreover, the particular representations we have chosen for
procedures and environments are arbitrary. In principle, one could have
several kinds of ENCLOSE, each transforming instances of a
particular representation into procedures. For example, someone might
actually want to implement a primitive ALGOL-ENCLOSE,
taking a string and a 2-by-N array representing code and environment for
an ALGOL procedure:
(ALGOL-ENCLOSE "integer procedure fact(n); value n: integer n:
fact := if n=0 then 1 else n*fact(n-1)"
NULL-ARRAY)
This could return a factorial function completely acceptable to SCHEME. Of course, the implementation of the primitive
ALGOL-ENCLOSE would have to know about the internal
representations of procedures used by the implementation of SCHEME; but this is hidden from the user.
Similarly, one could have APL-ENCLOSE,
BASIC-ENCLOSE, COBOL-ENCLOSE,
FORTRAN-ENCLOSE, RPG-ENCLOSE …
AND} ^The definition of AND has three rules, not only for the
same reasons OR does, but because AND is not a
precise dual to OR. OR, on failure, returns
NIL, but AND does not just return
non-NIL on success. It must return the non-NIL
thing returned by its last form.
OR} ^We might have defined OR with only two rules:
(OR) -> 'NIL
(OR x . r) -> (COND (x) (T (OR . r)))
However, because of the way OR is sometimes used, it is
a technical convenience to be able to guarantee to the user that the
last form in an OR is evaluated without an extra “stack
frame”; that is, a function called as the last form in an
OR will be invoked tail-recursively. For example:
(DEFINE NOT-ALL-NIL-P
(LAMBDA (X)
(LABELS ((LOOP
(LAMBDA (Z)
(OR (CAR Z) (LOOP (CDR Z))))))
(LOOP X))))
executes iteratively in [SCHEME]{.sc, but would not execute
iteratively if the two-rule definition of OR were used.
We should perhaps say instead that <identifier> is
treated the same as (STATIC <identifier>). The
STATIC construction is included in SCHEME primarily for pedagogical purposes, to provide
symmetry, to (FLUID <identifier>). The fact that lone
atomic symbols are interpreted as lexical variables rather than dynamic
ones is in some sense arbitrary. Some critics of SCHEME [personal communications] have expressed a
certain horror that there are two kinds of variables, perhaps imagining
some confusion in the interpretation of simple identifiers. We can have
as many kinds of variables as we like (though we have so far discovered
only two kinds of any great use), as long as we can distinguish them. In
SCHEME we distinguish them with a special
marker, such as STATIC or FLUID; then, as a
convenience, we prescribe that simple atomic symbols, not marked by such
a keyword, shall also be interpreted as lexical variables, because that
is the kind we use most often in SCHEME. We
could as easily have defined simple symbols to be interpreted as fluid
variables, or for that matter as constants (as numbers and strings are).
We could also have prescribed a different method of distinguishing
types, e.g. “all variables beginning with I, J, K, L, M, or N shall be
fluid”. (This is not as silly as it sounds. A fairly wide-spread LISP convention is to spell global variables with
leading and trailing *, as in *FOO*, and some
programmers have wished that the compiler would automatically treat
variables so spelled as SPECIAL.) Indeed, given the
read-macro-character facility, we effectively have the syntactic rule
“all variables beginning with • shall be fluid”. We have settled on the
current definition of SCHEME as being the most
convenient both to implement and to use.
Compare the use of syntactic markers and read-macro-characters to the
constructions <GVAL X> = ,X = “global
value of X” and <LVAL X> =
.X = “local value of X” in MUDDLE [Galley and
Pfister]. Indeed, in MUDDLE a simple atomic
symbol is regarded as a constant, not as an identifier.
All this suggests another solution to the problem posed in {Note FUNCALL
is a Pain} (the confusion of magic forms with combinations). The
real problem is distinguishing a magic word from a variable. Let us
abbreviate (STATIC FOO) to ≡FOO, just as
(FLUID FOO) can be abbreviated as •FOO. Then
(≡LOOP A B) would have to be a call on
the function LOOP, and not a magic form. Similarly, we
could write (MAGICWORD COND) instead of COND,
and invent an abbreviation for that too. This all raises as many
problems as it solves by becoming too clumsy; but then again, maybe it
isn’t asking too much to require the user to write all magic words in
boldface (as in the ALGOL reference language) or
in italics (as in an early version of PLASMA [Smith and Hewitt])…
Comments by Carl Hewitt and Berthold Horn were of considerable value in preparing this paper. Ed Barton (who is also helping to maintain the PDP-10 MacLISP SCHEME implementation) made important contributions to the revisions of the language definition.
Steele, Guy Lewis Jr. LAMBDA: The Ultimate Declarative. AI Memo 379. MIT AI Lab (Cambridge, November 1976).
{{HTML transcription: research.scheme.org/lambda-papers/. }}
Galley, S.W. and Pfister, Greg. The MDL
Language. Programming Technology Division Document SYS.11.01.
Project MAC, MIT (Cambridge, November 1975).
{{See also The MDL Programming
Language] (transcription at ReadtheDocs.io) }}
Hewitt, Carl. Viewing Control Structures as Patterns of Passing Messages AI Journal 8, 3 (June 1977), 323-364.
Steele, Guy Lewis Jr., and Sussman, Gerald Jay. LAMBDA: The Ultimate Imperative. AI Lab Memo 353. MIT (Cambridge, March 1976).
{{HTML transcription: research.scheme.org/lambda-papers/. }}
Landin, Peter J. A Correspondence between ALGOL 60 and Church’s Lambda-Notation. CACM 8, 2-3 (February and March 1965).
McDermott, Drew V. and Sussman, Gerald Jay. The CONNIVER Reference
Manual. AI Memo 295a 259a. MIT AI Lab (Cambridge,
January 1974).
Moon, David A. MACLISP Reference Manual, Revision 0. Project MAC, MIT (Cambridge, April 1974).
Moses, Joel. The Function of FUNCTION in LISP. AI Memo 199, MIT AI Lab (Cambridge, June 1970).
Naur, Peter (ed.), et al. Revised Report on the Algorithmic Language ALGOL 60. Comm. ACM 6, 1 (January 1963), 1-20.
Steele, Guy Lewis Jr. Compiler Optimization Based on Viewing LAMBDA as Rename plus Goto. S.M. thesis. MIT (Cambridge, May 1977).
{{See research.scheme.org/lambda-papers/.
See also Compiler Optimization Based on Viewing LAMBDA as RENAME
plus GOTO in Artificial
Intelligence, an MIT Perspective, Volume 2 (MIT Press 1979)
pp401-431. }}
Reynolds, John C. Definitional Interpreters for Higher Order Programming Languages. ACM Conference Proceedings 1972.
Sussman, Gerald Jay, and Steele, Guy L. Jr. SCHEME: An Interpreter for Extended Lambda Calculus. AI Memo 349. MIT AI Lab (Cambridge, December 1975).
{{HTML transcription: research.scheme.org/lambda-papers/.
Republished with notes as
Sussman, G.J., Steele, G.L. Scheme:
A Interpreter for Extended Lambda Calculus. Higher-Order and
Symbolic Computation 11, 405–439 (1998).
https://doi.org/10.1023/A:1010035624696
See also: Sussman, G.J., Steele, G.L. The First Report on Scheme
Revisited. Higher-Order and Symbolic Computation 11, 399–404
(1998). https://doi.org/10.1023/A:1010079421970 }}
Smith, Brian C. and Hewitt, Carl. A PLASMA Primer (draft). MIT AI Lab (Cambridge, October 1975).