This report describes research done at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the laboratory’s artificial intelligence research is provided in part by the Advanced Research Projects Agency of the Department of Defense under Office of Naval Research Contract number N00014-80-C-0505.
c Copyright by the Massachusetts Institute of Technology; Cambridge, Mass. 02139 All rights reserved.
The Lisp Machine manual describes both the language and the "operating system" of the Lisp Machine. The language, a dialect of Lisp called Zetalisp, is completely documented by this manual. The software environment and operating-system-like parts of the system contain many things which are still in a state of flux. This manual confines itself primarily to the stabler parts of the system, and does not address the window system and user interface at all. That documentation will be released as a separate volume at a later time.
Any comments, suggestions, or criticisms will be welcomed. Please send Arpa network mail to BUG-LMMAN@MIT-AI.
Those not on the Arpanet may send U.S. mail to
Daniel L. Weinreb or David A. Moon
Room 926
545 Technology Square
Cambridge, Mass. 02139
The Lisp Machine is a product of the efforts of many people too numerous to list here and of the unique environment of the M.I.T. Artificial Intelligence Laboratory.
Portions of this manual were written by Richard Stallman, Mike McMahon, and Alan Bawden. The chapter on the LOOP iteration macro is a reprint of Laboratory for Computer Science memo TM-169, by Glenn Burke.
The Lisp Machine is a new computer system designed to provide a high performance and economical implementation of the Lisp language. It is a personal computation system, which means that processors and main memories are not time-multiplexed: when using a Lisp Machine, you get your own processor and memory system for the duration of the session. It is designed this way to relieve the problems of the running of large Lisp programs on time-sharing systems. Everything on the Lisp Machine is written in Lisp, including all system programs; there is never any need to program in machine language. The system is highly interactive.
The Lisp Machine executes a new dialect of Lisp called Zetalisp, developed at the M.I.T Artificial Intelligence Laboratory for use in artificial intelligence research and related fields. It is closely related to the Maclisp dialect, and attempts to maintain a good degree of compatibility with Maclisp, while also providing many improvements and new features. Maclisp, in turn, is based on Lisp 1.5.
This document is the reference manual for the Zetalisp language. This document is not a tutorial, and it sometimes refers to functions and concepts that are not explained until later in the manual. It is assumed that you have a basic working knowledge of some Lisp dialect; you will be able to figure out the rest of the language from this manual.
There are also facilities explained in this manual that are not really part of the Lisp language. Some of these are subroutine packages of general use, and others are tools used in writing programs. However, the Lisp Machine window system, and the major utility programs, are not documented here.
The manual starts out with an explanation of the language. Chapter object-chapter explains the different primitive types of Lisp object, and presents some basic predicate functions for testing types. Chapter evaluator-chapter explains the process of evaluation, which is the heart of the Lisp language. Chapter flow-chapter introduces the basic Lisp control structures.
The next several chapters explain the details of the various primitive data-types of the language, and the functions that deal with them. Chapter cons-chapter deals with conses and the higher-level structures that can be built out of them, such as trees, lists, association lists, and property lists. Chapter symbol-chapter deals with symbols, chapter number-chapter with the various kinds of numbers, and chapter array-chapter with arrays. Chapter string-chapter explains character strings, which are a special kind of array.
After this there are some chapters that explain more about functions, function-calling, and related matters. Chapter function-chapter presents all the kinds of functions in the language, explains function-specs, and tells how to manipulate definitions of functions. Chapters closure-chapter and stack-group-chapter discuss closures and stack-groups, two facilities useful for creating coroutines and other advanced control and access structures.
Next, a few lower-level issues are dealt with. Chapter locative-chapter explains locatives, which are a kind of pointer to memory cells. Chapter subprimitive-chapter explains the "subprimitive" functions, which are primarily useful for implementation of the Lisp language itself and the Lisp Machine’s "operating system". Chapter area-chapter discusses areas, which give you control over storage allocation and locality of reference.
Chapter compiler-chapter discusses the Lisp compiler, which converts Lisp programs into "machine language". Chapter macros-chapter explains the Lisp macro facility, which allows users to write their own extensions to Lisp, extending both the interpreter and the compiler. The next two chapters go into detail about two such extensions, one that provides a powerful iteration control structure (chapter loop-chapter), and one that provides a powerful data structure facility (chapter defstruct-chapter).
Chapter flavor-chapter documents flavors, a language facility to provide generic functions using the paradigm used in Smalltalk and the Actor families of languages, called "object-oriented programming" or "message passing". Flavors are widely used by the system programs of the Lisp Machine, as well as being available to the user as a language feature.
Chapter io-chapter explains the Lisp Machine’s Input/Output system, including streams and the printed representation of Lisp objects. Chapter pathname-chapter documents how to deal with pathnames (the names of files).
Chapter package-chapter describes the package system, which allows many name spaces within a single Lisp environment. Chapter system-chapter documents the "system" facility, which helps you create and maintain programs that reside in many files.
Chapter process-chapter discusses the facilities for multiple processes and how to write programs that use concurrent computation. Chapter error-chapter explains how exceptional conditions (errors) can be handled by programs, handled by users, and debugged. Chapter code-chapter explains the instruction set of the Lisp Machine, and tells you how to examine the output of the compiler. Chapter query-chapter documents some functions for querying the user, chapter time-chapter explains some functions for manipulating dates and times, and chapter misc-chapter contains other miscellaneous functions and facilities.
There are several conventions of notation, and various points that should be understood before reading the manual to avoid confusion. This section explains those conventions.
The symbol "=>" will be used to indicate evaluation in
examples. Thus, when you see "foo
=> nil
", this means the
same thing as "the result of evaluating foo
is (or would have
been) nil
".
The symbol "==>" will be used to indicate macro expansion
in examples. This, when you see "(foo bar)
==> (aref bar 0)
",
this means the same thing as "the result of macro-expanding (foo bar)
is (or would have been) (aref bar 0)
".
A typical description of a Lisp function looks like this:
(foo 3)
) ¶The function-name
function adds together arg1 and arg2,
and then multiplies the result by arg3. If arg3 is not provided,
the multiplication isn’t done. function-name
then returns a list
whose first element is this result and whose second element is arg4.
Examples:
(function-name 3 4) => (7 4) (function-name 1 2 2 'bar) => (6 bar)
Note the use of fonts (typefaces). The name of the function is
in bold-face in the first line of the description, and the arguments are
in italics. Within the text, printed representations of Lisp objects
are in a different bold-face font, such as (+ foo 56)
, and argument
references are italicized, such as arg1 and arg2. A different,
fixed-width font, such as function-name
, is used for Lisp examples
that are set off from the text.
The word "&optional
" in the list of arguments tells you that all
of the arguments past this point are optional. The default value can be
specified explicitly, as with arg4 whose default value is the result
of evaluating the form (foo 3)
. If no default value is specified,
it is the symbol nil
. This syntax is used in lambda-lists in the
language, which are explained in lambda-list. Argument lists may
also contain "&rest
", which is part of the same syntax.
The descriptions of special forms and macros look like this:
This evaluates form three times and returns the result of the third evaluation.
This evaluates the forms with the symbol foo
bound to nil
.
It expands as follows:
(with-foo-bound-to-nil form1 form2 ...) ==> (let ((foo nil)) form1 form2 ...)
Since special forms and macros are the mechanism by which the syntax of Lisp
is extended, their descriptions must describe both their syntax and their
semantics; functions follow a simple consistent set of rules, but each
special form is idiosyncratic. The syntax is displayed on the first line
of the description using the following conventions. Italicized words are
names of parts of the form which are referred to in the descriptive text.
They are not arguments, even though they resemble the italicized words in
the first line of a function description. Parentheses ("( )
") stand for themselves.
Square brackets ("[ ]
") indicate that what they enclose is optional.
Ellipses ("...
") indicate that the subform (italicized word or parenthesized
list) which precedes them may be repeated any number of times (possibly no times at all).
Curly brackets followed by ellipses ("{ }...
") indicate that what they
enclose may be repeated any number of times. Thus the first line of the
description of a special form is a "template" for what an instance of that
special form would look like, with the surrounding parentheses removed.
The syntax of some special forms is sufficiently complicated
that it does not fit comfortably into this style; the first line of the
description of such a special form contains only the name, and the syntax is
given by example in the body of the description.
The semantics of a special form includes not only what it "does for a living", but also which subforms are evaluated and what the returned value is. Usually this will be clarified with one or more examples.
A convention used by many special forms is that all of their subforms after
the first few are described as "body...
". This means that the remaining
subforms constitute the "body" of this special form; they are Lisp forms which
are evaluated one after another in some environment established by the special
form.
This ridiculous special form exhibits all of the syntactic features:
This twiddles the parameters of frob, which defaults to default-frob
if not specified. Each parameter is the name of one of the adjustable parameters of
a frob; each value is what value to set that parameter to. Any number
of parameter/value pairs may be specified. If any options are specified,
they are keywords which select which safety checks to override while twiddling
the parameters. If neither frob nor any options are specified, the
list of them may be omitted and the form may begin directly with the first
parameter name.
frob and the values are evaluated; the parameters and options are syntactic keywords and not evaluated. The returned value is the frob whose parameters were adjusted. An error is signalled if any safety checks are violated.
Methods, the message-passing equivalent of ordinary Lisp’s functions, are described in this style:
This is the documentation of the effect of sending a message
named message-name
, with arguments arg1, arg2, and arg3,
to an instance of flavor flavor-name
.
Descriptions of variables ("special" or "global" variables) look like this:
The variable typical-variable
has a typical value....
Most numbers shown are in octal radix (base eight). Spelled out
numbers and numbers followed by a decimal point are in decimal. This is
because, by default, Zetalisp types out numbers in base 8; don’t
be surprised by this. If you wish to change it, see the documentation on the variables
ibase
and base
(ibase-var).
All uses of the phrase "Lisp reader", unless further qualified,
refer to the part of Lisp which reads characters from I/O streams
(the read
function), and not the person reading this manual.
There are several terms which are used widely in other references on Lisp, but are not used much in this document since they have become largely obsolete and misleading. For the benefit of those who may have seen them before, they are: "S-expression", which means a Lisp object; "Dotted pair", which means a cons; and "Atom", which means, roughly, symbols and numbers and sometimes other things, but not conses. The terms "list" and "tree" are defined in list-and-tree.
The characters acute accent ('
) (also called "single quote") and
semicolon (;
) have special meanings when typed to Lisp; they are
examples of what are called macro characters. Though the
mechanism of macro characters is not of immediate interest to the new
user, it is important to understand the effect of these two, which are
used in the examples.
When the Lisp reader encounters a "'
", it reads in the next
Lisp object and encloses it in a quote
special form. That
is, 'foo-symbol
turns into (quote foo-symbol)
, and '(cons 'a 'b)
turns into (quote (cons (quote a) (quote b)))
. The reason
for this is that "quote
" would otherwise have to be typed in very
frequently, and would look ugly.
The semicolon is used as a commenting character. When the
Lisp reader sees one, the remainder of the line is
discarded.
The character "/
" is used for quoting strange characters so
that they are not interpreted in their usual way by the Lisp reader,
but rather are treated the way normal alphabetic characters are treated.
So, for example, in order to give a "/
" to the reader, you must type "//
",
the first "/
" quoting the second one. When a character
is preceeded by a "/
" it is said to be slashified. Slashifying
also turns off the effects of macro characters such as "'
" and ";
".
The following characters also have special meanings,
and may not be used in symbols without slashification. These characters
are explained in detail in the section on printed-representation
(reader).
"
Double-quote delimits character strings.
#
Number-sign introduces miscellaneous reader macros.
`
Backquote is used to construct list structure.
,
Comma is used in conjunction with backquote.
:
Colon is the package prefix.
|
Characters between pairs of vertical-bars are quoted.
circleX
Circle-cross lets you type in characters using their octal codes.
All Lisp code in this manual is written in lower case. In fact, the reader turns all symbols into upper-case, and consequently everything prints out in upper case. You may write programs in whichever case you prefer.
You will see various symbols that have the colon (:
)
character in their names. By convention, all "keyword" symbols in the
Lisp Machine system have names starting with a colon. The colon
character is not actually part of the print name, but is a package
prefix indicating that the symbol belongs to the package with a null
name, which means the user
package. So, when you print such a
symbol, you won’t see the colon if the current package is user
.
However, you should always type in the colons where the manual tells you
to. This is all explained in chapter package-chapter; until you read
that, just make believe that the colons are part of the names of the
symbols, and don’t worry that they sometimes don’t get printed out for keyword
symbols.
This manual documents a number of internal functions and variables,
which can be identified by the "si:
" prefix in their names. The "si"
stands for "system internals". These functions and variables are documented
here because they are things you sometimes need to know about. However,
they are considered internal to the system and their behavior is not as
guaranteed as that of everything else. They may be changed in the future.
Zetalisp is descended from Maclisp, and a good deal of effort was expended to try to allow Maclisp programs to run in Zetalisp. Throughout the manual, there are notes about differences between the dialects. For the new user, it is important to note that many functions herein exist solely for Maclisp compatibility; they should not be used in new programs. Such functions are clearly marked in the text.
The Lisp Machine character set is not quite the same as that used on I.T.S nor on Multics; it is described in full detail elsewhere in the manual. The important thing to note for now is that the character "newline" is the same as "return", and is represented by the number 215 octal. (This number should not be built into any programs.)
When the text speaks of "typing Control-Q" (for example), this means to hold down the CTRL key on the keyboard (either of the two), and, while holding it down, to strike the "Q" key. Similarly, to type "Meta-P", hold down either of the META keys and strike "P". To type "Control-Meta-T" hold down both CTRL and META. Unlike ASCII, there are no "control characters" in the character set; Control and Meta are merely things that can be typed on the keyboard.
Many of the functions refer to "areas". The area feature is only of interest to writers of large systems, and can be safely disregarded by the casual user. It is described in chapter area-chapter.
This section enumerates some of the various different primitive types of
objects in Zetalisp. The types explained below include
symbols, conses, various types of numbers, two kinds of compiled code
objects, locatives, arrays, stack groups, and closures. With each is
given the associated symbolic name, which is returned by the function
data-type
(data-type-fun).
A symbol (these are sometimes called "atoms" or "atomic symbols" by other texts) has a print name, a binding, a definition, a property list, and a package.
The print name is a string, which may be obtained by the
function get-pname
(get-pname-fun). This string serves as the
printed representation (see printer) of the symbol. Each symbol
has a binding (sometimes also called the "value"), which may be any
Lisp object. It is also referred to sometimes as the "contents of the
value cell", since internally every symbol has a cell called the value
cell which holds the binding. It is accessed by the symeval
function (symeval-fun), and updated by the set
function
(set-fun). (That is, given a symbol, you use symeval
to find out
what its binding is, and use set
to change its binding.) Each
symbol has a definition, which may also be any Lisp object. It is
also referred to as the "contents of the function cell", since
internally every symbol has a cell called the function cell which
holds the definition. The definition can be accessed by the
fsymeval
function (fsymeval-fun), and updated with fset
(fset-fun), although usually the functions fdefinition
and
fdefine
are employed (fdefine-fun).
The property list is a list of an even number of
elements; it can be accessed directly by plist
(plist-fun), and
updated directly by setplist
(setplist-fun), although usually the
functions get
, putprop
, and remprop
(get-fun) are used.
The property list is used to associate any number of additional
attributes with a symbol–attributes not used frequently enough to
deserve their own cells as the value and definition do. Symbols also have a
package cell, which indicates which "package" of names the symbol
belongs to. This is explained further in the section on packages
(chapter package-chapter) and can be disregarded by the casual user.
The primitive function for creating symbols is
make-symbol
(make-symbol-fun), although most symbols
are created by read
, intern
, or
fasload
(which call make-symbol
themselves.)
A cons is an object that cares about two
other objects, arbitrarily named the car and the cdr.
These objects can be accessed with car
and cdr
(car-fun), and updated
with rplaca
and rplacd
(rplaca-fun). The primitive function for creating
conses is cons
(cons-fun).
There are several kinds of numbers in Zetalisp. Fixnums represent integers in the range of -2^23 to 2^23-1. Bignums represent integers of arbitrary size, but they are more expensive to use than fixnums because they occupy storage and are slower. The system automatically converts between fixnums and bignums as required. Flonums are floating-point numbers. Small-flonums are another kind of floating-point numbers, with less range and precision, but less computational overhead. Other types of numbers are likely to be added in the future. See number for full details of these types and the conversions between them.
The usual form of compiled, executable code is a Lisp object called a "Function Entry Frame" or "FEF". A FEF contains the code for one function. This is analogous to what Maclisp calls a "subr pointer". FEFs are produced by the Lisp Compiler (compiler), and are usually found as the definitions of symbols. The printed representation of a FEF includes its name, so that it can be identified. Another Lisp object which represents executable code is a "micro-code entry". These are the microcoded primitive functions of the Lisp system, and user functions compiled into microcode.
About the only useful thing to do with any of these compiled code objects
is to apply it to arguments. However, some functions are
provided for examining such objects, for user convenience. See
arglist
(arglist-fun),
args-info
(args-info-fun),
describe
(describe-fun),
and disassemble
(disassemble-fun).
A locative (see locative) is a kind of a pointer to a single memory cell
anywhere in the system. The contents of this cell can be accessed by cdr
(see cdr-fun) and updated by rplacd
(see rplacd-fun).
An array (see array) is a set of cells indexed by a tuple of integer subscripts. The contents of the cells may be accessed and changed individually. There are several types of arrays. Some have cells which may contain any object, while others (numeric arrays) may only contain small positive numbers. Strings are a type of array; the elements are 8-bit unsigned numbers which encode characters.
A list is not a primitive data type, but rather a data structure
made up out of conses and the symbol nil
. See list-and-tree.
A predicate is a function which tests for some condition involving
its arguments and returns the symbol t
if the condition is true, or
the symbol nil
if it is not true. Most of the following predicates are for
testing what data type an object has; some other general-purpose predicates
are also explained.
By convention, the names of predicates usually end in the letter "p" (which stands for "predicate").
The following predicates are for testing data types. These predicates
return t
if the argument is of the type indicated by the name of the function,
nil
if it is of some other type.
symbolp
returns t
if its argument is a symbol, otherwise nil
.
nsymbolp
returns nil
if its argument is a symbol, otherwise t
.
listp
returns t
if its argument is a cons, otherwise nil
.
Note that this means (listp nil)
is nil
even though nil
is the empty list.
[This may be changed in the future.]
nlistp
returns t
if its argument is anything besides a cons,
otherwise nil
.
nlistp
is identical to atom
, and so (nlistp nil)
returns t
.
[This may be changed in the future, if and when listp
is changed.]
The predicate atom
returns t
if its argument is not a cons,
otherwise nil
.
numberp
returns t
if its argument is any kind of number,
otherwise nil
.
fixp
returns t
if its argument is a fixed-point number, i.e a
fixnum or a bignum, otherwise nil
.
floatp
returns t
if its argument is a floating-point number,
i.e a flonum or a small flonum, otherwise nil
.
fixnump
returns t
if its argument is a fixnum, otherwise nil
.
bigp
returns t
if arg is a bignum, otherwise nil
.
flonump
returns t
if arg is a (large) flonum, otherwise nil
.
small-floatp
returns t
if arg is a small flonum, otherwise nil
.
stringp
returns t
if its argument is a string, otherwise nil
.
arrayp
returns t
if its argument is an array, otherwise nil
.
Note that strings are arrays.
functionp
returns t
if its argument is a function (essentially, something
that is acceptable as the first argument to apply
), otherwise it returns nil
.
In addition to interpreted, compiled, and microcoded functions, functionp
is true of closures, select-methods (see select-method), and symbols whose function
definition is functionp
. functionp
is not true of objects which can be called
as functions but are not normally thought of as functions: arrays, stack groups, entities,
and instances. If allow-special-forms is specified and non-nil
, then functionp
will be true of macros and special-form functions (those with quoted arguments). Normally
functionp
returns nil
for these since they do not behave like functions.
As a special case, functionp
of a symbol whose function definition is an array
returns t
, because in this case the array is being used as a function rather than
as an object.
subrp
returns t
if its argument is any compiled code object,
otherwise nil
. The Lisp Machine system doesn’t use the term "subr",
but the name of this function comes from Maclisp.
closurep
returns t
if its argument is a closure, otherwise nil
.
entityp
returns t
if its argument is an entity, otherwise nil
.
See entity for information about "entities".
locativep
returns t
if its argument is a locative, otherwise nil
.
typep
is really two different functions. With one argument,
typep
is not really a predicate; it returns a symbol describing the
type of its argument. With two arguments, typep
is a predicate which
returns t
if arg is of type type, and nil
otherwise.
Note that an object can be "of" more than one type, since one type can
be a subset of another.
The symbols that can be returned by typep
of one argument are:
:symbol
arg is a symbol.
:fixnum
arg is a fixnum (not a bignum).
:bignum
arg is a bignum.
:flonum
arg is a flonum (not a small-flonum).
:small-flonum
arg is a small flonum.
:list
arg is a cons.
:locative
arg is a locative pointer (see locative).
:compiled-function
arg is the machine code for a compiled function (sometimes called a FEF).
:microcode-function
arg is a function written in microcode.
:closure
arg is a closure (see closure).
:select-method
arg is a select-method table (see select-method).
:stack-group
arg is a stack-group (see stack-group).
:string
arg is a string.
:array
arg is an array that is not a string.
:random
Returned for any built-in data type that does not fit into one of the above categories.
foo
An object of user-defined data type foo (any symbol). The primitive type of the object could be array, instance, or entity. See Named Structures, named-structure, and Flavors, flavor.
The type argument to typep
of two arguments can be any of the above
keyword symbols (except for :random
), the name of a user-defined data type
(either a named structure or a flavor), or one of the following additional
symbols:
:atom
Any atom (as determined by the atom
predicate).
:fix
Any kind of fixed-point number (fixnum or bignum).
:float
Any kind of floating-point number (flonum or small-flonum).
:number
Any kind of number.
:instance
An instance of any flavor. See flavor.
:entity
An entity. typep
of one argument returns the name of the particular
user-defined type of the entity, rather than :entity
.
See also data-type
, data-type-fun.
Note that (typep nil) => :symbol
, and (typep nil ':list) => nil
; the
latter may be changed.
The following functions are some other general purpose predicates.
(eq x y) => t
if and only if x and y are the same object.
It should be noted that things that print the same are not necessarily eq
to each other.
In particular, numbers with the same value
need not be eq
, and two similar lists are usually not eq
.
Examples:
(eq 'a 'b) => nil (eq 'a 'a) => t (eq (cons 'a 'b) (cons 'a 'b)) => nil (setq x (cons 'a 'b)) (eq x x) => t
Note that in Zetalisp equal fixnums are eq
; this is not true in Maclisp.
Equality does not imply eq
-ness for other types of numbers. To compare numbers,
use =
; see =-fun.
(neq x y)
= (not (eq x y))
. This is provided
simply as an abbreviation for typing convenience.
The equal
predicate returns t
if its arguments are similar
(isomorphic) objects. (cf eq
)
Two numbers are equal
if they have the same value and type (for
example, a flonum is never equal
to a fixnum, even if =
is true of them).
For conses, equal
is defined
recursively as the two car
’s being equal
and the two cdr
’s
being equal. Two strings are equal
if they have the same length,
and the characters composing them are the same; see string-equal
,
string-equal-fun. Alphabetic case is ignored (but see
alphabetic-case-affects-string-comparison
,
alphabetic-case-affects-string-comparison-var). All other objects
are equal
if and only if they are eq
. Thus equal
could have
been defined by:
(defun equal (x y) (cond ((eq x y) t) ((neq (typep x) (typep y)) nil) ((numberp x) (= x y)) ((stringp x) (string-equal x y)) ((listp x) (and (equal (car x) (car y)) (equal (cdr x) (cdr y))))))
As a consequence of the above definition, it can be seen that
equal
may compute forever when applied to looped list structure.
In addition, eq
always implies equal
; that is, if (eq a b)
then (equal a b)
. An intuitive definition of equal
(which is
not quite correct) is that two objects are equal
if they look the
same when printed out. For example:
(setq a '(1 2 3)) (setq b '(1 2 3)) (eq a b) => nil (equal a b) => t (equal "Foo" "foo") => t
not
returns t
if x is nil
, else nil
.
null
is the same as not
; both functions are included for the sake
of clarity. Use null
to check whether something is nil
; use not
to invert the sense of a logical value. Even though Lisp uses the symbol
nil
to represent falseness, you shouldn’t make understanding of your program
depend on this fortuitously. For example, one often writes:
(cond ((not (null lst)) ... )
( ... ))
rather than
(cond (lst ... )
( ... ))
There is no loss of efficiency, since these will compile into exactly the same instructions.
The following is a complete description of the actions taken by the evaluator, given a form to evaluate.
If form is a number, the result is form.
If form is a string, the result is form.
If form is a symbol, the result is the binding of form. If form is unbound, an error is signalled. The way symbols are bound is explained in variable-section below.
If form is not any of the above types, and is not a list, an error is signalled.
In all remaining cases, form is a list. The evaluator
examines the car
of the list to figure out what to do next. There
are three possibilities: this form may be a special form, a macro
form, or a plain-old function form. Conceptually, the evaluator
knows specially about all the symbols whose appearance in the car
of a
form make that form a special form, but the way the evaluator actually
works is as follows. If the car
of the form is a symbol, the evaluator
finds the object in the function cell of the symbol (see symbol) and
starts all over as if that object had been the car
of the list. If the
car
isn’t a symbol, then if it’s a cons whose car
is the symbol
macro
, then this is a macro form; if it is a "special function" (see
special-function) then this is a special form; otherwise, it should
be a regular function, and this is a function form.
If form is a special form, then it is handled accordingly; each special form works differently. All of them are documented in this manual. The internal workings of special forms are explained in more detail on special-function, but this hardly ever affects you.
If form is a macro form, then the macro is expanded as explained in chapter macros-chapter.
If form is a function form, it calls for the application
of a function to arguments. The car
of the form is a function
or the name of a function. The cdr
of the form is a list of
subforms. Each subform is evaluated, sequentially. The values produced
by evaluating the subforms are called the "arguments" to the function.
The function is then applied to those arguments. Whatever results the
function returns are the values of the original form.
There is a lot more to be said about evaluation. The way variables
work and the ways in which they are manipulated, including the binding of
arguments, is explained in variable-section. A basic explanation of
functions is in function-section. The way functions can return more
than one value is explained in multiple-value. The description of all
of the kinds of functions, and the means by which they are manipulated, is
in chapter function-chapter. Macros are explained in chapter
macros-chapter. The evalhook
facility, which lets you do something
arbitrary whenever the evaluator is invoked, is explained in
evalhook-section. Special forms are described all over the manual; each
special form is in the section on the facility it is part of.
In Zetalisp, variables are implemented using symbols. Symbols are used for many things in the language, such as naming functions, naming special forms, and being keywords; they are also useful to programs written in Lisp, as parts of data structures. But when the evaluator is given a symbol, it treats it as a variable, using the value cell to hold the value of the variable. If you evaluate a symbol, you get back the contents of the symbol’s value cell.
There are two different ways of changing the value of a variable. One
is to set the variable. Setting a variable changes its value to a
new Lisp object, and the previous value of the variable is forgotten.
Setting of variables is usually done with the setq
special form.
The other way to change the value of a variable is with binding
(also called "lambda-binding"). When a variable is bound, its old value
is first saved away, and then the value of the variable is made to be
the new Lisp object. When the binding is undone, the saved value is
restored to be the value of the variable. Bindings are always followed
by unbindings. The way this is enforced is that binding is only done by
special forms that are defined to bind some variables, then evaluate some
subforms, and then unbind those variables. So the variables are all
unbound when the form is finished. This means that the evaluation of
the form doesn’t disturb the values of the variables that are bound;
whatever their old value was, before the evaluation of the form, gets
restored when the evaluation of the form is completed. If such a form
is exited by a non-local exit of any kind, such as *throw
(see
*throw-fun) or return
(see return-fun), the bindings are
undone whenever the form is exited.
The simplest construct for binding variables is the let
special
form. The do
and prog
special forms can also bind variables, in
the same way let
does, but they also control the flow of the program
and so are explained elsewhere (see do-fun). let*
is just a
sequential version of let
; the other special forms below are only
used for esoteric purposes.
Binding is an important part of the process of applying interpreted functions to arguments. This is explained in the next section.
When a Lisp function is compiled, the compiler understands the use of symbols as variables. However, the compiled code generated by the compiler does not actually use symbols to represent variables. Rather, the compiler converts the references to variables within the program into more efficient references, that do not involve symbols at all. A variable that has been changed by the compiler so that it is not implemented as a symbol is called a "local" variable. When a local variable is bound, a memory cell is allocated in a hidden, internal place (the Lisp control stack) and the value of the variable is stored in this cell. You cannot use a local variable without first binding it; you can only use a local variable inside of a special form that binds that variable. Local variables do not have any "top level" value; they do not even exist outside of the form that binds them.
The variables which are associated with symbols (the kind which are used by non-compiled programs) are called "special" variables.
Local variables and special variables do not behave quite the same way, because "binding" means different things for the two of them. Binding a special variable saves the old value away and then uses the value cell of the symbol to hold the new value, as explained above. Binding a local variable, however, does not do anything to the symbol. In fact, it creates a new memory cell to hold the value, i.e a new local variable.
Thus, if you compile a function, it may do different things after it has been compiled. Here is an example:
(setq a 2) ;Set the variablea
to the value2
. (defun foo () ;Define a function namedfoo
. (let ((a 5)) ;Bind the symbola
to the value5
. (bar))) ;Call the functionbar
. (defun bar () ;Define a function namedbar
. a) ;It just returns the value of the variablea
. (foo) => 5 ;Callingfoo
returns5
. (compile 'foo) ;Now compilefoo
. (foo) => 2 ;This time, callingfoo
returns2
.
This is a very bad thing, because the compiler is only supposed to speed
things up, without changing what the function does. Why did the function
foo
do something different when it was compiled? Because a
was
converted from a special variable into a local variable. After foo
was
compiled, it no longer had any effect on the value cell of the symbol a
,
and so the symbol retained its old contents, namely 2
.
In most uses of variables in Lisp programs, this problem doesn’t come
up. The reason it happened here is because the function bar
refers
to the symbol a
without first binding a
to anything. A
reference to a variable that you didn’t bind yourself is called a free
reference; in this example, bar
makes a free reference to a
.
We mentioned above that you can’t use a local variable without first
binding it. Another way to say this is that you can’t ever have a free
reference to a local variable. If you try to do so, the compiler will
complain. In order for our functions to work, the compiler must be told
not to convert a
into a local variable; a
must remain a
special variable. Normally, when a function is compiled, all variables
in it are made to be "local". You can stop the compiler from making a
variable local by "declaring" to the compiler that the variable is
"special". When the compiler sees references to a variable that has
been declared special, it uses the symbol itself as the variable instead
of making a local variable.
Variables can be declared by the special forms defvar
and
defconst
(see below), or by explicit compiler declarations (see
special-fun). The most common use of special variables is as
"global" variables: variables used by many different functions
throughout a program, that have top-level values.
Had bar
been compiled, the compiler would have seen the free
reference and printed a warning message: Warning: a declared
special.
It would have automatically declared a
to be special and
proceeded with the compilation. It knows that free references mean that
special declarations are needed. But when a function is compiled that
binds a variable that you want to be treated as a special variable but
that you have not explicitly declared, there is, in general, no way for
the compiler to automatically detect what has happened, and it will
produce incorrect output. So you must always provide declarations for
all variables that you want to be treated as special variables.
When you declare a variable to be special using declare
rather than
local-declare
, the declaration is "global"; that is, it applies
wherever that variable name is seen. After fuzz
has been declared
special using declare
, all following uses of fuzz
will be
treated by the compiler as references to the same special variable.
Such variables are called "global variables", because any function can
use them; their scope is not limited to one function. The special forms
defvar
and defconst
are useful for creating global variables;
not only do they declare the variable special, but they also provide a
place to specify its initial value, and a place to add documentation.
In addition, since the names of these special forms start with "def
" and
since they are used at the top-level of files, the Lisp Machine editor
can find them easily.
Here are the special forms used for setting variables.
The setq
special form is used to set the value of a variable or of
many variables. The first value is evaluated, and the first
variable is set to the result. Then the second value is
evaluated, the second variable is set to the result, and so on for
all the variable/value pairs. setq
returns the last value, i.e
the result of the evaluation of its last subform.
Example:
(setq x (+ 3 2 1) y (cons x nil))
x
is set to 6
, y
is set to (6)
, and the setq
form
returns (6)
. Note that the first variable was set before
the second value form was evaluated, allowing that form to use the new value of x
.
A psetq
form is just like a setq
form, except
that the variables are set "in parallel"; first all of the value forms
are evaluated, and then the variables are set to the resulting
values.
Example:
(setq a 1) (setq b 2) (psetq a b b a) a => 2 b => 1
Here are the special forms used for binding variables.
let
is used to bind some variables to some objects, and evaluate some forms
(the "body") in the context of those bindings.
A let
form looks like
(let ((var1 vform1) (var2 vform2) ...) bform1 bform2 ...)
When this form is evaluated, first the vforms (the values) are evaluated. Then the vars are bound to the values returned by the corresponding vforms. Thus the bindings happen in parallel; all the vforms are evaluated before any of the vars are bound. Finally, the bforms (the body) are evaluated sequentially, the old values of the variables are restored, and the result of the last bform is returned.
You may omit the vform from a let
clause, in which case it is as
if the vform were nil
: the variable is bound to nil
.
Furthermore, you may replace the entire clause (the list of the variable
and form) with just the variable, which also means that the variable
gets bound to nil
. Example:
(let ((a (+ 3 3)) (b 'foo) (c) d) ...)
Within the body, a
is bound to 6
, b
is bound to foo
, c
is
bound to nil
, and d
is bound to nil
.
let*
is the same as let
except that the binding is sequential. Each
var is bound to the value of its vform before the next vform
is evaluated. This is useful when the computation of a vform depends on
the value of a variable bound in an earlier vform. Example:
(let* ((a (+ 1 2)) (b (+ a a))) ...)
Within the body, a
is bound to 3
and b
is bound to 6
.
let-if
is a variant of let
in which the binding of variables is conditional.
The variables must all be special variables.
The let-if
special form, typically written as
(let-if cond ((var-1 val-1) (var-2 val-2)...) body-form1 body-form2...)
first evaluates the predicate form cond. If the result is non-nil
, the value forms
val-1, val-2, etc are evaluated and then the variables var-1, var-2,
etc are bound to them. If the result is nil
, the
vars and vals are ignored. Finally the body forms are evaluated.
let-globally
is similar in form to let
(see let-fun).
The difference is that let-globally
does not bind the
variables; instead, it saves the old values and sets the variables,
and sets up an unwind-protect
(see unwind-protect-fun) to set them back. The important
difference between let-globally
and let
is that when
the current stack group (see stack-group) co-calls some other stack
group, the old values of the variables are not restored. Thus
let-globally
makes the new values visible in all stack groups and
processes that don’t bind the variables themselves, not just the current stack group.
progv
is a special form to provide the user with extra control
over binding. It binds a list of special variables to a list of values,
and then evaluates some forms. The lists of special variables and values
are computed quantities; this is what makes progv
different from
let
, prog
, and do
.
progv
first evaluates symbol-list and value-list, and then binds each
symbol to the corresponding value. If too few values are supplied, the
remaining symbols are bound to nil
. If too many values are
supplied, the excess values are ignored.
After the symbols have been bound to the values, the body forms are evaluated, and finally the symbols’ bindings are undone. The result returned is the value of the last form in the body.
Example:
(setq a 'foo b 'bar) (progv (list a b 'b) (list b) (list a b foo bar)) => (foo nil bar nil)
During the evaluation of the body of this progv
, foo
is bound to bar
, bar
is bound to nil
, b
is
bound to nil
, and a
retains its top-level value foo
.
progw
is a somewhat modified kind of progv
. Like progv
, it
only works for special variables.
First, vars-and-val-forms-form is evaluated. Its value should be a list
that looks like the first subform of a let
:
((var1 val-form-1) (var2 val-form-2) ...)
Each element of this list is processed in turn, by evaluating the val-form, and binding the var to the resulting value. Finally, the body forms are evaluated sequentially, the bindings are undone, and the result of the last form is returned. Note that the bindings are sequential, not parallel.
This is a very unusual special form because of the way the evaluator is
called on the result of an evaluation. Thus progw
is mainly
useful for implementing special forms and for functions part of whose
contract is that they call the interpreter. For an example of the latter,
see sys:*break-bindings*
(sys:*break-bindings*-var); break
implements
this by using progw
.
Here are the special forms for defining special variables.
defvar
is the recommended way to declare the use of a global variable in a
program. Placed at top level in a file,
(defvar variable)
declares variable special for the sake of compilation, and records its location for the sake of the editor so that you can ask to see where the variable is defined. If a second subform is supplied,
(defvar variable initial-value)
variable is initialized to the result of evaluating the form initial-value unless it already has a value, in which case it keeps that value. initial-value is not evaluated unless it is used; this is useful if it does something expensive like creating a large data structure.
defvar
should be used only at top level, never in function
definitions, and only for global variables (those used by more than one
function). (defvar foo 'bar)
is roughly equivalent to
(declare (special foo)) (if (not (boundp 'foo)) (setq foo 'bar))
(defvar variable initial-value documentation)
allows you to include a documentation string which describes what the variable is for or how it is to be used. Using such a documentation string is even better than commenting the use of the variable, because the documentation string is accessible to system programs that can show the documentation to you while you are using the machine.
If defvar
is used in a patch file (see patch-facility)
or is a single form (not a region) evaluated with the editor’s
compile/evaluate from buffer commands,
if there is an initial-value the variable is always set to it
regardless of whether it is already bound.
defconst
is the same as defvar
except that if an initial value
is given the variable is always set to it regardless of whether it is
already bound. The rationale for this is that defvar
declares a
global variable, whose value is initialized to something but will then
be changed by the functions that use it to maintain some state. On the
other hand, defconst
declares a constant, whose value will never be
changed by the normal operation of the program, only by changes to the program.
defconst
always sets the variable to the specified value
so that if, while developing or debugging the program, you change your
mind about what the constant value should be, and then you evaluate the
defconst
form again, the variable will get the new value.
It is not the intent of defconst
to declare that the value of
variable will never change; for example, defconst
is not license to the compiler
to build assumptions about the value of variable into programs being compiled.
In the description of evaluation on description-of-evaluation, we said that evaluation of a function form works by applying the function to the results of evaluating the argument subforms. What is a function, and what does it mean to apply it? In Zetalisp there are many kinds of functions, and applying them may do many different kinds of things. For full details, see function-functions. Here we will explain the most basic kinds of functions and how they work. In particular, this section explains lambda lists and all their important features.
The simplest kind of user-defined function is the lambda-expression, which is a list that looks like:
(lambda lambda-list body1 body2...)
The first element of the lambda-expression is the symbol lambda
; the
second element is a list called the lambda list, and the rest of the
elements are called the body. The lambda list, in its simplest
form, is just a list of variables. Assuming that this simple form
is being used, here is what happens when a lambda expression is applied
to some arguments. First, the number of arguments and the number of
variables in the lambda list must be the same, or else an error is signalled.
Each variable is bound to the corresponding argument value. Then
the forms of the body are evaluated sequentially. After this, the
bindings are all undone, and the value of the last form in the body is
returned.
This may sound something like the description of let
, above. The
most important difference is that the lambda-expression is not a form at
all; if you try to evaluate a lambda-expression, you will get told that
lambda
is not a defined function. The lambda-expression is a
function, not a form. A let
form gets evaluated, and the
values to which the variables are bound come from the evaluation of some
subforms inside the let
form; a lambda-expression gets applied, and
the values are the arguments to which it is applied.
The variables in the lambda list are sometimes called parameters, by analogy with other languages. Some other terminologies would refer to these as formal parameters, and to arguments as actual parameters.
Lambda lists can have more complex structure than simply being a list of
variables. There are additional features accessible by using certain
keywords (which start with &
) and/or lists as elements of the
lambda list.
The principal weakness of the simple lambda lists is that any
function written with one must only take a certain, fixed number of
arguments. As we know, many very useful functions, such as list
,
append
, +
, and so on, accept a varying number of arguments.
Maclisp solved this problem by the use of lexprs and lsubrs,
which were somewhat inelegant since the parameters had to be referred to
by numbers instead of names (e.g (arg 3)
). (For compatibility
reasons, Zetalisp supports lexprs, but they should not be
used in new programs). Simple lambda lists also require that
arguments be matched with parameters by their position in the
sequence. This makes calls hard to read when there are a great many
arguments. Keyword parameters enable the use of other styles of call
which are more readable.
In general, a function in Zetalisp has zero or more positional parameters, followed if desired by a single rest parameter, followed by zero or more keyword parameters. The positional and keyword parameters may be required or optional, but all the optional parameters must follow all the required ones. The required/optional distinction does not apply to the rest parameter.
The caller must provide enough arguments so that each of the required parameters gets bound, but he may provide extra arguments, for some of the optional parameters. Also, if there is a rest parameter, he can provide as many extra arguments as he wants, and the rest parameter will be bound to a list of all these extras. Optional parameters may have a default-form, which is a form to be evaluated to produce the default value for the parameter if no argument is supplied.
Positional parameters are matched with arguments by the position of
the arguments in the argument list. Keyword parameters are matched
with their arguments by matching the keyword name; the arguments need
not appear in the same order as the parameters. If an optional
positional argument is omitted, then no further arguments can be
present. Keyword parameters allow the caller to decide independently
for each one whether to specify it.
Here is the exact explanation of how this all works. When
apply
(the primitive function that applies functions
to arguments) matches up the arguments with the parameters, it follows the
following algorithm:
The positional parameters are dealt with first.
The first required positional parameter is bound to the first
argument. apply
continues to bind successive required
positional parameters
to the successive arguments. If, during this process, there are no
arguments left but there are still some required parameters
(positional or keyword) which have
not been bound yet, it is an error ("too few arguments").
Next, after all required parameters are handled, apply
continues with the optional positional parameters, if any. It binds
successive parameter to the next argument. If, during this process, there are no arguments
left, each remaining optional parameter’s default-form is evaluated,
and the parameter is bound to it. This is done one parameter at a time;
that is, first one default-form is evaluated, and then the parameter is
bound to it, then the next default-form is evaluated, and so on.
This allows the default for an argument to depend on the previous argument.
Now, if there are no remaining parameters (rest or keyword), and there are no
remaining arguments, we are finished. If there are no more parameters
but there are still some arguments remaining, an error is caused ("too
many arguments"). If parameters remain, all the remaining arguments
are used for both the rest parameter, if any, and the keyword
parameters.
First, if there is a rest parameter, it is bound to a list of all
the remaining arguments. If there are no
remaining arguments, it gets bound to nil
.
If there are keyword parameters, the same remaining arguments are
used to bind them, as follows.
The arguments for the keyword parameters are treated as a list
of alternating keyword symbols and associated values. Each symbol is
matched with the keyword parameter names, and the matching keyword
paramater is bound to the value which follows the symbol. All the
remaining arguments are treated in this way. Since the arguments are
usually obtained by evaluation, those arguments which are keyword
symbols are typically quoted in the call; but they do not have to be.
The keyword symbols are compared by means of eq
, which means they
must be specified in the correct package. The keyword symbol for a
parameter has the same print name as the parameter, but resides in the
keyword package regardless of what package the parameter name itself
resides in. (You can specify the keyword symbol explicitly in the
lambda list if you must; see below).
If any keyword parameter has not received a value when all the
arguments have been processed, this is an error if the parameter is
required. If it is optional, the default-form for the parameter is
evaluated and the parameter is bound to its value.
There may be a keyword symbol among the arguments which does not match any
keyword parameter name. The function itself specifies whether this is
an error. If it is not an error, then the non-matching symbols and
their associated values are ignored. The function can access these
symbols and values through the rest parameter, if there is one.
It is common for a function to check only for certain keywords, and
pass its rest parameter to another function using lexpr-funcall
;
then that function will check for the keywords that concern it.
The way you express which parameters are required, optional,
and rest is by means of specially recognized symbols, which are called
&-keywords
, in the lambda list. All such symbols’ print names
begin with the character "&
". A list of all such symbols is the value of
the symbol lambda-list-keywords
.
The keywords used here are &key
, &optional
and &rest
.
The way they are used is best explained by means of examples;
the following are typical lambda lists, followed by descriptions
of which parameters are positional, rest or keyword; and required or optional.
a
, b
, and c
are all required and positional. The function must be
passed three arguments.
a
and b
are required, c
is optional. All three are
positional. The function may be passed either two or three arguments.
a
, b
, and c
are all optional and positional. The function may
be passed any number of arguments between zero and three, inclusive.
a
is a rest parameter. The function may be passed any number of arguments.
a
and b
are required positional, c
and d
are optional
positional, and e
is rest. The function may be passed two or more
arguments.
a
and b
are both required keyword parameters. A typical
call would look like
(foo ':b 69 ':a '(some elements))
This illustrates that the parameters can be matched in either order.
a
is required keyword, and b
is optional keyword.
The sample call above would be legal for this function also; so would
(foo ':a '(some elements))
which doesn’t specify b
.
x
is required positional, y
is optional positional,
z
is rest, and a
and b
are optional keyword.
One or more arguments are allowed. One or two arguments specify only
the positional parameters. Arguments beyond the second specify both
the rest parameter and the keyword parameters, so that
(foo 1 2 ':b '(a list))
specifies 1
for x
, 2
for y
, (:b (a
list))
for z
, and (a list)
for b
. It does not
specify a
.
z
is rest, and a
, b
and c
are optional keyword
parameters. &allow-other-keys
says that absolutely any keyword
symbols may appear among the arguments; these symbols and the values
that follow them have no effect on the keyword parameters, but do
become part of the value of z
.
This is equivalent to (&rest z)
. So, for that matter, is the
previous example, if the function does not use the values of a
,
b
and c
.
In all of the cases above, the default-form for each
optional parameter is nil
. To specify your own default forms,
instead of putting a symbol as the element of a lambda list, put in a
list whose first element is the symbol (the parameter itself) and whose
second element is the default-form. Only optional parameters may have
default forms; required parameters are never defaulted, and rest
parameters always default to nil
. For example:
The default-form for b
is 3
. a
is a required parameter, and
so it doesn’t have a default form.
a
’s default-form is 'foo
, b
’s is nil
, and c
’s is
(symeval a)
. Note that if
the function whose lambda list this is were called on no arguments,
a
would be bound to the symbol foo
, and c
would be bound
to the binding of the symbol foo
; this illustrates the fact
that each variable is bound immediately after its default-form is evaluated,
and so later default-forms may take advantage of earlier parameters
in the lambda list. b
and d
would be bound to nil
.
For a keyword parameter, you normally specify the variable name,
and the keyword proper is usually computed from it. You can specify the
keyword symbol independently if you need to. To do this, use a
two-level list instead of a symbol: ((keyword variable))
. The top
level of list can also contain an default value and supplied-p variable,
for optional arguments.
The function with this argument list will accept two keywords
foo:a
and foo:b
, which will set variables a
and b
.
Occasionally it is important to know whether a certain optional
parameter was defaulted or not. You can’t tell from just examining its
value, since if the value is the default value, there’s no way to tell
whether the caller passed that value explicitly, or whether the caller
didn’t pass any value and the parameter was defaulted. The way to tell
for sure is to put a third element into the list: the third element
should be a variable (a symbol), and that variable is bound to nil
if the parameter was not passed by the caller (and so was defaulted), or
t
if the parameter was passed. The new variable is called a "supplied-p"
variable; it is bound to t
if the parameter is supplied.
For example:
The default-form for b
is 3
, and the "supplied-p" variable for b
is c
. If the function is called with one argument, b
will be bound
to 3
and c
will be bound to nil
. If the function is called
with two arguments, b
will be bound to the value that was passed
by the caller (which might be 3
), and c
will be bound to t
.
It is possible to specify a keyword parameter’s symbol independently of its parameter name. To do this, use two nested lists to specify the parameter. The outer list is the one which can contain the default-form and supplied-p variable, if the parameter is optional. The first element of this list, instead of a symbol, is again a list, whose elements are the keyword symbol and the parameter variable name. For example:
This is equivalent to (&key a &optional (b t))
.
This allows a keyword which the user will know under the name
:base
, without making the parameter shadow the value of
base
, which is used for printing numbers.
It is also possible to include, in the lambda list, some other
symbols, which are bound to the values of their default-forms upon
entry to the function. These are not parameters, and they are
never bound to arguments; they just get bound, as if they appeared
in a let
form. (Whether you use these aux-variables or bind the
variables with let
is a stylistic decision.)
To include such symbols, put them after any parameters, preceeded
by the &
-keyword &aux
. Examples:
d
, e
, and f
are bound, when the function is
called, to nil
, 5
, and a cons of the first argument and 5.
Note that aux-variables are bound sequentially rather than in parallel.
It is important to realize that the list of arguments to which a
rest-parameter is bound is set up in whatever way is most efficiently
implemented, rather than in the way that is most convenient for the
function receiving the arguments. It is not guaranteed to be a
"real" list. Sometimes the rest-args list is stored in the
function-calling stack, and loses its validity when the function
returns. If a rest-argument is to be returned or made part of permanent
list-structure, it must first be copied (see copylist
, page
copylist-fun), as you must always assume that it is one of these
special lists. The system will not detect the error of omitting to copy
a rest-argument; you will simply find that you have a value which seems
to change behind your back. At other times the rest-args list will be
an argument that was given to apply
; therefore it is not safe to
rplaca
this list as you may modify permanent data structure. An
attempt to rplacd
a rest-args list will be unsafe in this case,
while in the first case it would cause an error, since lists in the stack
are impossible to rplacd
.
There are some other keywords in addition to those mentioned
here. See lambda-list-keywords for a complete list. You only need
to know about &optional
and &rest
in order to understand this
manual.
Lambda lists provide "positional" arguments: the meaning of an
argument comes from its position in the lambda list. For example, the
first argument to cons
is the object that will be the car
of the new
cons. Sometimes it is desirable to use "keyword" arguments, in which
the meaning of an argument comes from a "keyword" symbol that tells the
callee which argument this is. While lambda lists do not provide
keyword arguments directly, there is a convention for functions that
want arguments passed to them in the keyword fashion. The convention is
that the function takes a rest-argument, whose value is a list of
alternating keyword symbols and argument values. If cons
were
written as a keyword-style function, then instead of saying
(cons 4 (foo))
you could say either of
(cons ':car 4 ':cdr (foo))
or
(cons ':cdr (foo) ':car 4)
assuming the keyword symbols were :car
and :cdr
. Keyword symbols
are always in the keyword package,
and so their printed representations always start with a colon; the reason
for this is given in chapter package-chapter.
This use of keyword arguments is only a convention; it is not built into
the function-calling mechanism of the language. Your function must
contain Lisp programming to take apart the rest parameter and make sense
of the keywords and values. The special form keyword-extract
(see
keyword-extract-fun) may be useful for this.
This section describes some functions and special forms. Some are parts of the evaluator, or closely related to it. Some have to do specifically with issues discussed above such as keyword arguments. Some are just fundamental Lisp forms that are very important.
(eval x)
evaluates x, and returns the result.
Example:
(setq x 43 foo 'bar) (eval (list 'cons x 'foo)) => (43 . bar)
It is unusual to explicitly call eval
, since usually
evaluation is done implicitly. If you are writing a simple Lisp program and
explicitly calling eval
, you are probably doing something wrong.
eval
is primarily useful in programs which deal with Lisp itself,
rather than programs about knowledge or mathematics or games.
Also, if you are only interested in getting at the value of a
symbol (that is, the contents of the symbol’s value cell), then you
should use the primitive function symeval
(see symeval-fun).
Note: the actual name of the compiled code for eval
is "si:*eval
";
this is because use of the evalhook feature binds the function cell of eval
.
If you don’t understand this, you can safely ignore it.
Note: unlike Maclisp, eval
never takes a second argument; there
are no "binding context pointers" in Zetalisp.
They are replaced by Closures (see closure).
(apply f arglist)
applies the function f to the list of
arguments arglist. arglist should be a list; f can be any function.
Examples:
(setq fred '+) (apply fred '(1 2)) => 3 (setq fred '-) (apply fred '(1 2)) => -1 (apply 'cons '((+ 2 3) 4)) => ((+ 2 3) . 4) not (5 . 4)
Of course, arglist may be nil
.
Note: unlike Maclisp, apply
never takes a third argument; there
are no "binding context pointers" in Zetalisp.
Compare apply
with funcall
and eval
.
(funcall f a1 a2 ... an)
applies the
function f to the arguments a1, a2, ..., an.
f may not
be a special form nor a macro; this would not be meaningful.
Example:
(cons 1 2) => (1 . 2) (setq cons 'plus) (funcall cons 1 2) => 3
This shows that the use of the symbol cons
as the name of a function
and the use of that symbol as the name of a variable do not interact.
The cons
form invokes the function named cons
.
The funcall
form evaluates the variable and gets the symbol plus
,
which is the name of a different function.
lexpr-funcall
is like a cross between apply
and funcall
.
(lexpr-funcall f a1 a2 ... an l)
applies the
function f
to the arguments a1 through an followed by the elements of
the list l. Note that since it treats its last argument specially,
lexpr-funcall
requires at least two arguments.
Examples:
(lexpr-funcall 'plus 1 1 1 '(1 1 1)) => 6 (defun report-error (&rest args) (lexpr-funcall (function format) error-output args))
lexpr-funcall
with two arguments does the same thing as apply
.
Note: the Maclisp functions subrcall
, lsubrcall
, and arraycall
are not needed on the Lisp Machine; funcall
is just as efficient.
arraycall
is provided for compatibility; it ignores its first
subform (the Maclisp array type) and is otherwise identical to aref
.
subrcall
and lsubrcall
are not provided.
call
offers a very general way of controlling what arguments you
pass to a function. You can provide either individual arguments a la
funcall
or lists of arguments a la apply
, in any order. In
addition, you can make some of the arguments optional. If the
function is not prepared to accept all the arguments you specify, no
error occurs if the excess arguments are optional ones. Instead, the
excess arguments are simply not passed to the function.
The argument-specs are alternating keywords (or lists of keywords)
and values. Each keyword or list of keywords says what to do with the
value that follows. If a value happens to require no keywords,
provide ()
as a list of keywords for it.
Two keywords are presently defined: :optional
and :spread
.
:spread
says that the following value is a list of arguments.
Otherwise it is a single argument. :optional
says that all the
following arguments are optional. It is not necessary to specify
:optional
with all the following argument-specs, because it is
sticky.
Example:
(call #'foo () x ':spread y '(:optional :spread) z () w)
The arguments passed to foo
are the value of x
, the
elements of the value of y
, the elements of the value of
z
, and the value of w
. The function foo
must be
prepared to accept all the arguments which come from x
and
y
, but if it does not want the rest, they are ignored.
(quote x)
simply returns x. It is useful specifically
because x is not evaluated; the quote
is how you make a form
that returns an arbitrary Lisp object. quote
is used to include
constants in a form.
Examples:
(quote x) => x (setq x (quote (some list))) x => (some list)
Since quote
is so useful but somewhat cumbersome to type, the reader normally
converts any form preceded by a single quote ('
) character into a quote
form.
For example, (setq x '(some list)) is converted by read into (setq x (quote (some list)))
This means different things depending on whether f is a function
or the name of a function. (Note that in neither case is f evaluated.)
The name of a function is a symbol or a function-spec list
(see function-spec). A function is typically a list whose car
is the symbol lambda
, however there are several other kinds
of functions available (see kinds-of-functions).
If you want to pass an anonymous function as an argument to a function,
you could just use quote
; for example:
(mapc (quote (lambda (x) (car x))) some-list)
This works fine as far as the evaluator is concerned. However, the
compiler cannot tell that the first argument is going to be used as a
function; for all it knows, mapc
will treat its first argument as a
piece of list structure, asking for its car
and cdr
and so forth. So
the compiler cannot compile the function; it must pass the
lambda-expression unmodified. This means that the function will not get
compiled, which will make it execute more slowly than it might
otherwise.
The function
special form is one way to tell the compiler that
it can go ahead and compile the lambda-expression. You just use
the symbol function
instead of quote
:
(mapc (function (lambda (x) (car x))) some-list)
This will cause the compiler to generate code such that mapc
will be
passed a compiled-code object as its first argument.
That’s what the compiler does with a function
special form whose
subform f is a function. The evaluator, when given such a form,
just returns f; that is, it treats function
just like quote
.
To ease typing, the reader converts #'thing
into (function thing)
.
So #'
is similar to '
except that it produces a
function
form instead of a quote
form. So the above form
could be written as
(mapc #'(lambda (x) (car x)) some-list)
If f is not a function but the name of a function
(typically a symbol, but in general any kind of function spec), then
function
returns the definition of f; it is like fdefinition
except that it is a special form instead of a function, and so
(function fred) is like (fdefinition 'fred) which is like (fsymeval 'fred)
since fred
is a symbol.
function
is the same for the compiler and the interpreter when
f is the name of a function.
Another way of explaining function
is that it causes f to be
treated the same way as it would as the car of a form. Evaluating
the form (f arg1 arg2...)
uses the function definition
of f if it is a symbol, and otherwise expects f to be a list
which is a lambda-expression. Note that the car of a form may not be
a non-symbol function spec, to avoid difficult-to-read code. This can be
written as
(funcall (function spec) args...)
You should be careful about whether you use #'
or '
. Suppose
you have a program with a variable x
whose value is assumed to
contain a function that gets called on some arguments. If you want that
variable to be the car
function, there are two things you could say:
(setq x 'car)
or
(setq x #'car)
The former causes the value of x
to be the symbol car
, whereas
the latter causes the value of x
to be the function object found in
the function cell of car
. When the time comes to call the function
(the program does (funcall x ...)
), either of these two will work
(because if you use a symbol as a function, the contents of the symbol’s
function cell is used as the function, as explained in the beginning of
this chapter). The former case is a bit slower, because the function
call has to indirect through the symbol, but it allows the function
to be redefined, traced (see trace-fun), or advised (see advise-fun).
The latter case, while faster, picks up the function definition out of
the symbol car
and does not see any later changes to it.
The other way to tell the compiler that an argument that is a lambda
expression should be compiled is for the function that takes the
function as an argument to use the &functional
keyword in its
lambda list; see lambda-list-keywords. The basic system functions that
take functions as arguments, such as map
and sort
, have
this &functional
keyword and hence quoted lambda-expressions
given to them will be recognized as functions by the compiler.
In fact, mapc
uses &functional
and so the example given above
is bogus; in the particular case of the first argument to the function mapc
,
quote
and function
are synonymous. It is good style to use function
(or #'
) anyway, to make the intent of the program completely clear.
Takes no arguments and returns nil
.
Takes no arguments and returns t
.
Takes any number of arguments and returns nil
. This is often useful
as a "dummy" function; if you are calling a function that takes a function
as an argument, and you want to pass one that doesn’t do anything and
won’t mind being called with any argument pattern, use this.
comment
ignores its form and returns the symbol comment
.
Example:
(defun foo (x) (cond ((null x) 0) (t (comment x has something in it) (1+ (foo (cdr x))))))
Usually it is preferable to comment code using the semicolon-macro feature of the standard input syntax. This allows the user to add comments to his code which are ignored by the lisp reader.
Example:
(defun foo (x) (cond ((null x) 0) (t (1+ (foo (cdr x)))) ;x has something in it ))
A problem with such comments is that they are discarded when the form is read into Lisp. If the function is read into Lisp, modified, and printed out again, the comment will be lost. However, this style of operation is hardly ever used; usually the source of a function is kept in an editor buffer and any changes are made to the buffer, rather than the actual list structure of the function. Thus, this is not a real problem.
The body forms are evaluated in order from left to right and the value
of the last one is returned.
progn
is the primitive control structure construct for "compound
statements". Although lambda-expressions, cond
forms, do
forms, and
many other control structure forms use progn
implicitly, that is,
they allow multiple forms in their bodies, there are occasions when
one needs to evaluate a number of forms for their side-effects and
make them appear to be a single form.
Example:
(foo (cdr a) (progn (setq b (extract frob)) (car b)) (cadr b))
(When form1 is 'compile
, the progn
form has a special meaning
to the compiler. This is discussed on progn-quote-compile-discussion.)
prog1
is similar to progn
, but it returns the value of its first form rather
than its last.
It is most commonly used to evaluate an expression with side effects, and return
a value which must be computed before the side effects happen.
Example:
(setq x (prog1 y (setq y x)))
interchanges the values of the variables x and y. prog1
never
returns multiple values.
prog2
is similar to progn
and prog1
, but it returns its
second form. It is included largely for compatibility with old programs.
See also bind
(bind-fun), which is a
subprimitive that gives you maximal control over binding.
The following three functions (arg
, setarg
, and listify
)
exist only for compatibility with Maclisp lexprs. To write functions
that can accept variable numbers of arguments, use the &optional
and
&rest
keywords (see function-section).
(arg nil)
, when evaluated during the application of
a lexpr, gives the number of arguments supplied to that
lexpr.
This is primarily a debugging aid, since lexprs also receive their number of arguments
as the value of their lambda
-variable.
(arg i)
, when evaluated during the application of a lexpr, gives the value of the
i’th argument to the lexpr. i must be a fixnum in this case. It is an error if i is less than 1 or greater than the number
of arguments supplied to the lexpr.
Example:
(defun foo nargs ;define a lexpr foo. (print (arg 2)) ;print the second argument. (+ (arg 1) ;return the sum of the first (arg (- nargs 1)))) ;and next to last arguments.
setarg
is used only during the application of a lexpr.
(setarg i x)
sets the
lexpr’s i’th argument to x.
i must be greater than zero
and not greater than the number of arguments passed to the lexpr.
After (setarg i x)
has been done, (arg i)
will return x.
(listify n)
manufactures a list of n of the
arguments of a lexpr. With a positive argument n, it returns a
list of the first n arguments of the lexpr. With a negative
argument n, it returns a list of the last (abs n)
arguments of the lexpr. Basically, it works as if defined as follows:
(defun listify (n)
(cond ((minusp n)
(listify1 (arg nil) (+ (arg nil) n 1)))
(t
(listify1 n 1)) ))
(defun listify1 (n m) ; auxiliary function.
(do ((i n (1- i))
(result nil (cons (arg i) result)))
((< i m) result) ))
The Lisp Machine includes a facility by which the evaluation of a form
can produce more than one value. When a function needs to return more
than one result to its caller, multiple values are a cleaner way of
doing this than returning a list of the values or setq
’ing special
variables to the extra values. In most Lisp function calls, multiple
values are not used. Special syntax is required both to produce
multiple values and to receive them.
The primitive for producing multiple values is values
, which takes
any number of arguments and returns that many values. If the last form
in the body of a function is a values
with three arguments, then
a call to that function will return three values.
The other primitive for producing multiple values is return
, which when
given more than one argument returns all its arguments as the values of
the prog
or do
from which it is returning. The variant
return-from
also can produce multiple values. Many system functions
produce multiple values, but they all do it via the values
and return
primitives.
The special forms for receiving multiple values are multiple-value
,
multiple-value-bind
, and multiple-value-list
. These consist of
a form and an indication of where to put the values returned by that form.
With the first two of these, the caller requests a certain number of
returned values. If fewer values are returned than the number requested,
then it is exactly as if the rest of the values were present and had the
value nil
. If too many values are returned, the rest of the values
are ignored. This has the advantage that you don’t have to pay attention
to extra values if you don’t care about them, but it has the disadvantage
that error-checking similar to that done for function calling is not present.
Returns multiple values, its arguments. This is the primitive
function for producing multiple values. It is legal to call values
with
no arguments; it returns no values in that case.
Returns multiple values, the elements of the list. (values-list '(a b c))
is the same as (values 'a 'b 'c)
.
list may be nil
, the empty list, which causes no values to be returned.
return
and its variants can only be used within the do
and
prog
special forms and their variants, and so they are explained on
return-fun.
multiple-value
is a special
form used for calling a function which
is expected to return more than one value.
form is evaluated, and the variables
are set (not lambda-bound) to the values returned by form. If more values
are returned than there are variables, then the extra values
are ignored. If there are more variables than values returned,
extra values of nil
are supplied. If nil
appears in the var-list,
then the corresponding value is ignored (you can’t use nil
as a variable.)
Example:
(multiple-value (symbol already-there-p) (intern "goo"))
In addition to its first value (the symbol), intern
returns a second
value, which is t
if the symbol returned as the first value was
already interned, or else nil
if intern
had to create it. So if
the symbol goo
was already known, the variable already-there-p
will be set to t
, otherwise it will be set to nil
. The third value
returned by intern
will be ignored.
multiple-value
is usually used for effect rather than for value; however,
its value is defined to be the first of the values returned by form.
This is similar to multiple-value
, but locally binds the variables which
receive the values, rather than setting them, and has a body–a set of forms
which are evaluated with these local bindings in effect.
First form is evaluated. Then the variables are
bound to the values returned by form. Then the body forms
are evaluated sequentially, the bindings are undone, and the result
of the last body form is returned.
multiple-value-list
evaluates form, and returns a list of
the values it returned. This is useful for when you don’t know how many values
to expect.
Example:
(setq a (multiple-value-list (intern "goo"))) a => (goo nil #<Package User>)
This is similar to the example of multiple-value
above; a
will be set
to a list of three elements, the three values returned by intern
.
Due to the syntactic structure of Lisp, it is often the case that the value
of a certain form is the value of a sub-form of it. For example, the
value of a cond
is the value of the last form in the selected clause.
In most such cases, if the sub-form produces multiple values, the original
form will also produce all of those values. This passing-back of
multiple values of course has no effect unless eventually one of the
special forms for receiving multiple values is reached.
The exact rule governing passing-back of multiple values is as follows:
If X is a form, and Y is a sub-form of X, then if the value
of Y is unconditionally returned as the value of X, with no
intervening computation, then all the multiple values returned by Y
are returned by X. In all other cases, multiple values or only
single values may be returned at the discretion of the implementation;
users should not depend on whatever way it happens to work, as it
may change in the future or in other implementations. The reason we don’t guarantee
non-transmission of multiple values is because such a guarantee would
not be very useful and the efficiency cost of enforcing it would be
high. Even setq
’ing a variable to the result of a form, then
returning the value of that variable might be made to pass multiple
values by an optimizing compiler which realized that the setq
ing of
the variable was unnecessary.
Note that use of a form as an argument to a function never receives
multiple values from that form. That is, if the form (foo (bar))
is evaluated and the call to bar
returns many values, foo
will
still only be called on one argument (namely, the first value returned),
rather than being called on all the values returned. We choose not to
generate several separate arguments from the several values, because
this would make the source code obscure; it would not be syntactically
obvious that a single form does not correspond to a single argument.
Instead, the first value of a form is used as the argument and the
remaining values are discarded. Receiving of multiple values is done
only with the above-mentioned special forms.
For clarity, descriptions of the interaction of several common special forms with multiple values follow. This can all be deduced from the rule given above. Note well that when it says that multiple values are not returned, it really means that they may or may not be returned, and you should not write any programs that depend on which way it works.
The body of a defun
or a lambda
, and variations such as the
body of a function, the body of a let
, etc., pass back multiple
values from the last form in the body.
eval
, apply
, funcall
, and lexpr-funcall
pass back multiple values from the function called.
progn
passes back multiple values from its last form. progv
and
progw
do so also. prog1
and prog2
, however, do not pass
back multiple values.
Multiple values are passed back from the last subform of an and
or or
form,
but not from previous forms since the return is conditional. Remember
that multiple values are only passed back when the value of a sub-form
is unconditionally returned from the containing form. For example,
consider the form (or (foo) (bar))
. If foo
returns a non-nil
first value, then only that value will be returned as the value of the
form. But if it returns nil
(as its first value), then or
returns whatever values the call to bar
returns.
cond
passes back multiple values from the last form in the
selected clause, but not if the clause is only one long (i.e the
returned value is the value of the predicate) since the return is
conditional. This rule applies even to the last clause, where the return
is not really conditional (the implementation is allowed to pass
or not to pass multiple values in this case, and so you shouldn’t
depend on what it does). t
should be used as the predicate of the
last clause if multiple values are desired, to make it clear to the
compiler (and any human readers of the code!) that the return is not
conditional.
The variants of cond
such as if
, select
, selectq
, and
dispatch
pass back multiple values from the last form in the
selected clause.
The number of values returned by prog
depends on the return
form
used to return from the prog
. (If a prog
drops off the end it
just returns a single nil
.) If return
is given two or more
subforms, then prog
will return as many values as the return
has
subforms. However, if the return
has only one subform, then the
prog
will return all of the values returned by that one subform.
do
behaves like prog
with respect to return
.
All the values of the last exit-form are returned.
unwind-protect
passes back multiple values from its protected form.
*catch
does not pass back multiple values from the last form
in its body, because it is defined to return its
own second value (see *catch-fun) to tell you whether the *catch
form was exited normally or abnormally. This is sometimes inconvenient
when you want to propagate back multiple values but you also want to wrap
a *catch
around some forms. Usually people get around this problem
by enclosing the *catch
in a prog
and using return
to pass
out the multiple values, return
ing through the *catch
. This is
inelegant, but we don’t know anything that’s much better.
Lisp provides a variety of structures for flow of control.
Function application is the basic method for construction of programs. Operations are written as the application of a function to its arguments. Usually, Lisp programs are written as a large collection of small functions, each of which implements a simple operation. These functions operate by calling one another, and so larger operations are defined in terms of smaller ones.
A function may always call itself in Lisp. The calling of a function by itself is known as recursion; it is analogous to mathematical induction.
The performing of an action repeatedly (usually with some
changes between repetitions) is called iteration, and is provided as
a basic control structure in most languages. The do statement of
PL/I, the for statement of ALGOL/60, and so on are examples of
iteration primitives. Lisp provides two general iteration facilities:
do
and loop
, as well as a variety of special-purpose iteration
facilities. (loop
is sufficiently complex that it is explained
in its own chapter later in the manual; see loop-fun.) There is also
a very general construct to allow the traditional "goto" control structure,
called prog
.
A conditional construct is one which allows a program
to make a decision, and do one thing or another based on some logical
condition. Lisp provides the simple one-way conditionals and
and or
,
the simple two-way conditional if
, and more general multi-way
conditionals such as cond
and selectq
.
The choice of which form to use in any particular situation is a matter
of personal taste and style.
There are some non-local exit control structures, analogous
to the leave, exit, and escape constructs in many modern
languages.
The general ones are *catch
and *throw
; there is also return
and its variants, used for exiting iteration the constructs do
, loop
,
and prog
.
Zetalisp also provides a coroutine capability, explained in the section on stack-groups (stack-group), and a multiple-process facility (see process). There is also a facility for generic function calling using message passing; see flavor.
if
is the simplest conditional form. The "if-then" form looks like:
(if predicate-form then-form)
predicate-form is evaluated, and if the result is non-nil
, the
then-form is evaluated and its result is returned. Otherwise, nil
is returned.
In the "if-then-else" form, it looks like
(if predicate-form then-form else-form)
predicate-form is evaluated, and if the result is non-nil
, the
then-form is evaluated and its result is returned. Otherwise, the
else-form is evaluated and its result is returned.
If there are more than three subforms, if
assumes you want more than
one else-form; they are evaluated sequentially and the result of the
last one is returned, if the predicate returns nil
. There is
disagreement as to whether this consistutes good programming style or
not.
The cond
special form consists of the symbol cond
followed by
several clauses. Each clause consists of a predicate form, called
the antecedent, followed by zero or more consequent forms.
(cond (antecedent consequent consequent...) (antecedent) (antecedent consequent ...) ... )
The idea is that each clause represents a case which is selected if its antecedent is satisfied and the antecedents of all preceding clauses were not satisfied. When a clause is selected, its consequent forms are evaluated.
cond
processes its clauses in order from left to right. First,
the antecedent of the current clause is evaluated. If the result is
nil
, cond
advances to the next clause. Otherwise, the cdr of
the clause is treated as a list consequent forms which are
evaluated in order from left to right. After evaluating the
consequents, cond
returns without inspecting any remaining
clauses. The value of the cond
special form is the value of the
last consequent evaluated, or the value of the antecedent if there
were no consequents in the clause. If cond
runs out of clauses,
that is, if every antecedent evaluates to nil
, and thus no case is
selected, the value of the cond
is nil
.
Example:
(cond ((zerop x) ;First clause: (+ y 3)) ; (zerop x) is the antecedent. ; (+ y 3) is the consequent. ((null y) ;A clause with 2 consequents: (setq y 4) ; this (cons x z)) ; and this. (z) ;A clause with no consequents: the antecedent is ; justz
. Ifz
is non-nil
, it will be returned. (t ;An antecedent of t 105) ; is always satisfied. ) ;This is the end of the cond.
cond-every
has the same syntax as cond
, but executes every clause whose
predicate is satisfied, not just the first. If a predicate is the symbol
otherwise
, it is satisfied if and only if no preceding predicate is
satisfied. The value returned
is the value of the last consequent form in the last clause whose predicate
is satisfied. Multiple values are not returned.
and
evaluates the forms one at a time,
from left to right. If any form evaluates to nil
, and
immediately returns nil
without evaluating the remaining
forms. If all the forms evaluate to non-nil
values, and
returns
the value of the last form.
and
can be used in two different ways. You can use it as a logical
and
function, because it returns a true value only if all of its
arguments are true. So you can use it as a predicate:
(if (and socrates-is-a-person all-people-are-mortal) (setq socrates-is-mortal t))
Because the order of evaluation is well-defined, you can do
(if (and (boundp 'x) (eq x 'foo)) (setq y 'bar))
knowing that the x
in the eq
form will not be evaluated if x
is found to be unbound.
You can also use and
as a simple conditional form:
(and (setq temp (assq x y)) (rplacd temp z))
(and bright-day glorious-day (princ "It is a bright and glorious day."))
Note: (and) => t
, which is the identity for the and
operation.
or
evaluates the forms one by one from left to right.
If a form evaluates to nil
, or
proceeds to evaluate the
next form. If there are no more forms, or
returns nil
.
But if a form evaluates to a non-nil
value, or
immediately returns
that value without evaluating any remaining forms.
As with and
, or
can be used either as a logical or
function,
or as a conditional.
(or it-is-fish it-is-fowl (print "It is neither fish nor fowl."))
Note: (or) => nil
, the identity for this operation.
selectq
is a conditional which chooses one of its clauses to execute
by comparing the value of a form against various constants, which are
typically keyword symbols.
Its form is as follows:
(selectq key-form (test consequent consequent ...) (test consequent consequent ...) (test consequent consequent ...) ...)
The first thing selectq
does is to evaluate key-form; call the resulting
value key. Then selectq
considers
each of the clauses in turn. If key matches the clause’s
test, the consequents of this
clause are evaluated, and selectq
returns the value of the last
consequent. If there are no matches, selectq
returns nil
.
A test may be any of:
If the key is eq
to the symbol, it matches.
If the key is eq
to the number, it matches.
Only small numbers (fixnums) will work.
If the key is eq
to one of the elements of the list,
then it matches. The elements of the list should be symbols
or fixnums.
t
or otherwise
The symbols t
and otherwise
are special keywords which match anything.
Either symbol may be used, it makes no difference;
t
is mainly for compatibility with Maclisp’s caseq
construct.
To be useful, this should be the last clause in the selectq
.
Note that the tests are not evaluated; if you want them to
be evaluated use select
rather than selectq
.
Example:
(selectq x (foo (do-this)) (bar (do-that)) ((baz quux mum) (do-the-other-thing)) (otherwise (ferror nil "Never heard of ~S" x)))
is equivalent to
(cond ((eq x 'foo) (do-this)) ((eq x 'bar) (do-that)) ((memq x '(baz quux mum)) (do-the-other-thing)) (t (ferror nil "Never heard of ~S" x)))
Also see defselect
(defselect-fun), a special form for defining a function
whose body is like a selectq
.
select
is the same as selectq
, except that the elements of the
tests are evaluated before they are used.
This creates a syntactic ambiguity: if (bar baz)
is seen the
first element of a clause, is it a list of two forms, or is it one
form? select
interprets it as a list of two forms. If you
want to have a clause whose test is a single form, and that form
is a list, you have to write it as a list of one form.
Example:
(select (frob x) (foo 1) ((bar baz) 2) (((current-frob)) 4) (otherwise 3))
is equivalent to
(let ((var (frob x))) (cond ((eq var foo) 1) ((or (eq var bar) (eq var baz)) 2) ((eq var (current-frob)) 4) (t 3)))
selector
is the same as select
, except that you get to specify the function
used for the comparison instead of eq
. For example,
(selector (frob x) equal (('(one . two)) (frob-one x)) (('(three . four)) (frob-three x)) (otherwise (frob-any x)))
is equivalent to
(let ((var (frob x))) (cond ((equal var '(one . two)) (frob-one x)) ((equal var '(three . four)) (frob-three x)) (t (frob-any x))))
(dispatch byte-specifier number clauses...)
is the same
as select
(not selectq
), but the key is obtained by evaluating
(ldb byte-specifier number)
.
byte-specifier and number are both evaluated. Byte specifiers
and ldb
are explained on ldb-fun.
Example:
(princ (dispatch 0202 cat-type (0 "Siamese.") (1 "Persian.") (2 "Alley.") (3 (ferror nil "~S is not a known cat type." cat-type))))
It is not necessary to include all possible values of the byte which will be dispatched on.
selectq-every
has the same syntax as selectq
, but, like
cond-every
, executes every selected clause instead of just the first
one. If an otherwise
clause is present, it is selected if and only
if no preceding clause is selected. The value returned is the value of
the last form in the last selected clause. Multiple values are not
returned. Example:
(selectq-every animal ((cat dog) (setq legs 4)) ((bird man) (setq legs 2)) ((cat bird) (put-in-oven animal)) ((cat dog man) (beware-of animal)))
The caseq
special form is provided for Maclisp compatibility. It
is exactly the same as selectq
. This is not perfectly compatible
with Maclisp, because selectq
accepts otherwise
as well as t
where caseq
would not accept otherwise
, and because Maclisp
does some error-checking that selectq
does not. Maclisp programs
that use caseq
will work correctly so long as they don’t use the
symbol otherwise
as the key.
The do
special form provides a simple generalized iteration facility,
with an arbitrary number of "index variables" whose values are saved
when the do
is entered and restored when it is left, ie they are
bound by the do
. The index variables are used in the iteration
performed by do
. At the beginning, they are initialized to
specified values, and then at the end of each trip around the loop the
values of the index variables are changed according to specified
rules. do
allows the programmer to specify a predicate which
determines when the iteration will terminate. The value to be
returned as the result of the form may, optionally, be specified.
do
comes in two varieties.
The more general, so-called "new-style" do
looks like:
(do ((var init repeat) ...) (end-test exit-form ...) body...)
The first item in the form is a list of zero or more index variable
specifiers. Each index variable specifier is a list of the name of a
variable var, an initial value form init, which defaults to nil
if it is omitted, and a repeat value form repeat. If repeat is
omitted, the var is not changed between repetitions. If init is
omitted, the var is initialized to nil
.
An index variable specifier can also be just the name of a variable,
rather than a list. In this case, the variable has an initial value of
nil
, and is not changed between repetitions.
All assignment to the index variables is done in parallel. At the
beginning of the first iteration, all the init forms are evaluated,
then the vars are bound to the values of the init forms, their
old values being saved in the usual way. Note that the init forms
are evaluated before the vars are bound, ie lexically
outside of the do
. At the beginning of each succeeding
iteration those vars that have repeat forms get set to the
values of their respective repeat forms. Note that all the
repeat forms are evaluated before any of the vars is set.
The second element of the do
-form is a list of an end-testing
predicate form end-test, and zero or more forms, called the
exit-forms. This resembles a cond
clause. At the beginning of
each iteration, after processing of the variable specifiers, the
end-test is evaluated. If the result is nil
, execution proceeds
with the body of the do
. If the result is not nil
, the
exit-forms are evaluated from left to right and then do
returns.
The value of the do
is the value of the last exit-form, or
nil
if there were no exit-forms (not the value of the
end-test as you might expect by analogy with cond
).
Note that the end-test gets evaluated before the first time the body
is evaluated. do
first initializes the variables from the init
forms, then it checks the end-test, then it processes the body, then
it deals with the repeat forms, then it tests the end-test
again, and so on. If the end-test
returns a non-nil
value the
first time, then the body will never be processed.
If the second element of the form is nil
, there is no end-test
nor exit-forms, and the body of the do
is executed only
once. In this type of do
it is an error to have repeats. This
type of do
is no more powerful than let
; it is obsolete
and provided only for Maclisp compatibility.
If the second element of the form is (nil)
, the end-test is
never true and there are no exit-forms. The body of the do
is executed over and over. The infinite loop can be terminated by use
of return
or *throw
.
If a return
special form is evaluated inside the body of a do
,
then the do
immediately stops, unbinds its variables, and returns
the values given to return
. See return-fun for more details
about return
and its variants. go
special forms (see go-fun)
and prog
-tags can also be used inside the body of a do
and they mean the same
thing that they do inside prog
forms, but we
discourage their use since they complicate the control structure in
a hard-to-understand way.
The other, so-called "old-style" do
looks like:
(do var init repeat end-test body...)
The first time through the loop var gets the value of the init form;
the remaining times through the loop it gets the value of the repeat form,
which is re-evaluated each time. Note that the init form is evaluated
before var is bound, ie lexically outside of the do
.
Each time around the loop, after var is set,
end-test is evaluated. If it is non-nil
, the do
finishes
and returns nil
. If the end-test evaluated to nil
, the body of
the loop is executed. As with the new-style do, return
and go
may be used in the body, and they have the same meaning.
Examples of the older variety of do
:
(setq n (array-length foo-array)) (do i 0 (1+ i) (= i n) (aset 0 foo-array i)) ;zeroes out the array foo-array (do zz x (cdr zz) (or (null zz) (zerop (f (car zz))))) ; this applies f to each element of x ; continuously until f returns zero. ; Note that thedo
has no body.return
forms are often useful to do simple searches:
(do i 0 (1+ i) (= i n) ; Iterate over the length offoo-array
. (and (= (aref foo-array i) 5) ; If we find an element which ; equals5
, (return i))) ; then return its index.
Examples of the new form of do
:
(do ((i 0 (1+ i)) ; This is just the same as the above example, (n (array-length foo-array))) ((= i n)) ; but written as a new-styledo
. (aset 0 foo-array i)) ; Note how thesetq
is avoided.
(do ((z list (cdr z)) ; z starts aslist
and is cdr’ed each time. (y other-list) ; y starts asother-list
, and is unchanged by the do. (x) ; x starts asnil
and is not changed by thedo
. w) ; w starts asnil
and is not changed by thedo
. (nil) ; The end-test isnil
, so this is an infinite loop. body) ; Presumably the body usesreturn
somewhere.
The construction
(do ((x e (cdr x)) (oldx x x)) ((null x)) body)
exploits parallel assignment to index variables. On the first
iteration, the value of oldx
is whatever value x
had before
the do
was entered. On succeeding iterations, oldx
contains
the value that x
had on the previous iteration.
In either form of do
, the body may contain no forms at all.
Very often an iterative algorithm can be most clearly expressed entirely
in the repeats and exit-forms of a new-style do
,
and the body is empty.
(do ((x x (cdr x))
(y y (cdr y))
(z nil (cons (f x y) z))) ;exploits parallel assignment.
((or (null x) (null y))
(nreverse z)) ;typical use of nreverse.
) ;no do
-body required.
is like (maplist 'f x y) (see maplist-fun).
Also see loop
(loop-fun), a general iteration facility based on a keyword
syntax rather than a list-structure syntax.
In a word, do*
is to do
as prog*
is to prog
.
do*
works like do
but binds and steps the variables sequentially
instead of in parallel. This means that the init form for one
variable can use the values previous variables. The repeat forms
refer to the new values of previous variables instead of their old
values. Here is an example:
(do* ((x xlist (cdr x)) (y (car x) (car x))) (print (list x y)))
On each iteration, y’s value will be the car
of x. By
comparison, with do
, this would get an error on entry since x
would not have an old value yet.
Sometimes one do
is contained inside the body of an outer do
.
The return
function always returns from the innermost surrounding
do
, but sometimes you want to return from an outer do
while
within an inner do
. You can do this by giving the outer do
a
name. You use do-named
instead of do
for the outer do
, and
use return-from
(see return-from-fun), specifying that name, to
return from the do-named
.
The syntax of do-named
is like do
except that the symbol do
is
immediately followed by the name, which should be a symbol.
Example:
(do-named george ((a 1 (1+ a)) (d 'foo)) ((> a 4) 7) (do ((c b (cdr c))) ((null c)) ... (return-from george (cons b d)) ...))
If the symbol t
is used as the name, then it will be made
"invisible" to return
s; that is, return
s inside that do-named
will return to the next outermost level whose name is not t
.
(return-from t ...)
will return from a do-named
named t
. This
feature is not intended to be used by user-written code; it is for
macros to expand into.
If the symbol nil
is used as the name, it is as if this were a
regular do
. Not having a name is the same as being named nil
.
prog
s and loop
s can have names just as do
s can. Since the
same functions are used to return from all of these forms, all of these
names are in the same name-space; a return
returns from the
innermost enclosing iteration form, no matter which of these it is, and
so you need to use names if you nest any of them within any other and
want to return to an outer one from inside an inner one.
This special form offers a combination of the features of do*
and
those of do-named
.
dotimes
is a convenient abbreviation for the most common integer iteration.
dotimes
performs body
the number of times given by the value of count, with index bound
to 0
, 1
, etc. on successive iterations.
Example:
(dotimes (i (// m n)) (frob i))
is equivalent to:
(do ((i 0 (1+ i)) (count (// m n))) (( i count)) (frob i))
except that the name count
is not used. Note that i
takes on
values starting at zero rather than one, and that it stops before taking
the value (// m n)
rather than after. You can use return
and
go
and prog
-tags inside the body, as with do
. dotimes
forms return nil
unless returned from explicitly with return
.
For example:
(dotimes (i 5) (if (eq (aref a i) 'foo) (return i)))
This form searches the array that is the value of a
, looking for
the symbol foo
. It returns the fixnum index of the first element
of a
that is foo
, or else nil
if none of the elements
are foo
.
dolist
is a convenient abbreviation for the most common list iteration.
dolist
performs body
once for each element in the list which is the value of list, with
item bound to the successive elements.
Example:
(dolist (item (frobs foo)) (mung item))
is equivalent to:
(do ((lst (frobs foo) (cdr lst)) (item)) ((null lst)) (setq item (car lst)) (mung item))
except that the name lst
is not used.
You can use return
and go
and prog
-tags inside the body, as with do
.
dolist
forms return nil
unless returned from explicitly with return
.
keyword-extract
is an aid to writing functions which take keyword arguments
in the standard fashion. The form
(keyword-extract key-list iteration-var keywords flags other-clauses...)
will parse the keywords out into local variables of the function. key-list
is a form which evaluates to the list of keyword arguments; it is generally the
function’s &rest
argument. iteration-var is a variable used to iterate
over the list; sometimes other-clauses will use the form
(car (setq iteration-var (cdr iteration-var)))
to extract the next element of the list. (Note that this is not the same as pop
,
because it does the car
after the cdr
, not before.)
keywords defines the symbols which are keywords to be followed by an argument.
Each element of keywords is either the name of a local variable which receives
the argument and is also the keyword, or a list of the keyword and the variable, for
use when they are different or the keyword is not to go in the keyword package.
Thus if keywords is (foo (ugh bletch) bar)
then the keywords recognized
will be :foo
, ugh
, and :bar
. If :foo
is specified its argument
will be stored into foo
. If :bar
is specified its argument will be stored
into bar
. If ugh
is specified its argument will be stored into bletch
.
Note that keyword-extract
does not bind these local variables; it assumes you
will have done that somewhere else in the code that contains the keyword-extract
form.
flags defines the symbols which are keywords not followed by an argument.
If a flag is seen its corresponding variable is set to t
. (You are assumed to
have initialized it to nil
when you bound it with let
or &aux
.)
As in keywords, an element of flags may be either a variable from
which the keyword is deduced, or a list of the keyword and the variable.
If there are any other-clauses, they are selectq
clauses selecting on the
keyword being processed. These clauses are for handling any keywords that
are not handled by the keywords and flags elements.
These can be used to do special processing of certain keywords
for which simply storing the argument into a variable is not good enough. After the
other-clauses there will be an otherwise
clause to complain about any
undefined keywords found in key-list.
You can also use the &key
lambda-list keyword to create functions that take
keyword arguments; see &key.
prog
is a special form which provides temporary variables,
sequential evaluation of forms, and a "goto" facility. A typical prog
looks like:
(prog (var1 var2 (var3 init3) var4 (var5 init5)) tag1 statement1 statement2 tag2 statement3 . . . )
The first subform of a prog
is a list of variables, each of which
may optionally have an initialization form. The first thing evaluation
of a prog
form does is to evaluate all of the init forms. Then
each variable that had an init form is bound to its value, and the
variables that did not have an init form are bound to nil
.
Example:
(prog ((a t) b (c 5) (d (car '(zz . pp)))) <body> )
The initial value of a
is t
, that of b
is nil
, that of
c
is the fixnum 5, and that of d
is the symbol zz
. The
binding and initialization of the variables is done in parallel;
that is, all the initial values are computed before any of the variables
are changed. prog*
(see prog*-fun) is the same as prog
except that this initialization is sequential rather than parallel.
The part of a prog
after the variable list is called the
body. Each element of the body is either a symbol, in which case it
is called a tag, or anything else (almost always a list), in which
case it is called a statement.
After prog
binds the variables, it processes each form in
its body sequentially. tags are skipped over. statements are
evaluated, and their returned values discarded. If the end of the body
is reached, the prog
returns nil
. However, two special forms
may be used in prog
bodies to alter the flow of control. If
(return x)
is evaluated, prog
stops processing its body,
evaluates x, and returns the result. If (go tag)
is
evaluated, prog
jumps to the part of the body labelled with the
tag, where processing of the body is continued. tag is not
evaluated. return
and go
and their variants are explained
fully below.
The compiler requires that go
and return
forms be
lexically within the scope of the prog
; it is not possible for a
function called from inside a prog
body to return
to the
prog
. That is, the return
or go
must be inside the prog
itself, not inside a function called by the prog
. (This restriction
happens not to be enforced in the interpreter, but since all programs
are eventually compiled, the convention should be adhered to. The
restriction will be imposed in future implementations of the
interpreter.)
See also the do
special form, which uses a body similar to
prog
. The do
, *catch
, and *throw
special forms are
included in Zetalisp as an attempt to encourage goto-less programming
style, which often leads to more readable, more easily maintained code. The
programmer is recommended to use these forms instead of prog
wherever reasonable.
If the first subform of a prog
is a non-nil
symbol (rather than
a variable list), it is the name of the prog
, and return-from
(see return-from-fun) can be used to return from it. See
do-named
, do-named-fun.
Example:
(prog (x y z) ;x, y, z are prog variables - temporaries. (setq y (car w) z (cdr w)) ;w is a free variable. loop (cond ((null y) (return x)) ((null z) (go err))) rejoin (setq x (cons (cons (car y) (car z)) x)) (setq y (cdr y) z (cdr z)) (go loop) err (break are-you-sure? t) (setq z y) (go rejoin))
The prog*
special form is almost the same as prog
. The only
difference is that the binding and initialization of the temporary
variables is done sequentially, so each one can depend on the
previous ones. For example,
(prog* ((y z) (x (car y))) (return x))
returns the car of the value of z
.
The go
special form is used to do a "go-to" within the
body of a do
or a prog
. The tag must be a symbol.
It is not evaluated. go
transfers control to the point in the body labelled by a
tag eq
to the one given. If there is no such tag in the body, the
bodies of lexically containing prog
s and do
s (if any) are examined as well.
If no tag is found, an error is signalled.
Example:
(prog (x y z) (setq x some frob) loop do something (if some predicate (go endtag)) do something more (if (minusp x) (go loop)) endtag (return z))
return
is used to exit from a prog
-like special form (prog
,
prog*
, do
, do-named
, dotimes
, dolist
, loop
,
etc.) The value forms are evaluated, and the resulting values are
returned by the prog
as its values.
In addition, break
(see break-fun) recognizes the typed-in form
(return value)
specially. If this form is typed at a
break
, value will be evaluated and returned as the value of
break
. If not specially recognized by break
,
and not inside a prog
-like form, return
will cause an
error.
Example:
(do ((x x (cdr x)) (n 0 (* n 2))) ((null x) n) (cond ((atom (car x)) (setq n (1+ n))) ((memq (caar x) '(sys boom bleah)) (return n))))
Note that the return
form is very unusual: it does not ever
return a value itself, in the conventional sense. It isn’t useful to
write (setq a (return 3))
, because when the return
form is
evaluated, the containing do
or prog
is immediately exited,
and the setq
never happens.
A return
form may not appear as an argument to a
regular function, but only at the top level of a prog
or do
, or
within certain special forms such as conditionals which are within a
prog
or do
. A return
as an argument to a regular function would
be not only useless but possibly meaningless. The compiler does not
bother to know how to compile it correctly in all cases. The same is true of
go
.
return
can also be used with multiple arguments, to return multiple values
from a prog
or do
. For example,
(defun assqn (x table) (do ((l table (cdr l)) (n 0 (1+ n))) ((null l) nil) (if (eq (caar l) x) (return (car l) n))))
This function is like assq
, but it returns an additional value
which is the index in the table of the entry it found.
However, if you use return
with only one subform, then the prog
or do
will return all of the values returned by that subform. That
is, if you do
(prog () ... (return (foo 2)))
and the function foo
returns many values, then the prog
will return
all of those values. In fact, this means that
(return (values form1 form2 form3))
is the same as
(return form1 form2 form3)
It is legal to write simply (return)
, which will return from the prog
without returning any values.
See multiple-value for more information.
The value forms are evaluated, and then are
returned from the innermost containing prog
-like special form whose
name is name. See the description of do-named
(do-named-fun)
in which named do
s and prog
s are explained.
This function is like return
except
that the prog
returns all of the elements of list; if
list has more than one element, the prog
does a multiple-value
return.
To direct the returned values to a prog
or do-named
of a specific
name, use
(return-from name (values-list list))
.
Also see defunp
(defunp-fun), a variant of defun
that incorporates a prog
into the function body.
*catch
is a special form used with the *throw
function to do
non-local exits. First tag is evaluated; the result is called the "tag"
of the *catch
. Then the body forms are evaluated sequentially,
and the value of the last form is returned. However, if,
during the evaluation of the body, the
function *throw
is called with the same tag as the tag of the
*catch
, then the evaluation of the body is aborted, and the
*catch
form immediately returns the value that was the second
argument to *throw
without further evaluating the current body form or
the rest of the body.
The tag’s are used to match up *throw
’s with *catch
’s.
(*catch 'foo form)
will catch a (*throw 'foo form)
but
not a (*throw 'bar form)
. It is an error if *throw
is done
when there is no suitable *catch
(or catch-all
; see below).
The values t
and nil
for tag are special: a *catch
whose
tag is one of these values will catch throws to any tag. These are only
for internal use by unwind-protect
and catch-all
respectively.
The only difference between t
and nil
is in the error checking;
t
implies that after a "cleanup handler" is executed control will be
thrown again to the same tag, therefore it is an error if a specific
catch for this tag does not exist higher up in the stack. With nil
,
the error check isn’t done.
*catch
returns up to four values; trailing null values are not
returned for reasons of microcode simplicity, but the values not
returned will default to nil
if they are received with the
multiple-value
or multiple-value-bind
special forms.
If the catch completes normally,
the first value is the value of form and the second is nil
.
If a *throw
occurs, the first value is the second argument to
*throw
, and the second value is the first argument to *throw
,
the tag thrown to. The third and fourth values are the third and fourth
arguments to *unwind-stack
(see *unwind-stack-fun)
if that was used in place of *throw
; otherwise these values are nil
.
To summarize, the four values returned by *catch
are the value,
the tag, the active-frame-count, and the action.
Example
(*catch 'negative (mapcar (function (lambda (x) (cond ((minusp x) (*throw 'negative x)) (t (f x)) ))) y))
which returns a list of f
of each element of y
if they are all
positive, otherwise the first negative member of y
.
Note that *catch
returns its own extra values, and so it does not
propagate multiple values back from the last form.
*throw
is used with *catch
as a structured non-local exit mechanism.
(*throw tag x)
throws the value of x back to the most recent *catch
labelled with tag or t
or nil
. Other *catches
are skipped over.
Both x and tag are evaluated, unlike the Maclisp throw
function.
The values t
, nil
, and 0
for tag are reserved and used
for internal purposes. nil
may not be used, because it would cause
an ambiguity in the returned values of *catch
. t
may only be
used with *unwind-stack
. 0
and nil
are used internally when
returning out of an unwind-protect
.
See the description of *catch
for further details.
catch
and throw
are provided only for Maclisp compatibility.
(catch form tag)
is the same as (*catch 'tag form)
,
and (throw form tag)
is the same as (*throw 'tag form)
.
The forms of catch
and throw
without tags are not supported.
This is a generalization of *throw
provided for program-manipulating
programs such as the error handler.
tag and value are the same as the corresponding arguments to
*throw
.
A tag of t
invokes a special feature whereby the entire stack is
unwound, and then the function action is called (see below). During
this process unwind-protect
s receive control, but catch-all
s do
not. This feature is provided for the benefit of system programs which
want to unwind a stack completely.
active-frame-count, if non-nil
, is the number of frames
to be unwound. The definition of a "frame" is implementation-dependent.
If this counts down to zero before a suitable *catch
is found, the *unwind-stack
terminates and
that frame returns value to whoever called it.
This is similar to Maclisp’s freturn
function.
If action is non-nil
, whenever the *unwind-stack
would be
ready to terminate (either due to active-frame-count or due to
tag being caught as in *throw
), instead action is called
with one argument, value. If tag is t
, meaning throw out
the whole way, then the function action is not allowed to return.
Otherwise the function action may return and its value will be
returned instead of value from the *catch
–or from an arbitrary
function if active-frame-count is in use. In this case the
*catch
does not return multiple values as it normally does when
thrown to. Note that it is often useful for action to be a
stack-group.
Note that if both active-frame-count and action are nil
,
*unwind-stack
is identical to *throw
.
Sometimes it is necessary to evaluate a form and make sure that certain side-effects take place after the form is evaluated; a typical example is:
(progn (turn-on-water-faucet) (hairy-function 3 nil 'foo) (turn-off-water-faucet))
The non-local exit facility of Lisp creates a situation in which
the above code won’t work, however: if hairy-function
should
do a *throw
to a *catch
which is outside of the progn
form, then (turn-off-water-faucet)
will never be evaluated
(and the faucet will presumably be left running).
This is particularly likely if hairy-function
gets an error
and the user tells the error-handler to give up and flush the computation.
In order to allow the above program to work, it can
be rewritten using unwind-protect
as follows:
(unwind-protect (progn (turn-on-water-faucet) (hairy-function 3 nil 'foo)) (turn-off-water-faucet))
If hairy-function
does a *throw
which attempts to quit
out of the evaluation of the unwind-protect
, the
(turn-off-water-faucet)
form will be evaluated in between
the time of the *throw
and the time at which the *catch
returns.
If the progn
returns normally, then the (turn-off-water-faucet)
is evaluated, and the unwind-protect
returns the result of the progn
.
The general form of unwind-protect
looks like
(unwind-protect protected-form cleanup-form1 cleanup-form2 ...)
protected-form is evaluated, and when it returns or when it
attempts to quit out of the unwind-protect
, the cleanup-forms
are evaluated. The value of the unwind-protect
is the value of
protected-form.
Multiple values returned by the protected-form are propagated back
through the unwind-protect
.
The cleanup forms are run in the variable-binding environment that you
would expect: that is, variables bound outside the scope of the
unwind-protect
special form can be accessed, but variables bound
inside the protected-form can’t be. In other words, the stack is
unwound to the point just outside the protected-form, then the
cleanup handler is run, and then the stack is unwound some more.
(catch-all form)
is like (*catch some-tag form)
except that it will catch a
*throw
to any tag at all. Since the tag thrown to
is the second returned value, the caller of catch-all
may continue
throwing to that tag if he wants. The one thing that catch-all
will not catch is a *unwind-stack
with a tag of t
.
catch-all
is a macro which expands into *catch
with a tag of nil
.
If you think you want this, most likely you are mistaken and you really
want unwind-protect
.
Mapping is a type of iteration in which a function is successively applied to pieces of a list. There are several options for the way in which the pieces of the list are chosen and for what is done with the results returned by the applications of the function.
For example, mapcar
operates on successive elements of the list.
As it goes down the list, it calls the function giving it an element
of the list as its one argument: first the car
, then the
cadr
, then the caddr
, etc., continuing until the end of the
list is reached. The value returned by mapcar
is a list of the
results of the successive calls to the function. An example of the
use of mapcar
would be mapcar
’ing the function abs
over
the list (1 -2 -4.5 6.0e15 -4.2)
, which would be written as
(mapcar (function abs) '(1 -2 -4.5 6.0e15 -4.2))
.
The result is (1 2 4.5 6.0e15
4.2)
.
In general, the mapping functions take any number of arguments. For example,
(mapcar f x1 x2 ... xn)
In this case f must be a function of n arguments.
mapcar
will proceed
down the lists x1, x2, ..., xn in parallel.
The first argument to f will
come from x1, the second from x2, etc.
The iteration stops as soon as any of the lists is exhausted.
(If there are no lists at all, then there are no lists to be exhausted,
so the function will be called repeatedly over and over. This is an
obscure way to write an infinite loop. It is supported for
consistency.) If you want to call a function of many arguments
where one of the arguments successively takes on the values of the elements
of a list and the other arguments are constant, you can use a circular
list for the other arguments to mapcar
. The function circular-list
is useful for creating such lists; see circular-list.
There are five other mapping functions besides mapcar
. maplist
is like mapcar
except that the function is applied to the list and
successive cdr’s of that list rather than to successive elements of the
list. map
and mapc
are like maplist
and mapcar
respectively, except that they don’t return any useful value. These
functions are used when the function is being called merely for its
side-effects, rather than its returned values. mapcan
and
mapcon
are like mapcar
and maplist
respectively, except
that they combine the results of the function using nconc
instead
of list
. That is, mapcon
could have been defined by
(defun mapcon (f x y) (apply 'nconc (maplist f x y)))
Of course, this definition is less general than the real one.
Sometimes a do
or a straightforward recursion is preferable to a
map; however, the mapping functions should be used wherever they
naturally apply because this increases the clarity of the code.
Often f will be a lambda-expression, rather than a symbol; for example,
(mapcar (function (lambda (x) (cons x something))) some-list)
The functional argument to a mapping function must be a function, acceptable
to apply
–it cannot be a macro or the name of a special form.
Here is a table showing the relations between the six map functions.
applies function to | successive | successive | | sublists | elements | ---------------+--------------+---------------+ its own | | | second | map | mapc | argument | | | ---------------+--------------+---------------+ list of the | | | returns function | maplist | mapcar | results | | | ---------------+--------------+---------------+ nconc of the | | | function | mapcon | mapcan | results | | | ---------------+--------------+---------------+
There are also functions (mapatoms
and mapatoms-all
)
for mapping over all symbols in certain
packages. See the explanation of packages (package).
You can also do what the mapping functions do in a different way by using
loop
. See loop-fun.
This chapter discusses functions that manipulate conses, and higher-level structures made up of conses such as lists and trees. It also discusses hash tables and resources, which are related facilities.
A cons is a primitive Lisp data object that is extremely simple: it knows about two other objects, called its car and its cdr.
A list is recursively defined to be the symbol nil
, or a cons whose
cdr is a list. A typical list is a chain of conses: the cdr of each is
the next cons in the chain, and the cdr of the last one is the symbol
nil
. The cars of each of these conses are called the elements
of the list. A list has one element for each cons; the empty list,
nil
, has no elements at all. Here are the printed representations
of some typical lists:
(foo bar) ;This list has two elements. (a (b c d) e) ;This list has three elements.
Note that the second list has three elements: a
, (b c d)
, and e
.
The symbols b
, c
, and d
are not elements of the list itself.
(They are elements of the list which is the second element of the original
list.)
A "dotted list" is like a list except that the cdr of the last cons does
not have to be nil
. This name comes from the printed
representation, which includes a "dot" character. Here is an example:
(a b . c)
This "dotted list" is made of two conses. The car of the first cons is the
symbol a
, and the cdr of the first cons is the second cons. The car of
the second cons is the symbol b
, and the cdr of the second cons is
the symbol c
.
A tree is any data structure made up of conses whose cars and cdrs are other conses. The following are all printed representations of trees:
(foo . bar) ((a . b) (c . d)) ((a . b) (c d e f (g . 5) s) (7 . 4))
These definitions are not mutually exclusive. Consider a cons whose
car is a
and whose cdr is (b (c d) e)
. Its printed
representation is
(a b (c d) e)
It can be thought of and treated as a cons, or as a list of four
elements, or as a tree containing six conses. You can even think of it
as a "dotted list" whose last cons just happens to have nil
as a
cdr. Thus, lists and "dotted lists" and trees are not fundamental data
types; they are just ways of thinking about structures of conses.
A circular list is like a list except that the cdr of the last cons,
instead of being nil
, is the first cons of the list. This means that
the conses are all hooked together in a ring, with the cdr of each cons
being the next cons in the ring. While these are perfectly good Lisp
objects, and there are functions to deal with them, many other functions
will have trouble with them. Functions that expect lists as their
arguments often iterate down the chain of conses waiting to see a
nil
, and when handed a circular list this can cause them to compute
forever. The printer (see print-fun) is one of these functions; if
you try to print a circular list the printer will never stop producing
text. You have to be careful what you do with circular lists.
The Lisp Machine internally uses a storage scheme called "cdr coding" to represent conses. This scheme is intended to reduce the amount of storage used in lists. The use of cdr-coding is invisible to programs except in terms of storage efficiency; programs will work the same way whether or not lists are cdr-coded or not. Several of the functions below mention how they deal with cdr-coding. You can completely ignore all this if you want. However, if you are writing a program that allocates a lot of conses and you are concerned with storage efficiency, you may want to learn about the cdr-coded representation and how to control it. The cdr-coding scheme is discussed in cdr-code.
Returns the car of x.
Example:
(car '(a b c)) => a
Returns the cdr of x.
Example:
(cdr '(a b c)) => (b c)
Officially car
and cdr
are only applicable to conses and locatives.
However, as a matter of convenience, car
and cdr
of nil
return nil
.
All of the compositions of up to four car’s and cdr’s are defined as
functions in their own right. The names of these functions begin with "c
" and end
with "r
", and in between is a sequence of "a
"’s and "d
"’s corresponding to
the composition performed by the function.
Example:
(cddadr x) is the same as (cdr (cdr (car (cdr x))))
The error checking for these functions is exactly the same as for car
and cdr
above.
cons
is the primitive function to create a new cons, whose
car is x and whose cdr is y.
Examples:
(cons 'a 'b) => (a . b) (cons 'a (cons 'b (cons 'c nil))) => (a b c) (cons 'a '(b c d)) => (a b c d)
(ncons x)
is the same as (cons x nil)
.
The name of the function is from "nil-cons".
xcons
("exchanged cons") is like cons
except that the order of
the arguments is reversed.
Example:
(xcons 'a 'b) => (b . a)
This function creates a cons in a specific area. (Areas are
an advanced feature of storage management, explained in chapter
area-chapter; if you aren’t interested in them, you can safely skip
all this stuff). The first two arguments are the same as the two
arguments to cons
, and the third is the number of the area in which
to create the cons.
Example:
(cons-in-area 'a 'b my-area) => (a . b)
(ncons-in-area x area-number)
= (cons-in-area x nil area-number)
(xcons-in-area x y area-number) = (cons-in-area y x area-number)
The backquote reader macro facility is also generally useful for creating list structure, especially mostly-constant list structure, or forms constructed by plugging variables into a template. It is documented in the chapter on macros; see macro.
car-location
returns a locative pointer to the cell containing
the car of cons.
Note: there is no cdr-location
function; it is difficult
because of the cdr-coding scheme (see cdr-code).
length
returns the length of list. The length of a list
is the number of elements in it.
Examples:
(length nil) => 0 (length '(a b c d)) => 4 (length '(a (b c) d)) => 3
length
could have been defined by:
(defun length (x) (cond ((atom x) 0) ((1+ (length (cdr x)))) ))
or by:
(defun length (x) (do ((n 0 (1+ n)) (y x (cdr y))) ((atom y) n) ))
except that it is an error to take length
of a non-nil
atom.
These functions take a list as an argument, and return the first,
second, etc. element of the list. first
is identical to car
,
second
is identical to cadr
, and so on. The reason these names
are provided is that they make more sense when you are thinking of the
argument as a list rather than just as a cons.
restn
returns the rest of the elements of a list, starting with
element n (counting the first element as the zeroth). Thus
rest1
is identical to cdr
, rest2
is identical to cddr
,
and so on. The reason these names are provided is that they make more
sense when you are thinking of the argument as a list rather than just
as a cons.
(nth n list)
returns the n’th element of list, where
the zeroth element is the car of the list.
Examples:
(nth 1 '(foo bar gack)) => bar (nth 3 '(foo bar gack)) => nil
If n is greater than the length of the list, nil
is returned.
Note: this is not the same as the InterLisp function called nth
,
which is similar to but not exactly the same as the Lisp Machine function
nthcdr
.
Also, some people have used macros and functions called nth
of their own in
their Maclisp programs, which may not work the same way; be careful.
nth
could have been defined by:
(defun nth (n list) (do ((i n (1- i)) (l list (cdr l))) ((zerop i) (car l))))
(nthcdr n list)
cdrs list n times,
and returns the result.
Examples:
(nthcdr 0 '(a b c)) => (a b c) (nthcdr 2 '(a b c)) => (c)
In other words, it returns the n’th cdr of the list.
If n is greater than the length of the list, nil
is returned.
This is similar to InterLisp’s function nth
, except that the
InterLisp function is one-based instead of zero-based; see the
InterLisp manual for details.
nthcdr
could have been defined by:
(defun nthcdr (n list) (do ((i 0 (1+ i)) (list list (cdr list))) ((= i n) list)))
last
returns the last cons of list. If list is nil
, it
returns nil
. Note that last
is unfortunately not analogous
to first
(first
returns the first element of a list, but
last
doesn’t return the last element of a list); this is a
historical artifact.
Example:
(setq x '(a b c d)) (last x) => (d) (rplacd (last x) '(e f)) x => '(a b c d e f)
last
could have been defined by:
(defun last (x) (cond ((atom x) x) ((atom (cdr x)) x) ((last (cdr x))) ))
list
constructs and returns a list of its arguments.
Example:
(list 3 4 'a (car '(b . c)) (+ 6 -2)) => (3 4 a b 4)
list
could have been defined by:
(defun list (&rest args) (let ((list (make-list (length args)))) (do ((l list (cdr l)) (a args (cdr a))) ((null a) list) (rplaca l (car a)))))
list*
is like list
except that the last cons
of the constructed list is "dotted". It must be given at least
one argument.
Example:
(list* 'a 'b 'c 'd) => (a b c . d)
This is like
(cons 'a (cons 'b (cons 'c 'd)))
More examples:
(list* 'a 'b) => (a . b) (list* 'a) => a
list-in-area
is exactly the same as list
except that it takes
an extra argument, an area number, and creates the list in that area.
list*-in-area
is exactly the same as list*
except that it takes
an extra argument, an area number, and creates the list in that area.
This creates and returns a list containing length elements. length should be a fixnum. options are alternating keywords and values. The keywords may be either of the following:
:area
The value specifies in which area (see area) the list should be created.
It should be either an area number (a fixnum), or nil
to mean the
default area.
:initial-value
The elements of the list will all be this value. It defaults to nil
.
make-list
always creates a cdr-coded list (see cdr-code).
Examples:
(make-list 3) => (nil nil nil) (make-list 4 ':initial-value 7) => (7 7 7 7)
When make-list
was originally implemented, it took exactly two
arguments: the area and the length. This obsolete form is still
supported so that old programs will continue to work, but the new
keyword-argument form is preferred.
circular-list
constructs a circular list whose elements are args
, repeated
infinitely. circular-list
is the same as list
except that the list itself
is used as the last cdr, instead of nil
.
circular-list
is especially useful with mapcar
, as in the expression
(mapcar (function +) foo (circular-list 5))
which adds each element of foo
to 5.
circular-list
could have been defined by:
(defun circular-list (&rest elements) (setq elements (copylist* elements)) (rplacd (last elements) elements) elements)
Returns a list which is equal
to list, but not eq
.
copylist
does not copy any elements of the list: only the conses of the list itself.
The returned list is fully cdr-coded (see cdr-code) to minimize storage.
If the list is "dotted", that is, (cdr (last list))
is a non-nil
atom, this will be true of the returned list also.
You may optionally specify the area in which to create the new copy.
This is the same as copylist
except that the last cons of the
resulting list is never cdr-coded (see cdr-code). This makes for
increased efficiency if you nconc
something onto the list later.
copyalist
is for copying association lists (see
assoc-lists-section). The list is copied, as in copylist
.
In addition, each element of list which is a cons is replaced in the
copy by a new cons with the same car and cdr. You may optionally
specify the area in which to create the new copy.
copytree
copies all the conses of a tree and makes a new tree
with the same fringe.
reverse
creates a new list whose elements
are the elements of list taken in reverse order.
reverse
does not modify its argument, unlike nreverse
which is faster
but does modify its argument. The list created by reverse
is not cdr-coded.
Example:
(reverse '(a b (c d) e)) => (e (c d) b a)
reverse
could have been defined by:
(defun reverse (x) (do ((l x (cdr l)) ; scan down argument, (r nil ; putting each element (cons (car l) r))) ; into list, until ((null l) r))) ; no more elements.
nreverse
reverses its argument, which should be a list. The argument
is destroyed by rplacd
’s all through the list (cf reverse
).
Example:
(nreverse '(a b c)) => (c b a)
nreverse
could have been defined by:
(defun nreverse (x) (cond ((null x) nil) ((nreverse1 x nil)))) (defun nreverse1 (x y) ;auxiliary function (cond ((null (cdr x)) (rplacd x y)) ((nreverse1 (cdr x) (rplacd x y))))) ;; this last call depends on order of argument evaluation.
Currently, nreverse
does something inefficient with cdr-coded (see
cdr-code) lists, because it just uses rplacd
in the
straightforward way. This may be fixed someday. In the meantime
reverse
might be preferable in some cases.
The arguments to append
are lists. The result is a list which is the
concatenation of the arguments.
The arguments are not changed (cf nconc
).
Example:
(append '(a b c) '(d e f) nil '(g)) => (a b c d e f g)
append
makes copies of the conses of all the lists it is given,
except for the last one. So the new list will share the conses
of the last argument to append, but all of the other conses will be newly
created. Only the lists are copied, not the elements of the lists.
A version of append
which only accepts two arguments could have been defined by:
(defun append2 (x y) (cond ((null x) y) ((cons (car x) (append2 (cdr x) y)) )))
The generalization to any number of arguments could then be made (relying on
car
of nil
being nil
):
(defun append (&rest args) (if (< (length args) 2) (car args) (append2 (car args) (apply (function append) (cdr args)))))
These definitions do not express the full functionality of append
;
the real definition minimizes storage utilization by cdr-coding (see
cdr-code) the list it produces, using cdr-next except at the end
where a full node is used to link to the last argument, unless the last
argument is nil
in which case cdr-nil is used.
To copy a list, use copylist
(see copylist-fun); the old practice
of using append
to copy lists is unclear and obsolete.
nconc
takes lists as arguments. It returns a list which is the arguments
concatenated together. The arguments are changed, rather than copied.
(cf append
, append-fun)
Example:
(setq x '(a b c)) (setq y '(d e f)) (nconc x y) => (a b c d e f) x => (a b c d e f)
Note that the value of x
is now different, since its last cons has been rplacd
’d to
the value of y
.
If the nconc form is evaluated again, it would yield a piece of "circular" list
structure, whose printed representation would be
(a b c d e f d e f d e f ...)
, repeating forever.
nconc
could have been defined by:
(defun nconc (x y) ;for simplicity, this definition (cond ((null x) y) ;only works for 2 arguments. (t (rplacd (last x) y) ;hooky
onto x x))) ;and return the modifiedx
.
(nreconc x y)
is exactly the same as
(nconc (nreverse x) y)
except that it is more
efficient. Both x and y should be lists.
nreconc
could have been defined by:
(defun nreconc (x y) (cond ((null x) y) ((nreverse1 x y)) ))
using the same nreverse1
as above.
This creates and returns a list with the same elements as list, excepting the last element.
Examples:
(butlast '(a b c d)) => (a b c) (butlast '((a b) (c d))) => ((a b)) (butlast '(a)) => nil (butlast nil) => nil
The name is from the phrase "all elements but the last".
This is the destructive version of butlast
; it changes the cdr of
the second-to-last cons of the list to nil. If there is no
second-to-last cons (that is, if the list has fewer than two elements)
it returns nil
.
Examples:
(setq foo '(a b c d)) (nbutlast foo) => (a b c) foo => (a b c) (nbutlast '(a)) => nil
firstn
returns a list of length n, whose elements are the
first n elements of list
. If list is fewer than
n elements long, the remaining elements of the returned list
will be nil
.
Example:
(firstn 2 '(a b c d)) => (a b) (firstn 0 '(a b c d)) => nil (firstn 6 '(a b c d)) => (a b c d nil nil)
Returns a "tail" of list, i.e one of the conses that makes up list, or nil
.
(nleft n list)
returns the last n elements of list.
If n is too large, nleft
will return list.
(nleft n list tail)
takes cdr of list enough times
that taking n more cdrs would yield tail, and returns that.
You can see that when tail is nil
this is the same as the two-argument case.
If tail is not eq
to any tail of list, nleft
will return nil
.
list should be a list, and sublist should be one of the conses
that make up list. ldiff
(meaning "list difference") will return
a new list, whose elements are those elements of list that appear
before sublist.
Examples:
(setq x '(a b c d e))
(setq y (cdddr x)) => (d e)
(ldiff x y) => (a b c)
but
(ldiff '(a b c d) '(c d)) => (a b c d)
since the sublist was not eq
to any part of the list.
The functions rplaca
and rplacd
are used to make alterations in already-existing
list structure; that is, to change the cars and cdrs of existing conses.
The structure is not copied but is physically altered; hence caution
should be exercised when using these functions, as strange side-effects
can occur if portions of list structure become shared unbeknownst to the
programmer. The nconc
, nreverse
, nreconc
, and nbutlast
functions already described, and the delq
family described later,
have the same property.
(rplaca x y)
changes the car of x to y and returns
(the modified) x. x must be a cons or a locative. y may be any Lisp object.
Example:
(setq g '(a b c))
(rplaca (cdr g) 'd) => (d c)
Now g => (a d c)
(rplacd x y)
changes the cdr of x to y and returns
(the modified) x. x must be a cons or a locative. y may be any Lisp object.
Example:
(setq x '(a b c))
(rplacd x 'd) => (a . d)
Now x => (a . d)
(subst new old tree)
substitutes new for all occurrences of old
in tree, and returns the modified copy of tree. The original tree
is unchanged, as subst
recursively copies all of tree replacing
elements equal
to old as it goes.
Example:
(subst 'Tempest 'Hurricane '(Shakespeare wrote (The Hurricane))) => (Shakespeare wrote (The Tempest))
subst
could have been defined by:
(defun subst (new old tree) (cond ((equal tree old) new) ;if item equal to old, replace. ((atom tree) tree) ;if no substructure, return arg. ((cons (subst new old (car tree)) ;otherwise recurse. (subst new old (cdr tree))))))
Note that this function is not "destructive"; that is, it does not change the car or cdr of any already-existing list structure.
To copy a tree, use copytree
(see copytree-fun); the old practice
of using subst
to copy trees is unclear and obsolete.
Note: certain details of subst
may be changed in the future. It may
possibly be changed to use eq
rather than equal
for the comparison,
and possibly may substitute only in cars, not in cdrs. This is still being
discussed.
nsubst
is a destructive version of subst
. The list structure of
tree is altered by replacing each occurrence of old with
new. nsubst
could have been defined as
(defun nsubst (new old tree) (cond ((eq tree old) new) ;If item eq to old, replace. ((atom tree) tree) ;If no substructure, return arg. (t ;Otherwise, recurse. (rplaca tree (nsubst new old (car tree))) (rplacd tree (nsubst new old (cdr tree))) tree)))
sublis
makes substitutions for symbols in a tree. The first
argument to sublis
is an association list (see
assoc-lists-section). The second argument is the tree in which
substitutions are to be made. sublis
looks at all symbols in the
fringe of the tree; if a symbol appears in the association list
occurrences of it are replaced by the object it is associated with. The
argument is not modified; new conses are created where necessary and
only where necessary, so the newly created tree shares as much of its
substructure as possible with the old. For example, if no substitutions
are made, the result is just the old tree.
Example:
(sublis '((x . 100) (z . zprime)) '(plus x (minus g z x p) 4)) => (plus 100 (minus g zprime 100 p) 4)
sublis
could have been defined by:
(defun sublis (alist sexp) (cond ((atom sexp) (let ((tem (assq sexp alist))) (if tem (cdr tem) sexp))) ((let ((car (sublis alist (car sexp))) (cdr (sublis alist (cdr sexp)))) (if (and (eq (car sexp) car) (eq (cdr sexp) cdr)) sexp (cons car cdr))))))
nsublis
is like sublis
but changes the original tree
instead of creating new.
nsublis
could have been defined by:
(defun nsublis (alist tree) (cond ((atom tree) (let ((tem (assq tree alist))) (if tem (cdr tem) tree))) (t (rplaca tree (nsublis alist (car tree))) (rplacd tree (nsublis alist (cdr tree))) tree)))
This section explains the internal data format used to store conses inside the Lisp Machine. Casual users don’t have to worry about this; you can skip this section if you want. It is only important to read this section if you require extra storage efficiency in your program.
The usual and obvious internal representation of conses in any implementation of Lisp is as a pair of pointers, contiguous in memory. If we call the amount of storage that it takes to store a Lisp pointer a "word", then conses normally occupy two words. One word (say it’s the first) holds the car, and the other word (say it’s the second) holds the cdr. To get the car or cdr of a list, you just reference this memory location, and to change the car or cdr, you just store into this memory location.
Very often, conses are used to store lists. If the above representation
is used, a list of n elements requires two times n words of
memory: n to hold the pointers to the elements of the list, and
n to point to the next cons or to nil
. To optimize this
particular case of using conses, the Lisp Machine uses a storage
representation called "cdr coding" to store lists. The basic goal is to
allow a list of n elements to be stored in only n locations,
while allowing conses that are not parts of lists to be stored in the
usual way.
The way it works is that there is an extra two-bit field in every word of memory, called the "cdr-code" field. There are three meaningful values that this field can have, which are called cdr-normal, cdr-next, and cdr-nil. The regular, non-compact way to store a cons is by two contiguous words, the first of which holds the car and the second of which holds the cdr. In this case, the cdr code of the first word is cdr-normal. (The cdr code of the second word doesn’t matter; as we will see, it is never looked at.) The cons is represented by a pointer to the first of the two words. When a list of n elements is stored in the most compact way, pointers to the n elements occupy n contiguous memory locations. The cdr codes of all these locations are cdr-next, except the last location whose cdr code is cdr-nil. The list is represented as a pointer to the first of the n words.
Now, how are the basic operations on conses defined to work based on
this data structure? Finding the car is easy: you just read the
contents of the location addressed by the pointer. Finding the cdr is
more complex. First you must read the contents of the location
addressed by the pointer, and inspect the cdr-code you find there. If
the code is cdr-normal, then you add one to the pointer, read the
location it addresses, and return the contents of that location; that
is, you read the second of the two words. If the code is cdr-next, you
add one to the pointer, and simply return that pointer without doing any
more reading; that is, you return a pointer to the next word in the
n-word block. If the code is cdr-nil, you simply return nil
.
If you examine these rules, you will find that they work fine even if you mix the two kinds of storage representation within the same list. There’s no problem with doing that.
How about changing the structure? Like car, rplaca is very easy; you just store into the location addressed by the pointer. To do an rplacd you must read the location addressed by the pointer and examine the cdr code. If the code is cdr-normal, you just store into the location one greater than that addressed by the pointer; that is, you store into the second word of the two words. But if the cdr-code is cdr-next or cdr-nil, there is a problem: there is no memory cell that is storing the cdr of the cons. That is the cell that has been optimized out; it just doesn’t exist.
This problem is dealt with by the use of "invisible pointers". An invisible pointer is a special kind of pointer, recognized by its data type (Lisp Machine pointers include a data type field as well as an address field). The way they work is that when the Lisp Machine reads a word from memory, if that word is an invisible pointer then it proceeds to read the word pointed to by the invisible pointer and use that word instead of the invisible pointer itself. Similarly, when it writes to a location, it first reads the location, and if it contains an invisible pointer then it writes to the location addressed by the invisible pointer instead. (This is a somewhat simplified explanation; actually there are several kinds of invisible pointer that are interpreted in different ways at different times, used for things other than the cdr coding scheme.)
Here’s how to do an rplacd when the cdr code is cdr-next or cdr-nil. Call the location addressed by the first argument to rplacd l. First, you allocate two contiguous words (in the same area that l points to). Then you store the old contents of l (the car of the cons) and the second argument to rplacd (the new cdr of the cons) into these two words. You set the cdr-code of the first of the two words to cdr-normal. Then you write an invisible pointer, pointing at the first of the two words, into location l. (It doesn’t matter what the cdr-code of this word is, since the invisible pointer data type is checked first, as we will see.)
Now, whenever any operation is done to the cons (car, cdr, rplaca, or rplacd), the initial reading of the word pointed to by the Lisp pointer that represents the cons will find an invisible pointer in the addressed cell. When the invisible pointer is seen, the address it contains is used in place of the original address. So the newly-allocated two-word cons will be used for any operation done on the original object.
Why is any of this important to users? In fact, it is all invisible to you; everything works the same way whether or not compact representation is used, from the point of view of the semantics of the language. That is, the only difference that any of this makes is a difference in efficiency. The compact representation is more efficient in most cases. However, if the conses are going to get rplacd’ed, then invisible pointers will be created, extra memory will be allocated, and the compact representation will be seen to degrade storage efficiency rather than improve it. Also, accesses that go through invisible pointers are somewhat slower, since more memory references are needed. So if you care a lot about storage efficiency, you should be careful about which lists get stored in which representations.
You should try to use the normal representation for those data
structures that will be subject to rplacding operations, including
nconc
and nreverse
, and the compact representation for other
structures. The functions cons
, xcons
, ncons
, and their
area variants make conses in the normal representation. The functions
list
, list*
, list-in-area
, make-list
, and append
use
the compact representation. The other list-creating functions,
including read
, currently make normal lists, although this might get
changed. Some functions, such as sort
, take special care to operate
efficiently on compact lists (sort
effectively treats them as
arrays). nreverse
is rather slow on compact lists, currently, since
it simple-mindedly uses rplacd
, but this will be changed.
(copylist x)
is a suitable way to copy a
list, converting it into compact form (see copylist-fun).
Zetalisp includes functions which simplify the maintenance
of tabular data structures of several varieties. The simplest is
a plain list of items, which models (approximately) the concept of a set.
There are functions to add (cons
), remove (delete
, delq
,
del
, del-if
, del-if-not
, remove
, remq
, rem
,
rem-if
, rem-if-not
),
and search for (member
, memq
, mem
) items in a list.
Set union, intersection, and difference functions can be easily written using these.
Association lists are very commonly used. An association list
is a list of conses. The car of each cons is a "key" and the cdr
is a "datum", or a list of associated data. The functions
assoc
, assq
, ass
, memass
, and rassoc
may be used to retrieve the data, given the key. For example,
((tweety . bird) (sylvester . cat))
is an association list with two elements. Given a symbol representing the name of an animal, it can retrieve what kind of animal this is.
Structured records can be stored as association lists or as stereotyped cons-structures where each element of the structure has a certain car-cdr path associated with it. However, these are better implemented using structure macros (see defstruct).
Simple list-structure is very convenient, but may not be efficient enough
for large data bases because it takes a long time to search a long list.
Zetalisp includes hash table facilities for more efficient
but more complex tables (see hash-table), and
a hashing function (sxhash
) to aid users in constructing their own facilities.
(memq item list)
returns nil
if item is not one of the
elements of list. Otherwise, it returns the sublist of list
beginning with the first occurrence of item; that is, it returns the
first cons of the list whose car is item. The comparison is made by
eq
. Because memq
returns nil
if it doesn’t find anything,
and something non-nil
if it finds something, it is often used as a
predicate.
Examples:
(memq 'a '(1 2 3 4)) => nil (memq 'a '(g (x a y) c a d e a f)) => (a d e a f)
Note that the value returned by memq
is eq
to the portion of the list
beginning with a
.
Thus rplaca
on the result of memq
may be used,
if you first check to make sure memq
did not return nil
.
Example:
(let ((sublist (memq x z))) ;Search forx
in the listz
. (if (not (null sublist)) ;If it is found, (rplaca sublist y))) ;Replace it withy
.
memq
could have been defined by:
(defun memq (item list) (cond ((null list) nil) ((eq item (car list)) list) (t (memq item (cdr list))) ))
memq
is hand-coded in microcode and therefore especially fast.
member
is like memq
, except equal
is used for the comparison,
instead of eq
.
member
could have been defined by:
(defun member (item list) (cond ((null list) nil) ((equal item (car list)) list) (t (member item (cdr list))) ))
mem
is the same as memq
except that it takes an extra argument
which should be a predicate of two arguments, which is used for the
comparison instead of eq
. (mem 'eq a b)
is the same as
(memq a b)
. (mem 'equal a b)
is the same as (member a b)
.
mem
is usually used with equality predicates other than
eq
and equal
, such as =
, char-equal
or string-equal
.
It can also be used with non-commutative predicates. The predicate
is called with item as its first argument and the element of list
as its second argument, so
(mem #'< 4 list)
finds the first element in list for which (< 4 x)
is true;
that is, it finds the first element greater than 4
.
find-position-in-list
looks down list for an element which
is eq
to item, like memq.
However, it returns the numeric index
in the list at which it found the first occurence of item, or
nil
if it did not find it at all. This function is sort of
the complement of nth
(see nth-fun); like nth
, it is zero-based.
Examples:
(find-position-in-list 'a '(a b c)) => 0 (find-position-in-list 'c '(a b c)) => 2 (find-position-in-list 'e '(a b c)) => nil
find-position-in-list-equal
is exactly the same as
find-position-in-list
, except that the comparison is done
with equal
instead of eq
.
Returns t
if sublist is a sublist of list (i.e
one of the conses that makes up list). Otherwise returns nil
.
Another way to look at this is that tailp
returns t
if
(nthcdr n list)
is sublist, for some value of n.
tailp
could have been defined by:
(defun tailp (sublist list) (do list list (cdr list) (null list) (if (eq sublist list) (return t))))
(delq item list)
returns the list with all
occurrences of item removed. eq
is used for the comparison.
The argument list is actually modified (rplacd
’ed) when instances
of item are spliced out. delq
should be used for value, not
for effect. That is, use
(setq a (delq 'b a))
rather than
(delq 'b a)
These two are not equivalent when the first element
of the value of a
is b
.
(delq item list n)
is like (delq item list)
except only the first
n instances of item are deleted. n is allowed to be zero.
If n is greater than or equal to the number of occurrences of item in the
list, all occurrences of item in the list will be deleted.
Example:
(delq 'a '(b a c (a b) d a e)) => (b c (a b) d e)
delq
could have been defined by:
(defun delq (item list &optional (n -1)) (cond ((or (atom list) (zerop n)) list) ((eq item (car list)) (delq item (cdr list) (1- n))) (t (rplacd list (delq item (cdr list) n)))))
If the third argument (n) is not supplied, it defaults to -1
which
is effectively infinity since it can be decremented any number of times without
reaching zero.
delete
is the same as delq
except that equal
is used for the comparison
instead of eq
.
del
is the same as delq
except that it takes an extra argument
which should be a predicate of two arguments, which is used for the
comparison instead of eq
. (del 'eq a b)
is the same as
(delq a b)
. (cf mem
, mem-fun)
remq
is similar to delq
, except that the list is not altered;
rather, a new list is returned.
Examples:
(setq x '(a b c d e f)) (remq 'b x) => (a c d e f) x => (a b c d e f) (remq 'b '(a b c b a b) 2) => (a c a b)
remove
is the same as remq
except that equal
is used for the
comparison instead of eq
.
rem
is the same as remq
except that it takes an extra argument
which should be a predicate of two arguments, which is used for the
comparison instead of eq
. (rem 'eq a b)
is the same as
(remq a b)
. (cf mem
, mem-fun)
predicate should be a function of one argument.
A new list is made by applying predicate to
all of the elements of list and removing the ones for which the predicate
returns nil
. One of this function’s names (rem-if-not
)
means "remove if this condition is not true"; i.e it keeps the elements
for which predicate is true. The other name (subset
) refers to
the function’s action if list is considered to represent a mathematical set.
predicate should be a function of one argument.
A new list is made by applying predicate to
all of the elements of list and removing the ones for which the predicate
returns non-nil
. One of this function’s names (rem-if
)
means "remove if this condition is true". The other name (subset-not
)
refers to the function’s action if list is considered to represent
a mathematical set.
del-if
is just like rem-if
except that it modifies list
rather than creating a new list.
del-if-not
is just like rem-if-not
except that it modifies list
rather than creating a new list.
every
returns t
if predicate returns
non-nil
when applied to every element of list,
or nil
if predicate returns nil
for some element.
If step-function is present, it replaces cdr
as the function used to get to the next element of the list;
cddr
is a typical function to use here.
some
returns a tail of list such that the car
of the tail is the first element that the predicate returns
non-nil
when applied to,
or nil
if predicate returns nil
for every element.
If step-function is present, it replaces cdr
as the function used to get to the next element of the list;
cddr
is a typical function to use here.
(assq item alist)
looks up item in the association list
(list of conses) alist. The value is the first cons whose car
is eq
to x, or nil
if there is none such.
Examples:
(assq 'r '((a . b) (c . d) (r . x) (s . y) (r . z))) => (r . x) (assq 'fooo '((foo . bar) (zoo . goo))) => nil (assq 'b '((a b c) (b c d) (x y z))) => (b c d)
It is okay to rplacd
the result of assq
as long as it is not nil
,
if your intention is to "update" the "table" that was assq
’s second argument.
Example:
(setq values '((x . 100) (y . 200) (z . 50)))
(assq 'y values) => (y . 200)
(rplacd (assq 'y values) 201)
(assq 'y values) => (y . 201) now
A typical trick is to say
(cdr (assq x y))
.
Since the cdr of nil
is guaranteed to be nil
,
this yields nil
if no pair is found (or if a pair is
found whose cdr is nil
.)
assq
could have been defined by:
(defun assq (item list) (cond ((null list) nil) ((eq item (caar list)) (car list)) ((assq item (cdr list))) ))
assoc
is like assq
except that the comparison uses equal
instead of eq
.
Example:
(assoc '(a b) '((x . y) ((a b) . 7) ((c . d) .e))) => ((a b) . 7)
assoc
could have been defined by:
(defun assoc (item list) (cond ((null list) nil) ((equal item (caar list)) (car list)) ((assoc item (cdr list))) ))
ass
is the same as assq
except that it takes an extra argument
which should be a predicate of two arguments, which is used for the
comparison instead of eq
. (ass 'eq a b)
is the same as
(assq a b)
. (cf mem
, mem-fun) As with mem
, you may
use non-commutative predicates; the first argument to the predicate
is item and the second is the key of the element of alist.
memass
searches alist just like ass
, but returns
the portion of the list beginning with the pair containing item,
rather than the pair itself. (car (memass x y z)) =
(ass x y z)
. (cf mem
, mem-fun) As with mem
, you may
use non-commutative predicates; the first argument to the predicate
is item and the second is the key of the element of alist.
rassq
means "reverse assq". It is like assq
, but
it tries to find an element of alist whose cdr (not car)
is eq to item. rassq
could have been defined by:
(defun rassq (item in-list) (do l in-list (cdr l) (null l) (and (eq item (cdar l)) (return (car l)))))
rassoc
is to rassq
as assoc
is to assq
. That is, it
finds an element whose cdr is equal
to item.
rass
is to rassq
as ass
is to assq
. That is, it takes
a predicate to be used instead of eq
.
(cf mem
, mem-fun) As with mem
, you may
use non-commutative predicates; the first argument to the predicate
is item and the second is the cdr of the element of alist.
(sassq item alist fcn)
is like (assq item alist)
except
that if item is not found in alist, instead of returning nil
,
sassq
calls the function fcn with no arguments. sassq
could
have been defined by:
(defun sassq (item alist fcn) (or (assq item alist) (apply fcn nil)))
sassq
and sassoc
(see below) are of limited use.
These are primarily leftovers from Lisp 1.5.
(sassoc item alist fcn)
is like (assoc item alist)
except that if
item is not found in alist, instead of returning nil
, sassoc
calls
the function fcn with no arguments. sassoc
could have been
defined by:
(defun sassoc (item alist fcn) (or (assoc item alist) (apply fcn nil)))
pairlis
takes two lists and makes an association list which associates
elements of the first list with corresponding elements of the second
list.
Example:
(pairlis '(beef clams kitty) '(roast fried yu-shiang)) => ((beef . roast) (clams . fried) (kitty . yu-shiang))
From time immemorial, Lisp has had a kind of tabular data structure called a property list (plist for short). A property list contains zero or more entries; each entry associates from a keyword symbol (called the indicator) to a Lisp object (called the value or, sometimes, the property). There are no duplications among the indicators; a property-list can only have one property at a time with a given name.
This is very similar to an association list. The difference is that a
property list is an object with a unique identity; the operations for
adding and removing property-list entries are side-effecting operations
which alter the property-list rather than making a new one. An
association list with no entries would be the empty list ()
, i.e
the symbol nil
. There is only one empty list, so all empty
association lists are the same object. Each empty property-list is a
separate and distinct object.
The implementation of a property list is a memory cell containing a list with an even number (possibly zero) of elements. Each pair of elements constitutes a property; the first of the pair is the indicator and the second is the value. The memory cell is there to give the property list a unique identity and to provide for side-effecting operations.
The term "property list" is sometimes incorrectly used to refer to the list of entries inside the property list, rather than the property list itself. This is regrettable and confusing.
How do we deal with "memory cells" in Lisp; i.e what kind of Lisp object is a property list? Rather than being a distinct primitive data type, a property list can exist in one of three forms:
1. A property list can be a cons whose cdr is the list of entries and whose car is not used and available to the user to store something.
2. The system associates a property list with every symbol (see symbol-plist-section). A symbol can be used where a property list is expected; the property-list primitives will automatically find the symbol’s property list and use it.
3. A property list can be a memory cell in the middle of some data structure,
such as a list, an array, an instance, or a defstruct. An arbitrary memory
cell of this kind is named by a locative (see locative). Such locatives
are typically created with the locf
special form (see locf-fun).
Property lists of the first kind
are called "disembodied" property lists because they are not associated with
a symbol or other data structure.
The way to create a disembodied property list is (ncons nil)
,
or (ncons data)
to store data in the car of the property list.
Here is an example of the list of entries inside the property list of a
symbol named b1
which is being used by a program which deals with
blocks:
(color blue on b6 associated-with (b2 b3 b4))
There are three properties, and so the list has six elements.
The first property’s indicator is the symbol color
, and its value
is the symbol blue
. One says that "the value of b1
’s color
property is blue
", or, informally, that "b1
’s color
property
is blue
." The program is probably representing the information that
the block represented by b1
is painted blue. Similarly, it is probably
representing in the rest of the property list that block b1
is on
top of block b6
, and that b1
is associated with blocks
b2
, b3
, and b4
.
get
looks up plist’s indicator property. If it finds such a property,
it returns the value; otherwise, it returns nil
. If plist is a symbol,
the symbol’s associated property list is used. For example, if the property
list of foo
is (baz 3)
, then
(get 'foo 'baz) => 3 (get 'foo 'zoo) => nil
getl
is like get
, except that the second argument is a list
of indicators. getl
searches down plist for any
of the indicators in indicator-list, until it finds a property whose
indicator is one of the elements of indicator-list.
If plist is a symbol, the symbol’s associated property list is used.
getl
returns the portion of the list inside plist beginning
with the first such property that it found. So the car
of the returned
list is an indicator, and the cadr
is the property value. If none
of the indicators on indicator-list are on the property list, getl
returns nil
. For example, if the property list of foo
were
(bar (1 2 3) baz (3 2 1) color blue height six-two)
then
(getl 'foo '(baz height)) => (baz (3 2 1) color blue height six-two)
When more than one of the indicators in indicator-list is present in
plist, which one getl
returns depends on the order of the properties.
This is the only thing that depends on that order. The order maintained
by putprop
and defprop
is not defined (their behavior with respect
to order is not guaranteed and may be changed without notice).
This gives plist an indicator-property of x.
After this is done, (get plist indicator)
will return x.
If plist is a symbol, the symbol’s associated property list is used.
Example:
(putprop 'Nixon 'not 'crook)
defprop
is a form of putprop
with "unevaluated arguments",
which is sometimes more convenient for typing. Normally it doesn’t
make sense to use a property list rather than a symbol as the first (or plist) argument.
Example:
(defprop foo bar next-to)
is the same as
(putprop 'foo 'bar 'next-to)
This removes plist’s indicator property, by splicing it out of the property
list. It returns that portion of the list inside plist of which the
former indicator-property was the car
. car
of what remprop
returns is what get
would have returned with the same arguments.
If plist is a symbol, the symbol’s associated property list is used.
For example, if the property list of foo
was
(color blue height six-three near-to bar)
then
(remprop 'foo 'height) => (six-three near-to bar)
and foo
’s property list would be
(color blue near-to bar)
If plist has no indicator-property, then remprop
has no side-effect
and returns nil
.
There is a mixin flavor, called si:property-list-mixin
, that
provides messages that do things analogous to what the above functions
do. [Currently, the above functions do not work on flavor instances,
but this will be fixed.]
A hash table is a Lisp object that works something like a property list. Each hash table has a set of entries, each of which associates a particular key with a particular value (or sequence of values). The basic functions that deal with hash tables can create entries, delete entries, and find the value that is associated with a given key. Finding the value is very fast even if there are many entries, because hashing is used; this is an important advantage of hash tables over property lists. Hashing is explained in hash-section.
A given hash table stores a fixed number of values for each key; by default, there is only one value. Each time you specify a new value or sequence of values, the old one(s) are lost.
Hash tables come in two kinds, the difference being whether the keys
are compared using eq
or using equal
. In other words, there
are hash tables which hash on Lisp objects (using eq
) and there
are hash tables which hash on trees (using equal
). The following
discussion refers to the eq
kind of hash table; the other kind
is described later, and works analogously.
Hash tables of the first kind are created with the function make-hash-table
, which
takes various options. New entries are added to hash tables with the
puthash
function. To look up a key and find the associated value(s),
the gethash
function is used. To remove an entry, use remhash
.
Here is a simple example.
(setq a (make-hash-table)) (puthash 'color 'brown a) (puthash 'name 'fred a) (gethash 'color a) => brown (gethash 'name a) => fred
In this example, the symbols color
and name
are being used as
keys, and the symbols brown
and fred
are being used as the
associated values. The hash table remembers one value for each key,
since we did not specify otherwise, and has two items in it, one of
which associates from color
to brown
, and the other of which
associates from name
to fred
.
Keys do not have to be symbols; they can be any Lisp object. Likewise
values can be any Lisp object. The Lisp function eq
is used to
compare keys, rather than equal
. This means that keys are really
objects, but it means that it is not reasonable to use numbers other
than fixnums as keys.
When a hash table is first created, it has a size, which is the maximum number of entries it can hold. Usually the actual capacity of the table is somewhat less, since the hashing is not perfectly collision-free. With the maximum possible bad luck, the capacity could be very much less, but this rarely happens. If so many entries are added that the capacity is exceeded, the hash table will automatically grow, and the entries will be rehashed (new hash values will be recomputed, and everything will be rearranged so that the fast hash lookup still works). This is transparent to the caller; it all happens automatically.
The describe
function (see describe-fun) prints a variety of
useful information when applied to a hash table.
This hash table facility is similar to the hasharray facility of Interlisp,
and some of the function names are the same. However, it is not compatible.
The exact details and the order of arguments are designed to be consistent
with the rest of Zetalisp rather than with Interlisp. For instance,
the order of arguments to maphash
is different, we do not have the Interlisp
"system hash table", and we do not have the
Interlisp restriction that keys and values may not be nil
.
Note, however, that the order of arguments to gethash
, puthash
, and remhash
is not consistent with the Zetalisp’s get
, putprop
, and remprop
,
either. This is an unfortunate result of the haphazard historical development of Lisp.
If the calling program is using multiprocessing, it must be careful to make
sure that there are never two processes both referencing the hash table at
the same time. There is no locking built into hash tables; if you have two
processes that both want to reference the same hash table, you must arrange
mutual exclusion yourself by using a lock or some other means. Even two
processes just doing gethash
on the same hash table must synchronize
themselves, because gethash
may be forced by garbage collection to
rehash the table. Don’t worry about this if you don’t use multiprocessing;
but if you do use multiprocessing, you will have a lot of trouble if you
don’t understand this.
Hash tables are implemented with a special kind of array. arrayp
of a hash table will return t
. However, it is not recommended to
use ordinary array operations on a hash table.
Hash tables should be manipulated only with the functions described below.
This section documents the functions for eq
hash tables, which
use objects as keys and associate other objects with them.
This creates a new hash table. Valid option keywords are:
:size
Sets the initial size of the hash table, in entries, as a fixnum. The default is 100 (octal). The actual size is rounded up from the size you specify to the next size that is "good" for the hashing algorithm. You won’t necessarily be able to store this many entries into the table before the max-search-distance criterion (see below) is reached; but except in the case of extreme bad luck you will be able to store almost this many.
:number-of-values
Specifies how many values to associate with each key. The default is one.
:area
Specifies the area in which the hash table should be created. This is
just like the :area
option to make-array
(see make-array-fun).
Defaults to nil
(i.e default-cons-area
).
:rehash-function
Specifies the function to be used for rehashing when the table becomes full. Defaults to the internal rehashing function that does the usual thing. If you want to write your own rehashing function, you will have to understand all the internals of how hash tables work. These internals are not documented here, as the best way to learn them is to read the source code.
:rehash-size
Specifies how much to increase the size of the hash table when it becomes
full. This can be a fixnum which is the number of entries to add, or
it can be a flonum which is the ratio of the new size to the old size.
The default is 1.3
, which causes the table to be made 30% bigger
each time it has to grow.
:max-search-distance
Sets a maximum for how long a search you are willing to accept, to find an entry. The default is 8. If you add an entry and it turns out to be necessary to search more than this far for a place to put it, the hash table is enlarged and rehashed. With any luck, the search will not be as long then.
:actual-size
Specifies exactly the size for the hash table. Hash tables used by
the microcode for flavor method lookup must be a power of two in size.
This differs from :size
in that :size
is rounded up to a
nearly prime number, but :actual-size
is used exactly as
specified. :actual-size
overrides :size.
Find the entry in hash-table whose key is key, and return the
associated value. If there is no such entry, return nil
.
Returns a second value, which is t
if an entry was found or nil
if there
is no entry for key in this table.
Returns also a third value, a list which overlays the hash table entry. Its car is the key; the remaining elements are the values in the entry. This is how you can access values other than the first, if the hash table contains more than one value per entry.
Create an entry associating key to value; if there is already an entry for key, then replace the value of that entry with value. Returns value. The hash table automatically grows if necessary.
If the hash table associates more than one value with each key, the remaining values in the entry are taken from extra-values.
Remove any entry for key in hash-table. Returns t
if there was an
entry or nil
if there was not.
This specifies new value(s) for key like puthash
, but returns
values describing the previous state of the entry, just like
gethash
. In particular, it returns the previous (replaced)
associated value as the first value, and returns T as the second value
if the entry existed previously.
For each entry in hash-table, call function on two arguments: the key of the entry and the value of the entry.
If the hash table has more than one value per key, all the values, in order, are supplied as arguments, with the corresponding key.
Remove all the entries from hash-table. Returns the hash table itself.
This section documents the functions for equal
hash tables, which
use trees as keys and associate objects with them. The function to
make one is slightly different from make-hash-table
because the
implementations of the two kinds of hash table differ, but analogous
operations are provided.
This creates a new hash table of the equal
kind. Valid option keywords are:
:size
Sets the initial size of the hash table, in entries, as a fixnum. The default is 100 (octal). The actual size is rounded up from the size you specify to the next "good" size. You won’t necessarily be able to store this many entries into the table before it overflows and becomes bigger; but except in the case of extreme bad luck you will be able to store almost this many.
:area
Specifies the area in which the hash table should be created. This is
just like the :area
option to make-array
(see make-array-fun).
Defaults to nil
(i.e default-cons-area
).
:rehash-threshold
Specifies how full the table can be before it must grow. This is typically
a flonum. The default is 0.8
, i.e 80%.
:growth-factor
Specifies how much to increase the size of the hash table when it becomes
full. This is a flonum which is the ratio of the new size to the old size.
The default is 1.3
, which causes the table to be made 30% bigger
each time it has to grow.
Find the entry in hash-table whose key is equal
to key, and return the
associated value. If there is no such entry, return nil
.
Returns a second value, which is t
if an entry was found or nil
if there
is no entry for key in this table.
Create an entry associating key to value; if there is already an entry for key, then replace the value of that entry with value. Returns value. If adding an entry to the hash table exceeds its rehash threshold, it is grown and rehashed so that searching does not become too slow.
Remove any entry for key in hash-table. Returns t
if there was an
entry or nil
if there was not.
This does the same thing as puthash-equal
, but returns different values. If
there was already an entry in hash-table whose key was key, then
it returns the old associated value as its first returned value, and
t
as its second returned value. Otherwise it returns two values,
nil
and nil
.
For each entry in hash-table, call function on two arguments: the key of the entry and the value of the entry.
Remove all the entries from hash-table. Returns the hash table itself.
The eq
type hash tables actually hash on the address of the representation
of the object. When the copying garbage collector changes the addresses of
object, it lets the hash facility know so that gethash
will rehash
the table based on the new object addresses.
There will eventually be an option to make-hash-table
which tells it
to make a "non-GC-protecting" hash table. This is a special kind of hash table
with the property that if one of its keys becomes "garbage", i.e is an object
not known about by anything other than the hash table, then the entry for that
key will be silently removed from the table. When these exist they will be
documented in this section.
Hashing is a technique used in algorithms to provide fast retrieval of data in large tables. A function, known as a "hash function", is created, which takes an object that might be used as a key, and produces a number associated with that key. This number, or some function of it, can be used to specify where in a table to look for the datum associated with the key. It is always possible for two different objects to "hash to the same value"; that is, for the hash function to return the same number for two distinct objects. Good hash functions are designed to minimize this by evenly distributing their results over the range of possible numbers. However, hash table algorithms must still deal with this problem by providing a secondary search, sometimes known as a rehash. For more information, consult a textbook on computer algorithms.
sxhash
computes a hash code of a tree, and returns it as a fixnum.
A property of sxhash
is that (equal x y)
always implies
(= (sxhash x) (sxhash y))
. The number returned by sxhash
is
always a non-negative fixnum, possibly a large one. sxhash
tries to
compute its hash code in such a way that common permutations of an object,
such as interchanging two elements of a list or changing one character in
a string, will always change the hash code.
Here is an example of how to use sxhash
in maintaining
hash tables of trees:
(defun knownp (x &aux i bkt) ;look up x
in the table
(setq i (abs (remainder (sxhash x) 176)))
;The remainder should be reasonably randomized.
(setq bkt (aref table i))
;bkt is thus a list of all those expressions that
;hash into the same number as does x.
(memq x bkt))
To write an "intern" for trees, one could
(defun sintern (x &aux bkt i tem) (setq i (abs (remainder (sxhash x) 2n-1))) ;2n-1 stands for a power of 2 minus one. ;This is a good choice to randomize the ;result of the remainder operation. (setq bkt (aref table i)) (cond ((setq tem (memq x bkt)) (car tem)) (t (aset (cons x bkt) table i) x)))
sxhash
provides what is called "hashing on equal
"; that is, two
objects that are equal
are considered to be "the same" by
sxhash
. In particular, if two strings differ only in alphabetic case,
sxhash
will return the same thing for both of them because
they are equal
. The value returned by sxhash
does not depend
on the value of alphabetic-case-affects-string-comparison
(see alphabetic-case-affects-string-comparison-var).
Therefore, sxhash
is useful for retrieving data when
two keys that are not the same object but are equal
are considered
the same. If you consider two such keys to be different, then you need
"hashing on eq
", where two different objects are always considered
different. In some Lisp implementations, there is an easy way to create
a hash function that hashes on eq
, namely, by returning the virtual
address of the storage associated with the object. But in other
implementations, of which Zetalisp is one, this doesn’t work,
because the address associated with an object can be changed by the
relocating garbage collector. The hash tables created by make-hash-table
deal with this problem by using the appropriate subprimitives so that they
interface correctly with the garbage collector. If you need a hash table
that hashes on eq
, it is already provided; if you need an
eq
hash function for some other reason, you must build it yourself,
either using the provided eq
hash table facility or carefully using
subprimitives.
Several functions are provided for sorting arrays and lists. These functions use algorithms which always terminate no matter what sorting predicate is used, provided only that the predicate always terminates. The main sorting functions are not stable; that is, equal items may not stay in their original order. If you want a stable sort, use the stable versions. But if you don’t care about stability, don’t use them since stable algorithms are significantly slower.
After sorting, the argument (be it list or array) has been rearranged
internally so as to be completely ordered. In the case of an array
argument, this is accomplished by permuting the elements of the array,
while in the list case, the list is reordered by rplacd
’s in the
same manner as nreverse
. Thus if the argument should not be
clobbered, the user must sort a copy of the argument, obtainable by
fillarray
or copylist
, as appropriate. Furthermore, sort
of a list is like delq
in that it should not be used for effect;
the result is conceptually the same as the argument but in fact is a
different Lisp object.
Should the comparison predicate cause an error, such as a wrong type argument error, the state of the list or array being sorted is undefined. However, if the error is corrected the sort will, of course, proceed correctly.
The sorting package is smart about compact lists; it sorts compact sublists as if they were arrays. See cdr-code for an explanation of compact lists, and A. I. Memo 587 by Guy L. Steele Jr. for an explanation of the sorting algorithm.
The first argument to sort
is an array or a list. The second
is a predicate, which must be applicable to
all the objects in the array or list. The predicate should take two
arguments, and return non-nil
if and only if the first argument is
strictly less than the second (in some appropriate sense).
The sort
function proceeds to sort the contents of the array or list
under the ordering imposed by the predicate, and returns the array or
list modified into sorted order. Note that since sorting requires many
comparisons, and thus many calls to the predicate, sorting will be much
faster if the predicate is a compiled function rather than interpreted.
Example:
(defun mostcar (x) (cond ((symbolp x) x) ((mostcar (car x))))) (sort 'fooarray (function (lambda (x y) (alphalessp (mostcar x) (mostcar y)))))
If fooarray
contained these items before the sort:
(Tokens (The lion sleeps tonight)) (Carpenters (Close to you)) ((Rolling Stones) (Brown sugar)) ((Beach Boys) (I get around)) (Beatles (I want to hold your hand))
then after the sort fooarray
would contain:
((Beach Boys) (I get around)) (Beatles (I want to hold your hand)) (Carpenters (Close to you)) ((Rolling Stones) (Brown sugar)) (Tokens (The lion sleeps tonight))
When sort
is given a list, it may change the order of the
conses of the list (using rplacd
), and so it cannot be used merely
for side-effect; only the returned value of sort
will be the
sorted list. This will mess up the original list; if you need both
the original list and the sorted list, you must copy the original
and sort the copy (see copylist
, copylist-fun).
Sorting an array just moves the elements of the array into different places, and so sorting an array for side-effect only is all right.
If the argument to sort
is an array with a fill pointer, note that,
like most functions, sort
considers the active length of the array
to be the length, and so only the active part of the array will be
sorted (see array-active-length
, array-active-length-fun).
sortcar
is the same as sort
except that the predicate is applied
to the cars of the elements of x, instead of directly to the
elements of x. Example:
(sortcar '((3 . dog) (1 . cat) (2 . bird)) #'<) => ((1 . cat) (2 . bird) (3 . dog))
Remember that sortcar
, when given a list, may change the order of the
conses of the list (using rplacd
), and so it cannot be used merely
for side-effect; only the returned value of sortcar
will be the
sorted list.
stable-sort
is like sort
, but if two elements of x are equal,
i.e predicate returns nil
when applied to them in either order,
then those two elements will remain in their original order.
stable-sortcar
is like sortcar
, but if two elements of x are equal,
i.e predicate returns nil
when applied to their cars in either order,
then those two elements will remain in their original order.
sort-grouped-array
considers its array argument to
be composed of records of group-size elements each.
These records are considered as units, and are sorted with respect
to one another. The predicate is applied to the first element
of each record; so the first elements act as the keys on which
the records are sorted.
This is like sort-grouped-array
except that the
predicate is applied to four arguments: an array,
an index into that array, a second array, and an index into
the second array. predicate should consider each index
as the subscript of the first element of a record in the corresponding
array, and compare the two records. This is more general
than sort-grouped-array
since the function can get at
all of the elements of the relevant records, instead of only the first element.
Storage allocation is handled differently by different computer systems. In many languages, the programmer must spend a lot of time thinking about when variables and storage units are allocated and deallocated. In Lisp, freeing of allocated storage is normally done automatically by the Lisp system; when an object is no longer accessible to the Lisp environment, it is garbage collected. This relieves the programmer of a great burden, and makes writing programs much easier.
However, automatic freeing of storage incurs an expense: more computer resources must be devoted to the garbage collector. If a program is designed to allocate temporary storage, which is then left as garbage, more of the computer must be devoted to the collection of garbage; this expense can be high. In some cases, the programmer may decide that it is worth putting up with the inconvenience of having to free storage under program control, rather than letting the system do it automatically, in order to prevent a great deal of overhead from the garbage collector.
It usually is not worth worrying about freeing of storage when the units of storage are very small things such as conses or small arrays. Numbers are not a problem, either; fixnums and small flonums do not occupy storage, and the system has a special way of garbage-collecting the other kinds of numbers with low overhead. But when a program allocates and then gives up very large objects at a high rate (or large objects at a very high rate), it can be very worthwhile to keep track of that one kind of object manually. Within the Lisp Machine system, there are several programs that are in this position. The Chaosnet software allocates and frees "packets", which are moderately large, at a very high rate. The window system allocates and frees certain kinds of windows, which are very large, moderately often. Both of these programs manage their objects manually, keeping track of when they are no longer used.
When we say that a program "manually frees" storage, it does not really mean that the storage is freed in the same sense that the garbage collector frees storage. Instead, a list of unused objects is kept. When a new object is desired, the program first looks on the list to see if there is one around already, and if there is it uses it. Only if the list is empty does it actually allocate a new one. When the program is finished with the object, it returns it to this list.
The functions and special forms in this section perform the above
function. The set of objects forming each such list is called a
"resource"; for example, there might be a Chaosnet packet resource.
defresource
defines a new resource; allocate-resource
allocates
one of the objects; deallocate-resource
frees one of the objects
(putting it back on the list); and using-resource
temporarily
allocates an object and then frees it.
The defresource
special form is used to define a new resource. The
form looks like this:
(defresource name parameters keyword value keyword value ...)
name should be a symbol; it is the name of the resource and gets a
defresource
property of the internal data structure representing the resource.
parameters is a lambda-list giving names and default values (if &optional
is used) of parameters to an object of this type. For example, if one had a resource
of two-dimensional arrays to be used as temporary storage in a calculation, the
resource would typically have two parameters, the number of rows and the number of
columns. In the simplest case parameters is ()
.
The keyword options control how the objects of the resource are made and kept track of. The following keywords are allowed:
:constructor
The value is either a form or the name of a function. It is responsible for making an object, and will be used when someone tries to allocate an object from the resource and no suitable free objects exist. If the value is a form, it may access the parameters as variables. If it is a function, it is given the internal data structure for the resource and any supplied parameters as its arguments; it will need to default any unsupplied optional parameters. This keyword is required.
:initial-copies
The value is a number (or nil
which means 0). This many objects will
be made as part of the evaluation of the defresource
; thus is useful to
set up a pool of free objects during loading of a program. The default is
to make no initial copies.
If initial copies are made and there are parameters, all the parameters must
be &optional
and the initial copies will have the default values of the
parameters.
:finder
The value is a form or a function as with :constructor
and sees the
same arguments. If this option is specified, the resource system does not keep
track of the objects. Instead, the finder must do so. It will be called
inside a without-interrupts
and must find a usable object somehow and return it.
:matcher
The value is a form or a function as with :constructor
. In addition to
the parameters, a form here may access the variable object
(in the current package).
A function gets the object as its second argument, after the data structure and
before the parameters. The job of the matcher is to make sure that the object
matches the specified parameters. If no matcher is supplied, the system will remember
the values of the parameters (including optional ones that defaulted) that were used
to construct the object, and will assume that it matches those particular values for
all time. The comparison is done with equal
(not eq
). The matcher is
called inside a without-interrupts
.
:checker
The value is a form or a function, as above. In addition to the parameters,
a form here may access the variables object
and in-use-p
(in the current
package). A function receives these as its second and third arguments, after the
data structure and before the parameters. The job of the checker is to determine
whether the object is safe to allocate. If no checker is supplied, the default
checker looks only at in-use-p
; if the object has been allocated and not freed
it is not safe to allocate, otherwise it is. The checker is
called inside a without-interrupts
.
If these options are used with forms (rather than functions), the forms get
compiled into functions as part of the expansion of defresource
. These
functions are given names like (:property resource-name si:resource-constructor)
;
these names are not guaranteed not to change in the future.
Most of the options are not used in typical cases. Here is an example:
(defresource two-dimensional-array (rows columns) :constructor (make-array (list rows columns)))
Suppose the array was usually going to be 100 by 100, and you wanted to preallocate one during loading of the program so that the first time you needed an array you wouldn’t have to spend the time to create one. You might simply put
(using-resource (foo two-dimensional-array 100 100) )
after your defresource
, which would allocate a 100 by 100 array and then
immediately free it. Alternatively you could:
(defresource two-dimensional-array (&optional (rows 100) (columns 100)) :constructor (make-array (list rows columns)) :initial-copies 1)
Here is an example of how you might use the :matcher
option. Suppose you wanted
to have a resource of two-dimensional arrays, as above, except that when you allocate
one you don’t care about the exact size, as long as it is big enough. Furthermore
you realize that you are going to have a lot of different sizes and if you always
allocated one of exactly the right size, you would allocate a lot of different arrays
and would not reuse a pre-existing array very often. So you might:
(defresource sloppy-two-dimensional-array (rows columns) :constructor (make-array (list rows columns)) :matcher (and ( (array-dimension-n 1 object) rows) ( (array-dimension-n 2 object) columns)))
Allocate an object from the resource specified by name. The various forms
and/or functions given as options to defresource
, together with any
parameters given to allocate-resource
, control how a suitable object
is found and whether a new one has to be constructed or an old one can be reused.
Note that the using-resource
special form is usually what you want to
use, rather than allocate-resource
itself; see below.
Free the object resource, returning it to the free-object list of the resource specified by name.
Forget all of the objects being remembered by the resource specified by name.
Future calls to allocate-resource
will create new objects. This function is
useful if something about the resource has been changed incompatibly, such that the
old objects are no longer usable. If an object of the resource is in use when
clear-resource
is called, an error will be signalled when that object is
deallocated.
The body forms are evaluated sequentially with variable bound to an object allocated from the resource named resource, using the given parameters. The parameters (if any) are evaluated, but resource is not.
using-resource
is often more convenient than calling
allocate-resource
and deallocate-resource
.
Furthermore it is careful to free the object when the body is exited,
whether it returns normally or via *throw
. This is done by using
unwind-protect
; see unwind-protect-fun.
Here is an example of the use of resources:
(defresource huge-16b-array (&optional (size 1000)) :constructor (make-array size ':type 'art-16b)) (defun do-complex-computation (x y) (using-resource (temp-array huge-16b-array) ... ;Within the body, the array can be used. (aset 5 temp-array i) ...)) ;The array is returned at the end.
Each symbol has associated with it a value cell, which refers to one Lisp object. This object is called the symbol’s binding or value, since it is what you get when you evaluate the symbol. The binding of symbols to values allows symbols to be used as the implementation of variables in programs.
The value cell can also be empty, referring to no Lisp object, in which case the symbol is said to be unbound. This is the initial state of a symbol when it is created. An attempt to evaluate an unbound symbol causes an error.
Symbols are often used as special variables. Variables and how
they work are described in variable-section. The symbols nil
and
t
are always bound to themselves; they may not be assigned, bound,
or otherwise used as variables. Attempting to change the value of
nil
or t
(usually) causes an error.
The functions described here work on symbols, not variables in general. This means that the functions below won’t work if you try to use them on local variables.
set
is the primitive for assignment of symbols. The symbol’s value
is changed to value; value may be any Lisp object. set
returns
value.
Example:
(set (cond ((eq a b) 'c) (t 'd)) 'foo)
will either set c
to foo
or set d
to foo
.
symeval
is the basic primitive for retrieving a symbol’s value.
(symeval sym)
returns sym’s current binding.
This is the function called by eval
when it is given a symbol
to evaluate. If the symbol is unbound, then symeval
causes
an error.
boundp
returns t
if sym is bound; otherwise, it returns nil
.
makunbound
causes sym to become unbound.
Example:
(setq a 1)
a => 1
(makunbound 'a)
a => causes an error.
makunbound
returns its argument.
value-cell-location
returns a locative pointer to sym’s value cell.
See the section on locatives (locative). It is preferable to write
(locf (symeval sym))
instead of calling this function explicitly.
This is actually the internal value cell; there can also be an external value cell. For details, see the section on closures (closure).
Note: the function value-cell-location
works on symbols that
get converted to local variables (see variable-section); the compiler
knows about it specially when its argument is a quoted symbol which is
the name of a local variable. It returns a pointer to the cell that holds
the value of the local variable.
Every symbol also has associated with it a function cell. The function
cell is similar to the value cell; it refers to a Lisp object.
When a function is referred to by name, that is, when a symbol is applied
or appears as the car of a form to be evaluated, that symbol’s function cell
is used to find its definition, the functional object which is to be applied.
For example, when evaluating (+ 5 6)
,
the evaluator looks in +
’s function cell to find the definition of +
,
in this case a FEF containing a compiled program, to apply to 5 and 6.
Maclisp does not have function cells; instead, it looks for special
properties on the property list. This is one of the major incompatibilities
between the two dialects.
Like the value cell, a function cell can be empty, and it can be bound
or assigned. (However, to bind a function cell you must use the
bind
subprimitive; see bind-fun.)
The following functions are analogous to the value-cell-related
functions in the previous section.
fsymeval
returns sym’s definition, the contents of its function cell.
If the function cell is empty, fsymeval
causes an error.
fset
stores definition, which may be any Lisp object, into sym’s
function cell. It returns definition.
fboundp
returns nil
if sym’s function cell is empty,
i.e sym is undefined.
Otherwise it returns t
.
fmakunbound
causes sym to be undefined, i.e its
function cell to be empty.
It returns sym.
function-cell-location
returns a locative pointer to sym’s
function cell. See the section on locatives (locative). It is
preferable to write
(locf (fsymeval sym))
rather than calling this function explicitly.
Since functions are the basic building block of Lisp programs, the system provides a variety of facilities for dealing with functions. Refer to chapter function-chapter for details.
Every symbol has an associated property list. See plist for documentation of property lists. When a symbol is created, its property list is initially empty.
The Lisp language itself does not use a symbol’s property list for anything. (This was not true in older Lisp implementations, where the print-name, value-cell, and function-cell of a symbol were kept on its property list.) However, various system programs use the property list to associate information with the symbol. For instance, the editor uses the property list of a symbol which is the name of a function to remember where it has the source code for that function, and the compiler uses the property list of a symbol which is the name of a special form to remember how to compile that special form.
Because of the existence of print-name, value, function, and package cells,
none of the Maclisp system property names (expr
, fexpr
, macro
, array
,
subr
, lsubr
, fsubr
, and in former times value
and
pname
) exist in Zetalisp.
This returns the list which represents the property list of sym. Note that
this is not the property list itself; you cannot do get
on it.
This sets the list which represents the property list of sym to list.
setplist
is to be used with caution (or not at all),
since property lists sometimes contain internal system properties, which
are used by many useful system functions. Also it is inadvisable to have the property
lists of two different symbols be eq
, since the shared list structure will
cause unexpected effects on one symbol if putprop
or remprop
is done to the other.
This returns a locative pointer to the location of sym’s property-list cell. This locative pointer is equally valid as sym itself, as a handle on sym’s property list.
Every symbol has an associated string called the print-name, or pname
for short. This string is used as the external representation of the symbol:
if the string is typed in to read
, it is read as a reference to that symbol
(if it is interned), and if the symbol is printed, print
types out the
print-name.
For more information, see the sections on the reader
(see reader) and printer (see printer).
This returns the print-name of the symbol sym.
Example:
(get-pname 'xyz) => "xyz"
This predicate returns t
if the two symbols sym1 and sym2 have
equal
print-names; that is, if their printed representation is the same.
Upper and lower case letters are normally considered the same.
If either or both of the arguments is a string instead of a symbol, then that
string is used in place of the print-name.
samepnamep
is useful for determining if two symbols would be the same
except that they are in different packages (see package).
Examples:
(samepnamep 'xyz (maknam '(x y z)) => t (samepnamep 'xyz (maknam '(w x y)) => nil (samepnamep 'xyz "xyz") => t
This is the same function as string-equal
(see string-equal-fun).
samepnamep
is provided mainly so that you can write programs that
will work in Maclisp as well as Zetalisp; in new programs,
you should just use string-equal
.
Every symbol has a package cell which is used, for interned
symbols, to point to the package which the symbol belongs to. For an
uninterned symbol, the package cell contains nil
. For
information about packages in general, see the chapter on packages, package.
For information about package cells, see symbol-package-cell-discussion.
The functions in this section are primitives for creating symbols.
However, before discussing them, it is important to point out that most
symbols are created by a higher-level mechanism, namely the reader and
the intern
function. Nearly all symbols in Lisp are created
by virtue of the reader’s having seen a sequence of input characters that
looked like the printed representation of a symbol. When the
reader sees such a p.r, it calls intern
(see intern-fun),
which looks up the sequence of characters in a big table and sees whether any symbol
with this print-name already exists. If it does, read
uses the
already-existing symbol. If it does not, then intern
creates a new
symbol and puts it into the table, and read
uses that new symbol.
A symbol that has been put into such a table is called an interned symbol. Interned symbols are normally created automatically; the first time someone (such as the reader) asks for a symbol with a given print-name that symbol is automatically created.
These tables are called packages. In Zetalisp, interned symbols are the province of the package system. Although interned symbols are the most commonly used, they will not be discussed further here. For more information, turn to the chapter on packages (package).
An uninterned symbol is a symbol used simply as a data object, with no special cataloging. An uninterned symbol prints the same as an interned symbol with the same print-name, but cannot be read back in.
The following functions can be used to create uninterned symbols explicitly.
This creates a new uninterned symbol, whose print-name is the string
pname. The value and function bindings will be unbound and the
property list will be empty. If permanent-p is specified, it is
assumed that the symbol is going to be interned and probably kept around
forever; in this case it and its pname will be put in the proper areas.
If permanent-p is nil
(the default), the symbol goes in the
default area and the pname is not copied. permanent-p is mostly
for the use of intern
itself.
Examples:
(setq a (make-symbol "foo")) => foo (symeval a) => ERROR!
Note that the symbol is not interned; it is simply created and returned.
This returns a new uninterned symbol with the same print-name
as sym. If copy-props is non-nil
, then the
value and function-definition of the new symbol will
be the same as those of sym, and the property list of
the new symbol will be a copy of sym’s. If copy-props
is nil
, then the new symbol will be unbound and undefined, and
its property list will be empty.
gensym
invents a print-name, and creates a new symbol with that print-name.
It returns the new, uninterned symbol.
The invented print-name is a character prefix (the value of si:*gensym-prefix
)
followed by the decimal representation of a number (the value of si:*gensym-counter
),
e.g "g0001". The number is increased by one every time gensym
is called.
If the argument x is present and is a fixnum, then si:*gensym-counter
is
set to x. If x is a string or a symbol, then si:*gensym-prefix
is set to
the first character of the string or of the symbol’s print-name.
After handling the argument, gensym
creates a symbol as it would with no argument.
Examples:
if (gensym) => g0007 then (gensym 'foo) => f0008 (gensym 32.) => f0032 (gensym) => f0033
Note that the number is in decimal and always has four digits, and the prefix is always one character.
gensym
is usually used to create a symbol which should not normally
be seen by the user, and whose print-name is unimportant, except to
allow easy distinction by eye between two such symbols.
The optional argument is rarely supplied.
The name comes from "generate symbol", and the symbols produced by it
are often called "gensyms".
Zetalisp includes several types of numbers, with different
characteristics. Most numeric functions will accept any type of numbers as
arguments and do the right thing. That is to say, they are generic.
In Maclisp, there are generic numeric functions (like plus
) and there
are specific numeric functions (like +
) which only operate on a certain
type, and are much more efficient.
In Zetalisp, this distinction does not exist; both function
names exist for compatibility but they are identical. The microprogrammed
structure of the machine makes it possible to have only the generic
functions without loss of efficiency.
The types of numbers in Zetalisp are:
Fixnums are 24-bit 2’s complement binary integers. These are the "preferred, most efficient" type of number.
Bignums are arbitrary-precision binary integers.
Flonums are floating-point numbers. They have a mantissa of 32 bits and an exponent of 11 bits, providing a precision of about 9 digits and a range of about 10^300. Stable rounding is employed.
Small flonums are another form of floating-point number, with a mantissa of 18 bits and an exponent of 7 bits, providing a precision of about 5 digits and a range of about 10^19. Stable rounding is employed. Small flonums are useful because, like fixnums, and unlike flonums, they don’t require any storage. Computing with small flonums is more efficient than with regular flonums because the operations are faster and consing overhead is eliminated.
Generally, Lisp objects have a unique identity; each exists, independent
of any other, and you can use the eq
predicate to determine whether
two references are to the same object or not. Numbers are the exception
to this rule; they don’t work this way. The following function may return
either t
or nil
. Its behavior is considered undefined, but
as this manual is written it returns t
when interpreted but nil
when compiled.
(defun foo () (let ((x (float 5))) (eq x (car (cons x nil)))))
This is very strange from the point of view of Lisp’s usual object
semantics, but the implementation works this way, in order to gain
efficiency, and on the grounds that identity testing of numbers is not
really an interesting thing to do. So, the rule is that the result
of applying eq
to numbers is undefined, and may return either
t
or nil
at will. If you want to compare the values of
two numbers, use =
(see =-fun).
Fixnums and small flonums are exceptions to this rule; some system code
knows that eq
works on fixnums used to represent characters or small
integers, and uses memq
or assq
on them. eq
works as well
as =
as an equality test for fixnums. Small flonums that are =
tend to be eq
also, but it is unwise to depend on this.
The distinction between fixnums and bignums is largely transparent to
the user. The user simply computes with integers, and the system
represents some as fixnums and the rest (less efficiently) as bignums.
The system automatically converts back and forth between fixnums and
bignums based solely on the size of the integer. There are a few "low
level" functions which only work on fixnums; this fact is noted in
their documentation. Also when using eq
on numbers the user
needs to be aware of the fixnum/bignum distinction.
Integer computations cannot "overflow", except for division by zero,
since bignums can be of arbitrary size. Floating-point computations
can get exponent overflow or underflow, if the result is too large or small
to be represented. Exponent overflow always signals an error.
Exponent underflow normally signals an error, and assumes 0.0
as the answer
if the user says to proceed from the error. However, if the value of the
variable zunderflow
is non-nil
, the error is skipped
and computation proceeds with 0.0
in place of the result that was too small.
When an arithmetic function of more than one argument is given arguments of different numeric types, uniform coercion rules are followed to convert the arguments to a common type, which is also the type of the result (for functions which return a number). When an integer meets a small flonum or a flonum, the result is a small flonum or a flonum (respectively). When a small flonum meets a regular flonum, the result is a regular flonum.
Thus if the constants in a numerical algorithm are written as small flonums (assuming this provides adequate precision), and if the input is a small flonum, the computation will be done in small-flonum mode and the result will a small flonum, while if the input is a large flonum the computations will be done in full precision and the result will be a flonum.
Zetalisp never automatically converts between flonums and small flonums, in the way it automatically converts between fixnums and bignums, since this would lead either to inefficiency or to unexpected numerical inaccuracies. (When a small flonum meets a flonum, the result is a flonum, but if you use only one type, all the results will be of the same type too.) This means that a small-flonum computation can get an exponent overflow error even when the result could have been represented as a large flonum.
Floating-point numbers retain only a certain number of bits of precision; therefore, the results of computations are only approximate. Large flonums have 31 bits and small flonums have 17 bits, not counting the sign. The method of approximation is "stable rounding". The result of an arithmetic operation will be the flonum which is closest to the exact value. If the exact result falls precisely halfway between two flonums, the result will be rounded down if the least-significant bit is 0, or up if the least-significant bit is 1. This choice is arbitrary but insures that no systematic bias is introduced.
Integer addition, subtraction, and multiplication always produce an exact result. Integer division, on the other hand, returns an integer rather than the exact rational-number result. The quotient is truncated towards zero rather than rounded. The exact rule is that if A is divided by B, yielding a quotient of C and a remainder of D, then A = B * C + D exactly. D is either zero or the same sign as A. Thus the absolute value of C is less than or equal to the true quotient of the absolute values of A and B. This is compatible with Maclisp and most computer hardware. However, it has the serious problem that it does not obey the rule that if A divided by B yields a quotient of C and a remainder of D, then dividing A + k * B by B will yield a quotient of C + k and a remainder of D for all integer k. The lack of this property sometimes makes regular integer division hard to use. New functions that implement a different kind of division, that obeys this rule, will be implemented in the future.
Unlike Maclisp, Zetalisp does not have number declarations in the compiler. Note that because fixnums and small flonums require no associated storage they are as efficient as declared numbers in Maclisp. Bignums and (large) flonums are less efficient, however bignum and flonum intermediate results are garbage collected in a special way that avoids the overhead of the full garbage collector.
The different types of numbers can be distinguished by their printed representations. A leading or embedded (but not trailing) decimal point, and/or an exponent separated by "e", indicates a flonum. If a number has an exponent separated by "s", it is a small flonum. Small flonums require a special indicator so that naive users will not accidentally compute with the lesser precision. Fixnums and bignums have similar printed representations since there is no numerical value that has a choice of whether to be a fixnum or a bignum; an integer is a bignum if and only if its magnitude too big for a fixnum. See the examples on flonum-examples, in the description of what the reader understands.
Returns t
if x is zero. Otherwise it returns nil
.
If x is not a number, zerop
causes an error. For flonums,
this only returns t
for exactly 0.0
or 0.0s0
; there
is no "fuzz".
Returns t
if its argument is a positive number, strictly greater
than zero. Otherwise it returns nil
.
If x is not a number, plusp
causes an error.
Returns t
if its argument is a negative number, strictly
less than zero. Otherwise it returns nil
.
If x is not a number, minusp
causes an error.
Returns t
if number is odd, otherwise nil
.
If number is not a fixnum or a bignum, oddp
causes an error.
Returns t
if number is even, otherwise nil
.
If number is not a fixnum or a bignum, evenp
causes an error.
signp is used to test the sign of a number. It is present only for
Maclisp compatibility, and is not recommended for use in new programs.
signp
returns t
if x is a number which
satisfies the test, nil
if it is not a number or does not meet
the test. test is not evaluated, but x is. test can be
one of the following:
l
x < 0
le
x lessOrEqual
0
e
x = 0
n
x notEquals
0
ge
x greaterOrEqual
0
g
x > 0
Examples:
(signp ge 12) => t (signp le 12) => nil (signp n 0) => nil (signp g 'foo) => nil
See also the data-type predicates fixp
, floatp
, bigp
,
small-floatp
, and numberp
(fixp-fun).
All of these functions require that their arguments be numbers, and signal an error if given a non-number. They work on all types of numbers, automatically performing any required coercions (as opposed to Maclisp in which generally only the spelled-out names work for all kinds of numbers).
Returns t
if x and y are numerically equal. An integer can
be =
to a flonum.
greaterp
compares its arguments from left to right. If any argument
is not greater than the next, greaterp
returns nil
. But if the
arguments are monotonically strictly decreasing, the result is t
.
Examples:
(greaterp 4 3) => t (greaterp 4 3 2 1 0) => t (greaterp 4 3 1 2 0) => nil
greaterOrEqual
compares its arguments from left to right. If any argument
is less than the next, greaterOrEqual
returns nil
. But if the
arguments are monotonically decreasing or equal, the result is t
.
lessp
compares its arguments from left to right. If any argument
is not less than the next, lessp
returns nil
. But if the
arguments are monotonically strictly increasing, the result is t
.
Examples:
(lessp 3 4) => t (lessp 1 1) => nil (lessp 0 1 2 3 4) => t (lessp 0 1 3 2 4) => nil
lessOrEqual
compares its arguments from left to right. If any argument
is greater than the next, lessOrEqual
returns nil
. But if the
arguments are monotonically increasing or equal, the result is t
.
Returns t
if x is not numerically equal to y, and nil
otherwise.
max
returns the largest of its arguments.
Example:
(max 1 3 2) => 3
max
requires at least one argument.
min
returns the smallest of its arguments.
Example:
(min 1 3 2) => 1
min
requires at least one argument.
All of these functions require that their arguments be numbers, and signal an error if given a non-number. They work on all types of numbers, automatically performing any required coercions (as opposed to Maclisp, in which generally only the spelled-out versions work for all kinds of numbers, and the "$" versions are needed for flonums).
Returns the sum of its arguments. If there are no arguments, it returns
0
, which is the identity for this operation.
Returns its first argument minus all of the rest of its arguments.
Returns the negative of x.
Examples:
(minus 1) => -1 (minus -3.0) => 3.0
With only one argument, -
is the same as minus
; it
returns the negative of its argument.
With more than one argument, -
is the same as difference
;
it returns its first argument minus all of the rest of its arguments.
Returns |x|
, the absolute value of the number x.
abs
could have been defined by:
(defun abs (x) (cond ((minusp x) (minus x)) (t x)))
Returns the product of its arguments. If there are no arguments, it
returns 1
, which is the identity for this operation.
Returns the first argument divided by all of the rest of its arguments.
The name of this function is written //
rather than /
because
/
is the quoting character in Lisp syntax and must be doubled.
With more than one argument, //
is the same as quotient
;
it returns the first argument divided by all of the rest of its arguments.
With only one argument, (// x)
is the same as (// 1 x)
.
The exact rules for the meaning of the quotient and remainder of two
integers are given on division-rule; this explains why the rules used for
integer division are not correct for all applications.
Examples:
(// 3 2) => 1 ;Fixnum division truncates.
(// 3 -2) => -1
(// -3 2) => -1
(// -3 -2) => 1
(// 3 2.0) => 1.5
(// 3 2.0s0) => 1.5s0
(// 4 2) => 2
(// 12. 2. 3.) => 2
(// 4.0) => .25
Returns the remainder of x divided by y. x and y must be integers (fixnums or bignums). The exact rules for the meaning of the quotient and remainder of two integers are given on division-rule.
(\ 3 2) => 1 (\ -3 2) => -1 (\ 3 -2) => 1 (\ -3 -2) => -1
(sub1 x)
is the same as (difference x 1)
. Note that the
short name may be confusing: (1- x)
does not mean 1-x;
rather, it means x-1.
Returns the greatest common divisor of all its arguments. The arguments must be integers (fixnums or bignums).
Returns x raised to the y’th power.
The result is an integer if both arguments are integers (even if y is negative!)
and floating-point if either x or y or both is floating-point.
If the exponent is an integer a repeated-squaring algorithm is used, while
if the exponent is floating the result is (exp (* y (log x)))
.
Returns the square root of x.
Integer square-root. x must be an integer; the result is the greatest integer less than or equal to the exact square root of x.
These are the internal micro-coded arithmetic functions. There is no
reason why anyone should need to write code with these explicitly, since the
compiler knows how to generate the appropriate code for plus
,
+
, etc. These names are only here for Maclisp compatibility.
These functions are only for floating-point arguments; if given an integer they will convert it to a flonum. If given a small-flonum, they will return a small-flonum [currently this is not true of most of them, but it will be fixed in the future].
Returns e raised to the x’th power, where e is the base of natural logarithms.
Returns the natural logarithm of x.
Returns the sine of x, where x is expressed in radians.
Returns the sine of x, where x is expressed in degrees.
Returns the cosine of x, where x is expressed in radians.
Returns the cosine of x, where x is expressed in degrees.
Returns the angle, in radians, whose tangent is y/x. atan
always returns a
non-negative number between zero and 2pi
.
Returns the angle, in radians, whose tangent is y/x. atan2
always returns a
number between -pi
and pi
.
These functions are provided to allow specific conversions of data types to be forced, when desired.
Converts x from a flonum (or small-flonum) to an integer, truncating towards negative infinity. The result is a fixnum or a bignum as appropriate. If x is already a fixnum or a bignum, it is returned unchanged.
Converts x from a flonum (or small-flonum) to an integer, rounding to the
nearest integer. If x is exactly halfway between two integers,
this rounds up (towards positive infinity). fixr
could have been defined by:
(defun fixr (x) (if (fixp x) x (fix (+ x 0.5))))
Converts any kind of number to a flonum.
Converts any kind of number to a small flonum.
Except for lsh
and rot
, these functions operate on both
fixnums and bignums. lsh
and rot
have an inherent word-length
limitation and hence only operate on 24-bit fixnums. Negative numbers
are operated on in their 2’s-complement representation.
Returns the bit-wise logical inclusive or of its arguments. At least one argument is required.
Example:
(logior 4002 67) => 4067
Returns the bit-wise logical exclusive or of its arguments. At least one argument is required.
Example:
(logxor 2531 7777) => 5246
Returns the bit-wise logical and of its arguments. At least one argument is required.
Examples:
(logand 3456 707) => 406 (logand 3456 -100) => 3400
Returns the logical complement of number. This is the same as
logxor
’ing number with -1.
Example:
(lognot 3456) => -3457
boole
is the generalization of logand
, logior
, and logxor
.
fn should be a fixnum between 0 and 17 octal inclusive;
it controls the function which is computed. If the binary representation
of fn is abcd (a is the most significant bit, d the least)
then the truth table for the Boolean operation is as follows:
y | 0 1 --------- 0| a c x | 1| b d
If boole
has more than three arguments, it is associated left
to right; thus,
(boole fn x y z) = (boole fn (boole fn x y) z)
With two arguments, the result of boole
is simply its second argument.
At least two arguments are required.
Examples:
(boole 1 x y) = (logand x y) (boole 6 x y) = (logxor x y) (boole 2 x y) = (logand (lognot x) y)
logand
, logior
, and logxor
are usually preferred over the equivalent
forms of boole
, to avoid putting magic numbers in the program.
bit-test
is a predicate which returns t
if any of
the bits designated by the 1’s in x are 1’s in y.
bit-test
is implemented as a macro which expands as follows:
(bit-test x y) ==> (not (zerop (logand x y)))
Returns x shifted left y bits if y is positive or zero,
or x shifted right |y|
bits if y is negative.
Zero bits are shifted in (at either end) to fill unused positions.
x and y must be fixnums. (In some applications you may
find ash
useful for shifting bignums; see below.)
Examples:
(lsh 4 1) => 10 ;(octal)
(lsh 14 -2) => 3
(lsh -1 1) => -2
Shifts x arithmetically left y bits if y is positive,
or right -y bits if y is negative.
Unused positions are filled by zeroes from the right, and
by copies of the sign bit from the left. Thus, unlike lsh
,
the sign of the result is always the same as the sign of x.
If x is a fixnum or a bignum, this is a shifting operation.
If x is a flonum, this does scaling (multiplication by a power of two),
rather than actually shifting any bits.
Returns x rotated left y bits if y is positive or zero,
or x rotated right |y|
bits if y is negative.
The rotation considers x as a 24-bit number (unlike Maclisp,
which considers x to be a 36-bit number in both the pdp-10
and Multics implementations).
x and y must be fixnums. (There is no function for
rotating bignums.)
Examples:
(rot 1 2) => 4 (rot 1 -2) => 20000000 (rot -1 7) => -1 (rot 15 24.) => 15
This returns the number of significant bits in |x|
.
x may be a fixnum or a bignum. Its sign is ignored.
The result is the least integer strictly greater than the base-2
logarithm of |x|
.
Examples:
(haulong 0) => 0 (haulong 3) => 2 (haulong -7) => 3
Returns the high n bits of the binary representation of |x|
,
or the low -n
bits if n is negative.
x may be a fixnum or a bignum; its sign is ignored.
haipart
could have been defined by:
(defun haipart (x n) (setq x (abs x)) (if (minusp n) (logand x (1- (ash 1 (- n)))) (ash x (min (- n (haulong x)) 0))))
Several functions are provided for dealing with an arbitrary-width field of contiguous bits appearing anywhere in an integer (a fixnum or a bignum). Such a contiguous set of bits is called a byte. Note that we are not using the term byte to mean eight bits, but rather any number of bits within a number. These functions use numbers called byte specifiers to designate a specific byte position within any word. Byte specifiers are fixnums whose two lowest octal digits represent the size of the byte, and whose higher (usually two, but sometimes more) octal digits represent the position of the byte within a number, counting from the right in bits. A position of zero means that the byte is at the right end of the number. For example, the byte-specifier 0010 (i.e 10 octal) refers to the lowest eight bits of a word, and the byte-specifier 1010 refers to the next eight bits. These byte-specifiers will be stylized below as ppss. The maximum value of the ss digits is 27 (octal), since a byte must fit in a fixnum although bytes can be loaded from and deposited into bignums. (Bytes are always positive numbers.) The format of byte-specifiers is taken from the pdp-10 byte instructions.
ppss specifies a byte of num to be extracted.
The ss bits of the byte starting at bit pp
are the lowest ss bits in the returned value, and the rest of the
bits in the returned value are zero. The name of the function,
ldb
, means "load byte". num may be a fixnum or a bignum.
The returned value is always a fixnum.
Example:
(ldb 0306 4567) => 56
This is like ldb
except that instead of using a byte specifier,
the position and size are passed as separate arguments.
The argument order is not analogous to that of ldb
so that
load-byte
can be compatible with Maclisp.
ldb-test
is a predicate which returns t
if any of
the bits designated by the byte specifier ppss are 1’s in y.
That is, it returns t
if the designated field is non-zero.
ldb-test
is implemented as a macro which expands as follows:
(ldb-test ppss y) ==> (not (zerop (ldb ppss y)))
This is similar to ldb
; however, the specified byte
of num is returned as a number in position pp of
the returned word, instead of position 0 as with ldb
.
num must be a fixnum.
Example:
(mask-field 0306 4567) => 560
Returns a number which is the same as num except in the
bits specified by ppss. The low
ss bits of byte are placed in those bits. byte is interpreted as
being right-justified, as if it were the result of ldb
.
num may be a fixnum or a bignum. The name means "deposit byte".
Example:
(dpb 23 0306 4567) => 4237
This is like dpb
except that instead of using a byte specifier,
the position and size are passed as separate arguments.
The argument order is not analogous to that of dpb
so that
deposit-byte
can be compatible with Maclisp.
This is like dpb
, except that byte is not taken to
be left-justified; the ppss bits of byte are used
for the ppss bits of the result, with the rest of the
bits taken from num. num must be a fixnum.
Example:
(deposit-field 230 0306 4567) => 4237
The behavior of the following two functions depends on the size of fixnums, and so functions using them may not work the same way on future implementations of Zetalisp. Their names start with "%" because they are more like machine-level subprimitives than the previous functions.
%logldb
is like ldb
except that it only loads out of fixnums and
allows a byte size of 30 (octal), i.e all 24 bits of the fixnum
including the sign bit.
%logdpb
is like dpb
except that it only deposits into fixnums.
Using this to change the sign-bit will leave the result as a fixnum,
while dpb
would produce a bignum result for arithmetic correctness.
%logdpb
is good for manipulating fixnum bit-masks such as are used
in some internal system tables and data-structures.
The functions in this section provide a pseudo-random number generator
facility. The basic function you use is random
, which returns a new
pseudo-random number each time it is called. Between calls, its state
is saved in a data object called a random-array. Usually there is
only one random-array; however, if you want to create a reproducible
series of pseudo-random numbers, and be able to reset the state to
control when the series starts over, then you need some of the other
functions here.
(random)
returns a random fixnum, positive or negative. If arg
is present, a fixnum between 0 and arg minus 1 inclusive is
returned. If random-array is present, the given array is used
instead of the default one (see below). Otherwise, the default
random-array is used (and is created if it doesn’t already exist).
The algorithm is executed inside a without-interrupts
(see without-interrupts-fun) so two processes can use the
same random-array without colliding.
A random-array consists of an array of numbers, and two pointers into the array. The pointers circulate around the array; each time a random number is requested, both pointers are advanced by one, wrapping around at the end of the array. Thus, the distance forward from the first pointer to the second pointer, allowing for wraparound, stays the same. Let the length of the array be length and the distance between the pointers be offset. To generate a new random number, each pointer is set to its old value plus one, modulo length. Then the two elements of the array addressed by the pointers are added together; the sum is stored back into the array at the location where the second pointer points, and is returned as the random number after being normalized into the right range.
This algorithm produces well-distributed random numbers if length and offset are chosen carefully, so that the polynomial x^length+x^offset+1 is irreducible over the mod-2 integers. The system uses 71 and 35.
The contents of the array of numbers should be initialized to anything
moderately random, to make the algorithm work. The contents get
initialized by a simple random number generator, based on a
number called the seed. The initial value of the seed is set when
the random-array is created, and it can be changed. To have several
different controllable resettable sources of random numbers, you
can create your own random-arrays. If you don’t care about reproducibility
of sequences, just use random
without the random-array argument.
nil
) ¶Creates, initializes, and returns a random-array. length is the
length of the array. offset is the distance between the pointers
and should be an integer less than length. seed is the initial
value of the seed, and should be a fixnum. This calls
si:random-initialize
on the random array before returning it.
array must be a random-array, such as is created by
si:random-create-array
. If new-seed is provided, it should be a
fixnum, and the seed is set to it. si:random-initialize
reinitializes the
contents of the array from the seed (calling random
changes the
contents of the array and the pointers, but not the seed).
Sometimes it is desirable to have a form of arithmetic which has no overflow checking (which would produce bignums), and truncates results to the word size of the machine. In Zetalisp, this is provided by the following set of functions. Their answers are only correct modulo 2^24. These functions should not be used for "efficiency"; they are probably less efficient than the functions which do check for overflow. They are intended for algorithms which require this sort of arithmetic, such as hash functions and pseudo-random number generation.
Returns the sum of x and y modulo 2^24. Both arguments must be fixnums.
Returns the difference of x and y modulo 2^24. Both arguments must be fixnums.
Returns the product of x and y modulo 2^24. Both arguments must be fixnums.
These peculiar functions are useful in programs that don’t want to use bignums for one reason or another. They should usually be avoided, as they are difficult to use and understand, and they depend on special numbers of bits and on the use of two’s-complement notation.
Returns bits 24 through 46 (the most significant half) of the product of
num1 and num2. If you call this and %24-bit-times
on the
same arguments num1 and num2, regarding them as integers, you
can combine the results into a double-precision product. If num1
and num2 are regarded as two’s-complement fractions, -1
,
num < 1
%multiply-fractions
returns 1/2 of their correct
product as a fraction. (The name of this function isn’t too great.)
Divides the double-precision number given by the first two arguments by the third argument, and returns the single-precision quotient. Causes an error if division by zero or if the quotient won’t fit in single precision.
Divides the double-precision number given by the first two arguments by the third argument, and returns the remainder. Causes an error if division by zero.
high24 and low24, which must be fixnums, are concatenated
to produce a 48-bit unsigned positive integer. A flonum containing the
same value is constructed and returned. Note that only the 31 most-significant
bits are retained (after removal of leading zeroes.) This function is
mainly for the benefit of read
.
An array is a Lisp object that consists of a group of cells, each of which may contain an object. The individual cells are selected by numerical subscripts.
The dimensionality of an array (or, the number of dimensions which the array has) is the number of subscripts used to refer to one of the elements of the array. The dimensionality may be any integer from one to seven, inclusively.
The lowest value for any subscript is zero; the highest value is a property of the array. Each dimension has a size, which is the lowest number which is too great to be used as a subscript. For example, in a one-dimensional array of five elements, the size of the one and only dimension is five, and the acceptable values of the subscript are zero, one, two, three, and four.
The most basic primitive functions for handling arrays are:
make-array
, which is used for the creation of arrays, aref
,
which is used for examining the
contents of arrays, and aset
, which
is used for storing into arrays.
An array is a regular Lisp object, and it is common for an array to be the binding of a symbol, or the car or cdr of a cons, or, in fact, an element of an array. There are many functions, described in this chapter, which take arrays as arguments and perform useful operations on them.
Another way of handling arrays, inherited from Maclisp, is to treat them
as functions. In this case each array has a name, which is a symbol
whose function definition is the array. Zetalisp supports this
style by allowing an array to be applied to arguments, as if it were
a function. The arguments are treated as subscripts and the array is
referenced appropriately. The store
special form (see store-fun)
is also supported. This kind of array referencing is considered to be
obsolete, and is slower than the usual kind. It should not be used in
new programs.
There are many types of arrays. Some types of arrays can hold
Lisp objects of any type; the other types of arrays can only hold
fixnums or flonums. The array types are known by a set of symbols whose names
begin with "art-
" (for ARray Type).
The most commonly used type is called art-q
. An art-q
array simply holds Lisp objects of any type.
Similar to the art-q
type is the art-q-list
. Like the
art-q
, its elements may be any Lisp object. The difference is that
the art-q-list
array "doubles" as a list; the function g-l-p
will take an art-q-list
array and return a list whose
elements are those of the array, and whose actual substance is that of
the array. If you rplaca
elements of the list, the corresponding
element of the array will change, and if you store into the array, the
corresponding element of the list will change the same way.
An attempt to rplacd
the list will cause an error, since arrays
cannot implement that operation.
There is a set of types called art-1b, art-2b, art-4b, art-8b
,
and art-16b
;
these names are short for "1 bit", "2 bits", and so on. Each element
of an art-nb
array is a non-negative fixnum, and only the
least significant n bits are remembered in the array; all of the others are discarded. Thus art-1b
arrays store only 0 and 1, and
if you store a 5 into an art-2b
array and look at it
later, you will find a 1 rather than a 5.
These arrays are used when it is known beforehand that the
fixnums which will be stored are non-negative and limited in size to a
certain number of bits. Their advantage over the art-q
array is
that they occupy less storage, because more than one element of the
array is kept in a single machine word. (For example, 32 elements
of an art-1b
array or 2 elements of an art-16b
array
will fit into one word).
There are also art-32b
arrays which have 32 bits per element.
Since fixnums only have 24 bits anyway, these are the same as art-q
arrays except that they only hold fixnums. They do not behave consistently
with the other "bit" array types, and generally they should not be used.
Character strings are implemented by the art-string
array
type. This type acts similarly to the art-8b
; its elements must be
fixnums, of which only the least significant eight bits are stored.
However, many important system functions, including read
,
print
, and eval
, treat art-string
arrays very differently
from the other kinds of arrays. These arrays are usually called
strings, and chapter string-chapter of this manual deals with functions
that manipulate them.
An art-fat-string
array is a character string with wider characters, containing
16 bits rather than 8 bits. The extra bits are ignored by string operations,
such as comparison, on these strings; typically they are used to hold font
information.
An art-half-fix
array contains half-size fixnums. Each element
of the array is a signed 16-bit integer; the range is from -32768 to 32767
inclusive.
The art-float
array type is a special-purpose type whose
elements are flonums. When storing into such an array the value (any
kind of number) will be converted to a flonum, using the float
function (see float-fun). The advantage of
storing flonums in an art-float
array rather than an art-q
array is that the numbers in an art-float
array are not true Lisp
objects. Instead the array remembers the numerical value, and when it
is aref
’ed creates a Lisp object (a flonum) to hold the value.
Because the system does special storage management for bignums and
flonums that are intermediate results, the use of art-float
arrays
can save a lot of work for the garbage-collector and hence greatly
increase performance. An intermediate result is a Lisp object passed
as an argument, stored in a local variable, or returned as the value of
a function, but not stored into a global variable, a non-art-float
array, or list structure. art-float
arrays also provide a locality
of reference advantage over art-q
arrays containing flonums, since
the flonums are contained in the array rather than being separate objects
probably on different pages of memory.
The art-fps-float
array type is another special-purpose type
whose elements are flonums. The internal format of this array is compatible
with the pdp11/VAX single-precision floating-point format. The primary purpose
of this array type is to interface with the FPS array processor, which can
transfer data directly in and out of such an array.
When storing into an art-fps-float
array any kind of number may
be stored. It will be rounded off to the 24-bit precision of the pdp11. If
the magnitude of the number is too large, the largest valid floating-point
number will be stored. If the magnitude is too small, zero will be stored.
When reading from an art-fps-float
array, a new flonum is created
containing the value, just as with an art-float
array.
There are three types of arrays which exist only for the
implementation of stack groups; these types are called
art-stack-group-head, art-special-pdl
, and art-reg-pdl
. Their elements
may be any Lisp object; their use is explained in the section on
stack groups (see stack-group).
Currently, multi-dimensional arrays are stored in column-major order rather than row-major order as in Maclisp. Row-major order means that successive memory locations differ in the last subscript, while column-major order means that successive memory locations differ in the first subscript. This has an effect on paging performance when using large arrays; if you want to reference every element in a multi-dimensional array and move linearly through memory to improve locality of reference, you must vary the first subscript fastest rather than the last.
The value of array-types
is a list of all of the array type symbols
such as art-q
, art-4b
, art-string
and so on. The values
of these symbols are internal array type code numbers for the corresponding
type.
Given an internal numeric array-type code, returns the symbolic name of that type.
array-elements-per-q
is an association list (see alist) which
associates each array type symbol with the number of array elements
stored in one word, for an array of that type. If the value is negative,
it is instead the number of words per array element, for arrays whose
elements are more than one word long.
Given the internal array-type code number, returns the number of array elements stored in one word, for an array of that type. If the value is negative, it is instead the number of words per array element, for arrays whose elements are more than one word long.
The value of array-bits-per-element
is an association list (see alist)
which associates each array type symbol with the number of
bits of unsigned number it can hold, or nil
if it can
hold Lisp objects. This can be used to tell whether an array
can hold Lisp objects or not.
Given the internal array-type code numbers, returns the number of bits
per cell for unsigned numeric arrays, or nil
for a type of array
that can contain Lisp objects.
Given an array, returns the number of bits that fit in an element of that array. For arrays that can hold general Lisp objects, the result is 24., assuming you will be storing unsigned fixnums in the array.
Any array may have an array leader. An array leader is
like a one-dimensional art-q
array which is attached to the main
array. So an array which has a leader acts like two arrays joined
together. The leader can be stored into and examined by a special set
of functions, different from those used for the main array:
array-leader
and store-array-leader
. The leader is always
one-dimensional, and always can hold any kind of Lisp object,
regardless of the type or dimensionality of the main part of the array.
Very often the main part of an array will be a homogeneous set of objects,
while the leader will be used to remember a few associated non-homogeneous pieces of data.
In this case the leader is not used like an array; each slot is used
differently from the others. Explicit numeric subscripts should not be
used for the leader elements of such an array; instead the leader should be described
by a defstruct
(see defstruct-fun).
By convention, element 0 of the array leader of
an array is used to hold the number of elements in the array
that are "active" in some sense. When the zeroth element is used
this way, it is called a fill pointer.
Many array-processing functions recognize the fill pointer.
For instance, if a string (an array of type art-string
) has
seven elements, but its fill pointer contains the value five, then only elements
zero through four of the string are considered to be "active"; the string’s
printed representation will be five characters long, string-searching
functions will stop after the fifth element, etc.
The system does not provide a way to turn off the fill-pointer convention; any array that has a leader must reserve element 0 for the fill pointer or avoid using many of the array functions.
Leader element 1 is used in conjunction with the "named structure" feature to associate a "data type" with the array; see named-structure. Element 1 is only treated specially if the array is flagged as a named structure.
The following explanation of displaced arrays is probably not of interest to a beginner; the section may be passed over without losing the continuity of the manual.
Normally, an array is represented as a small amount of header information, followed by the contents of the array. However, sometimes it is desirable to have the header information removed from the actual contents. One such occasion is when the contents of the array must be located in a special part of the Lisp Machine’s address space, such as the area used for the control of input/output devices, or the bitmap memory which generates the TV image. Displaced arrays are also used to reference certain special system tables, which are at fixed addresses so the microcode can access them easily.
If you give make-array
a fixnum or a locative
as the value of the :displaced-to
option,
it will create a displaced array referring to that location of virtual memory
and its successors.
References to elements of the displaced array will access that part
of storage, and return the contents; the regular aref
and
aset
functions are used. If the array is one whose elements
are Lisp objects, caution should be used: if the region of address
space does not contain typed Lisp objects, the integrity of the storage
system and the garbage collector could be damaged. If the array is one
whose elements are bytes (such as an art-4b
type), then there
is no problem. It is important to know, in this case, that the elements
of such arrays are allocated from the right to the left within the 32-bit
words.
It is also possible to have an array whose contents, instead
of being located at a fixed place in virtual memory, are defined
to be those of another array. Such an array is called an indirect array,
and is created by giving make-array
an array as
the value of the :displaced-to
option.
The effects of this are simple if both arrays have the same type; the two
arrays share all elements. An object stored in a certain element
of one can be retrieved from the corresponding element of the other.
This, by itself, is not very useful. However, if the arrays have
different dimensionality, the manner of accessing the elements differs.
Thus, by creating a one-dimensional array of nine elements which was
indirected to a second, two-dimensional array of three elements by three,
then the elements could be accessed in either a one-dimensional or
a two-dimensional manner. Weird effects can be produced if
the new array is of a different type than the old array; this is not
generally recommended. Indirecting an art-mb
array to
an art-nb
array will do the "obvious" thing. For instance,
if m is 4 and n is 1, each element of the first array
will contain four bits from the second array, in right-to-left order.
It is also possible to create an indirect array in such a way
that when an attempt is made to reference it or store into it, a
constant number is added to the subscript given. This number is called
the index-offset, and is specified at the time the indirect array
is created, by giving a fixnum to make-array
as the value of the :displaced-index-offset
option.
Similarly, the length of the indirect array need not be the full length of
the array it indirects to; it can be smaller.
The nsubstring
function (see nsubstring-fun) creates such
arrays. When using index offsets with multi-dimensional arrays, there
is only one index offset; it is added in to the "linearized" subscript
which is the result of multiplying each subscript by an appropriate
coefficient and adding them together.
This is the primitive function for making arrays. dimensions should be a list of fixnums which are the dimensions of the array; the length of the list will be the dimensionality of the array. For convenience when making a one-dimensional array, the single dimension may be provided as a fixnum rather than a list of one fixnum.
options are alternating keywords and values. The keywords may be any of the following:
:area
The value specifies in which area (see area) the list should be created.
It should be either an area number (a fixnum), or nil
to mean the
default area.
:type
The value should be a symbolic name of an array type; the most
common of these is art-q
, which is the default. The elements of the array are
initialized according to the type: if the array is of a type whose
elements may only be fixnums or flonums, then every element of the array will
initially be 0
or 0.0
; otherwise, every element will initially be
nil
. See the description of array types on array-type.
The value of the option may also be the value of a symbol which is an array type name
(that is, an internal numeric array type code).
:displaced-to
If this is not nil
, then the array will be a displaced array.
If the value is a fixnum or a locative, make-array
will create a
regular displaced array which refers to the specified section of virtual
address space.
If the value is an array, make-array
will create
an indirect array (see indirect-array).
:leader-length
The value should be a fixnum. The array will have a leader with that
many elements. The elements of the leader will be initialized to nil
unless the :leader-list
option is given (see below).
:leader-list
The value should be a list. Call the number of elements in the list n.
The first n elements of the leader will be initialized from successive
elements of this list. If the :leader-length
option is not specified,
then the length of the leader will be n. If the :leader-length
option is given, and its value is greater than n, then the nth
and following leader elements will be initialized to nil
. If its value
is less than n, an error is signalled. The leader elements are
filled in forward order; that is, the car
of the list will be stored
in leader element 0
, the cadr
in element 1
, and so on.
:displaced-index-offset
If this is present, the value of the :displaced-to
option should be an
array, and the value should be a non-negative fixnum; it is made to be the
index-offset of the created indirect array. (See index-offset.)
:named-structure-symbol
If this is not nil
, it is a symbol to
be stored in the named-structure cell of the array. The array
will be tagged as a named structure (see named-structure.) If
the array has a leader, then this symbol will be stored in leader element
1
regardless of the value of the :leader-list
option. If the array
does not have a leader, then this symbol will be stored in array element zero.
Examples:
;; Create a one-dimensional array of five elements. (make-array 5) ;; Create a two-dimensional array, ;; three by four, with four-bit elements. (make-array '(3 4) ':type 'art-4b) ;; Create an array with a three-element leader. (make-array 5 ':leader-length 3) ;; Create an array with a leader, providing ;; initial values for the leader elements. (setq a (make-array 100 ':type 'art-1b ':leader-list '(t nil))) (array-leader a 0) => t (array-leader a 1) => nil
;; Create a named-structure with five leader ;; elements, initializing some of them. (setq b (make-array 20 ':leader-length 5 ':leader-list '(0 nil foo) ':named-structure-symbol 'bar)) (array-leader b 0) => 0 (array-leader b 1) => bar (array-leader b 2) => foo (array-leader b 3) => nil (array-leader b 4) => nil
make-array
returns the newly-created array, and also
returns, as a second value, the number of words allocated in the process
of creating the array, i.e the %structure-total-size
of the array.
When make-array
was originally implemented, it took its arguments
in the following fixed pattern:
(make-array area type dimensions &optional displaced-to leader displaced-index-offset named-structure-symbol)
leader was a combination of the :leader-length
and :leader-list
options, and the list was in reverse order.
This obsolete form is still supported so that old programs will continue
to work, but the new keyword-argument form is preferred.
Returns the element of array selected by the subscripts. The subscripts must be fixnums and their number must match the dimensionality of array.
These are obsolete versions of aref
that only work for one, two, or three
dimensional arrays, respectively. There is no reason ever to use them.
Stores x into the element of array selected by the subscripts. The subscripts must be fixnums and their number must match the dimensionality of array. The returned value is x.
These are obsolete versions of aset
that only work for one, two, or three
dimensional arrays, respectively. There is no reason ever to use them.
Returns a locative pointer to the element-cell of array selected by the subscripts. The subscripts must be fixnums and their number must match the dimensionality of array. See the explanation of locatives in locative.
These are obsolete versions of aloc
that only work for one, two, or three
dimensional arrays, respectively. There is no reason ever to use them.
The compiler turns aref
into ar-1
, ar-2
, etc according
to the number of subscripts specified, turns aset
into as-1
,
as-2
, etc., and turns aloc
into ap-1
, ap-2
, etc.
For arrays with more than 3 dimensions the compiler uses the slightly less
efficient form since the special routines only exist for 1, 2, and 3 dimensions.
There is no reason for any program to call ar-1
, as-1
, ar-2
, etc
explicitly; they are documented because there used to be such a reason, and
many old programs use these functions. New programs should use aref
,
aset
, and aloc
.
A related function, provided only for Maclisp compatibility, is
arraycall
(arraycall-fun).
array should be an array with a leader, and i should be a
fixnum. This returns the i’th element of array’s leader.
This is analogous to aref
.
array should be an array with a leader, and i should be a
fixnum. x may be any object. x is stored in the i’th element
of array’s leader. store-array-leader
returns x.
This is analogous to aset
.
Returns the symbolic type of array.
Example:
(setq a (make-array '(3 5))) (array-type a) => art-q
array may be any array. This returns the total number
of elements in array. For a one-dimensional array,
this is one greater than the maximum allowable subscript.
(But if fill pointers are being used, you may want to use
array-active-length
.)
Example:
(array-length (make-array 3)) => 3
(array-length (make-array '(3 5)))
=> 17 ;octal, which is 15. decimal
If array does not have a fill pointer, then this returns whatever
(array-length array)
would have. If array does have a
fill pointer, array-active-length
returns it. See the general
explanation of the use of fill pointers, on fill-pointer.
Returns the dimensionality of array. Note that the name of the function includes a "#", which must be slashified if you want to be able to read your program in Maclisp. (It doesn’t need to be slashified for the Zetalisp reader, which is smarter.)
Example:
(array-#-dims (make-array '(3 5))) => 2
array may be any kind of array, and n should be a fixnum.
If n is between 1 and the dimensionality of array,
this returns the n’th dimension of array. If n is 0
,
this returns the length of the leader of array; if array has no
leader it returns nil
. If n is any other value, this
returns nil
.
Examples:
(setq a (make-array '(3 5) ':leader-length 7)) (array-dimension-n 1 a) => 3 (array-dimension-n 2 a) => 5 (array-dimension-n 3 a) => nil (array-dimension-n 0 a) => 7
array-dimensions
returns a list whose elements are the dimensions
of array.
Example:
(setq a (make-array '(3 5))) (array-dimensions a) => (3 5)
Note: the list returned by (array-dimensions x)
is
equal to the cdr of the list returned by (arraydims x)
.
array may be any array; it also may be a symbol whose
function cell contains an array, for Maclisp compatibility (see maclisp-array).
arraydims
returns a list whose first element is the symbolic name of
the type of array, and whose remaining elements are its dimensions.
Example:
(setq a (make-array '(3 5))) (arraydims a) => (art-q 3 5)
This function checks whether subscripts is a legal
set of subscripts for array, and returns t
if they
are; otherwise it returns nil
.
array may be any kind of array.
This predicate returns t
if array is any kind of displaced array
(including an indirect array). Otherwise it returns nil
.
array may be any kind of array.
This predicate returns t
if array is an indirect array.
Otherwise it returns nil
.
array may be any kind of array.
This predicate returns t
if array is an indirect array with an index-offset.
Otherwise it returns nil
.
array may be any array. This predicate returns t
if array
has a leader; otherwise it returns nil
.
array may be any array. This returns the length of array’s leader
if it has one, or nil
if it does not.
If array is a one-dimensional array, its size is
changed to be new-size. If array has more than one
dimension, its size (array-length
) is changed to new-size
by changing only the last dimension.
If array is made smaller, the extra elements are lost; if array
is made bigger, the new elements are initialized in the same fashion as
make-array
(see make-array-fun) would initialize them: either to nil
or 0
,
depending on the type of array.
Example:
(setq a (make-array 5))
(aset 'foo a 4)
(aref a 4) => foo
(adjust-array-size a 2)
(aref a 4) => an error occurs
If the size of the array is being increased,
adjust-array-size
may have to allocate a new array somewhere. In
that case, it alters array so that references to it will be made to
the new array instead, by means of "invisible pointers" (see
structure-forward
, structure-forward-fun).
adjust-array-size
will return this new array if it creates one, and
otherwise it will return array. Be careful to be consistent about
using the returned result of adjust-array-size
, because you may end
up holding two arrays which are not the same (i.e not eq
), but
which share the same contents.
array-grow
creates a new array of the same type as array,
with the specified dimensions. Those elements of array that
are still in bounds are copied into the new array. The elements of
the new array that are not in the bounds of array are initialized
to nil
or 0
as appropriate. If array has a leader, the new
array will have a copy of it. array-grow
returns the new array
and also forwards array to it, like adjust-array-size
.
Unlike adjust-array-size
, array-grow
always creates a new array
rather than growing or shrinking the array in place. But array-grow
of a multi-dimensional array can change all the subscripts and move
the elements around in memory to keep each element at the same logical
place in the array.
This peculiar function attempts to return array to free storage.
If it is displaced, this returns the displaced array itself, not the
data that the array points to. Currently return-array
does nothing if the array is
not at the end of its region, i.e if it was not the most recently allocated
non-list object in its area. This will eventually be renamed to
reclaim
, when it works for other objects than arrays.
If you still have any references to array anywhere in the Lisp world
after this function returns, the garbage collector can get a fatal error
if it sees them. Since the form that calls this function must get the
array from somewhere, it may not be clear how to legally call return-array
.
One of the only ways to do it is as follows:
(defun func () (let ((array (make-array 100))) ... (return-array (prog1 array (setq array nil)))))
so that the variable array
does not refer to the array when return-array
is called. You should only call this function if you know what you are doing;
otherwise the garbage collector can get fatal errors. Be careful.
These functions manipulate art-q-list
arrays, which were
introduced on art-q-list-var.
array should be an art-q-list
array. This returns
a list which shares the storage of array.
Example:
(setq a (make-array 4 ':type 'art-q-list)) (aref a 0) => nil (setq b (g-l-p a)) => (nil nil nil nil) (rplaca b t) b => (t nil nil nil) (aref a 0) => t (aset 30 a 2) b => (t nil 30 nil)
The following two functions work strangely, in the same way that store
does, and should not be used in new programs.
The argument array-ref is ignored, but should be a reference
to an art-q-list
array by applying the array to subscripts (rather
than by aref
). This returns a list object which
is a portion of the "list" of the array, beginning with the last
element of the last array which has been called as a function.
get-locative-pointer-into-array
is
similar to get-list-pointer-into-array
, except that it returns a
locative, and doesn’t require the array to be art-q-list
.
Use aloc
instead of this function in new programs.
array must be a one-dimensional array which has a fill pointer, and x may
be any object. array-push
attempts to store x in the element
of the array designated by the fill pointer, and increase the fill pointer
by one. If the fill pointer does not designate an element of the array (specifically,
when it gets too big), it is unaffected and array-push
returns nil
;
otherwise, the two actions (storing and incrementing) happen uninterruptibly,
and array-push
returns the former value of the fill pointer,
i.e the array index in which it stored x. If the array is of type
art-q-list
, an operation similar to nconc
has taken place,
in that the element has been added to the list by changing the cdr of
the formerly last element. The cdr coding is updated to ensure this.
array-push-extend
is just like array-push
except
that if the fill pointer gets too large, the array is grown
to fit the new element; i.e it never "fails" the way array-push
does,
and so never returns nil
. extension is the number of
elements to be added to the array if it needs to be grown. It defaults
to something reasonable, based on the size of the array.
array must be a one-dimensional array which has a fill pointer.
The fill pointer is decreased by one, and the array element
designated by the new value of the fill pointer is returned.
If the new value does not designate any element of the array
(specifically, if it had already reached zero), an error is caused.
The two operations (decrementing and array referencing) happen
uninterruptibly. If the array is of type art-q-list
, an operation
similar to nbutlast
has taken place. The cdr coding is
updated to ensure this.
array may be any type of array, or, for Maclisp
compatibility, a symbol whose function cell contains an array. There
are two forms of this function, depending on the type of x.
If x is a list, then fillarray
fills up array with
the elements of list. If x is too short to fill up all of
array, then the last element of x is used to fill the
remaining elements of array. If x is too long, the extra
elements are ignored. If x is nil
(the empty list), array
is filled with the default initial value for its array type (nil
or 0
).
If x is an array (or, for Maclisp compatibility, a symbol
whose function cell contains an array), then the elements of array are
filled up from the elements of x. If x is too small, then
the extra elements of array are not affected.
If array is multi-dimensional, the elements are accessed
in row-major order: the last subscript varies the most quickly.
The same is true of x if it is an array.
fillarray
returns array.
array may be any type of array, or, for Maclisp
compatibility, a symbol whose function cell contains an array.
listarray
creates and returns a list whose elements are those of
array. If limit is present, it should be a fixnum, and only
the first limit (if there are more than that many) elements of
array are used, and so the maximum length of the returned list is
limit.
If array is multi-dimensional, the elements are accessed
in row-major order: the last subscript varies the most quickly.
array may be any type of array, or, for Maclisp
compatibility, a symbol whose function cell contains an array.
list-array-leader
creates and returns a list whose elements are those of
array’s leader. If limit is present, it should be a fixnum, and only
the first limit (if there are more than that many) elements of
array’s leader are used, and so the maximum length of the returned list is
limit. If array has no leader, nil
is returned.
from and to must be arrays. The contents of from
is copied into the contents of to, element by element.
If to is shorter than from,
the rest of from is ignored. If from is shorter than
to, the rest of to is filled with nil
if it
is a q-type array, or 0 if it is a numeric array or a string,
or 0.0 if it is a flonum array.
This function always returns t
.
Note that even if from or to has a leader, the whole array is used; the convention that leader element 0 is the "active" length of the array is not used by this function. The leader itself is not copied.
copy-array-contents
works on multi-dimensional arrays. from
and to are "linearized" subscripts, and column-major order is used,
i.e the first subscript varies fastest (opposite from fillarray
).
This is just like copy-array-contents
, but the leader of from
(if any) is also copied into to. copy-array-contents
copies only
the main part of the array.
The portion of the array from-array with indices greater than or
equal to from-start and less than from-end is copied into
the portion of the array to-array with indices greater than or
equal to to-start and less than to-end, element by element.
If there are more elements in the selected portion of to-array
than in the selected portion of from-array, the extra elements
are filled with the default value as by copy-array-contents
.
If there are more elements in the selected portion of from-array,
the extra ones are ignored. Multi-dimensional arrays are treated
the same way as copy-array-contents
treats them.
This function always returns t
.
from-array and to-array must be two-dimensional arrays
of bits or bytes (art-1b
, art-2b
, art-4b
, art-8b
,
art-16b
, or art-32b
). bitblt
copies a rectangular portion of from-array
into a rectangular portion of to-array. The value stored
can be a Boolean function of the new value and the value already there,
under the control of alu (see below). This function is most commonly used
in connection with raster images for TV displays.
The top-left corner of the source rectangle is (aref from-array
from-x from-y)
. The top-left corner of the destination
rectangle is (aref to-array to-x to-y)
. width
and height are the dimensions of both rectangles. If width
or height is zero, bitblt
does nothing.
from-array and to-array are allowed to be the same array.
bitblt
normally traverses the arrays in increasing order of x
and y subscripts. If width is negative, then (abs width)
is used as the width, but the processing of the x direction is done
backwards, starting with the highest value of x and working down.
If height is negative it is treated analogously. When
bitblt
’ing an array to itself, when the two rectangles overlap, it
may be necessary to work backwards to achieve the desired effect, such
as shifting the entire array upwards by a certain number of rows. Note
that negativity of width or height does not affect the
(x,y) coordinates specified by the arguments, which are still the
top-left corner even if bitblt
starts at some other corner.
If the two arrays are of different types, bitblt
works bit-wise
and not element-wise. That is, if you bitblt
from an art-2b
array into an art-4b
array, then two elements of the from-array
will correspond to one element of the to-array.
If bitblt
goes outside the bounds of the source array, it wraps
around. This allows such operations as the replication of a small
stipple pattern through a large array. If bitblt
goes outside
the bounds of the destination array, it signals an error.
If src is an element of the source rectangle, and dst
is the corresponding element of the destination rectangle, then
bitblt
changes the value of dst to
(boole alu src dst)
. See the boole
function (boole-fun). There are symbolic names for some of the
most useful alu functions; they are tv:alu-seta
(plain
copy), tv:alu-ior
(inclusive or), tv:alu-xor
(exclusive
or), and tv:alu-andca
(and with complement of source).
bitblt
is written in highly-optimized microcode and goes very much
faster than the same thing written with ordinary aref
and aset
operations would. Unfortunately this causes bitblt
to have a couple
of strange restrictions. Wrap-around does not work correctly if
from-array is an indirect array with an index-offset. bitblt
will signal an error if the first dimensions of from-array
and to-array are not both integral multiples of the machine word
length. For art-1b
arrays, the first dimension must be a multiple
of 32., for art-2b
arrays it must be a multiple of 16., etc.
The functions in this section perform some useful matrix operations.
The matrices are represented as two-dimensional Lisp arrays.
These functions are part of the mathematics package rather than
the kernel array system, hence the "math:
" in the names.
Multiplies matrix-1 by matrix-2. If matrix-3 is supplied,
multiply-matrices
stores the results into matrix-3 and returns
matrix-3; otherwise it creates an array to contain the answer and
returns that. All matrices must be two-dimensional arrays, and the first
dimension of matrix-2 must equal the second dimension of matrix-1.
Computes the inverse of matrix. If into-matrix is supplied,
stores the result into it and returns it; otherwise it creates an array
to hold the result, and returns that. matrix must be two-dimensional
and square. The Gauss-Jordan algorithm with partial pivoting is used.
Note: if you want to solve a set of simultaneous equations, you should
not use this function; use math:decompose
and math:solve
(see below).
Transposes matrix. If into-matrix is supplied, stores the result into it and returns it; otherwise it creates an array to hold the result, and returns that. matrix must be a two-dimensional array. into-matrix, if provided, must be two-dimensional and have sufficient dimensions to hold the transpose of matrix.
Returns the determinant of matrix. matrix must be a two-dimensional square matrix.
The next two functions are used to solve sets of simultaneous linear
equations. math:decompose
takes a matrix holding the coefficients of the
equations and produces the LU decomposition; this decomposition can then
be passed to math:solve
along with a vector of right-hand sides
to get the values of the variables. If you want to solve the same
equations for many different sets of right-hand side values, you only need to call
math:decompose
once. In terms of the argument names used below, these
two functions exist to solve the vector equation A x = b
for x. A is a matrix. b and x are vectors.
Computes the LU decomposition of matrix a. If lu is non-nil
,
stores the result into it and returns it; otherwise it creates an array
to hold the result, and returns that. The lower triangle of lu, with
ones added along the diagonal, is L, and the upper triangle of lu is
U, such that the product of L and U is a. Gaussian elimination with
partial pivoting is used. The lu array is permuted by rows according
to the permutation array ps, which is also produced by this function;
if the argument ps is supplied, the permutation array is stored into it;
otherwise, an array is created to hold it. This function returns two values:
the LU decomposition and the permutation array.
This function takes the LU decomposition and associated permutation
array produced by math:decompose
, and solves the set of simultaneous
equations defined by the original matrix a and the right-hand sides
in the vector b. If x is supplied, the solutions
are stored into it and it is returned; otherwise, an array is
created to hold the solutions and that is returned. b must
be a one-dimensional array.
Returns a list of lists containing the values in array, which must be a two-dimensional array. There is one element for each row; each element is a list of the values in that row.
This is the opposite of math:list-2d-array
. list should be a
list of lists, with each element being a list corresponding to a row.
array’s elements are stored from the list. Unlike fillarray
(see fillarray-fun), if list is not long enough,
math:fill-2d-array
"wraps around", starting over at the beginning.
The lists which are elements of list also work this way.
A plane is an array whose bounds, in each dimension, are plus-infinity and minus-infinity; all integers are legal as indices. Planes are distinguished not by size and shape, but by number of dimensions alone. When a plane is created, a default value must be specified. At that moment, every component of the plane has that value. As you can’t ever change more than a finite number of components, only a finite region of the plane need actually be stored.
The regular array accessing functions don’t work on planes.
You can use make-plane
to create a plane,
plane-aref
or plane-ref
to get the value of a component, and
plane-aset
or plane-store
to store into a component.
array-#-dims
will work on a plane.
A plane is actually stored as an array with a leader.
The array corresponds to a rectangular, aligned region of the plane,
containing all the components in which a plane-store
has been done
(and others, in general, which have never been altered).
The lowest-coordinate corner of that rectangular region is
given by the plane-origin
in the array leader.
The highest coordinate corner can be found by adding the plane-origin
to the array-dimensions
of the array.
The plane-default
is the contents of all the
elements of the plane which are not actually stored in the array.
The plane-extension
is the amount to extend a plane by in any direction
when the plane needs to be extended. The default is 32.
If you never use any negative indices, then the plane-origin
will
be all zeroes and you can use regular array functions, such as aref
and aset
,
to access the portion of the plane which is actually stored. This can be
useful to speed up certain algorithms. In this case you can even use the
bitblt
function on a two-dimensional plane of bits or bytes,
provided you don’t change the plane-extension
to a number that is not
a multiple of 32.
Creates and returns a plane. rank is the number of dimensions. options is a list of alternating keyword symbols and values. The allowed keywords are:
:type
The array type symbol (e.g art-1b
) specifying the type of the array
out of which the plane is made.
:default-value
The default component value as explained above.
:extension
The amount by which to extend the plane, as explained above.
Example:
(make-plane 2 ':type 'art-4b ':default-value 3)
creates a two-dimensional plane of type art-4b
, with default value 3
.
A list of numbers, giving the lowest coordinate values actually stored.
This is the contents of the infinite number of plane elements which are not actually stored.
The amount to extend the plane by in any direction when plane-store
is done
outside of the currently-stored portion.
These two functions return the contents of a specified element of a plane.
They differ only in the way they take their arguments; plane-aref
wants
the subscripts as arguments, while plane-ref
wants a list of subscripts.
These two functions store datum into the specified element of a plane,
extending it if necessary, and return datum.
They differ only in the way they take their arguments; plane-aset
wants
the subscripts as arguments, while plane-store
wants a list of subscripts.
The functions in this section are provided only for Maclisp compatibility, and should not be used in new programs.
Fixnum arrays do not exist (however, see Zetalisp’s
small-positive-integer arrays). Flonum arrays exist but you do not use
them in the same way; no declarations are required or allowed.
"Un-garbage-collected" arrays do not exist.
Readtables and obarrays are represented as arrays, but unlike Maclisp special
array types are not used. See the descriptions
of read
(read-fun) and intern
(intern-fun) for
information about readtables and obarrays (packages).
There are no "dead" arrays, nor are Multics "external" arrays provided.
The arraycall
function exists for compatibility
but should not be used (see aref
, aref-fun.)
Subscripts are always checked for validity, regardless of the value
of *rset
and whether the code is compiled or not.
However, in a multi-dimensional array, an error is only caused
if the subscripts would have resulted in a reference to storage
outside of the array. For example, if you have a 2 by 7 array and refer
to an element with subscripts 3 and 1, no error will
be caused despite the fact that the reference is invalid;
but if you refer to element 1 by 100, an error will be caused.
In other words, subscript errors will be caught if and only if
they refer to storage outside the array; some errors are undetected,
but they will only clobber some other element of the same array
rather than clobbering something completely unpredictable.
Currently, multi-dimensional arrays are stored in column-major order rather than row-major order as in Maclisp. See column-major for further discussion of this issue.
loadarrays
and dumparrays
are not provided. However,
arrays can be put into "QFASL" files; see fasdump.
The *rearray
function is not provided, since not all
of its functionality is available in Zetalisp.
The most common uses can be replaced by adjust-array-size
.
In Maclisp, arrays are usually kept on the array
property
of symbols, and the symbols are used instead of the arrays. In order
to provide some degree of compatibility for this manner of using
arrays, the array
, *array
, and store
functions are
provided, and when arrays are applied to arguments, the arguments are
treated as subscripts and apply
returns the corresponding element
of the array.
This creates an art-q
type array in default-array-area
with the given dimensions. (That is, dims is given
to make-array
as its first argument.) type is ignored.
If symbol is nil
, the array is returned; otherwise,
the array is put in the function cell of symbol, and symbol
is returned.
This is just like array
, except that all of the arguments
are evaluated.
store
stores x into the
specified array element. array-ref should be a form which
references an array by calling it as a function (aref
forms are not
acceptable). First x is evaluated, then array-ref is
evaluated, and then the value of x is stored into the array cell
last referenced by a function call, presumably the one in array-ref.
This is just like store
, but it is not
a special form; this is because the arguments are in the other
order. This function only exists for the compiler to compile the
store
special form into, and should never be used by programs.
(arraycall t array sub1 sub2...)
is the same
as (aref array sub1 sub2...)
. It exists for
Maclisp compatibility.
Strings are a type of array which represent a sequence of
characters. The printed representation of a string is its characters
enclosed in quotation marks, for example "foo bar"
. Strings are
constants, that is, evaluating a string returns that string. Strings
are the right data type to use for text-processing.
Strings are arrays of type art-string
, where each element
holds an eight-bit unsigned fixnum. This is because characters are
represented as fixnums, and for fundamental characters only eight bits are
used. A string can also be an array of type art-fat-string
, where each
element holds a sixteen-bit unsigned fixnum; the extra bits allow for
multiple fonts or an expanded character set.
The way characters work, including
multiple fonts and the extra bits from the keyboard, is explained
in character-set. Note that you can type in the fixnums
that represent characters using "#/
" and "#\
"; for example,
#/f
reads in as the fixnum that represents the character "f",
and #\return
reads in as the fixnum that represents the special "return"
character. See sharp-slash for details of this syntax.
The functions described in this section provide a variety of useful
operations on strings. In place of a string, most of these functions will
accept a symbol or a fixnum as an argument, and will coerce it into a
string. Given a symbol, its print name, which is a string, will be used.
Given a fixnum, a one-character string containing the character designated
by that fixnum will be used. Several of the functions actually work on any
type of one-dimensional array and may be useful for other than string
processing; these are the functions such as substring
and string-length
which do not depend on the elements of the string being characters.
Since strings are arrays, the usual array-referencing function aref
is used to extract the characters of the string as fixnums. For example,
(aref "frob" 1) => 162 ;lower-case r
Note that the character at the beginning of the string is element zero of the array (rather than one); as usual in Zetalisp, everything is zero-based.
It is also legal to store into strings (using aset
).
As with rplaca
on lists, this changes the actual object; one must be careful
to understand where side-effects will propagate to.
When you are making strings that you intend to change later, you probably
want to create an array with a fill-pointer (see fill-pointer) so that
you can change the length of the string as well as the contents.
The length of a string is always computed using array-active-length
,
so that if a string has a fill-pointer, its value will be used
as the length.
character
coerces x to a single character,
represented as a fixnum. If x is a number, it is returned. If
x is a string or an array, its first element is returned. If
x is a symbol, the first character of its pname is returned.
Otherwise, an error occurs. The way characters are represented
as fixnums is explained in character-set.
This is the primitive for comparing characters for equality;
many of the string functions call it. ch1 and ch2
must be fixnums. The result is t
if the characters are equal ignoring
case and font, otherwise nil
.
%%ch-char
is the byte-specifier for the portion of a character
which excludes the font information.
This is the primitive for comparing characters for order;
many of the string functions call it. ch1 and ch2
must be fixnums. The result is t
if ch1 comes before ch2
ignoring case and font, otherwise nil
. Details of the ordering
of characters are in character-set.
This variable is normally nil
. If it is t
, char-equal
,
char-lessp
, and the string searching and comparison functions will
distinguish between upper-case and lower-case letters. If it is nil
,
lower-case characters behave as if they were the same character but
in upper-case. It is all right
to bind this to t
around a string operation, but changing its
global value to t
will break many system functions and user
interfaces and so is not recommended.
If ch, which must be a fixnum, is a lower-case alphabetic character its upper-case form is returned; otherwise, ch itself is returned. If font information is present it is preserved.
If ch, which must be a fixnum, is a upper-case alphabetic character its lower-case form is returned; otherwise, ch itself is returned. If font information is present it is preserved.
Returns a copy of string, with all lower case alphabetic characters replaced by the corresponding upper case characters.
Returns a copy of string, with all upper case alphabetic characters replaced by the corresponding lower case characters.
string
coerces x into a string. Most of the string
functions apply this to their string arguments.
If x is a string (or any array), it is returned. If x is
a symbol, its pname is returned. If x is a non-negative
fixnum less than 400
octal, a one-character-long string containing
it is created and returned. If x is a pathname (see pathname),
the "string for printing" is returned. Otherwise, an error is signalled.
If you want to get the printed representation of an object into the
form of a string, this function is not what you should use.
You can use format
, passing a first argument of nil
(see format-fun).
You might also want to use with-output-to-string
(see
with-output-to-string-fun).
string-length
returns the number of characters in string. This is 1
if string is a number, the array-active-length
(see array-active-length-fun)
if string
is an array, or the array-active-length
of the pname if string is a symbol.
string-equal
compares two strings, returning t
if
they are equal and nil
if they are not. The comparison ignores
the extra "font" bits in 16-bit strings
and ignores alphabetic case. equal
calls string-equal
if
applied to two strings.
The optional arguments idx1 and idx2 are the starting
indices into the strings. The optional arguments lim1 and lim2
are the final indices; the comparison stops just before the final index.
lim1 and lim2 default to the lengths of the strings. These arguments are provided
so that you can efficiently compare substrings.
Examples:
(string-equal "Foo" "foo") => t (string-equal "foo" "bar") => nil (string-equal "element" "select" 0 1 3 4) => t
%string-equal
is the microcode primitive which string-equal
calls.
It returns t
if the count characters of string1 starting
at idx1 are char-equal
to the count characters of string2
starting at idx2, or nil
if the characters are not equal or
if count runs off the length of either array.
Instead of a fixnum, count may also be nil
. In this case,
%string-equal
compares
the substring from idx1 to (string-length string1)
against the substring from idx2 to (string-length string2)
.
If the lengths of these substrings differ, then they are not equal and
nil
is returned.
Note that string1 and string2 must really be strings; the
usual coercion of symbols and fixnums to strings is not performed.
This function is documented because certain programs which require
high efficiency and are willing to pay the price of less generality
may want to use %string-equal
in place of string-equal
.
Examples:
To compare the two strings foo and bar:
(%string-equal foo 0 bar 0 nil)
To see if the string foo starts with the characters "bar"
:
(%string-equal foo 0 "bar" 0 3)
string-lessp
compares two strings using dictionary order
(as defined by char-lessp
).
The result is t
if string1 is the lesser, or nil
if they are equal or string2 is the lesser.
string-compare
compares two strings using dictionary order (as defined
by char-lessp
). The arguments are interpreted as in string-equal
.
The result is 0
if the strings are equal, a negative number if string1
is less than string2, or a positive number if string1 is greater than
string2. If the strings are not equal, the absolute value of the
number returned is one greater than the index (in string1) where the first
difference occurred.
This extracts a substring of string, starting at the character specified by start and going up to but not including the character specified by end. start and end are 0-origin indices. The length of the returned string is end minus start. If end is not specified it defaults to the length of string. The area in which the result is to be consed may be optionally specified.
Example:
(substring "Nebuchadnezzar" 4 8) => "chad"
nsubstring
is the same as substring
except that the substring
is not copied; instead an indirect array (see indirect-array) is created which shares part
of the argument string. Modifying one string will modify the other.
Note that nsubstring
does not necessarily use less storage than
substring
; an nsubstring
of any length uses at least as much
storage as a substring
12 characters long. So you shouldn’t use
this just "for efficiency"; it is intended for uses in which it is important
to have a substring which, if modified, will cause the original string
to be modified too.
Any number of strings are copied and concatenated into a single string.
With a single argument, string-append
simply copies it.
If the first argument is an array, the result will be an array of the same type.
Thus string-append
can be
used to copy and concatenate any type of 1-dimensional array.
Example:
(string-append #/! "foo" #/!) => "!foo!"
string-nconc
is like string-append
except that instead
of making a new string containing the concatenation of its arguments,
string-nconc
modifies its first argument. modified-string
must have a fill-pointer so that additional characters can be tacked
onto it. Compare this with array-push-extend
(array-push-extend-fun). The value of string-nconc
is
modified-string or a new, longer copy of it; in the latter case
the original copy is forwarded to the new copy (see adjust-array-size
,
adjust-array-size-fun). Unlike nconc
, string-nconc
with more than two arguments modifies only its first argument, not
every argument but the last.
This returns a substring
of string, with all characters
in char-set stripped off of the beginning and end.
char-set is a set of characters, which can be represented as a list
of characters or a string of characters.
Example:
(string-trim '(#\sp) " Dr. No ") => "Dr. No" (string-trim "ab" "abbafooabb") => "foo"
This returns a substring
of string, with all characters
in char-set stripped off of the beginning.
char-set is a set of characters, which can be represented as a list
of characters or a string of characters.
This returns a substring
of string, with all characters
in char-set stripped off of the end.
char-set is a set of characters, which can be represented as a list
of characters or a string of characters.
Returns a copy of string with the order of characters reversed. This will reverse a 1-dimensional array of any type.
Returns string with the order of characters reversed, smashing the original string, rather than creating a new one. If string is a number, it is simply returned without consing up a string. This will reverse a 1-dimensional array of any type.
string-pluralize
returns a string containing the plural of the
word in the argument string. Any added characters go in the same
case as the last character of string.
Example:
(string-pluralize "event") => "events" (string-pluralize "Man") => "Men" (string-pluralize "Can") => "Cans" (string-pluralize "key") => "keys" (string-pluralize "TRY") => "TRIES"
For words with multiple plural forms depending on the
meaning, string-pluralize
cannot always do the right thing.
string-search-char
searches through string starting at the index from,
which defaults to the beginning, and returns the index of the first
character which is char-equal
to char, or nil
if none is found.
If the to argument is supplied, it is used in place of (string-length string)
to limit the extent of the search.
Example:
(string-search-char #/a "banana") => 1
%string-search-char
is the microcode primitive which string-search-char
and other functions call. string must be an array and char, from,
and to must be fixnums. Except for this lack of type-coercion, and the fact
that none of the arguments is optional, %string-search-char
is the same as
string-search-char
. This function is documented for the benefit of
those who require the maximum possible efficiency in string searching.
string-search-not-char
searches through string starting at the index from,
which defaults to the beginning, and returns the index of the first
character which is not char-equal
to char, or nil
if none is found.
If the to argument is supplied, it is used in place of (string-length string)
to limit the extent of the search.
Example:
(string-search-not-char #/b "banana") => 1
string-search
searches for the string key in the string
string. The search begins at from, which defaults to the
beginning of string. The value returned is the index of the first
character of the first instance of key, or nil
if none is
found.
If the to argument is supplied, it is used in place of (string-length string)
to limit the extent of the search.
Example:
(string-search "an" "banana") => 1 (string-search "an" "banana" 2) => 3
string-search-set
searches through string looking for
a character which is in char-set. The search begins at the index from,
which defaults to the beginning. It returns the index of the first
character which is char-equal
to some element of char-set,
or nil
if none is found.
If the to argument is supplied, it is used in place of (string-length string)
to limit the extent of the search.
char-set is a set of characters, which can be represented as a list
of characters or a string of characters.
Example:
(string-search-set '(#/n #/o) "banana") => 2 (string-search-set "no" "banana") => 2
string-search-not-set
searches through string looking for
a character which is not in char-set. The search begins at the index from,
which defaults to the beginning. It returns the index of the first
character which is not char-equal
to any element of char-set,
or nil
if none is found.
If the to argument is supplied, it is used in place of (string-length string)
to limit the extent of the search.
char-set is a set of characters, which can be represented as a list
of characters or a string of characters.
Example:
(string-search-not-set '(#/a #/b) "banana") => 2
string-reverse-search-char
searches through string in reverse order, starting
from the index one less than from, which defaults to the length of string,
and returns the index of the first character which is char-equal
to char, or nil
if none is found. Note that the index returned
is from the beginning of the string, although the search starts from the end.
If the to argument is supplied, it limits the extent of the search.
Example:
(string-reverse-search-char #/n "banana") => 4
string-reverse-search-not-char
searches through string in reverse order, starting
from the index one less than from, which defaults to the length of string,
and returns the index of the first character which is not char-equal
to char, or nil
if none is found. Note that the index returned
is from the beginning of the string, although the search starts from the end.
If the to argument is supplied, it limits the extent of the search.
Example:
(string-reverse-search-not-char #/a "banana") => 4
string-reverse-search
searches for the string key in the string string.
The search proceeds in reverse order, starting
from the index one less than from, which defaults to the length of string,
and returns the index of the first (leftmost) character of the first instance found,
or nil
if none is found. Note that the index returned
is from the beginning of the string, although the search starts from the end.
The from condition, restated, is that the instance of key found
is the rightmost one whose rightmost character is before the from’th character
of string.
If the to argument is supplied, it limits the extent of the search.
Example:
(string-reverse-search "na" "banana") => 4
string-reverse-search-set
searches through string in reverse order, starting
from the index one less than from, which defaults to the length of string,
and returns the index of the first character which is char-equal
to some element of char-set, or nil
if none is found.
Note that the index returned
is from the beginning of the string, although the search starts from the end.
If the to argument is supplied, it limits the extent of the search.
char-set is a set of characters, which can be represented as a list
of characters or a string of characters.
(string-reverse-search-set "ab" "banana") => 5
string-reverse-search-not-set
searches through string in reverse order, starting
from the index one less than from, which defaults to the length of string,
and returns the index of the first character which is not char-equal
to any element of char-set, or nil
if none is found.
Note that the index returned
is from the beginning of the string, although the search starts from the end.
If the to argument is supplied, it limits the extent of the search.
char-set is a set of characters, which can be represented as a list
of characters or a string of characters.
(string-reverse-search-not-set '(#/a #/n) "banana") => 0
See also intern
(intern-fun), which given a string will return "the" symbol
with that print name.
The special forms in this section allow you to create I/O streams which input from or output to a string rather than a real I/O device. See streams for documentation of I/O streams.
The form
(with-input-from-string (var string) body)
evaluates the forms in body with the variable var bound to a stream which reads characters from the string which is the value of the form string. The value of the special form is the value of the last form in its body.
The stream is a function that only works inside the with-input-from-string
special form, so be careful what you do with it.
You cannot use it after control leaves the body, and you cannot nest
two with-input-from-string
special forms and use both streams
since the special-variable bindings associated with the streams will
conflict. It is done this way to avoid any allocation of memory.
After string you may optionally specify two additional "arguments". The first is index:
(with-input-from-string (var string index) body)
uses index as the starting index into the string, and sets index
to the index of the first character not read when with-input-from-string
returns. If the whole string is read, it will be set to the
length of the string. Since index is
updated it may not be a general expression; it must be a variable
or a setf
-able reference. The index is not updated
in the event of an abnormal exit from the body, such as a *throw
.
The value of index is not updated until with-input-from-string
returns, so you can’t use its value within the body to see how far
the reading has gotten.
Use of the index feature prevents multiple values from being returned out of the body, currently.
(with-input-from-string (var string index limit) body)
uses the value of the form limit, if the value is not nil
, in
place of the length of the string. If you want to specify a limit
but not an index, write nil
for index.
This special form provides a variety of ways to send output to a string through an I/O stream.
(with-output-to-string (var) body)
evaluates the forms in body with var bound to a stream which saves the characters output to it in a string. The value of the special form is the string.
(with-output-to-string (var string) body)
will append its output to the string which is the value of the form string.
(This is like the string-nconc
function; see string-nconc-fun.)
The value returned is the value of the last form in the body, rather than the string.
Multiple values are not returned. string must have an array-leader;
element 0 of the array-leader will be used as the fill-pointer.
If string
is too small to contain all the output, adjust-array-size
will be used to
make it bigger.
(with-output-to-string (var string index) body)
is similar to the above except that index is a variable or setf
-able
reference which contains the index of the next character to be stored into.
It must be initialized outside the with-output-to-string
and will be updated
upon normal exit.
The value of index is not updated until with-output-to-string
returns, so you can’t use its value within the body to see how far
the writing has gotten. The presence of index means that string
is not required to have a fill-pointer; if it does have one it will be updated.
The stream is a "downward closure" simulated with special variables,
so be careful what you do with it.
You cannot use it after control leaves the body, and you cannot nest
two with-output-to-string
special forms and use both streams
since the special-variable bindings associated with the streams will
conflict. It is done this way to avoid any allocation of memory.
It is OK to use a with-input-from-string
and with-output-to-string
nested within one another, so long as there is only one of each.
Another way of doing output to a string is to use the format
facility
(see format-fun).
The following functions are provided primarily for Maclisp compatibility.
(alphalessp string1 string2)
is equivalent to
(string-lessp string1 string2)
.
Returns the index’th character of string
as a symbol. Note that 1-origin indexing is used. This function
is mainly for Maclisp compatibility; aref
should be used
to index into strings (however, aref
will not coerce symbols
or numbers into strings).
Returns the index’th character of string
as a fixnum. Note that 1-origin indexing is used. This function
is mainly for Maclisp compatibility; aref
should be used
to index into strings (however, aref
will not coerce symbols
or numbers into strings).
ascii
is like character
, but returns a symbol
whose printname is the character instead of returning a fixnum.
Examples:
(ascii 101) => A (ascii 56) => /.
The symbol returned is interned in the current package (see package).
maknam
returns
an uninterned symbol whose print-name is a string made up of the characters in char-list.
Example:
(maknam '(a b #/0 d)) => ab0d
implode
is like maknam
except that the returned symbol
is interned in the current package.
The samepnamep
function is also provided; see samepnamep-fun.
Functions are the basic building blocks of Lisp programs. This chapter describes the functions in Zetalisp that are used to manipulate functions. It also explains how to manipulate special forms and macros.
This chapter contains internal details intended for those writing programs to manipulate programs as well as material suitable for the beginner. Feel free to skip sections that look complicated or uninteresting when reading this for the first time.
There are many different kinds of functions in Zetalisp. Here are the printed representations of examples of some of them:
foo (lambda (x) (car (last x))) (named-lambda foo (x) (car (last (x)))) (subst (x) (car (last x))) #<dtp-fef-pointer append 1424771> #<dtp-u-entry last 270> #<dtp-closure 1477464>
We will examine these and other types of functions in detail later in this chapter. There is one thing they all have in common: a function is a Lisp object that can be applied to arguments. All of the above objects may be applied to some arguments and will return a value. Functions are Lisp objects and so can be manipulated in all the usual ways; you can pass them as arguments, return them as values, and make other Lisp objects refer to them.
The name of a function does not have to be a symbol. Various kinds of lists describe other places where a function can be found. A Lisp object which describes a place to find a function is called a function spec. ("Spec" is short for "specification".) Here are the printed representations of some typical function specs:
foo (:property foo bar) (:method tv:graphics-mixin :draw-line) (:internal foo 1) (:within foo bar) (:location #<dtp-locative 7435216>)
Function specs have two purposes: they specify a place to remember a function, and they serve to name functions. The most common kind of function spec is a symbol, which specifies that the function cell of the symbol is the place to remember the function. We will see all the kinds of function spec, and what they mean, shortly. Function specs are not the same thing as functions. You cannot, in general, apply a function spec to arguments. The time to use a function spec is when you want to do something to the function, such as define it, look at its definition, or compile it.
Some kinds of functions remember their own names, and some don’t. The
"name" remembered by a function can be any kind of function spec,
although it is usually a symbol. In the examples of functions in the
previous section, the one starting with the symbol named-lambda
, the
one whose printed representation included dtp-fef-pointer
, and the
dtp-u-entry
remembered names (the function specs foo
,
append
, and last
respectively). The others didn’t remember
their names.
To define a function spec means to make that function spec remember
a given function. This is done with the fdefine
function; you give
fdefine
a function spec and a function, and fdefine
remembers
the function in the place specified by the function spec. The function
associated with a function spec is called the definition of the
function spec. A single function can be
the definition of more than one function spec at the same time, or of no
function specs.
To define a function means to create a new function, and define a
given function spec as that new function. This is what the defun
special form does. Several other special forms, such as defmethod
(defmethod-fun) and defselect
(defselect-fun) do this too.
These special forms that define functions usually take a function spec,
create a function whose name is that function spec, and then define that
function spec to be the newly-created function. Most function
definitions are done this way, and so usually if you go to a function
spec and see what function is there, the function’s name will be the
same as the function spec. However, if you define a function named
foo
with defun
, and then define the symbol bar
to be this
same function, the name of the function is unaffected; both foo
and
bar
are defined to be the same function, and the name of that
function is foo
, not bar
.
A function spec’s definition in general consists of a basic
definition surrounded by encapsulations. Both the basic
definition and the encapsulations are functions, but of recognizably
different kinds. What defun
creates is a basic definition, and
usually that is all there is. Encapsulations are made by
function-altering functions such as trace
and advise
. When the
function is called, the entire definition, which includes the tracing
and advice, is used. If the function is "redefined" with defun
,
only the basic definition is changed; the encapsulations are left in
place. See the section on encapsulations, encapsulate.
A function spec is a Lisp object of one of the following types:
a symbol
The function is remembered in the function cell of the symbol. See fsymeval-fun for an explanation of function cells and the primitive functions to manipulate them.
(:property symbol property)
The function is remembered on the property list of the symbol; doing
(get symbol property)
would return the function. Storing
functions on property lists is a frequently-used technique for
dispatching (that is, deciding at run-time which function to call, on
the basis of input data).
(:method flavor-name message)
(:method flavor-name method-type message)
The function is remembered inside internal data structures of the flavor system. See the chapter on flavors (flavor) for details.
(:handler flavor-name message)
This is a name for the function actually called when a message message
is sent to an instance of the flavor flavor-name. The difference
between :handler
and :method
is that the handler may be a method
inherited from some other flavor or a combined method automatically
written by the flavor system. Methods are what you define in source files;
handlers are not. Note that redefining or encapsulating a handler affects
only the named flavor, not any other flavors built out of it. Thus
:handler
function specs are often used with trace
(see trace-fun) and advise
(see advise-fun).
(:location pointer)
The function is stored in the cdr of pointer, which may be a locative
or a list. This is for pointing at an arbitrary place
which there is no other way to describe. This form of function spec
isn’t useful in defun
(and related special forms) because the
reader has no printed representation for locative pointers and always
creates new lists; these function specs are intended for programs
that manipulate functions (see programs-that-manipulate-functions).
(:within within-function function-to-affect)
This refers to the meaning of the symbol function-to-affect, but only where it occurs in the text of the definition of within-function. If you define this function spec as anything but the symbol function-to-affect itself, then that symbol is replaced throughout the definition of within-function by a new symbol which is then defined as you specify. See the section on function encapsulation (encapsulate) for more information.
(:internal function-spec number)
Some Lisp functions contain internal functions, created by
(function (lambda ...))
forms. These internal functions need names when compiled,
but they do not have symbols as names; instead they are named by
:internal
function-specs. function-spec is the containing function.
number is a sequence number; the first internal function the compiler comes
across in a given function will be numbered 0, the next 1, etc. Internal
functions are remembered inside the FEF of their containing function.
Here is an example of the use of a function spec which is not a symbol:
(defun (:property foo bar-maker) (thing &optional kind) (set-the 'bar thing (make-bar 'foo thing kind)))
This puts a function on foo
’s bar-maker
property. Now you can
say
(funcall (get 'foo 'bar-maker) 'baz)
Unlike the other kinds of function spec, a symbol can be used as a function. If you apply a symbol to arguments, the symbol’s function definition is used instead. If the definition of the first symbol is another symbol, the definition of the second symbol is used, and so on, any number of times. But this is an exception; in general, you can’t apply function specs to arguments.
A keyword symbol which identifies function specs (may appear in the car of
a list which is a function spec) is identified by a
sys:function-spec-handler
property whose value is a function which
implements the various manipulations on function specs of that type. The
interface to this function is internal and not documented in this manual.
For compatibility with Maclisp, the function-defining special forms
defun
, macro
, and defselect
(and other defining
forms built out of them, such as defunp
and defmacro
)
will also accept a list
(symbol property)
as a function name. This is translated into
(:property symbol property)
symbol must not be one of the keyword symbols which identifies a function spec, since that would be ambiguous.
defun
is the usual way of defining a function which is part of a
program. A defun
form looks like:
(defun name lambda-list body...)
name is the function spec you wish to define as a function.
The lambda-list is a list of the names to give to the arguments of
the function. Actually, it is a little more general than that; it can
contain lambda-list keywords such as &optional
and &rest
.
(These keywords are explained in lambda-list and other
keywords are explained in lambda-list-keywords.)
See additional-defun-explanation for some additional syntactic features of defun
.
defun
creates a list which looks like
(named-lambda name lambda-list body...)
and puts it in the function cell of name. name is now defined as a function and can be called by other forms.
Examples:
(defun addone (x) (1+ x)) (defun foo (a &optional (b 5) c &rest e &aux j) (setq j (+ (addone a) b)) (cond ((not (null c)) (cons j e)) (t j)))
addone
is a function which expects a number as an argument, and
returns a number one larger. foo
is a complicated function which
takes one required argument, two optional arguments, and any number of
additional arguments which are given to the function as a list named
e
.
A declaration (a list starting with declare
) can appear as the first
element of the body. It is equivalent to a local-declare
(see
local-declare-fun) surrounding the entire defun
form. For
example,
(defun foo (x)
(declare (special x))
(bar)) ;bar uses x
free.
is equivalent to and preferable to
(local-declare ((special x)) (defun foo (x) (bar)))
(It is preferable because the editor expects the open parenthesis of a top-level function definition to be the first character on a line, which isn’t possible in the second form without incorrect indentation.)
A documentation string can also appear as the first element of the body
(following the declaration, if there is one). (It shouldn’t be the only
thing in the body; otherwise it is the value returned by the function
and so is not interpreted as documentation. A string as an element of a
body other than the last element is only evaluated for side-effect, and
since evaluation of strings has no side effects, they aren’t useful in
this position to do any computation, so they are interpreted as
documentation.) This documentation string becomes part of the
function’s debugging info and can be obtained with the function
documentation
(see documentation-fun). The first line of the
string should be a complete sentence which makes sense read by itself,
since there are two editor commands to get at the documentation, one of
which is "brief" and prints only the first line. Example:
(defun my-append (&rest lists) "Like append but copies all the lists. This is like the Lisp function append, except that append copies all lists except the last, whereas this function copies all of its arguments including the last one." ...)
Usually when a function uses prog
, the prog
form is
the entire body of the function; the definition of such a function
looks like (defun name arglist (prog varlist ...))
.
Although the use of prog
is generally discouraged, prog
fans
may want to use this special form.
For convenience, the defunp
macro can be used to produce such definitions.
A defunp
form such as
(defunp fctn (args) form1 form2 ... formn)
expands into
(defun fctn (args) (prog () form1 form2 ... (return formn)))
You can think of defunp
as being like defun
except that you can
return
out of the middle of the function’s body.
For more information on defining functions, and other ways of doing so, see function-defining.
Here is a list of the various things a user (as opposed to a program) is likely to want to do to a function. In all cases, you specify a function spec to say where to find the function.
To print out the definition of the function spec with indentation to
make it legible, use grindef
(see grindef-fun). This works only
for interpreted functions. If the definition is a compiled function, it
can’t be printed out as Lisp code, but its compiled code can be printed
by the disassemble
function (see disassemble-fun).
To find out about how to call the function, you can ask to see its
documentation, or its argument names. (The argument names are usually
chosen to have mnemonic significance for the caller). Use arglist
(arglist-fun) to see the argument names and documentation
(documentation-fun) to see the documentation string.
There are also editor commands for doing these things: the CTRL/SHIFT/D
and META/SHIFT/D
commands are for looking at a function’s
documentation, and CTRL/SHIFT/A
is for looking at an argument
list. CTRL/SHIFT/A
does not ask for the function name; it acts on
the function which is called by the innermost expression which the
cursor is inside. Usually this is the function which will be called
by the form you are in the process of writing.
You can see the function’s debugging info alist by means of the function
debugging-info
(see debugging-info-fun).
When you are debugging, you can use trace
(see trace-fun) to
obtain a printout or a break loop whenever the function is called. You
can use breakon
(see breakon-fun) to cause the error handler to
be entered whenever the function is called; from there, you can step
through further function calls and returns. You can customize the
definition of the function, either temporarily or permanently, using
advise
(see advise-fun).
There are many kinds of functions in Zetalisp. This section briefly describes each kind of function. Note that a function is also a piece of data and can be passed as an argument, returned, put in a list, and so forth.
Before we start classifying the functions, we’ll first discuss something
about how the evaluator works. As we said in the basic description of
evaluation on description-of-evaluation, when the evaluator is given
a list whose first element is a symbol, the form may be a function form,
a special form, or a macro form. If the definition of the symbol is a
function, then the function is just applied to the result of evaluating
the rest of the subforms. If the definition is a cons whose car is
macro
, then it is a macro form; these are explained in macro.
What about special forms?
Conceptually, the evaluator knows specially about all special forms
(that’s why they’re called that). However, the Zetalisp
implementation actually uses the definition of symbols that name special
forms as places to hold pieces of the evaluator. The definitions of
such symbols as prog
, do
, and
, and or
actually hold Lisp
objects, which we will call special functions. Each of these
functions is the part of the Lisp interpreter that knows how to deal
with that special form. Normally you don’t have to know about this;
it’s just part of the hidden internals of how the evaluator works.
However, if you try to add encapsulations to and
or something
like that, knowing this will help you understand the behavior you will
get.
Special functions are written like regular functions except that the
keywords "e
and &eval
(see lambda-list-keywords) are used
to make some of the arguments be "quoted" arguments. The evaluator
looks at the pattern in which arguments to the special function are
"quoted" or not, and it calls the special function in a special way: for
each regular argument, it passes the result of evaluating the
corresponding subform, but for each "quoted" argument, it passes the
subform itself without evaluating it first. For example, cond
works by having a special function that takes a "quoted" &rest
argument;
when this function is called it is passed a list of cond
clauses
as its argument.
If you apply or funcall a special function yourself, you have to understand
what the special form is going to do with its arguments; it is likely
to call eval
on parts of them. This is different from applying
a regular function, which is passed argument values rather than Lisp
expressions.
Defining your own special form, by using "e
yourself, can be
done; it is a way to extend the Lisp language. Macros are another way
of extending the Lisp language. It is preferable to implement language
extensions as macros rather than special forms, because macros directly
define a Lisp-to-Lisp translation and therefore can be understood by
both the interpreter and the compiler. Special forms, on the other
hand, only extend the interpreter. The compiler has to be modified in
an ad hoc way to understand each new special form so that code using
it can be compiled. Since all real programs are eventually compiled,
writing your own special functions is strongly discouraged.
(In fact, many of the special forms in Zetalisp are actually implemented as macros, rather than as special functions. They’re implemented this way because it’s easier to write a macro than to write both a new special function and a new ad hoc module in the compiler. However, they’re sometimes documented in this manual as special forms, rather than macros, because you should not in any way depend on the way they are implemented; they might get changed in the future to be special functions, if there was some reason to do so.)
There are four kinds of functions, classified by how they work.
First, there are interpreted functions: you define them with
defun
, they are represented as list structure, and they are
interpreted by the Lisp evaluator.
Secondly, there are compiled functions: they are defined
by compile
or by loading a qfasl file, they are represented by a
special Lisp data type, and they are executed directly by the microcode.
Similar to compiled functions are microcode functions, which are written
in microcode (either by hand or by the micro-compiler) and executed directly
by the hardware.
Thirdly, there are various types of Lisp object which can be applied to
arguments, but when they are applied they dig up another function
somewhere and apply it instead. These include dtp-select-method
,
closures, instances, and entities.
Finally, there are various types of Lisp object which, when used as functions, do something special related to the specific data type. These include arrays and stack-groups.
An interpreted function is a piece of list structure which represents a program according to the rules of the Lisp interpreter. Unlike other kinds of functions, an interpreted function can be printed out and read back in (it has a printed representation that the reader understands), can be pretty-printed (see grindef-fun), and can be opened up and examined with the usual functions for list-structure manipulation.
There are four kinds of interpreted functions: lambda
s,
named-lambda
s, subst
s, and named-subst
s. A lambda
function is the
simplest kind. It is a list that looks like this:
(lambda lambda-list form1 form2...)
The symbol lambda
identifies this list as a lambda
function. lambda-list is a description of what arguments the
function takes; see lambda-list for details. The forms
make up the body of the function. When the function is called,
the argument variables are bound to the values of the arguments
as described by lambda-list, and then the forms in the body are
evaluated, one by one. The value of the function is the value of its
last form.
A named-lambda
is like a lambda
but contains an extra element in
which the system remembers the function’s name, documentation, and other
information. Having the function’s name there allows the error handler
and other tools to give the user more information. This is the kind of
function that defun
creates. A named-lambda
function looks
like this:
(named-lambda name lambda-list body forms...)
If the name slot contains a symbol, it is the function’s name.
Otherwise it is a list whose car is the name and whose cdr is the
function’s debugging information alist. See debugging-info
,
debugging-info-fun. Note that the name need not be a symbol;
it can be any function spec. For example,
(defun (foo bar) (x) (car (reverse x)))
will give foo
a bar
property whose value is
(named-lambda ((:property foo bar)) (x) (car (reverse x)))
A subst
is just like a lambda
as far as the interpreter is concerned.
It is a list that looks like this:
(subst lambda-list form1 form2...)
The difference between a subst
and a lambda
is the way they are
handled by the compiler. A call to a normal function is compiled as a
closed subroutine; the compiler generates code to compute the
values of the arguments and then apply the function to those values. A
call to a subst
is compiled as an open subroutine; the compiler
incorporates the body forms of the subst
into the function being
compiled, substituting the argument forms for references to the
variables in the subst
’s lambda-list. This is a simple-minded
but useful facility for open or in-line coded functions.
It is simple-minded because the argument forms can be evaluated multiple
times or out of order, and so the semantics of a subst
may not be the
same in the interpreter and the compiler. subst
s are described more fully on
defsubst-fun, with the explanation of defsubst
.
A named-subst
is the same as a subst
except that it has a name
just as a named-lambda
does. It looks like
(named-subst name lambda-list form1 form2 ...)
where name is interpreted the same way as in a named-lambda
.
There are two kinds of compiled functions: macrocoded functions
and microcoded functions. The Lisp compiler converts lambda
and named-lambda
functions into macrocoded functions. A
macrocoded function’s printed representation looks like:
#<dtp-fef-pointer append 1424771>
This type of Lisp object is also called a "Function Entry Frame", or "FEF" for short. Like "car" and "cdr", the name is historical in origin and doesn’t really mean anything. The object contains Lisp Machine machine code that does the computation expressed by the function; it also contains a description of the arguments accepted, any constants required, the name, documentation, and other things. Unlike Maclisp "subr-objects", macrocoded functions are full-fledged objects and can be passed as arguments, stored in data structure, and applied to arguments.
The printed representation of a microcoded function looks like:
#<dtp-u-entry last 270>
Most microcompiled functions are basic Lisp primitives or subprimitives written in Lisp Machine microcode. You can also convert your own macrocode functions into microcode functions in some circumstances, using the micro-compiler.
A closure is a kind of function which contains another function and a
set of special variable bindings. When the closure is applied, it
puts the bindings into effect and then applies the other function. When
that returns, the closure bindings are removed. Closures are made with
the function closure
. See closure for more information.
Entities are slightly different from closures; see entity.
A select-method (dtp-select-method
) is an a-list of symbols and
functions. When one is called the first argument is looked up in the
a-list to find the particular function to be called. This function is
applied to the rest of the arguments. The a-list may have a list of
symbols in place of a symbol, in which case the associated function is
called if the first argument is any of the symbols on the list. If
cdr
of last
of the a-list is not nil
, it is a default
handler function, which gets called if the message key is not found in
the a-list. Select-methods can be created with the defselect
special form (see defselect-fun).
An instance is a message-receiving object which has some state and a table of message-handling functions (called methods). Refer to the chapter on flavors (flavor) for further information.
An array can be used as a function. The arguments to the array are
the indices and the value is the contents of the element of the
array. This works this way for Maclisp compatibility and is not recommended usage.
Use aref
(aref-fun) instead.
A stack group can be called as a function. This is one way to pass control to another stack group. See stack-group.
defun
is a special form which is put in a program to define a
function. defsubst
and macro
are others.
This section explains how these special forms work, how
they relate to the different kinds of functions, and how they interface to the
rest of the function-manipulation system.
Function-defining special forms typically take as arguments a function
spec and a description of the function to be made, usually in the form
of a list of argument names and some forms which constitute the body of
the function. They construct a function, give it the function spec as
its name, and define the function spec to be the new function.
Different special forms make different kinds of functions. defun
makes a named-lambda
function, and defsubst
makes a named-subst
function. macro
makes a macro; though the macro definition is not
really a function, it is like a function as far as definition handling
is concerned.
These special forms are used in writing programs because the function
names and bodies are constants. Programs that define functions usually
want to compute the functions and their names, so they use fdefine
.
See fdefine-fun.
All of these function-defining special forms alter only the basic definition of the function spec. Encapsulations are preserved. See encapsulate.
The special forms only create interpreted functions. There is no special way of defining a compiled function. Compiled functions are made by compiling interpreted ones. The same special form which defines the interpreted function, when processed by the compiler, yields the compiled function. See compiler for details.
Note that the editor understands these and other "defining" special forms
(e.g defmethod
, defvar
, defmacro
, defstruct
, etc.)
to some extent, so that when you ask for the definition of something, the editor
can find it in its source file and show it to you. The general convention
is that anything which is used at top level (not inside a function)
and starts with def
should be a special form for defining things
and should be understood by the editor. defprop
is an exception.
The defun
special form (and the defunp
macro which expands into a
defun
) are used for creating ordinary interpreted functions (see defun-fun).
For Maclisp compatibility, a type symbol may be inserted between
name and lambda-list in the defun
form. The following types
are understood:
expr
The same as no type.
fexpr
"e
and &rest
are prefixed to the lambda list.
macro
A macro is defined instead of a normal function.
If lambda-list is a non-nil
symbol instead of a list,
the function is recognized as a Maclisp lexpr and it is converted
in such a way that the arg
, setarg
, and listify
functions
can be used to access its arguments (see arg-fun).
The defsubst
special form is used to create substitutible functions. It
is used just like defun
but produces a list starting with named-subst
instead of one starting with named-lambda
. The named-subst
function
acts just like the corresponding named-lambda
function when applied,
but it can also be open-coded (incorporated into its callers) by the compiler.
See defsubst-fun for full information.
The macro
special form is the primitive means of creating a macro.
It gives a function spec a definition which is a macro definition rather
than a actual function. A macro is not a function because it cannot be
applied, but it can appear as the car of a form to be evaluated.
Most macros are created with the more powerful defmacro
special form.
See macro.
The defselect
special form defines a select-method function. See defselect-fun.
Unlike the above special forms, the next two (deff
and def
)
do not create new functions. They simply serve as hints to the editor
that a function is being stored into a function spec here, and therefore
if someone asks for the source code of the definition of that function spec,
this is the place to look for it.
If a function is created in some strange way, wrapping a def
special
form around the code that creates it informs the editor of the connection.
The form
(def function-spec form1 form2...)
simply evaluates the forms form1, form2, etc. It is assumed that these forms will create or obtain a function somehow, and make it the definition of function-spec.
Alternatively, you could put (def function-spec)
in
front of or anywhere near the forms which define the function. The
editor only uses it to tell which line to put the cursor on.
deff
is a simplified version of def
. It
evaluates the form definition-creator, which should produce a function,
and makes that function the definition of function-spec, which is not
evaluated. deff
is used for giving a function spec a definition which is not obtainable
with the specific defining forms such as defun
and macro
.
For example,
(deff foo 'bar)
will make foo
equivalent to bar
, with an indirection so that if
bar
changes foo
will likewise change;
(deff foo (function bar))
copies the definition of bar
into foo
with no indirection, so that
further changes to bar
will have no effect on foo
.
This macro turns into nil
, doing nothing. It exists for the sake of the
listing generation program, which uses it to declare names of special forms
which define objects (such as functions) that should cross-reference.
This function is used by defun
and the compiler to convert
Maclisp-style lexpr, fexpr, and macro defun
s to Zetalisp
definitions. x should be the cdr of a (defun ...)
form.
defun-compatibility
will return a corresponding (defun ...)
or
(macro ...)
form, in the usual Zetalisp format. You shouldn’t
ever need to call this yourself.
defselect
defines a function which is a select-method. This
function contains a table of subfunctions; when it is called, the first
argument, a symbol on the keyword package called the message name,
is looked up in the table to determine which subfunction to call. Each
subfunction can take a different number of arguments, and have a
different pattern of &optional
and &rest
arguments.
defselect
is useful for a variety of "dispatching" jobs. By analogy
with the more general message passing facilities described in flavor,
the subfunctions are sometimes called methods and the first argument
is sometimes called a message.
The special form looks like
(defselect (function-spec default-handler no-which-operations) (message-name (args...) body...) (message-name (args...) body...) ...)
function-spec is the name of the function to be defined.
default-handler is optional; it must be a symbol and is a function which
gets called if the select-method is called with an unknown message. If
default-handler is unsupplied or nil
, then an error occurs if an unknown
message is sent. If no-which-operations is non-nil
, the
:which-operations
method which would normally be supplied automatically is
suppressed. The :which-operations
method takes no arguments and returns a
list of all the message names in the defselect
.
If function-spec is a symbol, and default-handler and no-which-operations
are not supplied, then the first subform of the defselect
may be just function-spec
by itself, not enclosed in a list.
The remaining subforms in a defselect
define methods. message-name is the
message name, or a list of several message names if several messages are to be handled
by the same subfunction. args is a lambda-list; it should not include the first
argument, which is the message name. body is the body of the
function.
A method subform can instead look like:
(message-name . symbol)
In this case, symbol is the name of a function which is to be called when the message-name message is received. It will be called with the same arguments as the select-method, including the message symbol itself.
This section documents all the keywords that may appear in the "lambda-list" (argument list) (see lambda-list) of a function, a macro, or a special form. Some of them are allowed everywhere, while others are only allowed in one of these contexts; those are so indicated.
The value of this variable is a list of all of the allowed "&" keywords. Some of these are obsolete and don’t do anything; the remaining ones are listed below.
&optional
Separates the required arguments of a function from the optional arguments. See lambda-list.
&rest
Separates the required and optional arguments of a function from the rest argument. There may be only one rest argument. See &rest for full information about rest arguments. See lambda-list.
&key
Separates the positional arguments and rest argument of a function from the keyword arguments. See lambda-list.
&allow-other-keys
In a function which accepts keyword arguments, says that keywords which are not recognized are allowed. They and the corresponding values are ignored, as far as keyword arguments are concerned, but they do become part of the rest argument, if there is one.
&aux
Separates the arguments of a function from the auxiliary variables.
Following &aux
you can put entries of the form
(variable initial-value-form)
or just variable if you want it initialized to nil
or don’t care what the initial
value is.
&special
Declares the following arguments and/or auxiliary variables to be special within the scope of this function.
&local
Turns off a preceding &special
for the variables which follow.
&functional
Preceding an argument, tells the compiler that the value of this argument will be
a function. When a caller of this function is compiled, if it passes a quoted
constant argument which looks like a function (a list beginning with the symbol
lambda
) the compiler will know that it is intended to be a function rather
than a list that happens to start with that symbol, and will compile it.
"e
Declares that the following arguments are not to be evaluated. This is how you create a special function. See the caveats about special forms, on special-form-caveat.
&eval
Turns off a preceding "e
for the arguments which follow.
&list-of
This is for macros defined by defmacro
only. Refer to &list-of.
&body
This is for macros defined by defmacro
only. It is similar to &rest
,
but declares to grindef
and the code-formatting module of the editor that
the body forms of a special form follow and should be indented accordingly.
Refer to &body.
nil
) (no-query nil
) ¶This is the primitive which defun
and everything else in the system
uses to change the definition of a function spec. If carefully is
non-nil
, which it usually should be, then only the basic definition
is changed, the previous basic definition is saved if possible (see
undefun
, undefun-fun), and any encapsulations of the function such
as tracing and advice are carried over from the old definition to the
new definition. carefully also causes the user to be queried if the
function spec is being redefined by a file different from the one that
defined it originally. However, this warnings is suppressed if either the
argument no-query is non-nil
, or if the global variable
inhibit-fdefine-warnings
is t
.
If fdefine
is called while a file is being loaded, it records what
file the function definition came from so that the editor can find the
source code.
If function-spec was already defined as a
function, and carefully is non-nil
, the function-spec’s
:previous-definition
property is used to save the previous
definition. If the previous definition is an interpreted function, it
is also saved on the :previous-expr-definition
property. These
properties are used by the undefun
function (undefun-fun), which
restores the previous definition, and the uncompile
function
(uncompile-fun), which restores the previous interpreted definition.
The properties for different kinds of function specs are stored in
different places; when a function spec is a symbol its properties are
stored on the symbol’s property list.
defun
and the other function-defining special forms all supply t
for carefully and nil
or nothing for no-query. Operations
which construct encapsulations, such as trace
, are the only ones
which use nil
for carefully.
This variable is normally nil
. Setting it to t
prevents
fdefine
from warning you and asking about questionable function definitions such as
a function being redefined by a different file than defined it originally,
or a symbol that belongs to one package being defined by a file that
belongs to a different package. Setting it to :just-warn
allows
the warnings to be printed out, but prevents the queries from happening;
it assumes that your answer is "yes", i.e that it is all right to
redefine the function.
While loading a file, this is the generic-pathname for the file.
The rest of the time it is nil
. fdefine
uses this to
remember what file defines each function.
This function is obsolete. It is equivalent to
(fdefine symbol definition t force-flag)
This returns t
if function-spec has a definition, or nil
if
it does not.
This returns function-spec’s definition. If it has none, an error occurs.
This returns a locative pointing at the cell which contains
function-spec’s definition. For some kinds of function specs,
though not for symbols, this can cause data structure to be created to
hold a definition. For example, if function-spec is of the
:property
kind, then an entry may have to be added to the property
list if it isn’t already there. In practice, you should write (locf
(fdefinition function-spec))
instead of calling this function
explicitly.
Removes the definition of function-spec. For symbols this
is equivalent to fmakunbound
.
If the function is encapsulated, fundefine
removes both the
basic definition and the encapsulations. Some types of function specs
(:location
for example) do not implement fundefine
.
fundefine
on a :within
function spec removes the replacement
of function-to-affect, putting the definition of within-function
back to its normal state. fundefine
on a :method
function spec
removes the method completely, so that future messages will be handled
by some other method (see the flavor chapter).
Returns the value of the indicator property of function-spec,
or nil
if it doesn’t have such a property.
Gives function-spec an indicator property whose value is value.
If function-spec has a saved previous basic definition, this
interchanges the current and previous basic definitions,
leaving the encapsulations alone.
This undoes the effect of a defun
, compile
, etc.
See also uncompile
(uncompile-fun).
These functions take a function as argument and return information about that function. Some also accept a function spec and operate on its definition. The others do not accept function specs in general but do accept a symbol as standing for its definition. (Note that a symbol is a function as well as a function spec).
Given a function or a function spec, this finds its documentation
string, which is stored in various different places depending on the
kind of function. If there is no documentation, nil
is returned.
This returns the debugging info alist of function, or nil
if it has none.
arglist
is given a function or a function spec, and returns its best
guess at the nature of the function’s lambda
-list. It can also
return a second value which is a list of descriptive names for the
values returned by the function.
If function is a symbol, arglist
of its function definition is used.
If the function is an actual lambda
-expression,
its cadr, the lambda-list, is returned. But if function
is compiled, arglist
attempts to reconstruct the lambda-list of the original
definition, using whatever debugging information was saved by the compiler.
Sometimes the actual names of the bound variables are not available, and
arglist
uses the symbol si:*unknown*
for these. Also, sometimes
the initialization of an optional parameter is too complicated
for arglist
to reconstruct; for these it returns the symbol
si:*hairy*
.
Some functions’ real argument lists are not what would be most
descriptive to a user. A function may take a &rest argument for
technical reasons even though there are standard meanings for the first
element of that argument. For such cases, the definition of the
function can specify, with a local declaration, a value to be returned
when the user asks about the argument list. Example:
(defun foo (&rest rest-arg) (declare (arglist x y &rest z)) .....)
real-flag allows the caller of arglist
to say that
the real argument list should be used even if a declared argument list exists.
Note that while normally declare
s are only for the compiler’s benefit,
this kind of declare
affects all functions, including interpreted functions.
arglist
cannot be relied upon to return the exactly
correct answer, since some of the information may have been lost.
Programs interested in how many and what kind of arguments there are
should use args-info
instead. In general arglist
is to be used for documentation purposes, not for reconstructing
the original source code of the function.
When a function returns multiple values, it is useful to give
the values names so that the caller can be reminded which value is
which. By means of a return-list
declaration in the function’s
definition, entirely analogous to the arglist
declaration above,
you can specify a list of mnemonic names for the returned values. This
list will be returned by arglist
as the second value.
(arglist 'arglist)
=> (function &optional real-flag) and (arglist return-list)
args-info
returns a fixnum called the "numeric argument descriptor"
of the function, which describes the way the function takes arguments.
This descriptor is used internally by the microcode, the evaluator, and
the compiler. function can be a function or a function spec.
The information is stored in various bits and byte fields in the
fixnum, which are referenced by the symbolic names shown below.
By the usual Lisp Machine convention, those starting with a single "%"
are bit-masks (meant to be logand
’ed or bit-test
’ed with the number), and those
starting with "%%" are byte descriptors (meant to be used with ldb
or ldb-test
).
Here are the fields:
%%arg-desc-min-args
¶This is the minimum number of arguments which may be passed to this function, i.e the number of "required" parameters.
%%arg-desc-max-args
¶This is the maximum number of arguments which may be passed to this function, i.e the sum of the number of "required" parameters and the number of "optional" paramaters. If there is a rest argument, this is not really the maximum number of arguments which may be passed; an arbitrarily-large number of arguments is permitted, subject to limitations on the maximum size of a stack frame (about 200 words).
%arg-desc-evaled-rest
¶If this bit is set, the function has a "rest" argument, and it is not "quoted".
%arg-desc-quoted-rest
¶If this bit is set, the function has a "rest" argument, and it is "quoted". Most special forms have this bit.
%arg-desc-fef-quote-hair
¶If this bit is set, there are some quoted arguments other than the "rest" argument (if any), and the pattern of quoting is too complicated to describe here. The ADL (Argument Description List) in the FEF should be consulted. This is only for special forms.
%arg-desc-interpreted
¶This function is not a compiled-code object, and a numeric argument descriptor
cannot be computed.
Usually args-info
will not return this bit, although %args-info
will.
%arg-desc-fef-bind-hair
¶There is argument initialization, or something else too complicated to describe here. The ADL (Argument Description List) in the FEF should be consulted.
Note that %arg-desc-quoted-rest
and %arg-desc-evaled-rest
cannot both be set.
This is an internal function; it is like args-info
but does not work for interpreted functions. Also, function
must be a function, not a function spec. It exists because it has
to be in the microcode anyway, for apply
and the basic
function-calling mechanism.
The definition of a function spec actually has two parts: the basic
definition, and encapsulations. The basic definition is what
functions like defun
create, and encapsulations are additions made by
trace
or advise
to the basic definition. The purpose of making
the encapsulation a separate object is to keep track of what was made by
defun
and what was made by trace
. If defun
is done a second
time, it replaces the old basic definition with a new one while leaving
the encapsulations alone.
Only advanced users should ever need to use encapsulations directly via
the primitives explained in this section. The most common things to do
with encapsulations are provided as higher-level, easier-to-use features:
trace
(see trace-fun) and advise
(see advise-fun).
The way the basic definition and the encapsulations are defined is that the actual definition of the function spec is the outermost encapsulation; this contains the next encapsulation, and so on. The innermost encapsulation contains the basic definition. The way this containing is done is as follows. An encapsulation is actually a function whose debugging info alist contains an element of the form
(si:encapsulated-definition uninterned-symbol encapsulation-type)
The presence of such an element in the debugging info alist
is how you recognize a function to be an encapsulation. An encapsulation
is usually an interpreted function (a list starting with named-lambda
) but
it can be a compiled function also, if the application which created it
wants to compile it.
uninterned-symbol’s function definition is the thing that the encapsulation contains, usually the basic definition of the function spec. Or it can be another encapsulation, which has in it another debugging info item containing another uninterned symbol. Eventually you get to a function which is not an encapsulation; it does not have the sort of debugging info item which encapsulations all have. That function is the basic definition of the function spec.
Literally speaking, the definition of the function spec is the
outermost encapsulation, period. The basic definition is not the
definition. If you are asking for the definition of the function spec
because you want to apply it, the outermost encapsulation is exactly
what you want. But the basic definition can be found mechanically
from the definition, by following the debugging info alists. So it
makes sense to think of it as a part of the definition. In regard to
the function-defining special forms such as defun
, it is
convenient to think of the encapsulations as connecting between the
function spec and its basic definition.
An encapsulation is created with the macro si:encapsulate
.
A call to si:encapsulate
looks like
(si:encapsulate function-spec outer-function type body-form extra-debugging-info)
All the subforms of this macro are evaluated. In fact, the macro could almost be replaced with an ordinary function, except for the way body-form is handled.
function-spec evaluates to the function spec whose definition the
new encapsulation should become. outer-function is another function
spec, which should often be the same one. Its only purpose is to be
used in any error messages from si:encapsulate
.
type evaluates to a symbol which identifies the purpose of the
encapsulation; it says what the application is. For example, it could
be advise
or trace
. The list of possible types is defined by
the system because encapsulations are supposed to be kept in an order
according to their type (see si:encapsulation-standard-order
,
si:encapsulation-standard-order-var). type should have an
si:encapsulation-grind-function
property which tells grindef
what to
do with an encapsulation of this type.
body-form is a form which evaluates to the body of the
encapsulation-definition, the code to be executed when it is called.
Backquote is typically used for this expression; see backquote.
si:encapsulate
is a macro because, while body is
being evaluated, the variable si:encapsulated-function
is bound to a
list of the form (function uninterned-symbol)
, referring to the
uninterned symbol used to hold the prior definition of
function-spec. If si:encapsulate
were a function, body-form
would just get evaluated normally by the evaluator before si:encapsulate
ever got invoked, and so there would be no opportunity to bind si:encapsulated-function
.
The form body-form should contain (apply ,si:encapsulated-function arglist)
somewhere if the encapsulation is
to live up to its name and truly serve to encapsulate the original
definition. (The variable arglist
is bound by some of the code
which the si:encapsulate
macro produces automatically. When the
body of the encapsulation is run arglist
’s value will be the list of
the arguments which the encapsulation received.)
extra-debugging-info evaluates to a list of extra items to put into
the debugging info alist of the encapsulation function (besides the one
starting with si:encapsulated-definition
which every encapsulation
must have). Some applications find this useful for recording
information about the encapsulation for their own later use.
When a special function is encapsulated, the encapsulation is itself a
special function with the same argument quoting pattern. (Not all quoting
patterns can be handled; if a particular special form’s quoting pattern
cannot be handled, si:encapsulate
signals an error.) Therefore,
when the outermost encapsulation is started, each argument has been
evaluated or not as appropriate. Because each encapsulation calls the
prior definition with apply
, no further evaluation takes place, and the
basic definition of the special form also finds the arguments evaluated
or not as appropriate. The basic definition may call eval
on some
of these arguments or parts of them; the encapsulations should not.
Macros cannot be encapsulated, but their expander functions can be; if
the definition of function-spec is a macro, then si:encapsulate
automatically encapsulates the expander function instead. In this case,
the definition of the uninterned symbol is the original macro
definition, not just the original expander function.
It would not work for the encapsulation to apply the macro definition.
So during the evaluation of body-form, si:encapsulated-function
is bound
to the form (cdr (function uninterned-symbol))
, which extracts the
expander function from the prior definition of the macro.
Because only the expander function is actually encapsulated, the encapsulation does not see the evaluation or compilation of the expansion itself. The value returned by the encapsulation is the expansion of the macro call, not the value computed by the expansion.
It is possible for one function to have multiple encapsulations, created by different subsystems. In this case, the order of encapsulations is independent of the order in which they were made. It depends instead on their types. All possible encapsulation types have a total order and a new encapsulation is put in the right place among the existing encapsulations according to its type and their types.
The value of this variable is a list of the allowed encapsulation types, in the order that the encapsulations are supposed to be kept in (innermost encapsulations first). If you want to add new kinds of encapsulations, you should add another symbol to this list. Initially its value is
(advise breakon trace si:rename-within)
advise
encapsulations are used to hold advice (see advise-fun).
breakon
encapsulations are used for implementing breakon
(see breakon-fun).
trace
encapsulations are used for implementing tracing (see trace-fun).
si:rename-within
encapsulations are used to record the fact that
function specs of the form (:within within-function
altered-function)
have been defined. The encapsulation goes on
within-function (see rename-within-section for more information).
Every symbol used as an encapsulation type must be on the list
si:encapsulation-standard-order
. In addition, it should have an
si:encapsulation-grind-function
property whose value is a function that
grindef
will call to process encapsulations of that type. This
function need not take care of printing the encapsulated function
because grindef
will do that itself. But it should print any
information about the encapsulation itself which the user ought to see.
Refer to the code for the grind function for advise
to see how to
write one.
To find the right place in the ordering to insert a new encapsulation,
it is necessary to parse existing ones. This is done with the function
si:unencapsulate-function-spec
.
This takes one function spec and returns another. If the original function spec is undefined, or has only a basic definition (that is, its definition is not an encapsulation), then the original function spec is returned unchanged.
If the definition of function-spec is an encapsulation, then its debugging info is examined to find the uninterned symbol which holds the encapsulated definition, and also the encapsulation type. If the encapsulation is of a type which is to be skipped over, the uninterned symbol replaces the original function spec and the process repeats.
The value returned is the uninterned symbol from inside the last encapsulation skipped. This uninterned symbol is the first one which does not have a definition which is an encapsulation that should be skipped. Or the value can be function-spec if function-spec’s definition is not an encapsulation which should be skipped.
The types of encapsulations to be skipped over are specified by
encapsulation-types. This can be a list of the types to be
skipped, or nil
meaning skip all encapsulations (this is the default). Skipping all
encapsulations means returning the uninterned symbol which holds the basic
definition of function-spec. That is, the definition of
the function spec returned is the basic definition of the function spec
supplied. Thus,
(fdefinition (si:unencapsulate-function-spec 'foo))
returns the basic definition of foo
, and
(fdefine (si:unencapsulate-function-spec 'foo) 'bar)
sets the basic definition (just like using fdefine
with
carefully supplied as t
).
encapsulation-types can also be a symbol, which should be an
encapsulation type; then we skip all types which are supposed to come
outside of the specified type. For example, if
encapsulation-types is trace
, then we skip all types of
encapsulations that come outside of trace
encapsulations, but we
do not skip trace
encapsulations themselves. The result is a
function spec which is where the trace
encapsulation ought to
be, if there is one. Either the definition of this function spec is a
trace
encapsulation, or there is no trace
encapsulation
anywhere in the definition of function-spec, and this function
spec is where it would belong if there were one. For example,
(let ((tem (si:unencapsulate-function-spec spec 'trace))) (and (eq tem (si:unencapsulate-function-spec tem '(trace))) (si:encapsulate tem spec 'trace `(...body...))))
finds the place where a trace
encapsulation ought to go, and
makes one unless there is already one there.
(let ((tem (si:unencapsulate-function-spec spec 'trace))) (fdefine tem (fdefinition (si:unencapsulate-function-spec tem '(trace)))))
eliminates any trace
encapsulation by replacing it by whatever it
encapsulates. (If there is no trace
encapsulation, this code
changes nothing.)
These examples show how a subsystem can insert its own type of
encapsulation in the proper sequence without knowing the names of any
other types of encapsulations. Only the variable
si:encapsulation-standard-order
, which is used by
si:unencapsulate-function-spec
, knows the order.
One special kind of encapsulation is the type si:rename-within
. This
encapsulation goes around a definition in which renamings of functions
have been done.
How is this used?
If you define, advise, or trace (:within foo bar)
, then bar
gets renamed to altered-bar-within-foo
wherever it is called from
foo
, and foo
gets a si:rename-within
encapsulation to
record the fact. The purpose of the encapsulation is to enable
various parts of the system to do what seems natural to the user.
For example, grindef
(see grindef-fun) notices the
encapsulation, and so knows to print bar
instead of
altered-bar-within-foo
, when grinding the definition of foo
.
Also, if you redefine foo
, or trace or advise it, the new
definition gets the same renaming done (bar
replaced by
altered-bar-within-foo
). To make this work, everyone who alters
part of a function definition should pass the new part of the
definition through the function si:rename-within-new-definition-maybe
.
Given new-structure which is going to become a part of the
definition of function-spec, perform on it the replacements
described by the si:rename-within
encapsulation in the
definition of function-spec, if there is one. The altered
(copied) list structure is returned.
It is not necessary to call this function yourself when you replace
the basic definition because fdefine
with carefully
supplied as t
does it for you. si:encapsulate
does this
to the body of the new encapsulation. So you only need to call
si:rename-within-new-definition-maybe
yourself if you are rplac’ing
part of the definition.
For proper results, function-spec must be the outer-level function
spec. That is, the value returned by si:unencapsulate-function-spec
is not the right thing to use. It will have had one or more
encapsulations stripped off, including the si:rename-within
encapsulation if any, and so no renamings will be done.
A closure is a type of Lisp functional object useful
for implementing certain advanced access and control structures.
Closures give you more explicit control over the
environment, by allowing you to save the environment created
by the entering of a dynamic contour (i.e a lambda
, do
,
prog
, progv
, let
, or any of several other special
forms), and then use that environment
elsewhere, even after the contour has been exited.
There is a view of lambda-binding which we will use in this
section because it makes it easier to explain what closures do. In
this view, when a variable is bound, a new value cell is created for it.
The old value cell is saved away somewhere and is inaccessible. Any
references to the variable will get the contents of the new value cell,
and any setq
’s will change the contents of the new value cell.
When the binding is undone, the new value cell goes away, and the old
value cell, along with its contents, is restored.
For example, consider the following sequence of Lisp forms:
(setq a 3) (let ((a 10)) (print (+ a 6))) (print a)
Initially there is a value cell for a
, and the setq
form makes
the contents of that value cell be 3
. Then the
lambda
-combination is evaluated. a
is bound to 10
: the old
value cell, which still contains a 3
, is saved away, and a new
value cell is created with 10
as its contents. The reference to
a
inside the lambda
expression evaluates to the current binding
of a
, which is the contents of its current value cell, namely
10
. So 16
is printed. Then the binding is undone, discarding
the new value cell, and restoring the old value cell which still
contains a 3
. The final print
prints out a 3
.
The form (closure var-list function)
, where
var-list is a list of variables and function is any function,
creates and returns a closure. When this closure is applied to some
arguments, all of the value cells of the variables on var-list are
saved away, and the value cells that those variables had at the time
closure
was called (that is, at the time the closure was created)
are made to be the value cells of the symbols. Then function is
applied to the argument. (This paragraph is somewhat complex, but it
completely describes the operation of closures; if you don’t understand
it, come back and read it again after reading the next two paragraphs.)
Here is another, lower level explanation. The closure object stores several things inside of it. First, it saves the function. Secondly, for each variable in var-list, it remembers what that variable’s value cell was when the closure was created. Then when the closure is called as a function, it first temporarily restores the value cells it has remembered inside the closure, and then applies function to the same arguments to which the closure itself was applied. When the function returns, the value cells are restored to be as they were before the closure was called.
Now, if we evaluate the form
(setq a (let ((x 3)) (closure '(x) 'frob)))
what happens is that a new value cell is created for x
, and its
contents is a fixnum 3
. Then a closure is created, which remembers
the function frob
, the symbol x
, and that value cell.
Finally the old value cell of x
is restored, and the closure is
returned. Notice that the new value cell is still around, because
it is still known about by the closure. When the closure is applied,
say by doing (funcall a 7)
,
this value cell will be restored and the value of x
will be 3
again. If frob
uses x
as a free variable, it will see 3
as the value.
A closure can be made around any function, using any form
which evaluates to a function. The form could evaluate to a
lambda expression, as in '(lambda () x)
, or to a compiled function,
as would (function (lambda () x))
. In the example above, the
form is 'frob
and it evaluates to the symbol frob
. A
symbol is also a good function. It is usually better to close around
a symbol which is the name of the desired function, so that the
closure points to the symbol. Then, if the symbol is redefined, the
closure will use the new definition. If you actually prefer that the
closure continue to use the old definition which was current when the
closure was made, then close around the definition of the symbol rather
than the symbol itself. In the above example, that would be done by
(closure '(x) (function frob))
Because of the way closures are implemented, the variables to be
closed over must not get turned into "local variables" by the compiler.
Therefore, all such variables must be declared special. This can be
done with an explicit declare
(see declare-fun), with a special form
such as defvar
(defvar-fun), or with let-closed
(let-closed-fun). In simple cases, a local-declare
around the
binding will do the job. Usually the compiler can tell when a special
declaration is missing, but in the case of making a closure the compiler
detects this after already acting on the assumption that the variable is
local, by which time it is too late to fix things. The compiler will
warn you if this happens.
In Zetalisp’s implementation of closures,
lambda-binding never really allocates any storage to create new value
cells. Value cells are only created by the closure
function
itself, when they are needed. Thus, implementors of large systems need
not worry about storage allocation overhead from this mechanism if they
are not using closures.
Zetalisp closures are not closures in the true sense, as they do not save the whole variable-binding environment; however, most of that environment is irrelevant, and the explicit declaration of which variables are to be closed allows the implementation to have high efficiency. They also allow the programmer to explicitly choose for each variable whether it is to be bound at the point of call or bound at the point of definition (e.g creation of the closure), a choice which is not conveniently available in other languages. In addition the program is clearer because the intended effect of the closure is made manifest by listing the variables to be affected.
The implementation of closures (which it not usually necessary for you
to understand) involves two kinds of value cells. Every symbol has an
internal value cell, which is where its value is normally stored.
When a variable is closed over by a closure, the variable gets an
external value cell to hold its value. The external value cells
behave according to the lambda-binding model used earlier in this
section. The value in the external value cell is found through the
usual access mechanisms (such as evaluating the symbol, calling
symeval
, etc.), because the internal value cell is made to contain
an invisible pointer to the external value cell currently in effect.
A symbol will use such an invisible pointer whenever its current value
cell is a value cell that some closure is remembering; at other times,
there won’t be an invisible pointer, and the value will just reside in the
internal value cell.
One thing we can do with closures is to implement a generator, which
is a kind of function which is called successively to obtain successive elements
of a sequence.
We will implement a function make-list-generator
, which takes a list,
and returns a generator which will return successive
elements of the list. When it gets to the end it should return nil
.
The problem is that in between calls to the generator, the generator must somehow remember where it is up to in the list. Since all of its bindings are undone when it is exited, it cannot save this information in a bound variable. It could save it in a global variable, but the problem is that if we want to have more than one list generator at a time, they will all try to use the same global variable and get in each other’s way.
Here is how we can use closures to solve the problem:
(defun make-list-generator (l) (declare (special l)) (closure '(l) (function (lambda () (prog1 (car l) (setq l (cdr l)))))))
Now we can make as many list generators as we like; they won’t get
in each other’s way because each has its own (external) value cell for l
.
Each of these value cells was created when the make-list-generator
function was entered, and the value cells are remembered by the closures.
The following form uses closures to create an advanced accessing environment:
(declare (special a b)) (defun foo () (setq a 5)) (defun bar () (cons a b)) (let ((a 1) (b 1)) (setq x (closure '(a b) 'foo)) (setq y (closure '(a b) 'bar)))
When the let
is entered, new value cells are created for the symbols
a
and b
, and two closures are created that both point to those
value cells. If we do (funcall x)
, the function foo
will
be run, and it will change the contents of the remembered value cell
of a
to 5
. If we then do (funcall y)
, the function bar
will return (5 . 1)
. This shows that the value cell of a
seen
by the closure y
is the same value cell seen by the closure x
. The
top-level value cell of a
is unaffected.
This creates and returns a closure of function over the variables in var-list. Note that all variables on var-list must be declared special if the function is to compile correctly.
To test whether an object is a closure, use the closurep
predicate
(see closurep-fun). The typep
function will return the symbol
closure
if given a closure. (typep x 'closure)
is equivalent
to (closurep x)
.
This returns the binding of symbol in the environment of closure;
that is, it returns what you would get if you restored the value cells
known about by closure and then evaluated symbol.
This allows you to "look around inside" a closure.
If symbol is not closed over by closure, this is just like symeval
.
This sets the binding of symbol in the environment of closure
to x; that is, it does what would happen if you restored the value cells
known about by closure and then set symbol to x.
This allows you to change the contents of the value cells known about
by a closure.
If symbol is not closed over by closure, this is just like set
.
This returns the location of the place in closure where the saved
value of symbol is stored. An equivalent form is
(locf (symeval-in-closure closure symbol))
.
Returns an alist of (symbol . value)
pairs describing the
bindings which the closure performs when it is called. This list is not
the same one that is actually stored in the closure; that one contains
pointers to value cells rather than symbols, and closure-alist
translates them back to symbols so you can understand them. As a result,
clobbering part of this list will not change the closure.
Returns the closed function from closure. This is the function
which was the second argument to closure
when the closure was
created.
When using closures, it is very common to bind a set of variables with
initial values, and then make a closure over those variables. Furthermore
the variables must be declared as "special" for the compiler. let-closed
is a special form which does all of this. It is best described by example:
(let-closed ((a 5) b (c 'x))
(function (lambda () ...)))
macro-expands into
(local-declare ((special a b c))
(let ((a 5) b (c 'x))
(closure '(a b c)
(function (lambda () ...)))))
An entity is almost the same thing as a closure; the data type is
nominally different but an entity behaves just like a closure when
applied. The difference is that some system functions, such as
print
, operate on them differently. When print
sees a closure,
it prints the closure in a standard way. When print
sees an entity,
it calls the entity to ask the entity to print itself.
To some degree, entities are made obsolete by flavors (see flavor). The use of entities as message-receiving objects is explained in flavor-entity.
Returns a newly constructed entity. This function is just like the
function closure
except that it returns an entity instead of a
closure.
To test whether an object is an entity, use the entityp
predicate
(see entityp-fun). The functions symeval-in-closure
,
closure-alist
, closure-function
, etc also operate on entities.
A stack group (usually abbreviated "SG") is a type of Lisp object useful for implementation of certain advanced control structures such as coroutines and generators. Processes, which are a kind of coroutine, are built on top of stack groups (see process). A stack group represents a computation and its internal state, including the Lisp stack.
At any time, the computation being performed by the Lisp Machine is associated with one stack group, called the current or running stack group. The operation of making some stack group be the current stack group is called a resumption or a stack group switch; the previously running stack group is said to have resumed the new stack group. The resume operation has two parts: first, the state of the running computation is saved away inside the current stack group, and secondly the state saved in the new stack group is restored, and the new stack group is made current. Then the computation of the new stack group resumes its course.
The stack group itself holds a great deal of state information.
It contains the control stack, or "regular PDL". The control stack is
what you are shown by the backtracing commands of the error handler
(Control-B, Meta-B, and Control-Meta-B); it remembers the function which
is running, its caller, its caller’s caller, etc., and
the point of execution of each function (the "return addresses" of each
function). A stack group also contains the environment stack, or
"special PDL". This contains all of the values saved by
lambda
-binding. The name "stack group" derives from the existence
of these two stacks. Finally, the stack group contains various internal state
information (contents of machine registers and so on).
When the state of the current
stack group is saved away, all of its bindings are undone,
and when the state is restored, the bindings are put back.
Note that although bindings are temporarily undone, unwind-protect
handlers are not run by a stack-group switch (see let-globally
,
let-globally-fun).
Each stack group is a separate environment for purposes of function calling, throwing, dynamic variable binding, and condition signalling. All stack groups run in the same address space, thus they share the same Lisp data and the same global (not lambda-bound) variables.
When a new stack group is created, it is empty: it doen’t contain the state of any computation, so it can’t be resumed. In order to get things going, the stack group must be set to an initial state. This is done by "presetting" the stack group. To preset a stack group, you supply a function and a set of arguments. The stack group is placed in such a state that when it is first resumed, this function will call those arguments. The function is called the "initial" function of the stack group.
The interesting thing that happens to stack groups is that they resume each other. When one stack group resumes a second stack group, the current state of Lisp execution is saved away in the first stack group, and is restored from the second stack group. Resuming is also called "switching stack groups".
At any time, there is one stack group associated with the current computation; it is called the current stack group. The computations associated with other stack groups have their states saved away in memory, and they are not computing. So the only stack group that can do anything at all, in particular resuming other stack groups, is the current one.
You can look at things from the point of view of one computation. Suppose it is running along, and it resumes some stack group. Its state is saved away into the current stack group, and the computation associated with the one it called starts up. The original computation lies dormant in the original stack group, while other computations go around resuming each other, until finally the original stack group is resumed by someone. Then the computation is restored from the stack group and gets to run again.
There are several ways that the current stack group can resume other stack groups. This section describes all of them.
Associated with each stack group is a resumer. The resumer is nil
or another stack group. Some forms of resuming examine and alter the
resumer of some stack groups.
Resuming has another ability: it can transmit a Lisp object from the old stack group to the new stack group. Each stack group specifies a value to transmit whenever it resumes another stack group; whenever a stack group is resumed, it receives a value.
In the descriptions below, let c stand for the current stack group, s stand for some other stack group, and x stand for any arbitrary Lisp object.
Stack groups can be used as functions. They accept one argument. If c calls s as a function with one argument x, then s is resumed, and the object transmitted is x. When c is resumed (usually–but not necessarily–by s), the object transmitted by that resumption will be returned as the value of the call to s. This is one of the simple ways to resume a stack group: call it as a function. The value you transmit is the argument to the function, and the value you receive is the value returned from the function. Furthermore, this form of resuming sets s’s resumer to be c.
Another way to resume a stack group is to use stack-group-return
.
Rather than allowing you to specify which stack group to resume,
this function always resumes the resumer of the current stack group.
Thus, this is a good way to resume whoever it was who resumed you,
assuming he did it by function-calling. stack-group-return
takes
one argument which is the object to transmit. It returns when someone
resumes the current stack group, and returns one value, the object
that was transmitted by that resumption. stack-group-return
does
not affect the resumer of any stack group.
The most fundamental way to do resuming is with stack-group-resume
,
which takes two arguments: the stack group, and a value to transmit.
It returns when someone resumes the current stack group, returning
the value that was transmitted by that resumption,
and does not affect any stack group’s resumer.
If the initial function of c attempts to return a value x, the regular kind of Lisp function return cannot take place, since the function did not have any caller (it got there when the stack group was initialized). So instead of normal function returning, a "stack group return" happens. c’s resumer is resumed, and the value transmitted is x. c is left in a state ("exhausted") from which it cannot be resumed again; any attempt to resume it will signal an error. Presetting it will make it work again.
Those are the "voluntary" forms of stack group switch; a resumption happens because the computation said it should. There are also two "involuntary" forms, in which another stack group is resumed without the explicit request of the running program.
If an error occurs, the current stack group resumes the error handler stack group. The value transmitted is partially descriptive of the error, and the error handler looks inside the saved state of the erring stack group to get the rest of the information. The error handler recovers from the error by changing the saved state of the erring stack group and then resuming it.
When certain events occur, typically a 1-second clock tick, a sequence break occurs. This forces the current stack group to resume a special stack group called the scheduler (see scheduler). The scheduler implements processes by resuming, one after another, the stack group of each process that is ready to run.
The binding of this variable is the resumer of the current stack group.
The value of sys:%current-stack-group
is the stack group which is
currently running. A program can use this variable to get its hands
on its own stack group.
A stack group has a state, which controls what it will do when it
is resumed. The code number for the state is returned by the function
sys:sg-current-state
. This number will be the value of one of
the following symbols. Only the states actually used by the current
system are documented here; some other codes are defined but not used.
sys:sg-state-active
The stack group is the current one.
sys:sg-state-resumable
The stack group is waiting to be resumed, at which time it will pick up its saved machine state and continue doing what it was doing before.
sys:sg-state-awaiting-return
The stack group called some other stack group as a function. When it is resumed, it will return from that function call.
sys:sg-state-awaiting-initial-call
The stack group has been preset (see below) but has never been called. When it is resumed, it will call its initial function with the preset arguments.
sys:sg-state-exhausted
The stack group’s initial function has returned. It cannot be resumed.
sys:sg-state-awaiting-error-recovery
When a stack group gets an error it goes into this state, which prevents anything from happening to it until the error handler has looked at it. In the meantime it cannot be resumed.
sys:sg-state-invoke-call-on-return
When the stack group is resumed, it will call a function. The function and arguments are already set up on the stack. The debugger uses this to force the stack group being debugged to do things.
This creates and returns a new stack group. name may be any symbol
or string; it is used in the stack group’s printed representation.
options is a list of alternating keywords and values. The options
are not too useful; most calls to make-stack-group
don’t need any
options at all. The options are:
:sg-area
The area in which to create the stack group structure itself.
Defaults to the default area (the value of default-cons-area
).
:regular-pdl-area
The area in which to create the regular PDL. Note that this
may not be any area; only certain areas will do, because regular PDLs
are cached in a hardware device called the pdl buffer.
The default is sys:pdl-area
.
:special-pdl-area
The area in which to create the special PDL.
Defaults to the default area (the value of default-cons-area
).
:regular-pdl-size
Length of the regular PDL to be created. Defaults to 3000.
:special-pdl-size
Length of the special PDL to be created. Defaults to 2000.
:swap-sv-on-call-out
¶:swap-sv-of-sg-that-calls-me
These flags default to 1. If these are 0, the system does not maintain separate binding environments for each stack group. You do not want to use this feature.
:trap-enable
This determines what to do if a microcode error occurs.
If it is 1
the system tries to handle the error;
if it is 0
the machine halts. Defaults to 1.
:safe
If this flag is 1
(the default), a strict call-return discipline among
stack-groups is enforced. If 0
, no restriction on stack-group
switching is imposed.
This sets up stack-group so that when it is resumed,
function will be applied to arguments within the stack group.
Both stacks are made empty; all saved state in the stack group is destroyed.
stack-group-preset
is typically used to initialize a stack group just after it is made,
but it may be done to any stack group at any time. Doing this to a stack
group which is not exhausted will destroy its present state without
properly cleaning up by running unwind-protect
s.
Resumes s, transmitting the value x. No stack group’s resumer is affected.
Resumes the current stack group’s resumer, transmitting the value x. No stack group’s resumer is affected.
Evaluates the variable symbol in the binding environment of
sg. If frame is not nil
, if evaluates symbol in the
binding environment of execution in that frame. (A frame is an index
in the stack group’s regular pdl).
Two values are returned: the symbol’s value, and a locative to where
the value is stored. If as-if-current is not nil
, the
locative points to where the value would be stored if sg were
running. This may be different from where the value is stored now;
for example, the current binding in stack group sg is stored in
symbol’s value cell when sg is running, but is probably
stored in sg’s special pdl when sg is not running.
as-if-current makes no difference if sg actually is
the current stack group.
If symbol is unbound in the specified stack group and frame, this will get an unbound-variable error.
There are a large number of functions in the sys:
and eh:
packages
for manipulating the internal details of stack groups. These are not
documented here as they are not necessary for most users or even system
programmers to know about. Refer to the file SYS: LMWIN; EH LISP for them.
Because each stack group has its own set of dynamic bindings, a
stack group will not inherit its creator’s value of terminal-io
(see terminal-io-var), nor its caller’s, unless you make special
provision for this. The terminal-io
a stack group gets by default
is a "background" stream which does not normally expect to be used. If
it is used, it will turn into a "background window" which will request
the user’s attention. Usually this is because an error printout is
trying to be printed on the stream. [This will all be explained
in the window system documentation.]
If you write a program that uses multiple stack groups, and you want
them all to do input and output to the terminal, you should pass the
value of terminal-io
to the top-level function of each stack group
as part of the stack-group-preset
, and that function should bind
the variable terminal-io
.
Another technique is to use a closure as the top-level function
of a stack group. This closure can bind terminal-io
and any other
variables that are desired to be shared between the stack group and its
creator.
The canonical coroutine example is the so-called samefringe problem:
Given two trees, determine whether they contain the same
atoms in the same order, ignoring parenthesis structure. A better
way of saying this is, given two binary trees built out of conses,
determine whether the sequence of atoms on the fringes of the trees
is the same, ignoring differences in the arrangement of the
internal skeletons of the two trees. Following the usual rule
for trees, nil
in the cdr of a cons is to be ignored.
One way of solving this problem is to use generator coroutines. We make a generator for each tree. Each time the generator is called it returns the next element of the fringe of its tree. After the generator has examined the entire tree, it returns a special "exhausted" flag. The generator is most naturally written as a recursive function. The use of coroutines, i.e stack groups, allows the two generators to recurse separately on two different control stacks without having to coordinate with each other.
The program is very simple. Constructing it in the usual bottom-up style,
we first write a recursive function which takes a tree and stack-group-return
s
each element of its fringe. The stack-group-return
is how the generator
coroutine delivers its output. We could easily test this function by changing
stack-group-return
to print
and trying it on some examples.
(defun fringe (tree) (cond ((atom tree) (stack-group-return tree)) (t (fringe (car tree)) (if (not (null (cdr tree))) (fringe (cdr tree))))))
Now we package this function inside another, which takes care of returning the special "exhausted" flag.
(defun fringe1 (tree exhausted) (fringe tree) exhausted)
The samefringe
function takes the two trees as arguments and returns
t
or nil
. It creates two stack groups to act as the two
generator coroutines, presets them to run the fringe1
function, then
goes into a loop comparing the two fringes. The value is nil
if a difference
is discovered, or t
if they are still the same when the end is reached.
(defun samefringe (tree1 tree2) (let ((sg1 (make-stack-group "samefringe1")) (sg2 (make-stack-group "samefringe2")) (exhausted (ncons nil))) (stack-group-preset sg1 #'fringe1 tree1 exhausted) (stack-group-preset sg2 #'fringe1 tree2 exhausted) (do ((v1) (v2)) (nil) (setq v1 (funcall sg1 nil) v2 (funcall sg2 nil)) (cond ((neq v1 v2) (return nil)) ((eq v1 exhausted) (return t))))))
Now we test it on a couple of examples.
(samefringe '(a b c) '(a (b c))) => t (samefringe '(a b c) '(a b c d)) => nil
The problem with this is that a stack group is quite a large object, and we make two of them every time we compare two fringes. This is a lot of unnecessary overhead. It can easily be eliminated with a modest amount of explicit storage allocation, using the resource facility (see defresource-fun). While we’re at it, we can avoid making the exhausted flag fresh each time; its only important property is that it not be an atom.
(defresource samefringe-coroutine () :constructor (make-stack-group "for-samefringe")) (defvar exhausted-flag (ncons nil)) (defun samefringe (tree1 tree2) (using-resource (sg1 samefringe-coroutine) (using-resource (sg2 samefringe-coroutine) (stack-group-preset sg1 #'fringe1 tree1 exhausted-flag) (stack-group-preset sg2 #'fringe1 tree2 exhausted-flag) (do ((v1) (v2)) (nil) (setq v1 (funcall sg1 nil) v2 (funcall sg2 nil)) (cond ((neq v1 v2) (return nil)) ((eq v1 exhausted-flag) (return t)))))))
Now we can compare the fringes of two trees with no allocation of memory whatsoever.
A locative is a type of Lisp object used as a pointer to a cell. Locatives are inherently a more "low level" construct than most Lisp objects; they require some knowledge of the nature of the Lisp implementation. Most programmers will never need them.
A cell is a machine word that can hold a (pointer to a)
Lisp object. For example, a symbol has five cells: the print name cell,
the value cell, the function cell, the property list cell, and the
package cell. The value cell holds (a pointer to) the binding of the
symbol, and so on. Also, an array leader of length n has n
cells, and an art-q
array of n elements has n cells.
(Numeric arrays do not have cells in this sense.) A locative is
an object that points to a cell; it lets you refer to a cell, so that
you can examine or alter its contents.
There are a set of functions that create locatives to
cells; the functions are documented with the kind of object to
which they create a pointer. See ap-1
, ap-leader
,
car-location
, value-cell-location
, etc. The macro locf
(see locf-fun)
can be used to convert a form that accesses a cell to one that
creates a locative pointer to that cell: for example,
(locf (fsymeval x)) ==> (function-cell-location x)
locf
is very convenient because it saves the writer and reader of
a program from having to remember the names of all the functions
that create locatives.
Either of the functions car
and cdr
(see car-fun)
may be given a locative, and will return the contents of the cell at
which the locative points.
For example, (car (value-cell-location x)) is the same as (symeval x)
Similarly, either of the functions rplaca
and rplacd
may
be used to store an object into the cell at which a locative
points.
For example, (rplaca (value-cell-location x) y) is the same as (set x y)
If you mix locatives and lists, then it matters whether you use car
and rplaca
or cdr
and rplacd
,
and care is required. For example, the following function takes
advantage of value-cell-location
to cons up a list in forward
order without special-case code. The first time through the loop,
the rplacd
is equivalent to (setq res ...)
; on later times
through the loop the rplacd
tacks an additional cons onto the end of the list.
(defun simplified-version-of-mapcar (fcn lst) (do ((lst lst (cdr lst)) (res nil) (loc (value-cell-location 'res))) ((null lst) res) (rplacd loc (setq loc (ncons (funcall fcn (car lst)))))))
You might expect this not to work if it was compiled and res
was not declared special, since non-special compiled variables are
not represented as symbols. However, the compiler arranges for
it to work anyway, by recognizing value-cell-location
of the name
of a local variable, and compiling it as something other than a call
to the value-cell-location
function.
Subprimitives are functions which are not intended to be used by
the average program, only by "system programs". They allow one to
manipulate the environment at a level lower than normal Lisp.
They are described in this chapter.
Subprimitives usually have names which start with a %
character.
The "primitives" described in other sections of the manual typically
use subprimitives to accomplish their work. The subprimitives take
the place of machine language in other systems, to some extent.
Subprimitives are normally hand-coded in microcode.
There is plenty of stuff in this chapter that is not fully
explained; there are terms that are undefined, there are forward references,
and so on. Furthermore, most of what is in here is considered subject
to change without notice. In fact, this chapter does not exactly belong
in this manual, but in some other more low-level manual. Since the latter
manual does not exist, it is here for the interim.
Subprimitives by their very nature cannot do full checking.
Improper use of subprimitives can destroy the environment.
Subprimitives come in varying degrees of dangerousness. Those without
a %
sign in their name cannot destroy the environment, but are
dependent on "internal" details of the Lisp implementation. The ones
whose names start with a %
sign can
violate system conventions if used improperly. The subprimitives are documented here
since they need to be documented somewhere, but this manual does not
document all the things you need to know in order to use them. Still other
subprimitives are not documented here because they are very specialized.
Most of these are never used explicitly by a programmer; the compiler
inserts them into the program to perform operations which are expressed
differently in the source code.
The most common problem you can cause using subprimitives, though
by no means the only one, is to create illegal pointers: pointers
that are, for one reason or another, according to storage conventions,
not allowed to exist. The storage conventions are not documented;
as we said, you have to be an expert to correctly use a lot of the functions
in this chapter. If you create such an illegal pointer, it probably will
not be detected immediately, but later on parts of the system may see it,
notice that it is illegal, and (probably) halt the Lisp Machine.
In a certain sense car
, cdr
, rplaca
, and rplacd
are
subprimitives. If these are given a locative instead of a list, they will
access or modify the cell addressed by the locative without regard to what
object the cell is inside. Subprimitives can be used to create locatives
to strange places.
data-type
returns a symbol which is the name
for the internal data-type of the "pointer" which represents arg.
Note that some types as seen by the user are not distinguished from each other
at this level, and some user types may be represented by more than one
internal type. For example, dtp-extended-number
is the symbol that
data-type
would return for either a flonum or a bignum, even though
those two types are quite different.
The typep
function (typep-fun) is a higher-level
primitive which is more useful in most cases; normal programs
should always use typep
rather than data-type
.
Some of these type codes are internal tag fields that are never
used in pointers that represent Lisp objects at all, but they are
documented here anyway.
dtp-symbol
The object is a symbol.
dtp-fix
The object is a fixnum; the numeric value is contained in the address field of the pointer.
dtp-small-flonum
The object is a small flonum; the numeric value is contained in the address field of the pointer.
dtp-extended-number
The object is a flonum or a bignum. This value will also be used for future numeric types.
dtp-list
The object is a cons.
dtp-locative
The object is a locative pointer.
dtp-array-pointer
The object is an array.
dtp-fef-pointer
The object is a compiled function.
dtp-u-entry
The object is a microcode entry.
dtp-closure
The object is a closure; see closure.
dtp-stack-group
The object is a stack-group; see stack-group.
dtp-instance
The object is an instance of a flavor, i.e an "active object". See flavor.
dtp-entity
The object is an entity; see entity.
dtp-select-method
The object is a "select-method"; see select-method.
dtp-header
An internal type used to mark the first word of a multi-word structure.
dtp-array-header
An internal type used in arrays.
dtp-symbol-header
An internal type used to mark the first word of a symbol.
dtp-instance-header
An internal type used to mark the first word of an instance.
dtp-null
Nothing to do with nil
. This is used in unbound value and function cells.
dtp-trap
The zero data-type, which is not used. This hopes to detect microcode bugs.
dtp-free
This type is used to fill free storage, to catch wild references.
dtp-external-value-cell-pointer
An "invisible pointer" used for external value cells, which are part of the closure mechanism (see closure), and used by compiled code to address value and function cells.
dtp-header-forward
An "invisible pointer" used to indicate that the structure containing
it has been moved elsewhere. The "header word" of the structure is
replaced by one of these invisible pointers. See the function structure-forward
(structure-forward-fun).
dtp-body-forward
An "invisible pointer" used to indicate that the structure containing it has been moved elsewhere. This points to the word containing the header-forward, which points to the new copy of the structure.
dtp-one-q-forward
An "invisible pointer" used to indicate that the single cell containing it has been moved elsewhere.
dtp-gc-forward
This is used by the copying garbage collector to flag the obsolete copy of an object; it points to the new copy.
The value of q-data-types
is a list of all of the symbolic
names for data types described above under data-type
.
These are the symbols whose print names begin
with "dtp-
". The values of these symbols are the internal numeric data-type codes
for the various types.
Given the internal numeric data-type code, returns the corresponding symbolic name. This "function" is actually an array.
An invisible pointer is a kind of pointer that does not represent a Lisp object, but just resides in memory. There are several kinds of invisible pointer, and there are various rules about where they may or may not appear. The basic property of an invisible pointer is that if the Lisp Machine reads a word of memory and finds an invisible pointer there, instead of seeing the invisible pointer as the result of the read, it does a second read, at the location addressed by the invisible pointer, and returns that as the result instead. Writing behaves in a similar fashion. When the Lisp Machine writes a word of memory it first checks to see if that word contains an invisible pointer; if so it goes to the location pointed to by the invisible pointer and tries to write there instead. Many subprimitives that read and write memory do not do this checking.
The simplest kind of invisible pointer has the data type code
dtp-one-q-forward
. It is used to forward a single word of memory to
someplace else. The invisible pointers with data types
dtp-header-forward
and dtp-body-forward
are used for moving
whole Lisp objects (such as cons cells or arrays) somewhere else. The
dtp-external-value-cell-pointer
is very similar to the
dtp-one-q-forward
; the difference is that it is not "invisible" to
the operation of binding. If the (internal) value cell of a symbol
contains a dtp-external-value-cell-pointer
that points to some other
word (the external value cell), then symeval
or set
operations on
the symbol will consider the pointer to be invisible and use the
external value cell, but binding the symbol will save away the
dtp-external-value-cell-pointer
itself, and store the new value into
the internal value cell of the symbol. This is how closures are implemented.
dtp-gc-forward
is not an invisible pointer at all; it only appears in
"old space" and will never be seen by any program other than the garbage
collector. When an object is found not to be garbage, and the garbage collector
moves it from "old space" to "new space", a dtp-gc-forward
is left behind
to point to the new copy of the object. This ensures that other references
to the same object get the same new copy.
This causes references to old-object to actually reference new-object, by storing invisible pointers in old-object. It returns old-object.
An example of the use of structure-forward
is adjust-array-size
.
If the array is being made bigger and cannot be expanded in place, a new
array is allocated, the contents are copied, and the old array is
structure-forwarded to the new one. This forwarding ensures that pointers
to the old array, or to cells within it, continue to work. When the garbage
collector goes to copy the old array, it notices the forwarding and uses
the new array as the copy; thus the overhead of forwarding disappears
eventually if garbage collection is in use.
Normally returns object, but if object has been structure-forward
’ed,
returns the object at the end of the chain of forwardings. If object
is not exactly an object, but a locative to a cell in the middle of an object,
a locative to the corresponding cell in the latest copy of the object will be
returned.
This alters from-symbol so that it always has the same value
as to-symbol, by sharing its value cell. A dtp-one-q-forward
invisible pointer is stored into from-symbol’s value cell.
Do not do this while from-symbol is lambda
-bound, as
the microcode does not bother to check for that case and something
bad will happen when from-symbol gets unbound. The microcode check
is omitted to speed up binding and unbinding.
To forward one arbitrary cell to another (rather than specifically one value cell to another), given two locatives do
(%p-store-tag-and-pointer locative1 dtp-one-q-forward locative2)
loc is a locative to a cell. Normally loc is returned, but if the
cell has been forwarded, this follows the chain of forwardings and returns
a locative to the final cell. If the cell is part of a structure which has
been forwarded, the chain of structure forwardings is followed, too.
If evcp-p is t
, external value cell pointers are followed; if
it is nil
they are not.
It should again be emphasized that improper use of these functions can damage or destroy the Lisp environment. It is possible to create pointers with illegal data-type, pointers to non-existent objects, and pointers to untyped storage which will completely confuse the garbage collector.
Returns the data-type field of x, as a fixnum.
Returns the pointer field of x, as a fixnum. For most types, this is dangerous since the garbage collector can copy the object and change its address.
This makes up a pointer, with data-type in the data-type
field and pointer in the pointer field, and returns it. data-type
should be an internal numeric data-type code; these are the values of
the symbols that start with dtp-
. pointer may be any object;
its pointer field is used. This is
most commonly used for changing the type of a pointer. Do not use this
to make pointers which are not allowed to be in the machine, such as
dtp-null
, invisible pointers, etc.
This returns a pointer with data-type in the data-type
field, and pointer plus offset in the pointer field. The
data-type and pointer arguments are like those of %make-pointer
;
offset may be any object but is usually a fixnum. The
types of the arguments are not checked; their pointer fields are simply
added together. This is useful for constructing locative pointers
into the middle of an object. However, note that it is illegal to
have a pointer to untyped data, such as the inside of a FEF or
a numeric array.
Returns a fixnum which is pointer-1 minus pointer-2. No type checks are made. For the result to be meaningful, the two pointers must point into the same object, so that their difference cannot change as a result of garbage collection.
This subprimitive finds the structure into which pointer points, by searching backward for a header. It is a basic low-level function used by such things as the garbage collector. pointer is normally a locative, but its data-type is ignored. Note that it is illegal to point into an "unboxed" portion of a structure, for instance the middle of a numeric array.
In structure space, the "containing structure" of a pointer
is well-defined by system storage conventions. In list space,
it is considered to be the contiguous, cdr-coded segment of
list surrounding the location pointed to. If a cons of the list
has been copied out by rplacd
, the contiguous list includes
that pair and ends at that point.
This is identical to %find-structure-header
, except that if the
structure is an array with a leader, this returns a locative pointer
to the leader-header, rather than returning the array-pointer itself.
Thus the result of %find-structure-leader
is always the lowest
address in the structure. This is the one used internally by the garbage collector.
Returns the number of "boxed Q’s" in object. This is the number of words at the front of the structure which contain normal Lisp objects. Some structures, for example FEFs and numeric arrays, contain additional "unboxed Q’s" following their "boxed Q’s". Note that the boxed size of a PDL (either regular or special) does not include Q’s above the current top of the PDL. Those locations are boxed but their contents is considered garbage, and is not protected by the garbage collector.
Returns the total number of words occupied by the representation of object, including boxed Q’s, unboxed Q’s, and garbage Q’s off the ends of PDLs.
This is the subprimitive for creating most structured-type objects.
area is the area in which it is to be created, as a fixnum or a symbol.
size is the number of words to be allocated. The value returned
points to the first word allocated, and has data-type data-type.
Uninterruptibly, the words allocated are initialized so that storage
conventions are preserved at all times. The first word, the header,
is initialized to have header-type in its data-type field
and header in its pointer field. The second word is initialized
to second-word. The remaining words are initialized to nil
.
The flag bits of all words are set to 0. The cdr codes of all words
except the last are set to cdr-next
; the cdr code of the last word
is set to cdr-nil
. It is probably a bad idea to rely on this.
The basic functions for creating list-type objects are cons
and
make-list
; no special subprimitive is needed. Closures, entities,
and select-methods are based on lists, but there is no primitive
for creating them. To create one, create a list and then use %make-pointer
to change the data type from dtp-list
to the desired type.
This is the subprimitive for creating arrays, called only by make-array
.
It is different from %allocate-and-initialize
because arrays have
a more complicated header structure.
This is the basic locking primitive. pointer is a locative to
a cell which is uninterruptibly read and written. If the contents of
the cell is eq
to old, then it is replaced by new and
t
is returned. Otherwise, nil
is returned and the contents
of the cell is not changed.
Returns the contents of the register at the specified Unibus address, as a fixnum. You must specify a full 18-bit address. This is guaranteed to read the location only once. Since the Lisp Machine Unibus does not support byte operations, this always references a 16-bit word, and so address will normally be an even number.
Writes the 16-bit number data at the specified Unibus address, exactly once.
Returns the contents of the register at the specified Xbus address. io-offset is an offset into the I/O portion of Xbus physical address space. This is guaranteed to read the location exactly once. The returned value can be either a fixnum or a bignum.
Writes data, which can be a fixnum or a bignum, into the register at the specified Xbus address. io-offset is an offset into the I/O portion of Xbus physical address space. This is guaranteed to write the location exactly once.
Does (%xbus-write w-loc w-data)
, but first synchronizes to
within about one microsecond of a certain condition. The synchronization
is achieved by looping until
(= (logand (%xbus-read sync-loc) sync-mask) sync-value)
is false, then looping until it is true, then looping delay times. Thus the write happens a specified delay after the leading edge of the synchronization condition. The number of microseconds of delay is roughly one third of delay.
Stops the machine.
This checks the cell pointed to by base-pointer for a forwarding pointer. Having followed forwarding pointers to the real structure pointed to, it adds offset to the resulting forwarded base-pointer and returns the contents of that location.
There is no %p-contents
, since car
performs that operation.
Given a pointer to a memory location containing a pointer which isn’t
allowed to be "in the machine" (typically an invisible pointer)
this function returns the contents of the location as a dtp-locative
.
It changes the disallowed data type to dtp-locative
so that you can safely
look at it and see what it points to.
This checks the cell pointed to by base-pointer for
a forwarding pointer. Having followed forwarding pointers to the
real structure pointed to, it adds offset to the resulting
forwarded base-pointer, fetches the contents of that location,
and returns it with the data type changed to dtp-locative
in case
it was a type which isn’t allowed to be "in the machine" (typically
an invisible pointer). This can be used, for example, to analyze the
dtp-external-value-cell-pointer
pointers in a FEF, which are
used by the compiled code to reference value cells and function cells
of symbols.
value is stored into the data-type and pointer fields of the location addressed by pointer. The cdr-code and flag-bit fields remain unchanged. value is returned.
This checks the cell pointed to by base-pointer for a forwarding pointer. Having followed forwarding pointers to the real structure pointed to, it adds offset to the resulting forwarded base-pointer, and stores value into the data-type and pointer fields of that location. The cdr-code and flag-bit fields remain unchanged. value is returned.
Creates a Q by taking 8 bits from miscfields and 24 bits from pntrfield, and stores that into the location addressed by pointer. The low 5 bits of miscfields become the data-type, the next bit becomes the flag-bit, and the top two bits become the cdr-code. This is a good way to store a forwarding pointer from one structure to another (for example).
This is like ldb
but gets a byte from the location
addressed by pointer. Note that
you can load bytes out of the data type etc bits, not just
the pointer field, and that the word loaded out of need not
be a fixnum. The result returned is always a fixnum.
This checks the cell pointed to by base-pointer for a forwarding pointer. Having followed forwarding pointers to the real structure pointed to, the byte specified by ppss is loaded from the contents of the location addressed by the forwarded base-pointer plus offset, and returned as a fixnum. This is the way to reference byte fields within a structure without violating system storage conventions.
The value, a fixnum, is stored into the byte selected
by ppss in the word addressed by pointer. nil
is returned.
You can use this to alter data types, cdr codes, etc.
This checks the cell pointed to by base-pointer for
a forwarding pointer. Having followed forwarding pointers to the
real structure pointed to, the value is stored into the byte specified by ppss in
the location addressed by the forwarded
base-pointer plus offset. nil
is returned.
This is the way to alter unboxed data within a structure
without violating system storage conventions.
This is similar to %p-ldb
, except that the selected
byte is returned in its original position within the word instead
of right-aligned.
This is similar to %p-ldb-offset
, except that the selected
byte is returned in its original position within the word instead
of right-aligned.
This is similar to %p-dpb
, except that the selected
byte is stored from the corresponding bits of value rather than
the right-aligned bits.
This is similar to %p-dpb-offset
, except that the selected
byte is stored from the corresponding bits of value rather than
the right-aligned bits.
Extracts the pointer field of the contents of the location addressed by pointer and returns it as a fixnum.
Extracts the data-type field of the contents of the location addressed by pointer and returns it as a fixnum.
Extracts the cdr-code field of the contents of the location addressed by pointer and returns it as a fixnum.
Extracts the flag-bit field of the contents of the location addressed by pointer and returns it as a fixnum.
Clobbers the pointer field of the location addressed by pointer to value, and returns value.
Clobbers the data-type field of the location addressed by pointer to value, and returns value.
Clobbers the cdr-code field of the location addressed by pointer to value, and returns value.
Clobbers the flag-bit field of the location addressed by pointer to value, and returns value.
Returns a locative pointer to its caller’s stack frame. This
function is not defined in the interpreted Lisp environment; it only works
in compiled code. Since it turns into a "misc" instruction,
the "caller’s stack frame" really means "the frame for the FEF
that executed the %stack-frame-pointer
instruction".
The following special variables have values which define the most important attributes
of the way Lisp data structures are laid out in storage. In addition to the variables
documented here, there are many others which are more specialized. They are not
documented in this manual since they are in the system
package rather than
the global
package. The variables whose names start with %%
are
byte specifiers, intended to be used with subprimitives such as %p-ldb
.
If you change the value of any of these variables, you will probably bring the
machine to a crashing halt.
The field of a memory word which contains the flag-bit. In most data structures this bit is not used by the system and is available for the user.
The field of a memory word which contains the data-type code. See data-type-fun.
The field of a memory which contains the pointer address, or immediate data.
The field of a memory word which contains the part of the address that lies within a single page.
The concatenation of the %%q-data-type
and %%q-pointer
fields.
The field of a memory word which contains the tag fields, %%q-cdr-code
and %%q-flag-bit
.
The concatenation of all fields of a memory word except for %%q-pointer
.
The concatenation of all fields of a memory word except for %%q-cdr-code
.
These subprimitives can be used (carefully!) to call a function with the
number of arguments variable at run time. They only work in compiled code
and are not defined in the interpreted Lisp environment.
The preferred higher-level primitive is lexpr-funcall
(lexpr-funcall-fun).
Starts a call to function. n-adi-pairs is the number of
pairs of additional information words already %push
’ed; normally
this should be 0. destination is where to put the result;
the useful values are 0
for the value to be ignored, 1
for the value to go onto the stack, 3
for the value to be
the last argument to the previous open call block, and 4
for the value to be returned from this frame.
Pushes value onto the stack. Use this to push the arguments.
Causes the call to happen.
Pops the top value off of the stack and returns it as its value.
Use this to recover the result from a call made by %open-call-block
with a destination of 1.
Call this before doing a sequence of %push
’s or %open-call-block
s
which will add n-words to the current frame. This subprimitive checks
that the frame will not exceed the maximum legal frame size, which is 255 words
including all overhead. This limit is dictated by the way stack frames are linked together.
If the frame is going to exceed the legal limit, %assure-pdl-room
will signal
an error.
Binds the cell pointed to by locative to x, in the caller’s
environment. This function is not defined in the interpreted Lisp
environment; it only works from compiled code. Since it turns into an
instruction, the "caller’s environment" really means "the binding block
for the stack frame that executed the bind
instruction". The preferred
higher-level primitives which turn into this are let
(let-fun),
let-if
(let-if-fun), and progv
(progv-fun).
[This will be renamed to %bind
in the future.]
[Someday this may discuss how it works.]
This variable contains bits which control various disk usage features.
Bit 0 (the least significant bit) enables read-compares after disk read operations. This causes a considerable slowdown, so it is rarely used.
Bit 1 enables read-compares after disk write operations.
Bit 2 enables the multiple page swap-out feature. When this is enabled, as it is by default, each time a page is swapped out, up to 20 contiguous pages will also be written out to the disk if they have been modified. This greatly improves swapping performance.
Bit 3 controls the multiple page swap-in feature, which is also on by
default. This feature causes pages to be swapped in in groups; each
time a page is needed, several contiguous pages are swapped in in the
same disk operation. The number of pages swapped in can be specified
for each area using si:set-swap-recommendations-of-area
.
Specifies that pages of area area-number should be swapped in in groups of recommendation at a time. This recommendation is used only if the multiple page swap-in feature is enabled.
Generally, the more memory a machine has, the higher the swap recommendations should be to get optimum performance.
Specifies the swap-in recommendation of all areas at once.
t
) ¶If wire-p is t
, the page containing address is wired-down; that is,
it cannot be paged-out. If wire-p is nil
, the page ceases to be wired-down.
(si:unwire-page address)
is the same as
(si:wire-page address
.
nil
)
Makes sure that the storage which represents object is in main
memory. Any pages which have been swapped out to disk are read in,
using as few disk operations as possible. Consecutive disk pages are
transferred together, taking advantage of the full speed of the disk.
If object is large, this will be much faster than bringing the pages
in one at a time on demand. The storage occupied by object is defined
by the %find-structure-leader
and %structure-total-size
subprimitives.
This is a version of sys:page-in-structure
which can bring in a portion
of an array. from and to are lists of subscripts; if they are shorter
than the dimensionality of array, the remaining subscripts are assumed to
be zero.
Any pages in the range of address space starting at address and continuing for n-words which have been swapped out to disk are read in with as few disk operations as possible.
All swapped-out pages of the specified region or area are brought into main memory.
These are similar to the above, except that take pages out of main memory rather than bringing them in. Actually, they only mark the pages as having priority for replacement by others. Use these operations when you are done with a large object, to make the virtual memory system prefer reclaiming that object’s memory over swapping something else out.
The page hash table entry for the page containing virtual-address
is found and altered as specified. t
is returned if it was found,
nil
if it was not (presumably the page is swapped out.) swap-status
and access-status-and-meta-bits can be nil
if those fields are not
to be changed. This doesn’t make any error checks; you can really
screw things up if you call it with the wrong arguments.
This makes the hashing function for the page hash table available to the user.
This is used when adjusting the size of real memory available to the machine. It adds an entry for the page frame at physical-address to the page hash table, with virtual address -1, swap status flushable, and map status 120 (read only). This doesn’t make error checks; you can really screw things up if you call it with the wrong arguments.
If there is a page in the page frame at physical-address, it is swapped out and its entry is deleted from the page hash table, making that page frame unavailable for swapping in of pages in the future. This doesn’t make error checks; you can really screw things up if you call it with the wrong arguments.
Loads virtual memory from the partition named by the concatenation of
the two 16-bit arguments, and starts executing it. The name 0
refers to the default load (the one the machine loads when it is
started up). This is the primitive used by disk-restore
(see disk-restore-fun).
Copies virtual memory into the partition named by the concatenation
of the two 16-bit arguments (0
means the default), then restarts
the world, as if it had just been restored. The physical-mem-size
argument should come from %sys-com-memory-size
in system-communication-area
.
This is the primitive used by disk-save
(see disk-save-fun).
These functions deal with things like what closures deal with: the distinction between internal and external value cells and control over how they work.
This is the primitive that could be used by closure
.
First, if any of the symbols in list-of-symbols has no external
value cell, a new external value cell is created for it, with
the contents of the internal value cell. Then a list of locatives,
twice as long as list-of-symbols, is created and returned.
The elements are grouped in pairs: pointers to the internal
and external value cells, respectively, of each of the symbols.
closure
could have been defined by:
(defun closure (variables function) (%make-pointer dtp-closure (cons function (sys:%binding-instances variables))))
This function is the primitive operation that invocation of closures
could use. It takes a list such as sys:%binding-instances
returns,
and for each pair of elements in the list, it "adds" a binding to the
current stack frame, in the same manner that the bind
function
(which should be called %bind
) does. These bindings remain in effect
until the frame returns or is unwound.
sys:%using-binding-instances
checks for redundant bindings and ignores them.
(A binding is redundant if the symbol is already bound to the desired external
value cell). This check avoids excessive growth of the special pdl in some cases
and is also made by the microcode which invokes closures, entities, and instances.
Returns the contents of the internal value cell of symbol.
dtp-one-q-forward
pointers are considered invisible, as usual, but
dtp-external-value-cell-pointer
s are not; this function can
return a dtp-external-value-cell-pointer
. Such pointers will be
considered invisible as soon as they leave the "inside of the machine",
meaning internal registers and the stack.
The following variables’ values actually reside in the scratchpad memory
of the processor. They are put there by dtp-one-q-forward
invisible
pointers. The values of these variables are used by the microcode.
Many of these variables are highly internal and you shouldn’t expect to
understand them.
This is the version number of the currently-loaded microcode, obtained from the version number of the microcode source file.
Size of micro-code-entry-area
and related areas.
default-cons-area
is documented on default-cons-area-var.
The area number of the area where bignums and flonums are consed.
Normally this variable contains the value of sys:extra-pdl-area
, which
enables the "temporary storage" feature for numbers, saving garbage collection
overhead.
sys:%current-stack-group
and sys:%current-stack-group-previous-stack-group
are documented on sys:%current-stack-group-var.
The sg-state
of the currently-running stack group.
The argument list of the currently-running stack group.
The number of arguments to the currently-running stack group.
The microcode address of the most recent error trap.
The function which is called when the machine starts up.
Normally this is the definition of si:lisp-top-level
.
The stack group in which the machine starts up.
The stack group which receives control when a microcode-detected error occurs. This stack group cleans up, signals the appropriate condition, or assigns a stack group to run the debugger on the erring stack group.
The stack group which receives control when a sequence break occurs.
A fixnum which is the virtual address which maps to the Unibus location of the Chaosnet interface.
A fixnum which is the inclusive lower bound of the region of virtual memory subject to the MAR feature (see mar).
A fixnum which is the inclusive upper bound of the region of virtual memory subject to the MAR feature (see mar).
If non-nil
, you can write into read-only areas. This is used by fasload
.
self
is documented on self-var.
inhibit-scheduling-flag
is documented on inhibit-scheduling-flag-var.
If non-nil
, the scavenger is turned off. The scavenger is
the quasi-asynchronous portion of the garbage collector,
which normally runs during consing operations.
Incremented whenever a new region is allocated.
Increments whenever a new page is allocated.
t
while the scavenger is running, nil
when there are no pointers
to oldspace.
A fixnum which is incremented whenever the garbage collector flips, converting
one or more regions from newspace to oldspace.
If this number has changed, the %pointer
of an object may have changed.
A fixnum which is the virtual address of the TV buffer location of the run-light which lights up when the disk is active. This plus 2 is the address of the run-light for the processor. This minus 2 is the address of the run-light for the garbage collector.
A fixnum which contains the high 24 bits of the name of the disk partition from which virtual memory was booted. Used to create the greeting message.
Configuration of the disk being used for paging. Don’t change these!
A fixnum which controls extra disk error-checking. Bit 0 enables read-compare after a read, bit 1 enables read-compare after a write. Normally this is 0.
Used for communication between the window system and the microcoded graphics primitives.
The next four have to do with a metering system which is not yet documented in this manual.
t
if the metering system is turned on for all stack-groups.
A temporary buffer used by the metering system.
Where the metering system writes its next block of results on the disk.
The number of disk blocks remaining for recording of metering information.
A list of all of the above symbols (and any others added after this documentation was written).
Returns the contents of the microcode meter named name, which can be a fixnum or a bignum. name must be one the symbols listed below.
Writes value, a fixnum or a bignum, into the microcode meter named name. name must be one the symbols listed below.
The microcode meters are as follows:
The number of times transmission on the Chaosnet was aborted, either by a collision or because the receiver was busy.
Internal state of the garbage collection algorithm.
The number of TV frames per clock sequence break. The default value is 67 which causes clock sequence breaks to happen about once per second.
The number of times the first-level virtual-memory map was invalid and had to be reloaded from the page hash table.
The number of times the second-level virtual-memory map was invalid and had to be reloaded from the page hash table.
The number of times the virtual address map was reloaded to contain only "meta bits", not an actual physical address.
The number of read references to the pdl buffer which were virtual memory references that trapped.
The number of write references to the pdl buffer which were virtual memory references that trapped.
The number of virtual memory references which trapped in case they should have gone to the pdl buffer, but turned out to be real memory references after all (and therefore were needlessly slowed down.)
The number of pages read from the disk.
The number of pages written to the disk.
The number of fresh (newly-consed) pages created in core, which would have otherwise been read from the disk.
The number of paging read operations; this can be smaller than the number of disk pages read when more than one page at a time is read.
The number of paging write operations; this can be smaller than the number of disk pages written when more than one page at a time is written.
The number of times a page was used after being read in before it was needed.
The number of times a page was read in before it was needed, but got evicted before it was ever used.
The number of times the machine waited for a page to finish being written out in order to evict the page.
The number of times the machine waited for a page to finish being written out in order to do something else with the disk.
The time spent waiting for the disk, in microseconds. This can be used to distinguish paging time from running time when measuring and optimizing the‘ performance of programs.
The number of recoverable disk errors.
The number of times the disk seek mechanism was recalibrated, usually as part of error recovery.
The number of disk errors which were corrected through the error correcting code.
The number of times a read compare was done, no disk error occurred, but the data on disk did not match the data in memory.
The number of times a disk read was done over because after the read a read compare was done and did not succeed (either it got an error or the data on disk did not match the data in memory).
The number of times a disk write was done over because after the write a read compare was done and did not succeed (either it got an error or the data on disk did not match the data in memory).
Address of the next entry to be written in the disk error log. The function
si:print-disk-error-log
(see si:print-disk-error-log-fun) prints this log.
The number of times the page ager set an age trap on a page, to determine whether it was being referenced.
The number of times the page ager saw that a page still had an age trap and hence made it "flushable", a candidate for eviction from main memory.
A number from 0 to 3 which controls how long a page must remain unreferenced before it becomes a candidate for eviction from main memory.
The number of pages inspected by the page replacement algorithm.
The number of times no evictable page was found and extra aging had to be done.
A list of all of the above symbols (and any others added after this documentation was written).
Storage in the Lisp Machine is divided into areas. Each area
contains related objects, of any type. Areas are intended to give the
user control over the paging behavior of his program, among other
things. By putting related data together, locality can be greatly
increased. Whenever a new object is created the area to be used can
optionally be specified. For example, instead of using cons
you can
use cons-in-area
(see cons-in-area-fun). Object-creating functions
which take keyword arguments generally accept a :area
argument.
You can also control which area is used by binding default-cons-area
(see default-cons-area-var); most functions that allocate storage
use the value of this variable, by default, to specify the area to use.
There is a default Working Storage area which collects those objects which the user has not chosen to control explicitly.
Areas also give the user a handle to control the garbage collector. Some areas can be declared to be "static", which means that they change slowly and the garbage collector should not attempt to reclaim any space in them. This can eliminate a lot of useless copying. A "static" area can be explicitly garbage-collected at infrequent intervals when it is believed that that might be worthwhile.
Each area can potentially have a different storage discipline, a different paging algorithm, and even a different data representation. The microcode will dispatch on an attribute of the area at the appropriate times. The structure of the machine makes the performance cost of these features negligible; information about areas is stored in extra bits in the memory mapping hardware where it can be quickly dispatched on by the microcode; these dispatches usually have to be done anyway to make the garbage collector work, and to implement invisible pointers. This feature is not currently used by the system, except for the list/structure distinction described below.
Each area has a name and a number. The name is a symbol whose
value is the number. The number is an index into various internal
tables. Normally the name is treated as a special variable, so the
number is what is given as an argument to a function that takes an area
as an argument. Thus, areas are not Lisp objects; you cannot
pass an area itself as an argument to a function; you just pass its
number. There is a maximum number of areas (set at cold-load generation
time); you can only have that many areas before the various internal
tables overflow. Currently (as this manual is written) the limit is
256
areas, of which 64
already exist when you start.
The storage of an area consists of one or more regions. Each region is a contiguous section of address space with certain homogeneous properties. The most important of these is the data representation type. A given region can only store one type. The two types that exist now are list and structure. A list is anything made out of conses (a closure for instance). A structure is anything made out of a block of memory with a header at the front; symbols, strings, arrays, instances, compiled functions, etc. Since lists and structures cannot be stored in the same region, they cannot be on the same page. It is necessary to know about this when using areas to increase locality of reference.
When you create an area, one region is created initially. When you try
to allocate memory to hold an object in some area, the system tries to
find a region that has the right data representation type to hold this
object, and that has enough room for it to fit. If there isn’t any such
region, it makes a new one (or signals an error; see the :size
option
to make-area
, below). The size of the new region is an attribute of
the area (controllable by the :region-size
option to make-area
).
If regions are too large, memory may get taken up by a region and never used.
If regions are too small, the system may run out of regions because regions,
like areas, are defined by internal tables that have a fixed size (set at
cold-load generation time). Currently (as this manual is written) the limit
is 256
regions, of which about 90
already exist when you start.
(If you’re wondering why the limit on regions isn’t higher than the limit
on areas, as it clearly ought to be, it’s just because both limits have
to be multiples of 256
for internal reasons, and 256
regions seem
to be enough.)
The value of this variable is the number of the area in which objects are created
by default. It is initially the number of working-storage-area
.
Giving nil
where an area is required uses the value of default-cons-area
.
Note that to put objects into an area other than working-storage-area
you can either bind this variable or use functions such as
cons-in-area
(see cons-in-area-fun) which take the area as an explicit argument.
Creates a new area, whose name and attributes are specified by the keywords.
You must specify a symbol as a name; the symbol will be setq
’ed to
the area-number of the new area, and that number will also be returned,
so that you can use make-area
as the initialization of a defvar
.
The arguments are taken in pairs, the first being a keyword and the second
a "value" for that keyword. The last three keywords documented herein
are in the nature of subprimitives; like the stuff in chapter
subprimitive-chapter, their meaning is system-dependent and is not
documented here. The following keywords exist:
A symbol which will be the name of the area. This item is required.
The maximum allowed size of the area, in words. Defaults to infinite. If the number of words allocated to the area reaches this size, attempting to cons an object in the area will signal an error.
The approximate size, in words, for regions within this area. The
default is the area size if a :size
argument was given, otherwise a
suitable medium size. Note that if you specify :size
and not
:region-size
, the area will have exactly one region. When making an
area which will be very big, it is desirable to make the region size
larger than the default region size to avoid creating very many regions
and possibly overflowing the system’s fixed-size region tables.
The type of object to be contained in the area’s initial region.
The argument to this keyword can be :list
, :structure
, or a numeric code.
:structure
is the default. If you are only going to cons lists in your
area, you should specify :list
so you don’t get a useless structure region.
The type of garbage-collection to be employed. The choices are :dynamic
(which is the default) and :static
. :static
means that the area will
not be copied by the garbage collector, and nothing in the area or pointed to by
the area will ever be reclaimed, unless a garbage collection of this area is
manually requested.
With an argument of t
, causes the area to be made read-only. Defaults
to nil
. If an area is read-only, then any attempt to change anything
in it (altering a data object in the area, or creating a new object in the
area) will signal an error unless sys:%inhibit-read-only
(see sys:%inhibit-read-only-var) is bound to a non-nil
value.
With an argument of t
, makes the area suitable for storing
regular-pdls of stack-groups. This is a special attribute due to the
pdl-buffer hardware. Defaults to nil
. Areas for which this is nil
may not be used to store regular-pdls. Areas for which this is t
are relatively slow to access; all references to pages in the area will
take page faults to check whether the referenced location is really in
the pdl-buffer.
Lets you specify the map bits explicitly, overriding the specification from the other keywords. This is for special hacks only.
Lets you specify the space type explicitly, overriding the specification from the other keywords. This is for special hacks only.
Lets you override the scavenge-enable bit explicitly. This is an internal flag related to the garbage collector. Don’t mess with this!
With an argument of t
, adds this area to the list of areas which are
displayed by default by the room
function (see room-fun).
Example:
(make-area ':name 'foo-area ':gc ':dynamic ':representation ':list)
area may be the name or the number of an area. Various attributes of the area are printed.
The value of area-list
is a list of the names of all existing areas.
This list shares storage with the internal area name table, so you should
not change it.
Returns the number of the area to which pointer points, or nil
if
it does not point within any known area. The data-type of pointer
is ignored.
Returns the number of the region to which pointer points, or nil
if
it does not point within any known region. The data-type of pointer
is ignored. (This information is generally not very interesting to users;
it is important only inside the system.)
Given an area number, returns the name. This "function" is actually an array.
See also cons-in-area
(cons-in-area-fun), list-in-area
(list-in-area-fun),
and room
(room-fun).
This section lists the names of some of the areas and tells what they are for. Only the ones of the most interest to a user are listed; there are many others.
This is the normal value of default-cons-area
.
Most working data are consed in this area.
This area is to be used for "permanent" data, which will (almost) never become
garbage. Unlike working-storage-area
, the contents of this area
are not continually copied by the garbage collector; it is a static area.
Print-names of symbols are stored in this area.
This area contains most of the symbols in the Lisp world, except t
and nil
,
which are in a different place for historical reasons.
This area contains packages, principally the hash tables with which intern
keeps track of symbols.
FEFs (compiled functions) are put here by the compiler and by fasload
.
This area holds the property lists of symbols.
The purpose of the Lisp compiler is to convert Lisp functions into programs in the Lisp Machine’s instruction set, so that they will run more quickly and take up less storage. Compiled functions are represented in Lisp by FEFs (Function Entry Frames), which contain machine code as well as various other information. The printed representation of a FEF is
#<DTP-FEF-POINTER address name>
If you want to understand the output of the compiler, refer to understanding-compiled-code.
There are three ways to invoke the compiler from the Lisp
Machine. First, you may have an interpreted function in the Lisp
environment which you would like to compile. The function compile
is used to do this. Second, you may have code in an editor buffer
which you would like to compile. The Zwei editor has commands
to read code into Lisp and compile it.
Third, you may have a program (a group of function definitions and other forms) written in a
file on the file system. The compiler can translate this file into a
QFASL file. Loading in the QFASL file is almost the same as reading in the source
file; the difference is that the functions defined in the file will be
defined as compiled functions instead of interpreted functions.
The qc-file
function is used for translating source files into QFASL files.
If definition is supplied, it should be a lambda
-expression.
Otherwise function-spec (this is usually a symbol, but see function-spec for details)
should be defined as an interpreted function and
its definition will be used as the lambda
-expression to be compiled.
The compiler converts the lambda
-expression into a FEF, saves the
lambda
-expression as the :previous-expr-definition
and
:previous-definition
properties of function-spec if it is a symbol, and changes
function-spec’s definition to be the FEF. (See fdefine
,
fdefine-fun). (Actually, if function-spec is not defined as a
lambda
-expression, and function-spec is a symbol,
compile
will try to find a lambda
-expression in
the :previous-expr-definition
property of function-spec and use that
instead.)
If symbol is not defined as an interpreted function and it
has a :previous-expr-definition
property, then uncompile
will restore the function cell from the value of the property.
(Otherwise, uncompile
does nothing and returns "Not compiled"
.)
This "undoes" the effect of compile
. See also undefun
, undefun-fun.
This function takes a formidable number of arguments, but normally only one
argument is supplied.
The file filename is given to the compiler, and the output of the
compiler is written to a file whose name is filename except with a
file type of "QFASL". The input format for files to the compiler is
described on compiler-input-section.
Macro definitions, subst
definitions, and special
declarations created during
the compilation are undone when the compilation is
finished.
The optional arguments allow certain modifications to the standard procedure. output-file lets you change where the output is written. package lets you specify in what package the source file is to be read. Normally the system knows, or asks interactively, and you need not supply this argument. load-flag and in-core-flag are incomprehensible; you don’t want to use them. functions-defined and file-local-declarations are for compiling multiple files as if they were one. dont-set-default-p suppresses the changing of the default file name to filename that normally occurs.
Normally, a form is read from the file and processed and then another
form is read and processed, and so on. But if
read-then-process-flag is non-nil
, the whole source file is read
before any of it is processed. This is not done by default; it has the
problem that compile-time reader-macros defined in the file will not
work properly.
qc-file-load
compiles a file and then loads in the resulting QFASL file.
See also the disassemble
function (disassemble-fun), which lists the instructions
of a compiled function in symbolic form.
The purpose of qc-file
is to take a file and produce
a translated version which does the same thing as the original except
that the functions are compiled. qc-file
reads through the input
file, processing the forms in it one by one. For each form, suitable
binary output is sent to the QFASL file so that when the QFASL file is
loaded the effect of that source form will be reproduced. The differences
between source files and QFASL files are that QFASL files are in a compressed
binary form which reads much faster (but cannot be edited), and that
function definitions in QFASL files have been translated from Lisp forms
to FEFs.
So, if the source contains a (defun ...)
form at top level,
then when the QFASL file is loaded, the function will be defined as a
compiled function. If the source file contains a form which is not of a
type known specially to the compiler, then that form (encoded in QFASL
format) will be output "directly" into the QFASL file, so that when the
QFASL file is loaded that form will be evaluated. Thus, if the source
file contains (setq x 3)
, then the compiler will put in the QFASL
file instructions to set x
to 3
at load time (that is, when
the QFASL file is loaded into the Lisp environment). It happens that QFASL
files have a specific way to setq
a symbol. For a more general form,
the QFASL file would contain instructions to recreate the list structure
of a form and then call eval
on it.
Sometimes we want to put things in the file that are not merely meant to be translated into QFASL form. One such occasion is top level macro definitions; the macros must actually get defined within the compiler in order for the compiler to be able to expand them at compile time. So when a macro form is seen, it should (sometimes) be evaluated at compile time, and should (sometimes) be put into the QFASL file.
Another thing we sometimes want to put in a file is compiler declarations. These are forms which should be evaluated at compile time to tell the compiler something. They should not be put into the QFASL file, unless they are useful for working incrementally on the functions in the file, compiling them one by one from the editor.
Therefore, a facility exists to allow the user to tell the compiler just what to do with a form. One might want a form to be:
Two forms are recognized by the compiler to allow this. The less
general, old-fashioned one is declare
; the completely
general one is eval-when
.
An eval-when
form looks like
(eval-when times-list form1 form2 ...)
The times-list may contain one or more of the symbols load
, compile
,
or eval
.
If load
is present, the forms are written into the QFASL file
to be evaluated when the QFASL file is loaded (except that defun
forms
will put the compiled definition into the QFASL file instead).
If compile
is present, the forms are evaluated in the compiler.
If eval
is present, the forms are evaluated when read into Lisp;
this is because eval-when
is defined as a special form in Lisp. (The
compiler ignores eval
in the times-list.)
For example,
(eval-when (compile eval) (macro foo (x) (cadr x)))
would define foo
as a macro in the compiler and when the file
is read in interpreted, but not when the QFASL file is fasloaded.
For the rest of this section, we will use lists such as are
given to eval-when
, e.g (load eval)
, (load compile)
, etc.
to describe when forms are evaluated.
A declare
form looks like (declare form1 form2 ...)
.
declare
is defined in Lisp as a special form which does nothing;
so the forms within a declare
are not evaluated at eval
time.
The compiler does the following upon finding form within
a declare
: if form is a call to either special
or unspecial
, form is treated as (load compile)
;
otherwise it is treated as (compile)
.
If a form is not enclosed in an eval-when
nor a declare
,
then the times at which it will be evaluated depend on the form.
The following table summarizes at what times evaluation will take
place for any given form seen at top level by the compiler.
(eval-when times-list form1 ...)
times-list
(declare (special ...)) or (declare (unspecial ...))
(load compile)
(declare anything-else)
(compile)
(special ...) or (unspecial ...)
(load compile eval)
(macro ...) or (defmacro ...) or (defsubst ...)
(load compile eval)
(comment ...)
Ignored at all times.
(compiler-let ((var val) ...) body...)
Processes the body in its normal fashion, but
at (compile eval)
time, the indicated
variable bindings are in effect. These variables will typically
affect the operation of the compiler or of macros. See compiler-let-discussion.
(local-declare (decl decl ...) body...)
Processes the body in its normal fashion, with the indicated
declarations added to the front of the list which is the value
of local-declarations
.
(defflavor ...) or (defstruct ...)
(load compile eval)
(defun ...) or (defmethod ...) or (defselect ...)
(load eval)
, but at load time what is processed is not this form
itself, but the result of compiling it.
anything-else
(load eval)
Sometimes a macro wants to return more than one form for the compiler top level to see (and to be evaluated). The following facility is provided for such macros. If a form
(progn (quote compile) form1 form2 ...)
is seen at the compiler top level, all of the forms are processed as if they had been at
compiler top level. (Of course, in the interpreter they
will all be evaluated, and the (quote compile)
will harmlessly
evaluate to the symbol compile
and be ignored.)
See progn-quote-compile-discussion for additional discussion of this.
When seen by the interpreter, if one of the times is the symbol eval
then the body forms are evaluated; otherwise eval-when
does nothing.
But when seen by the compiler, this special form does the special things described above.
declare
does nothing, and returns the symbol declare
.
But when seen by the compiler, this special form does the special things described above.
There is also a different use of declare
, used in conjuction with the arglist
function (see arglist-fun).
This section describes functions meant to be called during
compilation, and variables meant to be set or bound during compilation,
by using declare
or local-declare
.
A local-declare
form looks like
(local-declare (decl1 decl2 ...) form1 form2 ...)
Each decl is consed onto the list local-declarations
while
the forms are being evaluated (in the interpreter) or compiled
(in the compiler). There are two uses for this. First, it can be
used to pass information from outer macros to inner macros. Secondly,
the compiler will specially interpret certain decls as local
declarations, which only apply to the compilations of the forms.
It understands the following forms:
(special var1 var2 ...)
The variables var1, var2, etc. will be treated as special variables during the compilation of the forms.
(unspecial var1 var2 ...)
The variables var1, var2, etc. will be treated as local variables during the compilation of the forms.
(arglist . arglist)
Putting this local declaration around a defun
saves arglist as the
argument list of the function, to be used instead of its lambda
-list if
anyone asks what its arguments are. This is purely documentation.
(return-list . values)
Putting this local declaration around a defun
saves values as the
return values list of the function, to be used if anyone asks what values
it returns. This is purely documentation.
(def name . definition)
name will be defined for the compiler during the compilation
of the forms. The compiler uses this to keep track of macros and
open-codable functions (defsubst
s) defined in the file being compiled.
Note that the cddr
of this item is a function.
Declares each variable to be "special" for the compiler.
Removes any "special" declarations of the variables for the compiler.
The next three declarations are primarily for Maclisp compatibility.
Declares each symbol to be the name of a function. In addition it prevents these functions from appearing in the list of functions referenced but not defined printed at the end of the compilation.
Declares each symbol to be the name of a function. In addition it prevents these functions from appearing in the list of functions referenced but not defined printed at the end of the compilation.
Declares each symbol to be the name of a special form. In addition it prevents these names from appearing in the list of functions referenced but not defined printed at the end of the compilation.
There are some advertised variables whose compile-time values affect the operation of the compiler. The user may set these variables by including in his file forms such as
(declare (setq open-code-map-switch t))
If this variable is non-nil
, the compiler will try to warn the
user about any constructs which will not work in Maclisp. By no means
will all Lisp Machine system functions not built in to Maclisp be
cause for warnings; only those which could not be written by the user
in Maclisp (for example, make-array
, value-cell-location
, etc.).
Also, lambda-list keywords such as
&optional
and initialized prog
variables will be
mentioned. This switch also inhibits the warnings for obsolete Maclisp functions.
The default value of this variable is nil
.
If this variable is non-nil
, the compiler will try to warn
the user whenever an "obsolete" Maclisp-compatibility function such as
maknam
or samepnamep
is used. The default value is t
.
If this variable is non-nil
, the compiler allows the use of
the name of a variable in function position to mean that the
variable’s value should be funcall
’d. This is for compatibility
with old Maclisp programs. The default value of this variable is
nil
.
If this variable is non-nil
, the compiler will attempt
to produce inline code for the mapping functions (mapc
, mapcar
, etc.,
but not mapatoms
) if the function being mapped is an anonymous
lambda-expression. This allows that function to reference
the local variables of the enclosing function without the need for special
declarations.
The generated code is also more efficient. The default value is t
.
If this variable is non-nil
, the compiler regards all variables
as special, regardless of how they were declared. This provides
compatibility with the interpreter at the cost of efficiency.
The default is nil
.
If this variable is non-nil
, all compiler style-checking is
turned off. Style checking is used to issue obsolete function
warnings and won’t-run-in-Maclisp warnings, and other sorts of
warnings. The default value is nil
. See also the
inhibit-style-warnings
macro, which acts on one level only of an
expression.
Syntactically identical to let
, compiler-let
allows
compiler switches to be bound locally at compile time, during the
processing of the body forms.
Example:
(compiler-let ((open-code-map-switch nil)) (map (function (lambda (x) ...)) foo))
will prevent the compiler from open-coding the map
.
When interpreted, compiler-let
is equivalent to let
. This
is so that global switches which affect the behavior of macro
expanders can be bound locally.
By controlling the compile-time values of the variables run-in-maclisp-switch
,
obsolete-function-warning-switch
, and inhibit-style-warning-switch
(explained
above), you can enable or disable some of the warning messages of the compiler.
The following special form is also useful:
Prevents the compiler from performing style-checking on the top level of form. Style-checking will still be done on the arguments of form. Both obsolete function warnings and won’t-run-in-Maclisp warnings are done by means of the style-checking mechanism, so, for example,
(setq bar (inhibit-style-warnings (value-cell-location foo)))
will not warn that value-cell-location
will not work in Maclisp,
but
(inhibit-style-warnings (setq bar (value-cell-location foo)))
will warn, since inhibit-style-warnings
applies only to the top
level of the form inside it (in this case, to the setq
).
Sometimes functions take argument that they deliberately do not use.
Normally the compiler warns you if your program binds a variable that it
never references. In order to disable this warning for variables that
you know you are not going to use, there are two things you can do. The
first thing is to name the variables ignore
or ignored
. The
compiler will not complain if a variable by one of these names is not
used. Furthermore, by special dispensation, it is all right to have
more than one variable in a lambda-list that has one of these names.
The other thing you can do is simply use the variable, for effect
(ignoring its value), at the front of the function. Example:
(defun the-function (list fraz-name fraz-size) fraz-size ; This argument is not used. ...)
This has the advantage that arglist
(see arglist-fun) will return
a more meaningful argument list for the function, rather than returning
something with ignore
s in it.
The following function is useful for requesting compiler warnings in
certain esoteric cases. Normally, the compiler notices whenever any
function x uses (calls) any other function y; it makes notes of
all these uses, and then warns you at the end of the compilation if the
function y got called but was neither defined nor declared (by
*expr
, see *expr-fun). This usually does what you want, but
sometimes there is no way the compiler can tell that a certain function is being
used. Suppose that instead of x’s containing any forms that call
y, x simply stores y away in a data structure somewhere, and
someplace else in the program that data structure is accessed and
funcall
is done on it. There is no way that the compiler can see
that this is going to happen, and so it can’t notice the function usage,
and so it can’t create a warning message. In order to make such
warnings happen, you can explicitly call the following function at
compile-time.
what is a symbol that is being used as a function. by may be any function spec.
compiler:function-referenced
must be
called at compile-time while a compilation is in progress. It tells the
compiler that the function what is referenced by by. When the compilation
is finished, if the function what has not been defined, the compiler will
issue a warning to the effect that by referred to the function what,
which was never defined.
This special form declares a function to be obsolete; code that calls it
will get a compiler warning, under the control of obsolete-function-warning-switch
.
This is used by the compiler to mark as obsolete some Maclisp functions which exist in
Zetalisp but should not be used in new programs. It can also
be useful when maintaining a large system, as a reminder that a function has
become obsolete and usage of it should be phased out. An example of an
obsolete-function declaration is:
(compiler:make-obsolete create-mumblefrotz "use MUMBLIFY with the :FROTZ option instead")
The compiler stores optimizers for source code on property lists so as
to make it easy for the user to add them. An optimizer can be used to
transform code into an equivalent but more efficient form (for
example, (eq obj nil)
is transformed into (null obj)
,
which can be compiled better). An optimizer can also be used to
tell the compiler how to compile a special form. For example,
in the interpreter do
is a special form, implemented by a function
which takes quoted arguments and calls eval
. In the compiler,
do
is expanded in a macro-like way by an optimizer, into
equivalent Lisp code using prog
, cond
, and go
, which
the compiler understands.
The compiler finds the optimizers to apply to a form by looking for
the compiler:optimizers
property of the symbol which is the
car
of the form. The value of this property should be a list of
optimizers, each of which must be a function of one argument. The
compiler tries each optimizer in turn, passing the form to be
optimized as the argument. An optimizer which returns the original
form unchanged (eq
to the argument) has "done nothing", and
the next optimizer is tried. If the optimizer returns anything else,
it has "done something", and the whole process starts over again.
Only after all the optimizers
have been tried and have done nothing is an ordinary macro definition
processed. This is so that the macro definitions, which will be seen
by the interpreter, can be overridden for the compiler by optimizers.
Optimizers should not be used to define new language features, because they only take effect in the compiler; the interpreter (that is, the evaluator) doesn’t know about optimizers. So an optimizer should not change the effect of a form; it should produce another form that does the same thing, possibly faster or with less memory or something. That is why they are called optimizers. If you want to actually change the form to do something else, you should be using macros.
Puts optimizer on function’s optimizers list if it isn’t there already. optimizer is the name of an optimization function, and function is the name of the function calls which are to be processed. Neither is evaluated.
(compiler:add-optimizer function optimizer optimize-into-1
optimize-into-2...)
also remembers optimize-into-1, etc., as
names of functions which may be called in place of function as a result
of the optimization.
Certain programs are intended to be run both in Maclisp and in
Zetalisp. Their source files need some special conventions. For
example, all special
declarations must be enclosed in declare
s,
so that the Maclisp compiler will see them. The main issue is that many
functions and special forms of Zetalisp do not exist in
Maclisp. It is suggested that you turn on run-in-maclisp-switch
in
such files, which will warn you about a lot of problems that your program
may have if you try to run it in Maclisp.
The macro-character combination "#Q" causes the object that follows it
to be visible only when compiling for Zetalisp. The combination
"#M" causes the following object to be visible only when compiling for
Maclisp. These work both on subexpressions of the objects in the file,
and at top level in the file. To conditionalize top-level objects,
however, it is better to put the macros if-for-lispm
and
if-for-maclisp
around them. (You can only put these around a single
object.) The if-for-lispm
macro turns off run-in-maclisp-switch
within its object, preventing spurious warnings from the compiler. The
#Q
macro-character cannot do this, since it can be used to
conditionalize any S-expression, not just a top-level form.
To allow a file to detect what environment it is being compiled in, the following macros are provided:
If (if-for-lispm form)
is seen at the top level of
the compiler, form is passed to the compiler top level if
the output of the compiler is a QFASL file intended for Zetalisp.
If the Zetalisp interpreter sees this it will evaluate form
(the macro expands into form).
If (if-for-maclisp form)
is seen at the top level of
the compiler, form is passed to the compiler top level if
the output of the compiler is a FASL file intended for Maclisp
(e.g if the compiler is COMPLR).
If the Zetalisp interpreter sees this it will ignore it
(the macro expands into nil
).
If (if-for-maclisp-else-lispm form1 form2)
is seen at the top level of
the compiler, form1 is passed to the compiler top level if
the output of the compiler is a FASL file intended for Maclisp;
otherwise form2 is passed to the compiler top level.
In Zetalisp, (if-in-lispm form)
causes form
to be evaluated; in Maclisp, form is ignored.
In Maclisp, (if-in-maclisp form)
causes form
to be evaluated; in Zetalisp, form is ignored.
When you have two definitions of one function, one
conditionalized for one machine and one for the other, put them next to
each other in the source file with the second "(defun
" indented by
one space, and the editor will put both function definitions on the
screen when you ask to edit that function.
In order to make sure that those macros are defined when reading the file into the Maclisp compiler, you must make the file start with a prelude, which should look like:
(declare (cond ((not (status feature lispm)) (load '|AI: LISPM2; CONDIT|))))
This will do nothing when you compile the program on the Lisp Machine.
If you compile it with the Maclisp compiler, it will load in definitions
of the above macros, so that they will be available to your
program. The form (status feature lispm)
is generally useful in
other ways; it evaluates to t
when evaluated on the Lisp machine and
to nil
when evaluated in Maclisp.
It is possible to make a QFASL file containing data, rather than a compiled program. This can be useful to speed up loading of a data structure into the machine, as compared with reading in printed representations. Also, certain data structures such as arrays do not have a convenient printed representation as text, but can be saved in QFASL files. For example, the system stores fonts this way. Each font is in a QFASL file (on the LMFONT directory) which contains the data structures for that font. When the file is loaded, the symbol which is the name of the font gets set to the array which represents the font. Putting data into a QFASL file is often referred to as "fasdumping the data".
In compiled programs, the constants are saved in the QFASL file in this way.
The compiler optimizes by making constants which are equal
become eq
when the file is loaded. This does not happen when you make a data file yourself;
identity of objects is preserved. Note that when a QFASL file is loaded,
objects that were eq
when the file was written are still eq
; this does not
normally happen with text files.
The following types of objects can be represented in QFASL files: Symbols (but uninterned symbols will be interned when the file is loaded), numbers of all kinds, lists, strings, arrays of all kinds, instances, and FEFs.
When an instance is fasdumped (put into a QFASL file), it is sent a :fasd-form
message, which must return a Lisp form which, when evaluated, will recreate the
equivalent of that instance. This is because instances are often part of a large
data structure, and simply fasdumping all of the instance variables and making
a new instance with those same values is unlikely to work. Instances remain
eq
; the :fasd-form
message is only sent the first time a particular instance
is encountered during writing of a QFASL file. If the instance does
not accept the :fasd-form
message, it cannot be fasdumped.
Writes a QFASL file named filename which contains the value of symbol.
When the file is loaded, symbol will be setq
’ed to the same value.
filename is parsed with the same defaults that load
and qc-file
use.
The file type defaults to "qfasl"
.
Writes the font named name into a QFASL file with the appropriate name (on the LMFONT directory).
This is a way to dump a complex data structure into a QFASL file. The values,
the function definitions, and some of the properties of certain symbols are put into
the QFASL file in such a way that when the file is loaded the symbols will
be setq
ed, fdefine
d, and putprop
ed appropriately. The user can
control what happens to symbols discovered in the data structures being fasdumped.
filename is the name of the file to be written. It is parsed with
the same defaults that load
and qc-file
use. The file type
defaults to "qfasl"
.
symbols is a list of symbols to be processed. properties is a list of properties which are to be fasdumped if they are found on the symbols. dump-values-p and dump-functions-p control whether the values and function definitions are also dumped.
new-symbol-function is called whenever a new symbol is found in the
structure being dumped. It can do nothing, or it can add the symbol to the
list to be processed by calling compiler:fasd-symbol-push
. The value
returned by new-symbol-function is ignored.
If eval
is handed a list whose car is a symbol, then eval
inspects the definition of the symbol to find out what to do. If the
definition is a cons, and the car of the cons is the symbol
macro
, then the definition (i.e that cons) is called a macro.
The cdr of the cons should be a function of one argument.
eval
applies the function to the form it was originally given,
takes whatever is returned, and evaluates that in lieu of the
original form.
Here is a simple example. Suppose the definition of the symbol first
is
(macro lambda (x) (list 'car (cadr x)))
This thing is a macro: it is a cons whose car is the symbol
macro
. What happens if we try to evaluate a form (first '(a b
c))
? Well, eval
sees that it has a list whose car is a symbol
(namely, first
), so it looks at the definition of the symbol and
sees that it is a cons whose car is macro
; the definition is
a macro.
eval
takes the cdr of the cons, which is supposed to be the
macro’s expander function, and calls it providing as an argument the
original form that eval
was handed. So it calls
(lambda (x) (list 'car (cadr x)))
with argument (first '(a b c))
.
Whatever this returns is the expansion of the macro call. It will
be evaluated in place of the original form.
In this case, x
is bound to (first '(a b c))
, (cadr x)
evaluates to '(a b c)
, and (list 'car (cadr x))
evaluates to
(car '(a b c))
, which is the expansion. eval
now evaluates the
expansion. (car '(a b c))
returns a
, and so the result is that
(first '(a b c))
returns a
.
What have we done? We have defined a macro called first
. What
the macro does is to translate the form to some other form. Our
translation is very simple–it just translates forms that look like
(first x)
into (car x)
, for any form x.
We can do much more
interesting things with macros, but first we will show how
to define a macro.
The primitive special form for defining macros is macro
.
A macro definition looks like this:
(macro name (arg) body)
name can be any function spec. arg must be a variable. body is a sequence of Lisp forms that expand the macro; the last form should return the expansion.
To define our first
macro, we would say
(macro first (x) (list 'car (cadr x)))
Here are some more simple examples of macros. Suppose we want
any form that looks like (addone x)
to be translated into
(plus 1 x)
. To define a macro to do this we would say
(macro addone (x) (list 'plus '1 (cadr x)))
Now say we wanted a macro which would translate (increment x)
into (setq x (1+ x)
. This would be:
(macro increment (x) (list 'setq (cadr x) (list '1+ (cadr x))))
Of course, this macro is of limited usefulness. The reason is that the
form in the cadr of the increment
form had better be a symbol.
If you tried (increment (car x))
, it would be translated into
(setq (car x) (1+ (car x)))
, and setq
would complain.
(If you’re interested in how to fix this problem, see setf
(setf-fun);
but this is irrelevant to how macros work.)
You can see from this discussion that macros are very different
from functions. A function would not be able to tell what kind of
subforms are around in a call to itself; they get evaluated before the
function ever sees them. However, a macro gets to look at the whole
form and see just what is going on there. Macros are not functions;
if first
is defined as a macro, it is not meaningful to apply
first
to arguments. A macro does not take arguments at all; its
expander function takes a Lisp form and turns it into another Lisp
form.
The purpose of functions is to compute; the purpose of macros is to translate. Macros are used for a variety of purposes, the most common being extensions to the Lisp language. For example, Lisp is powerful enough to express many different control structures, but it does not provide every control structure anyone might ever possibly want. Instead, if a user wants some kind of control structure with a syntax that is not provided, he can translate it into some form that Lisp does know about.
For example, someone might want a limited iteration construct which increments a variable by one until it exceeds a limit (like the FOR statement of the BASIC language). He might want it to look like
(for a 1 100 (print a) (print (* a a)))
To get this, he could write a macro to translate it into
(do a 1 (1+ a) (> a 100) (print a) (print (* a a)))
A macro to do this could be defined with
(macro for (x) (cons 'do (cons (cadr x) (cons (caddr x) (cons (list '1+ (cadr x)) (cons (list '> (cadr x) (cadddr x)) (cddddr x)))))))
Now he has defined his own new control structure primitive, and it will act just as if it were a special form provided by Lisp itself.
The main problem with the definition for the for
macro is
that it is verbose and clumsy. If it is that hard to write a macro
to do a simple specialized iteration construct, one would wonder how
anyone could write macros of any real sophistication.
There are two things that make the definition so inelegant.
One is that the programmer must write things like "(cadr x)
"
and "(cddddr x)
" to refer to the parts of the form he wants
to do things with. The other problem is that the long chains of calls
to the list
and cons
functions are very hard to read.
Two features are provided to solve these two problems.
The defmacro
macro solves the former, and the "backquote" (`
)
reader macro solves the latter.
Instead of referring to the parts of our form by "(cadr x)
"
and such, we would like to give names to the various pieces of the form,
and somehow have the (cadr x)
automatically generated. This is done
by a macro called defmacro
. It is easiest to explain what defmacro
does
by showing an example. Here is how you would write the for
macro
using defmacro
:
(defmacro for (var lower upper . body) (cons 'do (cons var (cons lower (cons (list '1+ var) (cons (list '> var upper) body))))))
The (var lower upper body)
is a pattern to match against
the body of the form (to be more precise, to match against the cdr
of the argument to the macro’s expander function). If defmacro
tries to match the two
lists
(var lower upper . body)
and
(a 1 100 (print a) (print (* a a)))
var
will get bound to the symbol a
, lower
to the fixnum 1
,
upper
to the fixnum 100
, and body
to the list
((print a) (print (* a a)))
. Then inside the body of the defmacro
,
var, lower, upper,
and body
are variables, bound to the matching
parts of the macro form.
defmacro
is a general purpose macro-defining macro. A defmacro
form looks like
(defmacro name pattern . body)
The pattern may be anything made up out of symbols and conses.
It is matched against the body of the macro form; both pattern
and the form are car’ed and cdr’ed identically, and whenever
a non-nil
symbol is hit in pattern, the symbol is bound to the corresponding
part of the form. All of the symbols in pattern can be used
as variables within body. name is the name of the macro
to be defined; it can be any function spec (see function-spec).
body is evaluated with these bindings in effect,
and its result is returned to the evaluator as the expansion of the macro.
Note that the pattern need not be a list the way a lambda-list must.
In the above example, the pattern was a "dotted list", since the symbol
body
was supposed to match the cddddr of the macro form.
If we wanted a new iteration form, like for
except that
our example would look like
(for a (1 100) (print a) (print (* a a)))
(just because we thought that was a nicer syntax), then we could
do it merely by modifying the pattern of the defmacro
above;
the new pattern would be (var (lower upper) . body)
.
Here is how we would write our other examples using defmacro
:
(defmacro first (the-list) (list 'car the-list)) (defmacro addone (form) (list 'plus '1 form)) (defmacro increment (symbol) (list 'setq symbol (list '1+ symbol)))
All of these were very simple macros and have very simple patterns,
but these examples show that we can replace the (cadr x)
with a
readable mnemonic name such as the-list
or symbol
, which
makes the program clearer, and enables documentation facilities such
as the arglist
function to describe the syntax of the special form
defined by the macro.
There is another version of defmacro
which defines
displacing macros (see displacing-macro).
defmacro
has other, more complex features; see defmacro-hair.
Now we deal with the other problem: the long strings of calls to
cons
and list
. This problem is relieved by introducing some new
characters that are special to the Lisp reader. Just as the
single-quote character makes it easier to type things of the form
(quote x)
, so will some more new special characters make it
easier to type forms that create new list structure. The functionality
provided by these characters is called the backquote facility.
The backquote facility is used by giving a backquote character
(`
), followed by a form. If the form does
not contain any use of the comma character, the backquote acts just
like a single quote: it creates a form which, when evaluated, produces
the form following the backquote. For example,
'(a b c) => (a b c) `(a b c) => (a b c)
So in the simple cases, backquote is just like the regular single-quote macro. The way to get it to do interesting things is to include a comma somewhere inside of the form following the backquote. The comma is followed by a form, and that form gets evaluated even though it is inside the backquote. For example,
(setq b 1) `(a b c) => (a b c) `(a ,b c) => (a 1 c) `(abc ,(+ b 4) ,(- b 1) (def ,b)) => (abc 5 0 (def 1))
In other words, backquote quotes everything except things preceeded by a comma; those things get evaluated.
A list following a backquote can be thought of as a template for some new list structure. The parts of the list that are preceeded by commas are forms that fill in slots in the template; everything else is just constant structure that will appear in the result. This is usually what you want in the body of a macro; some of the form generated by the macro is constant, the same thing on every invocation of the macro. Other parts are different every time the macro is called, often being functions of the form that the macro appeared in (the "arguments" of the macro). The latter parts are the ones for which you would use the comma. Several examples of this sort of use follow.
When the reader sees the `(a ,b c)
it is actually generating
a form such as (list 'a b 'c)
. The actual form generated may use
list
, cons
, append
, or whatever might be a good idea; you
should never have to concern yourself with what it actually turns into.
All you need to care about is what it evaluates to. Actually, it
doesn’t use the regular functions cons
, list
, and so forth, but
uses special ones instead so that the grinder can recognize a form which
was created with the backquote syntax, and print it using backquote so
that it looks like what you typed in. You should never write any
program that depends on this, anyway, because backquote makes no
guarantees about how it does what it does. In particular, in some
circumstances it may decide to create constant forms, that will cause
sharing of list structure at run time, or it may decide to create forms
that will create new list structure at run time.
For example, if the readers sees `(r ,nil)
,
it may produce the same thing as (cons 'r nil)
, or '(r . nil)
.
Be careful that your program does not depend on which of these it does.
This is generally found to be pretty confusing by most people; the best way
to explain further seems to be with examples. Here is how we would write our
three simple macros using both the defmacro
and backquote facilities.
(defmacro first (the-list) `(car ,the-list)) (defmacro addone (form) `(plus 1 ,form)) (defmacro increment (symbol) `(setq ,symbol (1+ ,symbol)))
To finally demonstrate how easy it is to define macros with these two facilities,
here is the final form of the for
macro.
(defmacro for (var lower upper . body) `(do ,var ,lower (1+ ,var) (> ,var ,upper) . ,body))
Look at how much simpler that is than the original definition. Also,
look how closely it resembles the code it is producing. The functionality
of the for
really stands right out when written this way.
If a comma inside a backquote form is followed by an "atsign"
character (@
), it has a special meaning. The ",@
" should
be followed by a form whose value is a list; then each of the elements
of the list is put into the list being created by the backquote. In other
words, instead of generating a call to the cons
function, backquote
generates a call to append
. For example, if a
is bound to
(x y z)
, then `(1 ,a 2)
would evaluate to (1 (x y z) 2)
,
but `(1 ,@a 2)
would evaluate to (1 x y z 2)
.
Here is an example of a macro definition that uses the ",@
"
construction. Suppose you wanted to extend Lisp by adding a kind of
special form called repeat-forever
, which evaluates all of its
subforms repeatedly. One way to implement this would be to expand
(repeat-forever form1 form2 form3)
into
(prog () a form1 form2 form3 (go a))
You could define the macro by
(defmacro repeat-forever body `(prog () a ,@body (go a)))
A similar construct is ",
" (comma, dot). This means the same thing
as ",@
" except that the list which is the value of the following form
may be freely smashed; backquote uses nconc
rather than append
.
This should of course be used with caution.
Backquote does not make any guarantees about what parts of the structure it
shares and what parts it copies. You should not do destructive operations
such as nconc
on the results of backquote forms such as
`(,a b c d)
since backquote might choose to implement this as
(cons a '(b c d))
and nconc
would smash the constant. On the other hand, it would be
safe to nconc
the result of
`(a b ,c ,d)
since there is nothing this could expand into that does not involve making a new list, such as
(list 'a 'b c d)
Backquote of course guarantees not to do any destructive operations
(rplaca
, rplacd
, nconc
) on the components of the
structure it builds, unless the ",
" syntax is used.
Advanced macro writers sometimes write "macro-defining macros": forms which expand into forms which, when evaluated, define macros. In such macros it is often useful to use nested backquote constructs. The following example illustrates the use of nested backquotes in the writing of macro-defining macros.
This example is a very simple version of defstruct
(see defstruct-fun).
You should first understand the basic description of defstruct
before
proceeding with this example. The defstruct
below does not accept
any options, and only allows the simplest kind of items; that is, it only
allows forms like
(defstruct (name) item1 item2 item3 item4 ...)
We would like this form to expand into
(progn 'compile (defmacro item1 (x) `(aref ,x 0)) (defmacro item2 (x) `(aref ,x 1)) (defmacro item3 (x) `(aref ,x 2)) (defmacro item4 (x) `(aref ,x 3)) ...)
(The meaning of the (progn 'compile ...)
is discussed on
progn-quote-compile-page.) Here is the macro to perform the
expansion:
(defmacro defstruct ((name) . items) (do ((item-list items (cdr item-list)) (ans nil) (i 0 (1+ i))) ((null item-list) `(progn 'compile . ,(nreverse ans))) (setq ans (cons `(defmacro ,(car item-list) (x) `(aref ,x ,',i)) ans))))
The interesting part of this definition is the body of
the (inner) defmacro
form:
`(aref ,x ,',i)
Instead of using this backquote construction, we could have written
(list 'aref x ,i)
That is, the ",',
"
acts like a comma which matches the outer backquote, while
the ",
" preceeding the "x
" matches with the inner
backquote. Thus, the symbol i
is evaluated when the
defstruct
form is expanded, whereas the symbol x
is
evaluated when the accessor macros are expanded.
Backquote can be useful in situations other than the writing of macros. Whenever there is a piece of list structure to be consed up, most of which is constant, the use of backquote can make the program considerably clearer.
A substitutable function is a function which is open coded by the compiler. It is like any other function when applied, but it can be expanded instead, and in that regard resembles a macro.
defsubst
is used for defining substitutable functions. It is used just
like defun
.
(defsubst name lambda-list . body)
and does almost the same thing. It defines a function which executes
identically to the one which a similar call to defun
would define. The
difference comes when a function which calls this one is compiled. Then,
the call will be open-coded by substituting the substitutable function’s
definition into the code being compiled. The function itself
looks like (named-subst name lambda-list . body)
. Such a function
is called a subst
. For example, if
we define
(defsubst square (x) (* x x)) (defun foo (a b) (square (+ a b)))
then if foo
is used interpreted, square
will work just as if it had
been defined by defun
. If foo
is compiled, however, the squaring
will be substituted into it and it will compile just like
(defun foo (a b) (* (+ a b) (+ a b)))
square
’s definition would be
(named-subst square (x) (* x x))
(The internal formats of subst
s and named-subst
s are explained in subst.)
A similar square
could be defined as a macro, with
(defmacro square (x) `(* ,x ,x))
In general, anything that is implemented as a subst
can be re-implemented
as a macro, just by changing the defsubst
to a defmacro
and putting
in the appropriate backquote and commas. The disadvantage of macros
is that they are not functions, and so cannot be applied to arguments.
Their advantage is that they can do much more powerful things than
subst
s can. This is also a disadvantage since macros provide more
ways to get into trouble. If something can be implemented either as a macro
or as a subst
, it is generally better to make it a subst
.
The lambda-list of a subst
may contain &optional
and
&rest
, but no other lambda-list keywords. If there is a
rest-argument, it is replaced in the body with an explicit call to
list
:
(defsubst append-to-foo (&rest args) (setq foo (append args foo))) (append-to-foo x y z)
expands to
(setq foo (append (list x y z) foo))
Rest arguments in subst
s are most useful with
lexpr-funcall
, because of an optimization that is done:
(defsubst xhack (&rest indices) (lexpr-funcall 'xfun xarg1 indices)) (xhack a (car b))
is equivalent to
(xfun xarg1 a (car b))
If xfun
is itself a subst
, it will be expanded in turn.
You will notice that the substitution performed is very simple and takes no care about the possibility of computing an argument twice when it really ought to be computed once. For instance, in the current implementation, the functions
(defsubst reverse-cons (x y) (cons y x)) (defsubst in-order (a b c) (and (< a b) (< b c)))
would present problems. When compiled, because of the substitution
a call to reverse-cons
would evaluate its arguments in the
wrong order, and a call to in-order
could evaluate its second
argument twice. This will be fixed at some point in the future,
but for now the writer of defsubst
’s must be cautious.
Also all occurrences of the argument names in the body are replaced
with the argument forms, wherever they appear. Thus an argument name
should not be used in the body for anything else, such as a function
name or a symbol in a constant.
As with defun
, name can be any function spec.
There are many useful techniques for writing macros. Over the years, Lisp programmers have discovered techniques that most programmers find useful, and have identified pitfalls that must be avoided. This section discusses some of these techniques, and illustrates them with examples.
The most important thing to keep in mind as you learn to write macros is that the first thing you should do is figure out what the macro form is supposed to expand into, and only then should you start to actually write the code of the macro. If you have a firm grasp of what the generated Lisp program is supposed to look like, from the start, you will find the macro much easier to write.
In general any macro that can be written as a substitutable
function (see defsubst-fun) should be written as one, not as a macro,
for several reasons: substitutable functions are easier to write and to
read; they can be passed as functional arguments (for example, you can
pass them to mapcar
); and there are some subtleties that can occur
in macro definitions that need not be worried about in substitutable
functions. A macro can be a substitutable function only if it has
exactly the semantics of a function, rather than a special form. The
macros we will see in this section are not semantically like functions;
they must be written as macros.
One of the most common errors in writing macros is best illustrated by
example. Suppose we wanted to write dolist
(see dolist-fun) as
a macro that expanded into a do
(see do-fun). The first step,
as always, is to figure out what the expansion should look like. Let’s
pick a representative example form, and figure out what its expansion
should be. Here is a typical dolist
form.
(dolist (element (append a b)) (push element *big-list*) (foo element 3))
We want to create a do
form that does the thing that the above
dolist
form says to do. That is the basic goal of the macro: it
must expand into code that does the same thing that the original code
says to do, but it should be in terms of existing Lisp constructs.
The do
form might look like this:
(do ((list (append a b) (cdr list)) (element)) ((null list)) (setq element (car list)) (push element *big-list*) (foo element 3))
Now we could start writing the macro that would generate this code, and
in general convert any dolist
into a do
, in an analogous way.
However, there is a problem with the above scheme for expanding the
dolist
. The above expansion works fine. But what if the input
form had been the following:
(dolist (list (append a b)) (push list *big-list*) (foo list 3))
This is just like the form we saw above, except that the user happened
to decide to name the looping variable list
rather than
element
. The corresponding expansion would be:
(do ((list (append a b) (cdr list)) (list)) ((null list)) (setq list (car list)) (push list *big-list*) (foo list 3))
This doesn’t work at all! In fact, this is not even a valid program,
since it contains a do
that uses the same variable in two different
iteration clauses.
Here’s another example that causes trouble:
(let ((list nil)) (dolist (element (append a b)) (push element list) (foo list 3)))
If you work out the expansion of this form, you will see that there are
two variables named list
, and that the user meant to refer to the
outer one but the generated code for the push
actually uses
the inner one.
The problem here is an accidental name conflict. This can happen in any macro that has to create a new variable. If that variable ever appears in a context in which user code might access it, then you have to worry that it might conflict with some other name that the user is using for his own program.
One way to avoid this problem is to choose a name that is very
unlikely to be picked by the user, simply by choosing an unusual name.
This will probably work, but it is inelegant since there is no
guarantee that the user won’t just happen to choose the same name. The
way to really avoid the name conflict is to use an uninterned symbol as
the variable in the generated code. The function gensym
(see
gensym-fun) is useful for creating such symbols.
Here is the expansion of the original form, using an uninterned
symbol created by gensym
.
(do ((g0005 (append a b) (cdr g0005)) (element)) ((null g0005)) (setq element (car g0005)) (push element *big-list*) (foo element 3))
This is the right kind of thing to expand into. Now that we understand how the expansion works, we are ready to actually write the macro. Here it is:
(defmacro dolist ((var form) . body) (let ((dummy (gensym))) `(do ((,dummy ,form (cdr ,dummy)) (,var)) ((null ,dummy)) (setq ,var (car ,dummy)) . ,body)))
Many system macros do not use gensym
for the internal variables in their
expansions. Instead they use symbols whose print names begin and end with a dot.
This provides meaningful names for these variables when looking at the generated
code and when looking at the state of a computation in the error-handler.
However, this convention means that users should avoid naming variables this way.
A related problem occurs when you write a macro that expands into a
prog
(or a do
, or something that expands into prog
or do
)
behind the user’s back (unlike dolist
, which is documented to be like
do
).
Consider the error-restart
special form (see error-restart-fun);
suppose we wanted to implement it as a macro that expands into a prog
.
If it expanded into a plain-old prog
, then the following (contrived)
Lisp program would not behave correctly:
(prog () (setq a 3) (error-restart (cond ((> a 10) (return 5)) ((> a 4) (cerror nil t 'lose "You lose.")))) (setq b 7))
The problem is that the return
would return from the
error-restart
instead of the prog
. The way to avoid this
problem is to use a named prog
whose name is t
. The name t
is special in that it is invisible to the return
function. If we
write error-restart
as a macro that expands into a prog
named
t
, then the return
will pass right through the
error-restart
form and return from the prog
, as it ought to.
In general, when a macro expands into a prog
or a do
around the
user’s code, the prog
or do
should be named t
so that
return
forms in the user code will return to the right place,
unless the macro is documented as generating a prog/do
-like form
which may be exited with return
.
Sometimes a macro wants to do several different things when its expansion
is evaluated. Another way to say this is that sometimes a macro wants to
expand into several things, all of which should happen sequentially at run
time (not macro-expand time). For example, suppose you wanted to implement
defconst
(see defconst-fun) as a macro. defconst
must do two
things: declare the variable to be special, and set the variable to its
initial value. (We will implement a simplified defconst
that only does
these two things, and doesn’t have any options.) What should a
defconst
form expand into? Well, what we would like is for an
appearance of
(defconst a (+ 4 b))
in a file to be the same thing as the appearance of the following two forms:
(declare (special a)) (setq a (+ 4 b))
However, because of the way that macros work, they only expand into one
form, not two. So we need to have a defconst
form expand into
one form that is just like having two forms in the file.
There is such a form. It looks like this:
(progn 'compile (declare (special a)) (setq a (+ 4 b)))
In interpreted Lisp, it is easy to see what happens here. This is a
progn
special form, and so all its subforms are evaluated, in turn.
First the form 'compile
is evaluated. The result is the symbol
compile
; this value is not used, and evaluation of 'compile
has
no side-effects, so the 'compile
subform is effectively ignored.
Then the declare
form and the setq
form are evaluated, and so
each of them happens, in turn. So far, so good.
The interesting thing is the way this form is treated by the compiler.
The compiler specially recognizes any progn
form at top level in a
file whose first subform is 'compile
. When it sees such a
form, it processes each of the remaining subforms of the progn
just
as if that form had appeared at top level in the file. So the compiler
behaves exactly as if it had encountered the declare
form at top
level, and then encountered the setq
form at top level, even though
neither of those forms was actually at top-level (they were both inside
the progn
). This feature of the compiler is provided specifically for
the benefit of macros that want to expand into several things.
Here is the macro definition:
(defmacro defconst (variable init-form) `(progn 'compile (declare (special ,variable)) (setq ,variable ,init-form)))
Here is another example of a form that wants to expand into several
things. We will implement a special form called define-command
,
which is intended to be used in order to define commands in some
interactive user subsystem. For each command, there are two things
provided by the define-command
form: a function that executes the
command, and a text string that contains the documentation for the
command (in order to provide an on-line interactive documentation
feature). This macro is a simplified version of a macro that is
actually used in the Zwei editor. Suppose that in
this subsystem, commands are always functions of no arguments,
documentation strings are placed on the help
property of the name
of the command, and the names of all commands are put onto a list.
A typical call to define-command
would look like:
(define-command move-to-top "This command moves you to the top." (do () ((at-the-top-p)) (move-up-one)))
This could expand into:
(progn 'compile (defprop move-to-top "This command moves you to the top." help) (push 'move-to-top *command-name-list*) (defun move-to-top () (do () ((at-the-top-p)) (move-up-one))) )
The define-command
expands into three forms. The first one sets up
the documentation string and the second one puts the command name onto
the list of all command names. The third one is the defun
that
actually defines the function itself. Note that the defprop
and
push
happen at load-time (when the file is loaded); the function,
of course, also gets defined at load time. (See the description of
eval-when
(eval-when-fun) for more discussion of the differences
between compile time, load time, and eval time.)
This technique makes Lisp a powerful language in which to implement your own language. When you write a large system in Lisp, frequently you can make things much more convenient and clear by using macros to extend Lisp into a customized language for your application. In the above example, we have created a little language extension: a new special form that defines commands for our system. It lets the writer of the system put his documentation strings right next to the code that they document, so that the two can be updated and maintained together. The way that the Lisp environment works, with load-time evaluation able to build data structures, lets the documentation data base and the list of commands be constructed automatically.
There is a particular kind of macro that is very useful for many
applications. This is a macro that you place "around" some Lisp code,
in order to make the evaluation of that code happen in some context.
For a very simple example, we could define a macro called
with-output-in-base
, that executes the forms within its body
with any output of numbers that is done defaulting to a specified base.
(defmacro with-output-in-base ((base-form) &body body) `(let ((base ,base-form)) . ,body))
A typical use of this macro might look like:
(with-output-in-base (*default-base*) (print x) (print y))
which would expand into
(let ((base *default-base*)) (print x) (print y))
This example is too trivial to be very useful; it is intended to
demonstrate some stylistic issues. There are some special forms in
Zetalisp that are similar to this macro; see
with-open-file
(with-open-file-fun) and with-input-from-string
(with-input-from-string-fun), for example.
The really interesting thing, of course, is that you can define your
own such special forms for your own specialized applications. One very
powerful application of this technique was used in a system that
manipulates and solves the Rubik’s cube puzzle. The system heavily
uses a special form called with-front-and-top
, whose meaning is
"evaluate this code in a context in which this specified face of the
cube is considered the front face, and this other specified face is
considered the top face".
The first thing to keep in mind when you write this sort of macro is
that you can make your macro much clearer to people who might read your
program if you conform to a set of loose standards of syntactic style.
By convention, the names of such special forms start with "with-
".
This seems to be a clear way of expressing the concept that we are
setting up a context; the meaning of the special form is "do this stuff
with the following things true". Another convention is that any
"parameters" to the special form should appear in a list that is the
first subform of the special form, and that the rest of the subforms
should make up a body of forms that are evaluated sequentially with the
last one returned. All of the examples cited above work this way. In
our with-output-in-base
example, there was one parameter (the
base), which appears as the first (and only) element of a list that is
the first subform of the special form. The extra level of parentheses
in the printed representation serves to separate the "parameter" forms
from the "body" forms so that it is textually apparent which is which;
it also provides a convenient way to provide default parameters (a good
example is the with-input-from-string
special form
(with-input-from-string-fun), which takes two required and two
optional "parameters"). Another convention/technique is to use the
&body
keyword in the defmacro
to tell the editor how to
correctly indent the special form (see &body).
The other thing to keep in mind is that control can leave the special
form either by the last form’s returning, or by a non-local exit (that
is, something doing a *throw
). You should write the special form
in such a way that everything will be cleaned up appropriately no
matter which way control exits. In our with-output-in-base
example, there is no problem, because non-local exits undo
lambda-bindings. However, in even slightly more complicated cases, an
unwind-protect
form (see unwind-protect-fun) is needed: the
macro must expand into an unwind-protect
that surrounds the body,
with "cleanup" forms that undo the context-setting-up that the macro
did. For example, using-resource
(see using-resource-fun)
is implemented as a macro
that does an allocate-resource
and then performs the body inside of
an unwind-protect
that has a deallocate-resource
in its
"cleanup" forms. This way the allocated resource item will be
deallocated whenever control leaves the using-resource
special form.
In any macro, you should always pay attention to the problem of multiple or out-of-order evaluation of user subforms. Here is an example of a macro with such a problem. This macro defines a special form with two subforms. The first is a reference, and the second is a form. The special form is defined to create a cons whose car and cdr are both the value of the second subform, and then to set the reference to be that cons. Here is a possible definition:
(defmacro test (reference form) `(setf ,reference (cons ,form ,form)))
Simple cases will work all right:
(test foo 3) ==> (setf foo (cons 3 3))
But a more complex example, in which the subform has side effects, can produce surprising results:
(test foo (setq x (1+ x))) ==> (setf foo (cons (setq x (1+ x)) (setq x (1+ x))))
The resulting code evaluates the setq
form twice, and so x
is increased by two instead of by one. A better definition of test
that avoids this problem is:
(defmacro test (reference form) (let ((value (gensym))) `(let ((,value ,form)) (setf ,reference (cons ,value ,value)))))
With this definition, the expansion works as follows:
(test foo (setq x (1+ x))) ==> (let ((g0005 (setq x (1+ x)))) (setf foo (cons g0005 g0005)))
In general, when you define a new special form that has some forms as its subforms, you have to be careful about just when those forms get evaluated. If you aren’t careful, they can get evaluated more than once, or in an unexpected order, and this can be semantically significant if the forms have side-effects. There’s nothing fundamentally wrong with multiple or out-of-order evalation if that is really what you want and if it is what you document your special form to do. However, it is very common for special forms to simply behave like functions, and when they are doing things like what functions do, it’s natural to expect them to be function-like in the evaluation of their subforms. Function forms have their subforms evaluated, each only once, in left-to-right order, and special forms that are similar to function forms should try to work that way too for clarity and consistency.
There is a tool that makes it easier for you to follow the principle
explained above. It is a macro called once-only
. It is most easily
explained by example. The way you would write test
using
once-only
is as follows:
(defmacro test (reference form) (once-only (form) `(setf ,reference (cons ,form ,form))))
This defines test
in such a way that the form
is only evaluated
once, and references to form
inside the macro body refer to that
value. once-only
automatically introduces a lambda-binding of a
generated symbol to hold the value of the form. Actually, it is more
clever than that; it avoids introducing the lambda-binding for forms
whose evaluation is trivial and may be repeated without harm nor cost,
such as numbers, symbols, and quoted structure. This is just an
optimization that helps produce more efficient code.
The once-only
macro makes it easier to follow the principle, but it
does not completely nor automatically solve the problems of multiple and
out-of-order evaluation. It is just a tool that can solve some of the
problems some of the time; it is not a panacea.
The following description attempts to explain what once-only
does,
but it is a lot easier to use once-only
by imitating the example
above than by trying to understand once-only
’s rather tricky
definition.
A once-only
form looks like
(once-only var-list form1 form2 ...)
var-list is a list of variables. The forms are a Lisp program,
that presumably uses the values of those variables. When the form
resulting from the expansion of the once-only
is evaluated, the first
thing it does is to inspect the values of each of the variables in
var-list; these values are assumed to be Lisp forms. For each of the
variables, it binds that variable either to its current value, if the
current value is a trivial form, or to a generated symbol. Next,
once-only
evaluates the forms, in this new binding environment, and
when they have been evaluated it undoes the bindings. The result of the
evaluation of the last form is presumed to be a Lisp form, typically
the expansion of a macro. If all of the variables had been bound to
trivial forms, then once-only just returns that result. Otherwise,
once-only
returns the result wrapped in a lambda-combination that binds
the generated symbols to the result of evaluating the respective
non-trivial forms.
The effect is that the program produced by evaluating the once-only
form
is coded in such a way that it only evaluates each form once, unless evaluation
of the form has no side-effects, for each of the forms which were the values
of variables in var-list. At the same time, no unnecessary lambda
-binding
appears in this program, but the body of the once-only
is not cluttered up
with extraneous code to decide whether or not to introduce lambda
-binding
in the program it constructs.
Caution! A number of system macros, setf
for example, fail to
follow this convention. Unexpected multiple evaluation and out-of-order
evaluation can occur with them. This was done for the sake of efficiency,
is prominently mentioned in the documentation of these macros, and will
be fixed in the future. It would be best not to compromise the semantic
simplicity of your own macros in this way.
A useful technique for building language extensions is to define
programming constructs that employ two special forms, one of which is
used inside the body of the other. Here is a simple example. There
are two special forms. The outer one is called with-collection
,
and the inner one is called collect
. collect
takes one
subform, which it evaluates; with-collection
just has a body, whose
forms it evaluates sequentially. with-collection
returns a list of
all of the values that were given to collect
during the evaluation
of the with-collection
’s body. For example,
(with-collection (dotimes (i 5) (collect i))) => (1 2 3 4 5)
Remembering the first piece of advice we gave about macros, the next thing to do is to figure out what the expansion looks like. Here is how the above example could expand:
(let ((g0005 nil)) (dotimes (i 5) (push i g0005)) (nreverse g0005))
Now, how do we write the definition of the macros? Well,
with-collection
is pretty easy:
(defmacro with-collection (&body body) (let ((var (gensym))) `(let ((,var nil)) ,@body (nreverse ,var))))
The hard part is writing collect
. Let’s try it:
(defmacro collect (argument) `(push ,argument ,var))
Note that something unusual is going on here: collect
is using the
variable var
freely. It is depending on the binding that takes
place in the body of with-collection
in order to get access to the
value of var
. Unfortunately, that binding took place when
with-collection
got expanded; with-collection
’s expander
function bound var
, and it got unbound when the expander function
was done. By the time the collect
form gets expanded, var
has
long since been unbound. The macro definitions above do not work.
Somehow the expander function of with-collection
has to communicate
with the expander function of collect
to pass over the generated
symbol.
The only way for with-collection
to convey information to the
expander function of collect
is for it to expand into something
that passes that information. What we can do is to define a special
variable (which we will call *collect-variable*
), and have
with-collection
expand into a form that binds this variable to the
name of the variable that the collect
should use. Now, consider
how this works in the interpreter. The evaluator will first see the
with-collection
form, and call in the expander function to expand
it. The expander function creates the expansion, and returns to the
evaluator, which then evaluates the expansion. The expansion includes
in it a let
form to bind *collect-variable*
to the generated
symbol. When the evaluator ses this let
form during the evaluation
of the expansion of the with-collection
form, it will set up the
binding and recursively evaluate the body of the let
. Now, during
the evaluation of the body of the let
, our special variable is
bound, and if the expander function of collect
gets run, it will be
able to see the value of collection-variable
and incorporate the
generated symbol into its own expansion.
Writing the macros this way is not quite right. It works fine
interpreted, but the problem is that it does not work when we try to
compile Lisp code that uses these special forms. When code is being
compiled, there isn’t any interpreter to do the binding in our new
let
form; macro expansion is done at compile time, but generated
code does not get run until the results of the compilation are loaded
and run. The way to fix our definitions is to use compiler-let
instead of let
. compiler-let
(see compiler-let-fun) is a
special form that exists specifically to do the sort of thing we are
trying to do here. compiler-let
is identical to let
as far as
the interpreter is concerned, so changing our let
to a
compiler-let
won’t affect the behavior in the interpreter; it will
continue to work. When the compiler encounters a compiler-let
, however,
it actually performs the bindings that the compiler-let
specifies,
and proceeds to compile the body of the compiler-let
with all of
those bindings in effect. In other words, it acts as the interpreter
would.
Here’s the right way to write these macros:
(defvar *collect-variable*) (defmacro with-collection (&body body) (let ((var (gensym))) `(let ((,var nil)) (compiler-let ((*collect-variable* ',var)) . ,body) (nreverse ,var)))) (defmacro collect (argument) `(push ,argument ,*collect-variable*))
The technique of defining functions to be used during macro expansion
deserves explicit mention here. It may not occur to you, but a macro
expander function is a Lisp program like any other Lisp program, and it
can benefit in all the usual ways by being broken down into a
collection of functions that do various parts of its work. Usually
macro expander functions are pretty simple Lisp programs that take
things apart and put them together slightly differently and such, but
some macros are quite complex and do a lot of work. Several features
of Zetalisp, including flavors, loop
, and defstruct
,
are implemented using very complex macros, which, like any complex
well-written Lisp program, are broken down into modular functions. You should
keep this in mind if you ever invent an advanced language extension
or ever find yourself writing a five-page expander function.
A particular thing to note is that any functions used by macro-expander
functions must be available at compile-time. You can make a function
available at compile time by surrounding its defining form with an
(eval-when (compile load eval) ...)
; see eval-when-fun for more
details. Doing this means that at compile time the definition of the
function will be interpreted, not compiled, and hence will run more
slowly. Another approach is to separate macro definitions and the
functions they call during expansion into a separate file, often called
a "defs" (definitions) file. This file defines all the macros but does
not use any of them. It can be separately compiled and loaded up
before compiling the main part of the program, which uses the macros.
The system facility (see system-system) helps keep these
various files straight, compiling and loading things in the right order.
mexp
goes into a loop in which it reads forms and sequentially
expands them, printing out the result of each expansion (using
the grinder (see grind) to improve readability). It terminates
when it reads an atom (anything that is not a cons). If you type
in a form which is not a macro form, there will be no expansions
and so it will not type anything out, but just prompt you for
another form. This allows you to see what your macros are
expanding into, without actually evaluating the result of the expansion.
Every time the the evaluator sees a macro form, it must
call the macro to expand the form. If this expansion always
happens the same way, then it is wasteful to expand the whole
form every time it is reached; why not just expand it once?
A macro is passed the macro form itself, and so it can change
the car and cdr of the form to something else by using rplaca
and rplacd
! This way the first time the macro is expanded,
the expansion will be put where the macro form used to be, and the
next time that form is seen, it will already be expanded. A macro that
does this is called a displacing macro, since it displaces
the macro form with its expansion.
The major problem with this is that the Lisp form
gets changed by its evaluation. If you were to write a program
which used such a macro, call grindef
to look at it,
then run the program and call grindef
again, you would
see the expanded macro the second time. Presumably the reason
the macro is there at all is that it makes the program look nicer;
we would like to prevent the unnecessary expansions, but still let
grindef
display the program in its more attractive form.
This is done with the function displace
.
Anothing thing to worry about with displacing macros is that if you change the definition of a displacing macro, then your new definition will not take effect in any form that has already been displaced. If you redefine a displacing macro, an existing form using the macro will use the new definition only if the form has never been evaluated.
form must be a list.
displace
replaces the car and cdr of form so
that it looks like:
(si:displaced original-form expansion)
original-form is equal to form but has a different
top-level cons so that the replacing mentioned above doesn’t
affect it. si:displaced
is a macro, which returns
the caddr of its own macro form. So when the si:displaced
form is given to the evaluator, it "expands" to expansion.
displace
returns expansion.
The grinder knows specially about si:displaced
forms,
and will grind such a form as if it had seen the original-form
instead of the si:displaced
form.
So if we wanted to rewrite our addone
macro as a displacing
macro, instead of writing
(macro addone (x) (list 'plus '1 (cadr x)))
we would write
(macro addone (x) (displace x (list 'plus '1 (cadr x))))
Of course, we really want to use defmacro
to define
most macros. Since there is no way to get at the original macro form itself
from inside the body of a defmacro
, another version of it is
provided:
defmacro-displace
is just like defmacro
except that
it defines a displacing macro, using the displace
function.
Now we can write the displacing version of addone
as
(defmacro-displace addone (val) (list 'plus '1 val))
All we have changed in this example is the defmacro
into
defmacro-displace
. addone
is now a displacing macro.
The pattern in a defmacro
is more like the lambda
-list
of a normal function than revealed above. It is allowed to
contain certain &
-keywords.
&optional
is followed by variable, (variable)
,
(variable default)
, or (variable default
present-p)
, exactly the same as in a function. Note that
default is still a form to be evaluated, even though variable
is not being bound to the value of a form. variable does not have
to be a symbol; it can be a pattern. In this case the first form is
disallowed because it is syntactically ambigous. The pattern must be
enclosed in a singleton list. If variable is a pattern, default
can be evaluated more than once.
Using &rest
is the same as using a dotted list as the pattern,
except that it may be easier to read and leaves a place to put &aux
.
&aux
is the same in a macro as in a function, and has nothing to do
with pattern matching.
defmacro
has a couple of additional keywords not allowed in functions.
&body
is identical to &rest
except that it informs the editor and the grinder
that the remaining subforms constitute a "body" rather than "arguments"
and should be indented accordingly.
&list-of
pattern requires the corresponding position of the
form being translated to contain a list (or nil
). It
matches pattern against each element of that list. Each variable
in pattern is bound to a list of the corresponding values in each element
of the list matched by the &list-of
. This may be clarified by an
example. Suppose we want to be able to say things like
(send-commands (aref turtle-table i) (forward 100) (beep) (left 90) (pen 'down 'red) (forward 50) (pen 'up))
We could define a send-commands
macro as follows:
(defmacro send-commands (object &body &list-of (command . arguments)) `(let ((o ,object)) . ,(mapcar #'(lambda (com args) `(send o ',com . ,args)) command arguments)))
Note that this example uses &body
together with &list-of
, so you
don’t see the list itself; the list is just the rest of the macro-form.
You can combine &optional
and &list-of
. Consider the following example:
(defmacro print-let (x &optional &list-of ((vars vals) '((base 10.) (*nopoint t)))) `((lambda (,@vars) (print ,x)) ,@vals)) (print-let foo) ==> ((lambda (base *nopoint) (print foo)) 12 t) (print-let foo ((bar 3))) ==> ((lambda (bar) (print foo)) 3)
In this example we aren’t using &body
or anything like it, so you do see
the list itself; that is why you see parentheses around the (bar 3)
.
The following two functions are provided to allow the user to control expansion of macros; they are often useful for the writer of advanced macro systems, and in tools that want to examine and understand code which may contain macros.
If form is a macro form, this expands it (once)
and returns the expanded form. Otherwise it just
returns form. macroexpand-1
expands defsubst
function forms as well as macro forms.
If form is a macro form, this expands it repeatedly
until it is not a macro form, and returns the final expansion.
Otherwise, it just returns form. macroexpand
expands defsubst
function forms as well as macro forms.
In Lisp, a variable is something that can remember one piece of data. The main operations on a variable are to recover that piece of data, and to change it. These might be called access and update. The concept of variables named by symbols, explained in variable-section, can be generalized to any storage location that can remember one piece of data, no matter how that location is named.
For each kind of generalized variable, there are typically two functions
which implement the conceptual access and update operations. For
example, symeval
accesses a symbol’s value cell, and set
updates
it. array-leader
accesses the contents of an array leader element, and
store-array-leader
updates it. car
accesses the car of a cons,
and rplaca
updates it.
Rather than thinking of this as two functions, which operate on a storage
location somehow deduced from their arguments, we can shift our point of
view and think of the access function as a name for the storage
location. Thus (symeval 'foo)
is a name for the value of foo
, and
(aref a 105)
is a name for the 105th element of the array a
.
Rather than having to remember the update function associated with each
access function, we adopt a uniform way of updating storage locations named
in this way, using the setf
special form. This is analogous to the
way we use the setq
special form to convert the name of a variable
(which is also a form which accesses it) into a form which updates it.
setf
is particularly useful in combination with structure-accessing
macros, such as those created with defstruct
, because the knowledge of the
representation of the structure is embedded inside the macro, and the programmer
shouldn’t have to know what it is in order to alter an element of the structure.
setf
is actually a macro which expands into the appropriate update function.
It has a database, explained below, which associates from access functions to
update functions.
setf
takes a form which accesses something, and "inverts"
it to produce a corresponding form to update the thing.
A setf
expands into an update form, which stores the result of evaluating
the form value into the place referenced by the access-form.
Examples:
(setf (array-leader foo 3) 'bar) ==> (store-array-leader 'bar foo 3) (setf a 3) ==> (setq a 3) (setf (plist 'a) '(foo bar)) ==> (setplist 'a '(foo bar)) (setf (aref q 2) 56) ==> (aset 56 q 2) (setf (cadr w) x) ==> (rplaca (cdr w) x)
If access-form invokes a macro or a substitutable function, then
setf
expands the access-form and starts over again. This lets you
use setf
together with defstruct
accessor macros.
For the sake of efficiency, the code produced by setf
does not preserve order of evaluation of the argument forms. This is only a problem
if the argument forms have interacting side-effects. For example,
if you evaluate
(setq x 3) (setf (aref a x) (setq x 4))
then the form might set element 3
or element 4
of the array.
We do not guarantee which one it will do; don’t just try it and see
and then depend on it, because it is subject to change without notice.
Furthermore, the value produced by setf
depends on the structure
type and is not guaranteed; setf
should be used for side effect
only.
Besides the access and update conceptual operations on variables, there
is a third basic operation, which we might call locate. Given the name of
a storage cell, the locate operation will return the address of that cell
as a locative pointer (see locative). This locative pointer is a kind of
name for the variable which is a first-class Lisp data object. It can be
passed as an argument to a function which operates on any kind of variable,
regardless of how it is named. It can be used to bind the variable, using
the bind
subprimitive (see bind-fun).
Of course this can only work on variables whose implementation is really to store their value in a memory cell. A variable with an update operation that encrypts the value and an access operation that decrypts it could not have the locate operation, since the value per se is not actually stored anywhere.
locf
takes a form which accesses some cell, and produces
a corresponding form to create a locative pointer to that cell.
Examples:
(locf (array-leader foo 3)) ==> (ap-leader foo 3) (locf a) ==> (value-cell-location 'a) (locf (plist 'a)) ==> (property-cell-location 'a) (locf (aref q 2)) ==> (aloc q 2)
If access-form invokes a macro or a substitutable function, then
locf
expands the access-form and starts over again. This lets you
use locf
together with defstruct
accessor macros.
Both setf
and locf
work by means of property lists.
When the form (setf (aref q 2) 56)
is expanded, setf
looks
for the setf
property of the symbol aref
. The value of the
setf
property of a symbol should be a cons whose car
is a pattern to be matched with the access-form, and whose cdr
is the corresponding update-form, with the symbol si:val
in
place of the value to be stored. The setf
property of aref
is a cons whose car is (aref array . subscripts)
and whose
cdr is (aset si:val array . subscripts)
. If the transformation which
setf
is to do cannot be expressed as a simple pattern, an arbitrary
function may be used: When the form (setf (foo bar) baz)
is being expanded, if the setf
property of foo
is a symbol,
the function definition of that symbol will be applied to two arguments,
(foo bar)
and baz
, and the result will be taken to be the
expansion of the setf
.
Similarly, the locf
function
uses the locf
property, whose value is analogous. For example, the locf
property
of aref
is a cons whose car is (aref array . subscripts)
and whose cdr is (aloc array . subscripts)
. There is no si:val
in the case of locf
.
Increments the value of a generalized variable. (incf ref)
increments
the value of ref by 1. (incf ref amount)
adds amount
to ref and stores the sum back into ref.
incf
expands into a setf
form, so ref can be anything that
setf
understands as its access-form. This also means that you
should not depend on the returned value of an incf
form.
You must take great care with incf
because it may evaluate
parts of ref more than once. For example,
(incf (car (mumble))) ==> (setf (car (mumble)) (1+ (car (mumble)))) ==> (rplaca (mumble) (1+ (car (mumble))))
The mumble
function is called more than once, which may be
significantly inefficient if mumble
is expensive, and which may be
downright wrong if mumble
has side-effects. The same problem
can come up with the decf
, push
, and pop
macros (see below).
Decrements the value of a generalized variable. (decf ref)
decrements
the value of ref by 1. (decf ref amount)
subtracts amount
from ref and stores the difference back into ref.
decf
expands into a setf
form, so ref can be anything that
setf
understands as its access-form. This also means that you
should not depend on the returned value of a decf
form.
Adds an item to the front of a list which is stored in a generalized variable.
(push item ref)
creates a new cons whose car is the result of evaluating item
and whose cdr is the contents of ref, and stores the new cons
into ref.
The form
(push (hairy-function x y z) variable)
replaces the commonly-used construct
(setq variable (cons (hairy-function x y z) variable))
and is intended to be more explicit and esthetic.
All the caveats that apply to incf
apply to push
as well:
forms within ref may be evaluated more than once. The returned value
of push
is not defined.
Removes an element from the front of a list which is stored in a generalized variable.
(pop ref)
finds the cons in ref, stores the cdr of the cons back into ref,
and returns the car of the cons.
Example:
(setq x '(a b c)) (pop x) => a x => (b c)
All the caveats that apply to incf
apply to pop
as well:
forms within ref may be evaluated more than once.
loop
is a Lisp macro which provides a programmable
iteration facility. The same loop
module operates compatibly in
Zetalisp, Maclisp (PDP-10 and Multics), and NIL, and a
moderately compatible package is under development for the MDL
programming environment. loop
was inspired by the "FOR"
facility of CLISP in InterLisp; however, it is not compatible and
differs in several details.
The general approach is that a form introduced by the word
loop
generates a single program loop, into which a large variety
of features can be incorporated. The loop consists of some
initialization (prologue) code, a body which may be executed
several times, and some exit (epilogue) code. Variables may be
declared local to the loop. The features are concerned with loop
variables, deciding when to end the iteration, putting user-written
code into the loop, returning a value from the construct, and
iterating a variable through various real or virtual sets of values.
The loop
form consists of a series of clauses, each
introduced by a keyword symbol. Forms appearing in or implied by the
clauses of a loop
form are classed as those to be executed as
initialization code, body code, and/or exit code; within each part of
the template that loop
fills in, they are executed strictly in
the order implied by the original composition. Thus, just as in
ordinary Lisp code, side-effects may be used, and one piece of code
may depend on following another for its proper operation. This is the
principal philosophy difference from InterLisp’s "FOR" facility.
Note that loop
forms are intended to look like stylized
English rather than Lisp code. There is a notably low density of
parentheses, and many of the keywords are accepted in several
synonymous forms to allow writing of more euphonious and grammatical
English. Some find this notation verbose and distasteful, while
others find it flexible and convenient. The former are invited to
stick to do
.
Here are some examples to illustrate the use of loop
.
(defun print-elements-of-list (list-of-elements) (loop for element in list-of-elements do (print element)))
The above function prints each element in its argument, which
should be a list. It returns nil
.
(defun gather-alist-entries (list-of-pairs) (loop for pair in list-of-pairs collect (car pair)))
gather-alist-entries
takes an association list and
returns a list of the "keys"; that is, (gather-alist-entries
'((foo 1 2) (bar 259) (baz)))
returns (foo bar baz)
.
(defun extract-interesting-numbers (start-value end-value) (loop for number from start-value to end-value when (interesting-p number) collect number))
The above function takes two arguments, which should be
fixnums, and returns a list of all the numbers in that range
(inclusive) which satisfy the predicate interesting-p
.
(defun find-maximum-element (an-array) (loop for i from 0 below (array-dimension-n 1 an-array) maximize (aref an-array i)))
find-maximum-element
returns the maximum of the elements
of its argument, a one-dimensional array. For Maclisp, aref
could be a macro which turns into either funcall
or
arraycall
depending on what is known about the type of the array.
(defun my-remove (object list) (loop for element in list unless (equal object element) collect element))
my-remove
is like the Lisp function delete
, except
that it copies the list rather than destructively splicing out
elements. This is similar, although not identical, to the
Zetalisp function remove
.
(defun find-frob (list) (loop for element in list when (frobp element) return element finally (ferror nil "No frob found in the list ~S" list)))
This returns the first element of its list argument which
satisfies the predicate frobp
. If none is found, an error is
generated.
Internally, loop
constructs a prog
which includes
variable bindings, pre-iteration (initialization) code,
post-iteration (exit) code, the body of the iteration, and stepping
of variables of iteration to their next values (which happens on
every iteration after executing the body).
A clause consists of the keyword symbol and any Lisp
forms and keywords which it deals with. For example,
(loop for x in l do (print x)),
contains two clauses, "for x in l
" and "do (print x)
".
Certain of the parts of the clause will be described as being
expressions, e.g (print x)
in the above. An expression can
be a single Lisp form, or a series of forms implicitly collected with
progn
. An expression is terminated by the next following atom,
which is taken to be a keyword. This syntax allows only the first
form in an expression to be atomic, but makes misspelled keywords
more easily detectable.
loop
uses print-name equality to compare keywords so
that loop
forms may be written without package prefixes; in
Lisp implementations that do not have packages, eq
is used for
comparison.
Bindings and iteration variable steppings may be performed either sequentially or in parallel, which affects how the stepping of one iteration variable may depend on the value of another. The syntax for distinguishing the two will be described with the corresponding clauses. When a set of things is "in parallel", all of the bindings produced will be performed in parallel by a single lambda binding. Subsequent bindings will be performed inside of that binding environment.
These clauses all create a variable of iteration, which
is bound locally to the loop and takes on a new value on each
successive iteration. Note that if more than one iteration-driving
clause is used in the same loop, several variables are created which
all step together through their values; when any of the iterations
terminates, the entire loop terminates. Nested iterations are not
generated; for those, you need a second loop
form in the body of
the loop. In order to not produce strange interactions, iteration
driving clauses are required to precede any clauses which produce
"body" code: that is, all except those which produce prologue or
epilogue code (initially
and finally
), bindings
(with
), the named
clause, and the iteration termination
clauses (while
and until
).
Clauses which drive the iteration may be arranged to perform
their testing and stepping either in series or in parallel. They are
by default grouped in series, which allows the stepping computation of
one clause to use the just-computed values of the iteration variables
of previous clauses. They may be made to step "in parallel", as is
the case with the do
special form, by "joining" the iteration
clauses with the keyword and
. The form this typically takes is
something like
(loop ... for x = (f) and for y = init then (g x) ...)
which sets x
to (f)
on every iteration, and binds y
to the value of init for the first iteration, and on every
iteration thereafter sets it to (g x)
, where x
still has
the value from the previous iteration. Thus, if the calls to
f
and g
are not order-dependent, this would be best
written as
(loop ... for y = init then (g x) for x = (f) ...)
because, as a general rule, parallel stepping has more overhead than sequential stepping. Similarly, the example
(loop for sublist on some-list and for previous = 'undefined then sublist ...)
which is equivalent to the do
construct
(do ((sublist some-list (cdr sublist)) (previous 'undefined sublist)) ((null sublist) ...) ...)
in terms of stepping, would be better written as
(loop for previous = 'undefined then sublist for sublist on some-list ...)
When iteration driving clauses are joined with and
, if
the token following the and
is not a keyword which introduces an
iteration driving clause, it is assumed to be the same as the keyword
which introduced the most recent clause; thus, the above example
showing parallel stepping could have been written as
(loop for sublist on some-list and previous = 'undefined then sublist ...)
The order of evaluation in iteration-driving clauses is that those expressions which are only evaluated once are evaluated in order at the beginning of the form, during the variable-binding phase, while those expressions which are evaluated each time around the loop are evaluated in order in the body.
One common and simple iteration driving clause is
repeat
:
repeat expression
¶This evaluates expression (during the variable binding phase),
and causes the loop
to iterate that many times.
expression is expected to evaluate to a fixnum. If
expression evaluates to a zero or negative result, the body code
will not be executed.
All remaining iteration driving clauses are subdispatches of
the keyword for
, which is synonomous with as
.
In all of them a variable of iteration is specified. Note that,
in general, if an iteration driving clause implicitly supplies an
endtest, the value of this iteration variable as the loop is exited
(i.e, when the epilogue code is run) is undefined. (This is
discussed in more detail in section
loop-iteration-framework-section.)
Here are all of the varieties of for
clauses. Optional
parts are enclosed in curly brackets. The data-types as used
here are discussed fully in section loop-data-type-section.
for var {data-type} in expr1 {by expr2}
¶This iterates over each of the elements in the list expr1. If
the by
subclause is present, expr2 is evaluated once
on entry to the loop
to supply the function to be used to fetch successive sublists,
instead of cdr
.
for var {data-type} on expr1 {by expr2}
¶This is like the previous for
format, except that var is
set to successive sublists of the list instead of successive elements.
Note that since var will always be a list, it is
not meaningful to specify a data-type unless var is
a destructuring pattern, as described in the section on
destructuring, loop-destructuring-page. Note also that
loop
uses a null
rather than an atom
test to
implement both this and the preceding clause.
for var {data-type} = expr
¶On each iteration, expr is evaluated and var is set to the result.
for var {data-type} = expr1 then expr2
¶var is bound to expr1 when the loop is entered, and set to expr2 (re-evaluated) at all but the first iteration. Since expr1 is evaluated during the binding phase, it cannot reference other iteration variables set before it; for that, use the following:
for var {data-type} first expr1 then expr2
¶This sets var to expr1 on the first iteration, and to
expr2 (re-evaluated) on each succeeding iteration. The
evaluation of both expressions is performed inside of the
loop
binding environment, before the loop
body. This
allows the first value of var to come from the first value of
some other iteration variable, allowing such constructs as
(loop for term in poly for ans first (car term) then (gcd ans (car term)) finally (return ans))
for var {data-type} from expr1 {to expr2} {by expr3}
¶This performs numeric iteration. var is initialized to
expr1, and on each succeeding iteration is incremented by
expr3 (default 1
). If the to
phrase is given, the
iteration terminates when var becomes greater than expr2.
Each of the expressions is evaluated only once, and the to
and
by
phrases may be written in either order. downto
may be
used instead of to
, in which case var is decremented by
the step value, and the endtest is adjusted accordingly. If
below
is used instead of to
, or above
instead of
downto
, the iteration will be terminated before expr2 is
reached, rather than after. Note that the to
variant
appropriate for the direction of stepping must be used for the endtest
to be formed correctly; i.e the code will not work if expr3
is negative or zero. If no limit-specifying clause is given, then the
direction of the stepping may be specified as being decreasing by
using downfrom
instead of from
. upfrom
may also be
used instead of from
; it forces the stepping direction to be
increasing. The data-type defaults to fixnum
.
for var {data-type} being expr and its path ...
¶for var {data-type} being {each|the} path ...
This provides a user-definable iteration facility. path names the manner in which the iteration is to be performed. The ellipsis indicates where various path dependent preposition/expression pairs may appear. See the section on Iteration Paths (iteration-path-page) for complete documentation.
The with
keyword may be used to establish initial
bindings, that is, variables which are local to the loop but are only
set once, rather than on each iteration. The with
clause looks like:
with var1 {data-type} {= expr1}
{and var2 {data-type} {= expr2}}...
If no expr is given, the variable is initialized to the
appropriate value for its data type, usually nil
.
with
bindings linked by and
are performed in
parallel; those not linked are performed sequentially. That is,
(loop with a = (foo) and b = (bar) and c ...)
binds the variables like
((lambda (a b c) ...) (foo) (bar) nil)
whereas
(loop with a = (foo) with b = (bar a) with c ...)
binds the variables like
((lambda (a) ((lambda (b) ((lambda (c) ...) nil)) (bar a))) (foo))
All expr’s in with
clauses are evaluated in the order they
are written, in lambda expressions surrounding the generated
prog
. The loop
expression
(loop with a = xa and b = xb with c = xc for d = xd then (f d) and e = xe then (g e d) for p in xp with q = xq ...)
produces the following binding contour, where t1
is a
loop
-generated temporary:
((lambda (a b) ((lambda (c) ((lambda (d e) ((lambda (p t1) ((lambda (q) ...) xq)) nil xp)) xd xe)) xc)) xa xb)
Because all expressions in with
clauses are evaluated during the
variable binding phase, they are best placed near the front of the
loop
form for stylistic reasons.
For binding more than one variable with no particular initialization, one may use the construct
with variable-list {data-type-list} {and ...}
as in
with (i j k t1 t2) (fixnum fixnum fixnum) ...
A slightly shorter way of writing this is
with (i j k) fixnum and (t1 t2) ...
These are cases of destructuring which loop
handles
specially; destructuring and data type keywords are discussed in
sections loop-destructuring-section and
loop-data-type-section.
Occasionally there are various implementational reasons
for a variable not to be given a local type declaration. If
this is necessary, the nodeclare
clause may be used:
nodeclare variable-list
¶The variables in variable-list are noted by loop
as not
requiring local type declarations. Consider the following:
(declare (special k) (fixnum k)) (defun foo (l) (loop for x in l as k fixnum = (f x) ...))
If k
did not have the fixnum
data-type keyword given for
it, then loop
would bind it to nil
, and some compilers
would complain. On the other hand, the fixnum
keyword also
produces a local fixnum
declaration for k
; since k
is special, some compilers will complain (or error out). The solution
is to do:
(defun foo (l) (loop nodeclare (k) for x in l as k fixnum = (f x) ...))
which tells loop
not to make that local declaration. The
nodeclare
clause must come before any reference to the
variables so noted. Positioning it incorrectly will cause this clause
to not take effect, and may not be diagnosed.
initially expression
¶This puts expression into the prologue of the iteration. It
will be evaluated before any other initialization code other than the
initial bindings. For the sake of good style, the initially
clause should therefore be placed after any with
clauses but
before the main body of the loop.
finally expression
¶This puts expression into the epilogue of the loop, which is
evaluated when the iteration terminates (other than by an explicit
return
). For stylistic reasons, then, this clause should appear
last in the loop body. Note that certain clauses may generate code
which terminates the iteration without running the epilogue code;
this behavior is noted with those clauses. Most notable of these are
those described in the section aggregated-boolean-tests-section,
Aggregated Boolean Tests. This clause may be used to cause the loop
to return values in a non-standard way:
(loop for n in l sum n into the-sum count t into the-count finally (return (quotient the-sum the-count)))
do expression
¶doing expression
expression is evaluated each time through the loop, as shown in
the print-elements-of-list
example on
print-elements-of-list-example.
The following clauses accumulate a return value for the iteration in some manner. The general form is
type-of-collection expr {data-type} {into var}
where type-of-collection is a loop
keyword, and expr
is the thing being "accumulated" somehow. If no into
is
specified, then the accumulation will be returned when the loop
terminates. If there is an into
, then when the epilogue of the
loop
is reached, var (a variable automatically bound
locally in the loop) will have been set to the accumulated
result and may be used by the epilogue code. In this way, a user may
accumulate and somehow pass back multiple values from a single
loop
, or use them during the loop. It is safe to reference
these variables during the loop, but they should not be modified
until the epilogue code of the loop is reached.
For example,
(loop for x in list collect (foo x) into foo-list collect (bar x) into bar-list collect (baz x) into baz-list finally (return (list foo-list bar-list baz-list)))
has the same effect as
(do ((g0001 list (cdr g0001)) (x) (foo-list) (bar-list) (baz-list)) ((null g0001) (list (nreverse foo-list) (nreverse bar-list) (nreverse baz-list))) (setq x (car g0001)) (setq foo-list (cons (foo x) foo-list)) (setq bar-list (cons (bar x) bar-list)) (setq baz-list (cons (baz x) baz-list)))
except that loop
arranges to form the lists in the correct
order, obviating the nreverse
s at the end, and allowing the
lists to be examined during the computation.
collect expr {into var}
¶collecting ...
This causes the values of expr on each iteration to be collected into a list.
nconc expr {into var}
¶nconcing ...
append ...
appending ...
These are like collect
, but the results are nconc
ed or
append
ed together as appropriate.
(loop for i from 1 to 3 nconc (list i (* i i))) => (1 1 2 4 3 9)
count expr {into var} {data-type}
¶counting ...
If expr evaluates non-nil
, a counter is incremented.
The data-type defaults to fixnum
.
sum expr {data-type} {into var}
¶summing ...
Evaluates expr on each iteration, and accumulates the sum of all
the values. data-type defaults to
number
, which for all practical purposes is notype
. Note
that specifying data-type implies that both the sum and
the number being summed (the value of expr) will be of that type.
maximize expr {data-type} {into var}
¶minimize ...
Computes the maximum (or minimum) of expr over all
iterations. data-type defaults to number
. Note that if
the loop iterates zero times, or if conditionalization prevents the
code of this clause from being executed, the result will be
meaningless. If loop
can determine that the arithmetic being
performed is not contagious (by virtue of data-type being
fixnum
, flonum
, or small-flonum
), then it may choose
to code this by doing an arithmetic comparison rather than calling
either max
or min
. As with the sum
clause,
specifying data-type implies that both the result of the
max
or min
operation and the value being maximized or
minimized will be of that type.
Not only may there be multiple accumulations in a
loop
, but a single accumulation may come from multiple
places within the same loop
form. Obviously, the types of
the collection must be compatible. collect
, nconc
, and
append
may all be mixed, as may sum
and count
, and
maximize
and minimize
. For example,
(loop for x in '(a b c) for y in '((1 2) (3 4) (5 6)) collect x append y) => (a 1 2 b 3 4 c 5 6)
The following computes the average of the entries in the list list-of-frobs:
(loop for x in list-of-frobs count t into count-var sum x into sum-var finally (return (quotient sum-var count-var)))
The following clauses may be used to provide additional
control over when the iteration gets terminated, possibly causing
exit code (due to finally
) to be performed and possibly returning
a value (e.g, from collect
).
while expr
¶If expr evaluates to nil
, the loop is exited, performing
exit code (if any), and returning any accumulated value. The
test is placed in the body of the loop where it is written. It may
appear between sequential for
clauses.
until expr
¶Identical to while (not expr)
.
This may be needed, for example, to step through a strange data structure, as in
(loop until (top-of-concept-tree? concept) for concept = expr then (superior-concept concept) ...)
Note that the placement of the while
clause before the for
clause is valid in this case because of the definition of this
particular variant of for
, which binds concept
to
its first value rather than setting it from inside the loop
.
The following may also be of use in terminating the iteration:
(loop-finish)
causes the iteration to terminate "normally", the
same as implicit termination by an iteration driving clause, or by the
use of while
or until
–the epilogue code (if any) will be
run, and any implicitly collected result will be returned as the value
of the loop
.
For example,
(loop for x in '(1 2 3 4 5 6) collect x do (cond ((= x 4) (loop-finish)))) => (1 2 3 4)
This particular example would be better written as until (= x 4)
in place of the do
clause.
All of these clauses perform some test, and may immediately terminate the iteration depending on the result of that test.
always expr
¶Causes the loop to return t
if expr always
evaluates
non-null
. If expr evaluates to nil
, the loop
immediately returns nil
, without running the epilogue code (if
any, as specified with the finally
clause); otherwise, t
will be returned when the loop finishes, after the epilogue code has
been run.
never expr
¶Causes the loop to return t
if expr never
evaluates
non-null
. This is equivalent to always (not expr)
.
thereis expr
¶If expr evaluates non-nil
, then the iteration is
terminated and that value is returned, without running the epilogue
code.
These clauses may be used to "conditionalize" the following
clause. They may precede any of the side-effecting or value-producing
clauses, such as do
, collect
, always
, or
return
.
when expr
¶if expr
If expr evaluates to nil
, the following clause will be
skipped, otherwise not.
unless expr
¶This is equivalent to when (not expr))
.
Multiple conditionalization clauses may appear in sequence. If one test fails, then any following tests in the immediate sequence, and the clause being conditionalized, are skipped.
Multiple clauses may be conditionalized under the same test by
joining them with and
, as in
(loop for i from a to b when (zerop (remainder i 3)) collect i and do (print i))
which returns a list of all multiples of 3
from a
to
b
(inclusive) and prints them as they are being collected.
If-then-else conditionals may be written using the else
keyword, as in
(loop for i from a to b when (oddp i) collect i into odd-numbers else collect i into even-numbers)
Multiple clauses may appear in an else
-phrase, using and
to join them
in the same way as above.
Conditionals may be nested. For example,
(loop for i from a to b when (zerop (remainder i 3)) do (print i) and when (zerop (remainder i 2)) collect i)
returns a list of all multiples of 6
from a
to b
,
and prints all multiples of 3
from a
to b
.
When else
is used with nested conditionals, the "dangling else"
ambiguity is resolved by matching the else
with the innermost when
not already matched with an else
. Here is a complicated example.
(loop for x in l when (atom x) when (memq x *distinguished-symbols*) do (process1 x) else do (process2 x) else when (memq (car x) *special-prefixes*) collect (process3 (car x) (cdr x)) and do (memoize x) else do (process4 x))
Useful with the conditionalization clauses is the return
clause, which causes an explicit return of its "argument" as
the value of the iteration, bypassing any epilogue code. That is,
when expr1 return expr2
is equivalent to
when expr1 do (return expr2)
Conditionalization of one of the "aggregated boolean value" clauses simply causes the test which would cause the iteration to terminate early not to be performed unless the condition succeeds. For example,
(loop for x in l when (significant-p x) do (print x) (princ "is significant.") and thereis (extra-special-significant-p x))
does not make the extra-special-significant-p
check unless the
significant-p
check succeeds.
The format of a conditionalized clause is typically something like
when expr1 keyword expr2
If expr2 is the keyword it
, then a variable is generated to
hold the value of expr1, and that variable gets substituted for
expr2. Thus, the composition
when expr return it
is equivalent to the clause
thereis expr
and one may collect all non-null values in an iteration by saying
when expression collect it
If multiple clauses are joined with and
, the it
keyword
may only be used in the first. If multiple when
s,
unless
es, and/or if
s occur in sequence, the value
substituted for it
will be that of the last test performed.
The it
keyword is not recognized in an else
-phrase.
named name
¶This gives the prog
which loop
generates a name of
name, so that one may use the return-from
form to return
explicitly out of that particular loop
:
(loop named sue ... do (loop ... do (return-from sue value) ...) ...)
The return-from
form shown causes value to be immediately
returned as the value of the outer loop
. Only one name may be
given to any particular loop
construct.
This feature does not exist in the Maclisp version of loop
, since
Maclisp does not support "named progs".
return expression
¶Immediately returns the value of expression as the value of the
loop, without running the epilogue code. This is most useful with
some sort of conditionalization, as discussed in the previous
section. Unlike most of the other clauses, return
is not
considered to "generate body code", so it is allowed to occur between
iteration clauses, as in
(loop for entry in list when (not (numberp entry)) return (error ...) as frob = (times entry 2) ...)
If one instead desires the loop to have some return value when it
finishes normally, one may place a call to the return
function in the
epilogue (with the finally
clause, loop-finally-clause).
May be used to make keyword, a loop
keyword (such as
for
), into a Lisp macro which may introduce a loop
form.
For example, after evaluating
(define-loop-macro for),
one may now write an iteration as
(for i from 1 below n do ...)
This facility exists primarily for diehard users of a
predecessor of loop
. Its unconstrained use is not recommended,
as it tends to decrease the transportability of the code and
needlessly uses up a function name.
In many of the clause descriptions, an optional data-type
is shown. A data-type in this sense is an atomic symbol, and is
recognizable as such by loop
. These are used for declaration
and initialization purposes; for example, in
(loop for x in l maximize x flonum into the-max sum x flonum into the-sum ...)
the flonum
data-type keyword for the maximize
clause
says that the result of the max
operation, and its "argument"
(x
), will both be flonums; hence loop
may choose to code
this operation specially since it knows there can be no contagious
arithmetic. The flonum
data-type keyword for the sum
clause behaves similarly, and in addition causes the-sum
to be
correctly initialized to 0.0
rather than 0
. The
flonum
keywords will also cause the variables the-max
and
the-sum
to be declared to be flonum
, in implementations
where such a declaration exists. In general, a numeric data-type more
specific than number
, whether explicitly specified or defaulted,
is considered by loop
to be license to generate code using
type-specific arithmetic functions where reasonable. The following
data-type keywords are recognized by loop
(others may be
defined; for that, consult the source code):
fixnum
An implementation-dependent limited range integer.
flonum
An implementation-dependent limited precision floating point number.
small-flonum
This is recognized in the Zetalisp implementation only, where its only significance is for initialization purposes, since no such declaration exists.
integer
Any integer (no range restriction).
number
Any number.
notype
Unspecified type (i.e, anything else).
Note that explicit specification of a non-numeric type for an
operation which is numeric (such as the summing
clause) may
cause a variable to be initialized to nil
when it should be
0
.
If local data-type declarations must be inhibited, one can use
the nodeclare
clause, which is described on
loop-nodeclare-clause.
Destructuring provides one with the ability to "simultaneously" assign or bind multiple variables to components of some data structure. Typically this is used with list structure. For example,
(loop with (foo . bar) = '(a b c) ...)
has the effect of binding foo
to a
and bar
to (b
c)
.
loop
’s destructuring support is intended to parallel if
not augment that provided by the host Lisp implementation, with a goal
of minimally providing destructuring over list structure patterns.
Thus, in Lisp implementations with no system destructuring support at
all, one may still use list-structure patterns as loop
iteration
variables, and in with
bindings. In NIL, loop
also
supports destructuring over vectors.
One may specify the data types of the components of a pattern
by using a corresponding pattern of the data type keywords in place of
a single data type keyword. This syntax remains unambiguous because
wherever a data type keyword is possible, a loop
keyword is
the only other possibility. Thus, if one wants to do
(loop for x in l as i fixnum = (car x) and j fixnum = (cadr x) and k fixnum = (cddr x) ...)
and no reference to x
is needed, one may instead write
(loop for (i j . k) (fixnum fixnum . fixnum) in l ...)
To allow some abbreviation of the data type pattern, an atomic component of the data type pattern is considered to state that all components of the corresponding part of the variable pattern are of that type. That is, the previous form could be written as
(loop for (i j . k) fixnum in l ...)
This generality allows binding of multiple typed variables in a reasonably concise manner, as in
(loop with (a b c) and (i j k) fixnum ...)
which binds a
, b
, and c
to nil
and i
,
j
, and k
to 0
for use as temporaries during the
iteration, and declares i
, j
, and k
to be fixnums
for the benefit of the compiler.
(defun map-over-properties (fn symbol) (loop for (propname propval) on (plist symbol) by 'cddr do (funcall fn symbol propname propval)))
maps fn over the properties on symbol, giving it arguments
of the symbol, the property name, and the value of that property.
In Lisp implementations where loop
performs its own
destructuring, notably Multics Maclisp and Zetalisp, one can
cause loop
to use already provided destructuring support
instead:
This variable only exists in loop
implementations in Lisps
which do not provide destructuring support in the default environment.
It is by default nil
. If changed, then loop
will behave
as it does in Lisps which do provide destructuring support:
destructuring binding will be performed using let
, and
destructuring assignment will be performed using desetq
.
Presumably if one’s personalized environment supplies these macros,
then one should set this variable to t
; there is, however,
little (if any) efficiency loss if this is not done.
This section describes the way loop
constructs
iterations. It is necessary if you will be writing your own iteration
paths, and may be useful in clarifying what loop
does with its
input.
loop
considers the act of stepping to have four
possible parts. Each iteration-driving clause has some or all of these
four parts, which are executed in this order:
This is an endtest which determines if it is safe to step to the next value of the iteration variable.
Variables which get "stepped". This is internally manipulated as a
list of the form (var1 val1 var2 val2
..)
; all of those variables are stepped in parallel, meaning that
all of the vals are evaluated before any of the vars are
set.
Sometimes you can’t see if you are done until you step to the next value; that is, the endtest is a function of the stepped-to value.
Other things which need to be stepped. This is typically used for internal variables which are more conveniently stepped here, or to set up iteration variables which are functions of some internal variable(s) which are actually driving the iteration. This is a list like steps, but the variables in it do not get stepped in parallel.
The above alone is actually insufficient in just about all
iteration driving clauses which loop
handles. What is missing
is that in most cases the stepping and testing for the first time
through the loop is different from that of all other times. So, what
loop
deals with is two four-tuples as above; one for the first
iteration, and one for the rest. The first may be thought of as
describing code which immediately precedes the loop in the prog
,
and the second as following the body code–in fact, loop
does
just this, but severely perturbs it in order to reduce code
duplication. Two lists of forms are constructed in parallel: one is
the first-iteration endtests and steps, the other the
remaining-iterations endtests and steps. These lists have dummy
entries in them so that identical expressions will appear in the same
position in both. When loop
is done parsing all of the clauses,
these lists get merged back together such that corresponding identical
expressions in both lists are not duplicated unless they are "simple"
and it is worth doing.
Thus, one may get some duplicated code if one has
multiple iterations. Alternatively, loop
may decide to use and
test a flag variable which indicates whether one iteration has been
performed. In general, sequential iterations have less overhead than
parallel iterations, both from the inherent overhead of stepping
multiple variables in parallel, and from the standpoint of potential
code duplication.
One other point which must be noted about parallel stepping is
that although the user iteration variables are guaranteed to be
stepped in parallel, the placement of the endtest for any particular
iteration may be either before or after the stepping. A notable case
of this is
(loop for i from 1 to 3 and dummy = (print 'foo) collect i) => (1 2 3)
but prints foo
four times. Certain other constructs, such
as for var on
, may or may not do this depending on the
particular construction.
This problem also means that it may not be safe to examine an
iteration variable in the epilogue of the loop form. As a general
rule, if an iteration driving clause implicitly supplies an endtest,
then one cannot know the state of the iteration variable when the loop
terminates. Although one can guess on the basis of whether the
iteration variable itself holds the data upon which the endtest is
based, that guess may be wrong. Thus,
(loop for subl on expr ... finally (f subl))
is incorrect, but
(loop as frob = expr while (g frob) ... finally (f frob))
is safe because the endtest is explicitly dissociated from the stepping.
Iteration paths provide a mechanism for user extension of
iteration-driving clauses. The interface is constrained so that the
definition of a path need not depend on much of the internals of
loop
. The typical form of an iteration path is
for var {data-type} being {each|the} pathname {preposition1 expr1}...
pathname is an atomic symbol which is defined as a loop
path
function. The usage and defaulting of data-type is up to the
path function. Any number of preposition/expression pairs may be
present; the prepositions allowable for any particular path are
defined by that path. For example,
(loop for x being the array-elements of my-array from 1 to 10 ...)
To enhance readability, pathnames are usually defined in both the singular and plural forms; this particular example could have been written as
(loop for x being each array-element of my-array from 1 to 10 ...)
Another format, which is not so generally applicable, is
for var {data-type} being expr0 and its pathname {preposition1 expr1}...
In this format, var takes on the value of expr0 the first
time through the loop. Support for this format is usually limited to
paths which step through some data structure, such as the "superiors"
of something. Thus, we can hypothesize the cdrs
path, such that
(loop for x being the cdrs of '(a b c . d) collect x) => ((b c . d) (c . d) d)
but
(loop for x being '(a b c . d) and its cdrs collect x) => ((a b c . d) (b c . d) (c . d) d)
To satisfy the anthropomorphic among you, his
, her
, or
their
may be substituted for the its
keyword, as may
each
. Egocentricity is not condoned. Some example uses of
iteration paths are shown in section predefined-paths-section.
Very often, iteration paths step internal variables which the
user does not specify, such as an index into some data-structure.
Although in most cases the user does not wish to be concerned with
such low-level matters, it is occasionally useful to have a handle on
such things. loop
provides an additional syntax with which one
may provide a variable name to be used as an "internal" variable by an
iteration path, with the using
"prepositional phrase".
The using
phrase is placed with the other phrases associated
with the path, and contains any number of keyword/variable-name pairs:
(loop for x being the array-elements of a using (index i) ...)
which says that the variable i
should be used to hold the index
of the array being stepped through. The particular keywords which may
be used are defined by the iteration path; the index
keyword is
recognized by all loop
sequence paths (section
loop-sequence-section). Note that any individual using
phrase applies to only one path; it is parsed along with the
"prepositional phrases". It is an error if the path does not call for
a variable using that keyword.
By special dispensation, if a pathname is not recognized,
then the default-loop-path
path will be invoked upon a syntactic
transformation of the original input. Essentially, the loop
fragment
for var being frob
is taken as if it were
for var being default-loop-path in frob
and
for var being expr and its frob ...
is taken as if it were
for var being expr and its default-loop-path in frob
Thus, this "undefined pathname hook" only works if the
default-loop-path
path is defined. Obviously, the use of this
"hook" is competitive, since only one such hook may be in use, and the
potential for syntactic ambiguity exists if frob is the name of
a defined iteration path. This feature is not for casual use; it is
intended for use by large systems which wish to use a special
syntax for some feature they provide.
loop
comes with two pre-defined iteration path
functions; one implements a mapatoms
-like iteration path
facility, and the other is used for defining iteration paths for
stepping through sequences.
The interned-symbols
iteration path is like a
mapatoms
for loop
.
(loop for sym being interned-symbols ...)
iterates over all of the symbols in the current package and its
superiors (or, in Maclisp, the current obarray). This is the same set
of symbols which mapatoms
iterates over, although not
necessarily in the same order. The particular package to look in may
be specified as in
(loop for sym being the interned-symbols in package ...)
which is like giving a second argument to mapatoms
.
In Lisp implementations with some sort of hierarchical package
structure such as Zetalisp, one may restrict the iteration to
be over just the package specified and not its superiors, by using the
local-interned-symbols
path:
(loop for sym being the local-interned-symbols {in package} ...)
Example:
(defun my-apropos (sub-string &optional (pkg package)) (loop for x being the interned-symbols in pkg when (string-search sub-string x) when (or (boundp x) (fboundp x) (plist x)) do (print-interesting-info x)))
In the Zetalisp and NIL implementations of loop
, a package
specified with the in
preposition may be anything acceptable to
the pkg-find-package
function. The code generated by this path
will contain calls to internal loop
functions, with the effect
that it will be transparent to changes to the implementation of
packages. In the Maclisp implementation, the obarray must be an
array pointer, not a symbol with an array
property.
One very common form of iteration is that over the elements
of some object which is accessible by means of an integer index.
loop
defines an iteration path function for doing this in a
general way, and provides a simple interface to allow users to define
iteration paths for various kinds of "indexable" data.
(define-loop-sequence-path path-name-or-names
fetch-fun size-fun
sequence-type default-var-type)
path-name-or-names is either an atomic path name or list of path
names. fetch-fun is a function of two arguments: the sequence,
and the index of the item to be fetched. (Indexing is assumed to be
zero-origined.) size-fun is a function of one argument, the
sequence; it should return the number of elements in the sequence.
sequence-type is the name of the data-type of the sequence, and
default-var-type the name of the data-type of the elements of
the sequence. These last two items are optional.
The Zetalisp implementation of loop
utilizes the
Zetalisp array manipulation primitives to define both
array-element
and array-elements
as iteration paths:
(define-loop-sequence-path (array-element array-elements)
aref array-active-length)
Then, the loop
clause
for var being the array-elements of array
will step var over the elements of array, starting from
0
. The sequence path function also accepts in
as a
synonym for of
.
The range and stepping of the iteration may be specified with
the use of all of the same keywords which are accepted by the loop
arithmetic stepper (for var from ...
); they are
by
, to
, downto
, from
, downfrom
,
below
, and above
, and are interpreted in the same manner.
Thus,
(loop for var being the array-elements of array from 1 by 2 ...)
steps var over all of the odd elements of array, and
(loop for var being the array-elements of array downto 0 ...)
steps in "reverse" order.
(define-loop-sequence-path (vector-elements vector-element) vref vector-length notype notype)
is how the vector-elements
iteration path can be defined in NIL
(which it is). One can then do such things as
(defun cons-a-lot (item &restv other-items) (and other-items (loop for x being the vector-elements of other-items collect (cons item x))))
All such sequence iteration paths allow one to specify the
variable to be used as the index variable, by use of the index
keyword with the using
prepositional phrase, as described (with
an example) on loop-using-crock.
This section and the next may not be of interest to those
not interested in defining their own iteration paths.
A loop
iteration clause (e.g a for
or as
clause) produces, in addition to the code which defines the iteration
(section loop-iteration-framework-section), variables which must
be bound, and pre-iteration (prologue) code. This breakdown
allows a user-interface to loop
which does not have to depend on
or know about the internals of loop
. To complete this
separation, the iteration path mechanism parses the clause before
giving it to the user function which will return those items. A
function to generate code for a path may be declared to loop
with the define-loop-path
function:
(define-loop-path pathname-or-names path-function
list-of-allowable-prepositions
datum-1 datum-2 ...)
This defines path-function to be the handler for the path(s)
pathname-or-names, which may be either a symbol or a list of
symbols. Such a handler should follow the conventions described
below. The datum-i are optional; they are passed in to
path-function as a list.
The handler will be called with the following arguments:
The name of the path which caused the path function to be invoked.
The "iteration variable".
The data type supplied with the iteration variable, or nil
if
none was supplied.
This is a list with entries of the form (preposition
expression), in the order in which they were collected. This may
also include some supplied implicitly (e.g an of
phrase when
the iteration is inclusive, and an in
phrase for the
default-loop-path
path); the ordering will show the order of
evaluation which should be followed for the expressions.
This is t
if variable should have the starting point of
the path as its value on the first iteration (by virtue of being
specified with syntax like for var being expr and its
pathname
), nil
otherwise. When t
, expr
will appear in prepositional-phrases with the of
preposition; for example, for x being foo and its cdrs
gets
prepositional-phrases of ((of foo))
.
This is the list of allowable prepositions declared for the pathname that caused the path function to be invoked. It and data (immediately below) may be used by the path function such that a single function may handle similar paths.
This is the list of "data" declared for the pathname that caused the path function to be invoked. It may, for instance, contain a canonicalized pathname, or a set of functions or flags to aid the path function in determining what to do. In this way, the same path function may be able to handle different paths.
The handler should return a list of either six or ten elements:
This is a list of variables which need to be bound. The entries in it
may be of the form variable, (variable expression),
or (variable expression data-type). Note that it is
the responsibility of the handler to make sure the iteration variable
gets bound. All of these variables will be bound in parallel;
if initialization of one depends on others, it should be done with a
setq
in the prologue-forms. Returning only the variable
without any initialization expression is not allowed if the variable
is a destructuring pattern.
This is a list of forms which should be included in the loop
prologue.
These are the four items described in section loop-iteration-framework-section, loop-iteration-framework-page: pre-step-endtest, steps, post-step-endtest, and pseudo-steps.
If these four items are given, they apply to the first iteration, and the previous four apply to all succeeding iterations; otherwise, the previous four apply to all iterations.
Here are the routines which are used by loop
to compare
keywords for equality. In all cases, a token may be any Lisp
object, but a keyword is expected to be an atomic symbol. In
certain implementations these functions may be implemented as macros.
This is the loop
token comparison function. token is any Lisp
object; keyword is the keyword it is to be compared
against. It returns t
if they represent the same token,
comparing in a manner appropriate for the implementation.
The member
variant of si:loop-tequal
.
The assoc
variant of si:loop-tequal
.
If an iteration path function desires to make an internal
variable accessible to the user, it should call the following function
instead of gensym
:
This should only be called from within an iteration path function. If
keyword has been specified in a using
phrase for this
path, the corresponding variable is returned; otherwise, gensym
is called and that new symbol returned. Within a given path function,
this routine should only be called once for any given keyword.
If the user specifies a using
preposition containing any keywords
for which the path function does not call si:loop-named-variable
,
loop
will inform the user of his error.
Here is an example function which defines the
string-characters
iteration path. This path steps a variable
through all of the characters of a string. It accepts the format
(loop for var being the string-characters of str ...)
The function is defined to handle the path by
(define-loop-path string-characters string-chars-path (of))
Here is the function:
(defun string-chars-path (path-name variable data-type prep-phrases inclusive? allowed-prepositions data &aux (bindings nil) (prologue nil) (string-var (gensym)) (index-var (gensym)) (size-var (gensym))) allowed-prepositions data ; unused variables ; To iterate over the characters of a string, we need ; to save the string, save the size of the string, ; step an index variable through that range, setting ; the user's variable to the character at that index. ; Default the data-type of the user's variable: (cond ((null data-type) (setq data-type 'fixnum))) ; We support exactly one "preposition", which is ; required, so this check suffices: (cond ((null prep-phrases) (ferror nil "OF missing in ~S iteration path of ~S" path-name variable))) ; We do not support "inclusive" iteration: (cond ((not (null inclusive?)) (ferror nil "Inclusive stepping not supported in ~S path ~ of ~S (prep phrases = ~:S)" path-name variable prep-phrases))) ; Set up the bindings (setq bindings (list (list variable nil data-type) (list string-var (cadar prep-phrases)) (list index-var 0 'fixnum) (list size-var 0 'fixnum))) ; Now set the size variable (setq prologue (list `(setq ,size-var (string-length ,string-var)))) ; and return the appropriate stuff, explained below. (list bindings prologue `(= ,index-var ,size-var) nil nil ;char-n
is the NIL string referencing primitive. ; In Zetalisp,aref
could be used instead. (list variable `(char-n ,string-var ,index-var) index-var `(1+ ,index-var))))
The first element of the returned list is the bindings. The
second is a list of forms to be placed in the prologue. The
remaining elements specify how the iteration is to be performed. This
example is a particularly simple case, for two reasons: the actual
"variable of iteration", index-var
, is purely internal
(being gensym
med), and the stepping of it (1+
) is such
that it may be performed safely without an endtest. Thus
index-var
may be stepped immediately after the setting of the
user’s variable, causing the iteration specification for the first
iteration to be identical to the iteration specification for all
remaining iterations. This is advantageous from the standpoi