|Computer Solutions Ltd|
Celebrating 38 years
|Home Products Supported Chips Information Zone Contact Site Map Web Shop|
Excerpts from Forth Programmer's Handbook #1
About the Forth Programming Language
The Forth programming language was originally developed in the early 1970s by Charles H. Moore, at the National Radio Astronomy Observatory. Forth was used at several NRAO installations for controlling radio telescopes and associated scientific instruments, as well as for high-speed data acquisition and graphical analysis. Today Forth is used world-wide by people seeking maximum flexibility and efficiency in a wide variety of application areas.
About This Book
The Forth Programmer's Handbook book provides a detailed technical reference for programmers and engineers who are developing software using ANS-compliant versions of Forth provided by FORTH, Inc. or other vendors. It features ANS Forth (ANSI X3.215:1994, the standard adopted in 1994, or ISO/IEC 15145:1997) and many extensions commonly in use; some information in this book is taken directly from the official ANS Forth document.
This book assumes the reader has a general knowledge of programming principles and practices, and general familiarity with computer hardware and software systems. If you are learning Forth for the first time, we suggest you begin with Starting Forth, an excellent introductory tutorial. It includes many exercises you can try that will actually run on your system.
The following reference materials may be of use to the reader of this manual.
Additional recommended publications are listed in Appendix C, "Bibliography" on page 223, along with other sources of information about Forth.
How to Proceed
If you are not already familiar with Forth, begin by reading Starting Forth, working its examples with your Forth system. Use this book for technical details about the features discussed in Starting Forth, and to assist you as you move on to more ambitious programming challenges.
This Forth Programmer's Handbook provides a reference source for the most common features of the integrated software development systems based on the Forth programming language. We assume at least an elementary knowledge of the Forth language, consistent with having studied Starting Forth and attended a Forth programming course, or the equivalent. If you are new to Forth, we encourage you to begin by reading Starting Forth carefully, working the problems at the end of each chapter.
This book is primarily intended to describe how a programmer can use Forth to solve problems. This is a rather different goal from explaining how Forth works, but it is a practical necessity for the new user of a Forth system. This manual is also organised to serve experienced programmers who need to check some point quickly.
We highly recommend that you spend time examining the Forth source code supplied with your system, along with its documentation. Forth was designed to be highly readable, and the source code offers many examples of good usage and programming practice.
This manual does not attempt to cover all Forth commands. Indeed, no book can do that -- Forth is an extensible system, and no two implementations need or use identical components. What we can do is provide a detailed exposition of the most valuable and most commonly used features and facilities of the fundamental system from which your application begins.
FORTH, Inc. provides development environments for a growing number of computer systems and embedded microprocessors. Since hardware is unique for each computer, it is not feasible for this document to cover every feature of every system supported. The Forth Programmer's Handbook presents features common to ANS Forth and to the most common extensions found in all FORTH, Inc. systems. When discussing hardware-specific features, particularly dictionary structure, high-level object format, database management, and device drivers, an idealised model of a Forth system is used. Separate product documentation provides implementation details and descriptions of features specific to that system.
In this manual, typefaces are used as follows:
Many of the following topics are treated in tutorial form in Starting Forth. This section highlights special considerations arising from the actual implementation of a system. More detailed technical discussions of subjects covered here will be found in later sections of this book, especially Section 2. Appendix A, "Glossary & Notation" on page 203 provides supplementary definitions of many of the terms used in this manual, as well as a detailed description of the notation conventions.
1.1.1 Definitions of Terms
Forth allows any kind of ASCII string (except one containing spaces) to be a valid name, and this introduces some ambiguities in references. For instance, Forth calls subroutines words, but word could also mean an addressable unit of memory. To resolve this, we use the following conventions:
The dictionary contains all the executable routines (or words) that make up a Forth system. System routines are entries predefined in the dictionary that become available when the system is booted. Electives are optionally compiled after booting. User-defined words are entries the user adds. In a multi-user configuration, system and elective definitions are available to all users, whereas user-defined words are available only to the user who defines them. Otherwise, there are no differences in size, speed, or structure. You may make user words available to other users simply by loading them with the other electives.
The basic form of the most common type of word definition is:
The dictionary is the fundamental mechanism by which Forth allocates memory and performs symbol table operations. Because the dictionary serves so many purposes, be sure you understand how to use it. You may wish to review the related material in Starting Forth.
The dictionary is a linked list of variable-length entries, each of which is a Forth word and its definition. In most implementations, the dictionary grows toward high memory; the discussion in this section will assume it does. Each dictionary entry points to the entry that logically precedes it (see Figure 1). The address of the next available cell at the end of the dictionary is put on the stack by the word
Figure 1. The "top" of a dictionary.
Dictionary entries are not necessarily contiguous. For example, in cross-compilers used to construct programs for embedded systems, the searchable portion of the dictionary (name, link, and a pointer to the content -- see Figure 2) may reside in a host computer, and the actual content may reside in a target image being constructed in the host computer's memory for later downloading or for burning into PROM.
The dictionary is searched by sequentially matching names in source text against names compiled in the dictionary. On some systems, the search is speeded by providing more than one chain of definitions, with the entries linked in logical sequences that do not necessarily reflect their physical location. The Forth text interpreter selects one of these chains to search; the selection mechanism is implementation dependent, and may include two or more chains in a programmer-controlled order (see Section 3.6 and Starting Forth). The search follows the selected chain until a match is found or the end of the chain is reached. Because the latest definition will be found first, this organisation permits words to be redefined, a technique that is frequently useful.
The ANS Forth term for one of these chains is word list. A word list is a subset of the dictionary containing words for some special purpose. There usually are several word lists present in a system and these are normally available to all users on a re-entrant basis.
Figure 2. Logical structure of the Forth dictionary
The essential structure of dictionary entries is the same for all words, and is diagrammed in Figure 2. The link cell contains the location of the preceding entry. This speeds up searches, which start at the recent end of the dictionary and work backwards to the older end. By this process, the most recent definition of a word is always found. In a developed application, where the user is dealing with the highest level of the program, this process optimises search time.
The name field in a dictionary entry contains the count of characters in the full name, followed by some number of characters in the name. The count (and, thus, the longest allowable name length) usually is limited to 31 characters. Any characters other than space, backspace, and carriage return can be used as part of a name field.
Some systems are case sensitive and others are not; see your product documentation for details. To avoid problems and to maximise the transportability of code, the names of the words provided in a standard system are defined in all upper-case letters and should always be referred to in all upper-case letters when using them in subsequent definitions. When defining and using new names, it is important to be consistent; always refer to a name using exactly the same case(s) in which it was defined. Also, in systems that are case sensitive, avoid creating names that differ only in their use of case; such code will not be transportable to a case-insensitive system.
Figure 3. Structural details of a typical dictionary entry
Although the order of the fields in a dictionary entry is arranged in each implementation to optimise each machine's dictionary search, Figure 3 shows a general model. There will always be a link field, a name field, and a code field. The code field contains a pointer to the run-time code to be executed when this definition is invoked. There is often a parameter field of variable length, containing references to data needed when this definition executes. There may also be a locate field, containing information about where this word is defined in source code. When developing programs for embedded systems, this structure may exist only on the host, with a parameter field containing a pointer to the actual executable portion being constructed in the target image.
In addition, usually there are several control bits to control the type and use of the definition. Since the longest name field in most systems has 31 characters, requiring only five bits to express a count, the control bits are often found in the byte containing the count. The most important control bit is called the precedence bit. A word whose precedence bit is set executes at compile time. The precedence bit is set by the word
Another common control bit is the smudge bit. A word whose smudge bit is set is invisible to a dictionary search. This bit is set by the compiler when starting to compile a high-level
The code field, pointing to the run-time code for a definition, causes different behaviours depending on the type of word being defined. In some implementation strategies, the code field is not required, or contains the code itself.
The cells (if any) after the code field address are called the parameter field, which is of variable length.
1.1.3 Data Stack
Every Forth system contains at least one data stack. In a multitasked system, each task may have its own data stack. The stack is a cell-wide, push-down LIFO (last-in, first-out) list; its purpose is to contain numeric operands for Forth commands. Commands commonly expect their input parameters on this stack and leave their output results there. The stack's size is indefinite. Usually it is located at a relatively high memory address and grows downward towards areas allocated for other purposes; see your product documentation for your system's particular layout. The data stack rarely grows beyond 10 - 20 entries in a well-written application.
When numbers are pushed onto or popped off the stack, the remaining numbers are not moved. Instead, a pointer is adjusted to indicate the last used cell in a static memory array. On most implementations, the top-of-stack pointer is kept in a register.
Stacks typically extend toward low memory for reasons of implementation efficiency, but this is by no means required or universally true. On implementations on which the stack grows toward low memory, a push operation involves decrementing the stack pointer, while a pop involves incrementing it.
A number encountered by the text interpreter will be converted to binary and pushed onto the stack. Forth data objects such as
(a) pushes the number 12 on the stack; (b) pushes 2400 over it (see Figure 4); (c) executes the multiply routine
Figure 4. Items on the data stack
The standard Forth dictionary provides words for simple manipulation of single- and double-length operands on the stack:
The push-down stack simplifies the internal structure of Forth and produces naturally re-entrant routines. Passing parameters via the stack means fewer variables must be named, reducing the amount of memory required for named variables (as well as reducing the programmer's associated housekeeping).
A pointer to the top (i.e., the latest entry) of the user's stack is maintained by the system. There is also a pointer to the "bottom" of the stack, so that stack-empty or underflow conditions can be detected, and to aid in clearing the stack if an abort condition is detected.
Most Forth systems check for stack underflow only after executing (or attempting to execute) a word from the input stream. Underflows that occur in the process of execution will not be detected at that time (see Figure 5).
The usual result of a detected stack underflow is the message:
followed by a system abort.
1.1.4 Return Stack
Every Forth system also has a return stack. In a multitasked system, each task has its own return stack, usually located above its data stack in memory. Like the data stack, the return stack is a cell-wide LIFO list. It is used for system functions, but may also be accessed directly by an application program. It serves the following purposes:
Because the return stack has multiple uses, care must be exercised to avoid conflicts when accessing it directly.
There are no commands for directly manipulating the return stack, except those for moving one or two parameters between the data stack and the return stack.
The maximum size of the return stack for each task is specified at the time the task is defined, and remains fixed during operation; a typical size is 128 cells.
1.1.5 Text Interpreter
The text interpreter serves these critical functions:
The operator's terminal is the default text source. The keyboard input interrupt handler will accept characters into a text buffer called the terminal input buffer until a user event occurs, such as a Return or Enter key press, function key press, mouse click, etc. When such an event is detected, the text interpreter will process the text in the buffer. If interpretation is from source code on disk, it is buffered separately in an implementation-dependent fashion. In general, the place where the text resides that the text interpreter is parsing is called the parse area.
Text interpretation repeats the following steps until the parse area is exhausted or an error has occurred:
The Forth language was designed from first principles to support an interactive development style. By developing a very simple application in this section, we will show how this style translates into practice.
The general process of developing a program in Forth is consistent with the recommended development practices of top-down design and bottom-up coding and testing. However, Forth adds another element: extreme modularity. You don't write page after page of code and then try to figure out why it doesn't work; instead, you write a few very brief definitions and then exercise them, one by one.
Suppose we are designing a washing machine. The overall, highest-level definition might be:
The colon indicates that a new word is being defined; following it is the name of that new word,
Figure 6. Example of a control program that runs a washing machine
Typically, we design the highest-level routines first. This approach leads to conceptually correct solutions with a minimum of effort. In Forth, words must be compiled before they can be referenced. Thus, a listing begins with the most primitive definitions and ends with the highest-level words. If the higher-level words are entered first, lower-level routines are added above them in the listing.
Figure 6 shows a complete listing of the washing machine example. This is a typical Forth block of source code. Comments are in parentheses. In this example, lines 1 - 3 define named constants, with hex values representing hardware port addresses. Lines 5 - 15 define, in sequence, the application words that perform the work.
The code in this example is nearly self-documenting; the few comments show the parameters being passed to certain words. Forth allows as many comments as desired, with no penalty in object code size or performance.
it is obvious what
When you wonder how
Reading further, one finds that
Even from this simple example, it may be clear that Forth is not so much a language, as a tool for building application-oriented command sets. The definition of
Because Forth is extensible, Forth programmers write collections of words that apply to the problem at hand. The power of Forth, which is simple and universal to begin with, grows rapidly as words are defined in terms of previously defined words. Each successive, newer word becomes more powerful and more specific. The final program becomes as readable as you wish to make it.
When developing this program, you would follow your top-down logic, as described above. But when the time comes to test it, you see the real convenience of Forth's interactivity.
If your hardware is available, your first step would be to see if it works. Even without the code in Figure 6, you could read and write the hardware registers by typing phrases such as:
This would read the water-level register at 7010H and display its value. And you could type:
to see if the valve opens and closes.
If the hardware is unavailable, you might temporarily re-define the words
You can load your block of source (as described in Section 3.3.3), whereupon all its definitions are available for testing. You can further exercise your I/O by typing phrases such as:
to see what happens. Then you can exercise your low-level words, such as:
and so on, until your highest-level words are tested.
As you work, you can use any of the additional programmer aids described in Section 2.1.5. You can also easily change your code and re-load it. But your main ally is the intrinsically interactive nature of Forth itself.