You are here:
Home → Shareware → GENER8 → User Manual

# GENER8 — Text Preprocessor User Manual for Release 8.0

Program Dated 25 Apr 2016  /  Document Dated 25 Apr 2016

Summary: GENER8 is perfect for generating documents with a lot of repetition, or repetition with variations, like a series of HTML files for a Web site. GENER8 lets you include files, define and use macros, perform arithmetic, and output different text conditionally. While its original inspiration was the C preprocessor, GENER8 was written from the ground up to work with any text at all, and it requires no programming knowledge.

## Overview

GENER8 takes one or more input files and processes them to produce an output file. Along the way it can perform text substitution and paste other files or the results of system commands into the output.

This is ideal for producing series of files that are partly boilerplate, like pages in a Web site. It’s also terrific when you have several versions of a document that are mostly the same but have a few differences. And even if you’re producing just one document, if it’s got lots of repetition it may be worth your while to use macros to cut down the amount of typing you do.

## Getting Started

The GENER8 package consists of a suite of programs written in the AWK language. You do not need to understand the AWK language to use them.

### System Requirements

GENER8 requires the free program GAWK (GNU AWK). If GAWK isn’t already on your system, you can find a free copy at GNU.org. Windows users can get the latest version of the same port I use, part of the GNUWin32 project. Users of any operating system can download the source code and compile it for themselves.

Be sure to use GNU AWK (GAWK). GENER8 uses several features of GAWK that are not present in some other implementations. GENER8 has been tested with GAWK 3.0.4 and 3.1.0 in 32-bit and 64-bit Windows, but it should work on any operating system with GAWK 3.0.4 or later.

### Installation

There is no special installation process. Simply place gener8.awk in any convenient directory.

GENER8 can work just fine in a different directory from your input files. In that case you may find it convenient to set the AWKPATH environment variable.

### Evaluation, License, and Warranty

GENER8 is fully functional as you downloaded it, but it is not freeware. If you evaluate it for 30 days and intend to keep using it, please register it at http://oakroadsystems.com/pub/sharware/gener8buy.htm. The fee is modest, and you’ll have the satisfaction of supporting the shareware movement, which makes useful programs available at low cost.

You are welcome to distribute GENER8, whether you have registered it yourself or not, as long as you distribute it complete, unaltered, and without charge. Please see the file license.txt for full license and warranty information.

Whether you have registered GENER8 or not, I disclaim any ownership or royalty in any output files that you produce by running GENER8.

## Running the Program

GENER8 is a command-line utility:

gawk -f gener8.awk {options} inputfiles... >outputfile

gawk is the GNU AWK program; it might be called something different on your system. Please make sure you are using GNU AWK, because other AWK variants may not have all the features that GENER8 relies on.

-f gener8.awk specifies the GENER8 program. If gener8.awk is not in your current directory, as probably it is not, you have two choices:

After any options, specify one or more input files. GENER8 will process them in the order specified and will write the results to the specified output file. Both input and output files may be in other directories; just specify appropriate paths.

The input files contain text, directives, and macro calls.

### Options

The options are, well, optional. But each one that you specify must be in lower case, with -v before each option. All the options must precede the input file(s).

#### -v stderr=file

Normally any error messages are written to the standard error stream (which is usually your screen), but this option defines a file for error messages. If any messages are actually written, they will overwrite the file; otherwise the file is untouched.

#### -v maxerrors=n

Sometimes a single mistake — misspelling a macro name in a #define directive, for instance — can trigger a whole cascade of error messages. Therefore, by default GENER8 will display only the first five errors to the standard error output. The input file(s) are processed to the end, and if debugging is in effect then all error messages are still written to the output file.

You may want to set a different limit on the number of errors that are displayed. To do this, set the maxerrors option to the desired number. If you set maxerrors=0, GENER8 will display all errors, however many there may be.

#### -v debug=n

This option sets initial debugging mode to 0, 1, or 2; the default is 0 (no debug information).

Debugging mode can also be set or changed by the #debug directive in any input file. Please see that section for more information about debugging.

#### -v picky=n

This option sets initial macro pickiness to 0, 1, or 2; the default is 1 (treat undefined macros as defined but empty).

Macro pickiness can also be set or changed by the #picky directive in any input file. Please see that section for more information about debugging.

#### -v target=path/file

These options are intended primarily for use when you’re using GENER8 to create HTML files for a Web site. You specify -v target as the path and filename of the output file on your disk, and -v home as the path and filename of the site’s home page on your disk. If you use these options, either specify absolute locations for the files or specify locations relative to the current file.

Use forward slashes to separate directories, even in Windows.

GENER8 uses these options, if present, to set up the predefined macros HOME, RELHOME, TARGET, TARGETDIR, and TARGETNAME.

#### -v tocmin=minlevel and -v tocmax=maxlevel

These options are used to implement the two-pass processing of the TOCB macro, and should not be entered on the command line by the user.

### Environment Variables

Two optional environment variables specify search paths. You may wish to use them instead of specifying full paths on some file names.

#### AWKPATH: Locations of Programs

If you specify just -f gener8.awk on the command line, without a path, GAWK normally looks for the program in the current directory. You can use the AWKPATH environment variable to tell GAWK where to look for AWK programs. Example for Windows:

set AWKPATH=.;d:/my/code;c:/util

Example for UNIX:

set AWKPATH=.:~/mybin/progs

For UNIX, use forward slashes in directories and separate them with colons; for Windows use forward slashes (not backslashes) but separate directories with semicolons.

The period (.) at the start of the two examples above stands for the current directory.

GAWK examines AWKPATH only to look for AWK programs. Input files on the command line must have explicit paths if they are not in the current directory.

#### INCLUDE: Locations of Include Files

If your input files contain #include directives, you may want to keep the include files in one or more directories other than the current directory. And if you do keep some include files elsewhere, you probably want to specify the search path in one place, not specify paths for the included files one by one.

That one place is the INCLUDE environment variable. If it is defined, then whenever GENER8 finds an #include in an input file, and no path is given for the file to be included, and the file isn’t in the current directory, then GENER8 will look for the include file in the directories specified in your INCLUDE environment variable.

For Windows, specify paths with / or \ and separate them with semicolons. Example:

d:/my/code;c:/util

For other systems, specify paths with / and separate them with colons, like this:

~/mybin/progs

There is no need to specify current directory for search. If you #include a plain file with no path, GENER8 will always look for it in the current directory first.

If you prefer to set the include path right within your input file, use the #includepath directive.

### Return Values (ERRORLEVEL)

GENER8 normally returns the value 0 if the program finished normally and 1 if GENER8 found errors like bad macro calls or missing include files. Any other value comes from the gawk program itself.

Windows command-line programmers can access the returned value with the IF ERRORLEVEL statement.

## # Directives

GENER8 honors a number of directives, as listed below. If the first non-blank character on the line is #, the line is recognized as a directive. If you need to start a regular text line with a # character, use # instead.

Every directive must be completed on one line. If you need additional lines, use a trailing \ character to continue on the next line.

You use directives to control or change what is written to the output file, but the directive itself is never written to the output file.

### #debug

If you get output that seems wrong, you can turn on debugging to see in detail what is going on. This should help you correct your input so that you get the output you want.

Debugging information, such as macro substitution and lines ignored because of #if directives and #ifdef directives, gets written to the output file in sequence with the regular output.

The bare #debug directive turns debugging mode on. You can have finer control by putting a number after the directive:

• #debug 1 turns on debugging as described above.
• #debug 2 turns on debugging, and also tells GENER8 to stop immediately when it finds an error.
• #debug 0 turns debugging off.

The initial value is #debug 0, unless you set a different value on the command line. You can have multiple #debug directives in an input file, so that you can debug only a small section of input and not have to cope with voluminous debugging output.

When debugging is on (debugging level 1 or 2), GENER8 writes error messages to the standard error stream as usual but also writes a copy of the same error messages to the output file.

### #define and #freeze and #undef

Please see the section on defining macros.

### #if and Friends: Conditional Processing

The directives #if #ifdef #ifndef #elif #else #endif work together to let you determine whether to process blocks of lines or not, based on some conditions. For instance, you might have something like this in your input file:

#if coursenum == 200
Statistics
#elif coursenum == 201
Calculus
     . . .
#endif

If you have defined the macro coursenum to equal 200, GENER8 will write Statistics to the output file and ignore everything else till the #endif; if coursenum is 201, GENER8 will write Calculus to the output file and ignore everything else till the #endif; and so forth.

Here is the complete pattern of a group of conditional directives:

• an #if or #ifdef or #ifndef directive
• a block of lines to be written and/or directives to be executed if that condition is true
• (optional) one or more #elif or #elifdef or #elifndef directives, each followed by a block to be processed if that condition is true
• (optional) an #else directive followed by a block to be processed if no previous condition in this group is true
• an #endif directive

Out of any group of conditional directives, as soon as one condition is found to be true, its block is processed and everything else until the #endif is ignored. If there is more than one true condition in an #if-#elif series, only the first true one will be processed.

Anything can be inside conditional blocks, including other directives, even more conditional directives. In other words, conditional blocks can be nested.

Conditional processing is most useful when you have a number of decisions in the input file that depend on a small number of conditions. A useful technique is to put the macro definitions in a separate file. In this example you’d have one file for the Statistics definitions, one for the Calculus definitions, and so on. Then when you are building an output file, on your command line you would list the appropriate file of macro definitions, followed by your main input file. For instance:

gawk -f gener8.awk stat.def syllabus >statsyllabus.htm

#### #elif expression

These directives contain conditions to be tested. The form is fairly loose: any expression that ultimately evaluates to true (nonzero) or false (zero). Expressions are described under the ARITH macro below.

If there’s any error in the expression, GENER8 will print an error message to the standard error stream and ignore the directive, which may cause additional errors further down the line.

Any macro calls that appear after #if or #elif on the line will be evaluated in the same way as macros inside (#ARITH#). Any macro names that appear will be expanded but their contents won’t be evaluated; in that case, a pure number 0 is false and anything else, including text strings, is true.

Example:

#if 0

ignores everything until the next #elif, #else, or #endif.

Example:

#if coursenum

depends on whether a macro named coursenum is defined. If there is no such macro, then coursenum is a pure text string. Text strings evaluate to true, not to false as you might expect. If there is such a macro, its contents are pasted but not evaluated: numeric 0 is treated as false and anything else, including text, is treated as true.

To avoid this kind of confusion with macros that may or may not be defined, you may want to stick to #ifdef directives, or use expressions like 0+macroname and 1-macroname to force an undefined macro to be treated as 0.

If the macro was defined as 25>100, the #if will nevertheless be true because the macro is pasted as text, not evaluated. To avoid this kind of confusion with macros that contain expressions, either evaluate the macro on the #if line by enclosing it in (#…#), or define the macro with the #freeze directive and the ARITH macro.

Example:

#if (#coursenum#)

evaluates the contents of coursenum. A non-numeric result or a nonzero numeric result is treated as true; a result of numeric 0 is treated as false. Any of the following would return a result of true: abc, 1, 25<100. Any of the following return false: 50-50, 25>100.

Example:

#if coursenum == 200

tests whether the definition of macro coursenum is the three characters 200. If coursenum has exactly that definition, the following block (to the next #elif or #else or #endif) will be executed; if coursenum has some other definition, even 100+100, the following block will be ignored.

Example:

#if coursenum+0 == 200

converts the contents of coursenum to a number. Effectively, this tests whether the definition of macro coursenum begins with something that looks like the number 200, including 200xyz, 2e2nonsense, and so forth. If coursenum is 100+100, the string-to-integer conversion stops at the first non-numeric character, the plus sign, so the test is for 100==200, which is false.

Example:

#if (#coursenum#) == 200

tests whether the macro coursenum is an expression that evaluates to 200, including 200, 100+100, 800xyz/4, and so forth.

If conditionals don’t seem to be going as you expect, try turning on debugging.

#### #ifndef macroname

It can be handy to use a macro as a simple switch: you take one set of actions if the macro is defined and another (or no actions) if it is not defined. These directives let you test whether a macro is defined. It makes no difference whether the macro definition used #define or #freeze.

These tests are most appropriate for empty macros, though you can test any macro name with them.

The DEFINED macro makes the same test as #ifdef, but can be used in expressions.

#### #elifndef macroname

These directives let you test whether a macro is defined if some preceding condition is not true, without nesting.

Example: Suppose you give some people your phone number and some your e-mail address, but nobody gets both. (Admittedly, this is a contrived example.) You could code your contact information like this:

#ifdef phone
    You can phone me at (#phone#).
#elifdef email
    You can e-mail me at (#email#).
#else
    Well, there's always tin cans and a string.
#endif

#### #endif

These directives have already been explained: the block following the optional #else is processed if no preceding condition was true, and the required #endif marks the end of the conditional group.

Anything on the line after #else or #endif is ignored (treated as a comment).

### #include file

When GENER8 reads this directive in an input file, it suspends processing the current file, opens the named file and processes it, then after reaching the end of the named file continues with the input file that was being processed.

You may use macro calls on the #include line to specify part or all of file. In fact, the entire #include line can be created inside a macro.

If file includes a path specification, GENER8 will look only in the specified location. If file does not specify a path, GENER8 will look first in the current directory and then in order in the directories (if any) specified in the INCLUDE environment variable. If GENER8 can’t locate the file, it prints an error message to the standard error stream and aborts all processing.

The value of the FILENAME macro does not change while processing an included file: it always refers to the current input file from the command line. The value of the INCLUDEFILE macro does change when an included file is opened or closed.

The included file may itself contain another #include directive, and so on down the line. The limit to these nested includes depends on how many open files your system allows.

### #include !command

When GENER8 reads this directive in an input file, it executes the command, which might be a program name with arguments or a shell command. You may use macro calls to specify all or part of the command. In fact, the entire #include line can be created inside a macro.

GENER8 intercepts any output from the command and processes it just as though it had come from an input file; therefore any macro calls and directives in the command output are processed. This is the difference from the otherwise similar SYSTEM macro, which also executes a system command: the SYSTEM macro lets the system command write directly to the output file with no processing by GENER8.

If the command generates no output, GENER8 prints an error message to the standard error stream and aborts all processing.

### #includepath path;path;...

While you can specify a search path for include files with the INCLUDE variable, it’s awkward to specify an environment variable in a set of make commands. So it may be more convenient to specify the include path right in the input file.

Specify one path or multiple paths separated with semicolons (colons for UNIX systems). Any \ characters within the paths will be changed to /. Don’t specify an empty path or a single period to indicate current directory; GENER8 will always look in the current directory before searching the include path.

The #includepath directive overrides any path that may be specified in the environment variable. Therefore, if you want to include any previously defined paths, use the value of the INCLUDE variable, like this:

#includepath f:\somewhere\faraway;(#ENV INCLUDE#)

GENER8 sets its internal copy of the INCLUDE variable to the contents of the #includepath directive. Therefore a pattern like the above will work whether the previous path was set in a previous #includepath directive, or in the environment before calling GENER8.

### #info text

Sometimes you need to display a piece of information while GENER8 is processing source files. For example, maybe you’re not sure how a macro is being expanded. But you don’t want to embed this kind of debugging information in your output file.

Use the #info directive. Any macros on the line will be expanded (and #info itself can be the result of a macro expansion), and then GENER8 will display the source file name, the line number, and the text on the console, without writing anything to the output file.

### #macrosep characters

By default, macro arguments are separated from each other and from the macro name by a run of one or more spaces. That might be inconvenient in two contexts: if a lot of your macro arguments contain spaces, or if you reflow text and macro calls end up split across a line. The solution is to redefine the macro separator.

#macrosep takes one argument, the macro separator characters to be defined in addition to a run of spaces. For example,

#macrosep \+

would define the macro separator as one or more spaces, or a backslash or plus character possibly with one or more spaces on either side. The corresponding regular expression is /  *| *[\\+] */. (GENER8 automatically escapes characters from the #macrosep directive that have special meaning in regular expressions.)

### #macrosepregexp regularexpression

You can customize the macro argument separator completely by specifying a regular expression. For example,

#macrosepregexp [;:\-]

would separate macro arguments with a single semicolon, colon, or hyphen.

#macrosepregexp \.\.\.| *- *

would separate macro arguments by either a string of three dots, or a single hyphen possibly preceded or followed by spaces.

When specifying a regular expression, you must escape any characters that need escaping: GENER8 passes your regular expression unchanged to the split( ) function of AWK.

### #picky

Most people like to know if they have used a macro without defining it, or entered an incomplete macro definition. In these cases, GENER8 normally displays a warning message in the standard error stream, then treats the macro as defined but empty. The #picky directive lets you alter this behavior:

• #picky 0 suppresses these messages.
• #picky 1 displays the messages but doesn’t count them as errors. Any undefined macro is expanded as a zero-length string.
• #picky 2 displays the messages and does treat them as errors. Any undefined macro is expanded as the text MACRO ERROR, and any #if directive using an undefined macro generates an error message.
• #picky prev returns to the previous level of pickiness. This lets you put a #picky directive in a file to apply to a short section of code, then return to whatever value may have been set on the command line or in a previous file without knowing what that level was.

The initial value is #picky 1, unless you specify a different value on the command line. You can have multiple #picky directives in an input file.

When the macro pickiness level is 2, any macro error counts against the maxerrors quota (if set) and will affect the return value passed to the operating system.

### #tocifexpression

When a table of contents is being generated, you may not want particular headers to appear in it. For instance, if the table of contents itself has a header, you probably don’t want that header in the table. #tocif lets you control this.

If expression evaluates to 0, subsequent headers won’t be included in the table of contents. If expression evaluates to nonzero, headers will be included. (Undefined macros are treated as text, which evaluates to nonzero.)

### #tocinsertlitext

When a table of contents is being generated, you may occasionally need to insert some text, such as a class= or style= attribute, in one or more of the generated <li> tags.

Any macro calls on the line will be evaluated, and the resulting text will be stored. When the table of contents is generated, the text will be placed just before the > of the <li> tag. To stop the insertion, use a #tocinsertli directive with no text, or only spaces.

## Macros

A macro is a bit of stored text for later processing. Pretty much any sequence of characters that is used several times (perhaps with variants) is a candidate for making into a macro. When you call a macro, GENER8 inserts the macro text at that point in the output file. You can define your own macros, and GENER8 has a few macros predefined for you.

A macro can be simple unvarying text, or it can contain placeholders for arguments that are supplied when you call the macro. For example, if you are creating an HTML table and you want most cells to be centered horizontally and vertically, you can define a macro that contains the repetitive HTML coding with a placeholder for the cell contents.

### Defining a Macro

#### #define macroname  definition

#define tells GENER8 to store the definition under the given name as a string of text characters for later use. (To store the result of an expression, see the #freeze directive.)

The macro name may contain any character except a space. Macro names are case sensitive, meaning that abc and Abc are different macros. If a macro with the same name already exists, even a predefined macro, GENER8 discards the old definition.

The macro definition may contain any characters at all, but if a percent sign occurs before *, ?, or a digit, you must code it as %% because a single % sign looks like a macro argument. Spaces at the beginning or end of the macro definition are discarded, but spaces within the macro definition are preserved.

The macro definition must be on one line. If it is too long to fit comfortably on a line, use the \ character and then continue the definition on the next line. This does not generate a line break in the output; if you want a line break you code it as \n.

The macro definition must contain something. If you want to define an empty macro, use the EMPTY macro. Exception: You can set macro pickiness to accept empty macro definitions.

Example:  You are preparing a document that will be modified each quarter and used again. The identification of the quarter occurs many times in the document, and you would like to be able to change that just once without searching through the document to find all occurrences. (Also you worry about typos and want to make sure that all mentions are identical.) Code the macro definition like this:

#define qtr third quarter of 2002

Then everywhere in the document you would write (#qtr#) instead of third quarter of 2002. Come October, you simply change the macro definition to reference the fourth quarter.

#### Defining a Macro to Expect Arguments

You can define a macro with placeholders for text to be supplied later, when the macro is used. (Presumably it will be different text with different uses.)

For example, suppose you’re creating an HTML table and you want most cells to be centered horizontally and vertically. This means that you need to code them as

<td align=center valign=middle>contents</td>

You can define a macro that contains the HTML coding with a placeholder for the cell contents, like this:

#define cell <td align=center \
valign=middle>%*</td>

The %* says “whatever text is supplied with the macro call, insert it here.” You might call the macro like this:

<tr>(#cell 45#)(#cell 88#)(#cell 133#)</tr>

Your macro definition doesn’t need to treat the text of the macro call as a big lump, but can deal with the individual arguments. Here’s an example:

#define myhref <small><a href="%1%2">%2</a></small>

This is handy when I want to create a link in my document but type the link address only once. For instance, I might call this macro in this fashion:

found at (#myhref http://www. gnu.org#).

Then I have made a proper link to http://www.gnu.org, but the visible text of the link is just gnu.org.

In fact, you can specify placeholders for multiple arguments in the definition of a macro:

• %1 through %9 stand for the first through ninth arguments in the macro call. It’s rare but legal to use, say, argument %3 without using %2; this can happen for instance if you change your macro definition after coding a lot of calls to the macro.

A macro can have only nine numbered arguments. The sequence %10 means the first argument followed by a zero character.

• %* and %? stand for whatever text is left in the macro call after the numbered arguments have been extracted. For instance, if your macro definition contains %1, %2, and %*, then %* stands for all the text in the macro call after the second argument.

All macro arguments are separated when the macro is first read, and the unused ones are put back together, separated by a single space, for %* and %?. With the default macro separator, that means that any runs of spaces are collapsed to a single space. (If you want spaces not to collapse, code them as double underscores.) If you have used the #macrosep directive or the #macrosepregexp directive, then the argument delimiters are all changed to single spaces in the %* or %? text.

The distinction between %* and %? is this: use %* when there must be additional text in the macro call after the numbered arguments; use %? when there may be additional text in the macro call after the numbered arguments. Never use %* and %? in the same macro definition.

If the macro definition contains %1 and %3 but not %2, the second argument in the macro call is discarded, not used in %* or %?.

• %% stands for a literal % sign. If you want the text %3 in the output of a macro, you must code it as %%3 so that it doesn’t look like a macro argument. If you want %% in the output, code it as %%%%.

A percent sign is only special before *, ?, %, or a numeric. Before any other character you don’t have to code the percent sign specially, but you can if you want.

For an example with %?, consider this macro from my personal macro file for Web pages:

#define copyright %? Copyright &copy; %1 \
Stan Brown, Oak Road Systems

This macro definition says that when the copyright macro is called, it must contain one argument, which gets placed after the copyright symbol. But the call may contain extra text, which if present gets placed before the word Copyright. Here are two sample calls of this macro:

(#copyright 2002#)
(#copyright 2002 portions of this page are#)

Earlier in this section, the example with table cells used %* alone in a macro definition. Here’s an example using %* and numbered arguments:

#define ti83pic <img src="(#pics#)%2.gif" \
%* width=200 height=%1>

I put many screen shots from the TI-83 in my Web pages for my students. All of the images are 200 pixels wide, but they have varying heights (%1) and of course different filenames (%2). In addition, each one needs some alternative text (%*), and some need special alignment. I might call this macro like this:

(#ti83pic 136 tdist alt="t distribution"#)

#### Defining Nested Macros

You can call one macro inside the definition of another. For example, in creating this document I defined a macro pcode to make a one-line “paragraph” consisting of a line of code, and a second macro ex to show the word Example followed by a line of code. It makes sense to define ex in terms of pcode, like this:

#define ex <p class="brk">Example:</p>(#pcode %*#)
#define pcode <pre class="codeline">%*</pre>

When you nest macro definitions in this way, GENER8 simply stores each definition as an unrelated text string. Inner macros are not evaluated until the outer macro is called. In the example above, note that ex calls pcode before pcode has been defined. That’s perfectly legitimate: GENER8 doesn’t pay any attention to the contents of ex until it is called in the document. As long as the inner macro pcode is defined by then, everything is fine.

When you call a macro whose definition contains a call to another macro, GENER8 completely evaluates the inner macro call before the outer macro call. This lets you do things like (#chap(#chapnum#)#), where chapnum is a macro that will be defined later. If chapnum currently has a value of 14, then the macro call is a call to macro chap14.

Example:

#define sqrt (#ARITH %%9.4f %1^.5#)

This macro computes the square root of a number and displays it to four decimal places. You need a double percent sign, %%, for the formatting argument to an ARITH macro, because the outer macro is expanded first and you don’t want %9 to be taken as a macro argument. For example, the value of (#sqrt 1127#) is “ 33.5708”.

Example:

#define coursenum 200
#define course MATH(#coursenum#)

At this point the definition of the course macro is not MATH200; it is MATH(#coursenum#).

#define coursenum 105
The prerequisites for (#course#) are

The text The prerequisites for MATH105 are will be written to the output file.

#### #freeze macroname  definition

But suppose you want to define one macro in terms of the current value of another, regardless of how the inner macro might be redefined later? This is where #freeze comes in. Unlike #define, #freeze evaluates any macro calls in the definition right away. Modifying the previous example,

#define coursenum 200
#freeze course MATH(#coursenum#)

At this point GENER8 evaluates (#coursenum#) and stores the text MATH200 as the definition of the course macro. Any later changes in coursenum have no effect on course. For example, suppose these two lines occur later in the input file:

#define coursenum 105
The prerequisites for (#course#) are

The text The prerequisites for MATH200 are will be written to the output file. With #freeze changes to inner macros don’t affect the outer macro.

When you are nesting macros, use #define to make the outer macro change with the changes in the inner macro; use #freeze to freeze the outer macro and make it invariable with changes in the inner macro. When you are not nesting macros — when the definition doesn’t contain any macro calls — it doesn’t matter whether you use #define or #freeze.

#freeze is also useful to store the result of an expression, particularly an expensive one like a test for the existence of a file. If you write

#define gotit (#EXISTS somefile.htm#)

then the macro definition contains a call to EXISTS, and every time you write (#gotit#) a system call will check for the existence of the file. On the other hand, if you write

#freeze gotit (#EXISTS somefile.htm#)

then the result of EXISTS is “frozen” as a 1 or 0 in the definition of gotit.

You can play this same game with complicated expressions that don’t actually change. Just wrap them in (#ARITH…#) and freeze them. This expression isn’t complicated, but illustrates the technique:

#freeze myval (#ARITH 88*44#)

#### Defining an Empty Macro: EMPTY

When you define a macro, you need to give it a definition. But sometimes you define a macro just to act as a switch to be used with an #ifdef directive or #ifndef directive. With such a macro, all you care about is whether it has been defined.

To define a macro with an empty definition, use the special code (#EMPTY#) for the definition. As an alternative, you can set macro pickiness to accept empty macro definitions.

Under special circumstances, you might want to call the EMPTY macro with arguments. Consider this example:

#ifdef something
#define pageref (See page %1.)
#else
#define pageref (#EMPTY %?#)
#endif

Suppose that you have page references scattered through your document. If something is defined, you want to display all of them; otherwise you want to suppress them all. It would be cumbersome to bracket each page reference with an #ifdef-#endif pair. Instead, you code them all in the form (#pageref 162#) and define the macro to emit no text if something is not defined. But you need to eat up the argument to the pageref macro; hence the %? in the definition. (%* or %1 would work as well in this case, but %? will always work.)

The EMPTY macro is special in that it simply ignores any arguments. Also, unlike other predefined macros, it cannot be redefined.

#### Removing a Macro Definition: #undef macroname

If you want a macro to be defined for part of a document but not defined for the rest, use the #undef directive to remove the definition.

#undef doesn’t care whether the macro was originally defined with #define or #freeze, but it is an error to undefine a macro that is not presently defined one way or the other.

### Using a Macro

To call a macro, simply specify its name surrounded by (# and #). For instance, if you have defined the macro qtr to contain the text second quarter of 2002, you might write a sentence like this:

Profits rebounded in the (#qtr#).

As you see from the example, a macro call need not be on a line by itself.

If you call a macro that was never defined, GENER8 displays a warning message and then treats the macro as defined but empty. You can change that behavior with the #picky directive.

#### Macro Arguments

Separate any macro arguments from the macro name and each other by spaces. If you want to have a space inside a macro argument, code it as __ (two underscores); if you want to force a line break in the output file somewhere inside a macro argument, code it as \n. (If you actually want a double underscore, code it as _\_. If you actually want _\_, you’re out of luck.)

You can change the argument separator from a space to almost anything you like. If you often want spaces inside your macro arguments, this may be a better solution than the __ hack. See the #macrosep directive.

A macro with all its arguments need not be alone on a line, but everything from the opening (# to the closing #) must be on one line. If the macro call is too long to fit comfortably on an input line, use \ and continue it to one or more additional lines. The \ continuation may occur in the middle of an argument or between arguments.

If you’re using an editor that reflows text, a macro that you code on a line may be split when you reflow the text. To avoid this, pick a different argument separator so that there are no spaces within the macro. See the #macrosep directive.

GENER8 will check that you have supplied the proper number of arguments according to the macro definition. For example, suppose a macro is defined with %1 through %4 but no higher argument numbers:

• If the definition contains %4 and neither %? nor %*, the call must have exactly four arguments.
• If the definition contains %4 and also %?, the call must have four arguments but may have more.
• If the definition contains %4 and also %*, the call must have more than four arguments, namely five arguments or more.

#### Nesting Macro Calls

You can use a macro call in an argument to another macro call. In this case, the inner macro is expanded first, before the outer macro is analyzed.

### Predefined Utility Macros

The following macros are automatically defined for you when GENER8 starts up. If you define any of them yourself, your definition will replace the original definition.

#### ARITH format  expression

GENER8 performs arithmetic, string, and logical operations and pastes the numeric or string result in the output. The expression consists of numeric and string operands connected by the operators listed below. Spaces are optional between operators and operands. You can use parentheses to specify the order of operations.

##### Operand and Result Types

GENER8 uses AWK’s logic in treating operands and expressions as strings or numerics. 1234 is always numeric, and 2e2 is always numeric (with a value of 200). But 1234z is a string.

Strings and numerics are automatically converted where appropriate:

• Binary operations + - * / ^ are done in floating point. The result will have an appropriate number of decimal places, or if it’s a whole number it will have no decimal places and no decimal point.

String operands are converted to numeric. Only the numeric characters before the first non-numeric get used; if there aren’t any, then the string is converted to 0. For example, the strings 200xyz and 2e2zzz and 200.0 would all become 200 if converted to numeric.

Unary + and ! are not arithmetic operators, and string arguments are not converted. +abc is abc, and !abc is 0, not 1 as you might expect.

• String concatenations return a string value, naturally enough.
• Logical and relational operators all return a value of 0 (false) or 1 (true).

The relationals can provide some surprises: if either operand is a string, both are treated as strings. 1234>98 is true because both are numeric, but 1234z>98 is false because they are compared as strings.

You can force conversions where you need to:

• To convert a string to a number, add 0.
• To convert a number to a string, concatenate "" (an empty string).
##### Format String

You can specify a format string before the expression. Use standard AWK format (printf style) strings to specify the conversion from number to string. For example,

(#ARITH 7/4#)

displays as 1.75, but

(#ARITH %08.3f 7/4#)

displays as 0001.750. Perhaps you want a rounded answer; then you would use

(#ARITH %.0f 7/4#)

which displays as 2 (no decimal point). (%d and %i don’t round; they truncate results to an integer.)

##### Operators

The grammar of expressions follows, from highest priority (evaluated first) to lowest priority (evaluated last).

operand
• any sequence of characters ended by a blank or one of the operators listed below.
• any sequence of any characters enclosed by " characters. Use \" to include a " character within the string.
• a macro name, which is automatically replaced by the text of the macro but not expanded. That means that if macro mac is defined as 2+2, then (#ARITH mac+5#) equals 7, not 9. If you want a macro expanded, not just treated as a text string, call the macro — (#ARITH (#mac#)+5#) does equal 9. If you want to use a text string that happens to equal the name of a macro, put it in quotes. The value of mac is the string 2+2, but the value of "mac" is mac.
( )
Any expression inside parentheses is evaluated first and counts as an operand.
^ and unary + - !
exponentiation: 2^5 is 32, and -2^4 is −16. The ! operator is “not”: !0 is 1 and ! turns any nonzero value into 0. Caution: ! applied to a text string returns 0, not 1.
binary + - * / %
% is the modulus: 17%3 is 2. * / and % are done left to right before + and -.
string concatenation
Two values next to each other with no operator are concatenated. For example, 2+4 5 is 65.
 == != < <= > >=
Relational operators return 0 for false or 1 for true. Comparisons are done as numerics only if both operands are numeric. For example, 134xxx >98 is false (0) because the operands are compared as strings.
~ !~
“matches” (contains) and “does not match”. The second operand is treated as a regular expression. For example, 123zonk ~ 3z is true (1) but 123zonk ~ "^3z" is false (0) because 123zonk doesn’t begin with 3z.

Caution! The parser is easily fooled when the regular expression contains characters that are also GENER8 operators. For instance, ^abc looks like a defective exponentiation, and [019][0-9] looks like a subtraction. If you run into problems like this, enclose the regular expression in double quotes to force it to be parsed as a string. "^abc" and "[019][0-9]" are both parsed correctly.

 && ||
Logical “and” and “or” are evaluated left to right with equal priority and return 0 (false) or 1 (true). Unlike AWK, GENER8 always evaluates both operands.

#### DATE format

Convert the specified date and optional time — or the current date and time, if no date is specified — to any desired format.

##### Input Date Formats

If you specify a date in the macro, it must be in the same format as your system date. More precisely, it must be in the same format that you have told GENER8 is your system date format. See the DATE_SYSFORMAT macro, below.

While this might seem restrictive, the alternative was to specify both input and output formats in the macro, with different possibilities available for the two. Most people always write their dates in a given format, and a simple macro call lets you set that format if it’s different from what your system does.

In all system date formats, years can be two or four digits, and days and months can be one or two digits. Two-digit years 70 to 99 will have 1900 added; 00 to 69 will have 2000 added.

Although the supported date input formats all use the hyphen (-) as separator within the date, your input dates can equally well use a period (.) or slash (/) as separator.

The time, if specified, can be separated from the date by a T or by one or more spaces. The time can be in the form hh, hh:mm, or hh:mm:ss, with or without leading zeroes. If the time is on a 12-hour clock, add a space and AM or PM in upper or lower case.

##### Output Date Formats

You have complete freedom in the output format that you specify. Although you can use strftime’s completely general formatting codes, most likely you’ll find one of the following keywords gives you the format you want.

• trad = Apr 25, 2016.
• traditional = April 25, 2016.
• custom1 = Apr 25 for current year, or Apr 25, 2016 (same as trad) for any other year.
• mil = 25 Apr 2016.
• milopt = 25 Apr for current year, or 25 Apr 2016 same as mil) for any other year.
• custom2 = Apr 25 for current year, or 25 Apr 2016 (same as mil) for any other year.
• iso = 2016-04-25.
• isofull = 2016-04-25T21:12:00.
• timestamp = “raw” format, suitable for comparisons, in the format 1461633120. This is the number of seconds since the epoch, which for Windows systems was the start of 1 January 1970.

The author is open to creating new keywords, particularly for people who have registered the software.

The keyword format strings are not case sensitive.

The three-letter abbreviations Jun, Jul, and Sep are changed to June, July, and Sept. See the DATE_MONTHS4 macro if you want to stick with three-letter months.

If you want non-breaking spaces (&nbsp;) instead of regular spaces, include nbsp anywhere in the one of the keyword formats; for example, isofullnbsp or isonbspfull. (If you want some other character in place of the spaces, use the GSUB macro on the result of the DATE macro.)

Examples:

(#DATE custom1#)

formats the current date, in the format Dec 24.

(#DATE isofull 11/22/12 2:26 pm#)

formats the indicated date and time as 2012-11-22T14:26:00, if your system date format is m-d-y.

##### Output Date Formats via strftime

In addition to the predefined formats above, you can use any format acceptable to the strftime( ) function in AWK.

strftime has several dozen format strings, too many to list here. You can find them in the GAWK manual, which is in many places on the Web, including here.

Please observe these rules:

• strftime format strings are case sensitive.
• Because the format argument to the DATE macro must be one argument, you must code any spaces in the format string as double underscores.
• The three-letter abbreviations Jun, Jul, and Sep are changed to June, July, and Sept. See the DATE_MONTHS4 macro if you want to stick with three-letter months.

For example, the format string for the abbreviated weekday and month/day/year is %a %D, so you would enter it like this:

(#DATE %a__%D#)

Since no date was given in the macro, GENER8 will format the system date. When this document was last updated, the result was Sat 12/24/16.

#### #define DATE_MONTHS4 0_or_1

By default, if you select an output date format with a three-letter month, GENER8 will change Jun, Jul, and Sep to the four-letter abbreviations June, July, and Sept.

If you want to stick with three-letter months, include this line in your input file:

#define DATE_MONTHS4 0

If you want this to apply to every file, you can easily edit the gener8.awk file. Simply change 1 to 0 in the program line

macroStore("DATE_MONTHS4", "1")

You can change back and forth within a single file, simply by redefining DATE_MONTHS4.

#### #define DATE_SYSFORMAT format

If you use the DATE macro or the FILEDATE macro, GENER8 has to know what date format you are using, or what date format your system is using. By default, GENER8 assumes that the system date format is y-m-d. You can change that by defining the DATE_SYSFORMAT macro with one of these values:

• m-d-y: The system date, and your dates in the DATE macro, are in the form month-day-year, month/day/year, or month.day.year.
• d-m-y: The system date, and your dates in the DATE macro, are in the form day-month-year, day/month/year, or day.month.year.
• y-m-d: The system date, and your dates in the DATE macro, are in the form year-month-day, year/month/day, or year.month.day.

You can use . or / in the format string in place of -, if you wish.

Caution! You’re storing text into a macro. Like any other macro, the text doesn’t actually get examined until it’s used. An invalid system format will be caught when you call a DATE macro or FILEDATE macro. A valid format that doesn’t match the actual date format in your system may or may not get caught; you might just get bogus dates.

These rules apply to all input dates and times, regardless of your system date format:

• Years can be two or four digits.
• Day and month may be one or two digits: leading zeroes are allowed but not required.
• Dots, dashes, and slashes are all acceptable as separators for year, month, and day, no matter which separator you use in the format string.
• The format specifies date format only. All time formats are accepted in any case. For the rules for input times, see Input Date Formats.

If your system uses a different date format, or you want to use a different format for entering dates in the DATE macro, redefine the system date format in your input file. For example:

#define DATE_SYSFORMAT m-d-y

says that input dates will be in the form month-day-year, month/day/year, or month.day.year. You can change the format multiple times within any file.

If you want to change the default for every file, you can easily edit the gener8.awk file. Simply change the date format in the

macroStore("DATE_SYSFORMAT", "y-m-d")

line to one of the other supported formats.

The author will gladly add other input date formats for registered users. Or you could do it yourself: just edit the isodate( ) function in the gener8.awk file.

#### DEFINED macroname

If the named macro is defined (including an empty definition), DEFINED is replaced with 1; otherwise it is replaced with 0.

This is handy for #if directives where you want to check whether something is defined or some other condition is true. While that can be done with nested #ifdef directives and #if directives, it’s easier to do it on one line using a call to the DEFINED macro.

#### EMPTY

This macro (which may be called with or without arguments) emits nothing. It is useful when you need an empty macro.

Unlike the other predefined macros, EMPTY cannot be redefined.

#### ENV variable

This macro is replaced with the value of the named environment variable. If the variable is not defined in the environment, the macro is replaced with nothing (and no message is displayed).

Caution! Not all environment variables in Windows are upper case. For instance, in Windows 7 I have ComSpec, not COMSPEC. Although the Windows command line doesn’t care about the case of environment-variable names, AWK does.

#### EXISTS file

This macro is replaced with 1 if the named file exists, and 0 if it does not. The file path can use forward or backward slashes.

In Windows operating systems, the macro finds only real files, not folders (directories). I don’t know what happens in other operating systems, but I’d be grateful for information.

#### FILEDATE file format

GENER8 queries the system for the last-modified date of the file, reformats the file date according to the format you specify, and pastes the result into the output.

The possible output formats are the same as for the DATE macro, including strftime format strings. If you don’t specify a format, trad will be used.

Example:

(#FILEDATE (#FILENAME#) iso#)

will find the date when the current input file was last modified and format it in ISO format, such as 2016-04-25.

System dependencies:

• This function should work on all Windows XP and later Windows systems, as well as UNIX. If there’s a problem in UNIX, edit the UNIX command line in the filedate( ) function in the gener8.awk file. If you have a confirmed working edit, please send it to me for inclusion in GENER8.
• GENER8 uses the DATE_SYSFORMAT macro to interpret the file date that it gets from the system. If this doesn’t match the actual system date format, FILEDATE will return an error (if you’re lucky), or garbage.

#### FILESIZE format filespec

GENER8 queries the system for the size of the file in bytes, then writes that size to your output file in a format that you specify.

There is no need for quotes around a filespec that contains spaces, though they seem to do no harm. If the system can’t find the file, or if it’s a directory, the size will be zero.

The format is one or two characters:

1. B, K, M, or G for file size in bytes, kilobytes (1024 bytes), megabytes (1024² bytes), or gigabytes (1024³ bytes). For all except bytes, the suffix KB, MB, or GB will be added. (Some people like B for bytes, others want to spell out the word.)
2. If this is a digit, the size will be displayed to that number of decimal places. If this is a comma, the number is displayed with a comma every three digits. You can omit this second digit, in which case the display is a plain whole number.

If the second character is a comma or is omitted, and the first character was K, M, or G, then the number displayed will be the nearest whole number of that unit. For example, 512–1535 bytes with format K would display as 1 KB.

Examples, using the size of another file:

(#FILESIZE B grepman.htm#) bytes = 299240 bytes

(#FILESIZE B, grepman.htm#) bytes = 299,240 bytes

(#FILESIZE K grepman.htm#) = 292 KB

(#FILESIZE K, grepman.htm#) = 292 KB

(#FILESIZE M grepman.htm#) = 0 MB

(#FILESIZE M3 grepman.htm#) = 0.285 MB

System dependency: I believe I have a correct UNIX command for getting file size in bytes. If you’re not getting correct answers, edit the UNIX line in the filesize( ) function in the gener8.awk file. If you have a confirmed correction, please send it to me for inclusion in GENER8 in the future.

#### GDEL old how text

This macro deletes the part of text that matches the regular expression old. If there is no match, text will be placed in the output unchanged.

how determines which occurrence(s) of old get deleted. If how is G or g, every occurrence will be deleted. If how is an unsigned number, only that occurrence will be deleted. If how is anything else, the first occurrence will be deleted.

old must not contain any spaces. As usual, you can use a double underscore (__) as a stand-in for a space. text may contain spaces.

GDEL is useful for manipulating file names. For instance, this gives the base name part of the current file name, without extension:

(#GDEL \.*$(#FILENAME#)#) #### GSUB old new how text This macro performs text substitution. If the regular expression old matches the text or part of it, it will be replaced with new. If there is no match, text will be placed in the output unchanged. old is a regular expression. It may not contain spaces, but you can use a double underscore (__) as a stand-in for a space. old may contain parentheses to delimit subexpressions, and you can then address these in new by \1, \2, and so forth. new may not contain spaces, but you can use a double underscore (__) as a stand-in for a space. new may contain special characters: \& to indicate the entire substring of text that was matched by old, and \1, \2, etc. to indicate the part of text matched by the first, second, etc. parenthesized subexpresion in old. new is required. There is no way to delete text using GSUB; use the GDEL macro instead. how determines which occurrence(s) of old get replaced. If how is G or g, every occurrence will be replaced. If how is an unsigned number, only that occurrence will be replaced. If how is anything else, the first occurrence will be replaced. GSUB is useful for all sorts of manipulations. For instance, this replaces all occurrences of two hyphens in the text from the macro with the em dash character: (#GSUB -- &#8212; g (#somemacro#)#) This puts a comma after the thousands part of a number, but doesn’t put a comma if the number has no thousands part: (#GSUB ([1-9][0-9]+)([0-9][0-9][0-9])$ \1,\2 1 (#somemacro#)#)

It would be cleaner with ([1-9][0-9]+)(0-9}{3}) for old, but for some reason that doesn’t work, even with backslashes to escape the brace characters.

#### IIF condition iftrue

This macro is an inline version of the #if-#else-#endif sequence. condition is evaluated, using the logic of (#ARITH condition#), to determine whether it’s true (nonzero) or false (zero). Non-numeric text is considered true, unless it matches a macro name. Macros in condition are expanded just as they are within ARITH.

If condition is true (nonzero or text), IIF returns iftrue; if condition is false (numeric zero), IIF returns iffalse. iftrue and iffalse may themselves be macro calls, as long as they return values that don’t contain macro-argument separator characters. iftrue and iffalse are not evaluated unless they are actually macro calls, so if either of them happens to match a macro name without (#...#) it will still be treated as ordinary text.

Sometimes you want to produce text if a certain condition is true and nothing if the condition is false. In that case, simply omit iffalse.

Examples:

#define zonk 45
(#IIF zonk yes no#) (#IIF zonk==45 yes no#) (#IIF zonk==1 yes zonk#)

will place yes yes zonk in the output. Notice that the third one is zonk, not 45, because iftrue and iffalse aren’t evaluated.

#define n 15
My sample size was (#n#) individual(#IIF n==1 . s.#)

will place this text in the output:

My sample size was 15 individuals.

You could get the same effect, slightly shorter, like this:

My sample size was (#n#) individual(#IIF n!=1 s#).

#### IIFDEF name ifdefined

This is a shortcut version of (#IIF (#DEFINED name#) ifdefined ifundefined#).

If name is a user-defined or predefined macro, IIFDEF returns ifdefined; otherwise, IIFDEF returns ifundefined (or empty text if ifundefined is omitted).

#### LOWER text

converts text to lower case.

#### REGPRE macroname

Fairly often you need to number or letter things in a document.  REGINC (“register increment”) and REGPRE (“register preincrement”) let you maintain any number of separate counters and update them automatically. (Counters are simply specialized macros.)

A counter can be a regular number, an upper-case letter, or a lower-case letter. Letter series end at Z or z; number series have no practical upper limit. You can have multiple counters going at once: counters are completely independent of each other.

Before using a counter, first set it to its initial value with #define or #freeze, like any other macro.

If you use a counter without giving it an initial value, what happens depends on the latest #picky directive. If picky=2, GENER8 displays an error message and pastes MACRO ERROR in the output. Otherwise, GENER8 supplies an initial value of 0.)

To use a counter, pass its name as argument to a REGINC or REGPRE macro. The two macros work identically, with one exception:

• REGINC pastes the current value of the counter and then immediately increments it for future use. If you use the counter again without incrementing it, it will be one unit past the previously displayed value.
• REGPRE increments the counter and then pastes the new value. If you use the counter again without incrementing it, it will be the same as the previously displayed value.

For example, if you want to identify sections of a document as A, B, C, and so on, assign the value A to a macro, using Perhaps you might use the name secnum, like this:

#define secnum A

Now you use REGINC in your section heads, like this:

(#REGINC secnum#). Fruits
(#REGINC secnum#). Vegetables
(#REGINC secnum#). Grains and Cereals
(#REGINC secnum#). Dairy

The sections will be lettered A, B, C, and so on.

REGINC increments the counter immediately after using it. This lets you start the counter off at its intended initial value, but it prevents you from reusing the value of the counter. For example, if you had subsections A1, A2, A3 under section A, you would be unable to use (#secnum#) to re-display the A part of that, because secnum has already been updated to B.

For these situations, there is the REGPRE macro, which increments first and then displays. The quirk here is that the initial value must be one before the actual initial value you want to see. (@ comes before A, and  comes before a. 0, of course, comes before 1.)

Here’s how that would play out, using REGPRE. I’ve also defined a macro for the whole subsection number, A1, A2, and so on:

#define secnum @
#define subsec (#secnum#)(#REGPRE subsecnum#)
(#REGPRE secnum#). Fruits
    #define subsecnum 0
    (#subsec#). Berries
    (#subsec#). Melons
    (#subsec#). Stone Fruits
(#REGPRE secnum#). Vegetables
    #define subsecnum 0
(#REGPRE secnum#). Grains and Cereals
    #define subsecnum 0
(#REGPRE secnum#). Dairy
    #define subsecnum 0

There’s a way around the repeated #define subsecnum 0 lines; see the REGSET macro, below.

Occasionally you may want to increment a counter without displaying it. To do this, wrap the REGINC call inside a call of the EMPTY macro, like this:

(#EMPTY (#REGINC secnum#)#)

#### REGSET macronamevalue

In the example for REGPRE, you had to reset the counter for the subsection number every time you started a new section. That’s kind of tedious. Wouldn’t it be better to have a macro that shows the section number and also resets the subsection number?

REGSET to the rescue! REGSET is similar to #define or #freeze, but it doesn’t have to be on a line by itself. Here’s that example rewritten in shorter form by using REGSET:

#define secnum @
#define subsec (#secnum#)(#REGPRE subsecnum#)
(#REGPRE secnum#)(#REGSET subsecnum 0#). Fruits
    (#subsec#). Berries
    (#subsec#). Melons
    (#subsec#). Stone Fruits
(#REGPRE secnum#)(#REGSET subsecnum 0#). Vegetables
(#REGPRE secnum#)(#REGSET subsecnum 0#). Grains and Cereals
(#REGPRE secnum#)(#REGSET subsecnum 0#). Dairy

But there’s still some repetition that can be squeezed out. Here sec is the section number as it appears on the header line, and secnum is the underlying counter. Similarly, subsec is the section+subsection number that appears on the header line, and subsecnum is the underlying counter for the subsection:

#define secnum @
#define sec    (#REGPRE secnum#)(#REGSET subsecnum 0#)
#define subsec (#secnum#)(#REGPRE subsecnum#)
(#sec#). Fruits
    (#subsec#). Berries
    (#subsec#). Melons
    (#subsec#). Stone Fruits
(#sec#). Vegetables
(#sec#). Grains and Cereals
(#sec#). Dairy

The sec macro increments the secnum counter and pastes the new value, then sets the subsecnum counter to 0. The subsec macro pastes the current value of secnum, then increments the subsecnum counter and pastes its new value. The result looks like this:

A. Fruits
    A1. Berries
    A2. Melons
    A3. Stone Fruits
B. Vegetables
C. Grains and Cereals
D. Dairy

#### SYSTEM command

GENER8 expands any inner macro calls and passes the command to your operating system. Anything the command writes to the standard output stream goes directly into the output file, with no intervention by GENER8. (Compare to the #include !command directive, where GENER8 reads and processes the output of the command, expanding macro calls and so forth.)

Example:

(#SYSTEM gawk -f maketoc.awk -v indent=-1 (#FILENAME#)#)

will use the maketoc.awk program to create a table of contents for the current input file. (maketoc was a bonus with older registered version of GENER8, replaced in GENER8 7.0 by the built-in TOCB macro.)

While it’s legal to call the SYSTEM macro on the same line as other text, the results can be confusing. If the SYSTEM macro occurs on a line with any other text, the command runs before the rest of the text is processed. After the command has run, GENER8 processes the rest of the line, less the SYSTEM macro call and its arguments. This hardly ever matters, but if it does you may want to use the SYSTEMINLINE macro instead, or make sure that the call of the SYSTEM macro is on a line by itself.

#### SYSTEMINLINE command

GENER8 also expands any inner macros and passes the command to your operating system. However, unlike the SYSTEM macro, with the SYSTEMINLINE macro GENER8 intercepts the output of the command. GENER8 replaces the macro call with the first (or only) line of output from the command; any further output from the command is discarded.

Example:

This file is \
(#SYSTEMINLINE echo %@filesize[(#FILENAME#)]#) bytes long.

In 4DOS, %@filesize[...] returns the file size in bytes; naturally the command would be different on other operating systems. Suppose the current file is 32487 bytes long. Then GENER8 will read the output 32487 from the command and substitute it for the macro call:

This file is 32487 bytes long.

(That was just a historical example, Beginning in GENER8 8.0, the FILESIZE macro gets file sizes for you.)

#### TOCB minlevel maxlevel awkexe awkprog ul_identifier

This macro implements tables of contents in HTML documents.

All the <hminlevel> through <hmaxlevel> tags from files listed on the gawk command line are gathered for the table of contents. Although the start and end tags need not be on the same line, the attribute id= must be on the same line as the start tag. (name= won't work.)

The TOCB macro is replaced with nested <ul> lists, nested the necessary number of levels. Each header tag becomes an <li>, and the text becomes a link to the actual header within the document. This document itself provides a sample of a generated table of contents.

1. GENER8 reprocesses the input file(s) into a temporary file, %TEMP%\gener8.toc. All directives are honored, and all macros are expanded. (TOCB itself is ignored, of course. The predefined macros TOCMIN and TOCMAX return nonzero values in this pass.) The temporary file will contain only <hn> or <Hn> tags.

The #tocif directive and #tocinsertli directive are effective only during this pass.

2. GENER8 reads the temporary file and converts each header to a line of the form
<li><a href="#id">header text</a></li>

This pass also inserts <ul> and </ul> as appropriate for the hierarchy of the headers.

You can style the table of contents as you wish through CSS. A common technique is to place <div id="TOC"> before TOCB and </div> after TOCB; you then style #TOC ul, #TOC li ul, and so on in your CSS. Or instead, you can specify an id or a class attribute for the main <ul> through the optional fifth argument to TOCB; see below.

Caution: If you specify an id= or class= on the header tag preceding the table, you can’t use that in CSS to style the table of contents. The <ul> tag is not a child of the <hn> tag, as far as HTML and CSS are concerned. To style the entries in the table of contents, either use the optional fifth argument to TOCB below, or enclose the whole business in a <div> with a unique identifier, as suggested above.

TOCB takes four or five arguments. The first two are the minimum and maximum <hn> levels to generate entries in the table. Usually you want minlevel to be 2 and maxlevel to be 3 or 4, but all values 1 to 6 are accepted.

The third argument is the path and name of the AWK or GAWK executable, the same as on the command line. (GENER8 has no way to determine this.) If your PATH environment variable is set properly, you need not specify the whole path, and awk or gawk is sufficient.

The fourth argument is the path and name of gener8.awk itself, just as it would appear after -f on the command line, but without the -f.

The optional fifth argument will be added to the initial <ul> in the generated table of contents. This lets you give the main list an id or class attribute that will then become part of your CSS selectors, thus avoiding the <div> wrapper mentioned above.

Recommendation: Put the TOCB macro on a line by itself. If you have it on a line with other text, that text will be written to your final document after the generated table of contents.

Example:

(#TOCB 2 4 gawk gener8.awk#)

will generate a table of contents from all the <h2>, <h3>, and <h4> tags. It’s assumed that the PATH variable includes the directory where GAWK.EXE is located, and the AWKPATH environment variable includes the directory where gener8.awk is located.

Example:

(#TOCB 2 3 c:\utils\text\gawk c:/utils/gener8/gener8.awk id="TOC"#)

will generate a table of contents from all the <h2> and <h3> tags. The paths of the GAWK program and of gener8.awk are given explicitly — notice that the latter has forward slashes. The first tag in the generated table of contents will be <ul id="TOC">.

#### TOCF minlevel maxlevel awkexe awkprog sourcefile ul_identifier

This macro is exactly the same as TOCB macro, except that TOCF reads header tags from sourcefile , not the current file set as with TOCB. This can be useful when each chapter of a book has its own table of contents and you want to pull together an overall table of contents in a separate file.

#### TOCMAX

These two macros let you determine whether GENER8 is in normal processing, or scanning for headers while expanding a TOCB macro or TOCF macro. This can be helpful for debugging complex logic.

During normal processing, both macros return 0. During the scan for headers, TOCMIN and TOCMAX return the minlevel and maxlevel arguments from the TOCB or TOCF that triggered the scan.

#### UPPER text

converts text to upper case.

### Predefined Filename Macros

#### FILENAME

translates to the name of the file currently being read from the command line, converted to lower case. If the file on the command line contains a path, it will be part of FILENAME.

The value of (#FILENAME#) does not change when GENER8 finds an #include directive and reads the included file. Even within an include file, (#FILENAME#) still expands to the name of the input file named on the command line. If you name multiple input files on the command line, (#FILENAME#) will change as GENER8 finishes processing one input file and starts on the next.

Compare to the INCLUDEFILE macro.

#### HOME

If you specify the -v home option on the command line, macro HOME will be set to the given path and file, translated to lower case. If you don’t specify the option, HOME is undefined.

#### INCLUDEFILE

translates to the name of the file currently being read, whether from the command line or because of an #include directive, converted to lower case. If a path was specified, it will be part of INCLUDEFILE.

There’s only one difference between INCLUDEFILE and the FILENAME macro: FILENAME changes only when GENER8 begins processing the next file named on the command line, and INCLUDEFILE changes when a file named on the command line or in an #include directive is opened or closed.

#### RELHOME

If you specify the -v home option and the -v target option on the command line, macro RELHOME will be set to the relative URL from the target to the home page, translated to lower case. If you specify only one of those options or neither, RELHOME is undefined.

Most Web pages contain a link to the site’s home page. You want to do them as relative URLs so that you can test all the links on the site before you upload it to your Web site. RELHOME makes that easy. For example:

<a href="(#RELHOME#)">Home</a>

#### TARGETNAME

These three macros are defined only if you specify the -v target option on the command line. If you do, TARGET is the full path and filename as defined in that option, translated to lower case; TARGETDIR is the path part of TARGET, including a trailing /; and TARGETNAME is the name (and extension) part of TARGET.

## Special Processing of Input Lines

### Comment Lines

There are two ways to tell GENER8 to ignore lines and not write them to the output file (or not process them, if they’re directives):

• For single-line comments, put the text

<!-- ignore -->

anywhere on the line. GENER8 checks for this marker after pasting any continued lines together, so the entire logical line is ignored.

• For a block of comments, use an #if directive:

#if 0
All these lines will be
ignored, not written to the
output file.
#endif

There is no way to tell GENER8 to write only part of a line to the output file.

### Continuation Lines

There are two situations where you may need continuation lines:

If you put a \ character at the very end of a line, GENER8 will remove the \ and paste the following line to the end before doing any other processing.

Example:

#define row <tr><td align=center>%1</td>\
<td align=center>%2</td><td>%*</td></tr>

Here, a long macro definition is split into two input lines for convenience in editing.

Be careful with spaces in continued lines! If you want a space where the lines were joined, you must provide one either before the \ as shown above, or at the start of the next input line. This example is properly coded to get spaces between the words:

You want all \ 
of these lines \ 
to be joined in one \ 
line in the output file.

If you actually want a backslash at the end of a line, code it as \\. GENER8 will translate it to a single \ but will not append the next line to this one.

### Making # Non-Special

The # character is special in two contexts:

• When # is the first non-blank character on a line, it marks the line as a directive.
• The sequences (# and #) mark the start and end of a macro call.

If you need a # in the output file in either of these two contexts, code it as \# to remove the special meaning and have GENER8 output it as plain text. In fact, GENER8 will change every \# sequence to plain #, so if you want a # character output as text it’s always safe to code it as \#.

Example:

\#1. Put out the cat.
\#2. Lock all doors.
\#3. Turn off lights.`