5.11 The Dangers Of FunnelWeb

Reference

Developer

Tutorial
1 Introduction
2 Macros
3 Typesetting
4 Example
5 Hints
6 Examples
7 Webmaking

SEARCH

Like many tools that are general and flexible, FunnelWeb can be used in a variety of ways, both good and bad. One of the original appeals of the literate approach to programming for Knuth, the inventor of literate programming, was that it allows the programmer to describe the target program bottom up, top down, size to side, or chaotically if desired.

The flexibility that the literate style of programming leaves much room for bad documentation as well as good documentation. Years of experience with FunnelWeb has revealed the following stylistic pitfalls which the experienced FunnelWeb user should take care to avoid. (Note: The fact that these faults are listed here does not mean that the author has eliminated them in his own work. Rather, it is mainly the author's own mistakes that have resulted in this list being compiled. The author immediately confesses to several of the faults listed here, most notably that of Pavlov documentation).

Spaghetti organization: By far the worst problem that arises in connection with the literate style occurs where the programmer has used the literate tool to completely scramble the program so that the program is described and layed out in an unordered, undisciplined "stream of consciousness". In such cases the programmer may be using the literate style as a crutch to avoid having to think about structuring the presentation.

Boring organization: At the other extreme, a program may be organized in such a strict way that it is essentially laid out in the order most "desired" by the target programming language. For example, each macro might contain a single procedure, with all the macros being called by a macro connected to a file at the top. In many cases a boring structure may be entirely appropriate, but the programmer should be warned that it is easy to slip into such a normative style, largely forgetting the descriptive structural power that FunnelWeb provides.

Poor random access: Using FunnelWeb, it is quite possible to write programs like novels --- to be read from cover to cover. Sometimes the story is very exciting, with data structures making dashing triumphs and optimized code bringing the story to a satisfying conclusion. These programs can be works of art. Unfortunately, without careful construction, such "novel-programs" can become very hard to access randomly by (say) a maintenance programmer who wishes only to dive in and fix a specific problem. If the entire program is scrambled for sequential exposition, it can be hard to find the parts relating to a single function. Somehow a balance must be struck in the document between the needs of the sequential and of the random-access reader. This balance will depend on the intended use of the program.

Too-interdependent documentation: Sometimes, when editing a program written using FunnelWeb, one knows how to modify the program, but one is unsure of how to update the surrounding documentation! The documentation may be woven into such a network of facts that it seems that changing a small piece of code could invalidate many pieces of documentation scattered throughout the document. The documentation becomes a big tar pit in which movement is impossible. For example, if you have talked about a particular data structure invariant throughout a document, changing that invariant in a small way could mean having to update all the documentation without touching much code. In such cases, the documentation is too interdependent. This could be symptomatic of an excessibly interconnected program, or of an excessively verbose or redundant documenting style. In any case, a balance must be struck between the conversational style that encourages redundancy (by mentioning things many times) and the normalized database approach where each fact is given at only one point, and the reader is left to figure out the implications throughout the document.

Pavlov documentation: By placing so much emphasis on the documentation, FunnelWeb naturally provides slots where documentation "should" go. For example, a FunnelWeb user may feel that there may be a rather unpleasant gap between a @C marker and the following macro. In many cases no commentary is needed, and the zone is better left blank rather than being filled with the kind of uninformative waffle one often finds filling the slots of structured documentation written according to a military standards (e.g. MIL-STD-2167A). (Note: This is not a criticism of 2167A, only of the way it is sometimes used). The lesson is to add documentation only when it adds something. The lesson in Strunk and White[Strunk79] (p. 23) holds for program documentation as it does for other writing: "Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that he avoid all detail and treat his subjects only in outline, but that every word tell.".

Duplicate documentation: Where the programmer is generating product files that must exist on their own within the entire programming environment (e.g. the case of a programmer in a team who is using FunnelWeb for his own benefit but must generate (say) commented Ada specification package files) there is a tendency for the comments in the target code to duplicate the commentary in the FunnelWeb text. This may or may not be a problem, depending on the situation. However, if this is happening, it is certainly worth the programmer spending some time deciding if one or other of the FunnelWeb or inline-comment documentation should be discarded. In many cases, a mixture can be used, with the FunnelWeb documentation referring the reader to the inline comments where they are present. For example:

@A Here is the header comment for the list package
specification. The reader should read these comments
carefully as they define a list. There is no need to
duplicate the comments in this text.

@$@<Specification package comments@>==@{@-
-- LIST PACKAGE
-- ============
-- * A LIST consists of zero or more ITEMS.
-- * The items are numbered 1 to N where N
--   is the number of items in the list.
-- * If the list is non-empty, item 1
--   is called the HEAD of the list.
-- * If the list is non-empty, item N
--   is called the TAIL of the list.
-- ...
@}

Overdocumenting: Another evil that can arise when using FunnelWeb is to over-document the target program. In some of Knuth's earlier (e.g. 1984) examples of literate programming, each variable is given its own description and each piece of code has a detailed explanation. This level of analysis, while justified for tricky tracts of code, is probably not warranted for most of the code that constitutes most programs. Such over-commenting can even have the detrimental affect of obscuring the code, making it hard to understand because it is so scattered (see "spaghetti organization" earlier). It is up to the user to decide when a stretch of just a few lines of code should be pulled to bits and analysed and when it is clearer to leave it alone.

In the case where there are a few rather tricky lines of code, a detailed explanation may be appropriate. The following example contains a solution to a problem outlined in section 16.3 of the book "The Science of Programming" by David Gries[Gries81].

@C@<Calculation of the longest plateau in b@>

This section contains a solution to a problem
outlined in section 16.3 of the book @/The
Science of Programming@/ by David Gries[Gries81].

@D Given a sorted array @{b[1..N]@} of
integers, we wish to determine the @/length@/ of
the longest run of identically valued elements in
the array. This problem is defined by the
following precondition and postcondition.

@$@<Precondition@>==@{/* Pre: sorted(b). */@}
@$@<Postcondition@>==@{@-
/* Post: sorted(b) and p is the length */
/* of the longest run in b[1..N].      */@}

@D We approach a solution to the problem by
deciding to try the approach of scanning through
the array one element at a time maintaining a
useful invariant through each iteration. A loop
variable array index @{i@} is created for this
purpose. The bound function is @{N-i@}. Here is
the invariant.

@$@<Invariant@>==@{@-
/* Invariant: sorted(b) and 1<=i<=N and   */
/*            p is len of longest run in b[1..i]. */@}

@D Establishing the invariant above in the
initial, degenerate case is easy.

@$@<Establish plateau loop invariant@>@{i=1; p=1;@}

@D At this stage, we have the following loop
structure. Note that when both the invariant and
@{i == N@} are true, the postcondition holds and
the loop can terminate.

@$@<p=len(longest run in sorted b[1..N])@>@{@-
@<Precondition@>
@<Establish plateau loop invariant@>
while (i != N)
  {
   @<Invariant@>
   @<Loop body@>
  }
@<Postcondition@>
@}

@D Now there remains only the loop body whose
sole task is to increase @{i@} (and so decrease
the value of the bound function) while maintaining
the invariant. If @{p@} is the length of the
longest run seen so far (i.e. in b[1..i]), then,
because the array is sorted, the extension of our
array range to @{b[1..i+1]@} can only result in
an increase in @{p@} if the new element
terminates a run of length @{p+1@}. The increase
can be at most 1. Because the array is sorted, we
need only compare the endpoints of this possible
run to see if it exists. This is performed as
shown below.

@$@<Loop body@>==@{i++; if (b[i] != b[i-p]) p++;@}

Where the code is more obvious, it is often better to let the code speak for itself.

@C The following function compares two C
strings and returns TRUE iff they are identical.

@$@<Function comp@>==@{@-
bool comp(p,q)
char *p,*q;
{
 while (TRUE)
   {
    if (*p != *q  ) return FALSE;
    if (*p == '\0') return TRUE;
    p++; q++;
   }
}
@}