Section 6: Contextual Replace

By now you're familiar with option R - Replace characters on the Main Menu. You know that you can use Replace characters to dramatically modify text. Contextual Replace adds two more tools to the Replace characters workbench: pattern strings and on and off strings. A contextual transformation chapter can use either the pattern string tool, the on or off string tool, or both.

Part 1: Overview of the Tools

In basic Replace, each transformation rule pairs a find string with a change to string. In contextual Replace, each transformation rule has three parts: the find string, the pattern string, and the change to string. Pattern strings consist of special characters, (called pattern codes), that provide you with wild card functions. One pattern code means "find this letter whether it's upper- or lowercase." Another pattern code means "find any digit." We explore all thirty-four pattern codes in great detail in Parts 3 and 4. You can combine the various pattern codes to do some pretty nifty things.

Suppose you want to change lowercase letters to uppercase letters. With basic Replace, all your finding and changing must be explicit; you must write 26 transformation rules--like the LCUC transformation chapter on the BEXtras disk. Find a and change to A; find b and change to But, and so on through Zebra. Contextual Replace's pattern strings enable you to write a single transformation rule that finds every lowercase letter and changes it to its uppercase equivalent.

In contextual Replace, the find string alone does not specify what's found. It's the combination of the find string and the pattern string that defines what's found and where the change to string is placed in the target chapter.

On and off strings allow you to selectively Replace material within a chapter. You decide on a group of characters that serve as a "green light" or beginning point to start replacing. Once the program encounters this on string, it starts executing all the transformation rules you've supplied. You can also define a string of characters that serves as a "red light" or stopping point; this off string halts execution of the transformation rules. When the program encounters the off string, it doesn't stop altogether. Rather, Replace characters continues to search through the text for another on string that would reactivate execution of the transformation rules. You may define the on and off strings as anything you wish; they could be characters naturally part of the data you're transforming, or they could be characters that you insert for the specific purpose of switching replacement off and on.

As a quick example, you could define BEX's center-and-underline command, $$h as your on string, and BEX's paragraph ( $p ) indicator as your off string. Combining these on and off strings with the single case-changing transformation rule mentioned above, you can use Replace characters to make all your headings uppercase.

The transformation chapter that accomplishes the feat in the above example is just 22 characters long. But this brevity has its costs: Contextual Replace's pattern codes are quite cryptic. In fact, contextual Replace is almost a programming language. Like any language, you need to learn the vocabulary and the syntax to become a fluent user. And as with all language acquisition, learning about contextual Replace is not a linear process. You may find yourself reading and rereading this Section, and wondering if you will ever understand it. Then one day--whoosh!--it all makes sense.

Allow enough time for the learning process

Unlike some of the other material in the BEX Dox, you must read this Section front to back to make the most of the information provided here. Here's why: As you begin learning any language, you only know a few words and a few ways to put the words together. It's difficult to be eloquent until you have a range of vocabulary and syntax under your belt. In the following pages, we provide many elementary samples to assist you in understanding contextual Replace's vocabulary and syntax. Don't let the simplicity of these samples mislead you--once you have a full understanding of contextual Replace, you are indeed a Master of BEX.

Part 2: The Elements of Contextual Transformation Chapters

In User Level Section 8, we provided a sample of typing changes directly that illuminates how basic transformation chapters are structured. The same process should provide insight into how contextual transformation chapters are put together. But before you can start typing contextual changes directly, you need to learn a little bit about on and off strings and pattern codes.

As mentioned briefly in Part 1, on and off strings allow you to activate and deactivate the execution of transformation rules within a chapter. With basic Replace, you define a unique character to serve as a terminator. The terminator defines the end of the find and change to strings. This is also true in contextual Replace. Additionally, you may define two unique strings that serve as the signals to start and stop replacement. You don't have to define on and off strings: When you enter no characters as on or off strings, then all transformation rules are executed for the entire chapter.

The on and off strings may be any length, from 1 to 80 characters. Generally, the on and off strings are different from each other. In any one transformation chapter, the characters you define as on and off strings cannot be replaced. (When you're a real hotshot you can break these rules--details in Part 5.)

As mentioned in Part 1, contextual transformation rules consist of three parts: find, pattern, and change to strings. Each pattern string is made up of pattern codes; each code has a specific meaning. Each character in the find string is paired with one pattern code in the pattern string. The combination of the find string character and the pattern code define what happens during replacement. An example of a pattern code is the lowercase letter it. When a find string character is paired with the lowercase x pattern code, it means match exactly this find string character and remove it in the target chapter. In other words, when you use this pattern code, your contextual transformation rules function exactly like basic transformation rules.

Typing Contextual Changes Directly

The best way to learn about the elements of contextual Replace is to just dive right in and try it. Here's the first task: In the QUANDARY chapter, change almost every appearance of the word blind to the words vision impaired. Don't change blind to vision impaired when the word appears in a heading. Here's how you type those change directly:
Main: R
Replace
Chapter: QUANDARY <CR>
Chapter: <CR>
Target chapter: QUANDARY-C <CR>
Use transformation chapter: <CR>
Enter terminator: #
So far, this dialog is familiar. Press R, supply source and target chapter names, press <CR> at the Use transformation chapter: prompt to type changes directly, and establish which character serves as the terminator, in this case, the number sign.

When you enter your single terminator character at the first Find: prompt, you signal BEX to begin contextual Replace. Here's how it looks:
Find: #
Contextual replace
On string: <space>$p <space>#
Off string: $$p-1 $$c#
BEX responds with Contextual Replace and prompts you to define your on and off strings. We don't want to change the word blind to vision impaired when it appears in a heading. The headings in the QUANDARY chapter all begin with $$c and end with a paragraph ( $p ) indicator. We want to turn replacement off after the $$c and turn it back on at the start of the following paragraph. That's why we define the on string as ( $p ) and the off string as $$c. Here's the rest of the dialogue:
Find: blind#
Pattern: xxxxx#
Change to: vision impaired#
Find: #
Pattern: #
Continue? Y <CR>
Chapter QUANDARY done
Replaced 9 times
Save transformation chapter: QT <CR>

Once the on and off strings are defined, you begin to create transformation rules. The transformation rule begins with a find string. In this case, we're finding the word blind. Then comes the pattern string. Each character in the find string is paired with the pattern code character in the same relative position. The first find string character pairs with the first pattern code; the second find string character pairs with the second pattern code, and so forth. In this sample, each letter in the word blind is paired with the pattern code lowercase it. There are five letters in the word blind, so there are five lowercase x pattern codes.

The change to string functions exactly as it does in basic Replace. It defines what the find string is replaced by; in this case, the words vision impaired. The combination of the find character and the pattern code define what BEX finds and changes in your chapter.

As always, the change to string always ends with the terminator. When you press your single terminator character at both the Find: and Pattern: prompts, you signal the end of the list of rules. Our sample has only one transformation rule, so we press the number sign terminator at the Find: and Pattern: prompts. As with basic Replace, BEX prompts Continue? Y and gives you one last chance to chicken out or switch disks. Press <CR> here to start executing the rules.

The structure of contextual Transformation chapters

As with basic Replace, you have an opportunity to save the list of transformation rules in a transformation chapter. In the sample, we save the rules in a chapter named QT. When you edit the QT chapter, you find all the characters you typed between the Use transformation chapter: and Continue? Y prompts. It looks like this:

## $p #$$c#blind#xxxxx#vision impaired###
All contextual transformation chapters begin with two terminators. Between the second and third terminator are the characters that serve as the on string. The off string consists of the characters between the third and fourth terminator. After these four terminators, the list of transformation rules starts.

Each contextual transformation rule consists of the find string, a terminator, the pattern string, a terminator, the change to string, and one more terminator. The list of rules in a contextual transformation chapter always ends with three terminators in a row. The total number of terminators in a contextual transformation chapter is always divisible by three.

When your pattern string contains all lowercase x pattern codes, then Replace only finds characters that exactly match the find string, removes them from the target chapter, and replaces them with the change to string. (Exactly what basic Replace does.) With the QT sample, we have only taken advantage of one of contextual Replace's tools: the ability to switch replacement on and off within a chapter.

We now change gears and discuss pattern codes in greater detail. We return to the topic of on and off strings again in Part 5.

Part 3: Pattern Codes in Detail

So far, the only pattern code we've discussed is lowercase it. If that were the only pattern code, then contextual Replace wouldn't be much of an improvement over basic Replace. But lowercase x is just one of 34 pattern codes that together provide you with a flexible, if cryptic, way of manipulating text.

Three pattern code groups

Every pattern string must contain some number of pattern codes. The pattern codes divide into three groups: departing codes, boundary codes, and specials. How the four specials work is a little complicated--we defer a complete explanation until Part 4.

The fifteen departing pattern codes and fifteen boundary pattern codes are drawn from the same set of fifteen letters. When the letter is lowercase, it's a departing pattern code. When the letter is uppercase, it's a boundary pattern code. Some of the fifteen letters are mnemonic, but we ran out of clever names halfway through. Part 8 and the Thick Reference Card provide an alphabetical listing of all pattern codes, so don't try too hard to memorize them all.

In the sample in Part 2, the QT contextual transformation chapter pairs a departing pattern code with every character in the find string: the lowercase letter it. The find string characters are removed from the target chapter.

When you pair a find string character with a departing pattern code, then the character in the source chapter that satisfies the combination of find and pattern strings is removed from the target chapter. The departing pattern codes show BEX where to place the change to string in the target chapter. When you pair a find string character with a boundary pattern code, then the character in the source chapter that satisfies the combination of find and pattern strings remains in the target chapter. The boundary pattern codes allow you to specify a context for replacement--one of the reasons we call this option contextual Replace.

() Caution! The only significant spaces in this Section are those indicated by <space>. Any other spaces are there for ease of reading.

Here's a single transformation rule that illustrates the difference between departing and boundary pattern codes. The task is to change the Roman numeral II to the Arabic digit 2 whenever the text discusses literary braille. The terminator is the number sign:
Find: Grade <space> II#
Pattern: XXXXXXxx#
Change to: 2#
The first six characters of the find string are paired with uppercase X boundary codes in the pattern string. The only time that Roman numeral II is replaced with Arabic digit 2 is when the six characters preceding the Roman numeral are exactly uppercase Go, lowercase rather, lowercase a, lowercase do, lowercase every, <space>. Clever readers will note that a basic Replace rule that finds grade 2 and changes it to Grade II accomplishes the same task. But stay tuned! We return to this issue again shortly.

The contiguous departers requirement

One rule governs where boundary pattern codes may appear in the pattern string. Boundary codes cannot interrupt a string of departing pattern codes. Boundary codes can appear as the initial characters in the pattern string, as the final characters in the pattern string, or both initial and final. But they cannot appear in the middle of a string of departing pattern codes. Departing pattern codes must be contiguous. That is, departing pattern codes must touch in unbroken sequence. The departing pattern codes define where BEX places the change to string in the target chapter. If the departing pattern codes were interrupted by boundary codes, BEX wouldn't know where the change to string should go. The "II to 2" rule above shows initial boundary codes; the other possibilities are shown in some of the many samples that lie ahead.

Find and pattern strings are the same length

Each pattern code is paired with one character in the find string. Therefore, the find strings and the pattern strings must be the same number of characters. When you type contextual changes directly, BEX refuses to accept a pattern string that's shorter or longer than the find string.

With this basic information under your belt, you're ready for the guided tour through the pattern codes. In the rest of this Part, we examine each pattern code in detail. Some pattern codes are pretty obvious; we just describe what they do. Many other pattern codes are more subtle. To make their function clear, we provide samples of where you could use the pattern code under discussion. Some of the samples show a lowercase departing pattern code, others show an uppercase boundary pattern code. Remember, each of the fifteen pattern code letters can be used either way.

Pattern code B: Blank Space

This pattern code stands for just one character, the space. (Think of the letter b in the word blank.) It's actually a redundant pattern code, since you can also specify the space by pairing a space in your find string with the exact pattern code X or it. We included this pattern code to help make your transformation rules a little more legible, as you see in subsequent samples.

Pattern code W: Total Wild Card

The pattern code letter W means every possible character you can have in a BEX chapter. (Think of the letter w in the word wild.) Technically, this means every ASCII character from zero to 127; this includes all control characters, the space character, all the digits and letters, all punctuation and symbols. Because the W pattern code stands for every possible character, it really does not matter which find string character you pair it to.

Analyze the data before you write the rules

Before you can write successful transformation rules, you must analyze the patterns inherent in the data you're changing. Both contextual and basic Replace execute your transformation rules in order, one BEX page at a time. The program compares the first rule with each and every character. When it finds a match, then the change to string characters are inserted in the text. Once all the characters are compared with the first transformation rule, BEX repeats the process with the second transformation rule. It's crucial to consider how the initial rules in a transformation chapter may change the patterns inherent in the data you're replacing. We return to this topic again after we provide a sample using the lowercase w wildcard.

Sample: deleting BEX format commands

Your BEXtras disk contains a chapter named RESUME that's stuffed full of various $$ commands. The task here is to delete every BEX $$ format command from this chapter. This is possible because BEX format commands follow a pattern. They all begin with two dollar signs. They all end with a space. (We assume here that the text is formatted with BEX's new-line ( $l ) and paragraph ( $p ) indicators. Later on in this Part, we discuss how to cope when you format text with hard <CR> s.)

BEX format commands can vary in length. In between the two dollar signs and the space, there are some number of lowercase letters and frequently some digits as well. Some format commands contain punctuation symbols, like the plus sign, minus sign, or asterisk. We write six transformation rules to find and delete format commands from three to seven character long.
Main: R
Replace
Chapter: RESUME <CR>
Chapter: <CR>
Target Chapter: 1RESUME-NO$ <CR>
Use transformation chapter: <CR>
Enter terminator: #
Find: #
Contextual replace
On string: #
Off string: #
Find: <space> $$<space>#
Pattern: BxxB#
Change to: -#
Find: $$#
Pattern: xx#
Change to: <space> $$#
Find: $$c <space>#
Pattern: xxwB#
Change to: #
Find: $$p5<space>#
Pattern: xxwwB#
Change to: #
Find: $$ml4<space>#
Pattern: xxwwwB#
Change to: #
Find:<space>#
Pattern: xxwwwwB#
Change to: #
Find: $$ml+24<space>#
Pattern: xxwwwwwB#
Change to: #
Find: #
Pattern: #
Continue? Y <CR>
Starting to replace
Replaced 119 times
Save transformation chapter: STRIP DOLLAR <CR>

This sample shows that a contextual transformation chapter may contain more than one pattern code. In fact, a contextual transformation chapter could contain many combinations of all 34 departing and boundary pattern codes.

The first rule's find string is BEX's move-to-the-next tab command, were $$). This is the only $$ command that doesn't follow the pattern described above. We replace the tab with a hyphen, since we still want to have some characters separating the text that was formatted with tabs.

The second rule ensures that all subsequent $$ commands follow a regular pattern, that is, they all start with two dollar signs and end with a space. Although we urge you to precede and follow every $$ command with a space, it's also possible to jam a series of $$ commands together. All the other rules in this transformation chapter depend on a boundary space after the $$ command. As it happens, the RESUME chapter contains this text:
... long-range planning of the firm.<space> $$mr-8<space> $p <space> NORTH FARM COOPERATIVE ...
If the second rule did not insert a space before every $$, then the could not be changed, since it does not end with a space.

The next five rules specify every possible format command, starting with three-character-long commands and working up to seven-character-long commands. The third rule's find string is one possible format command that's three characters long: $$c. The pattern string begins with two lowercase x pattern codes that define exactly two dollar signs as departing characters. While the next departing pattern code, lowercase w, is paired with the specific character can, because it's a total wild card, this rule also matches $$b, $$d, $$h, $$r, and $$z. The last pattern code is a boundary uppercase B, standing for a space. (Uppercase X means the same thing; the B code here improves readability.) Since we enter the number sign terminator at the Change to: prompt, the change to string is empty. Since the initial dollar signs and the next character are replaced by nothing, they're deleted from the target chapter.

The fourth rule's find string shows the $$p5 format command. Since the p5 characters are paired with two lowercase w pattern codes, this rule also matches $$a3, $$vn, $$ub, and any other format command that's four characters long. The fifth, sixth, and last rules use the same technique to find and delete format commands from five to seven characters in length.

When you edit the RESUME-NO. chapter, you find many extra spaces. These spaces separated $$ commands in the original RESUME chapter, and were "thrown away" when BEX prints. You could use the SP2 transformation chapter on your BEXtras disk to delete all the extra spaces. (In Part 7, we explain how SP2 works.) Once you add more pattern codes to your repertory, it's possible to rewrite these rules to not create those extra spaces--we provide an example of this later on in this Part.

Reversing the order would delete characters

The order of the transformation rules in the STRIP DOLLAR chapter is crucial. If you reversed the order of the rules, searching first for seven-character format commands, next for six-character, and so on, you could delete portions of your text in addition to the $$ commands. Suppose the RESUME chapter contained the following characters:
...<space> $l <space> $$mr8 <space> A <space> mail-order ...
If the third transformation rule was:
Find: $$ml+24<space>#
Pattern: xxwwwwwB#
Change to: #
then the line of text would get mangled to:
...<space>$l <space><space> mail-order <space> firm ...

Here's why: The departing pattern code w matches every possible character. The transformation rule really says, delete two dollar signs and the next five characters, as long as the space boundary exists. It just so happens that $$mr8<space>A<space> satisfies this rule. These are five characters between the initial two dollar signs and for <space> after the article .A.

Because the STRIP DOLLAR transformation chapter begins with short $$ commands and finishes with long ones, it avoids this sort of problem. The rule that matches five-character $$ commands precedes the rule that matches seven-character $$ commands. Since Replace characters executes rules in order, the five-character command $$mr8 is matched and deleted before Replace starts looking for seven-character format commands. By the time Replace characters tries to match the seven-character rule, the $$mr8 <space> A <space> text no longer exists.

Pattern code I: Ignoring Capitalization

You pair the pattern code I with a letter in your find string. Pattern code I means: find exactly this letter in two situations, when the find string letter is uppercase and when it's lowercase. (Think of the letter i in the word ignore.) When the pattern code is uppercase I, then its find string partner is a boundary, and remains in the target chapter. When the pattern code is lowercase i, then its find string partner departs. Here's how we modify the "II to 2" rule from the QT transformation chapter (shown in Part 2) to make it more general (the terminator is number sign again):
Find: Grade <space> II#
Pattern: IIIIIBxx#
Change to: 2#
This rule finds and changes the Roman numeral II in a variety of contexts. Because the boundary code I ignores capitalization, Grade II, grade II, and GRADE II all satisfy the combination of find and pattern strings.

Pattern codes S, U, and L: Generic Letters

These three pattern codes focus on the letters of the alphabet. The pattern code S stands for any letter of the alphabet, as long as it's lowercase. (Think of the letter s in the word small.) The pattern code U is the opposite of S; U stands for any letter of the alphabet that's uppercase. (Think of the letter u in the word uppercase.) Finally, the pattern code L stands for 52 characters: any letter of the alphabet, regardless of capitalization. (Think of the letter l in the word letter.) The following sample illustrates what a time-saver the S, U, and L pattern codes can be.

Sample: Inserting underlining for single letters

The task here is one we actually faced while revising this edition of the BEX Dox. In the earlier edition, when we were discussing an individual letter, we would enclose that letter in double quotes. In this edition, we italicize a single letter. We didn't want to make all those changes manually! The following sample transformation rules delete the quotes from around a single letter, and insert BEX's underlining commands,and.

The number sign serves as the terminator once more. The first rule changes the opening double quote:
Find: <space> "A" <space>#
Pattern: BxLXB#
Change to: $$p-1 <space>#
We're assuming that a quoted single letter, together with any touching punctuation, is preceded and followed by a space. The find string shows the letter A, but since it's paired with the boundary code L, this rule actually applies to every possible letter, regardless of capitalization. With basic Replace, we would have had to write 52 rules to accomplish the same task. It's important to include both the opening and closing quotes, because we only want to change double quotes around single letters. Only one pattern code is departing; the double quote is removed from the target chapter. The change to string consists of the start underlining command.

This rule illustrates the contiguous departers requirement. Because departing pattern codes must touch, a pattern string of BxLxB is not allowed. When you want to modify text on both sides of a given group of characters, you have to work on one side first, and then the other.

Now that we've dealt with the opening double quote, the second rule handles the closing double quote.
Find: $$ub<space>A"<space>#
Pattern: XXXXBLxB#
Change to: <space>$$uf#
Replace characters always executes the transformation rules in the order you enter them. Therefore, after the first rule, the emphasized letters no longer have opening quotes--they have the begin underlining format command, $$ub. The find string for the second rule reflects this change. The pattern string again contains a single departing code, paired to the closing double quote. The change to string does not need to include a space after the $$uf, because the space after the double quote is paired with a boundary pattern code, B, which is not changed in the target chapter. This space is still present in the target chapter, where it serves as the space required after every BEX format command.

These two transformation rules accomplish the task at hand for many occurrences of quoted letters. The only exceptions are when punctuation touch the quoted letter. One could write a series of rules like this:
Find: <space>"A,"<space>#
Pattern: BxLXXB#
Change to:<space>#
making one rule for every possible punctuation that touches the quote marks. In addition to the comma shown, you would need rules for the period, exclamation point, question mark, colon, left and right parentheses, left and right brackets, semicolon, and several more. But (as you're probably not surprised to discover at this point) contextual Replace has one pattern code for this situation.

Pattern code P: punctuation and symbols

The pattern code P stands for punctuation and symbols. (Think of the letter p in the word punctuation.) Its definition is actually a process of elimination: P means anything that's not a letter, not a digit, not a control character, and not a space. Note that many of the symbols on this list are not what your English teacher taught you as "punctuation." For the record, a complete list of the characters that match the p pattern code appears in Part 8.

Sample: Inserting underlining for single letters, continued

Returning to our underlining letters transformation chapter, the P pattern code is used to match various combinations of punctuation touching quotations. The next two rules place the begin and finish underlining when a symbol of enclosure--parentheses, brackets, etc.--precedes the opening quote:
Find: <space>("A"<space>#
Pattern: BPxLXB#
Change to: <control-T><space>#
Find: (<control-T>$$ub<space>A"<space>#
Pattern: PXXXXXBLxB#
Change to: <space>$$uf#
The <control-T> touching token is discussed fully in Section 5, Part 1; it can replace the initial or final space in a BEX format command. Because we only want to underline the letter, and not the punctuation that precedes it, we need to begin underlining after the initial punctuation. But if we inserted <space> $$ub <space> there would be one space between the initial punctuation and the letter. We replace the initial space with <control-T> so we can start underlining within a BEX word. BEX throws away the final space in the underlining command when it's executed.

The next two rules cope with punctuation that appears to the right of the closing quote:
Find: <space>"A".<space>#
Pattern: BxLXPB#
Change to: $$p-1 $$ub <space>#
Find: <space>$$ub<space> A".<space>#
Pattern: BXXXXBLxPB#
Change to: <control-T>$$uf<control-T>#
The first <control-T> ensures that there's no space between the letter and final punctuation. The second <control-T> signals the end of the $$ command, so that the translator turns on again and correctly translates the punctuation. Remember, although the find string shows a period after the closing quote, when paired with the boundary pattern code P, it matches a semicolon, right parentheses, or any of the other characters from the long list above.

Two more rules cope with the remaining possibility, when the punctuation appears between the quoted letter and the closing quote:
Find: <space>"A,"<space>#
Pattern: BxLPXB#
Change to: $$ub<space>#
Find: <space>$$ub<space>A."<space>#
Pattern: BXXXXBLPxB#
Change to: <space>#

We don't bother with <control-T> s here, because we use the $$sp command so BEX suppresses underlining of final periods, commas, semicolons, and colons. A more elegant transformation chapter that accomplishes the same task with fewer rules is discussed in Part 6.

Pattern code D: Delimiting BEX "Words"

The D pattern code stands for two characters: either <CR> or <space>. (Think of the letter d in the word delimiter.) A <CR> or space is how BEX defines a "word" in the Editor. So far, our examples have assumed you format your text with BEX's new-line ( $l ) and paragraph ( $p ) indicators. When you use hard <CR> s to format your text, then the B blank pattern code does not sufficiently define a word. For example, the last word on a line begins with a space but ends with <CR>. To make the STRIP DOLLAR transformation chapter work correctly when your text contains hard <CR> s, substitute the D pattern code for the B pattern code.

Pattern code E: Everything Except Space or <CR>

The E pattern code is the opposite of the D pattern code. E stands for every character except two: <space> and <CR>. (Think of the letter e in the words everything but.) When this code is departing, lowercase e, it's a succinct way to delete words.

Sample: Deleting BEX format commands, revisited

When we introduced the wild card pattern code W, we showed six rules that deleted BEX format commands. Now that we have the D pattern code in our repertory, we can write transformation rules that find BEX format commands ending with both <space> and <CR>. And now that the E pattern code is at our disposal, we can delete all BEX format commands with just four rules.
Find: <space> $$ <space>#
Pattern: BxxB#
Change to: #
Find: $$p-1 $$x#
Pattern: XXe#
Change to: #
Find: $$p-1 $$<space>#
Pattern: xxb#
Change to: #
Find: $$p-1 $$<CR>
Pattern: xxD
Change to: #
The first rule finds the move-to-the-next tab command, inserting a hyphen to separate what was separated with tabs. The second rule finds any character following two dollar signs except a <space> or <CR>. BEX repeatedly executes this rule, deleting one character at a time from all format commands. Once all the characters between the two dollar signs and the delimiter are gone, the third and fourth rules clean up what's left. The third rule deletes what's left when the format command was not at the end of a line. The fourth rule deletes the two dollars signs left when the format command ended with <CR>, but leaves the <CR> itself. Try these rules on the RESUME chapter and see what happens!

Pattern code Q: Defining Real Words

The Q pattern code is a combination of the D and the P pattern codes. It stands for space, <CR>, or any punctuation or symbol. Although BEX treats the punctuation touching a word as part of the word for Editor cursor movement, many times you don't want that punctuation considered as part of the word when replacing.

Sample: Expanding keyboard shortcuts

As we promised in User Level Section 8, contextual Replace makes it much easier to develop keyboard shortcuts for your data entry. Suppose you type the three characters s-b to stand for the word SlotBuster and the two letters vi to stand for the two words visually impaired. The following basic Replace rules would cause problems:
Find: s-b#
Change to: SlotBuster#
Find: vi#
Change to: visually <space> impaired#
When your original text has a sentence like The villainous has-been twirled his mustache, then the above rules create the nonsensical the visually impairedllainous haSlotBustereen twirled his mustache.

Aha! you think, I'll just modify the rules to include a space before and after the shortcuts, like this:
Find: <space> s-b <space>#
Change to: <space> SlotBuster <space>#
Find: <space> vi <space>#
Change to: <space> visually <space> impaired <space>#
but then neither keyboard shortcut is matched in this sentence: The "s-b" card is useful to people who are vi.

The contextual Replace pattern code Q comes to the rescue, as follows:
Find: (Vi)#
Pattern: QIxQ#
Change to: ision impaired#
Find: (s-b)#
Pattern: QxxxQ
Change to: SlotBuster#
Although the find strings show left and right parentheses, when paired with the Q pattern code, these rules find and change vi and s-b in a host of different contexts, whenever the shortcut letters appear as a word. By pairing the initial letter v with the I boundary code, we match the shortcut Vi at the start of a sentence. The first letter of the shortcut remains in the target chapter, with the same case as it had in the source chapter. Villainous and has-been don't satisfy the Q pattern code, so you don't end up with nonsense. However, if your text contains Roman numeral six, you're in trouble. As always, it's important to know your data before you start writing rules.

Pattern codes N and A: Numbers

The N pattern code stands for the digits zero through nine. (Think of the n in the word numeral.) The A pattern code stands for 62 characters; the ten digits plus any letter, lowercase or uppercase. (Think of the letter a in the word alpha-numeric.) For those who like formulas, pattern code L plus pattern code N equals pattern code A.

Pattern code O: Wild Card Minus One

The pattern code letter O is a very handy code indeed. It stands for every possible character except one--the find string character it's paired with. (Think of the letter o in the words other than.) When pattern code O is paired with the dollar sign, then it matches every character except the dollar sign.

The sample task here is again drawn from our experience at RDC. For formatting our large print output, we use JustText, an embedded command typesetting program for the Macintosh Plus. All its commands begin with left brace and end with right brace. The JustText commands vary greatly in length; some are five characters long, while others are twenty characters long. We usually prepare our text totally with BEX, using contextual Replace to put the JustText commands where they should go. Occasionally, however, we do data entry directly in JustText. When we bring this data back to the Apple, most of the JustText commands are irrelevant.

The transformation chapter that strips out the JustText commands begins with several rules that change the few meaningful JustText commands back to their BEX equivalents--we won't bother to show you those. It's the last two rules in the transformation chapter that show off the power of the o pattern code:
Find: {}#
Pattern: Xo#
Change to: #
Find: {}#
Pattern: xx#
Change to: #
The first rule matches the beginning of a JustText command. The initial left brace is paired with uppercase X, so it remains in the target chapter. The right brace, when paired with departing o, means: delete any character except the right brace. This rule is executed repeatedly, so it deletes each character in the JustText command until all that's left are the initial and final braces. The second rule deletes this residue.

Pattern code C: Control Characters

The C pattern code stands for any control character--technically speaking, the ASCII values zero to 31 plus ASCII 127. (Think of the letter c in the word control.) In terms of characters you are likely to enter in a BEX chapter, the C code includes <ESC>, <CR>, <control-T>, <control-S>, <ASCII 30> (the discretionary line-break), <ASCII 31> (the discretionary hyphen), and <DEL>.

The C code can be very useful when you are working with files from other computers that arrive at your Apple full of printer control commands. Suppose you have a file like this; the only control character you want to preserve is <CR>. Assuming that all printer control commands are two characters long, you can reformat the text with these rules:
Find: <CR>#
Pattern: x#
Change to: <space>$l <space>#
Find: <ESC> O#
Pattern: cw#
Change to: #
Find: <space>$l <space>#
Pattern: xxxx#
Change to: <CR>#
These rules demonstrate another important technique for Replace characters, both basic and contextual: temporarily protecting a character from being changed. The first rule changes <CR> to the new-line ( $l ) indicator. The only control characters that remain are ones you don't want, so the second rule deletes them. The third rule reverses the action of the first rule, changing the new-line indicators back to hard <CR> s.

Summary

We've now defined the fifteen departing and boundary pattern codes. We've also provided samples for most of them. Now it's time to dive into the mysterious world of the specials.

Part 4: The Special Pattern Codes

The four specials "break the rules" we've established so far. But in return for this inconsistency, you get the ability to do even niftier things. The special pattern codes are the caret, uppercase Z, uppercase Y, and the current terminator. Let's look at these in reverse order.

The Pattern String Shortcut

In the very first contextual Replace sample, the pattern string consisted of all lowercase x pattern codes. Just to make your life a little easier, there's a shorter way to indicate that you want an exact, departing match for every character in your find string. Instead of entering a lowercase x for every character in your find string, enter your single terminator character at the Pattern: prompt. The original rule looked like this:
Find: blind#
Pattern: xxxxx#
Change to: vision <space> impaired#
The next sample functions identically to this rule, using the pattern string shortcut:
Find: blind#
Pattern: #
Change to: vision <space> impaired#

Pattern codes Y and Z: Case-changing Specials

The uppercase letters Y and Z are like boundary pattern codes with one difference. All the boundary codes result in no change in your target chapter. Y and Z create a subtle change in your target chapters. When the combination of a find string character and the pattern code Y or Z is satisfied, the find string character changes its case. Uppercase Y stands for any letter, just like the letter code L; but when the find string partner is a lowercase letter, then that letter is uppercase in your target chapter. Uppercase Z stands for any letter; when the find string partner is an uppercase letter, it becomes lowercase in your target chapter.

() Caution! The Y and Z do not automatically reverse the case of the find string partner letter. Y forces an uppercase letter, and Z forces a lowercase letter. With the exception of their case-changing abilities, Y and Z follow the rules for boundary pattern codes.

Pattern code caret: Insert Change To String Here

Up until now, all the pattern strings we've shown contain at least one departing character. Sometimes, however, you don't want to remove any of the characters in your source chapter that satisfy the combination of find string and pattern string. The caret ^ character means: insert the change to string characters at this point. With this special pattern code, you can make changes without removing anything.

Now that the full truth is out, we can state the full requirements for pattern strings. A pattern string cannot consist solely of boundary pattern codes. A pattern string can contain any number of boundary pattern codes, plus either exactly one caret, or at least one departing character. When a pattern string does not contain the caret, then the number of characters in the pattern string is exactly the same as in the find string. When a pattern string contains a caret, then the pattern string is one character longer than the find string.

You cannot have more than one caret in any one pattern string, because the caret shows BEX where to insert the change to string. If you had two carets, BEX would not know where to put the new characters. Any one pattern string can not contain both a caret and any departing pattern codes. When you find yourself wanting to mix and match this way, you need to write two separate rules.

Sample: Inserting commas in long numbers

You've written a report with many large dollar amounts and other numbers in it, but you realize that you've left out all the commas between the hundreds and the thousands places. You can insert the comma using the caret code (the number sign is the terminator):
Find: 4321.#
Pattern: N^NNNQ#
Change to: ,#
This rule matches any four numbers, followed by punctuation or a delimiter, and inserts a comma between the first and second numbers. This takes care of a lot of cases. It doesn't matter whether the numbers are whole or followed by two decimal places or whether it's prec $$ by a dollar sign. However, this rule only works for numbers in the thousands: when your data contains millions and billions, you need to write more rules.

But at the end of your report, there's a list of phone numbers. You don't want commas between the fourth and fifth digits in those. You see that what makes these numbers unique is the hyphen between the third and fourth digits. The O pattern code specifies a match for every character except its find string partner--in this case, the hyphen:
Find: -4321.#
Pattern: ON^NNNQ#
Change to: ,#
That's better! Now the rule matches four digits followed by a space or delimiter, and inserts a comma between the first and second digits, unless the first number is preceded by a hyphen.

Sample: Uppercase to lowercase

You can make great use of the caret when the change to string is empty. The following single rule accomplishes the same task as the 106-character UCLC basic transformation chapter on your BEXtras disk.
Find: A#
Pattern: ^Z#
Change to: #
The find string contains the letter A. When paired with the boundary pattern code Z, this means find every letter and make it lowercase. A pattern string can not consist totally of boundary codes. We precede the Z with the caret, to mean insert the change to characters here then pull the sneaky trick of creating an empty change to string. No text is inserted; the target chapter contains all lowercase letters.

Part 5: On and Off Strings in Detail

With basic Replace characters, you can not selectively change material within a chapter. Instead, you have to break out your selections into separate chapters, Replace as required, then merge (or clipboard) the changed text back in to your original.

Contextual Replace's on and off strings save you a lot of extra work. In Part 2, we illustrated on and off strings by changing one word everywhere except in a heading, which began with $$c and finished with ( $p ). That's a good example of on and off strings that naturally occur in your text. The $$c characters are in your text to create a centered heading; the ( $p ) is there to make a new paragraph. By defining ( $p ) as the on string and $$c as the off string, you're getting double duty from those characters.

It's also possible to explicitly insert characters whose only purpose is switching Replace off and on. You want these characters to be very distinctive; three percent signs %%% makes a good on string, and three ampersands &&& makes a good off string. In part 7, we discuss the FIX TEXT contextual transformation chapter, which reformats textfiles into BEX chapters. When your textfile contains tabular material, FIX TEXT would cause havoc. As supplied on your BEXtras disk, FIX TEXT does not have on or off strings. When reformatting tabular textfiles, insert explicit on and off strings in the data and in FIX TEXT.

Explicit, Non-printing On and Off Strings

BEX's Grade 2 translator recognizes BEX's own underlining commands. So $$ub both begins underlining in print, and marks the beginning of braille italics. Likewise, $$uf finishes underlining and ends braille italics. When the translator encounters $$ub, it begins placing braille italics signs (dot 4-6, the period in screen braille) before every word. When the translator encounters $$uf, it stops placing italics signs. When the underlined passage is greater than three words, then the first word gets a double italics sign (two dot 4-6's or two periods in screen braille) and the last underlined word has a single italics sign.

Braille italics are used to represent various typeface changes in inkprint. But with some books, the braille reader must be able to distinguish between three typefaces: plain, italic, and bold. Placement of boldface indicators follows the same rule as italics indicators. Three or less bold words each begin with the single boldface indicator _. underbar, period. The first word in a passage longer than three begins with the double boldface indicator _.. underbar, two periods, and then the single boldface indicator precedes the last bold word.

Two of BEX's specific printer commands control boldface in print: $$eb begins boldface and $$ec returns to regular print. But the Grade 2 translator can't use $$eb and $$ec to place boldface signs. When the book you're transcribing contains extensive text that requires boldface indicators, it's time-consuming to place them all by hand. Combining the Grade 2 translator plus off and on strings, contextual Replace can automatically place boldface indicators.

It's a three step process. Enter $$eb and $$ec in your inkprint text. Then use basic Replace to transform $$eb to $$ub $$eb and $$ec to $$uf $$ec. The translator ignores the $$eb and $$ec commands; but because the $$ub and $$uf commands are there, it places italics signs as always.

The last step is using a contextual Replace transformation chapter with an on string of $$ub $$eb and an off string of $$uf $$ec. This transformation chapter changes . the italics indicator to a boldface indicator. The on and off strings prevent every italic sign from changing to boldface. This means that the only time italics signs are changed to boldface signs is when the italics signs appear between $$ub $$eb and $$uf $$ec.

BEX's formatter knows that braillers can neither underline nor do boldface. When you print your final grade 2 chapters to the braille previewer or an actual brailler, BEX suppress the action of all $$ub, $$uf, $$eb, and $$ec format commands.

Sample: Automatic braille boldface indicators

To get automatic braille boldface indicators, you must transform your data twice. In your inkprint data entry, type $$eb before boldface text and $$ec when you return to plain text. Before translation, use this basic Replace transformation:
Enter terminator: #
Find: <space> $$eb <space>#
Change to: <space>$$ub $$eb <space>#
Find: <space> $$ec <space># $$uf
Change to: <space> $$ec <space>#
Now you are ready to translate your inkprint into grade 2 or grade 1 braille. The translator places italics signs appropriately between all $$ub and $$uf underlining commands.

The following contextual transformation chapter changes the italics signs to boldface signs.
Enter terminator: #
Find: #
Contextual replace
On string: $$p-1 $$ub $$eb# $$uf
Off string: $$p-1 $$ec#
Find: <space>.#
Pattern: B^X#
Change to: _#
Find: #
Pattern: #
Continue? Y <CR>
The first find string character, <space>, when paired with a boundary B, defines the context at the beginning of a word. The period, when paired with boundary pattern code X, defines the first character of an italicized word in braille. There may be two italics signs or only one, but we actually don't care: we just want to insert the underbar that changes the italics sign to a boldface sign.

The pattern code meanings are for print data, which is important to keep in mind when writing contextual rules for braille data. For example, the P punctuation pattern code does not match braille punctuation, where a comma is represented with a screen braille digit 1 and the period is screen braille digit 4.

() Warning! When you include the w wildcard pattern code in a transformation chapter that uses on and off strings, it's easy to "eat" your on and off strings by mistake. Suppose your off string is #stop and one rule contains three w departing pattern codes in a row. When Replace characters encounters <space># it satisfies the rule, and the first two characters of your off string depart--and your replacement never turns off! The solution is to avoid the w wildcard. In this situation, it's safe to substitute the o departing pattern code paired with number sign. That way the off string is not affected by the replacement.

Off Strings that Overlap Data

While we don't recommend you find and change your exact on or off strings in the same replacement, you can modify text that's partly the same as your off strings. As BEX replaces characters, it works a character at a time. Suppose your off string is the first five letters of the alphabet: ABCDE. You can find and change <space> ABC because BEX hasn't encountered the full off string yet. But if your change deletes the A, But, or Can, you can find yourself in deep trouble, because the full off string no longer exists. You have to be careful when pulling this trick; we only use this technique to insert material.

Sample: turning off underlining when $$h ends with ( $l )

BEX's center-and-underline command, $$h can underline for quite a while if you forget to end the heading with a paragraph ( $p ) indicator. Using a transformation rule that partly overlaps the off string, you can insert the command to end underlining. Here's how:
Enter terminator: #
Find: #
Contextual replace
On string: $$p-1 $$h#
Off string: <space>$l <space>#
Find: d <space>$l#
Pattern: WbXX#
Change to: <space><space>#
Find: #
Pattern: #
Continue? Y <CR>
Starting to replace ...
The first character in the find string, when paired with the boundary pattern code W, stands in for last character of the heading. The next three characters are three-fourths of the off string, but the missing last fourth is enough of an overlap to make the transformation possible.

Changing a Basic Transformation Chapter to Contextual

The transformation chapter named MAKE CON on your BEXtras disk helps you transform a basic Replace transformation chapter into a contextual transformation chapter. It places its own on and off strings, which are the same character, then removes them at the end. MAKE CON clearly shows just how far you can push contextual Replace! We won't attempt to explain how it works, but it does work.

Here's how you use it. Suppose you have a basic Replace chapter named HARPO whose terminator is vertical bar. As supplied on the BEXtras disk, MAKE CON assumes that the basic Replace terminator is vertical bar. Edit HARPO and insert two characters at the beginning: <control-A> <space>

Quit HARPO, and run Replace characters. HARPO is your source chapter, the target chapter is named ZEPPO and the transformation chapter is MAKE CON. Once the transformation is finished, edit ZEPPO. Delete the <control-A> <space> at the start, and insert three terminators there. Advance to the end and add two terminators. Presto! ZEPPO is now a contextual transformation chapter.

When your basic Replace chapter uses a different terminator character, then modify MAKE CON. MAKE CON'S own terminator is slash. When your basic Replace terminator is any character except slash, simply use basic Replace to change vertical bar to your terminator.

Part 6: Using Contextual Replace

Congratulations! You've plowed your way through some pretty heavy material. In this Part of Section 6, we discuss various techniques to help you make the most of contextual Replace. First, we discuss BEX's error checking of pattern strings. Then we provide guidelines for preventing data salad. We present some thoughts on "elegance" in transformation chapters: how to get the most done with the fewest rules. Finally, we offer some hints on creating contextual transformation chapters in the Editor, and making hard copy versions of contextual transformation chapters under development.

Here's a general hint to make contextual Replace more enjoyable: Obtain a memory expansion card, and configure an extended disk system with RAM drives. While RAM drives are always fun, they're especially useful when you're creating contextual transformation chapters. When BEX is reading and writing to a RAM drive, it only takes a minute or two to test the reliability of a transformation chapter. The less time required for you to test, the more likely that you actually perform the tests that guarantee smooth sailing.

Add comments to the end of your transformation chapters

When BEX encounters the three terminators that define the end of a contextual transformation chapter, BEX stops looking for more transformation rules. You can insert any text you wish after the three terminators. It's not always readily apparent what purpose a contextual transformation chapter serves, especially when it's been three months since you wrote it. We always add a sentence or two describing the function of the transformation chapter, as well as when it's best used, and the date it was last changed.

Planning is All

Even when you think you're a real contextual hotshot, it's possible to do truly idiotic things. While we were writing this Section, we thought we'd made a rule to automatically place BEX's underlining commands around single letters. It looked like this:
Enter terminator: #
Find: #
Contextual replace
On string: #
Off string: #
Find: <space>$p <space>#
Pattern: #
Change to: <control-P>#
Find: <space>$l <space>#
Pattern: #
Change to: <control-L>#
Find: <space> A <space>#
Pattern: Q^LQ#
Change to: $$p-1<space>#
Find: $$p-1<space> A <space>#
Pattern: XXXXBLQ^#
Change to: <space>
<space>#
Find: $$p-1<space>#
Pattern: #
Change to: $$p-1<space>#
Find: $$p-1
<space>
#
Pattern: #
Change to: $$p-1
<space>#
Find: <control-L>#
Pattern: #
Change to: <space>$l <space>#
Find: <control-P>#
Pattern: #
Change to: <space>$p <space>#
Find: #
Pattern: #
The first two and last two rules work fine. The first two temporarily change the new-line and paragraph indicators to control characters, so the ( $l ) and ( $p ) are out of harm's way for the bulk of the rules. The last two rules undo the first two, transforming <control-P> and <control-L> back to ( $p ) and ( $l ). When you want to put particular strings of characters on a protected shelf during the Replace process, control characters are a great intermediate.

But the rules in between the first and last two did not work at all how we'd planned. Every time the letter a appeared as an article, as in "this move was a mistake," the article was underlined. Additionally, when a word ended with apostrophe s or apostrophe that, the last letter was split off and underlined. The third rule's pattern code of Q^LQ# was the source of the problem. We thought one rule could catch both single letters alone and single letters in quotes. But the final two characters in a word like don't also satisfied the pattern code. Fortunately, we used different source and target chapter names, so we just sighed and laughed at ourselves. (In fact, we were so embarrassed that we just made the changes manually. Everyone has a bad day sometime.)

The moral is: Never underestimate the power of contextual Replace to change something you never considered. When you're developing a contextual transformation chapter, never use the same name for source and target. Only when you've tested your chapter with a broad variety of data is it safe to use the S naming method.

Pattern String Errors

When you start out writing contextual transformation chapters, it's easiest to type changes directly. BEX prompts you for the find, pattern, and change to strings, so you're less likely to get lost. BEX also provides some rudimentary error-checking for your pattern codes. This error-checking happens both when you type changes directly and when BEX is executing rules from a transformation chapter on disk. How you recover from the error depends on when it occurs.

When you directly type a character that's not one of the 34 valid pattern codes, then BEX beeps twice, prompts Illegal character in pattern string and throws away both the find and bogus pattern strings you've just entered. BEX reprompts Find: to give you a chance to do it right.

When you directly type the wrong number of pattern codes in your pattern string, then BEX beeps, prompts Character counts are not equal and again discards the most recent find and pattern strings. Then BEX reprompts Find: to give you another chance. Finally, when your pattern string contains all boundary codes, but it doesn't contain the caret, then BEX beeps twice, prompts Pattern string needs caret or one departing code and ignores the most recent find and pattern strings. Again, BEX reprompts Find: and waits for your next attempt.

Enter intentional error to get a second chance

You can take advantage of these error-checking routines when you realize that your find or pattern strings are simply wrong. Suppose you enter the wrong characters in the find string; type <space><terminator> as a pattern string and BEX rebels, giving you an opportunity to try again.

When errors occur with a transformation chapter from disk

Whenever BEX finds problems with a pattern string, it always issues these error messages--even when the problems occur with a transformation chapter loaded from disk. When you first begin to write contextual transformation chapters in the Editor, you're guaranteed to run into this situation at least once. When your pattern strings are too long, too short, or contain invalid characters, then BEX issues one of the error messages, right in the middle of the act of replacing. Any and all correct transformation rules are executed for each BEX page until BEX encounters the faulty pattern string. Then BEX saves the first page, and goes on to the next page, again reporting the error when it encounters the faulty pattern string. In short, BEX executes as many rules as it can, saves the target chapter, and reports the number of times replaced.

An example may make this clearer. Suppose you created the following DUMB TRANS transformation chapter:

#### #b#@#99#===# ###
The first rule changes every <space> to an at-sign. The second rule's pattern string does not contain any of the 34 valid pattern codes. You ask BEX to use this DUMB TRANS chapter to Replace characters in the RESUME chapter. Once you specify the different target chapter, like RESUME-TEST, and commence replacing, you hear an enormous number of clicks as BEX changes all the spaces to at-signs. Then you hear two beeps and the Illegal character in pattern string error message. Next, you hear and see a BEX program function that's usually hidden from you; BEX saving the binary file of the target chapter page. The last thing you hear is Replaced 408 times after BEX has saved the imperfectly replaced RESUME-TEST chapter. If the RESUME chapter had two pages, you would hear the error message twice.

BEX issues this error message at the exact point when it tries to execute the transformation rule with the faulty pattern string. It's instructive to take a look at these half-baked target chapters; you may be able to see exactly where you erred. Once you're comfortable with editing transformation chapters, you can edit a fault chapter and insert two extra terminators after a rule to end the transformation chapter prematurely. Run Replace again; if you don't get the error messages, then delete the extra terminators and insert them in the next rule.

The fastest way to type changes directly

Whenever you specify the Ready chapter by its right bracket name, BEX assumes that the Ready chapter contains data, even if it's empty. You can take advantage of this fact to quickly create a contextual transformation chapter. Specify the Ready chapter as both source and target chapter and type your changes directly. After you press <CR> at the Continue? Y prompt, BEX almost instantly tells you Replaced 0 times and allows you to save the transformation chapter. You can save it to disk, or even in the Ready chapter, and then edit the transformation chapter you've just created.

Elegance and efficiency in transformation chapters

The time Replace characters requires to process your chapters depends on the number and nature of rules in your transformation chapter. You want to have as few rules as possible, but you also want to transform as many characters as possible with each rule.

When we introduced the L and P pattern codes in Part 3, we described transformation rules that changed single letters in quotes to underlined single letters. The transformation rules there more-or-less got the job done, but there's a more elegant solution. The following set of rules is shorter and handles every combination of punctuation touching the quoted letter, even when the source text contains ("a," "b," and "c").
Enter terminator: #
Contextual Replace
On string: #
Off string: #
Find: "a"<space>#
Pattern: xLXB#
Change to: <control-T><space>#
Find: "a".#
Pattern: xLXP#
Change to: <control-T><space>#
Find: "a."<space>#
Pattern: xLPXB#
Change to: <control-T><space>#
Find: <space><control-T>#
Pattern: xx#
Change to: <space>#
Find: $$ub<space>a"<space>#
Pattern: XXXXBLxB#
Change to: <space>#
Find: $$ub<space>a".#
Pattern: XXXXBLxP#
Change to: <control-T><control-T>#
Find: $$ub<space>a."<space>#
Pattern: XXXXBLPxB#
Change to: <space>#
Find: #
Pattern: #

The first three rules change any initial quote touching a single letter to <control-T>$$ub<space> even though many situations won't require the initial touching token. The point here is that when you don't need the touching token, it's obvious. When underlining does not begin mid-word, the data looks like <space><control-T> and the fourth rule deletes this unnecesary <control-T>. The moral is: when a few rules can insert more than you need, you can then delete the "extra" later on in the transformation chapter.

The fifth, sixth, and seventh rules demonstrate find and pattern strings that search for the minimum required. It doesn't really matter whether the underlining begins mid-word or not, so the find string for these three rules begins with the initial dollar sign of $$ub.

Also in Part 3, we demonstrated the o pattern code with two rules that delete all JustText typesetting commands. These commands always begin with left brace and end with right brace. The two rules delete every character between a left brace and a right brace, then delete the left brace-right brace pair that's left. While this approach is remarkably economical, it's also quite slow. Replace characters ends up comparing every character in the source chapter against the transformation rule at least four times, so it requires quite a while to strip out the commands.

A much more efficient procedure would use a few more rules:
Find: {}}}}}#
Pattern: Xooooo#
Change to: #
Find: {}}}#
Pattern: Xooo#
Change to: #
Find: {}#
Pattern: Xo#
Change to: #
Find: {}#
Pattern: xx#
Change to: #
The first rule deletes a six-character JustText command in one swipe; the second deletes a four-character command; the third deletes any commands that are left, and the fourth rule again deletes the residual braces. The second version accomplishes the same task in around half the time as the first.

Creating Contextual Transformation Chapters in the Editor

Whenever you type changes directly, make a habit of saving the transformation chapter, then editing it. (It's a good time to add the comments we mentioned above.) The more exposure you have to these chapters, the easier it becomes to create them from scratch. As we mentioned in Part 2, contextual transformation chapters contain every keystroke entered between the Enter terminator: and Continue? Y prompts. Whenever possible, use <CR> as your terminator. It makes moving around in the Editor much simpler since you can control-G and control-R between strings. To move to the next rule, you enter control-A 3 control-L.

The one essential prerequisite is knowing how many terminators you need. Every transformation rule has three terminators. The total number of terminators in a contextual transformation chapter is three times the number of rules, plus six. Four of that extra six are at the start of the chapter, and the last two finish up the list of rules.

Here's a quick review of the structure of a contextual transformation chapter. All contextual transformation chapters begin with two terminators. The on string, if you have one, goes between the second and third terminators. The off string, if you have one, goes between the third and fourth terminator. After these four terminators, the list of transformation rules starts.

Every rule follows the pattern of find string, terminator, pattern string, terminator, change to string, terminator. Remember the pattern string shortcut: an empty pattern string specifies an exact departing match for all characters in the find string.

The list of rules always ends with at least three terminators in a row. When your last change to string is empty, then the chapter ends with four terminators in a row. When your last pattern string uses the single terminator shortcut, and your last change to string is empty, then your chapter ends with five terminators in a row.

Sample: counting terminators in transformation chapters

Basic Replace characters can help you be sure that you have the correct number of terminators. In this sample, we are creating a WHITE contextual transformation chapter; its terminator is the vertical bar:
Main: R
Replace
Chapter: WHITE <CR>
Chapter: <CR>
Target chapter: WHITE <CR>
Use transformation chapter: <CR>
Enter terminator: <CR>
Find: |<CR>
Change to: |<CR>
Find: <CR>
Continue? Y <CR>
Starting to replace ...
All this replacement does is change the vertical bar terminator to itself. When it's finished, BEX announces Replaced # times and lets you save the transformation chapter. That # corresponds to the number of terminators in the transformation chapter; when it is divisible by three, then you have probably put terminators in all the right places.

Use <CR> as terminator while developing chapters

Even if you need to use the <CR> character as data in your rules, you can write a "development" version of the transformation chapter that uses <CR> as terminator. In addition to making it easier to move around in the Editor, when your terminator is <CR> it's much easier to make a hard copy printout of your work-in-progress.

Here's how: in your development version, use a different unique character to stand for <CR>. Suppose your transformation chapter does not contain any tildes. You put tilde in the development version wherever you want <CR> in your final transformation chapter. Once all your development work is complete, you use basic Replace characters on the development chapter to create the final version. In this case, you change the <CR> terminator to another character, like vertical bar, then change tilde to <CR>. Suppose your development chapter is named DEV and WHITE will be the name for the final transformation chapter. Once the DEV chapter is ready, you go through this dialogue:
Main: R
Replace
Chapter: DEV <CR>
Chapter: <CR>
Target chapter: WHITE <CR>
Use transformation chapter: <CR>
Enter terminator: #
Find: <CR>#
Change to: |#
Find: ~#
Change to: <CR>#
Find: #
Continue? Y <CR>
Starting to replace ...

Don't get dizzy! Using Replace characters to help you create contextual transformation chapters is so self-referential that you can feel the ground tilting beneath your feet.

Finally, the Clipboard can be very helpful while creating contextual transformation chapters in the Editor. You can type the find string, then copy and insert it as the basis of the pattern string. Overwrite each of the find string letters with the pattern code you've chosen. When you write a series of rules that address slightly different boundary contexts, copy one rule then use it as the template for the next rule.

Making Hard copy Printouts

Because the pattern codes are so cryptic, it's often difficult to hold all the interactions in your head. In our experience, it's a lot easier to do your design work in a hard copy medium. As we mentioned in the STRIP DOLLAR sample in Part 3, the order of the transformation rules is often crucial to their success. When you're working in hard copy, you can think up all the possible rules, then shuffle them around to get the optimal order.

To make useful hard copy print or braille editions of your transformation chapters, you must change your terminator into a two-character sequence; a unique character plus <CR>. The <CR> makes each string a new line. The unique character $$vt $$vp aM6:42 alerts you to an empty string, which

would otherwise be a blank line. When <CR> is used for data, then you must replace it with a different, printable character. For example, when your transformation chapter contains neither the vertical bar nor the lowercase letter as, change every appearance of <CR> as data to lowercase as. Then change your terminator character to vertical bar followed by <CR>.

You must also change any other control characters in your rules to printable characters, or you may find your printer doing some pretty strange things. When the spaces in your rules are important, change the space character to a printable character as well.

When your transformation chapters contain $$ commands, you have to "defuse" them in order to make a useful hard copy. Inserting $$z at the beginning of the transformation chapter does the trick. Your printer or brailler still receives the hard <CR> at the end of each string, so each string appears on a separate line.

Sample: printable version of FIX TEXT

Here's how you make a printable version of the FIX TEXT contextual transformation chapter on your BEXtras disk. It's a pretty stinky sample, since FIX TEXT contains <CR> s as data, several other control characters, and BEX $$ commands. (We have an ulterior motive for picking this sample; in Part 7 of this Section, we analyze how FIX TEXT works.)

First, edit the chapter and see what's there. To make a printable version, you must replace non-printing characters with printing characters. But you must choose printing characters that are not already present in the transformation chapter. If you don't then you would not be able to make sense of the print out.

The terminator is slash. There are four control characters: <CR>, <control-J>, <control-H>, and <control-T>. The simplest system is to substitute the plain letter for the control letter. Locate for lowercase more, just, have, and that: you get four beeps because those letters aren't in FIX TEXT. Great! That means you can substitute m for <CR>, j for <control-J>, h for <control-H>, and t for <control-T>.

There are many meaningful spaces in FIX TEXT. A good way to represent spaces is with the underbar. First, check to see if it's already used. When you locate for the underbar you get several hits, so that won't work. Try the asterisk character--none are present in FIX TEXT, so you can use the asterisk to represent spaces. You're ready to type these changes:
Main: R
Replace
Chapter: FIX TEXT <CR>
Chapter: <CR>
Target chapter: PRINTABLE FT <CR>
Use transformation chapter: <CR>
Enter terminator: <control-H>
Warning! Arrow keys used as data!
Enter terminator: #
Find: <CR>#
Change to: m#
Find: <control-H>#
Change to: h#
Find: <control-T>#
Change to: t#
Find: <control-J>#
Change to: j#
Find: /#
Change to: /<CR>#
Find: <space>#
Change to: *#
Find: #
Continue? Y <CR>
Starting to replace
Replaced 155 times
Save transformation chapter: SHOW CODES <CR>
Remembering back to User Level Section 8, the left and right arrow keys are generally used to correct your strings as you type them directly. When you need to include <control-H> and <control-U> in your strings, you press the left arrow at the Enter terminator: prompt, as shown above. From that point, you have to type your rules perfectly, since you can't use the left and right arrow to fix mistakes.

Before you can actually print the PRINTABLE FT chapter, you must edit it and insert the $$z command at the very beginning.

Part 7: Inside FIX TEXT and SP2

FIX TEXT and SP2 are two contextual transformation chapters on your BEXtras disk. In this Part, we examine what they do and how they do it.

SP2 deletes all occurrences of two spaces, except when they appear at the end of a sentence. FIX TEXT reformats data that's been created by Input through slot and Read textfile. This data generally is formatted as if it's been printed to disk; more details on Input through slot are available in User Level Section 12. Textfiles are discussed in Learner Level Section 12 and User Level Section 10.

Inside FIX TEXT

Copy the FIX TEXT transformation chapter before you edit it, so you don't have to worry if you change something by mistake. When you have a hard copy print or braille device, print out the PRINTABLE FT chapter you created by following the instructions in Part 6.

What FIX TEXT does

FIX TEXT changes a blank line plus many spaces to lines beginning with the BEX format command $$c; changes two <CR> s to a BEX paragraph indicator; changes a single <CR> to a space; changes two spaces to one; and attempts to place BEX's underline begin and finish commands where ne $$.

Most of its work could be accomplished with plain Replace characters. However, since the rules for placing underlining commands had to be Contextual, all of FIX TEXT had to be written in contextual form. As supplied on the BEXtras disk, FIX TEXT does not contain on or off strings. After we examine how the rules work in detail, we discuss when on and off strings would be appropriate.

Rule-by-rule through FIX TEXT

The 16 rules in FIX TEXT are divided into two broad categories. The first six rules deal with <CR> s and spaces. In these rules, each find string is followed by two slashes; the pattern string shortcut that means an exact, departing match for every find string character.

The first rule deletes linefeeds, also known as <control-J> s. Many IBM programs end each line with two characters, <CR><control-J>, while most Apple programs end each line with just <CR>. By deleting the <control-J> s first, subsequent rules apply equally to both type of line endings.

The second rule changes two <CR> s followed by 11 spaces into two <CR> s followed by BEX's $$c centering command. We settled on 11 spaces as a good solution through trial and error. If the number of spaces is too small, then the rule would be satisfied by the beginning of every paragraph. When your source material contains deeply indented paragraphs then FIX TEXT would think every line should be centered. But if the number is too large, then only very short headings satisfy the rule, and you would need to place the $$c commands manually.

Rule three changes three <CR> s to two <CR> s, and the rule four turns two <CR> s into BEX's paragraph ( $p ) indicator. We assume here that the source text always has at least one blank line between paragraphs. Now that we've defined the paragraphs, all other <CR> s are superfluous; rule five changes any <CR> s that are left into a space.

The sixth rule changes two spaces to one space. When you want to have two spaces at the end of sentences, use the SP2 transformation chapter on the BEXtras disk after you've used FIX TEXT.

The remaining ten rules cope with underlining and are fully contextual. To make sense of these rules, you must first analyze what underlined text looks like. We assume that the source computer accomplishes underlining in the same way BEX does: print a character, then a backspace <control-H> command, then the underbar character. The combination of <control-H> and underbar creates the underlining.

There are three possible contexts: when underlining begins; when underlining ends, and underlining "midstream."

Understanding midstream underlining provides the key: <control-H>, underbar, character, <control-H>, underbar, character, over and over. We want to delete the <control-H> s and underlines in this context.

The pattern for the beginning of underlining is defined as not like midstream underlining. When underlining starts, the first four characters are a character that's not an underbar; then the underlined character; <control-H>; underbar. We want to place BEX's $$ub UNDERLINE begin command here.

When underlining ends, the final characters are: last underlined character; <control-H>; underbar; then a word delimiter, like <space>, <CR>, or punctuation. We want to place BEX's $$uf underline finish command here. If the word delimiter is not a <space>, then we want to place <control-T> s around the underline finish command. (The <control-T> touching token is explained in Section 5, Part 1. It can replace the initial and final spaces in BEX $$ format commands.)

Rule seven searches for the beginning of underlining: it inserts dollar, dollar, lowercase us, lowercase but, <space> when the data does not follow the midstream pattern. The combination of the find and pattern strings specifies any character except underbar, followed by three boundary characters: the total wildcard W, then exactly <control-H> and underbar. The caret appears after the boundary O, so the-mand is inserted before the first underlined character.

The seventh rule inserts the-mand right next to punctuation. Suppose you underline the first syllable of the word despair enclosed in parentheses; your source data looks like this:
(d <control-H>_e <control-H>_s <control-H>_pair)
the first four characters match the seventh rule, so your data is transformed to
(<space> d <control-H>_e <control-H>_s <control-H>_pair)
The eighth rule changes the $$ub<space> to <control-T><control-T> The initial <control-T> means BEX's formatter recognizes the underline begin command; the final <control-T> means the Grade 2 translator recognizes the end of the $$ command, so the following characters are translated.

Rule nine deletes all <control-H>-underbar pairs that occur midstream. The initial <control-H> and underbar are paired with lowercase x, so they depart from the target chapter. Because the boundary patterns are total wildcard followed by exactly <control-H> underbar, this rule does not match the last underlined character in a word.

The tenth rule matches the last underlined character in a word, because the boundary character in the pattern string is D, standing for space or <CR>. The <control-H> and underbar are replaced with space, dollar, dollar, lowercase us, lowercase but. BEX's underline finish command requires a final space or <CR>; it's supplied by the character in the source chapter that matched boundary code D.

At this point, the only situation where <control-H>-underbar pairs remain in the source chapter is the pattern of last underlined character, <control-H>, underbar, punctuation. Rule 11 matches this situation. The question mark in the find string is paired with the E boundary code, meaning match everything except a <space> or <CR>. The <control-H> and underbar that depart are replaced with the underline finish command, preceded and followed by the <control-T> touching token.

Rule 12 gets rid of underline begin immediately followed by underline finish. This pattern appears when the formatter that did the underlining in the source chapter suppressed underlining for the space between words.

The remaining four rules undo the effect of rule 11 for four punctuation marks. We don't need to bother with the touching token for period, comma, semicolon, or colon because BEX's selective punctuation format command $$sp prevents underlining these four punctuations marks when they appear at the end of a word.

Turn Off FIX TEXT for Tables

When your source text contains tables or columns, FIX TEXT'S rules would create a disaster. Suppose you have a table with 6 columns and 15 rows printed to an 80-character carriage width. If it's single-spaced, FIX TEXT changes the <CR> s that define the end of each line to a space--and presto, you lose the line-oriented structure of the data. FIX TEXT would also change the multiple spaces between each column into a single space. Since FIX TEXT is a contextual transformation chapter, you can simply switch off the execution of its rules for the line-oriented portions of the text.

Since FIX TEXT as supplied on the BEXtras disk does not contain on or off strings, you can insert whatever characters you prefer. The terminator in FIX TEXT is the slash. The first slash in the chapter announces "Slash is the terminator." The second slash announces, "This is a contextual rather than plain transformation chapter." The third and fourth slashes are place holders for on and off strings respectively.

For the sake of example, choose @@text for the on string and @@lines for the off string. Edit a copy of FIX TEXT. Place your cursor on the third slash; press control-I, type the six characters @@text and press right arrow. Now press control-I, type the seven characters @@lines and press right arrow again.

Place these on and off strings where appropriate in your source chapter before using FIX TEXT. Whenever a transformation chapter contains on or off strings, Replace begins off. None of the rules are executed until the program encounters the first on string. So you must insert @@text at the first occurrence of non-tabular material in your source chapter. When you encounter a table, put @@lines immediately before it and @@text immediately after it. Now you can use Replace characters on the modified source chapter, with the modified FIX TEXT as your transformation chapter; it won't destroy the format of tables.

To really automate the process, you write a transformation chapter specifically designed to reformat tables; in this chapter, make @@lines the on string and @@text the off string. The last step is removing @@lines and @@text from the transformed chapter. Naturally, you write a transformation chapter that just deletes the on and off strings.

When to Use SP2 and What It Does

Correctly formatted braille and typeset print always contains just one space at the end of a sentence. On the other hand, typewritten print contains two spaces at the end of a sentence. When you enter text with two spaces, it's child's play to delete the extra space before you make braille or typeset text. But when your original text contains just one space after a sentence, it's massively boring to insert the extra space by hand to make correctly formatted print.

The contextual transformation chapter named SP2 is on the BEXtras disk. (We originally thought of calling it SPACE THE FINAL FRONTIER, but that's a lot to type.) Use SP2 whenever you want to create appropriately formatted typewriter-like print output. The rules in this chapter delete all occurrences of two spaces, then add two spaces after sentences, then delete two spaces after abbreviations.

Before you examine SP2 in the Editor, make a copy. Its terminator is <CR>. To make a print or braille hard copy, follow the suggestions in Part 6; replace space with some printable character. The discussion that follows assumes you have hard copy in hand.

Before we analyze each of the 24 rules in this chapter, let's sit back and think about how sentences appear in text. Since we're going to be talking about sentences a lot, we use a shorthand: S1 means the first sentence, and S2 means the second. SP2 only deals with the spaces that come between one sentence and the next sentence, or in shorthand, the spaces between S1 and S2. When S1 and S2 are separated by a <CR>, ( $l ), or ( $p ), then the number of spaces between them is irrelevant. Any number of spaces at the end of S1 just become trailing spaces at the end of a line. What other contexts can exist for this transition?

The first character in S2 is either an uppercase letter or punctuation, like a single or double quote. It's unlikely that S2 begins with a digit, as most style books frown on this. (When a number comes first in a sentence, most style books recommend you spell out the number in words.) The last character in S1 is always punctuation.

However, we can't define that punctuation with the P pattern code, because that code includes a number of symbols that can appear within a sentence. The percent sign and the asterisk are just two samples of symbols frequently found in the middle of a sentence. Without the P code, we must develop a list of possible sentence-ending punctuation--and that's why SP2 contains so many rules.

The final analytical challenge concerns abbreviations. The period character does double duty. Most sentences end with period, but period often appears inside a sentence in abbreviations like Mrs. and Still. SP2 tries to define several patterns of letters that abbreviations can follow; SP2 attempts to delete two spaces when they follow these patterns and end with a period.

The next-to-last sentence in the previous paragraph is a sample of a situation where SP2 fails. The period serves as the end of the Still. abbreviation and as the end of the sentence. There are several other contexts where SP2 is overwhelmed: a computer program is no match for a human being when it comes to subtle tasks like deciding where a sentence ends. We point out these problems as we examine SP2 in detail.

Rule-by-rule through SP2

The terminator in SP2 is <CR>, making it easy to move through the rules in the Editor. Although SP2 is contextual, it does not use on or off strings. The first rule changes two spaces to one: all subsequent rules insert an extra space where appropriate. The rules 2 through 15 parallel the rules 16 through 26. The first bunch define S2 as beginning with an uppercase letter; the second bunch define S2 as beginning with punctuation followed by uppercase letter.

Rule two defines the most common arrangement of characters between S1 and S2: period, space, uppercase letter. The single departing pattern code b is changed to two spaces. The character preceding the period could also be punctuation, for example, the British method of placing the period outside quoted material. Rule 16 is parallel to rule 2--S1 is the same and S2 changes. For rule 16 S2 begins with punctuation followed by uppercase letter.

Rule three defines S1 as ending with question mark instead of period; rule 17 shows S1 ending with period and S2 beginning with punctuation. Rules four and 18 show S1 ending with exclamation mark.

Rules five through 12 and 19 through 26 specify various sentences that end with two punctuation characters.

Rules 13 through 15 are the abbreviation-handlers. Thirteen deletes the extra space after the honorary Mrs. regardless of the case of the letters r and s. Rules 14 is more general: the pattern codes match any two-letter abbreviation where the first letter is uppercase. So in addition to Dr. these rules match MR., St., AV., Ln. and literally hundreds of others. Because the first pattern code is boundary Q, this rule matches an abbreviation with touching punctuation. Rule 15 matches a single uppercase letter, like those found in lettered outlines.

There are many possible abbreviations, and these three rules don't come anywhere near matching all of them. But, it's important to compare the benefits gained with the effort required to write hundreds of rules. When SP2 fails, all that happens is there's an extra space in your text, which is not life-threatening.

The final rule matches S1's that end with BEX's underline finish command. The space before the first dollar sign is printed, but the space after the lowercase f is not. The final rule adds that extra space.

Part 8: Contextual Replace Reference

Enter terminator at first Find: prompt to get into contextual Replace. Number of terminators in contextual Replace transformation chapter must be divisible by three. Total number of terminators is three times the number of transformation rules plus six. First two characters are terminators, then (optionally) on string, terminator, off string, terminator. When on string is present, then contextual Replace begins off. End list of rules with three terminators.

Lowercase Departing Pattern String Codes

Uppercase Boundary Pattern Codes

Special Pattern Codes

Pattern string shortcut - Entering terminator alone at Pattern string: prompt means an exact match, equivalent to pattern code of all lowercase x

Punctuation as defined by pattern code P

When the Echo uses a different term from standard practice, that term is included in parentheses):