if yr cmptr cn rd ths...

By Shay Addams (Computer Entertainment, August 1985, pages 24-27 & 76-77).

Software companies mobilize their syntax as the Parser War heats up.

In war, weapons are only as good as the battle they're used in. For example, taking a lance to the Battle of Normandy might have provided our valiant men with a good laugh but it wouldn't have been particularly effective. In the Parser Wars - the war between the software companies making interactive text programs - words and how well they're understood are the weapons. Primitive parsers can only understand two-word commands ("get sword", e.g.) and often even that is too much for them. The quest to invent a parser that understands complex, complete sentences goes on apace. The pen may be mightier than the sword, but in this war, the parser has to be mightier than the "get sword."

The parser is the part of the program that reads a player's typed-in command by comparing the words with those that the programmer has included in the game's vocabulary. If it finds a match, the parser then sends numbers representing those particular words to another section of the program so it can respond with the appropriate message or picture which is stored in the data base on the disk. Loosely defined, the parser is the lifeblood of a text program.

In Cambridge, Massachusetts, Spinnaker's engineers leaped into a foxhole, where "they practically locked themselves in a room for months and only came out for air," according to Seth Godin, who spearheaded Spinnaker's line of Telarium adventures. On the opposite coast, Synapse programmers hammered away at an adventure-specific language called BTZ (Better Than Zork) for 18 months, half of which they devoted to working on the parser. (See CE, June, 'Building a Better Zork') Projects by other companies quickly escalated this software version of the arms race, which erupted into a full-scale Parser War when the resulting games finally met head-to-head on the battleshelves of Software City, Crazy Eddie's and Toys 'R' Us. Why so much effort to build a better parser? For one, an intelligent parser is the hallmark of the most successful company in the field, Infocom. Equally important, the latest vogue in adventure games stresses interaction and conversation with characters over the typical object-manipulation involved in "look rock" and "go taxi" adventures.

PARSER, SPEAK MY TONGUE

"From day one," Godin recalls, "we decided to go with as sophisticated a parser as we could, because it's important to simulate realistic character interaction as closely as possible." This meant developing a full-sentence parser that, like Infocom's state-of-the-art counterpart, could "understand" more parts of speech than the elementary two-word parsers that restrict the player to nouns and verbs. The parsers in the early Telarium releases, games like Fahrenheit 451 and Dragonworld, met this criterion, but were comparatively slow when dealing with multiple commands like "get the rock then go west." Godin says they're continually upgrading the parser with each new game.

"We're improving the parser so the player can effortlessly talk to the people in the game." He says the latest model can deal with "at least ten parts of speech including adjectives, nouns, verbs, and direct and indirect objects. Right now, our parser can find seven different words and act on them: 'Mr. Burns, how many unaccompanied guests entered the building on Thursday?' This is important, because our new programs require you to ask questions of the characters. In The Nine Princes of Amber [now under development and based on Roger Zelazny's popular sci-fi series] everything hinges on character interaction; there's no inventory." Like the other Telarium adventures, it will be a graphic adventure.

Brian Fargo, the 22-year-old programmer whose development house, Interplay Productions, created Mindshadow and Tracer Sanction for Activision, agrees on the significance of character interaction in adventures. "We're shooting for a more sophisticated parser that enables you to say things such as 'who are you, where is the ice cream?' to characters." He and an associate designed the current parser, which understands about 250 nouns and 200 verbs as well as prepositions and indirect objects. (A parser that understands indirect objects permits the player to say things like "hit rock with flint" and "give money to writer.") Fargo elaborates: "We're expending the number of combinations of parts of speech the parser will understand, as well as boosting the vocabulary." Another big part of the parser's job is error-handling, how the parser responds to a command it doesn't understand. If you say "turn the wheel" and the program says, "Please rephrase that," you don't know whether it doesn't understand the word "turn" or the word "wheel." The more courteous parsers usually identify the unknown word, sparing a player at least some of the frustration usually associated with playing an adventure game.

Another familiar frustrating situation arises when an adventurer discovers he can't use a word that the game just used in a description. The game may say, "You see a cowering accountant," but when you type in "Kill the cowering accountant," the program responds "I don't know the word 'cowering'." That's because the original message "You see a cowering accountant" is stored in the program's data base (which holds all the game's descriptive passages) but isn't part of the game's vocabulary, which is stored in another section called the word tables. The parser works by comparing each word in your command with those in the word tables, not those in the data base of descriptive text. In other words, even the best parser doesn't always have the foggiest idea what it just said.

PENGUIN'S NEW PARSER

Penguin Software has been turning adventure games regularly since releasing Transylvania in 1981. Appropriately, Penguin's first adventure to incorporate a full-sentence parser will be the sequel, Crimson Crown: Transylvania II. The new parser was over a year in the works and can cope with many more parts of speech in various combinations than could the original. According to Penguin President Mark Pelczarski, this expands by "at least 10% the types of problems a designer can put in a game." Prepositions, which make it possible to say "look under the rock" or "behind" it rather than simply "look rock," are especially important in this regard. Pelczarski concedes the now-raging Parser War is one reason for Penguin's upgrade to a full-sentence parser, saying "I suppose it is a selling point," but he says the main reason for doing so was "to take away the computer, make it more transparent, so people could use standard English and concentrate on the adventure."

Pelczarski worked for several months on conceptual groundwork before Jeffrey Jay joined the staff. A college student who converted The Quest for the Atari and Commodore 64 before going to work full-time, Jay "got involved in May of 1984, when we were deciding how we wanted it to handle things and started designing it on paper. By September, JJ had started on the actual coding, and the programming itself was disgustingly simple." Crimson Crown also introduces Penguin's adventure language, COMPREHEND. Written in assembly, it's an adventure-specific language with keywords that function like BASIC's "AND," "FOR," and "IF". "It makes it a lot easier to write the games," explains Pelczarski. "With COMPREHEND, the game is actually written with AppleWrite or another Apple word processor." COMPREHEND, which houses the parser, turns that file into the final adventure. As Jay sees it, "We've removed the programming aspect from the writing of the game, so that when it's being created you can concentrate on the game itself. The programming is taken care of." A key strength of COMPREHEND, shared by Spinnaker's SAL and other adventure-specific languages, is that it enables Penguin to release a new game for several computers without doing lengthy conversions from one language to another. A custom interpreter for each system translates the COMPREHEND program into whatever language is needed for the different machines.

SIERRA STRIKES BACK

Sierra, whose classic graphic adventures always relied on a simple two-word parser, upgraded their parser when they released King's Quest in the summer of 1984. "With the success of Infocom, quite frankly, we came up with a full-sentence parser to remain competitive," John Williams explains. "It was a mixed bag for us. With a two-word parser you have more control over the player, because he can't type in as many oddball or unanticipated combinations of words. The advantage of the full-sentence parser is that people think in full sentences, not two-word phrases, so it's more natural." Sierra drafted Arthur Abrahams, then immersed in programming submarine systems, for the eleven-month project of engineering a full-sentence parser. The final product could handle indirect objects, so a player could say things like "give food to the man." In previous adventures, a player had to say "give food," then type in "to man" after the program asked him who he wanted to give it to.

Abrahams eventually returned to the sea, but Sierra's staff improved on the parser for King's Quest II: Romancing the Throne. "It understood adjectives before," says Williams, "but not nearly as well as now. And one of the biggest changes is in the use of the word 'and': it was ignored by the previous parser, but now you can say 'Get this and that' or 'Get this and kill the creature'." These are referred to as multiple commands, which essentially are a time-saving convenience that have been part of the Infocom games since the original Zork. (Even the Scott Adams adventures, which still use a two-word parser, now accept multiple commands for giving directions: North, East, East. etc.) "Every time we write a game we try to outdo ourselves," Williams says, "so you can expect to see improvements in the parser of our next adventures."

Sierra has been in the graphic adventure business since Roberta Williams created the genre in 1980 with Mystery House, so their development of a full-sentence parser was predictable. What surprised most "adventure-watchers" was the emergence of Imagic and Synapse as serious contenders in the fight. Imagic, like Activision, had racked up high scores in the now-battered Atari VCS market, but their first graphic adventures, I, Damiano and Another Bow (marketed by Bantam) have high-calibre parsers that give them a shot at victory in this battle zone. The latter, a Sherlock Holmes tale, uses a conventional parser, but writer/designer Peter Golden says "we found from the Holmes experience a way to create D," yet another adventure-specific language. "It took Mark Klein at least a year to develop D and its parser, which uses a modified Eliza-type routine that figures out what the input is and comes up with an appropriate response."

Eliza is a 1950's all-text program that imitates a psychologist. If you type in "I hate my sister," Eliza will find the keywords 'hate' and 'sister', then ask "Why do you hate your sister?" It doesn't really understand what you're saying the way a traditional parser or an orthodox Freudian does, it just turns your words around. A conventional parser gives up if it doesn't understand a command, and may or may not tell you which word is not in its vocabulary. The new Imagic parser doesn't stop there, but searches the command for various keywords. "We wrote sentences that were appropriate for the keywords," which the program displays rather than repeating the phrase 'I don't understand that.' Currently completing a novel, Peter Golden wrote short fiction for ten years before enlisting in Imagic's "adventure army" and feels that "the parsing must be more intelligent if interactive fiction is to ever reach the mass market - if the art is to be pushed to its limit."

The gifted parser in the Synapse series of Electronic Novels works along the same lines. According to Richard Sanford, "No parser can handle the language and resolve all ambiguities, so we came up with something else, a keyword concept based on certain built-in filters. We do as much of that [conventional parsing] as possible, but we may only have to parse part of a command and leave the rest to the filters. They look to see how a phrase involving a keyword object might tend to have meaning in this fictional world (the context of the game)."In Mindwheel and Essex, the first games in the series, this keyword approach enables the reader/player to converse with other characters in ways no other adventure has ever permitted. You can elicit reasonable responses to questions like: "Thug, what lies east of here?" for example, or "Singer, how can you help me?" "There are aspects of the Eliza approach that we've used as part of an eclectic combination of tools," Sanford continues, "and we couldn't do it without BTZ, which is responsible for codifying and assembling - pairing nouns with verbs, checking for prepositions and conditional modifiers such as adjectives and adverbs." Sanford says "the parser is still being refined, and will continue to be refined. Right now, there are certain ambiguities with combinations in compound sentences. We're cleaning up things like this and trying to make it even smoother."

INFOCOM - ON THE DEFENSIVE FOR THE FIRST TIME?

"You call that parsing?" snorts Marc Blank, chief architect of Infocom's parser and co-author of ZIL, the first adventure-specific language. He dubs Synapse's innovation "the Big Lie parser: it hardly understands anything, but it fools you into thinking it does." Blank dismisses the keyword approach as just more "bells and whistles, like adding graphics to adventures...it's superficially a slick-looking thing that isn't even close to ours, though it's definitely better than Spinnaker's, for instance." (Could he possibly be rankled at rumors that BTZ, the Synapse language, is an acronym for "Better Than Zork"? After all, Synapse is the first and only company to have been seen as a challenge to Infocom's dominance.)

Asking Marc Blank about Infocom's plans for its parser is like asking the CIA about its plans for Nicaragua - he refuses to part with a single clue, Invisi or otherwise. Denying plans to completely overhaul their parsing in light of Synapse's innovative technique ("We saw nothing there that looked like it was worth emulating"), Blank says that "we're making a lot of changes that add up. We make improvements for every game to meet the requirements of the game, and there have been improvements that we've made over time that we've fitted back into all the old games. The current Zork I has a new parser with a larger vocabulary and improvements in the variety of sentences it can understand. Hitchhiker's and Cutthroats had a lot of new features." Both feature much more character interaction than earlier Infocom games, a key reason for the changes. "In Hitchhiker's Bugblatter Beast problem," Blank points out, "the parser understood many, many different ways of trying to solve that puzzle."

And while you wrack your brain trying to unravel puzzles like that one, combat-fatigued programmers across America are building better parsers that will make it easier and more fun for you to do so. The intensity of war always produces our most dramatic technological advances, and the Parser War is no different. So as Infocom and Synapse cross swords in the all-text arena and a half-dozen other outfits continue to clash in the province of graphic games, the only sure winner will be you, the adventurer.

In the beginning was the word

In the Fifties, artificial intelligence research at MIT focused on natural language systems that would enable computers to manipulate human knowledge and information. Applications dealt with things like language translation. The heart of the process was the parser, a sub-routine that resolved a sentence like "Nyet, comrade" into its grammatical parts so another part of the program could quickly convert it into "No way, Jose." Not so coincidentally, William Crowther, who wrote the first adventure, Adventure, and the Zork crew studied at MIT.

Crowther's two-word parser of the late '60s was outpaced by Zork's full-sentence parser in 1977, which introduced multiple commands ("get the sword then kill the messenger") to adventure games. Until recently, Infocom had a virtual monopoly on the full-sentence parser. Now, however, it is becoming as ubiquitous as rocks in a cave. It's unrealistic to rate parsers side-by-side on a purely technical basis, for each must be considered in the context of the game. "If it's not important for a specific game to understand adjectives or adverbs," Imagic's Peter Golden amplifies, "then there's no need for the parser to understand them." It must, however, be able to identify and deal with adjectives, adverbs, prepositions and indirect objects, in order to truly qualify as a full-sentence parser: "look in the red box" and "carefully give the box to the skinhead." In this area, Infocom's parser is still top gun - although it apparently lacks the potential for character interaction inherent in the Synapse system.

Another important consideration is the size of a game's vocabulary: the more words in the game's vocabulary, the fewer "synonym searches" the player will have to conduct to find the right words to solve a particular puzzle. So even though current adventure games already have larger vocabularies than a New York cabbie, the heat is on to expand them even more. King's Quest II, with its 500-word dictionary, surpasses the original Sierra game by 150 words. Penguin's 1981 Transylvania has a 300-word vocabulary, but the 1985 sequel will understand 800-1,000 words. Activision's Mindshadow has a vocabulary in the neighborhood of 500 words, Imagic's Holmes knows 2,000, while Synapse's Electronic Novels range from 1,200-1,500 words.

Infocom games vary from title to title and the 1,000-word vocabulary of Sorcerer is the largest so far. (Blank claims he could cram in 10,000 if necessary.) Telarium's Dragonworld has a vocabulary of around 600 words, but Seth Godin says that "if a game calls for a 1,000-word vocabulary, we'll give it one." Indeed, the only technical restriction to the size of a game's vocabulary is the amount of a computer's RAM, and with 128K fast becoming the standard for home computers and 512K on the horizon, parsers of the future may be able to "thumb through" an unabridged dictionary, and learn new words and phrases while you play.

Thanks to André St-Aubin for transcribing and donating this article.

Last revised: Thu Aug 26 00:06:21 EDT 1999 / Peter Scheyen <Peter@Scheyen.com>