Highlights of my computer-programming career -- by Eugene Reimer 2011-Mar-20

One of the highlights of my career was the enhanced-dt80.  Toward the end of their era, dumb-terminals became micro-processor-based.  The Datamedia DT80 was powered by an Intel 8085;  it emulated 4 different dumb-terminals, one of which was the popular DEC VT100.  The 8085 was Intel's first single-chip CPU;  it sold for $5 each in quantities of 100, making low-cost motherboards possible, which led to micro-processors replacing dedicated circuitry in many applications including dumb terminals, smart appliances, etc.  Here's a photo and brief description of the Datamedia DT80 (scroll down).

At the University of Manitoba we felt the need for an enhanced dumb-terminal that would, together with our homegrown Mantes system, enable full-screen editing.  And I took on the project of writing the code to do this.  Larry McNish designed the hardware modifications, Bill Reid wrote or enhanced the 8080 assembler, I wrote the 8080 code (to a programmer the 8085 was just another 8080, only to the hardware-designer were they different).  The first version of our enhanced-dt80 came out in 1983 or 1984, the final version in 1985.  Approximately 1,000 of these enhanced-dt80's were produced.  They were used at the University of Manitoba and at Brandon University.   It was a wonderful terminal; too bad that by the time it was finished it was also obsolete.  (By 1985, PCs were becoming so affordable that these dumb-terminals were soon to be obsolete; that trend was already obvious in 1985, although it hadn't yet been when we started the project a few years earlier, showing how rapidly things changed in those years.) 

Writing the 8080-code was computer-programming at its finest -- the code ran on a naked 8085, no operating-system;  we can also say my code included an operating-system, albeit an extremely simple one.  Not that the code for serial-port I/O was altogether simple.  (Some day I'll rant about RS-232 as an extreme example of needless complexity, using 25 pins for a path that is one-bit wide;  however it's at the UART, where the programmer sees a path some 10 bits, give or take about 5 or 6, in width, where the horrors truly abound, with start-, data-, parity-, and stop-bits and different numbers of each being the stuff that nightmares are made of:)  My code was ultimately burned onto 5 ROM chips, however for testing we had an 8085-emulator with RAM, a serial-port for downloading, and some debugging features.

As a text-editor, one of its innovations was in addition to the usual line-oriented cut-and-paste a rectangular-region cut-and-paste, that's invaluable when editing tabular data.  It was my own invention, although I subsequently learned it had already been invented elsewhere.  My current text-editor jed also has it (most text-editors don't), and I could hardly live without it. 

As an "operating-system" it included several program-development tools:  a Forth-interpreter, and a G-code emulator.  The latter was both a space-saving measure and an extensibility feature;  I was up against the 20KB limit on the size of the code, since our hardware design allowed for 20KB of ROM, 12KB of RAM, the upper half of the 64KB addressable space being used for memory-mapped I/O.  Which meant that to add a new feature that needed 100 bytes of code, I first had to eliminate 100 bytes worth of code elsewhere.  A great deal of space was taken up by the testing and jumping code that implemented the myriad escape-sequences for the 4 different terminal-emulations, and I was reluctant to throw away support for those terminal emulations;  however by rewriting them in G-code, I reduced their size by many hundreds of bytes making room for the remaining features I wanted to add.  Incidentally "G" stood for Graph, as a G-program was best described as a directed-graph.  A G-program would only solve the high-level part of a problem, ultimately handing over, with a mode-change, to either Forth- or 8080-code.  The G-machine was good at top-down recursive-descent parsing and parsers written in G-code were remarkably compact.

I remember fondly the 3-days of inventing and implementing the "G-machine":  on the first day I wrote the emulator and produced a paper listing;  the next morning I read what I'd written, rewrote it to be better and to make the G-instructions smaller, and produced a paper listing;  the next morning I read version-2, rewrote it to be better and to make the G-instructions smaller, and produced a paper listing;  the next morning I read version-3 and pronounced it good.  The first version had 3-byte "nodes", the second had 2-byte nodes, the third had mostly one-byte nodes.

The Forth-interpreter used bytecodes;  such a routine was usually a good deal smaller than an equivalent routine of 8080-instructions;  by how much depended on the sorts of things it did (I no longer have any hard data).  I remember that a flaw in the condition-code-setting and conditional-jump instructions offered by the Intel 8080 made it a challenge to implement proper signed numbers in Forth;  in those days I was capable of solving such problems;  I was then in my early- to mid-30's.

Another highlight of my career was the "Wanted Person System" that I worked on as part of a foursome from Winnipeg who went to Los Angeles to build this system as part of an SHL Systemhouse contract with Los Angeles County.  Pat Smith formerly of the U of Manitoba recruited me for the job, the University gave me a year's leave of absence, and that's how I came to be there between June of 1986 and June of 1987.  Our system was called CWS or Countywide Warrant System.  First, I got to write statistics-gathering programs in PL/I that counted every conceivable attribute of the 1.2 million wanted persons in their Southern California database.  Our system as well as the one it replaced served the "wanted person" look-up needs for some 250+ police-forces in that sprawling metropolis commonly known as Los Angeles which turns out to be some 250+ separate cities, many of them like Seal Beach where I lived being small as cities go, Los Angeles being the largest at some 2.6 million people, and the overall population being around 10 or 15 million.  I was astonished to learn that the number of wanted persons in their database exceeded the population of Manitoba! 

My research included reading about NYSIIS (New York State Identification and Intelligence System), best known for the phonetic NYSIIS-name-coding algorithm they devised, which out-performs the traditional Soundex name-coding.  I was also impressed by their approach to a goodness-of-match measure for a fuzzy-search algorithm, which inspired the Bayes-Law-based generalization and the methodology I used in constructing the six kinds of search for the Los Angeles system.  It provided a solid foundation for such searches on name and/or description.  That method together with the stats I'd gathered over several months made it possible for me to write the "name-search" code in S/370-Assembler in one week when the "demo" we'd promised was drawing near.  I recall Gord Tallas being alarmed that I hadn't begun to write the program when the demo was only days away:-)  After producing the finished product for that demo, my part of the one-year project was done, so I got to goof off for the rest of the year. 

Those had been three highly productive months on my part.  Probably the most productive 3 months of my entire career.  Besides the "LAPIS" name-search already mentioned, I had also implemented a perfect-hash-finding program loosely based on a paper co-authored by Gord Cormack on the finding of perfect (meaning collision-free) hash-functions, so that all table-lookups done by the online program would be collision-free.  There were many such tables, some tiny, some like the common-surnames and common-firstnames tables having something like 1024 or 4096 entries each.

I'd also implemented a bit-wise compression method so that 260+ byte person-records seen by our COBOL programs were stored in half that space;  the constraints imposed by the underlying VSAM files accessed by IMS forced our compression to produce fixed-size records, ruling out most compression methods;  and so I wrote macros in S/370-Assembler to do bit-wise compression, packing 10-valued numeric-data into 4-bits, 32-valued character-data into 5-bits, and so on;  things like eye-colour and hair-colour (my documentation had to spell colour the Yank way:) did better with 3-byte COBOL values going into 3 and 2 bits.  Gender was originally 2-valued so it went into one-bit but that turned out to be an expensive mistake, on LA-County's part.  They later had to fly me to Los Angeles to convert to a two-bit gender.  It was a non-trivial change because gender was part of the VSAM key for one copy of the data.  There were 6 copies of the data, so that each of the 6 kinds of search could meet the performance objectives.  Our system had to be able to run many transactions per second (on whatever the biggest IBM mainframe was in those days);  I no longer remember the exact number.  That gender-expanding job was another case of mostly goofing-off time;  I'd been forced to quote a worst-case time-estimate, but then completed the job in half a day...

My 9 months of goofing off was interrupted by a few bits of work.  One of which was writing 6 trivial COBOL programs that a co-worker had struggled with for 3 months without success.  His "nearly finished" attempt had thousands of lines in each of the 6 programs -- to this day I have no idea what they did.  It was about an hour's work;  each of the COBOL programs was 3 lines of COBOL (not counting the preamble that's the same in every COBOL program).  For him it ought to have been 5 hours, with 4 of them for him to study the system documentation to learn which were the key fields for each of the 6 kinds of search, whereas I already knew that.  The reason for these 6 trivial programs was that if a key-field was being modified then instead of a modify call, 2 separate calls were made to delete and insert the record.  That same co-worker had been my assistant on the stats-gathering part of the job, and that left me with a low opinion on the value of having an assistant, as it took considerably more of my time to get him to do something than it took to do it myself.  And then after spending about a hundred times longer than was reasonable on a program to sort the surnames and give me the N most common ones, his output would have bizarre flaws such as the same name showing up twice, once space-padded and once NUL-padded.  How anyone can for such a simple problem write something complicated enough to have bugs like that still boggles my mind.

That was one of the 2 occasions where I've written in COBOL.  The other was more interesting, and also amusing.  This time I was a subcontractor to Systemhouse who were doing a job for MTS (Manitoba Telephone).  The specs called for the programs to be written in COBOL, but they had performance-requirements that couldn't be met by COBOL programmers, and that's why they hired me.  So I wrote in COBOL something resembling a bitmap for memory-management.  COBOL has no bit operations, so I wrote them in a way that made mockery of their requirement that all programs be in COBOL, and therein lies the humour:-)

I recall another small job that led me to have doubts about my fellow-man: someone had sent me a large file on 4 reels of "unlabelled" tape, and I was trying to copy the file to DASD.  That meant running a job that needed 4 consecutive tape-mounts to be done correctly (with labelled tapes the I/O clerk could screw up and the system would have him/her do it again, but with unlabelled tape you were at the I/O clerk's mercy), and I was starting to think that was never going to happen.  However after 8 hours came a shift-change and then it became possible.

The highlight of my student-era summer-jobs was the summer of 1974 at Chalk River Nuclear Laboratories where AECL (Atomic Energy of Canada) has its research (and medical-isotope-making) reactors.  In those days AECL also had a research facility in Pinawa Manitoba, however a research project being done by Mechanical Engineers from CRNL and McGill University (Michel J Pettigrew and Michael P Paidoussis) was looking for a software person familiar with the mathematics of mechanical-vibrations (no kidding) and that was the best fit for me within AECL.  It was a remarkable stroke of luck as it turned out to be a wonderful summer job, where I got paid to do things that were such fun I felt as though I should be paying them.  Prof Paidoussis of McGill was solving a problem using a symbolic-algebra program, and I was providing an independent second method to serve as a check (these nuclear scientists are extremely careful).  My method involved doing more of the math by hand (I spent several days solving integrals as in 1st-year Calculus), then using a software-package that I first had to complete, written in Fortran, to do matrix-operations without all the "bookkeeping" and the passing of all those extra parameters for declared dimensions and actually-in-use dimensions that come with multi-dimensional Fortran arrays.  (Although I knew the machine-language of the CDC-6600 processor we used at Chalk River, there was no reason to use it for anything I wrote there.)  This project sent me on my first ever "business trip" when I went to Montreal to confer with Professor Paidoussis at McGill and stayed at the Queen Elizabeth hotel.  As I recall, it also got me published, as one of 6 co-authors of a paper published in an academic journal.  Not every summer job has perks like that. 

I cannot find any mention of the publication I'm thinking of, although in Michael Paidoussis CV we find a less formal publication:  E Reimer, L Kates & M P Paidoussis,  "Computer Modelling of the Dynamic Behaviour of a String of Fuel Bundles in Axial Flow; Part A: Response to Steady Hydrodynamic Forces",  CRNL-1340, Atomic Energy of Canada, 1975.  I never met co-author Louis Kates the U of Waterloo co-op-program student who began, and wrote most of, the Fortran-based software-package;  however I did meet Michael Paidoussis at McGill and I remember enjoying his Greek cigarettes.

The next summer I worked at MGCC (Manitoba Government Computer Centre, later called MDS or Manitoba Data Services), along with about half a dozen Computer-Science students from U of Manitoba.  It was a different sort of learning-experience; they hired us but had nothing for us to do (working there was like retirement except it paid better), so we found our own projects.  Three of us wrote variants of listm, a TSO-command to display the members of a PDS (partitioned-dataset).  My main accomplishment was a pair of S/370 Assembly Language macros called EXCHPGM and CHPGM which made convenient the use of IBM's lowest-level access-method known as EXCP (execute channel-program).  They were typically used to construct a channel-program for track-at-a-time reading, as in the VTOC-reading utility I wrote at the University of Manitoba a few years later.  Reading an entire track at once provides a 50-fold performance improvement versus block-at-a-time reading for MVS VTOC (Volume Table of Contents) or Catalog or PDS-directory records which are unblocked -- although these days the "caching" does essentially the same thing automatically.

PS:  An email from Gord Tallas on 2008-06-10 informs me that the CWS system we built in 1986 is still in use, however they're now thinking of having the assembly-language LAPIS part rewritten in a high-level language.  Which reminds me that we'd chosen the acronym LAPIS (probably for Los Angeles Police Information System) for that "name-search" part of CWS.  LAPIS, as in the semi-precious stone, was a winner of an acronym in my opinion.  I don't remember whose brainwave it was except that it wasn't mine.  I'm thinking that C would have been entirely suitable back then too although it wasn't one of the choices.  Mind you, not having macros would have made it seem like going to a lower-level language rather than a higher:-)

PS:  Here's info on CWS and a few other related systems being used by Los Angeles County: www.la-sheriff.org/divisions/tsdiv/record_id/ri_ovrview.html.

PS:  Highlights from the PC-era are in programs/zz-collection.htm.

Send your questions, suggestions, etc to ereimer@shaw.ca.