26. Oktober 2017

HP 2116 and the power fail auto restart

Here is the “Power Fail / Auto Restart Diagnostic Reference Manual” of 1977. There is nothing to be seen, the electronic feature was on a board, was only “factory installable”, and cost 4,000 German Marks at that time, quite some money.
   The HP 2116 and similar minicomputers were built for the real world, not so much for office or the scientific world. It had many interfaces for connection to measuring and medical instruments, to relay switches, to sensors and time standards – to the technical world.
   At first the operating system was a BCS, a Basic Control System, and it didn’t even use the interrupt system. (Look here for “hard-wired, prioritized interrupt vectors”.) To explain interrupt  processing, you must recall what a “call to subroutine” was.
   Calling a subroutine was (and still is) very practical for routine matters that were not implemented in hardware instructions.
   When the computer encounteres a goto subroutine command, all it did, was jumping there – like a jump command by altering the program counter respectively – but remembering the address from where the program came. At the end of the subroutine a “return” took the computer back to the program counter at time of interrupt (plus one) for normal program continuation. Did I explain that? The jump subroutine command had to know where to jump to, of course, and that was fixed by the assembler or compiler and the relocation loader beforehand – but never mind.
   A interrupt is, as I defined it, an “involuntary jump subroutine”. A program is happily running along, and suddenly, at a time not determined by the program but by an external event, the computer jumps to a subroutine, more specifically to a fixed firmware address. (Later interrupts interrupted by just braking into the memory protect boundary, that protected the operating system. The priciple was the same).
   Turning on the interrupt system in the early seventies was the deciding step in “computering”, I think. Till this day all we did was just calculating, rechnen in German. With interrupts the computer became more than a fully predetermined machine, because you could never tell when exactly an interrupt would occure. Till today this is the reason why debugging a system becomes ever more difficult. We HP “systems analysts”, however, were hired for that and knew the machines inside-out.

The first thing a subroutine or an interrupt handler, called driver, had to do, is save the original contents of all the registers it would take liberty to use itself. Often you saved all registers, to be on the safe side; there weren’t that many. Then you quickly turned off the interrupt system, to be sure, you yorself didn’t get disturbed at work. An eventual next interrupt had to wait a little.
   When you re-emerged (returned) from the interrupt handling driver, you restored the registers to the setting you had inherited them and turned interrupts back on.
   Some years later you saved the caller’s registers on a stack (first on HP 35 and DEC’s PDP 11) or at least in a place where only you would have acces to at time of restore. Thus these routines became “reentrant”, and could be used by many programs “at the same time”. When Microsoft decided to go from MS-Dos to Windows, drivers became reentrant and the mess started, blue screens popped up like lightnings in the sundown, when drivers messed up.

Ad now finally to the power fail – auto restart.
   When a computer looses main power it does not work any more. Lights go off. When power is restored, lights go back on, but not so computers. They have lost all registers and would not know where to continue. Not even a modern PC automatically restarts – see the poor discussin here. They are “personal” computers, they expect you to constantly “wife” or “man” them, and so they don’t like to run unattended – like minicomputers could and did, monitoring a lab full of instruments.
   So someone got the idea to evoke an interrupt, when external AC power run out, sunk below a threshhold. Internally computers run on DC, and DC can be stored in batteries and is nicely smoothened by capacitors in the system, so it holds on longer that the external AC power, dies slower. In fact there is enough time for the power fail interrupt handling driver to save all the status of the machine, the registers.
   If the HP 2116 with power fail auto restart enabled and the proper software in memory lost power, it went down like any other machine. However, when power was restored, it lighted up again and continued exactly where it had been shut down. You might have lost some outside data in the meantime, or not have reacted to an alarm, but that was up to each program to recover. If you needed to bridge more time, you got yourself an uninterruptible power supply.

Two power fail auto restart stories,
   first the hardware story.
   Someone of us had found out, that the expensive option (4,000 DM, about € 2000) was built in already. It came with all HP 2116s. At that time such a “solution” was found in quite some cases: IBM had built-in memory extensions, very expensive as well – though I can’t prove that. When you paid for them they were activated.
   To activate the hidden, the just sleeping HP power fail auto restart option you had to pull out a board – I think the main processing board –, cover one contact by scotch tape, so as to isolate it, and plug the board back in. So we more or less routinely activated power fail auto restart.
   Hewlett-Packard probably did not want that this trick became wide spread, or was ashamed of it, so they required the option to be retrofitted only by returning the system back to factory. As the unit was big as a dishwasher, a quite inconvenient procedure, let alone the time lost.
   We software and hardware service people never took money for the option.
   A later article by John S. Elward  about the HP 21MX in the HP Journal from October 1974, “The Million-Word Minicomputer Main memory” (2 MByte) about the “Dynamic Mapping System” already shows a completely different hardware structure on http://www.hpl.hp.com/hpjournal/pdfs/IssuePDFs/1974-10.pdf#page=19. Time had progressed to non-volatile and much more memory.

To end here’s the second, the software story. “The 2116 as sleeping beauty or Running the Real Time System in Halt mode”
   The first multiprogramming operating system by HP was the RTE, the Real Time Operating System. We got it back in Milan in 1970, where HP had concentrated some ten analysts under my guidance. Work permits for the HP Headquarter at Geneva, Switzerland, were too difficult to get (Only after our stay in Cupertino we came to Geneva).
   This RTE, with its tough name, was a horrible and huge piece of software, could naturally be used only with a disc drive as support (or a drum), and, as I said, showed crashes where the cause was really difficult to find, due to the now lack of predetermination, as explained above. But it was used in importrant missions, it was promised to be robust under any circumstance, diagnostic stops were not allowed any more.
   A “memory protect boundary” proteced the RTE from being jumped (I’d say dived) in by a misguided program. Program and data were not separated, you could – I guess as today – execute data and program right into a program, which you did only with utmost care, like when time was extremely short. I remember the most brilliant piece of software, a software driver deep in the Time Share Basic operating system to communicate with 32 Teletype terminals asking who had interrupted and with what character. But again I stray away …
   The RTE when it had nothing to do – after all the real world was less hectic in those days, we think today – idled by a jump to itself, sat at the same program counter location with lights (apparently steadily) on and running, waiting for the next event, the next interrupt. You could see that it idled.
   The power fail auto restart however had a test command to evoke the failure process even without natural power loss. You could turn off power by software. When you did, the auto restart process should bring you back to real life, but only after a second or so. In the meantime the computer showed a halt, and in fact didn’t do anything useful. If it didn’t come back up a moment later, the restart process was bad.
   So I programmed a virtual death into RTE’s idle loop. This made the computer look like standing still. However when you entered a command into the console Teletype commanding the (apparently suspended) RTE, it reacted like an awakening sleeping beauty. It did what you commaned, just very, very slowly and apparently in Stop mode.
   I don’t know if someone today can imagine the amazement of everybody who saw that at the time, sorry.

Link to here

More old stories

Keine Kommentare: