Appendix to Chapter 19 — The Eternal Algorithm, Recursive Variant

^{Estimated time of reading: 12-13 minutes}

“The routine that calls itself”

‍

The iterative variant of binary search — the one presented in the chapter — uses a loop: adjust the bounds, recalculate the midpoint, repeat. It is direct, economical, without surprises.

But there is another way, more elegant mathematically: recursion. Instead of adjusting the bounds and returning to the beginning of the loop, the routine calls itself with new bounds. Each call solves a smaller problem, until the answer is trivial: either we have found it, or the space has been exhausted. It is the same algorithm, only instead of a loop that revolves, you have a function that mirrors itself.

In high-level languages, recursion is simple: the compiler takes care of the stack, the context, the return. In Assembler, you must do everything yourself. And that is precisely why it is instructive: you see with your own eyes how each recursive call deposits its small universe on the stack — registers, parameters, return address — and how, upon returning, each level lifts its universe back up, untouched.

MACRO-11: Natural recursion

The PDP-11 might have been designed with recursion in mind. The hardware stack, with auto-decrement on push and auto-increment on pop, means that each recursive call is merely a sequence of MOVs onto the stack followed by a JSR. On return, RTS PC restores everything automatically.

The interface: four parameters on the stack — the table address, the key, the lower bound (LOW) and the upper bound (HIGH). On the first call, LOW is 1 and HIGH is N. The result is returned in R0: the index found, or 0 if the element does not exist.

The base case: if LOW > HIGH, the space has been exhausted; return 0. Otherwise, calculate MID, compare, and call oneself with adjusted bounds. Note that the table address and the key are retransmitted identically at every level — only LOW and HIGH change.

A PDP-11 detail that can drive you to distraction if you are unaware of it: the stack offsets are in octal. 10(SP) means octal 10, that is, decimal 8. 12(SP) is decimal 10, 14(SP) is decimal 12, and so on. Once you count in octal, the stack becomes a perfect map.

The initial call:

; INITIAL CALL:

;

MOV N,-(SP) ; PUSH HIGH = N

MOV #1,-(SP) ; PUSH LOW = 1

MOV KEY,-(SP) ; PUSH KEY

MOV #TABLE,-(SP) ; PUSH TABLE ADDRESS

JSR PC,RBSRCH ; CALL

ADD #10,SP ; CLEAN STACK (4 PARAMS * 2 = 10 OCTAL)

; R0 = INDEX (1..N) OR 0

The recursive routine:

;=======================================================

; RECURSIVE BINARY SEARCH - MACRO-11 / PDP-11 / RT-11

;

; PARAMETERS ON STACK (PUSHED RIGHT TO LEFT):

; 2(SP) = TABLE ADDRESS (pointer to sorted word array)

; 4(SP) = KEY (search value)

; 6(SP) = LOW (lower bound, 1-based)

; 10(SP) = HIGH (upper bound, 1-based)

; (10 octal = 8 decimal)

;

; RETURNS: R0 = INDEX (1..N) IF FOUND, 0 IF NOT

;

; FIRST CALL:

; MOV HIGH,-(SP) ; push HIGH (=N)

; MOV LOW,-(SP) ; push LOW (=1)

; MOV KEY,-(SP) ; push KEY

; MOV #TABLE,-(SP) ; push TABLE address

; JSR PC,RBSRCH

; ADD #10,SP ; clean 4 params (octal 10 = 8)

; ; R0 = result

;=======================================================

RBSRCH: MOV R1,-(SP) ; SAVE REGISTERS

MOV R2,-(SP)

MOV R3,-(SP)

MOV R4,-(SP)

;

; LOAD PARAMETERS

; 4 saved regs + return addr = 5 words = 12 octal offset

;

MOV 12(SP),R1 ; R1 = TABLE ADDRESS

MOV 14(SP),R2 ; R2 = KEY

MOV 16(SP),R3 ; R3 = LOW

MOV 20(SP),R4 ; R4 = HIGH

;

; BASE CASE: LOW > HIGH? => NOT FOUND

;

CMP R3,R4 ; LOW > HIGH?

BGT RNFND ; YES: RETURN 0

;

; MID = (LOW + HIGH) / 2

;

MOV R3,R0 ; R0 = LOW

ADD R4,R0 ; R0 = LOW + HIGH

ASR R0 ; R0 = (LOW+HIGH)/2 = MID

;

; COMPUTE ADDRESS OF TABLE(MID)

;

MOV R0,-(SP) ; SAVE MID ON STACK

DEC R0 ; R0 = MID - 1

ASL R0 ; R0 = (MID-1) * 2 (WORD SIZE)

ADD R1,R0 ; R0 = ADDR OF TABLE(MID)

;

; COMPARE KEY WITH TABLE(MID)

;

CMP R2,(R0) ; KEY : TABLE(MID)

BLT RLEFT ; KEY < TABLE(MID)

BGT RRIGHT ; KEY > TABLE(MID)

;

; FOUND: RETURN MID

;

MOV (SP)+,R0 ; R0 = MID (POP)

BR RDONE

;

; KEY < TABLE(MID): RECURSE WITH HIGH = MID - 1

;

RLEFT: MOV (SP)+,R0 ; R0 = MID (POP)

DEC R0 ; R0 = MID - 1

;

; PUSH PARAMS FOR RECURSIVE CALL (RIGHT TO LEFT)

;

MOV R0,-(SP) ; PUSH NEW HIGH (MID-1)

MOV R3,-(SP) ; PUSH LOW (UNCHANGED)

MOV R2,-(SP) ; PUSH KEY

MOV R1,-(SP) ; PUSH TABLE ADDRESS

JSR PC,RBSRCH ; RECURSE!

ADD #10,SP ; CLEAN 4 PARAMS (OCTAL 10 = 8)

BR RDONE ; R0 = RESULT FROM RECURSION

;

; KEY > TABLE(MID): RECURSE WITH LOW = MID + 1

;

RRIGHT: MOV (SP)+,R0 ; R0 = MID (POP)

INC R0 ; R0 = MID + 1

;

; PUSH PARAMS FOR RECURSIVE CALL (RIGHT TO LEFT)

;

MOV R4,-(SP) ; PUSH HIGH (UNCHANGED)

MOV R0,-(SP) ; PUSH NEW LOW (MID+1)

MOV R2,-(SP) ; PUSH KEY

MOV R1,-(SP) ; PUSH TABLE ADDRESS

JSR PC,RBSRCH ; RECURSE!

ADD #10,SP ; CLEAN 4 PARAMS (OCTAL 10 = 8)

BR RDONE ; R0 = RESULT FROM RECURSION

;

; NOT FOUND: RETURN 0

;

RNFND: CLR R0

;

; RESTORE AND RETURN

;

RDONE: MOV (SP)+,R4 ; RESTORE REGISTERS

MOV (SP)+,R3

MOV (SP)+,R2

MOV (SP)+,R1

RTS PC ; RETURN (R0 = RESULT)

;

.END

ASSEMBLER IBM System/360: Recursion built by hand

On the /360, recursion is not impossible — but no one hands it to you on a plate. There is no hardware stack. There is the Save Area — a block of 18 fullwords in which each routine saves its registers. The Save Areas are chained: each one points to the preceding one (backward) and to the next (forward). It is like a chain of railway wagons, each with its place in the consist.

The problem: in a normal call, each routine has a single, static Save Area. If the routine calls itself, the second invocation overwrites the Save Area of the first — and upon return, the registers are lost. Disaster.

The solution: a simulated stack. I pre-allocated a memory area (STACK) with 20 frames of 96 bytes each — 18 fullwords for the Save Area plus 6 fullwords for local variables and the parameter list. A pointer (STKPTR) keeps track of the current frame. On each recursive call, the pointer advances by 96 bytes; on return, it recedes. Twenty levels are sufficient for 2 to the power of 20, that is, over one million elements — more than any disk in 1972 could hold.

The key instruction is BALR R14,R15 — Branch And Link Register: it saves the return address in R14 and branches to the address in R15. It is the equivalent of JSR on the PDP-11, except that it does not use the stack — it uses a register. Everything else — saving the context, passing parameters, restoration — is done by hand, with care and discipline.

RBSRCH CSECT

*=============================================================

* RECURSIVE BINARY SEARCH - IBM SYSTEM/360

* OS/360 STANDARD LINKAGE WITH DYNAMIC SAVE AREAS

* PARAMETERS VIA R1 (STANDARD OS/360):

* +0 A(TABLE) - ADDRESS OF SORTED FULLWORD ARRAY

* +4 A(KEY) - ADDRESS OF FULLWORD: SEARCH KEY

* +8 A(LOW) - ADDRESS OF FULLWORD: LOWER BOUND

* +12 A(HIGH) - ADDRESS OF FULLWORD: UPPER BOUND

* +16 A(IPOS) - ADDRESS OF FULLWORD: RESULT INDEX

* RETURN CODE IN R15: 0=FOUND, 4=NOT FOUND

* NOTE: EACH RECURSIVE CALL USES A FRAME FROM STACK AREA.

* REGISTER R11 IS THE STACK POINTER, ADVANCING BY

* FRAME SIZE (96 BYTES) ON EACH CALL.

* MAX DEPTH = 20 (ENOUGH FOR 2^20 = 1M ELEMENTS)

*=============================================================

FSIZE EQU 96 FRAME SIZE (18F SAVE + 6F WORK)

*--- STANDARD ENTRY LINKAGE ---

STM R14,R12,12(R13) SAVE CALLER'S REGISTERS

LR R12,R15 ESTABLISH BASE REGISTER

USING RBSRCH,R12

*--- ALLOCATE FRAME FROM STACK ---

L R11,=A(STKPTR) R11 -> CURRENT STACK PTR

L R10,0(,R11) R10 = CURRENT FRAME ADDRESS

LA R9,FSIZE(,R10) R9 = NEXT FRAME ADDRESS

ST R9,0(,R11) ADVANCE STACK POINTER

*--- CHAIN SAVE AREAS ---

ST R13,4(,R10) BACKWARD CHAIN

ST R10,8(,R13) FORWARD CHAIN

LR R13,R10 R13 -> OUR FRAME

*--- LOAD PARAMETERS ---

L R2,0(,R1) R2 = A(TABLE)

L R3,4(,R1) R3 = A(KEY)

L R4,8(,R1) R4 = A(LOW)

L R5,12(,R1) R5 = A(HIGH)

L R6,16(,R1) R6 = A(IPOS)

LA R6,0(,R6) CLEAR HI BIT IF LAST PARM

L R7,0(,R3) R7 = KEY VALUE

L R8,0(,R4) R8 = LOW VALUE

L R9,0(,R5) R9 = HIGH VALUE

*--- BASE CASE: LOW > HIGH? ---

CR R8,R9 LOW > HIGH?

BH RNOTFND YES: NOT FOUND

*--- MID = (LOW + HIGH) / 2 ---

LR R10,R8 R10 = LOW

AR R10,R9 R10 = LOW + HIGH

SRA R10,1 R10 = MID

*--- COMPARE KEY WITH TABLE(MID) ---

LR R4,R10 R4 = MID

BCTR R4,0 R4 = MID - 1

SLA R4,2 R4 = (MID-1) * 4

C R7,0(R4,R2) KEY : TABLE(MID)

BL RLEFT KEY < TABLE(MID)

BH RRIGHT KEY > TABLE(MID)

*--- FOUND: IPOS = MID, RC = 0 ---

ST R10,0(,R6) IPOS = MID

SR R15,R15 RC = 0

B REXIT

*--- KEY < TABLE(MID): RECURSE(LOW, MID-1) ---

RLEFT LR R5,R10 R5 = MID

BCTR R5,0 R5 = MID - 1

ST R8,72(,R13) STORE LOW IN WORK AREA

ST R5,76(,R13) STORE NEW HIGH (MID-1)

* BUILD PARAMETER LIST IN WORK AREA

L R1,=A(SAVTBL) RELOAD A(TABLE) FROM SAVE

ST R1,80(,R13) PARM+0: A(TABLE)

L R1,=A(SAVKEY)

ST R1,84(,R13) PARM+4: A(KEY)

LA R1,72(,R13)

ST R1,88(,R13) PARM+8: A(LOW) -> WORK+0

LA R1,76(,R13)

ST R1,92(,R13) PARM+12: A(HIGH) -> WORK+4

* NOTE: IPOS ADDRESS STAYS IN R6, PASSED THROUGH

B RCALL

*--- KEY > TABLE(MID): RECURSE(MID+1, HIGH) ---

RRIGHT LA R5,1(,R10) R5 = MID + 1

ST R5,72(,R13) STORE NEW LOW (MID+1)

ST R9,76(,R13) STORE HIGH (UNCHANGED)

L R1,=A(SAVTBL)

ST R1,80(,R13) PARM+0: A(TABLE)

L R1,=A(SAVKEY)

ST R1,84(,R13) PARM+4: A(KEY)

LA R1,72(,R13)

ST R1,88(,R13) PARM+8: A(LOW) -> WORK+0

LA R1,76(,R13)

ST R1,92(,R13) PARM+12: A(HIGH) -> WORK+4

*--- COMMON RECURSIVE CALL ---

RCALL LA R1,80(,R13) R1 -> PARAMETER LIST

* SET LAST PARM INDICATOR

L R5,92(,R13) LOAD LAST PARM ADDR

O R5,=X'80000000' SET HI BIT

ST R5,92(,R13) STORE BACK

L R15,=A(RBSRCH) R15 = ENTRY POINT

BALR R14,R15 CALL SELF (RECURSE!)

* R15 = RETURN CODE FROM RECURSIVE CALL

B REXIT

*--- NOT FOUND: IPOS = 0, RC = 4 ---

RNOTFND SR R10,R10

ST R10,0(,R6) IPOS = 0

LA R15,4 RC = 4

*--- STANDARD EXIT: DEALLOCATE FRAME ---

REXIT L R11,=A(STKPTR) RESTORE STACK POINTER

L R10,0(,R11)

S R10,=F'96' POP ONE FRAME

ST R10,0(,R11)

L R13,4(,R13) RESTORE CALLER'S R13

L R14,12(,R13) RESTORE R14

LM R0,R12,20(R13) RESTORE R0-R12

BR R14 RETURN

*--- CONSTANTS AND WORK AREAS ---

SAVTBL DS A SAVED A(TABLE) FOR RECURSION

SAVKEY DS A SAVED A(KEY) FOR RECURSION

STKPTR DC A(STACK) CURRENT STACK POINTER

STACK DS 20CL96 STACK: 20 FRAMES * 96 BYTES

* (MAX DEPTH 20 = 2**20 ELEMS)

R0 EQU 0

R1 EQU 1

R2 EQU 2

R3 EQU 3

R4 EQU 4

R5 EQU 5

R6 EQU 6

R7 EQU 7

R8 EQU 8

R9 EQU 9

R10 EQU 10

R11 EQU 11

R12 EQU 12

R13 EQU 13

R14 EQU 14

R15 EQU 15

END RBSRCH

Two machines, the same idea

On the PDP-11, recursion flows naturally: the hardware stack makes everything transparent. Each call is merely a JSR, each return merely an RTS. The machine was conceived for this.

On the /360, recursion is like pitching a tent by hand in a storm: it can be done, but you must know where to drive each stake. The simulated stack, the pre-allocated frames, the chain of Save Areas — all are constructions that the PDP-11 had for free in its hardware.

But the algorithm? The algorithm is the same. Divide, compare, call yourself with a smaller problem. The same logic, regardless of whether the stack is made of silicon or of code written by hand.

The eternal algorithm takes no notice of architecture. It simply works.

‍

← Back Home Chapters →