Structure and Interpretation of Computer Programs — Book + MIT 6.001 + Berkeley CS61A Summary & Notes [WIP]

31.07.2022 — notes, summary — 18 min read

This is my summary and notes from Structure and Interpretation of Computer Programs by Harold Abelson & Gerald Jay Sussman . Please use the link if you decide to buy the book after reading this as it contains my affiliate code. Thanks.

SICP uses Scheme (a dialect of Lisp).

Every computer program is a model, hatched in the mind, of a real or mental process. These processes arise from human experience or thought and are only partially understood at any time — leading to the constant evolution of programs.
Avoid complexities of control and concentrate on organizing the data to reflect the real structure of the world being modeled.
It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.
— Alan J. Perlis
Computer science (CS) isn’t about computers (that’s electrical engineering) and it isn’t primarily a science (things are invented more than they are discovered).
CS is partly a form of engineering (concerned with building reliable, efficient mechanisms, but in software instead of metal) and partly an art form (using programming as a medium for creative expression).

Chapter 1: Building Abstractions with Procedures

The acts of the mind, wherein it exerts its power over simple ideas, are chiefly these three:
Combining several simple ideas into one compound one, and thus all complex ideas are made.
Bringing two ideas, whether simple or complex, together, and setting them by one another so as to take a view of them at once, without uniting them into one, by which it gets all its ideas of relations.
The third is separating them from all other ideas that accompany them in their real existence: this is called abstraction, and thus all its general ideas are made. — John Locke, An Essay Concerning Human Understanding (1690)

Declarative knowledge talks about what is true. It makes statements of fact that can be used to reason about things.
√x = y
such that:
y² = x and y >= 0
Imperative knowledge talks about "how to" knowledge.
To find the √x:
1. Make a guess G
2. Is G² close enough to x?
3. If no, average G and x / G
4. If yes, done.
A procedure is a description of a series of specific, mechanical steps to be followed in order to deduce a particular value associated with some problem, using a predefined set of operations.
A process is the actual sequence of steps within the computer that cause the "how to" knowledge to evolve. By evaluating an expression that applies the procedure to some values, the "how to" knowledge is used to find the value associated with a particular instantiation of the problem by transforming one set of facts into a new set.
Procedures (dormant description) are a pattern of rules written in a programming language that directs a process (active).
Computational processes manipulate data as they evolve, as defined in a procedure.
Well designed computational systems have a modular design — allowing individual parts to be constructed, replaced or debugged separately.
The constraints imposed in building large software systems are the limitations of the human mind; opposed to the physical systems' constraints in other kinds of engineering.

flowchart LR subgraph Memory subgraph ProcessA [Process A] DataA[/Data A/] end subgraph ProcessB [Process B] DataB[/Data B/] end end style ProcessA fill:#FFFFFF,stroke:#000000,stroke-width:1px style ProcessB fill:#FFFFFF,stroke:#000000,stroke-width:1px style DataA fill:#A0C0FF,stroke:#000000,stroke-width:1px style DataB fill:#A0C0FF,stroke:#000000,stroke-width:1px subgraph Disk ["Memory"] subgraph ProgramX Procedure1(Procedure 1)-->Procedure2(Procedure 2) Procedure1-->Procedure3(Procedure 3) end end style ProgramX fill:#FFFFFF,stroke:#000000,stroke-width:1px style Procedure1 fill:#6ED3BE,stroke:#000000,stroke-width:1px style Procedure2 fill:#9ACE81,stroke:#000000,stroke-width:1px style Procedure3 fill:#9ACE81,stroke:#000000,stroke-width:1px ProgramX-->ProcessA ProgramX-->ProcessB

1.1 The Elements of Programming

Two kinds of elements in programming:
- Data: Stuff to be manipulated (dormant).
- Procedures: Description of the rules for manipulating data.
  Lisp treats procedures as data
Besides instructing the computer to perform tasks, powerful programming languages also serve as a framework to organize ideas about processes.
Every powerful language has 3 mechanisms for building complex ideas from simple ones:
- Primitive expressions: represents the simplest elements the language is concerned with. Eg:
  - Primitive procedures
  - Primitive data (Numbers, etc)
- Means of combination: by which compound elements are built from primitives. Eg:
  - Combinations
  - Construction of compound data
- Means of abstraction: by which compound elements can be named and treated as primitives. Eg:
  - Definitions
  - Data abstraction
Abstraction lets us talk more nearly in a problem’s own terms and less in terms of the computer’s mechanisms or capabilities.
The idea of isolating the use of a procedure from its actual implementation is used to control complexity of large systems.
The key to understanding complicated things is to know what not to look at, what not compute, and what not to think.

flowchart TB subgraph PrimitiveC [Someone's primitive] subgraph Abstraction1 [AbstractionX] subgraph Combination1 [Combination 1] PrimitiveA(Primitive A) PrimitiveB(Primitive B) end end end style Combination1 fill:#FFFFFF,stroke:#000000,stroke-width:1px style PrimitiveA fill:#F49666,stroke:#000000,stroke-width:1px style PrimitiveB fill:#A0C0FF,stroke:#000000,stroke-width:1px style PrimitiveC fill:#6ED3BE,stroke:#000000,stroke-width:1px

Features of a programming language

Vocabulary: a set of words that are used to build procedures. These will be the basic elements of computation, the fundamental representations of information and the fundamental procedures that we use to describe all other procedures.
Syntax: a set of rules for legally connecting elements together — i.e, how to build more complex parts of a procedure out of more basic ones.
Semantics: a set of rules for deducing the meaning associated with (simple/compound) elements.

1.1.1 Expressions

Self-evaluating expressions are the simplest class of primitives — their value is the object or expression itself. Examples are numbers, strings, and boolean.

Combinations are compound expressions that applies input expressions to an expression representing a procedure (eg: + or *)

1(+ 137 349) ; Combinations are formed by delimiting a list of expressions within parentheses in order to denote procedure application

Combinations can be nested — an operator or operand can itself be another combination.

1(define (a-plus-abs-b a b)
2   ((if (> b 0) + -)
3      a
4      b
5   ))

The convention of placing the operator to the left of the operands is known as prefix notation.
Prefix notation accomodates procedures that take arbitrary number of arguments.
```
1(+ 21 35 12 7)
2
3(+ 25 4)
```
No ambiguity can arise, because the operator is always the leftmost element and the entire combination is delimited by the parentheses.
Prefix notation extends in a straightforward way to allow combinations to be nested:
```
1(+ (* 3 5) (- 10 6))
```
The lisp interpreter runs in a read-evaluate-print loop (REPL).
In lisp, every expression has a value and every value has a type.

Syntax for a combination

An open parenthesis,
An expression whose value is a procedure (operator),
A list of expressions (operands), and
A close parenthesis.

1(procedure_expression operand_expression1 operand_expression2)

Semantics of a combination

The value of a combination is obtained by applying the procedure specified by the operator to the arguments that are the values of the operands.

1.1.2 Naming and the Environment

In scheme, objects are named with define. The names identifies a variable whose value is the object. The last sentence separates putting a value in a variable and giving it a name.
```
1(define size 2) ; name (size) is used to refer to the computational object (2)
```
define is Scheme's simplest means of abstraction — it allows the usage of simple names to refer to the results of compound operations.
Complex programs are constructed by building, step by step, computational objects of increasing complexity.
The environment is the memory the interpreter uses to hold the association between names and objects.

1.1.3 Evaluating Combinations

The interpreter follows a procedure to evaluate combinations:
- Evaluate all the subexpressions of the combination (in any order).
- Apply the procedure that is the value of the leftmost subexpression (the operator) to the arguments that are the values of the other subexpressions (the operands).
The evaluation rule is recursion — To evaluate a combination, the interpreter must first evaluate the elements of the combination, and so on.
Paretheses in Lisp is used for combinations. They are a way to write 2d combination trees as a linear character string.

Example:

1(* (+ 2
2      (* 4 6))
3   (+ 3 5 7))

Lisp evaluation rule

If self-evaluating, return value (base case).
If a name, return value associated with name in environment (base case).
If a special form (like define), use its own evaluation rule.
If a combination, then use the evaluation procedure described above.

Lisp doesn't make artificial distinctions between elements that are primitive to the language and custom built elements.

`Define` special evaluation rule

Evaluate only the last sub-expression,
Pair the value with the name symbol in the environment.

1(define name_symbol last_expression)

1.1.4 Compound Procedures

Procedure definition is a powerful abstraction technique by which a compound operation can be given a name and then referred to as a unit.

General form of a procedure definition

1(define (⟨name⟩ ⟨formal parameters⟩) ⟨body⟩)

The ⟨name⟩ is a symbol to be associated with the procedure definition in the environment.
The ⟨formal parameters⟩ are the names used within the body of the procedure to refer to the corresponding arguments of the procedure.
The ⟨body⟩ is an expression that will yield the value of the procedure when it's applied.
The ⟨name⟩ and the ⟨formal parameters⟩ are grouped within parentheses, just as they would be in an actual call to the procedure being defined.
The ⟨body⟩ is not evaluated when the procedure is evaluated, it's evaluated when the procedure is applied.
If the ⟨body⟩ contains more than one expression, each is evaluated in sequence and the value of the last one is returned.
Evaluating the definition creates a compound procedure and associates it with the name. The last sentence separates creating a procedure and giving it a name. Because it's possible to create a naemeless procedures or to give an existing procedure a name.

The argument’s name comes from the function’s definition; the argument’s value comes from the invocation.

1(define elvis +) ; Even primitive procedure symnols are just names bound to a procedure value
2(elvis 5 10)

Terminology

The formal parameter is the name of the argument inside the function body.
The actual argument expression is the expression used in the invocation (procedure_name (+ 2 3)).
The actual argument value is the value of the argument in the invocation (procedure_name 5).

Lambda expression

1(define ⟨name⟩ (lambda (⟨formal parameters⟩) ⟨body⟩))

The general form is syntatic sugar for a lambda expression.
Syntatic sugar is having more convenient surface forms for typing something.
The lambda symbol is Lisp's way of saying "make a procedure".
The semantics says that the value of a lambda expression is a procedure object. It's some internal representation of the procedure that includes information about the formal parameters and the body.

Example:

1(lambda (x y) (/ (+ x y) 2))

1.1.5 The Substitution Model for Procedure Application

The interpreter follows the same process when evaluating combinations with either compound or primitive procedure.
The substitution model for procedure application:
To apply a procedure to arguments, evaluate the body of the procedure with each formal parameter replaced by the corresponding argument.
The purpose of the substitution is to help reason about procedure application, not to provide a description of how the interpreter really works.
Typical interpreters do not apply procedures by manipulating the text of a procedure to substitute values for the formal parameters. In practice, the “substitution” is accomplished by using a local environment for the formal parameters.
There are more elaborate models — the substitution model is only a way to get started thinking formally about the evaluation process.
In general, when modeling phenomena in science and engineering, simplified incomplete models are first used. As things are examined in greater detail, these simple models become inadequate and must be replaced by more refined models.

Normal Order

Fully expand and then reduce — I.e Operand expressions are substituted for parameters until an expression involving only primitive operators is obtained, only then is evaluation performed.

Applicative Order

Evaluate the arguments and then apply — Operand expressions are evaluated and then applied to a procedure.

Lisp uses applicative-order evaluation because:

Additional efficiency obtained from avoiding multiple evaluations of expressions.
Normal-order evaluation becomes much more complicated to deal with outside the realm of procedures that can be modeled by substitution.

In functional programming, the same answer is obtained regardless of order of evaluation.

1.1.6 Conditional Expressions and Predicates

Conditional expression enables case analysis.

The general form of a conditional expression is:

1(cond (⟨p₁⟩ ⟨e₁⟩)
2   (⟨p₂⟩ ⟨e₂⟩)
3   ...
4   (⟨pₙ₋₁⟩ ⟨eₙ₋₁⟩)
5   (else ⟨eₙ⟩)
6)

It consists of the symbol cond followed by parenthesized pairs of expressions called clauses.
The first expression in each clause is a predicate — i.e, an expression whose value is interpreted as either true or false.
The second expression in each clause is called a consequent expression.

Example:

1(define (abs x)
2   (cond ((> x 0) x)
3         ((= x 0) 0)
4         ((< x 0) (- x))))

Conditional expressions evaluation rule:

The predicate ⟨p₁⟩ is evaluated first.
If its value is false, then ⟨p₂⟩ is evaluated.
This process continues until a predicate is found whose value is true or the special symbol else is found, in which case the interpreter returns the value of the corresponding consequent expression ⟨e⟩ as the value of the conditional expression.
If nothing matches, the value of the conditional expression is undefined.

If expressions

The general form for the special form if (a restricted type of conditional that allows only two cases in a case analysis) is:

1(if ⟨predicate⟩ ⟨consequent⟩ ⟨alternative⟩)

If expressions evaluation rule

If the ⟨predicate⟩ evaluates to a true value, the interpreter then evaluates the ⟨consequent⟩ and returns its value.
Otherwise it evaluates the ⟨alternative⟩ and returns its value.

Logical operators

In addition to primitive predicates such as <, =, and >, there are logical composition operations, which enables the construction of compound predicates:
- (and ⟨e₁⟩ ⟨e₂⟩ ... ⟨eₙ⟩)
- (or ⟨e₁⟩ ⟨e₂⟩ ... ⟨eₙ⟩)
- (not ⟨e₁⟩)
Notice that and & or are special forms, not procedures, because the subexpressions are not necessarily all evaluated (because of short circuiting).
not is an ordinary procedure.

Exercise 1.3

Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers:

1(define (sum-of-square-biggest x y z)
2         (cond ((and (> x z) (> y z)) (sum-of-square x y))
3               ((and (> y x) (> z x)) (sum-of-square y z))
4               ((and (> x y) (> z y)) (sum-of-square x z))
5 ))
6
7 (define (sum-of-square x y)
8    (+ (* x x) (* y y))
9 )

1.1.7 Example: Square Roots by Newton’s Method

The newton method of successive approximations can be used to calculate the square root.

To find the √x:
Make a guess y
Is y² close enough to x?
If no, improve guess by averaging y and x / y
If yes, done.

1Example: Find √2
2Initial guess: 1
3
4Guess       Quotient              Average                 Improved guess
51           (2/1) = 2             ((2 + 1)/2)             1.5
61.5         (2/1.5) = 1.3333      ((1.3333 + 1.5)/2)      1.4167
71.4167      (2/1.4167) = 1.4118   ((1.4118 + 1.4167)/2)   1.4142
81.4142 ...

One useful way of thinking about defining procedures is as a means of generalizing a common pattern of operations — i.e an abstraction. This is captured by giving a name to the part of the pattern that changes with each instantiation; identifying that name as a formal parameter; and then capturing that pattern as the body of a lambda expression.

Stages in capturing common patterns of computation

Identify modules or stages of process
Capture each module within a procedural abstraction
Constrcut a procedure to control the interactions between the modules
Repeat the process within each module as necessary

1(define (square x)
2   (* x x))
3
4(define (average x y)
5   (/ (+ x y) 2))
6
7(define (improve guess randicand)
8   (average guess (/ randicand guess)))
9
10(define tolerance 0.001)
11
12; The idea is to improve the answer until it is close enough so that its square differs from the radicand by less than a predetermined tolerance.
13(define (good-enough? guess randicand)
14   (< (abs (- (square guess)
15            randicand))
16      tolerance))
17
18(define (sqrt-iter guess randicand)
19   (if (good-enough? guess randicand)
20      guess
21      (sqrt-iter (improve guess randicand) randicand)
22   )
23)
24
25(define (sqrt randicand)
26   (sqrt-iter 1.0 randicand))

...I think people who really believe that you design everything before you implement it basically are people who haven’t designed very many things.
The real power is that you can pretend that you’ve made the decision and then later on figure out which one is right, which decision you ought to have made. And when you can do that, you have the best of both worlds.

— Abelson

The way in which you would construct a recursive process is by wishful thinking. You have to believe.

— Sussman

1.1.8 Procedures as Black-Box Abstractions

In the good-enough? procedure above, square is an abstraction of a procedure (aka procedural abstraction). At this level of abstraction, any procedure that computes the square is equally good.

Local names

The choice of names for the procedure’s formal parameters should not matter to the user of the procedure. The following procedures should not be distinguishable:
```
1(define (square x) (* x x))
2(define (square y) (* y y))
```
Formal parameters are bound variables — Their names don't matter.
The procedure definition binds its formal parameters.
A bound variable is a variable that was previously free, but has been bound to a specific value or set of values called domain of discourse or universe
Free variable refers to variables used in a function that are neither local variables nor parameters of that function.
If a variable is not bound, it's free.
The set of expressions for which a binding defines a name is called the scope of that name.
Formal parameters scope is the procedure's body.
Whether a variable is free or bound is relative to the object in question.

In the definition of good-enough? above, guess and x are bound variables but <, -, abs, and square are free.

Internal definitions and block structure

In the sqrt procedure above, the auxiliary procedures can be private and internal to it:

1(define (sqrt randicand)
2
3   ; Notice: No need to pass randicand
4   (define (good-enough? guess) ...)
5
6   (define (improve guess) ...)
7
8   (define (sqrt-iter guess) ...)
9
10   (sqrt-iter 1.0)
11)

Putting a definition in the body of a procedure makes it local to that procedure. This nesting is called block structure.

By internalizing auxiliary procedures, we can often eliminate bindings by allowing variables to remain free.

Lexical scoping

Scheme uses lexical scoping. Lexical scoping dictates that free variables in a procedure are taken to refer to bindings made by enclosing procedure definitions; that is, they are looked up in the environment in which the procedure was defined.

1.2 Procedures and the Processes They Generate

To become experts, we must learn to visualize the processes generated by various types of procedures. Only after we have developed such a skill can we learn to reliably construct programs that exhibit the desired behavior.

A procedure is a pattern for the local evolution of a computational process. It specifies how each stage of the process is built upon the previous stage.

1.2.1 Linear Recursion & Iteration — Finding the Factorial

Feature	Linear recursive process	Linear iterative process
Definition	A recursive process is characterized by a chain of deferred operations.	An iterative process is one whose state can be summarized by a fixed number of state variables, together with a fixed rule that describes how the state variables should be updated as the process moves from state to state and an (optional) end test that specifies conditions under which the process should terminate.
State	There is some additional “hidden” information, maintained by the interpreter and not contained in the program variables, which indicates “where the process is” in negotiating the chain of deferred operations.	The program variables provide a complete description of the state of the process at any point. If the computation is stopped between steps, all that is needed to resume the conputation is to supply the interpreter with the values of the program variables.
Shape	The substitution model reveals a "shape" of expansion followed by contraction. The expansion occurs as the process builds up a chain of deferred operations. The length of the deferred operations is proportional to `n`. The contraction occurs as the operations are actually performed.	The substitution model reveals a "shape" with no expansion/contraction.
Number of required steps	Grows linearly with `n`	Grows linearly with `n`
Space	Grows linearly with `n` (i.e proportional to `n`) The interpreter has to keep track of the deferred operations.	Constant

Factorial function definition:

n! = n (n - 1) (n - 2) ... 3 2 1

Linear Recursion

Rules to find n!:

When n > 1:
n! = n * (n - 1)!
When n = 1:
1! = 1

The rules translate directly into a linear recursive procedure:

1(define (factorial n)
2   (if (= n 1)
3      1
4      (* n (factorial (- n 1)))
5   )
6)

Visualizing the process of computing 6! using the substitution model:

1(factorial 6)
2(6 * (factorial 5)) ; Expansion starts (deferred chain of multiplications)
3(6 * (5 * (factorial 4)))
4(6 * (5 * (4 * (factorial 3))))
5(6 * (5 * (4 * (3 * (factorial 2)))))
6(6 * (5 * (4 * (3 * (2 * (factorial 1)))))) ; Expansion ends
7(6 * (5 * (4 * (3 * (2 * 1))))) ; Contraction starts (operations are actually performed)
8(6 * (5 * (4 * (3 * 2))))
9(6 * (5 * (4 * 6)))
10(6 * (5 * 24))
11(6 * 120) ; Contraction ends
12720

Linear Iteration

Rules to find n!:

1product = 1; Running product
2counter = 1; Range: [1, n]

Iterate

1product = (* counter product)
2counter = (+ counter 1)

Until

1(> counter n)

n! = Final value of product

The rules translate directly into a linear iterative procedure:

1(define (factorial n)
2   (define (factorial-iterate product counter)
3      (if (> counter n)
4          product
5          (factorial-iterate (* counter product)
6                             (+ counter 1)
7          )
8      )
9   )
10
11   (iterate 1 1)
12)

Visualizing the process of computing 6! using the substitution model:

1(factorial 6)
2(factorial-iterate 1 1)
3(factorial-iterate 1 2)
4(factorial-iterate 2 3)
5(factorial-iterate 6 4)
6(factorial-iterate 24 5)
7(factorial-iterate 120 6)
8(factorial-iterate 720 7)
9720

!> A recursive procedure refers to the syntactic fact that a procedure definition refers (either directly/indirectly) to the procedure itself.

A **recursive process** refers to how the process evolves, not about the syntax of how a procedure is written.

The `factorial-iterate` is a recursive procedure that generates an iterative process: Its state is captured completely by its three state variables (counter, product, n), and an interpreter need keep track of only three variables in order to execute the process.

Common languages like `C` have a defect: A recursive procedure always consumes an amount of memory that grows with the number of procedure calls, even when the process described is, in principle, iterative. These languages resort to special-purpose "looping constructs" like `while`, `for`.

Exercise 1.9

1; Procedure#1
2(define (+ a b)
3   (if (= a 0)
4       b 
5       (inc (+ (dec a) b))
6   )
7)
8
9; Procedure#2
10(define (+ a b)
11   (if (= a 0)
12       b
13       (+ (dec a)
14          (inc b)
15       )
16   )
17)

The substitution model shows that procedure#1 generates a recursive process.

1(+ 4 5)
2(inc (+ 3 5))
3(inc (inc (+ 2 5)))
4(inc (inc (inc (+ 1 5))))
5(inc (inc (inc (inc (+ 0 5)))))
6(inc (inc (inc (inc 5))))
7(inc (inc (inc 6)))
8(inc (inc 7))
9(inc 8)
109

The substitution model shows that procedure#2 generates an iterative process.
```
1(+ 4 5)
2(+ 3 6)
3(+ 2 7)
4(+ 1 8)
5(+ 0 9)
69
```

Exercise 1.10

1; Ackermann’s function
2(define (A x y)
3   (cond ((= y 0) 0)
4         ((= x 0) (* 2 y))
5         ((= y 1) 2)
6         (else (A (- x 1) (A x (- y 1))))
7   )
8)
9
10(define (f n) (A 0 n))
11
12(define (g n) (A 1 n))
13
14(define (h n) (A 2 n))
15
16(define (k n) (* 5 n n))

What are the values of the following expressions?
- (A 1 10) = 1024
- (A 2 4) = (A 1 16) = 65536
- (A 3 3) = (A 2 4) = 65536
Give concise mathematical definitions for the functions computed by the procedures f, g, and h for positive integer values of n.
- (f n) computes 2n, because (A 0 n) => (* 2 n).
- (g n) computes 2ⁿ, because (A 1 n) => (A 0 (A 1 (- n 1))) => (f (g (- n 1))).
- (h n) computes ⁿ2 (tetration), because (A 2 n) => (A 1 (A 2 (- n 1))) => (g (h (- n 1))).
- (k n) computes 5n^2, as stated in the exercise.

1.2.2 Tree Recursion — Finding Fibonacci Numbers

With tree recursion, the procedure invokes itself more than once, causing the process to evolve in the shape of a tree.

Fibonacci numbers can be defined by the rule:

E.g: 0, 1, 1, 2, 3, 5, 8, 13, 21, ...

Expressed as a recursive procedure for computing Fibonacci numbers:

1(define (fib n)
2   (cond
3      ((= n 0) 0)
4      ((= n 1) 1)
5      (else (+ (fib (- n 1)) (fib (- n 2))))
6   )
7)

The diagram below shows the tree-recursive process generated in computing (fib 5). Notice that the branches split into two at each level (except at the bottom); this reflects the fact that the fib procedure calls itself twice each time it is invoked.

The number of times the procedure will compute (fib 1) or (fib 0) (the number of leaves in the above tree, in general) is precisely Fib(n + 1).
The process uses a number of steps that grows exponentially with the input.
The space required grows linearly with the input, because we need keep track only of which nodes are above us in the tree at any point in the computation.
In general, the number of steps required by a tree-recursive process will be proportional to the number of nodes in the tree, while the space required will be proportional to the maximum depth of the tree.
Tree recursion is a natural and powerful tool when considering processes that operate on hierarchically structured data rather than numbers. The interpreter itself evaluates expressions using a tree-recursive process.

Linear Iteration — Finding Fibonacci Numbers

The idea is to use a pair of integers a and b, initialized to Fib(1) = 1 and Fib(0) = 0, and to repeatedly apply the simultaneous transformations:

a ← a + b,
b ← a.

After applying this transformation n times, a and b will be equal, respectively, to Fib(n + 1) and Fib(n).

1(define (fib n)
2   (fib-iter 1 0 n)
3)
4
5(define (fib-iter a b count)
6   (if (= count 0)
7       b
8       (fib-iter (+ a b) a (- count 1))
9   )
10)

Tree-recursive processes can be useful in helping us to understand and design programs. For instance, although the first fib procedure is much less efficient than the second one, it's more straightforward, being little more than a translation into Lisp of the definition of the Fibonacci sequence.

1.2.3 Orders of Growth

processes can differ considerably in the rates at which they consume computational resources. One convenient way to describe this difference is to use the notion of order of growth to obtain a gross measure of the resources required by a process as the inputs become larger.

Let n be a parameter that measures the size of the problem, and let R(n) be the amount of resources the process requires for a problem of size n.

In our previous examples we took n to be the number for which a given function is to be computed, but there are other possibilities. For instance, if our goal is to compute an approximation to the square root of a number, we might take n to be the number of digits accuracy required. For matrix multiplication we might take n to be the number of rows in the matrices. In general there are a number of properties of the problem with respect to which it will be desirable to analyze a given process.

Similarly, R(n) might measure the number of internal storage registers used, the number of elementary machine operations performed, and so on. In computers that do only a fixed number of operations at a time, the time required will be proportional to the number of elementary machine operations performed.

We say that R(n) has order of growth (f(n)), written R(n) = Θ(f(n)) (pronounced “theta of f(n)”), if there are positive constants k1 and k2 independent of n such that k1 * f(n) ≤ R(n) ≤ k2 * f(n) for any sufficiently large value of n. (In other words, for large n, the value R(n) is sandwiched between [k1 * f(n)] and [k2 * f(n)].)

For instance, with the linear recursive process for computing factorial described in Section 1.2.1 the number of steps grows proportionally to the input n. Thus, the steps required for this process grows as Θ(n). We also saw that the space required grows as Θ(n). For the iterative factorial, the number of steps is still Θ(n) but the space is Θ(1).

The tree-recursive Fibonacci computation requires Θ(ϕn) steps and space Θ(n), where ϕ is the golden ratio.

Orders of growth provide only a crude description of the behavior of a process. For example, a process requiring n² steps and a process requiring 1000n² steps and a process requiring 3n² + 10n + 17 steps all haveΘ(n²) order of growth.

On the other hand, order of growth provides a useful indication of how we may expect the behavior of the process to change as we change the size of the problem. For a Θ(n) (linear) process, doubling the size will roughly double the amount of resources used. For an exponential process, each increment in problem size will multiply the resource utilization by a constant factor. In the remainder of Section 1.2 we will examine two algorithms whose order of growth is logarithmic, so that doubling the problem size increases the resource requirement by a constant amount.

1.2.4 Exponentiation

A procedure that takes as arguments a base b and a positive integer exponent n and computes b^n.

Recursively:

bⁿ = b * b^{n - 1},
b⁰ = 1

Which translates into the procedure below (that generates a linear recursive process)

1(define (expt b n)
2   (if (= n 0)
3       1
4       (* b (expt b (- n 1)))
5   )
6)

Iteratively:

1(define product 1)
2(define counter n) ; n is the exponent

Then, iterate:

1(define product (* b product)) ; b is the base
2(define counter (- counter 1))

Until:

1(= counter 0)

Which translates into the procedure below (that generates a linear iterative process)

1(define (expt b n)
2   (expt-iter b n 1)
3)
4
5(define (expt-iter b counter product)
6   (if (= counter 0)
7      product
8      (expt-iter  b
9                  (- counter 1)
10                  (* b product)
11      )
12   )
13)

Successive squaring can be used to reduce the number of steps.
Example: b^8
Instead of:
b (b (b (b (b (b (b * b))))))
It can be computed using three multiplications
b⁸ = b⁴ b⁴ = ((b²)²)²
b² = b b

Successive squaring can be generalized to work with odd powers too

bⁿ = (b^n/2)² if n is even;
bⁿ = b * b^n-1 if n is odd.

Expressed as a procedure:

1(define (even? n)
2   (= (remainder n 2) 0)
3)
4
5(define (fast-expt b n)
6   (cond ((= n 0) 1)
7         ((even? n) (square (fast-expt b (/ n 2))))
8         (else (* b (fast-expt b (- n 1)))))
9)

The process evolved by fast-expt grows logarithmically with n in both space and number of steps.
- To see this, observe that computing b²ⁿ using fast-expt requires only one more multiplication than computing bⁿ.
- The size of the exponent we can compute therefore doubles (approximately) with every new multiplication that are allowed.
- Thus, the number of multiplications required for an exponent of n grows about as fast as the logarithm of n to the base 2.
- The process has O(log n) growth.
The difference between O(log n) growth and O(n) growth becomes striking as n becomes large. For example, fast-expt for n = 1000 requires only 14 multiplications.

Exercise 1.16

Design a procedure that evolves an iterative exponentiation process that uses successive squaring and uses a logarithmic number of steps, as does fast-expt.

(Hint: Using the observation that (b^n/2)² = (b²)^n/2, keep, along with the exponent n and the base b, an additional state variable a, and define the state transformation in such a way that the product abⁿ is unchanged from state to state. At the beginning of the process a is taken to be 1, and the answer is given by the value of a at the end of the process. In general, the technique of defining an invariant quantity that remains unchanged from state to state is a powerful way to think about the design of iterative algorithms.)

bⁿ = (b²)^n/2 (Using exponent product rule)
bⁿ = b¹ * b^n-1 (Using exponent product rule)

bⁿ can be expressed as: abⁿ
Where `a = 1` initially.

When n is even:
abⁿ = a(b²)^n/2 = a^'(b^')^n'
a^' = `a`
b^' = b²
n^' = `n/2`

When n is odd:
abⁿ = abb^n-1 = (ab)b^n-1 = a^'(b^')^n'
a^' = `ab`
b^' = `b`
n^' = `n - 1`

1(define (even? n)
2   (= (remainder n 2) 0)
3)
4
5(define (expt b n)
6   (define (expt-iter a b n)
7      (cond
8         ((= n 0) a)
9         ((even? n) (expt-iter a (* b b) (/n 2)))
10         (else (expt-iter (* a b) b (- n 1)))
11      )
12   )
13   (expt-iter 1 b n)
14)

Exercise 1.17 — Recursive Russian peasant method of multiplication

Recursive, successive doubling: Θ(logn) time, Θ(logn) space.

Case: When a is 1: a * b = b

Case: When a is odd:
a * b = [(a - 1) * b] + b

Case: When a is even:
a * b = [(a / 2) * b] * 2

1(define (* a b)
2   (cond
3      ((= a 1) b)
4      ((even? a) (double (* (half a) b)))
5      (else (+ b (* (- a 1) b)))
6   )
7)
8
9(define (even? n)
10   (= (remainder n 2) 0)
11)
12
13(define (double n)
14   (+ n n)
15)
16
17(define (half n)
18   (/ n 2)
19)

Exercise 1.18 — Iterative Russian peasant method of multiplication

a * b can be expressed as: m + a * b
Where m = 0 initially.

When a is even:
a * b = m + b/2 * a/2 * b = m' + a' + b'
m' = m + b/2
a' = a/2
b' = b

When a is odd:
a * b = m + b * (a-1) * b = m' + a' + b'
m' = m + b
a' = a-1
b' = b

When a is one, the answer is equal to m.

Iterative, successive doubling: Θ(logn) time, Θ(1) space.

1(define (* a b)
2   (define (*-iter m new-a)
3      (cond
4         ((= new-a 1) m)
5         ((even? new-a) (*-iter (+ m (double b)) (half new-a)))
6         (else (*-iter (+ m b) (- new-a 1)))
7      )
8   )
9   (*-iter 0 a)
10)
11
12(define (even? n)
13   (= (remainder n 2) 0)
14)
15
16(define (double n)
17   (+ n n)
18)
19
20(define (half n)
21   (/ n 2)
22)

Exercise 1.19

TODO()

1.2.5 Greatest Common Divisors

The greatest common divisor (GCD) of two integers b and c is the largest integer that divides both b and c with no remainder.
Eg: GCD(16, 28) = 4

Naive algorithm for GCD

Factor the two integer arguments and search for common factors.

Euclidean algorithm for GCD

If r is the remainder when b is divided by c, then the common divisors of b and c are precisely the same as the common divisors of c and r.

GCD(b, c) = GCD(c, r)

This is used to successively reduce the problem of computing a GCD to the problem of computing the GCD of smaller and smaller pairs of integers. Eg:

1; GCD(206, 40) =
2(gcd 206 40)
3(gcd 40 6)
4(gcd 6 4)
5(gcd 4 2)
6(gcd 2 0)
72

This always reduces to a pair where the second number is zero, and the other number is the GCD of the original pair.

Expressed as a procedure:

1(define (gcd a b)
2   (if (= b 0)
3      a
4      (gcd b (remainder a b))
5   )
6)

This generates an iterative process, whose number of steps grows as the logarithm of the numbers involved.

1.2.6 Example: Testing for Primality

Searching for divisors

If d is a divisor of n, then so is n/d. But d & n/d cannot both be greater than sqrt(n).
The end test for find-divisor is based on the fact that if n is not prime it must have a divisor less than or equal to sqrt(n).

1; n is prime IFF n is its own smallest divisor.
2(define (prime? n)
3   (= n (smallest-divisor n))
4)
5
6; finds the smallest integral divisor (greater than 1) of a given number n.
7(define (smallest-divisor n) (find-divisor n 2))
8
9; testing n for divisibility by successive integers [2, 3, 4 ... sqrt(n))
10(define (find-divisor n test-divisor)
11   (cond ((> (square test-divisor) n) n)
12         ((divides? test-divisor n) test-divisor)
13         (else (find-divisor n (+ test-divisor 1)))
14   )
15)
16
17(define (divides? a b) (= (remainder b a) 0))

The number of steps required to identify n as prime will have order of growth O(sqrt(n)).

The Fermat test

Fermat’s Little Theorem: If n is a prime number and b is any positive integer less than n, then b raised to the n^th power is congruent to b modulo n. I.e:
bⁿ modulo n = b modulo n

However, if b < n, then b modulo n = b. Therefore, this simplifes to:
bⁿ modulo n = b

If n is not prime, then, in general, most of the numbers b < n will not satisfy the above relation. This leads to the following algorithm for testing primality:

Fermat test:
Given a number n, pick a random number b such that b < n.
Compute bⁿ modulo n.
If the result is not equal to b, then n is certainly not prime.
If it is b, then chances are good that n is prime.
Repeat test using another random number to improve confidence that n is prime.

1; runs fermat-test multiple times as specified by the times parameter
2(define (fast-prime? n times)
3   (cond ((= times 0) true)
4         ((fermat-test n) (fast-prime? n (- times 1)))
5         (else false)
6   )
7)
8
9(define (fermat-test n)
10   (define (try-it a)
11      (= (expmod a n n) a))
12   
13   ; Random range: [0, arg)
14   (try-it (+ 1 (random (- n 1))))
15)
16
17; computes the exponential of a number modulo another number
18(define (expmod base exp m)
19   ; uses successive squaring — like the fast-expt procedure
20   (cond ((= exp 0) 1)
21         ((even? exp)
22            (remainder
23               (square (expmod base (/ exp 2) m))
24               m
25            )
26         )
27         (else
28            (remainder
29               (* base (expmod base (- exp 1) m))
30               m
31            )
32         )
33   )
34)

e reduction steps in the cases where the exponent e is greater than 1 are based on the fact that, for any integers x, y, and m, we can find the remainder of x times y modulo m by computing separately the remainders of x modulo m and y modulo m, multiplying these, and then taking the remainder of the result modulom. For instance, in the case where e is even, we compute the remainder of be=2 modulo m, square this, and take the remainder modulo m. is technique is useful because it means we can perform our computation without ever having to deal with numbers much larger than m.

Probabilistic methods

if n ever fails the Fermat test, we can be certain that n is not prime. But the fact that n passes the test, while an extremely strong indication, is still not a guarantee that n is prime. What we would like to say is that for any number n, if we perform the test enough times and find that n always passes the test, then the probability of error in our primality test can be made as small as we like.

Unfortunately, this assertion is not quite correct. ere do exist numbers that fool the Fermat test: numbers n that are not prime and yet have the property that an is congruent to a modulo n for all integers a < n. Such numbers are extremely rare, so the Fermat test is quite reliable in practice.

Numbers that fool the Fermat test are called Carmichael numbers, and lile is known about them other than that they are extremely rare. ere are 255 Carmichael numbers below 100,000,000. e smallest few are 561, 1105, 1729, 2465, 2821, and 6601. In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a “correct” algorithm. Considering an algorithm to be inadequate for the first reason but not for the second illustrates the difference between mathematics and engineering.

Structure and Interpretation of Computer Programs — Book + MIT 6.001 + Berkeley CS61A Summary & Notes [WIP]

Chapter 1: Building Abstractions with Procedures

1.1 The Elements of Programming

Features of a programming language

1.1.1 Expressions

Syntax for a combination

Semantics of a combination

1.1.2 Naming and the Environment

1.1.3 Evaluating Combinations

Lisp evaluation rule

Define special evaluation rule

1.1.4 Compound Procedures

General form of a procedure definition

Terminology

Lambda expression

1.1.5 The Substitution Model for Procedure Application

Normal Order

Applicative Order

1.1.6 Conditional Expressions and Predicates

Conditional expressions evaluation rule:

If expressions

If expressions evaluation rule

Logical operators

Exercise 1.3

1.1.7 Example: Square Roots by Newton’s Method

1.1.8 Procedures as Black-Box Abstractions

Local names

Internal definitions and block structure

Lexical scoping

1.2 Procedures and the Processes They Generate

1.2.1 Linear Recursion & Iteration — Finding the Factorial

Linear Recursion

Linear Iteration

Exercise 1.9

Exercise 1.10

1.2.2 Tree Recursion — Finding Fibonacci Numbers

Linear Iteration — Finding Fibonacci Numbers

1.2.3 Orders of Growth

1.2.4 Exponentiation

Exercise 1.16

Exercise 1.17 — Recursive Russian peasant method of multiplication

Exercise 1.18 — Iterative Russian peasant method of multiplication

Exercise 1.19

1.2.5 Greatest Common Divisors

Naive algorithm for GCD

Euclidean algorithm for GCD

1.2.6 Example: Testing for Primality

Searching for divisors

The Fermat test

Probabilistic methods

`Define` special evaluation rule