This is my summary and notes from Structure and Interpretation of Computer Programs by Harold Abelson & Gerald Jay Sussman . Please use the link if you decide to buy the book after reading this as it contains my affiliate code. Thanks.
SICP uses Scheme (a dialect of Lisp).
Every computer program is a model, hatched in the mind, of a real or mental process. These processes arise from human experience or thought and are only partially understood at any time — leading to the constant evolution of programs.
Avoid complexities of control and concentrate on organizing the data to reflect the real structure of the world being modeled.
It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.
— Alan J. Perlis
Computer science (CS) isn’t about computers (that’s electrical engineering) and it isn’t primarily a science (things are invented more than they are discovered).
CS is partly a form of engineering (concerned with building reliable, efficient mechanisms, but in software instead of metal) and partly an art form (using programming as a medium for creative expression).
The acts of the mind, wherein it exerts its power over simple ideas, are chiefly these three:
- Combining several simple ideas into one compound one, and thus all complex ideas are made.
- Bringing two ideas, whether simple or complex, together, and setting them by one another so as to take a view of them at once, without uniting them into one, by which it gets all its ideas of relations.
- The third is separating them from all other ideas that accompany them in their real existence: this is called abstraction, and thus all its general ideas are made. — John Locke, An Essay Concerning Human Understanding (1690)
√x = y
y² = x and y >= 0
To find the √x:
- Make a guess G
- Is G² close enough to x?
- If no, average G and x / G
- If yes, done.
Lisp treats procedures as data
The key to understanding complicated things is to know what not to look at, what not compute, and what not to think.
Vocabulary: a set of words that are used to build procedures. These will be the basic elements of computation, the fundamental representations of information and the fundamental procedures that we use to describe all other procedures.
Syntax: a set of rules for legally connecting elements together — i.e, how to build more complex parts of a procedure out of more basic ones.
Semantics: a set of rules for deducing the meaning associated with (simple/compound) elements.
Self-evaluating expressions are the simplest class of primitives — their value is the object or expression itself. Examples are numbers, strings, and boolean.
Combinations are compound expressions that applies input expressions to an expression representing a procedure (eg:
1(+ 137 349) ; Combinations are formed by delimiting a list of expressions within parentheses in order to denote procedure application
Combinations can be nested — an operator or operand can itself be another combination.
1(define (a-plus-abs-b a b)2 ((if (> b 0) + -)3 a4 b5 ))
The convention of placing the operator to the left of the operands is known as prefix notation.
Prefix notation accomodates procedures that take arbitrary number of arguments.
1(+ 21 35 12 7)23(+ 25 4)
No ambiguity can arise, because the operator is always the leftmost element and the entire combination is delimited by the parentheses.
Prefix notation extends in a straightforward way to allow combinations to be nested:
1(+ (* 3 5) (- 10 6))
The lisp interpreter runs in a read-evaluate-print loop (REPL).
In lisp, every expression has a value and every value has a type.
1(procedure_expression operand_expression1 operand_expression2)
The value of a combination is obtained by applying the procedure specified by the operator to the arguments that are the values of the operands.
In scheme, objects are named with
define. The names identifies a variable whose value is the object. The last sentence separates putting a value in a variable and giving it a name.
1(define size 2) ; name (size) is used to refer to the computational object (2)
define is Scheme's simplest means of abstraction — it allows the usage of simple names to refer to the results of compound operations.
Complex programs are constructed by building, step by step, computational objects of increasing complexity.
The environment is the memory the interpreter uses to hold the association between names and objects.
The interpreter follows a procedure to evaluate combinations:
operator) to the arguments that are the values of the other subexpressions (the
The evaluation rule is recursion — To evaluate a combination, the interpreter must first evaluate the elements of the combination, and so on.
Paretheses in Lisp is used for combinations. They are a way to write 2d combination trees as a linear character string.
1(* (+ 22 (* 4 6))3 (+ 3 5 7))
define), use its own evaluation rule.
Lisp doesn't make artificial distinctions between elements that are primitive to the language and custom built elements.
Definespecial evaluation rule
1(define name_symbol last_expression)
Procedure definition is a powerful abstraction technique by which a compound operation can be given a name and then referred to as a unit.
1(define (⟨name⟩ ⟨formal parameters⟩) ⟨body⟩)
⟨name⟩is a symbol to be associated with the procedure definition in the environment.
⟨formal parameters⟩are the names used within the body of the procedure to refer to the corresponding arguments of the procedure.
⟨body⟩is an expression that will yield the value of the procedure when it's applied.
⟨formal parameters⟩are grouped within parentheses, just as they would be in an actual call to the procedure being defined.
⟨body⟩is not evaluated when the procedure is evaluated, it's evaluated when the procedure is applied.
⟨body⟩contains more than one expression, each is evaluated in sequence and the value of the last one is returned.
The argument’s name comes from the function’s definition; the argument’s value comes from the invocation.
1(define elvis +) ; Even primitive procedure symnols are just names bound to a procedure value2(elvis 5 10)
(procedure_name (+ 2 3)).
1(define ⟨name⟩ (lambda (⟨formal parameters⟩) ⟨body⟩))
The general form is syntatic sugar for a lambda expression.
Syntatic sugar is having more convenient surface forms for typing something.
The lambda symbol is Lisp's way of saying "make a procedure".
The semantics says that the value of a lambda expression is a procedure object. It's some internal representation of the procedure that includes information about the formal parameters and the body.
1(lambda (x y) (/ (+ x y) 2))
The interpreter follows the same process when evaluating combinations with either compound or primitive procedure.
The substitution model for procedure application:
To apply a procedure to arguments, evaluate the body of the procedure with each formal parameter replaced by the corresponding argument.
The purpose of the substitution is to help reason about procedure application, not to provide a description of how the interpreter really works.
Typical interpreters do not apply procedures by manipulating the text of a procedure to substitute values for the formal parameters. In practice, the “substitution” is accomplished by using a local environment for the formal parameters.
There are more elaborate models — the substitution model is only a way to get started thinking formally about the evaluation process.
In general, when modeling phenomena in science and engineering, simplified incomplete models are first used. As things are examined in greater detail, these simple models become inadequate and must be replaced by more refined models.
Fully expand and then reduce — I.e Operand expressions are substituted for parameters until an expression involving only primitive operators is obtained, only then is evaluation performed.
Evaluate the arguments and then apply — Operand expressions are evaluated and then applied to a procedure.
Lisp uses applicative-order evaluation because:
In functional programming, the same answer is obtained regardless of order of evaluation.
Conditional expression enables case analysis.
The general form of a conditional expression is:
1(cond (⟨p₁⟩ ⟨e₁⟩)2 (⟨p₂⟩ ⟨e₂⟩)3 ...4 (⟨pₙ₋₁⟩ ⟨eₙ₋₁⟩)5 (else ⟨eₙ⟩)6)
It consists of the symbol
cond followed by parenthesized pairs of expressions called clauses.
The first expression in each clause is a predicate — i.e, an expression whose value is interpreted as either true or false.
The second expression in each clause is called a consequent expression.
1(define (abs x)2 (cond ((> x 0) x)3 ((= x 0) 0)4 ((< x 0) (- x))))
⟨p₁⟩is evaluated first.
elseis found, in which case the interpreter returns the value of the corresponding consequent expression
⟨e⟩as the value of the conditional expression.
The general form for the special form
if (a restricted type of conditional that allows only two cases in a case analysis) is:
1(if ⟨predicate⟩ ⟨consequent⟩ ⟨alternative⟩)
⟨predicate⟩evaluates to a true value, the interpreter then evaluates the
⟨consequent⟩and returns its value.
⟨alternative⟩and returns its value.
In addition to primitive predicates such as
>, there are logical composition operations, which enables the construction of compound predicates:
(and ⟨e₁⟩ ⟨e₂⟩ ... ⟨eₙ⟩)
(or ⟨e₁⟩ ⟨e₂⟩ ... ⟨eₙ⟩)
or are special forms, not procedures, because the subexpressions are not necessarily all evaluated (because of short circuiting).
not is an ordinary procedure.
Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers:
1(define (sum-of-square-biggest x y z)2 (cond ((and (> x z) (> y z)) (sum-of-square x y))3 ((and (> y x) (> z x)) (sum-of-square y z))4 ((and (> x y) (> z y)) (sum-of-square x z))5 ))67 (define (sum-of-square x y)8 (+ (* x x) (* y y))9 )
The newton method of successive approximations can be used to calculate the square root.
To find the √x:
- Make a guess
y²close enough to
- If no, improve guess by averaging
x / y
- If yes, done.
1Example: Find √22Initial guess: 134Guess Quotient Average Improved guess51 (2/1) = 2 ((2 + 1)/2) 1.561.5 (2/1.5) = 1.3333 ((1.3333 + 1.5)/2) 1.416771.4167 (2/1.4167) = 1.4118 ((1.4118 + 1.4167)/2) 1.414281.4142 ...
One useful way of thinking about defining procedures is as a means of generalizing a common pattern of operations — i.e an abstraction. This is captured by giving a name to the part of the pattern that changes with each instantiation; identifying that name as a formal parameter; and then capturing that pattern as the body of a lambda expression.
Stages in capturing common patterns of computation
1(define (square x)2 (* x x))34(define (average x y)5 (/ (+ x y) 2))67(define (improve guess randicand)8 (average guess (/ randicand guess)))910(define tolerance 0.001)1112; The idea is to improve the answer until it is close enough so that its square differs from the radicand by less than a predetermined tolerance.13(define (good-enough? guess randicand)14 (< (abs (- (square guess)15 randicand))16 tolerance))1718(define (sqrt-iter guess randicand)19 (if (good-enough? guess randicand)20 guess21 (sqrt-iter (improve guess randicand) randicand)22 )23)2425(define (sqrt randicand)26 (sqrt-iter 1.0 randicand))
...I think people who really believe that you design everything before you implement it basically are people who haven’t designed very many things.
The real power is that you can pretend that you’ve made the decision and then later on figure out which one is right, which decision you ought to have made. And when you can do that, you have the best of both worlds.
The way in which you would construct a recursive process is by wishful thinking. You have to believe.
good-enough? procedure above,
square is an abstraction of a procedure (aka procedural abstraction). At this level of abstraction, any procedure that computes the square is equally good.
The choice of names for the procedure’s formal parameters should not matter to the user of the procedure. The following procedures should not be distinguishable:
1(define (square x) (* x x))2(define (square y) (* y y))
Formal parameters are bound variables — Their names don't matter.
The procedure definition binds its formal parameters.
A bound variable is a variable that was previously free, but has been bound to a specific value or set of values called domain of discourse or universe
Free variable refers to variables used in a function that are neither local variables nor parameters of that function.
If a variable is not bound, it's free.
The set of expressions for which a binding defines a name is called the scope of that name.
Formal parameters scope is the procedure's body.
Whether a variable is free or bound is relative to the object in question.
In the definition of
good-enough?above, guess and x are bound variables but <, -, abs, and square are free.
sqrt procedure above, the auxiliary procedures can be private and internal to it:
1(define (sqrt randicand)23 ; Notice: No need to pass randicand4 (define (good-enough? guess) ...)56 (define (improve guess) ...)78 (define (sqrt-iter guess) ...)910 (sqrt-iter 1.0)11)
Putting a definition in the body of a procedure makes it local to that procedure. This nesting is called block structure.
By internalizing auxiliary procedures, we can often eliminate bindings by allowing variables to remain free.
Scheme uses lexical scoping. Lexical scoping dictates that free variables in a procedure are taken to refer to bindings made by enclosing procedure definitions; that is, they are looked up in the environment in which the procedure was defined.
To become experts, we must learn to visualize the processes generated by various types of procedures. Only after we have developed such a skill can we learn to reliably construct programs that exhibit the desired behavior.
A procedure is a pattern for the local evolution of a computational process. It specifies how each stage of the process is built upon the previous stage.
|Feature||Linear recursive process||Linear iterative process|
|Definition||A recursive process is characterized by a chain of deferred operations.||An iterative process is one whose state can be summarized by a fixed number of state variables, together with a fixed rule that describes how the state variables should be updated as the process moves from state to state and an (optional) end test that specifies conditions under which the process should terminate.|
|State||There is some additional “hidden” information, maintained by the interpreter and not contained in the program variables, which indicates “where the process is” in negotiating the chain of deferred operations.||The program variables provide a complete description of the state of the process at any point.|
If the computation is stopped between steps, all that is needed to resume the conputation is to supply the interpreter with the values of the program variables.
|Shape||The substitution model reveals a "shape" of expansion followed by contraction.|
The expansion occurs as the process builds up a chain of deferred operations.
The length of the deferred operations is proportional to
The contraction occurs as the operations are actually performed.
|The substitution model reveals a "shape" with no expansion/contraction.|
|Number of required steps||Grows linearly with ||Grows linearly with |
|Space||Grows linearly with |
The interpreter has to keep track of the deferred operations.
Factorial function definition:
n! = n (n - 1) (n - 2) ... 3 2 1
Rules to find
n > 1:
n! = n * (n - 1)!
n = 1:
1! = 1
The rules translate directly into a linear recursive procedure:
1(define (factorial n)2 (if (= n 1)3 14 (* n (factorial (- n 1)))5 )6)
Visualizing the process of computing
6! using the substitution model:
1(factorial 6)2(6 * (factorial 5)) ; Expansion starts (deferred chain of multiplications)3(6 * (5 * (factorial 4)))4(6 * (5 * (4 * (factorial 3))))5(6 * (5 * (4 * (3 * (factorial 2)))))6(6 * (5 * (4 * (3 * (2 * (factorial 1)))))) ; Expansion ends7(6 * (5 * (4 * (3 * (2 * 1))))) ; Contraction starts (operations are actually performed)8(6 * (5 * (4 * (3 * 2))))9(6 * (5 * (4 * 6)))10(6 * (5 * 24))11(6 * 120) ; Contraction ends12720
Rules to find
1product = 1; Running product2counter = 1; Range: [1, n]
1product = (* counter product)2counter = (+ counter 1)
1(> counter n)
n! = Final value of
The rules translate directly into a linear iterative procedure:
1(define (factorial n)2 (define (factorial-iterate product counter)3 (if (> counter n)4 product5 (factorial-iterate (* counter product)6 (+ counter 1)7 )8 )9 )1011 (iterate 1 1)12)
Visualizing the process of computing
6! using the substitution model:
1(factorial 6)2(factorial-iterate 1 1)3(factorial-iterate 1 2)4(factorial-iterate 2 3)5(factorial-iterate 6 4)6(factorial-iterate 24 5)7(factorial-iterate 120 6)8(factorial-iterate 720 7)9720
!> A recursive procedure refers to the syntactic fact that a procedure definition refers (either directly/indirectly) to the procedure itself.
1; Procedure#12(define (+ a b)3 (if (= a 0)4 b5 (inc (+ (dec a) b))6 )7)89; Procedure#210(define (+ a b)11 (if (= a 0)12 b13 (+ (dec a)14 (inc b)15 )16 )17)
The substitution model shows that
procedure#1 generates a recursive process.
1(+ 4 5)2(inc (+ 3 5))3(inc (inc (+ 2 5)))4(inc (inc (inc (+ 1 5))))5(inc (inc (inc (inc (+ 0 5)))))6(inc (inc (inc (inc 5))))7(inc (inc (inc 6)))8(inc (inc 7))9(inc 8)109
The substitution model shows that
procedure#2 generates an iterative process.
1(+ 4 5)2(+ 3 6)3(+ 2 7)4(+ 1 8)5(+ 0 9)69
1; Ackermann’s function2(define (A x y)3 (cond ((= y 0) 0)4 ((= x 0) (* 2 y))5 ((= y 1) 2)6 (else (A (- x 1) (A x (- y 1))))7 )8)910(define (f n) (A 0 n))1112(define (g n) (A 1 n))1314(define (h n) (A 2 n))1516(define (k n) (* 5 n n))
What are the values of the following expressions?
(A 1 10)= 1024
(A 2 4)= (A 1 16) = 65536
(A 3 3)= (A 2 4) = 65536
Give concise mathematical definitions for the functions computed by the procedures
h for positive integer values of
(f n)computes 2n, because
(A 0 n)=>
(* 2 n).
(g n)computes 2n, because
(A 1 n)=>
(A 0 (A 1 (- n 1)))=>
(f (g (- n 1))).
(h n)computes n2 (tetration), because
(A 2 n)=>
(A 1 (A 2 (- n 1)))=>
(g (h (- n 1))).
5n^2, as stated in the exercise.
With tree recursion, the procedure invokes itself more than once, causing the process to evolve in the shape of a tree.
Fibonacci numbers can be defined by the rule:
E.g: 0, 1, 1, 2, 3, 5, 8, 13, 21, ...
Expressed as a recursive procedure for computing Fibonacci numbers:
1(define (fib n)2 (cond3 ((= n 0) 0)4 ((= n 1) 1)5 (else (+ (fib (- n 1)) (fib (- n 2))))6 )7)
The diagram below shows the tree-recursive process generated in computing
(fib 5). Notice that the branches split into two at each level (except at the bottom); this reflects the fact that the
fib procedure calls itself twice each time it is invoked.
(fib 0)(the number of leaves in the above tree, in general) is precisely
Fib(n + 1).
The idea is to use a pair of integers
b, initialized to
Fib(1) = 1 and
Fib(0) = 0, and to repeatedly apply the simultaneous transformations:
a ← a + b,
b ← a.
After applying this transformation
b will be equal, respectively, to
Fib(n + 1) and
1(define (fib n)2 (fib-iter 1 0 n)3)45(define (fib-iter a b count)6 (if (= count 0)7 b8 (fib-iter (+ a b) a (- count 1))9 )10)
Tree-recursive processes can be useful in helping us to understand and design programs. For instance, although the first
fib procedure is much less efficient than the second one, it's more straightforward, being little more than a translation into Lisp of the definition of the Fibonacci sequence.
processes can differ considerably in the rates at which they consume computational resources. One convenient way to describe this difference is to use the notion of order of growth to obtain a gross measure of the resources required by a process as the inputs become larger.
Let n be a parameter that measures the size of the problem, and let R(n) be the amount of resources the process requires for a problem of size n.
In our previous examples we took n to be the number for which a given function is to be computed, but there are other possibilities. For instance, if our goal is to compute an approximation to the square root of a number, we might take n to be the number of digits accuracy required. For matrix multiplication we might take n to be the number of rows in the matrices. In general there are a number of properties of the problem with respect to which it will be desirable to analyze a given process.
R(n) might measure the number of internal storage registers
used, the number of elementary machine operations performed, and so
on. In computers that do only a fixed number of operations at a time, the
time required will be proportional to the number of elementary machine
We say that
R(n) has order of growth
R(n) = Θ(f(n))
(pronounced “theta of
f(n)”), if there are positive constants k1 and k2
independent of n such that
k1 * f(n) ≤ R(n) ≤ k2 * f(n) for any sufficiently
large value of n. (In other words, for large n, the value R(n) is sandwiched
[k1 * f(n)] and
[k2 * f(n)].)
For instance, with the linear recursive process for computing factorial described in Section 1.2.1 the number of steps grows proportionally to the input n. Thus, the steps required for this process grows as Θ(n). We also saw that the space required grows as Θ(n). For the iterative factorial, the number of steps is still Θ(n) but the space is Θ(1).
The tree-recursive Fibonacci computation requires Θ(ϕn) steps and space Θ(n), where ϕ is the golden ratio.
Orders of growth provide only a crude description of the behavior of a process. For example, a process requiring n² steps and a process requiring 1000n² steps and a process requiring 3n² + 10n + 17 steps all haveΘ(n²) order of growth.
On the other hand, order of growth provides a useful indication of how we may expect the behavior of the process to change as we change the size of the problem. For a Θ(n) (linear) process, doubling the size will roughly double the amount of resources used. For an exponential process, each increment in problem size will multiply the resource utilization by a constant factor. In the remainder of Section 1.2 we will examine two algorithms whose order of growth is logarithmic, so that doubling the problem size increases the resource requirement by a constant amount.
A procedure that takes as arguments a base
b and a positive integer exponent
n and computes
bn = b * bn - 1,
b0 = 1
Which translates into the procedure below (that generates a linear recursive process)
1(define (expt b n)2 (if (= n 0)3 14 (* b (expt b (- n 1)))5 )6)
1(define product 1)2(define counter n) ; n is the exponent
1(define product (* b product)) ; b is the base2(define counter (- counter 1))
1(= counter 0)
Which translates into the procedure below (that generates a linear iterative process)
1(define (expt b n)2 (expt-iter b n 1)3)45(define (expt-iter b counter product)6 (if (= counter 0)7 product8 (expt-iter b9 (- counter 1)10 (* b product)11 )12 )13)
Successive squaring can be used to reduce the number of steps.
b (b (b (b (b (b (b * b))))))
It can be computed using three multiplications
b8 = b4 b4 = ((b2)2)2
b2 = b b
Successive squaring can be generalized to work with odd powers too
bn = (bn/2)2 if n is even;
bn = b * bn-1 if n is odd.
Expressed as a procedure:
1(define (even? n)2 (= (remainder n 2) 0)3)45(define (fast-expt b n)6 (cond ((= n 0) 1)7 ((even? n) (square (fast-expt b (/ n 2))))8 (else (* b (fast-expt b (- n 1)))))9)
The process evolved by
fast-expt grows logarithmically with
n in both space and number of steps.
fast-exptrequires only one more multiplication than computing bn.
ngrows about as fast as the logarithm of
The difference between
O(log n) growth and
O(n) growth becomes striking as
n becomes large. For example,
n = 1000 requires only 14 multiplications.
Design a procedure that evolves an iterative exponentiation process that uses successive squaring and uses a logarithmic number of steps, as does
(Hint: Using the observation that
(b2)n/2, keep, along with the exponent
n and the base
b, an additional state variable
a, and define the state transformation in such a way that the product
abn is unchanged from state to state. At the beginning of the process
a is taken to be 1, and the answer is given by the value of
a at the end of the process. In general, the technique of defining an invariant quantity that remains unchanged from state to state is a powerful way to think about the design of iterative algorithms.)
(b2)n/2(Using exponent product rule)
b1 * bn-1(Using exponent product rule)
bncan be expressed as:
Where `a = 1` initially.
n'= `n - 1`
1(define (even? n)2 (= (remainder n 2) 0)3)45(define (expt b n)6 (define (expt-iter a b n)7 (cond8 ((= n 0) a)9 ((even? n) (expt-iter a (* b b) (/n 2)))10 (else (expt-iter (* a b) b (- n 1)))11 )12 )13 (expt-iter 1 b n)14)
Recursive, successive doubling:
a * b=
a * b=
[(a - 1) * b] + b
a * b=
[(a / 2) * b] * 2
1(define (* a b)2 (cond3 ((= a 1) b)4 ((even? a) (double (* (half a) b)))5 (else (+ b (* (- a 1) b)))6 )7)89(define (even? n)10 (= (remainder n 2) 0)11)1213(define (double n)14 (+ n n)15)1617(define (half n)18 (/ n 2)19)
a * bcan be expressed as:
m + a * b
m = 0initially.
a * b=
m + b/2 * a/2 * b=
m' + a' + b'
m + b/2
a * b=
m + b * (a-1) * b=
m' + a' + b'
m + b
a is one, the answer is equal to
Iterative, successive doubling: Θ(logn) time, Θ(1) space.
1(define (* a b)2 (define (*-iter m new-a)3 (cond4 ((= new-a 1) m)5 ((even? new-a) (*-iter (+ m (double b)) (half new-a)))6 (else (*-iter (+ m b) (- new-a 1)))7 )8 )9 (*-iter 0 a)10)1112(define (even? n)13 (= (remainder n 2) 0)14)1516(define (double n)17 (+ n n)18)1920(define (half n)21 (/ n 2)22)
The greatest common divisor (GCD) of two integers
c is the largest integer that divides both
c with no remainder.
Eg: GCD(16, 28) = 4
Factor the two integer arguments and search for common factors.
r is the remainder when
b is divided by
c, then the common divisors of
c are precisely the same as the common divisors of
GCD(b, c) = GCD(c, r)
This is used to successively reduce the problem of computing a GCD to the problem of computing the GCD of smaller and smaller pairs of integers. Eg:
1; GCD(206, 40) =2(gcd 206 40)3(gcd 40 6)4(gcd 6 4)5(gcd 4 2)6(gcd 2 0)72
This always reduces to a pair where the second number is zero, and the other number is the GCD of the original pair.
Expressed as a procedure:
1(define (gcd a b)2 (if (= b 0)3 a4 (gcd b (remainder a b))5 )6)
This generates an iterative process, whose number of steps grows as the logarithm of the numbers involved.
dis a divisor of
n, then so is
n/dcannot both be greater than
The end test for
find-divisoris based on the fact that if
nis not prime it must have a divisor less than or equal to
1; n is prime IFF n is its own smallest divisor.2(define (prime? n)3 (= n (smallest-divisor n))4)56; finds the smallest integral divisor (greater than 1) of a given number n.7(define (smallest-divisor n) (find-divisor n 2))89; testing n for divisibility by successive integers [2, 3, 4 ... sqrt(n))10(define (find-divisor n test-divisor)11 (cond ((> (square test-divisor) n) n)12 ((divides? test-divisor n) test-divisor)13 (else (find-divisor n (+ test-divisor 1)))14 )15)1617(define (divides? a b) (= (remainder b a) 0))
The number of steps required to identify
n as prime will have order of growth
Fermat’s Little Theorem: If
nis a prime number and
bis any positive integer less than
braised to the
nth power is congruent to
bn modulo n = b modulo n
b < n, then
b modulo n = b. Therefore, this simplifes to:
bn modulo n = b
n is not prime, then, in general, most of the numbers
b < n will not satisfy the above relation. This leads to the following algorithm for testing primality:
- Given a number
n, pick a random number
b < n.
- If the result is not equal to
nis certainly not prime.
- If it is
b, then chances are good that
- Repeat test using another random number to improve confidence that
1; runs fermat-test multiple times as specified by the times parameter2(define (fast-prime? n times)3 (cond ((= times 0) true)4 ((fermat-test n) (fast-prime? n (- times 1)))5 (else false)6 )7)89(define (fermat-test n)10 (define (try-it a)11 (= (expmod a n n) a))1213 ; Random range: [0, arg)14 (try-it (+ 1 (random (- n 1))))15)1617; computes the exponential of a number modulo another number18(define (expmod base exp m)19 ; uses successive squaring — like the fast-expt procedure20 (cond ((= exp 0) 1)21 ((even? exp)22 (remainder23 (square (expmod base (/ exp 2) m))24 m25 )26 )27 (else28 (remainder29 (* base (expmod base (- exp 1) m))30 m31 )32 )33 )34)
e reduction steps in the cases where the exponent e is greater than 1 are based on the fact that, for any integers x, y, and m, we can find the remainder of x times y modulo m by computing separately the remainders of x modulo m and y modulo m, multiplying these, and then taking the remainder of the result modulom. For instance, in the case where e is even, we compute the remainder of be=2 modulo m, square this, and take the remainder modulo m. is technique is useful because it means we can perform our computation without ever having to deal with numbers much larger than m.
if n ever fails the Fermat test, we can be certain that n is not prime. But the fact that n passes the test, while an extremely strong indication, is still not a guarantee that n is prime. What we would like to say is that for any number n, if we perform the test enough times and find that n always passes the test, then the probability of error in our primality test can be made as small as we like.
Unfortunately, this assertion is not quite correct. ere do exist numbers that fool the Fermat test: numbers n that are not prime and yet have the property that an is congruent to a modulo n for all integers a < n. Such numbers are extremely rare, so the Fermat test is quite reliable in practice.
Numbers that fool the Fermat test are called Carmichael numbers, and lile is known about them other than that they are extremely rare. ere are 255 Carmichael numbers below 100,000,000. e smallest few are 561, 1105, 1729, 2465, 2821, and 6601. In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a “correct” algorithm. Considering an algorithm to be inadequate for the first reason but not for the second illustrates the difference between mathematics and engineering.