Programming
Methodologies
Programming
methodologies deal with different methods of designing programs. This will
teach you how to program efficiently. This book restricts itself to the basics
of programming in C and C++, by assuming that you are familiar with the syntax
of C and C++ and can write, debug and run programs in C and C++. Discussions in
this chapter outline the importance of structuring the programs, not only the
data pertaining to the solution of a problem but also the programs that
operates on the data.
Data is
the basic entity or fact that is used in calculation or manipulation process. There
are two types of data such as numerical and alphanumerical data. Integer and
floating-point numbers are of numerical data type and strings are of
alphanumeric data type. Data may be single or a set of values, and it is to be
organized in a particular fashion. This organization or structuring of data
will have profound impact on the efficiency of the program.
1.1. AN
INTRODUCTION TO DATA STRUCTURE
Data
structure is the structural representation of logical relationships between
ele-ments of data. In other words a data structure is a way of organizing data
items by consid-ering its relationship to each other.
Data
structure mainly specifies the structured organization of data, by providing
accessing methods with correct degree of associativity. Data structure affects
the design of both the structural and functional aspects of a program.
Algorithm
+ Data Structure = Program
Data
structures are the building blocks of a program; here the selection of a
particu-lar data structure will help the programmer to design more efficient
programs as the complexity and volume of the problems solved by the computer is
steadily increasing day by day. The programmers have to strive hard to solve
these problems. If the problem is analyzed and divided into sub problems, the
task will be much easier i.e., divide, conquer and combine.
A
complex problem usually cannot be divided and programmed by set of modules
unless its solution is structured or organized. This is because when we divide
the big problems into sub problems, these sub problems will be programmed by
different pro-grammers or group of programmers. But all the programmers should
follow a standard structural method so as to make easy and efficient
integration of these modules. Such type of hierarchical structuring of program
modules and sub modules should not only reduce the complexity and control the
flow of program statements but also promote the proper structuring of
information. By choosing a particular structure (or data structure) for the
data items, certain data items become friends while others loses its relations.
The
representation of a particular data structure in the memory of a computer is
called a storage structure. That is, a data structure should be represented in
such a way that it utilizes maximum efficiency. The data structure can be
represented in both main and auxiliary memory of the computer. A storage
structure representation in auxiliary memory is often called a file structure.
It is
clear from the above discussion that the data structure and the operations on
organized data items can integrally solve the problem using a computer
Data
structure = Organized data + Operations
1.2.
ALGORITHM
Algorithm
is a step-by-step finite sequence of instruction, to solve a well-defined
computational problem.
That
is, in practice to solve any complex real life problems; first we have to
define the problems. Second step is to design the algorithm to solve that
problem.
Writing
and executing programs and then optimizing them may be effective for small
programs. Optimization of a program is directly concerned with algorithm
design. But for a large program, each part of the program must be well
organized before writing the pro-gram. There are few steps of refinement
involved when a problem is converted to program; this method is called stepwise
refinement method. There are two approaches for algorithm design; they are top-down and bottom-up algorithm
design.
1.3.
STEPWISE REFINEMENT TECHNIQUES
We can
write an informal algorithm, if we have an appropriate mathematical model for a
problem. The initial version of the algorithm will contain general statements, i.e.,;
informal instructions. Then we convert this informal algorithm to formal
algorithm, that is, more definite instructions by applying any programming
language syntax and seman-tics partially. Finally a program can be developed by
converting the formal algorithm by a programming language manual.
From
the above discussion we have understood that there are several steps to reach a
program from a mathematical model. In every step there is a refinement (or
conversion). That is to convert an informal algorithm to a program, we must go
through several stages of formalization until we arrive at a program — whose
meaning is formally defined by a programming language manual — is called
stepwise refinement techniques
PROGRAMMING
METHODOLOGIES
|
3
|
1. In
the first stage, modeling, we try to represent the problem using an appropriate
mathematical model such as a graph, tree etc. At this stage, the solution to
the problem is an algorithm expressed very informally.
2. At
the next stage, the algorithm is written in pseudo-language (or formal
algo-rithm) that is, a mixture of any programming language constructs and less
for-mal English statements. The operations to be performed on the various types
of data become fixed.
3. In
the final stage we choose an implementation for each abstract data type and
write the procedures for the various operations on that type. The remaining
in-formal statements in the pseudo-language algorithm are replaced by (or any
programming language) C/C++ code.
Following
sections will discuss different programming methodologies to design a program.
.
STRUCTURED PROGRAMMING
It is a
programming style; and this style of programming is known by several names:
Procedural decomposition, Structured programming, etc. Structured programming
is not programming with structures but by using following types of code
structures to write programs:
PROGRAMMING
METHODOLOGIES
|
5
|
1. Sequence
of sequentially executed statements
2. Conditional
execution of statements (i.e., “if” statements)
3. Looping
or iteration (i.e., “for, do...while, and while” statements)
4. Structured
subroutine calls (i.e., functions)
In
particular, the following language usage is forbidden:
• “GoTo”
statements
• “Break”
or “continue” out of the middle of loops
• Multiple
exit points to a function/procedure/subroutine (i.e., multiple “return”
statements)
• Multiple
entry points to a function/procedure/subroutine/method
In this
style of programming there is a great risk that implementation details of many
data structures have to be shared between functions, and thus globally exposed.
This in turn tempts other functions to use these implementation details;
thereby creating unwanted dependencies between different parts of the program.
The
main disadvantage is that all decisions made from the start of the project
de-pends directly or indirectly on the high-level specification of the
application. It is a well-known fact that this specification tends to change
over a time. When that happens, there is a great risk that large parts of the
application need to be rewritten.
1.8.
ANALYSIS OF ALGORITHM
After
designing an algorithm, it has to be checked and its correctness needs to be
predicted; this is done by analyzing the algorithm. The algorithm can be
analyzed by tracing all step-by-step instructions, reading the algorithm for
logical correctness, and testing it on some data using mathematical techniques
to prove it correct. Another type of analysis is to analyze the simplicity of
the algorithm. That is, design the algorithm in a simple way so that it becomes
easier to be implemented. However, the simplest and most straightforward way of
solving a problem may not be sometimes the best one. Moreover there may be more
than one algorithm to solve a problem. The choice of a particular algorithm
depends on following performance analysis and measurements :
1. Space
complexity
2. Time
complexity
1.8.1. SPACE
COMPLEXITY
Analysis
of space complexity of an algorithm or program is the amount of memory it needs
to run to completion.
Some of
the reasons for studying space complexity are:
1. If
the program is to run on multi user system, it may be required to specify the
amount of memory to be allocated to the program.
2. We
may be interested to know in advance that whether sufficient memory is
available to run the program.
3. There
may be several possible solutions with different space requirements.
4. Can
be used to estimate the size of the largest problem that a program can solve.
The
space needed by a program consists of following components.
• Instruction
space : Space needed to store the executable version of the program and
it is fixed.
• Data
space : Space needed to store all constants, variable values and has
further two components :
(a)
Space needed by constants and simple variables. This space is fixed.
(b)
Space needed by fixed sized structural variables, such as arrays and
struc-tures.
(c)
Dynamically allocated space. This space usually varies.
• Environment
stack space: This space is needed to store the information to resume the
suspended (partially completed) functions. Each time a function is invoked the
following data is saved on the environment stack :
(a)
Return address : i.e., from where it has to resume after completion
of the called function.
(b)
Values of all lead variables and the values of formal parameters in the
func-tion being invoked .
The
amount of space needed by recursive function is called the recursion stack
space. For each recursive function, this space depends on the space needed by
the local variables and the formal parameter. In addition, this space depends
on the maximum depth of the recursion i.e., maximum number of
nested recursive calls.
1.8.2.
TIME COMPLEXITY
The
time complexity of an algorithm or a program is the amount of time it needs to
run to completion. The exact time will depend on the implementation of the
algorithm, programming language, optimizing the capabilities of the compiler
used, the CPU speed, other hardware characteristics/specifications and so on.
To measure the time complexity accurately, we have to count all sorts of
operations performed in an algorithm. If we know the time for each one of the
primitive operations performed in a given computer, we can easily compute the
time taken by an algorithm to complete its execution. This time will vary from
machine to machine. By analyzing an algorithm, it is hard to come out with an
exact time required. To find out exact time complexity, we need to know the
exact instruc-tions executed by the hardware and the time required for the
instruction. The time com-plexity also depends on the amount of data inputted
to an algorithm. But we can calculate the order of magnitude for the time
required.
That
is, our intention is to estimate the execution time of an algorithm
irrespective of the computer machine on which it will be used. Here, the more
sophisticated method is to identify the key operations and count such
operations performed till the program com-pletes its execution. A key operation
in our algorithm is an operation that takes maximum time among all possible
operations in the algorithm. Such an abstract, theoretical ap-proach is not
only useful for discussing and comparing algorithms, but also it is useful to
improve solutions to practical problems. The time complexity can now be
expressed as function of number of key operations performed. Before we go ahead
with our discussions, it is important to understand the rate growth analysis of
an algorithm,
The
function that involves ‘n’ as an exponent, i.e., 2n, nn, n !
are called exponential functions, which is too slow except for small size input
function where growth is less than or equal to nc,(where
‘c’ is a constant) i.e.; n3, n2, n log2n, n,
log2 n are said to be polyno-mial. Algorithms with
polynomial time can solve reasonable sized problems if the constant in the
exponent is small.
When we
analyze an algorithm it depends on the input data, there are three cases :
1. Best
case
2. Average
case
3. Worst
case
In the
best case, the amount of time a program might be expected to take on best
possible input data.
In the
average case, the amount of time a program might be expected to take on typical
(or average) input data.
In the
worst case, the amount of time a program would take on the worst possible input
configuration.
BIG
“OH” NOTATION
Big Oh
is a characteristic scheme that measures properties of algorithm complexity
performance and/or memory requirements. The algorithm complexity can be
determined by eliminating constant factors in the analysis of the algorithm.
Clearly, the complexity function f(n) of an algorithm
increases as ‘n’ increases.
Let us
find out the algorithm complexity by analyzing the sequential searching
algo-rithm. In the sequential search algorithm we simply try to match the
target value against each value in the memory. This process will continue until
we find a match or finish scanning the whole elements in the array. If the
array contains ‘n’ elements, the maximum possible number of comparisons
with the target value will be ‘n’ i.e., the worst case. That
is the target value will be found at the nth position of the array.
f (n)
= n
i.e., the
worst case is when an algorithm requires a maximum number of iterations or steps
to search and find out the target value in the array.
The
best case is when the number of steps is less as possible. If the target value
is found in a sequential search array of the first position (i.e., we
need to compare the target value with only one element from the array)—we have
found the element by executing only one iteration (or by least possible
statements)
f (n)
= 1
PROGRAMMING
METHODOLOGIES
|
9
|
Average
case falls between these two extremes (i.e., best and worst). If the
target value is found at the n/2nd position, on an average we need
to compare the target value with only half of the elements in the array, so
f (n)
= n/2
The
complexity function f(n) of an algorithm increases as ‘n’ increases. The
function f (n)= O(n) can be read as “f of n is
big Oh of n” or as “f (n) is of the order of n”.
The total running time (or time complexity) includes the
initializations and several other iterative statements through the loop.
The
generalized form of the theorem is
f (n)
= cknk + ck–1nk –1 + ck–2nk–2 +
...... + c2n2 + c1n1 + c0n0
Where
the constant ck > 0
Then, f (n)
= O(nk)
Based
on the time complexity representation of the big Oh notation, the algorithm can
be categorized as :
1. Constant
time O(1)
2. Logarithmic
time Olog(n)
3. Linear
time O(n)
4. Polynomial time
5. Exponential time O(cn)
LIMITATION
OF BIG “OH” NOTATIONO(nc)
Big Oh
Notation has following two basic limitations :
1. It
contains no effort to improve the programming methodology. Big Oh Notation does
not discuss the way and means to improve the efficiency of the program, but it
helps to analyze and calculate the efficiency (by finding time complexity) of
the program.
2. It does not exhibit the potential of
the constants. For example, one algorithm is taking 1000n2 time
to execute and the other n3 time. The first
algorithm is O(n2), which implies that it will take less time
than the other algorithm which is O(n3). However in actual
execution the second algorithm will be faster for n <
1000.
We will
analyze and design the problems in data structure. As we have discussed to
develop a program of an algorithm, we should select an appropriate data
structure for that algorithm.
Asymptotic
Analysis
A programmer usually has a choice of data structures and
algorithms to use. Choosing the best one for a particular job involves, among
other factors, two important measures:
- Time
Complexity: how much time will the
program take?
- Space
Complexity: how much storage will the
program need?
A programmer will sometimes seek a tradeoff between space
and time complexity. For example, a programmer might choose a data structure
that requires a lot of storage in order to reduce the computation time. There
is an element of art in making such tradeoffs, but the programmer must make the
choice from an informed point of view. The programmer must have some verifiable
basis on which to make the selection of a data structure or algorithm. Complexity
analysis provides such a basis.
Complexity refers to the rate at which the
storage or time grows as a function of the problem size. The absolute growth
depends on the machine used to execute the program, the compiler used to
construct the program, and many other factors. We would like to have a way of
describing the inherent complexity of a program (or piece of a program),
independent of machine/compiler considerations. This means that we must not try
to describe the absolute time or storage needed. We must instead concentrate on
a "proportionality" approach, expressing the complexity in terms of
its relationship to some known function. This type of analysis is known
as asymptotic analysis.
Asymptotic analysis is based on the idea that as the problem
size grows, the complexity can be described as a simple proportionality to some
known function. This idea is incorporated in the "Big Oh" notation
for asymptotic performance.
Definition: T(n) = O(f(n)) if and only if there are constants c0 and
n0 such that T(n) <= c0 f(n) for all n >=
n0.
The expression "T(n) =
O(f(n))" is read as "T of n is in Big Oh of f of n." Big Oh is
sometimes said to describe an "upper-bound" on the complexity. Other
forms of asymptotic analysis ("Big Omega", "Little Oh",
"Theta") are similar in spirit to Big Oh, but will not be discussed
in this handout.
If a function T(n) = O(f(n)), then eventually
the value cf(n) will exceed the value of T(n) for some constant c.
"Eventually" means "after n exceeds some value." Does this
really mean anything useful? We might say (correctly) that n2 +
2n = O(n25), but we don't get a lot of information from that; n25 is
simply too big. When we use Big Oh analysis, we usually choose the function
f(n) to be as small as possible and still satisfy the definition of Big Oh.
Thus, it is more meaningful to say that n2 + 2n = O(n2);
this tells us something about the growth pattern of the function n2 +
2n, namely that the n2 term will dominate the growth as n
increases. The following functions are often encountered in computer science
Big Oh analysis:
- T(n)
= O(1). This is called constant growth. T(n) does not grow at all as a
function of n, it is a constant. It is pronounced "Big Oh of
one." For example, array access has this characteristic. A[i] takes the same time independent of the size of
the array A.
- T(n)
= O(lg(n)). This is called logarithmic growth. T(n) grows proportional to
the base 2 logarithm of n. Actually, the base doesn't matter, it's just
traditional to use base-2 in computer science. It is pronounced "Big
Oh of log n." For example, binary search has this characteristic.
- T(n)
= O(n). This is called linear growth. T(n) grows linearly with n. It is
pronounced "Big Oh of n." For example, looping over all the
elements in a one-dimensional array would be an O(n) operation.
- T(n)
= O(n log n). This is called "n log n" growth. T(n) grows
proportional to n times the base 2 logarithm of n. It is pronounced
"Big Oh of n log n." For example, Merge Sort has this
characteristic. In fact, no sorting algorithm that uses comparison between
elements can be faster than n log n.
- T(n)
= O(nk). This is called polynomial growth. T(n) grows
proportional to the k-th power of n. We rarely consider algorithms that
run in time O(nk) where k is greater than 5, because such
algorithms are very slow. For example, selection sort is an O(n2)
algorithm. It is pronounced "Big Oh of n squared."
- T(n)
= O(2n) This is called exponential growth. T(n) grows
exponentially. It is pronounced "Big Oh of 2 to the n."
Exponential growth is the most-feared growth pattern in computer science;
algorithms that grow this way are basically useless for anything but very
small problems.
The growth patterns above have been listed in
order of increasing "size." That is,
O(1),
O(lg(n)), O(n lg(n)), O(n2), O(n3), ... , O(2n).
Note that it is not true that if f(n) = O(g(n))
then g(n) = O(f(n)). The "=" sign does not mean equality in the usual
algebraic sense --- that's why some people say "f(n) is in Big Oh of g(n)" and we never say
"f(n) equals Big Oh of
g(n)."
Suppose we have a program that takes some
constant amount of time to set up, then grows linearly with the problem size n.
The constant time might be used to prompt the user for a filename and open the
file. Neither of these operations are dependent on the amount of data in the
file. After these setup operations, we read the data from the file and do
something with it (say print it). The amount of time required to read the file
is certainly proportional to the amount of data in the file. We let n be the
amount of data. This program has time complexity of O(n). To see this, let's
assume that the setup time is really long, say 500 time units. Let's also
assume that the time taken to read the data is 10n, 10 time units for each data
point read. The following graph shows the function 500 + 10n plotted against n,
the problem size. Also shown are the functions n and 20 n.
Note that the function n will never be larger than the function 500 + 10 n, no
matter how large n gets. However, there are constants c0 and n0 such
that 500 + 10n <= c0 n when n >= n0. One
choice for these constants is c0 = 20 and n0 =
50. Therefore, 500 + 10n = O(n). There are, of course, other choices for c0 and
n0. For example, any value of c0 > 10 will work
for n0 = 50.
Recursive and Iterative algorithms
Introduction
In Practical One, you will experiment with
examples of SML functions — fac, fib, gcd and power. These are “obviously correct”; That is, the definitions of the
functions are directly based on well-known mathematical relationships, and so
we should have reasonable confidence in our programming efforts. The issue of
correctness is of fundamental importance in software engineering, and will be
addressed as the CS201 course progresses. However, in this note, we shall tend
to take correctness for granted and worry instead about another important
issue: efficiency.
Given some problem to be solved, we need to
devise an algorithm,
that is, a sequence of computational steps that transforms an input (i.e., a
representation of a problem) into an appropriate output (i.e., a representation
of the solution). When looking at the efficiency of algorithms, we are concerned
with their resource requirements. In general, the two important resource
measurements are the size of computer, and the amount of computer time,
required by an implementation of an algorithm. For many problems, there are
straightforwardly-expressed algorithms whose resource demands are sufficiently
enormous that we need to find more subtle algorithms needing less resources.
There is no guarantee that such algorithms will exist: usually it can be shown
that problems cannot be solved using any less than some minimal level of
resource.
Measuring resource
requirements
If we have devised an algorithm, it possible to
implement it by writing a computer program1 and then to assess its efficiency by
performing experiments. For example, for different inputs to the algorithm, we
might measure the amount of memory space required, or the processor time
required, to compute the outputs. Note that the latter measure is not usually
the same as the actual time required, since the processor is likely to be doing
other things as well as running the algorithm. You will have a little
experience of experimentation in Practical One, where the time function will be used to measure the processor
time required to evaluate various functions on various arguments. In some
cases, this will yield unexpected observations worthy of further explanation.
The basic problems with an experimental approach
to checking efficiency (and correctness, for that matter) are threefold. First,
it is necessary to implement the algorithm: this may be time-consuming and
error-prone. Second, details of the implementation may affect efficiency, and
obscure the essential behaviour of the algorithm. Third, it is not normally
possible to test the algorithm on all possible inputs, which may mean missing
good or bad special cases.
Analysing resource
requirements
As an alternative to experimentation, we shall
be studying the analysis of algorithms as one thread of the course, that is, studying
methods for predicting the resources that an algorithm requires. Analysing even
simple algorithms can be challenging, and may require the use of non-trivial
mathematics. The immediate goal of analyses is to find simple means of expressing
the important characteristics of algorithms’ resource requirements, suppressing
unnecessary details. Before embarking on an introduction to some of the
technical tools required for analysis of algorithms, this note will review the
algorithms from Practical One, to see what can be predicted about their
resource requirements. To keep things simple, we shall restrict our attention
to the computation time required, and not worry about memory space. In general,
when writing down algorithms, we do not have to use a particular programming
language, but can use an appropriate mix of notation and language that permits
a rigorous description. For the algorithms here, SML descriptions are as good
as any (and, of course, have the added advantage that they can be directly
executed by a computer).
As Practical 1 will reveal, the basic problem
with an analytic approach to predicting the performance of an algorithm is that
we usually analyse a simplified computational model, which may abstract away
features that, in practice, dominate the performance.
Analysis of factorial
algorithm
The factorial algorithm can be expressed using
simple recursion:
fun fact 0 = 1
| fact n = n * fact (n - 1);
| fact n = n * fact (n - 1);
Looking at the computation that has to be done,
we might identify three things that will consume time: integer multiplication,
integer subtraction and recursive function calls. There are other possibilities
but, if we try to take every little detail into account, we are unlikely to get
any kind of simple analysis. Let us ignore the cost of making recursive calls,
and suppose that the cost of a multiplication is M and that that of a subtraction is S. We can then define a function T(n), meaning “the time required by the algorithm to compute n!”, in a very similar form to that of the actual
algorithm:
From this,
If we regard M and S as being constants, this expression indicates that the time to
compute n! is proportional
to n so, for example,
computing (2n)! will take twice as
long as computing n! does. The acid test of
such an analysis is to check its predictions against observed reality. Times
reported from the SML function call time fac [512,1024,2048,4096] were 2, 11, 53 and 228 — more than quadrupling
as the value of n is doubled. To explain
this discrepancy, we must both check our idealised analysis for flaws and also
consider whether the SML implementation might be introducing unexpected
behaviour.
In this case, one obvious problem is an
over-simplistic view of the cost of performing arithmetic. Regarding the
subtraction time S as a constant is
reasonable here, since n is a ‘normal size’
integer, i.e., fits into 16 or 32 bits, and so n-1 can be computed in one processor step. However,
this is not the situation for multiplication. Employing some knowledge of the
problem to be solved, we discover that n! grows very rapidly as n increases: Stirling’s approximation for the
factorial function reveals that n! grows at roughly the same rate as nn, so the number of bits needed to
represent n! (that is lg n! where lg is the base-two logarithm function)
grows at roughly the same rate as lg nn = n lg n. Thus, it is not reasonable to charge a
constant time for every multiplication: the number of bits used for the
right-hand operand in n * fact (n-1) — roughly proportional to (n - 1) lg(n - 1) — will rapidly become larger than 16, 32 or whatever limit our
processor has on normal integers.
Had we implemented the algorithm in C, we could
not have used multiplication in such a cavalier fashion, since integers are not
allowed to be bigger than the size that the processor can handle easily. To
further understand our SML experiment, we need to establish how much time such
a long multiplication takes, in terms of individual processor steps. Our SML
system employs a straightforward algorithm (similar to long multiplication as
learnt at school, in fact). This means that the time to multiply a normal size
integer by a b-bit long integer is
proportional to b. Thus, a more accurate
expression for T(n) would be:
(where M is still some constant). Using appropriate
solution techniques, we can establish that T(n) ∝ n2 lg n For example, (2n)! will take just over four times as long to
compute as n! does. This corresponds
rather better to the experimental results. Normally, we should hope to be able
to analyse algorithms without having to probe details of possible
implementations, because we assume that they will use ‘reasonable’
computational operations. Here, the SML system was being a little
‘unreasonable’ perhaps, since it was suggesting to us that an expensive
operation was available at a modest charge.
Analysis of Fibonacci
algorithms
In the obvious Fibonacci algorithm, there are
two conspicuous things that consume time: integer addition and recursive
function calls. Letting A be a constant representing the time required for a simple
addition, we can write down a functionT(n) meaning “the time
required by the algorithm to compute the n-th Fibonacci number”:
Using appropriate solution techniques, we
discover (to our slight horror) that T(n) is roughly proportional to 2n.
This is a very fast rate of growth, and helps to
explain why our SML implementations run very slowly, even for modest values
of n. The problem is the
pair of recursive calls, which duplicate much work. A little thought leads to
the alternative fastfib algorithm
that eliminates one of the recursions, and so has:
which is of similar style to the earlier
factorial time analysis. Thus, the time requirement is proportional to n — a dramatic improvement. Again, as with the
factorial algorithm, we might worry about whether the addition operations can
be done simply. However, this is a lesser concern when just comparing the two
different Fibonacci algorithms, since both perform similar kinds of addition
operations.
(Highlights of) analysis
of greatest common divisor algorithm
Letting D be a constant representing the time required for
a division (needed to perform the mod operation), we can write down a
function T(m,n) meaning “the time required by the algorithm to
compute the greatest common divisor of m and n”:
Here, the form of the recursion makes it rather
harder to derive an expression for T(m,n) in terms of m, n and D. The essential question
is: how long will the sequence of recursive calls be? It is rather hard to work
this out, but it turns out that the following is true:
where Fk is the k-th Fibonacci number. If we indulge in some
study of Fibonacci numbers, and then do some algebraic manipulation, it is
possible to eventually establish that, at worst, the time required to compute
gcd(m,n) is roughly
proportional to lg n. This backs up
experimental observations that the algorithm is fast. [Health warning: the full details of this analysis are far beyond
the scope of the CS201 course.]
(Exercise on) analysis
of powering algorithms
Write down ‘time required’ functions for both
the power and fastpower algorithms. You should be able to derive a
non-recursive expression for the power time function fairly easily. [Hint: similar to the one for
factorial.] Can you derive a non-recursive expression for the fastpower time function? (If not, await a lecture in the
near future.) As with the factorial algorithm, we also have to think about the
size of integers involved. When we do this, it turns out that there is not much
difference between the performance of the two algorithms, as you may have
discovered experimentally. With the SML implementation, there is a further
twist: if the binary representations of intermediate values contain lots of
zeroes (e.g., when computing powers of two), long multiplication is faster; this
helps the fastpower algorithm.
Thus far, we have analysed various algorithms in
a fairly informal way, starting from first principles each time. In the
remainder of this note, we draw on our earlier experience in order to establish
some general concepts that are useful whenever algorithms are being analysed
and compared.
There are three main things to be introduced:
standard simplifications to remove unnecessary detail from algorithms under
consideration; some mathematical notation to express concisely the resource
requirements of algorithms; and some algebraic techniques for manipulating
formulae expressing resource requirements. We look briefly at the first two of
these, and defer the algebraic techniques until later in the course.
Simplifying input detail
To quantify the resource requirements of an
algorithm, we should really define a function that maps each possible input to
a prediction of the algorithm’s resource requirement on that input. However,
for all but the most trivial types of input, this is a daunting undertaking
because of the number and variety of different inputs possible. Therefore, we
normally distill the inputs to get a simpler characterisation: the input size. Usually, when viewed at a low enough level of
abstraction, the size of a ‘reasonable’ binary representation of the input is a
good measure. This allows some comparison of the resource requirements of
algorithms from widely different problem areas. However, within specific
problem areas, it is often clearer to define input size at an appropriate
higher level.
The insertion of the word “reasonable” above is
designed to avoid occasions where we appear to be solving large problems
cheaply but where we are, in fact, solving small problems expensively, because
the algorithm input is represented in a way far more complicated than
necessary. As a rough guide to what is reasonable, we can be guided by a
fundamental result from information theory: if a data type has k different values, each of which is equally
likely to occur as inputs, then nlg k bits are necessary and
sufficient to represent a sequence of n inputs. As an indication that this result is not
unexpected, consider numbering the different values from 0 to k - 1, and recall that lg k bits are necessary and sufficient to represent
these integers uniquely. (As an indication that the result is not trivial,
observe that if some values occur more frequently than others we may transmit
messages more efficiently by using a code with fewer bits for more frequent
characters.)
After defining the input size measure for an
algorithm, we define a function that maps each input size to a predicted
resource requirement ‘on that input size’. The trouble is that there is usually
a range of requirements: some inputs of a given size may be very cheap to deal
with, others may be very expensive. Normally, we determine the requirements of
the worst possible input of each size in order to characterise the behaviour of
an algorithm. Sometimes however, where it is clear that an algorithm is
normally far better than a worst case analysis suggests, we determine the
average requirement over all the inputs of a given size. The analysis of the
quick sort algorithm later in the course will give an example of this.
Simplifying
computational detail
To predict the time required for the execution
of an algorithm2, we shall attempt to predict how many
‘elementary computational operations’ are performed. The working definition of
“elementary” will be that an operation requires some constant amount of time to
carry out, that is, the time does not vary with the size of the operands.
Everyday examples include adding two 64-bit real numbers, comparing two 32-bit
integers, or moving a 16-bit integer from one location to another. Outlawed
examples include adding two arbitrary length integers, or searching an
arbitrary-length list for an item of interest. To avoid our analyses becoming
dependent on implementation details, we shall not exactly quantify the constant
amounts of time required for elementary operations. This simplification still
allows us to examine the basic behaviour of algorithms, and to compare
alternative algorithms.
As the informal analyses given earlier may have
suggested, we shall make a further abstraction to simplify matters: only
the rate of growth of the time requirement with respect to increasing input size is
of interest. Therefore, instead of aiming to derive a fairly precise formula to
express an algorithm’s time requirements, e.g., T(n) = 10n2 - 2n + 9 elementary operations where n is the input size, we shall only aim to
establish the dominant highest order term, here 10n2. Moreover, we shall ignore the constant
coefficient, here 10, since constant factors are less significant than the rate
of growth in determining efficiency for large input sizes. Given these
simplifications, we can express formulae for resource requirements concisely
using ‘big-O’ notation. The formula
read “T(n) is of order f(n)”, is defined to mean:
there are two constants, c and n0, such that
(we assume that none of c, n0, T(n) or f(n) is ever negative). In
the example above where T(n) = 10n2 - 2n + 9, we just write T(n) = O(n2). Here, f(n) = n2, and c = 10 and n 0 = 5 are suitable constants. Note
that T(n) does not have to grow as fast asf(n); saying T(n) = O(f(n)) merely asserts that T doesn’t grow any faster than f. Usually, the functions f(n) that we see are of the form nk (where k is a constant), lg n or 2n, or some multiplicative combination of these, such as n2 lg n or n2n. Note that lg n grows more slowly than any function of the
form nk, for every k > 0 —even when k < 1, and 2n grows more quickly than any function of
the form nk. One useful, and
slightly abused, special case is f(n) = 1: the formula T(n) = O(1) is used as shorthand
for “T(n) is bounded by some constant for all n”.
We usually consider that one algorithm is more
efficient than another algorithm if its time requirement has a lower rate of
growth. Note that may not be faster for some input sizes, but the algorithm
will be faster for all large enough inputs. When comparing algorithms on the
basis of big-O notation formulae, we must take care that the hidden n0 thresholds and the hidden c constant factors are not outrageously large.
Remember that, for example, a O(n) time algorithm
requiring 1000000n operations is slower than
an O(n2) one requiring n2 operations unless n > 1000000.
Deriving formulae for
time requirements
The examples given earlier should indicate that
it may be fairly straightforward to derive a recursive definition for the time
requirement of a given algorithm. Solving such definitions to obtain an
algebraic formula for the complexity is sometimes a taxing mathematical
challenge. However, many examples are straightforward, and different algorithms
may often be treated using similar methods. We will return to look at some of
these techniques later in the course. For the moment, as an exercise, we derive O-expressions for the rates of growth of the
various recursively defined functions, counting nodes and leaves of trees,
introduced in Lecture Note 1.
Full Trees
The number of leaves in a full binary tree of
height n is given by the
equations:
We want to derive an algebraic expression
for B(n). By induction on n, we can show that B(n) = 2n. The first equation
gives the base case, n = 0. Using the second
equation and the induction hypothesis we can derive the induction step:
So, B(n) is exponential: B(n) = O(2n).
Spines
The equations for the number of leaves in a
spine
have a clear solution: S(n) = n + 1. So S(n) = O(n), a linear function — a formal proof would
again use induction.
0 comments:
Post a Comment