MCS-177 Project: Orders of Growth
First draft due: Thursday, March 18, at the beginning of lab
Final draft due: Tuesday, March 23, at 9am
Recommended preparation and
schedule
Problem
Does making a small change in a procedure make any difference in how
the time the procedure takes varies with the size of the problem? In
this project you will make some changes to the
sum-of-divisors procedure introduced in section 3.3 and
explore what effect these changes have on the procedure itself and on
procedures which call it.
In particular, you will write a second version of
sum-of-divisors.Using the techniques you learned in class,
you should do a theoretical analysis of each version to produce
predictions about how the time the computer takes to do a computation
changes with the size of the computation. Next, you will measure the
time the computer takes to do computations of varying sizes. (This is
known as empirical testing.) Then you can compare your theoretical
predictions with your empirical results. You will also write a
procedure that uses sum-of-divisors, and do similar analyses.
Extra deadline and groups
You should submit the procedure for Exercise 3.6,
sum-of-divisors, in class on Monday, March 3. We will then put
this lab project on hold for the week of March 3-7, working on other
topics instead. We will return to this lab project the following week.
However, because the test on March 6 covers chapter 3, and part 5 of
this lab project would provide you with extra practice on chapter 3
material, you may want to do that part at this point as well.
On Wednesday, March 5, you will choose
a partner from your lab section so as to work on the remaining portions of
the lab in groups of two (with perhaps one group of three, should that
prove necessary due to an odd number of students).
We warn you that working in a group of two does not mean that your
work load will be halved; in fact, some groups may find that the
amount of hassle will effectively increase the work. Our intention is
that you gain experience in working in teams, since teamwork is very
common in today's workplace. You should have a clear understanding of
each other's responsibilities, and be dependable and punctual in
carrying them out. Also, be sure that neither partner monopolizes the
work, but rather that you share it as equally as possible. Finally, we
strongly encourage you to set up a group meeting with a lab instructor
after your first or second day of working together in lab,
so that we can assess how the project is going and, if need be,
suggest strategies for more successfully working together.
Project Report
Note that each group is expected to turn in both a first draft and a
final draft of their report. The first draft will be peer-reviewed in
lab on March 20. It will also be graded by the lab instructor as if
it were the final draft, with clear indications of why we counted off
what we did, which you can use to improve your final report. Your
grade will be the grade you receive for your final report. Note also
that the lateness policy described in the course description applies
to the final report; in particular, no late first drafts will be
accepted, since we are giving you this as an opportunity for improving
your final report and they need to be corrected rapidly.
As we said in the first project, we will assign a specific audience
for each project report. This report might be called
scientific in nature, since the emphasis should be less on
the writing of the procedures, and more on the theoretical and
empirical analysis that you do of those procedures. When you write
the project report, you may assume that your audience is generally
knowledgeable about computer science and Scheme, but not about what
you did.
After a short description the procedures you wrote (including full
copies of those procedures and a brief description of how you tested
that they were accurate), your lab report should concentrate on
answering the central questions for this project. In this case, you
have two sets of procedures. In each set, there are two procedures
which produce the same result when given the same data. Your job is to
compare the amount of time the computer takes to run the procedures,
using both theoretical analyses and empirical results.
In writing this report, you will want to include not only the
procedures you used and the measurements you made, but a clear
analysis of the asymptotic orders of growth (in big
Theta-notation) for each of the two methods of doing each of
the two computations. In particular, for each method, you will need to
- (Theoretical results) Explain the order of growth for each method
by looking at the procedures themselves.
- (Experimental results) Clearly present time measurements for how
long each procedure takes as a function of n.
- (Compare the two) Explain how well the experimental measurements
match up with your theoretical results. If they differ, do they differ
by much? What would explain the differences?
Use tables and/or graphs to more clearly show your results. Keep in
mind that tables (or graphs) are most clear if headings (or axes) are
clearly labeled with both the variable being measured and the
units. Also, if your goal is to compare the results of two
experiments, it's best if the data from both can be placed in the same
table or graph. It's confusing for the reader if the data appear on
separate pages.
You might also want to make some additional predictions that go
beyond your measurements. Suppose you were interested in pursuing the
abundant/deficient ratio study well beyond n=3200, perhaps to
n=102400. How
long would you expect each method to take? Would you want to experimentally
verify this? A harder (optional) question is, how large an n would
the computer be able to handle in a day? a week? a year?
Finally, there are three documents you should consult when writing
up your project report:
Computer Exercises
-
Do exercise 3.6 on page 60, if you haven't already. Be sure to compare
low2 to n rather than comparing low to
the square root of n. The square root method will not work
reliably due to round off errors; the computer might determine that
the square root of 9 is 2.99999 instead of 3.00000. To get the text's
definitions of sum-of-divisors and divides?, copy
the procedures in the following file into your definitions window and
execute them:
~mc27/labs/sum-divisors/sum-of-divisors.scm
- What is the running time of the book's version of
sum-of-divisors expressed in big Theta-notation? What
is the running time (again expressed in big
Theta-notation) of the version you wrote for Exercise
3.6?
- You will next do some timings of these two versions of
sum-of-divisors. To load into Scheme a procedure called
run-time for timing how long computations take, copy the
following expression:
(load "~mc27/public/time.scm")
into your definition window and execute it. Now you should time how
long it takes to find the sum of divisors of 100, 1000, 10000, 100000,
and 1000000. You should do the same timing repeatedly, at least for
the smaller numbers, to get some idea how precisely repeatable the
timings are. To illustrate how you would do a timing, here is a sample
timing of the sum-of-divisors of 100:
> (run-time sum-of-divisors 100)
;Time: 0.0008151735270379337
217
>
The time is given in seconds; e.g., in the above example, the evaluation
took a little less than thousandth of a second. The time reported
will not agree well with your own impression of the time taken, because
it doesn't include time the computer spends doing other things, and also
because for fast evaluations (such as the above) the evaluation is repeated
until at least a second has elapsed and an average time for the repetitions
is reported.
What is the ratio of times from each measurement to the next? (That
is: What is the ratio of times going from 100 to 1000? From 1000 to 10000?
And so forth.) How does this compare with your expectation? (Note:
When answering questions like these in your project write-up, be sure to
back up your answers with adequate arguments. In other words, give reasons,
probably in the form of big Theta arguments, which explain your
expectations.)
- Now evaluate the improved definition from exercise 3.6 and
repeat the measurements. Again, what are the time ratios, and how do
they compare with your prediction?
- One use for the sum-of-divisors procedure other than
finding perfect numbers is to measure the relative frequency of
abundant and deficient
numbers. As mentioned in section 3.3, very few numbers are perfect. The
remaining numbers either have a sum of divisors that is more than twice
the number, in which case they are called abundant, or less than twice
the number, in which case they are called deficient. The empirical question
we'd like to ask is, are these two kinds of numbers equally common, or
does one predominate? If so, by how much? The usual mathematical way to
formulate this is to count how many numbers less than or equal to n
are abundant and how many are deficient, and take the ratio. Then repeat
this for larger and larger values of n. If the ratio approaches
closer and closer to some limiting value as n increases, then this
limiting ratio tells us the relative frequency of abundant and deficient
numbers. For example, if the ratio approaches a limit of 1, then abundant
and deficient numbers are equally frequent. On the other hand, a limiting
value of 1/2 would mean that there are roughly twice as many deficient
numbers as abundant numbers.
- Write a procedure that generates an iterative process for computing
this ratio, given n. Most of the work can be done by a subsidiary
procedure that keeps count of how many abundant numbers it has found and
how many deficient numbers it has found as it scans the range from 1 to
n.
Once the entire range has been scanned, one count can be divided by the
other. Test your procedure to be sure that it works.
- Suppose that you are using the text's definition of
sum-of-divisors with your procedure for computing the ratio,
and double the size of the range being tested. What is the running
time of the procedure expressed in big Theta-notation? By
roughly what factor do you expect the number of divides?
tests (and hence time) to increase? How about if you use the
sum-of-divisors procedure from exercise 3.6 with your ratio
finding procedure? (You can use the same general style of reasoning as
on page 81 to produce determine the running time expressed in big
Theta notation.)
- Now measure how long your procedure takes to find the ratio for
n equal to 200, 400, 800, 1600 and 3200, and calculate the ratio
of each consecutive pair of times. Do this with each of the two definitions
for
sum-of-divisors. How well do your measurements agree with
your predictions? Do the abundant/deficient ratios seem to approach some
simple limiting value? If so, what?
Back to
MCS 177
home page