MCS-177 Project: Orders of Growth

Due: Friday, October 21, at the beginning of class

Recommended preparation and schedule

Problem

Does making a small change in a procedure make any difference in how the time the procedure takes varies with the size of the problem? In this project you will make some changes to the sum-of-divisors procedure introduced in section 3.3 and explore what effect these changes have on the procedure itself and on procedures which call it.

In particular, you will write a second version of sum-of-divisors.Using the techniques you learned in class, you should do a theoretical analysis of each version to produce predictions about how the time the computer takes to do a computation changes with the size of the computation. Next, you will measure the time the computer takes to do computations of varying sizes. (This is known as empirical testing.) Then you can compare your theoretical predictions with your empirical results. You will also write a procedure that uses sum-of-divisors, and do similar analyses.

In class, we'll talk about how to present your timing data both graphically and by using a table as you try to confirm your theoretical run-time predictions. For the project report, while we recommend you present the same data both graphically and in a table (for practice and to get feedback), you are only required to present the data in one of the two formats.

Extra deadline and groups

You should submit the procedure for Step 2 of the lab, the code for the aprocedure which computes the ratio of abundant and deficient numbers, in class on Wednesday, October 5. You'll then have time to do the less rote parts of the labs during the lab period.

On this lab, you will work in groups of two (with perhaps one group of three, should that prove necessary due to an odd number of students). You may wish to choose a partner who is both in your lab section and your lecture. You should not work with the same partner twice during the course.

We warn you that working in a group of two does not mean that your work load will be halved; in fact, some groups may find that the amount of hassle will effectively increase the work. Our intention is that you gain experience in working in teams, since teamwork is very common in today's workplace. You should have a clear understanding of each other's responsibilities, and be dependable and punctual in carrying them out. Also, be sure that neither partner monopolizes the work, but rather that you share it as equally as possible. Finally, we strongly encourage you to set up a group meeting with a lab instructor after your first or second day of working together in lab, so that we can assess how the project is going and, if need be, suggest strategies for more successfully working together.

Project Report

Note that each group is expected to turn in both a first draft and a final draft of their report. The first draft will be peer-reviewed in class on October 17. It will also be graded by the lab instructor as if it were the final draft, with clear indications of why we counted off what we did, which you can use to improve your final report. Your grade will be the grade you receive for your final report. Note also that the lateness policy described in the course description applies to the final report; in particular, no late first drafts will be accepted, since we are giving you this as an opportunity for improving your final report and they need to be corrected rapidly.

As we said in the first project, we will assign a specific audience for each project report. This report might be called scientific in nature, since the emphasis should be less on the writing of the procedures, and more on the theoretical and empirical analysis that you do of those procedures. When you write the project report, you may assume that your audience is generally knowledgeable about computer science and Scheme, but not about what you did.

After a short description the procedures you wrote (including full copies of those procedures and a brief description of how you tested that they were accurate), your lab report should concentrate on answering the central questions for this project. In this case, you have two sets of procedures. In each set, there are two procedures which produce the same result when given the same data. Your job is to compare the amount of time the computer takes to run the procedures, using both theoretical analyses and empirical results.

In writing this report, you will want to include not only the procedures you used and the measurements you made, but a clear analysis of the asymptotic orders of growth (in big Theta-notation) for each of the two methods of doing each of the two computations. In particular, for each method, you will need to

Use tables and/or graphs to more clearly show your results. Keep in mind that tables (or graphs) are most clear if headings (or axes) are clearly labeled with both the variable being measured and the units. Also, if your goal is to compare the results of two experiments, it's best if the data from both can be placed in the same table or graph. It's confusing for the reader if the data appear on separate pages.

You might also want to make some additional predictions that go beyond your measurements. Suppose you were interested in pursuing the abundant/deficient ratio study well beyond n=3200, perhaps to n=102400. How long would you expect each method to take? Would you want to experimentally verify this? A harder (optional) question is, how large an n would the computer be able to handle in a day? a week? a year?

Finally, there are three documents you should consult when writing up your project report:

Project Exercises

  1. Do exercise 3.6 on page 60, if you haven't already. Be sure to compare low2 to n rather than comparing low to the square root of n. The square root method will not work reliably due to round off errors; the computer might determine that the square root of 9 is 2.99999 instead of 3.00000. To get the text's definitions of sum-of-divisors and divides?, copy the procedures in the following file into your definitions window and execute them:
      ~mc27/labs/sum-divisors/sum-of-divisors.scm
  2. (Do the programming for this exercise outside of your lab hour. During lab, skip to the next part.)
    One use for the sum-of-divisors procedure other than finding perfect numbers is to measure the relative frequency of abundant and deficient numbers. As mentioned in section 3.3, very few numbers are perfect. The remaining numbers either have a sum of divisors that is more than twice the number, in which case they are called abundant, or less than twice the number, in which case they are called deficient. The empirical question we'd like to ask is, are these two kinds of numbers equally common, or does one predominate? If so, by how much? The usual mathematical way to formulate this is to count how many numbers less than or equal to n are abundant and how many are deficient, and take the ratio. Then repeat this for larger and larger values of n. If the ratio approaches closer and closer to some limiting value as n increases, then this limiting ratio tells us the relative frequency of abundant and deficient numbers. For example, if the ratio approaches a limit of 1, then abundant and deficient numbers are equally frequent. On the other hand, a limiting value of 1/2 would mean that there are roughly twice as many deficient numbers as abundant numbers.
    Write a procedure that generates an iterative process for computing this ratio, given n. Most of the work can be done by a subsidiary procedure that keeps count of how many abundant numbers it has found and how many deficient numbers it has found as it scans the range from 1 to n. Once the entire range has been scanned, one count can be divided by the other. Test your procedure to be sure that it works.

    Do the abundant/deficient ratios seem to approach some simple limiting value? If so, what?

  3. What is the running time of the book's version of sum-of-divisors expressed in big Theta-notation? What is the running time (again expressed in big Theta-notation) of the version you wrote for Exercise 3.6?
  4. You will next do some timings of these two versions of sum-of-divisors. To load into Scheme a procedure called run-time for timing how long computations take, copy the following expression:
      (load "~mc27/public/time.scm")
    into your definition window and execute it. Now you should time how long it takes to find the sum of divisors of 100, 1000, 10000, 100000, and 1000000. You should do the same timing repeatedly, at least for the smaller numbers, to get some idea how precisely repeatable the timings are. To illustrate how you would do a timing, here is a sample timing of the sum-of-divisors of 100:
    > (run-time sum-of-divisors 100) 

    ;Time: 0.0008151735270379337  
    217
    >
    The time is given in seconds; e.g., in the above example, the evaluation took a little less than thousandth of a second. The time reported will not agree well with your own impression of the time taken, because it doesn't include time the computer spends doing other things, and also because for fast evaluations (such as the above) the evaluation is repeated until at least a second has elapsed and an average time for the repetitions is reported.

    What is the ratio of times from each measurement to the next? (That is: What is the ratio of times going from 100 to 1000? From 1000 to 10000? And so forth.) How does this compare with your expectation? (Note: When answering questions like these in your project write-up, be sure to back up your answers with adequate arguments. In other words, give reasons, probably in the form of big Theta arguments, which explain your expectations.)

  5. Now evaluate the improved definition from exercise 3.6 and repeat the measurements. Again, what are the time ratios, and how do they compare with your prediction?
  6. Suppose that you are using the text's definition of sum-of-divisors with your procedure for computing the ratio of abundant/deficient numbers. Evaluate the running time of your procedure to complete this ratio using big Theta notation. You can use the same general style of reasoning as on page 81, particularly the first complete paragraph. (If you don't require this analysis, you are likely in error, so be sure to ask questions!) Repeat your analysis, this time assuming you called the sum-of-divisors procedure from exercise 3.6 from your ratio finding procedure?
  7. Here you'll measure how long your procedure takes to find the ratio for n equal to 200, 400, 800, 1600 and 3200, and calculate the ratio of each consecutive pair of times. Before you take data, what do you expect the ratios to be given your big-Theta analysis. Now, take time measurements with each of the two definitions for sum-of-divisors. How well do your measurements agree with your predictions?

Back to MCS 177 home page