Formatting Paragraphs

MCS-178: Introduction to Computer Science II
Max Hailperin, Fall 2001

Due: October 23rd, at the beginning of class

Pre-lab

In this lab, you will be applying the technique of memoization to the particular problem of breaking a paragraph into lines. Since this is covered in section 12.6 of the textbook, you should read that section.

Important reminders

Be sure to test your procedures individually as you go, rather than only testing them as they are used by the ultimate paragraph formatting procedure. You are expected to report on how you test your procedures in your report, so be sure to keep track as you go.

Also, please remember that the code from the book is available on the web at http://www.gustavus.edu/~max/concabs/code/, so that you don't need to type in the definitions that are in the book.

In Lab

  1. Do exercises 12.20 and 12.21, from page 408-409.

  2. Do exercises 12.22 and 12.23, from page 409.

  3. Do exercise 12.24, from page 410.

  4. Do exercise 12.25 from page 411. As a source of test data, http://www.gustavus.edu/~max/concabs/code/paragraph.scm contains a definition of words as a list of words (i.e., a list of strings). Start by trying to use the first 15 words from that list, breaking them into lines of maximum length 15 characters. To get the first 15 words (or any number) from the words list, you can use the first-elements-of procedure from chapter 7, which is in http://www.gustavus.edu/~max/concabs/code/chap7.scm. See how long it takes to break these first 15 words into lines. You can view the result of the line-breaking more visually by applying the procedure display-lines to the result; this procedure is defined in the paragraph.scm file. It takes a list of list of strings, one per line, such as make-lines should be producing.

    Now try scaling the maximum line length up from 15, and (independently) try scaling up the number of words beyond 15. How rapidly does the time increase?

  5. Do exercise 12.26 on page 412. Make sure that your new procedure produces the same results as the old one. Now compare its performance with that of the non-memoized version, again increasing both the maximum line length and number of words. In addition to comparing performance in the same range of problem sizes, see how the memoized version performs on problem sizes that were too big to test with the non-memoized version.

  6. Take careful quantitative performance measurements, using the timing procedure in the file ~mc27/public/time.scm; this is the one you used in the MCS177 lab concerning perfect numbers. You could also make your own variation on the timing procedure if you wanted somewhat different features. Don't include the actual displaying of the lines in what you time. After all, the displaying time should be the same, regardless of the method used to calculate the line breaks. It is a matter of general experimental methodology to not measure more than what you are interested in, because you would let in more sources of extraneous variation ("noise") without gaining any benefit.

Postlab

Write up a lab report that explains the final product plus any conclusions you have drawn from your data. It would be good to present quantitative data graphically (by, for instance, plotting running time as a function of number of words) to make it easy to interpret. (You can use semi-log and regular graph paper found in http://www.gac.edu/~wolfe/courses/graph-paper/.) You should attempt to compare the performance empirically of the two versions of the program (one with and one without memoization) using your experimental results.