Formatting Paragraphs
MCS-178: Introduction to Computer Science II
Max Hailperin, Fall 2001
Due: October 23rd, at the beginning of class
Pre-lab
In this lab, you will be applying the technique of memoization to the
particular problem of breaking a paragraph into lines. Since this is
covered in section 12.6 of the textbook, you should read that section.
Important reminders
Be sure to test your procedures individually as you go, rather than
only testing them as they are used by the ultimate paragraph
formatting procedure. You are expected to report on how you test your
procedures in your report, so be sure to keep track as you go.
Also, please remember that the code from the book is available on the web
at
http://www.gustavus.edu/~max/concabs/code/,
so that you don't need to type in the definitions that are in the book.
In Lab
-
Do exercises 12.20 and 12.21, from page 408-409.
-
Do exercises 12.22 and 12.23, from page 409.
-
Do exercise 12.24, from page 410.
-
Do exercise 12.25 from page 411. As a source of test data,
http://www.gustavus.edu/~max/concabs/code/paragraph.scm
contains a definition of
words as a list of words (i.e.,
a list of strings). Start by trying to use the first 15 words from
that list, breaking them into lines of maximum length 15 characters.
To get the first 15 words (or any number) from the words
list, you can use the first-elements-of procedure from
chapter 7, which is in
http://www.gustavus.edu/~max/concabs/code/chap7.scm. See how long it
takes to break these first 15 words into lines. You can view the
result of the line-breaking more visually by applying the procedure
display-lines to the result; this procedure is defined in
the paragraph.scm file. It takes a list of list of strings,
one per line, such as make-lines should be producing.
Now try scaling the maximum line length up from 15, and
(independently) try scaling up the number of words beyond 15. How
rapidly does the time increase?
-
Do exercise 12.26 on page 412. Make sure that your new procedure
produces the same results as the old one. Now compare its performance
with that of the non-memoized version, again increasing both the
maximum line length and number of words. In addition to comparing
performance in the same range of problem sizes, see how the memoized
version performs on problem sizes that were too big to test with the
non-memoized version.
- Take careful quantitative performance measurements, using the
timing procedure in the file ~mc27/public/time.scm;
this is the one you used in the MCS177 lab concerning perfect numbers.
You could also make your own variation on the timing procedure if you
wanted somewhat different features. Don't include
the actual displaying of the lines in what you time. After all, the
displaying time should be the same, regardless of the method used to
calculate the line breaks. It is a matter of general experimental
methodology to not measure more than what you are interested in,
because you would let in more sources of extraneous variation
("noise") without gaining any benefit.
Postlab
Write up a lab report that explains the final product
plus any conclusions you have drawn from your data. It would be good
to present quantitative data graphically (by, for instance, plotting
running time as a function of number of words) to make it easy to
interpret. (You can use semi-log and regular graph paper found in http://www.gac.edu/~wolfe/courses/graph-paper/.)
You should attempt to compare the performance empirically of the two
versions of the program (one with and one without memoization) using
your experimental results.