Lab 2

MCS 394: Topics in Computer Science
Karl Knight, Spring 2000

Due: Monday, March 13, the beginning of class

Briefly stated, your assignment is to implement the ID3 algorithm as described in table 3.1 on page 56 of the textbook, and to do some experiments with the algorithm on some data I will provide for you.

More specifically, I am giving you several files which will help you in your work, all of which are located in the following directory (which you should probably copy to your work space):

         ~karl/www-docs/courses/mcs394-s00/labs/lab2/decision-trees/

Your tasks:

  1. Implement the ID3 algorithm as described on page 56 of Mitchell. I recommend that you write a procedure called id3 that takes a list of training examples and returns the decision tree that ID3 should return for that list. In particular, if you use all of the tennis-data in tennis-data.scm, you should get the tree in Figure 3.1 on page 53 of Mitchell.

    One remark: I recommend that you use a fairly simple, concrete representation for your decision tree. For example, my representation for the decision tree in Figure 3.1 is:

            (outlook (sunny (humidity (high no)
                                      (normal yes)))
                     (overcast yes)
                     (rain (wind (strong no)
                                 (weak yes))))
    

  2. You should write a procedure classify which takes a decision tree and an example, and returns the target classification that the decision tree gives for the example.

  3. Since the house voting data has missing attributes, you should extend your ID3 procedure to allow for missing attributes, using one of the methods described on page 75 of Mitchell.

  4. Finally, you should test run your ID3 procedure on the house voting data using various fractions of the data as training examples, and seeing how well it classifies the non-training examples. Remember that the procedure random-fractional-list in utility.scm will be helpful for this task.