Project 5: Global Sequence Alignment
Started: Tuesday 4/5; Due: Friday 4/15
Overview
This assignment is also taken from the projects on the book's web site. In this case, the assignment is
the Global Sequence Alignment. Be sure to read my comments below
for some clarification and suggestions.
You should work on this project individually.
Getting started
You should get started by reading through Global Sequence Alignment and work through one or two examples by hand. In other words, come up with some short DNA sequences and compute their edit distance using the table method illustrated in the write-up for the sequences AACAGTTACC and TAAGGTCA. You do not have to hand this in, and you may work with a partner on this part.
Required structure
Instead of giving you a template Java file to work from, we are going to have you create the code more or less from scratch. However, in order to enforce some uniformity (which will make the grading easier), we request that you do certain things:
- You should (just like in Project 2) have a copy of a directory that contains the contents of the sequence directory described in the write-up at the book's web site. I'm assuming you'll put all new files you create in this same directory.
- The name of the main class should be "EditDistance".
- You should fill in the readme.txt file.
-
Your final program should function as described at the bottom of the book site's (you choose which of the two output styles), with the following slight
difference: you would get the described output by typing
java EditDistance < example10.txt.
Submission
You should submit two files: EditDistance.java and readme.txt as described in our instructions on submitting code document.
Gradesheet
- Implementation (70 pts)
- Dynamic programming to compute Edit Distance correctly (30 pts)
- Prints the alignment of two strings correctly (20 pts)
- Measures the run-time correctly (10 pts)
- Documentation/Comments (5 pts)
- Choice of variable names, indentation, other style issues (5 pts)
- Questions in Readme.txt (30 pts)
- Alignment explanation (10 pts)
- Additional test data (5 pts)
- edit distance and run-time (5 pts)
- Answer for a, b and largest N (10 pts)