Notation

• A Kotlin string is implemented as an array of characters.
• Individual characters in a Kotlin string can be accessed through a nonnegative integer index using the [] notation.
• For a string $$X$$, we write $$X[i..j)$$ to denote the substring of $$X$$ consisting of all characters from position $$i$$ to position $$j-1$$ inclusive. (Note that this is not a Kotlin notation, it's our convenient notation for talking about substrings.)
• For example, if $$X$$ is AFRICA, then $$X[1..4)$$ is FRI and $$X[0..6)$$ is AFRICA itself.

Definitions

• If one deletes characters at certain positions from a string $$X$$, what remains is called a subsequence of $$X$$.
• For example, deleting the characters at positions 1 and 4 from AFRICA leaves us with the string ARIA. So we may say that ARIA is a subsequence of AFRICA.
• Deleting no characters is also permitted. So, for instance, AFRICA is considered a subsequence of itself.
• A sequence $$Z$$ is a common subsequence of sequences $$X$$ and $$Y$$ if $$Z$$ is a subsequence of both $$X$$ and $$Y$$.
• For instance, DIN is a common subsequence of DYNAMICPROGRAMMING and DIVIDEANDCONQUER.
• A longest common subsequence (LCS) of sequences $$X$$ and $$Y$$ is a common subsequence of $$X$$ and $$Y$$ of maximum possible length.
• For instance, DYNAMICPROGRAMMING and DIVIDEANDCONQUER have DICOR as an LCS. This is because DICOR is their common subsequence and no common subsequence of length 6 exists.
• DICON is another LCS, so LCS's are not unique.

Problem

• Let two sequences $$X=X_0X_1\dots X_{m-1}$$ and $$Y=Y_0Y_1\dots Y_{n-1}$$ be given. We want to find an LCS of $$X$$ and $$Y$$.
• Trying to solve the problem in a straightforward manner, we could do the following

we keep track of the current longest LCS found so far
for each subsequence Z of X do
if Z is also a subsequence of Y then
update the current longest LCS if needed
• However, this algorithm has to generate more than an exponential number of pairs of subsequences and check them for equality—an enormous amount as a function of m and n. We need a better way.

Solution by dynamic programming

• For $$0\le i < m$$ and $$0\le j < n$$, let OPT$$(i, j)$$ be the length of an LCS of $$X[i..m)$$ and $$Y[j..n)$$.
• We seek OPT$$(0, 0)$$.

Optimal substructure property

• Suppose $$Z=Z[0..k)$$ is an LCS of $$X[i..m)$$ and $$Y[j..n)$$.
• If $$X_i=Y_j$$, then necessarily $$X_i=Z_0$$. We can show that $$Z[1..k)$$ is an LCS of $$X[i+1 .. m)$$ and $$Y[j+1 .. n)$$.
• If $$X_i\ne Y_j$$, then $$X_i\ne Z_0$$ or $$Y_j\ne Z_0$$.
• If $$X_i\ne Z_0$$, we can show that $$Z$$ is an LCS of $$X[i+1..m)$$ and $$Y[j..n)$$.
• If $$Y_j\ne Z_0$$, we can show that $$Z$$ is an LCS of $$X[i..m)$$ and $$Y[j+1..n)$$.
• We know that one of the above cases must occur.
• This gives us the following recurrence. $\mbox{OPT}(i, j) = \left\{ \begin{array}{ll} 0, & \mbox{if i=m or j=n} \\ \mbox{OPT}(i+1, j+1) + 1, & \mbox{if 0 \le i < m, and 0 \le j < n, and X_i = Y_j} \\ \max\{\,\mbox{OPT}(i,j+1),\ \mbox{OPT}(i+1,j)\,\}, & \mbox{if 0\le i < m, and 0\le j < n, and X_i\ne Y_j} \end{array} \right.$

M A R C H
A 2 2 1 0 0 0
P 1 1 1 0 0 0
R 1 1 1 0 0 0
I 0 0 0 0 0 0
L 0 0 0 0 0 0
0 0 0 0 0 0

LCS algorithm

• [Step 1] Fill in a table of OPT$$(\cdot, \cdot)$$ values. This can be done row-by-row, column-by-column, or diagonal-by-diagonal.
• [Step 2] Find the LCS by following maximizer pointers, starting from OPT$$(0, 0)$$.
• Note: The maximizers can be found by pre-computation and kept in a table in Step 1, or they can be computed on the fly in Step 2.