- A Kotlin string is implemented as an array of characters.
- Individual characters in a Kotlin string can be accessed through a nonnegative integer index using the [] notation.
- For a string \(X\), we write \(X[i..j)\) to denote the substring of \(X\) consisting of all characters from position \(i\) to position \(j-1\) inclusive. (Note that this is not a Kotlin notation, it's our convenient notation for talking about substrings.)
- For example, if \(X\) is
`AFRICA`

, then \(X[1..4)\) is`FRI`

and \(X[0..6)\) is`AFRICA`

itself.

- If one deletes characters at certain positions from a string \(X\), what remains is called a
**subsequence**of \(X\). - For example, deleting the characters at positions 1 and 4 from
`AFRICA`

leaves us with the string`ARIA`

. So we may say that`ARIA`

is a subsequence of`AFRICA`

. - Deleting no characters is also permitted. So, for instance,
`AFRICA`

is considered a subsequence of itself. - A sequence \(Z\) is a
**common subsequence**of sequences \(X\) and \(Y\) if \(Z\) is a subsequence of both \(X\) and \(Y\). - For instance,
`DIN`

is a common subsequence of`DYNAMICPROGRAMMING`

and`DIVIDEANDCONQUER`

. - A
**longest common subsequence**(LCS) of sequences \(X\) and \(Y\) is a common subsequence of \(X\) and \(Y\) of maximum possible length. - For instance,
`DYNAMICPROGRAMMING`

and`DIVIDEANDCONQUER`

have`DICOR`

as an LCS. This is because`DICOR`

is their common subsequence and no common subsequence of length 6 exists. `DICON`

is another LCS, so LCS's are not unique.

- Let two sequences \(X=X_0X_1\dots X_{m-1}\) and \(Y=Y_0Y_1\dots Y_{n-1}\) be given. We want to find an LCS of \(X\) and \(Y\).
Trying to solve the problem in a straightforward manner, we could do the following

`we keep track of the current longest LCS found so far for each subsequence Z of X do if Z is also a subsequence of Y then update the current longest LCS if needed`

However, this algorithm has to generate more than an exponential number of pairs of subsequences and check them for equalityâ€”an enormous amount as a function of m and n. We need a better way.

- For \(0\le i < m\) and \(0\le j < n\), let
`OPT`

\((i, j)\) be the length of an LCS of \(X[i..m)\) and \(Y[j..n)\). - We seek OPT\((0, 0)\).

- Suppose \(Z=Z[0..k)\) is an LCS of \(X[i..m)\) and \(Y[j..n)\).
- If \(X_i=Y_j\), then necessarily \(X_i=Z_0\). We can show that \(Z[1..k)\) is an LCS of \(X[i+1 .. m)\) and \(Y[j+1 .. n)\).
- If \(X_i\ne Y_j\), then \(X_i\ne Z_0\) or \(Y_j\ne Z_0\).
- If \(X_i\ne Z_0\), we can show that \(Z\) is an LCS of \(X[i+1..m)\) and \(Y[j..n)\).
- If \(Y_j\ne Z_0\), we can show that \(Z\) is an LCS of \(X[i..m)\) and \(Y[j+1..n)\).

- We know that one of the above cases must occur.
- This gives us the following recurrence. \[ \mbox{OPT}(i, j) = \left\{ \begin{array}{ll} 0, & \mbox{if $i=m$ or $j=n$} \\ \mbox{OPT}(i+1, j+1) + 1, & \mbox{if $0 \le i < m$, and $0 \le j < n$, and $X_i = Y_j$} \\ \max\{\,\mbox{OPT}(i,j+1),\ \mbox{OPT}(i+1,j)\,\}, & \mbox{if $0\le i < m$, and $0\le j < n$, and $X_i\ne Y_j$} \end{array} \right. \]

M | A | R | C | H | ||
---|---|---|---|---|---|---|

A |
2 | 2 | 1 | 0 | 0 | 0 |

P |
1 | 1 | 1 | 0 | 0 | 0 |

R |
1 | 1 | 1 | 0 | 0 | 0 |

I |
0 | 0 | 0 | 0 | 0 | 0 |

L |
0 | 0 | 0 | 0 | 0 | 0 |

0 | 0 | 0 | 0 | 0 | 0 |

- [Step 1] Fill in a table of OPT\((\cdot, \cdot)\) values. This can be done row-by-row, column-by-column, or diagonal-by-diagonal.
- [Step 2] Find the LCS by following maximizer pointers, starting from OPT\((0, 0)\).
**Note:**The maximizers can be found by pre-computation and kept in a table in Step 1, or they can be computed on the fly in Step 2.