DP4: Longest Increasing Subsequence
San Skulrattanakulchai
October 12, 2018
Definitions
- DPV Ch 6.2
- Let \(A: a_1, a_2, \ldots, a_n\) and \(B: b_1, b_2, \ldots, b_k\) be sequences of positive integers.
- We say that \(B\) is a subsequence of \(A\) if there exists an increasing function \(i: \{1,2,\ldots,k\} \to\{1,2,\ldots,n\}\) such that \(b_j=a_{i(j)}\) for all \(1\le j\le k\).
- If \(B\) is also increasing, i.e., \(b_j < b_{j'}\) whenever \(j < j'\), then \(B\) is said to be an increasing subsequence of \(A\).
- A longest increasing subsequence (LIS) of \(A\) is an increasing subsequence of \(A\) of maximum length.
Examples
- Let \(A\) be the sequence \(20, 50, 30, 10, 40\).
- Then \(50, 10, 40\) is a subsequence of \(A\) but it is not increasing.
- The sequence \(10, 40\) is an increasing subsequence of \(A\) but not the longest.
- Finally, the sequence \(20, 30, 40\) is an LIS of \(A\).
Longest Increasing Subsequence Problem
- Given a sequence \(A = a_1, a_2, \ldots, a_n\) of integers, find an LIS of \(A\).
- We’ll develop a dynamic programming solution.
- Define \(m(i)\), for each \(1\le i \le n\), to be the length of a longest increasing subsequence of \(a_1, a_2, \ldots, a_i\) that has \(a_i\) as the last element of the subsequence.
- We seek \(\max \{\, m(i) : 1\le i\le n \,\}\).
Optimal Substructure Property
- If \(i=1\), then \(a_i\) is obviously the LIS of itself. Thus, \(m(1)=1\).
- Now suppose \(1< i \le n\). Fix such an \(i\). Let \(B: b_1, b_2, \ldots, b_k\) be an LIS of \(a_1, a_2, \ldots, a_i\) subject to \(b_k=a_i\). Either \(k = 1\) or \(k > 1\).
- If \(k = 1\), nothing further needs be said.
- If \(k > 1\), then \(b_1,b_2,\ldots, b_{k-1}\) must be an LIS of \(a_1, a_2, \ldots, a_{i-1}\). I.e., there exists some \(j\) (\(1\le j\le i-1\)) such that \(b_1, b_2, \ldots, b_{k-1}\) is an LIS of \(a_1, a_2, \ldots, a_j\) subject to \(b_{k-1}=a_j\).
Recurrence
- The above reasoning shows \(m(i)\) satisfies the recurrence \[
m(i) =
\left\{
\begin{array}{ll}
1, & \mbox{if $i = 1$} \\
\max \{\,1, m(j)+1: 1\le j< i \mbox{ and } a_j<a_i\,\}, & \mbox{if $1<i\le n$}
\end{array}
\right.
\]
Algorithm
- The algorithm consists of 3 steps.
- [Step 1] Fill in the table of \(m(\cdot)\) values and a companion table of maximizers.
- [Step 2] Scan through the \(m(\cdot)\) table to find a maximum value. Say \(m(i)\) is the maximum.
- [Step 3] Starting from the \(i\)th entry of the table of maximizers stored in Step 1, where \(i\) is the maximizer from Step 2, obtain the LIS by using these maximizers as pointers.
Running Time
- Step 1 takes time \(O(n^2)\) since we fill in 2 tables each of size \(O(n)\), and each table entry takes time \(O(n)\) to compute.
- Step 2 takes time \(O(n)\) since the size of the table is \(O(n)\).
- Step 3 takes time \(O(n)\) since there are \(O(n)\) maximizers and we spend \(O(1)\) time per maximizer.
- Therefore, total running time is \(O(n^2)\).
Note
- This problem can be modeled as finding a longest path in a dag (DVP Ch 5.2).
- The recurrence can also be written simply as \[
m(i) = 1 + \max \{\,0, m(j) : 1\le j <i \mbox{ and } a_j < a_i\,\}
\mbox{ for all $1\le i\le n$}.
\]
- We can also define \(m(i)\) so that \(a_i\) is the starting element of the LIS instead of the ending element. Doing so will give a backward recurrence.