Longest subarray whose sum <= k

by e2718281828459045

Longest subarray whose sum \leq k
Given an array A of n numbers and a key k, find the longest subarray of A for which the subarray sum is less than or equal to k.

This problem is from the book “Elements of Programming Interviews”.

This problem has a rather easy quadratic time complexity algorithm: check each pair of indices i < j, where 0 \leq i < j \leq n-1 to see if the sum of the subarray A[i..j] is no greater than k, and return the pair with the largest difference j-i+1. To avoid repeatedly computing the subarray sum, one can first preprocess the array to get an array S of partial sums, where S[i] = \sum_{j = 0}^{i-1} A[j], for i = 1,\cdots, n, and by definition, S[0] = 0. Then the subarray sum can be obtained as A[i..j] = S[j+1] - S[i]. The preprocessing takes linear time and linear space.

In the following we describe an algorithm with time complexity O(n \log n).

Algorithm LONGESTSUBARRAY(A[0..n-1], k)
Input: An array A[0..n-1] and a real number k
Output: A pair of indices i, j, maximizing j - i + 1, subject to \sum_{t = i}^j A[t] \leq k
1. Compute the array S[0..n] of partial sums, where S[i] is the prefix sum of the first i numbers in A. S[0] \leftarrow 0.
2. (l, r) \leftarrow (0,0)
3. M \leftarrow \emptyset
4. smallest \leftarrow +\infty
5. for i \leftarrow n down to 0
    if S[i] < {\rm smallest}
        M \leftarrow M \cup \{(S[i], i)\}
        {\rm smallest} \leftarrow S[i]
6. {\rm current} \leftarrow -\infty
7. for i \leftarrow 0 up to n-1
    if S[i] > {\rm current}
        Use binary search to look for the rightmost element (S[t], t) in M such that S[t] - S[i] \leq k. If such an element exists, update (l, r) to (i, t) if t-i \geq r - l
        {\rm current} \leftarrow S[i]
8. return (l,r)

Correctness of the Algorithm
The key is to observe the following two facts.
Claim: (1) If two indices r < r' satisfies that S[r] \geq S[r'], then r cannot appear in an optimum solution; (2) If two indices l < l' satisfies that S[l] \geq S[l'], then l' cannot appear in an optimum solution.
Proof: For any index i, 0 \leq i \leq n, note that the subarray sum of A[i..r'] is less than the subarray sum of A[i..r], and the length of the subarray $A[i..r’]$ is greater than the length of the subarray A[i..r] (although both length might be negative, but the statement still holds). Therefore, (i,r') is preferable over (i, r). This is also the reason why we compute the strictly increasing sequence of M.
The second statement follows a similar reasoning.

Time Complexity and Space Complexity
Step 1, computing the prefix sums, takes O(n) time. Step 4, computing the array M, takes O(n) time. Step 7 takes O(n \log n) time as each iteration takes O(\log n) time.