Solving 0/1 knapsack with the vectorized dynamic programming

I compared the speed between different implementations of the solution for the 0-1 knapsack problem. I found quite a nifty vectorized solution to the problem.

The other day I wanted to compare the speed between the different implementations for the solution to the 0-1 knapsack problem. There would be three possible implementations:

with a python dictionary
with a python list
with a numpy array and vectorized operations

The solutions are evaluated on the data from the Discrete Optimization course. First we see the implementation for each and then the benchmark results.

Solution with a python dictionary

This is a book version of the DP algorithm but with the dictionary. Every time we add a new item we take a look at the previous best result without the new item and the best result with the new item and we take the maximum.

 1Item = namedtuple("Item", ["value", "weight"])
 2
 3def dp_dict(items, capacity):
 4    """DP with a dictionary"""
 5    dict = {}
 6    for i in range(0, capacity + 1):
 7        dict[(i, 0)] = 0
 8
 9    for i in range(len(items)):
10        for k in range(0, capacity + 1):
11            if items[i].weight > k:
12                dict[(k, i + 1)] = dict[(k, i)]
13            else:
14                dict[(k, i + 1)] = max(
15                    dict[(k, i)], dict[(k - items[i].weight, i)] + items[i].value
16                )

Solution with a python list

Even more common is the implementation with the classic DP array.

 1def dp_array(items, capacity):
 2    """DP with an array"""
 3    dp = [[0] * (len(items) + 1) for _ in range(capacity + 1)]
 4
 5    for i in range(len(items)):
 6        for k in range(0, capacity + 1):
 7            if items[i].weight > k:
 8                dp[k][i + 1] = dp[k][i]
 9            else:
10                dp[k][i + 1] = max(
11                    dp[k][i], dp[k - items[i].weight][i] + items[i].value
12                )

Solution with a numpy array

Here we use numpy vectorized operations, that perform the calculation in parallel.

 1import numpy as np
 2
 3def dp_vectorization(items, capacity):
 4    """Dynamic programming algorithm with numpy array and vectorization"""
 5    dp = np.zeros((capacity + 1, len(items) + 1), dtype=np.uint32)
 6
 7    for i in range(len(items)):
 8        dp[: items[i].weight, i + 1] = dp[: items[i].weight, i]
 9        prev_value = dp[: -items[i].weight, i] + items[i].value
10        dp[items[i].weight :, i + 1] = np.maximum(dp[items[i].weight :, i], prev_value)

Benchmark

Each test case was run 5 times and the result was averaged. The file name represents the number of items in the test file. For the larger test case, we can see that dp_dict and dp_array time out due to the use of RAM. The numpy representation is much more efficient. We could use a 2 column table instead to optimize memory, but we just wanted to see the speed gain. The difference between ks_400_0 and ks_1000_0 is mainly in the capacity. The former has a capacity of almost 10 million, while the latter has a capacity of 100000, which is much more space efficient.

file	dp_dict	dp_array	dp_vectorization
ks_19_0	0.232 +- 0.003	0.105 +- 0.001	0.002 +- 0.000
ks_30_0	1.362 +- 0.022	0.519 +- 0.007	0.007 +- 0.001
ks_60_0	3.383 +- 0.109	1.372 +- 0.088	0.014 +- 0.001
ks_100_0	6.205 +- 0.125	1.557 +- 0.035	0.026 +- 0.001
ks_200_0	13.751 +- 0.151	2.912 +- 0.025	0.072 +- 0.001
ks_400_0	/	/	58.614 +- 0.160
ks_1000_0	/	/	0.510 +- 0.005
ks_10000_0	/	/	351.774 +- 6.184

The speed gain is considerable. I also like how elegant the vectorized solution is.

Solving 0/1 Knapsack With the Vectorized Dynamic Programming

Solution with a python dictionary

Solution with a python list

Solution with a numpy array

Benchmark

Comments