The other day I wanted to compare the speed between the different implementations for the solution to the 0-1 knapsack problem. There would be three possible implementations:

  • with a python dictionary
  • with a python list
  • with a numpy array and vectorized operations

The solutions are evaluated on the data from the Discrete Optimization course. First we see the implementation for each and then the benchmark results.

Solution with a python dictionary

This is a book version of the DP algorithm but with the dictionary. Every time we add a new item we take a look at the previous best result without the new item and the best result with the new item and we take the maximum.

 1Item = namedtuple("Item", ["value", "weight"])
 2
 3def dp_dict(items, capacity):
 4    """DP with a dictionary"""
 5    dict = {}
 6    for i in range(0, capacity + 1):
 7        dict[(i, 0)] = 0
 8
 9    for i in range(len(items)):
10        for k in range(0, capacity + 1):
11            if items[i].weight > k:
12                dict[(k, i + 1)] = dict[(k, i)]
13            else:
14                dict[(k, i + 1)] = max(
15                    dict[(k, i)], dict[(k - items[i].weight, i)] + items[i].value
16                )

Solution with a python list

Even more common is the implementation with the classic DP array.

 1def dp_array(items, capacity):
 2    """DP with an array"""
 3    dp = [[0] * (len(items) + 1) for _ in range(capacity + 1)]
 4
 5    for i in range(len(items)):
 6        for k in range(0, capacity + 1):
 7            if items[i].weight > k:
 8                dp[k][i + 1] = dp[k][i]
 9            else:
10                dp[k][i + 1] = max(
11                    dp[k][i], dp[k - items[i].weight][i] + items[i].value
12                )

Solution with a numpy array

Here we use numpy vectorized operations, that perform the calculation in parallel.

 1import numpy as np
 2
 3def dp_vectorization(items, capacity):
 4    """Dynamic programming algorithm with numpy array and vectorization"""
 5    dp = np.zeros((capacity + 1, len(items) + 1), dtype=np.uint32)
 6
 7    for i in range(len(items)):
 8        dp[: items[i].weight, i + 1] = dp[: items[i].weight, i]
 9        prev_value = dp[: -items[i].weight, i] + items[i].value
10        dp[items[i].weight :, i + 1] = np.maximum(dp[items[i].weight :, i], prev_value)

Benchmark

Each test case was run 5 times and the result was averaged. The file name represents the number of items in the test file. For the larger test case, we can see that dp_dict and dp_array time out due to the use of RAM. The numpy representation is much more efficient. We could use a 2 column table instead to optimize memory, but we just wanted to see the speed gain. The difference between ks_400_0 and ks_1000_0 is mainly in the capacity. The former has a capacity of almost 10 million, while the latter has a capacity of 100000, which is much more space efficient.

filedp_dictdp_arraydp_vectorization
ks_19_00.232 +- 0.0030.105 +- 0.0010.002 +- 0.000
ks_30_01.362 +- 0.0220.519 +- 0.0070.007 +- 0.001
ks_60_03.383 +- 0.1091.372 +- 0.0880.014 +- 0.001
ks_100_06.205 +- 0.1251.557 +- 0.0350.026 +- 0.001
ks_200_013.751 +- 0.1512.912 +- 0.0250.072 +- 0.001
ks_400_0//58.614 +- 0.160
ks_1000_0//0.510 +- 0.005
ks_10000_0//351.774 +- 6.184

The speed gain is considerable. I also like how elegant the vectorized solution is.