The other day I wanted to compare the speed between the different implementations for the solution to the 0-1 knapsack problem. There would be three possible implementations:
- with a python dictionary
- with a python list
- with a numpy array and vectorized operations
The solutions are evaluated on the data from the Discrete Optimization course. First we see the implementation for each and then the benchmark results.
Solution with a python dictionary
This is a book version of the DP algorithm but with the dictionary. Every time we add a new item we take a look at the previous best result without the new item and the best result with the new item and we take the maximum.
1Item = namedtuple("Item", ["value", "weight"])
2
3def dp_dict(items, capacity):
4 """DP with a dictionary"""
5 dict = {}
6 for i in range(0, capacity + 1):
7 dict[(i, 0)] = 0
8
9 for i in range(len(items)):
10 for k in range(0, capacity + 1):
11 if items[i].weight > k:
12 dict[(k, i + 1)] = dict[(k, i)]
13 else:
14 dict[(k, i + 1)] = max(
15 dict[(k, i)], dict[(k - items[i].weight, i)] + items[i].value
16 )
Solution with a python list
Even more common is the implementation with the classic DP array.
1def dp_array(items, capacity):
2 """DP with an array"""
3 dp = [[0] * (len(items) + 1) for _ in range(capacity + 1)]
4
5 for i in range(len(items)):
6 for k in range(0, capacity + 1):
7 if items[i].weight > k:
8 dp[k][i + 1] = dp[k][i]
9 else:
10 dp[k][i + 1] = max(
11 dp[k][i], dp[k - items[i].weight][i] + items[i].value
12 )
Solution with a numpy array
Here we use numpy
vectorized operations, that perform the calculation in parallel.
1import numpy as np
2
3def dp_vectorization(items, capacity):
4 """Dynamic programming algorithm with numpy array and vectorization"""
5 dp = np.zeros((capacity + 1, len(items) + 1), dtype=np.uint32)
6
7 for i in range(len(items)):
8 dp[: items[i].weight, i + 1] = dp[: items[i].weight, i]
9 prev_value = dp[: -items[i].weight, i] + items[i].value
10 dp[items[i].weight :, i + 1] = np.maximum(dp[items[i].weight :, i], prev_value)
Benchmark
Each test case was run 5 times and the result was averaged. The file name represents the number of items in the test file. For the larger test case, we can see that dp_dict
and dp_array
time out due to the use of RAM. The numpy
representation is much more efficient. We could use a 2 column table instead to optimize memory, but we just wanted to see the speed gain. The difference between ks_400_0
and ks_1000_0
is mainly in the capacity. The former has a capacity of almost 10 million, while the latter has a capacity of 100000
, which is much more space efficient.
file | dp_dict | dp_array | dp_vectorization |
---|---|---|---|
ks_19_0 | 0.232 +- 0.003 | 0.105 +- 0.001 | 0.002 +- 0.000 |
ks_30_0 | 1.362 +- 0.022 | 0.519 +- 0.007 | 0.007 +- 0.001 |
ks_60_0 | 3.383 +- 0.109 | 1.372 +- 0.088 | 0.014 +- 0.001 |
ks_100_0 | 6.205 +- 0.125 | 1.557 +- 0.035 | 0.026 +- 0.001 |
ks_200_0 | 13.751 +- 0.151 | 2.912 +- 0.025 | 0.072 +- 0.001 |
ks_400_0 | / | / | 58.614 +- 0.160 |
ks_1000_0 | / | / | 0.510 +- 0.005 |
ks_10000_0 | / | / | 351.774 +- 6.184 |
The speed gain is considerable. I also like how elegant the vectorized solution is.
Comments