You don't need to store the whole 2D array of dp values. Your algorithm only needs the current and previous rows. The computational complexity isn't changed by using only 2 rows, but the space complexity is so in practice, less memory requirement will probably give a performance boost.