yes I agree with you, there are tricky corner cases where this overflows. I ran into the same bug, working on a 32-bit architecture one counter-example is v = (1, 2)
(that is v = 2^32 + 2
, v_1 = 1
, v_0
= 2), the normalization step D1 computes d = floor((b-1)/v_1)
that is 0xffffffff
with b = 2^32
. But then d * v = (2^32-1) * (2^32 + 2) = 2^64 + 2^32 - 2 = (1, 0, 0xfffffffe)
with an overflow.
Would d = ceil(((b/2)/v_1))
work?