From a pure performance standpoint, printing the time at each iteration is very bad, you might want to use another way to keep track of your iterations. Might I suggest tqdm ?
If you want to do a lot of algebra fast, numpy is your best bet, though jax could be better if you want to run on gpu. For instance, to compute the length of 1000000 (a million) vectors generated randomly from values between 0 and 1 (like you did here), you could do:
import numpy as np
import time
t0 = time.time()
vectors = np.random.random(size=(10_000_000, 4))
vectors[:, 0] = vectors[:, 3]
print(time.time() - t0)
This code snippet takes about 220ms on my own computer, which is not particularly fancy.
EDIT: I changed my code snippet to reflect what yours does, it's abetter example.