Using map_batches instead of map_elements runs pretty fast for my > 4 million rows
map_batches
map_elements
df = df.with_columns(pl.col("b").map_batches(lambda x: x.to_numpy().transpose(0, 2, 1)))