Late answer here, I ran into the same issue where it would use the 2 GPUs I have so I made a simple shell utility that allows to specify the numbers of servers you want to launch with a specific port range and it works very well!
https://github.com/theodufort/ollama-server-scaler