I did smth similar for a Cassandra running on K8s behind a load balancer. The solution was to write a custom LoadBalancingPolicy where you can build the query plan as you wish. Then, you can set speculative retry delay to 0 or some small value and get the effect you want.
Also, do you have an application in another DC as well? In that case, you can delegate retries to the client of your application, saying that DC A app is unavailable due to Cassandra in that DC is unavailable, so the client should retry the call to the app in the DC B.
They propose such setup in https://docs.datastax.com/en/developer/java-driver/4.17/manual/core/load_balancing/index.html#built-in-policies .