79252590

Date: 2024-12-04 20:16:51
Score: 1.5
Natty:
Report link

Regarding the Ray integration question, I would think Ray Serve can be something suitable for the use case to serve online requests in parallel and with some computation. The library is a general framework to set up multiple replicas for logic to handle incoming requests and can be scaled up to run across a Ray cluster.

In addition, Ray Serve supports Resource Allocation. With that, you should be able to specify necessary GPU device for each replica.

Reasons:
  • No code block (0.5):
  • Low reputation (1):
Posted by: myan