The default for max_ongoing_requests
was changed to 5 in a recent release, which is the most likely issue. Try tuning it upwards like:
@serve.deployment(max_ongoing_requests=100)
@serve.ingress(app)
class FastAPIDeployment:
def __init__(self):
self.connections = {}
@app.websocket("/ws")
async def websocket_endpoint(self, websocket: WebSocket):
await websocket.accept()
# Connection handling logic
Reference docs: https://docs.ray.io/en/latest/serve/configure-serve-deployment.html#configurable-parameters