You have to implement this yourself. If you use Python, you can take these inference requests as an example: https://github.com/vllm-project/vllm/blob/main/benchmarks/backend_request_func.py