So you're running a FastAPI ML service with CPU-intensive SVD computations on the main thread, and you're seeing occasional throttling. You already scaled up Gunicorn workers, but you are wondering if you should move this to a separate thread or any best practices, right?
Ah, CPU-bound work in FastAPI can be tricky! Threads might not help much because of Python’s GIL, but let me show you a few approaches we’ve seen work well.
So the first step is ProcessPoolExecutor (Quick Fix)
For lighter workloads, offload to a separate process:
from concurrent.futures import ProcessPoolExecutor
import asyncio
def _compute_svd(matrix: np.ndarray):
return np.linalg.svd(matrix, full_matrices=False)
async def svd_with_fallback(self, matrix):
with ProcessPoolExecutor() as executor:
return await asyncio.get_event_loop().run_in_executor(executor, _compute_svd, matrix)
Pros: Simple, uses multiple cores.
Cons: Overhead for large matrices (serialization costs).
If you’re already hitting CPU limits, then a job queue(like Celery) might be better for scaling.
So the next step is Celery + Redis (Production-Grade)
@celery.task
def async_svd(matrix_serialized: bytes) -> bytes:
matrix = pickle.loads(matrix_serialized)
U, S, V = np.linalg.svd(matrix)
return pickle.dumps((U, S, V))
Pros: Decouples compute from API, scales horizontally.
Cons: Adds Redis/RabbitMQ as a dependency.
If you are thinking about optimizing Gunicorn further, so I mean, if you’re sticking with more workers:
gunicorn -w $(nproc) -k uvicorn.workers.UvicornWorker ...
Match workers to CPU cores.
Check for throttling in htop
—if it’s kernel-level, taskset
might help.
Maybe you are considering any low-hanging performance optimizations, right?
For SVD specifically, you could try Numba(if numerical stability allows):
from numba import njit
@njit
def svd_fast(matrix):
return np.linalg.svd(matrix) # JIT-compiled
But test thoroughly! Some NumPy-SVD optimizations are already BLAS-backed.
I’d start with ProcessPoolExecutor
—it’s the least invasive. If you’re still throttled, Celery’s the way to go. Want me to dive deeper into any of these?