do you compile llama_cpp with gpu ? in the case the answer it's yes, do you export the libllama.so
llama-cpp-python not using NVIDIA GPU CUDA
https://medium.com/@piyushbatra1999/installing-llama-cpp-python-with-nvidia-gpu-acceleration-on-windows-a-short-guide-0dfac475002d i hope this help you.