Have you considered using a multithreaded approach? You don’t mention details of your implementation but I’d suppose it’s possible to divide each page to a thread for processing to accelerate. Try with pthreads? Regards.