It seems like you're looking for a "one-in-all" answer. Maybe reworking/redoing one of your initial attempts may get you the answer, but I'm a fan of breaking things up. Personally, I have a work requirement related to expense tracking, so I've been researching OCR for mobile and found:
https://github.com/a7medev/react-native-ml-kit or the NPM link
With the extracted text, you could easily run a cheap/free server (AWS free-tier, Google Cloud free-tier, Heroku cheap) with a mini LLM and pass the extracted text and a text prompt to a server to get the heavy load off the user's mobile device.
Consider whether you truly want everything on a mobile device.
Even after quantizing a model, you'll still be looking at about 50-100MB of size alone (just for the model) which is a pretty large app. I believe Android's Google store has a limit of 150MB and then you have to do some funky file splitting (I think).