79323787

Date: 2025-01-02 13:06:08
Score: 3
Natty:
Report link

I'm assuming that your files are not machine readable - which would allow you to scrape them directly.

Have you tried pytesseract? https://pypi.org/project/pytesseract/

Is it an option to add a step where you first batch convert the documents to .md and only then extract and load them to excel?

Reasons:
  • Whitelisted phrase (-1): Have you tried
  • Low length (0.5):
  • No code block (0.5):
  • Ends in question mark (2):
  • Low reputation (1):
Posted by: Vittorio Distefano