79165979

Date: 2024-11-07 10:24:51
Score: 6.5 🚩
Natty: 4
Report link

I am facing a similar issue in extracting text content from complex-layout PDFs. The PDFs are not large; each is about 2 to 3 pages.

Thanks to @Davide Fiocco, I was able to find a better solution for my project.

However, I have a few follow-up questions:

The reason I need to use CURL is that I must develop this project with pure JavaScript, without other npm packages like pdfjs-dist, canvas, or openai.

Currently, I am attempting to convert PDFs to images using the PDF.co API and then send the images to OpenAI endpoints using fetch. However, I would prefer a solution that doesn’t require conversion to images. Again, the PDF layout is quite complex.

Reasons:
  • Blacklisted phrase (0.5): Thanks
  • Blacklisted phrase (0.5): How can I
  • Blacklisted phrase (0.5): I need
  • Long answer (-0.5):
  • No code block (0.5):
  • Me too answer (2.5): I am facing a similar issue
  • Contains question mark (0.5):
  • User mentioned (1): @Davide
  • Low reputation (1):
Posted by: R_H