Reports

I am facing a similar issue in extracting text content from complex-layout PDFs. The PDFs are not large; each is about 2 to 3 pages.

Thanks to @Davide Fiocco, I was able to find a better solution for my project.

However, I have a few follow-up questions:

Can I implement the same process using CURL? In other words, I can’t use pre-compiled SDKs.
Does OpenAI review my uploaded PDFs at any point?
How can I calculate the token count when I upload a PDF?

The reason I need to use CURL is that I must develop this project with pure JavaScript, without other npm packages like pdfjs-dist, canvas, or openai.

Currently, I am attempting to convert PDFs to images using the PDF.co API and then send the images to OpenAI endpoints using fetch. However, I would prefer a solution that doesn’t require conversion to images. Again, the PDF layout is quite complex.

Reasons:

Blacklisted phrase (0.5): Thanks
Blacklisted phrase (0.5): How can I
Blacklisted phrase (0.5): I need
Long answer (-0.5):
No code block (0.5):
Me too answer (2.5): I am facing a similar issue
Contains question mark (0.5):
User mentioned (1): @Davide
Low reputation (1):

Posted by: R_H

79165979