79723438

Date: 2025-08-02 15:01:57
Score: 2
Natty:
Report link

As @Tilman Hausherr pointed out, you embed a subsetted Unicode font, but I think the font is subsetted correctly. You may share your PDF for examination.

I believe you see the wrong order of Arabic/Hebrew words and/or reversed order of symbols in those words. Is it what you mean by "garbled"?

PDFBox doesn't support RTL scripts, so in case of RTL you need to use a 3rd-party library for BIDI reordering. See a good discussion about this topic and the solution here: Writing Arabic with PDFBOX with correct characters presentation form without being separated

---

As for proper text displaying on MacOS:

The default value of text fields (/V value in the Text field dictionary) is constructed using explicit UTF_16BE encoding.

Text in the default appearance stream is created by Unicode code points and the Java default character encoding.

You may compare the /V value in the field dictionary and the text inside the default appearance stream of the field (in angular brackets before the Tj operator).

So I guess you have different Java/System encodings on Windows and MacOS. Another thing I can think of is that a viewing PDF software on MacOS skips the Appearance Stream dictionaries of AcroForm (text) fields, but I very doubt about the latter one.

Reasons:
  • Long answer (-1):
  • No code block (0.5):
  • Contains question mark (0.5):
  • User mentioned (1): @pointed
  • Low reputation (1):
Posted by: Alexey Gagarinov