79308888

Date: 2024-12-26 07:15:00
Score: 0.5
Natty:
Report link

I faced this issue for me, I myself am author of the program (that's more like a mode for the free software, so I can't just rewrite its core) which is sensitive to BOM. I've done some tests and asked a lot of questions. I needed a cmd script that would process one specific text file after it was encoded in utf8 for my program to work correctly in Cyrillic.

The best answer that I myself got is to use something like this:

powershell -Command "(gc '%CD%\myfile.txt') "^
...
"| Out-File -encoding utf8 '%CD%\myfile.txt'"
powershell "(get-content %CD%\myfile.txt -Encoding Byte) | select -skip 3 | set-content %CD%\myfile.txt -Encoding Byte"

By no means do I claim authorship of the method, thanks a lot to js2010 for the hint.

And I think this is good enough. The program wasn't starting at all with BOM, I checked, and now it started in latin directory. But for Cyrillic this didn't work, I think because the program itself don't support utf-8 Cyrillic representation.

The only thing that truly solved my problem was:

chcp 1251
powershell -Command "(gc '%CD%\myfile.txt') "^
...
"| Out-File -encoding default '%CD%\myfile.txt'"

By setting chcp 1251 the program finally understood Cyrillic (it became corrupted for Windows notepad for some reason but perfectly readable for my program), default in this situation returns the previously set value. We have expanded the cmd ASCII to ANSI and removed the BOM. If we need list of additional characters other than Cyrillic we can use chcp 1252 or any other.

I hope this solves your problem.

Reasons:
  • Blacklisted phrase (0.5): thanks
  • Blacklisted phrase (0.5): I need
  • Long answer (-1):
  • Has code block (-0.5):
  • Low reputation (1):
Posted by: Verity Freedom