This can be fixed by changing raw = raw.decode('utf-8')
to raw = raw.decode('utf-8-sig')
. This will make the Byte Order Mark not be treated as part of your file and instead as metadata which tells it how the file should be interpreted. This should mean that it won't print the u+feff
anymore. You can also learn more about utf-8-sg from Python documentation.