UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 3: invalid start byte

Hello, obj importer from Blender 2.8-2.92 seems unable to import .obj with name of certain characters(? Russian), gives UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 3: invalid start byte. Blender 2.79 works fine. Any idea how to get around this? Here’s a test example.

The file works here with 2.93 (current master), best report this in the bug tracker.

I just tried 2.93 alpha and it still doesn’t work. I’m confused.

It seems that your .obj/.mtl files are not UTF-8 encoded. Try to convert the encoding to UTF-8 in an editor like Notepad++.

Yes, this would work, but not realistic to me. I’ve asked a few others to try importing this .obj in Blender 2.80-2.93, they all work properly. Even my 2.79 works fine. How come it doesn’t work on my 2.80-2.93? What’s the cause for it?

Edit: This is guess was not correct. It turns out that the importer is handling decoding errors more strictly on Windows than on Linux in current versions of Blender.

That would most likely be because import_obj.py doesn’t enforce the expected encoding in all places. This would result in the use of the encoding based on the system locale which may or may not match what the file is actually using as encoding. I would have to take a closer look at this though.

If that is the case this should be fixed in Blender. However, your file would still have to be encoded as UTF-8 as it is currently not valid.

You’re saying since the file isn’t utf-8 encoded, technically it shouldn’t work for others neither. But for some reason, it does. And you’re going to fix it so that it only imports utf-8 encoded files?

I see. Basically, if I want a quick go around, I can just mimic the OS system encoding setting that matches the file. But even if it can open the file, Blender still won’t recognize those characters. So in the long run I still have to change the files(both .obj and .mtl) to utf-8?

But it still doesn’t explain why my Blender 2.79 is able to open that file on my machine tho. Yes, that’s what I’m doing with notepad++, but I’ve got way too many files that makes it impossible, unless there’s a way to python batch notepad++ re-ntf-8-encode it.

I realize that you need a proper fix to this, but if the workaround is “Convert to UTF-8” in Notepad++ then it is possible you could script the prepending of the UTF-8 byte order mark - 0xEF,0xBB,0xBF - to the start of those files. Might work.

Could you please create a bug report through Help > Report a Bug?

Additionally, open Blender’s installation directory and double click on the blender_debug_log.cmd. This will start Blender in debug mode and creates log files. Try to import the *.obj again. Once it fails to import, close Blender. The Windows Explorer should open and show you up to two files, a debug log and the system information. Add them to your bug report.

Ok. After I open blender_debug_log.cmd, I should press enter to open Blender right? This will open my default Blender 2.79, is there a way to point it to a different version?

AFAICT, that “blender_debug_log.cmd” will only ever open the blender that is in the same directory as it.

Ok, found out why, I renamed the executable. Changed it back and now it works. Thanks.

1 Like

Now, just in case doing so might have screwed up your prior testing of this on different versions, note that there was a related issue to this that was only fixed in 2.93. In a nutshell users on Windows (only) could have some unicode characters give an error on decoding in earlier versions, fixed quite recently.

1 Like

It shouldn’t be the case here. I only renamed 2.80 executable, but this is also happening to 2.83.13/2.92/2.93.

1 Like

Most likely a developer needs to redo this on their system.

The importer was written to handle non-utf8 files. This works for me on Linux, if it fails on other platforms - that needs to be investigated.

I was under the impression that we only support UTF-8, especially given that os.fsdecode (applies the strict error handling) and .decode with explicit utf-8 encoding are used. I would be surprised if this works with all other encodings on Windows. I will check the ticket today.

As expected this neither works on Linux nor Windows since the name of the *.mtl can’t be decoded as UTF-8.

On Linux it just fails gracefully. It passes the decoding step but ignores the *.mtl because it can’t find the file with the improperly encoded name. Importing the model is still successful.

On Windows it fails in the os.fsdecode for the *.mtl name as it uses the error='strict' internally.

Since there is no code that attempts to detect the encoding and use that to interpret file, anything other than UTF-8 can’t be properly imported. At best if fails gracefully ignoring parts it can’t decode, as it does on Linux.

In conclusion, the importer doesn’t handle encoding errors as gracefully on Windows as it does on Linux. The .obj/.mtl files are still required to be UTF-8 encoded.