UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 3: invalid start byte

tiancaipipi110 · March 20, 2021, 12:33pm

Hello, obj importer from Blender 2.8-2.92 seems unable to import .obj with name of certain characters(? Russian), gives UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 3: invalid start byte. Blender 2.79 works fine. Any idea how to get around this? Here’s a test example.

ideasman42 · March 20, 2021, 1:01pm

The file works here with 2.93 (current master), best report this in the bug tracker.

tiancaipipi110 · March 20, 2021, 2:59pm

I just tried 2.93 alpha and it still doesn’t work. I’m confused.

Robert · March 20, 2021, 3:31pm

It seems that your .obj/.mtl files are not UTF-8 encoded. Try to convert the encoding to UTF-8 in an editor like Notepad++.

tiancaipipi110 · March 20, 2021, 3:57pm

Yes, this would work, but not realistic to me. I’ve asked a few others to try importing this .obj in Blender 2.80-2.93, they all work properly. Even my 2.79 works fine. How come it doesn’t work on my 2.80-2.93? What’s the cause for it?

Robert · March 20, 2021, 4:14pm

Edit: This is guess was not correct. It turns out that the importer is handling decoding errors more strictly on Windows than on Linux in current versions of Blender.

That would most likely be because import_obj.py doesn’t enforce the expected encoding in all places. This would result in the use of the encoding based on the system locale which may or may not match what the file is actually using as encoding. I would have to take a closer look at this though.

If that is the case this should be fixed in Blender. However, your file would still have to be encoded as UTF-8 as it is currently not valid.

tiancaipipi110 · March 20, 2021, 4:49pm

You’re saying since the file isn’t utf-8 encoded, technically it shouldn’t work for others neither. But for some reason, it does. And you’re going to fix it so that it only imports utf-8 encoded files?

tiancaipipi110 · March 20, 2021, 5:20pm

I see. Basically, if I want a quick go around, I can just mimic the OS system encoding setting that matches the file. But even if it can open the file, Blender still won’t recognize those characters. So in the long run I still have to change the files(both .obj and .mtl) to utf-8?

tiancaipipi110 · March 20, 2021, 5:31pm

But it still doesn’t explain why my Blender 2.79 is able to open that file on my machine tho. Yes, that’s what I’m doing with notepad++, but I’ve got way too many files that makes it impossible, unless there’s a way to python batch notepad++ re-ntf-8-encode it.

Harleya · March 20, 2021, 5:51pm

I realize that you need a proper fix to this, but if the workaround is “Convert to UTF-8” in Notepad++ then it is possible you could script the prepending of the UTF-8 byte order mark - 0xEF,0xBB,0xBF - to the start of those files. Might work.

Robert · March 20, 2021, 6:23pm

Could you please create a bug report through Help > Report a Bug?

Additionally, open Blender’s installation directory and double click on the blender_debug_log.cmd. This will start Blender in debug mode and creates log files. Try to import the *.obj again. Once it fails to import, close Blender. The Windows Explorer should open and show you up to two files, a debug log and the system information. Add them to your bug report.

tiancaipipi110 · March 20, 2021, 6:40pm

Ok. After I open blender_debug_log.cmd, I should press enter to open Blender right? This will open my default Blender 2.79, is there a way to point it to a different version?

Harleya · March 20, 2021, 6:49pm

AFAICT, that “blender_debug_log.cmd” will only ever open the blender that is in the same directory as it.

tiancaipipi110 · March 20, 2021, 7:06pm

Ok, found out why, I renamed the executable. Changed it back and now it works. Thanks.

Harleya · March 20, 2021, 7:15pm

Now, just in case doing so might have screwed up your prior testing of this on different versions, note that there was a related issue to this that was only fixed in 2.93. In a nutshell users on Windows (only) could have some unicode characters give an error on decoding in earlier versions, fixed quite recently.

tiancaipipi110 · March 21, 2021, 4:36am

It shouldn’t be the case here. I only renamed 2.80 executable, but this is also happening to 2.83.13/2.92/2.93.

ideasman42 · March 21, 2021, 7:07am

Most likely a developer needs to redo this on their system.

The importer was written to handle non-utf8 files. This works for me on Linux, if it fails on other platforms - that needs to be investigated.

Robert · March 21, 2021, 10:42am

I was under the impression that we only support UTF-8, especially given that os.fsdecode (applies the strict error handling) and .decode with explicit utf-8 encoding are used. I would be surprised if this works with all other encodings on Windows. I will check the ticket today.

Robert · March 21, 2021, 12:35pm

As expected this neither works on Linux nor Windows since the name of the *.mtl can’t be decoded as UTF-8.

On Linux it just fails gracefully. It passes the decoding step but ignores the *.mtl because it can’t find the file with the improperly encoded name. Importing the model is still successful.

On Windows it fails in the os.fsdecode for the *.mtl name as it uses the error='strict' internally.

Since there is no code that attempts to detect the encoding and use that to interpret file, anything other than UTF-8 can’t be properly imported. At best if fails gracefully ignoring parts it can’t decode, as it does on Linux.

Robert · March 21, 2021, 12:52pm

In conclusion, the importer doesn’t handle encoding errors as gracefully on Windows as it does on Linux. The .obj/.mtl files are still required to be UTF-8 encoded.

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 3: invalid start byte