Shrinking the daily builds

oh did not realize that machine code was so different even with the slightest change in source code.Very useful tools indeed. Radare 2 also has a diff tool with patching mechanisms

https://www.radare.org/r/

Just using 7zip with maximum compression (lzma2, 1536MB dictionary) allowed to shrink one of recent builds (blender-2.80-f12040e088b6-win64) almost twice, from 119MB to 71MB. And you can put it into SFX archive (self-extracting), if you are not sure, if user has any suitable unpackers installed.

The biggest annoyance (at least on Windows) with (manually) downloading the daily builds that I encounter is that first Chrome (and probably other web browsers) feeds the whole ZIP file to your anti-virus for analysis (that long unexplained delay when the download reaches 100% before you’re allowed to actually open it), and then when you go to extract it from the ZIP file, it feeds every single file individually to your anti-virus for inspection (the reason the extraction rate approaches zero after a few seconds), which adds a minute or two to the process.

This extra overhead greatly exceeds the time required to actually download the file (or at least an extra 25MB from less compression).

Doing a local build including the “make update” through git seems to avoid this, which is a big reason it’s so fast (relatively speaking).

Wow, I somehow overlooked that there where replies to this thread!

The problem with rsync (if I understand the concept correctly) is it could place a much higher burden on the build server as it would have to generate patches on the fly for every download instead of only once after a recent build was released. The other issue is it looked like rsync patches would be nowhere near as size efficient as patches generated by xdelta3 or Courgette.

Interesting tool! I just did a quick check, but it looks like radare uses a diff algorithm very similar to rsync.

7zip is certainly better than the standard zip algorithm. I’m not sure why the Blender Foundation stuck with zip, probably the more widespread compatibility with most OS’s supporting it by default.

That delay is one of the reasons I started looking into a patch based solution. The antivirus / antimalware overhead is a pain. I wish there was a way you could mark file sources as “trusted” to bypass this sometimes. IIRC, this is currently only an option with files downloaded from Microsoft’s Store.

Windows doesn’t support it out of the box, while it does support zip files, so we leaned to the side of compatiblity there. For my graphicall builds i decided not to care and use 7Z never heard a complaint, that being said, GA does have much less users than the main blender site, and given they are looking for specialized builds odds are they are more advanced users.

1 Like

In the age of fiber and 100mbit - 1gbit internet speeds becoming commonplace, I don’t know that it is really that big of a deal to fuss over shaving a few megabytes off of a download, especially when compared to the availability of the decompression tools in the various systems.

1 Like

In the age of fiber and 100mbit - 1gbit internet speeds becoming commonplace, I don’t know that it is really that big of a deal to fuss over shaving a few megabytes off of a download, especially when compared to the availability of the decompression tools in the various systems.

What he said! I realize that any hint of inefficiency drives some people nuts, but in terms of bang for the buck, there is not much gain in building and maintaining the infrastructure to do this.

Woot! I was looking over my notes on this and found another set of test results. I made a crude Courgette directory patcher in python and was able to use that code to shrink the size of an update down even more than with Xdelta. The results for a patch to go from dc3b5024be1a to a1ae04d15a9f:

  • uncompressed patch size: 13.5 MiB
  • patch compressed using zip (ultra deflate) 8.45 MiB
  • patch compressed using 7zip (ultra LZMA2) 5.8 MiB.

One major drawback, making the original uncompressed patch directory took almost 15 minutes on a 4th gen i5, but part of this is due to the fact that my directory patcher script runs single threaded.

@kilon sorry for a late update on this, I did find parts of this experiment, but I couldn’t find my notes on the binary sampling idea. Instead I found some python code that did a slow byte-by-byte move through data, using operations like file.seek() and file.tell(), to search for a chunk of data that matched an expected hash value. If there is interest I can look into uploading the code, but it seemed pretty hackish. On the other hand, I also found this neat info-graphic I made to document Bencode (updated to use a more recent torrent as the original torrent it was based off of seems to have disappeared).

Inexpensive and uncapped 100mbit to 1gbit speeds are not commonplace unless you live near a large metropolitan area (at least this is the case for a big chunk of North America).

I am feeling a sense of deja vu here, but to lay out some other advantages to this approach (besides a shorter download time):

  • potentially massive bandwidth savings (assuming you do not run into Jevon’s Paradox) as you are reducing file sizes by up to 95 percent
  • large potential for file storage space savings with daily builds. this would make it easier to host multiple builds per OS (and open up the possibility of letting new users “roll back” to a previous build if the most recent build was DOA)
  • the entire project could be hosted by anyone with access to the daily builds (although if hosted by a third party it would require their patch server to download a new daily build everyday to make patches from)
  • if you are just looking to test out a more recent build (say your current build is several days old), it’s potentially much faster to do updates (or rollbacks) this way than even building from source as you wouldn’t have to update local repos and wait for your build environment to load itself into memory, open the project, and compile everything.
  • this approach would not be tied to the blender project, potentially any software project releasing daily/nightly builds could take advantage of a setup like this
1 Like

for the quickest conclusion, for the daily builds, it would be very useful to have a simple “update to last” button inside blender, which downloads only the files changed from the previous build … it would save a lot of time and bandwidth.

2 Likes

There is nice script BlenderUpdaterCLI which can be used for this… It has also a complete video tutorial how to use it… But I agree that it would be great if Blender would have this integrated.

or:

Hi, I wait for such an “Updater” for a long time.
It is not only the file size, you save up several steps like open builder DL page, DL, unzip, change desktop shortcut, delete old build, delete zip.
For examle: I use Vivaldi browser snapshot (Beta), start update, load difference, auto restart > Ready (Windows only).
This should be working for Release builds too, except major upgrades 2.8 > 3.0 maybe.

Cheers, mib