Network Render Test with flamenco

Hello again!,

So, we have succeed running flamenco using Linux in two PCs as you suggested in order to check everything works fine in our network. The problem is that we have all our workstations and in-house software with Windows. So we still need to find a solution with “workers” not being able to access the job storage path. We suspect the issue is in the “worker windows installer”, what are your thoughts about it? If you don’t have spare time to check this issue we would be happy to do so but we haven’t find the sorce code of the “worker installer”, can you share it or tell us where to look for it?

P.D. I’m the one who tweeted you last Monday. :wink:

Extending your comment, this is the error in the command shell of flamenco-worker with two computers on Windows:

2020-06-10 14:56:33,328  WARNING flamenco_worker.runner.TaskRunner Command 0 of task 5ee0d876e5b84e87e6b7f458 was not succesful, aborting task.
2020-06-10 14:56:33,329     INFO flamenco_worker.runner.TaskRunner Aggregated task timing info: {}
2020-06-10 14:56:33,331    ERROR flamenco_worker.worker.FlamencoWorker Task 5ee0d876e5b84e87e6b7f458 failed
2020-06-10 14:56:33,334     INFO flamenco_worker.upstream_update_queue.TaskUpdateQueue Pushing task update to Manager, queue size is 2
2020-06-10 14:56:33,354     INFO flamenco_worker.upstream_update_queue.TaskUpdateQueue Pushing task update to Manager, queue size is 1
2020-06-10 14:56:36,435     INFO flamenco_worker.worker.FlamencoWorker.fetch_task Received task: 5ee0d876e5b84e87e6b7f459
2020-06-10 14:56:36,435     INFO flamenco_worker.worker.FlamencoWorker Task changed status to active, pushing to manager
2020-06-10 14:56:36,439     INFO flamenco_worker.worker.FlamencoWorker Updating task 5ee0d876e5b84e87e6b7f459 with status 'active' and activity Activity(activity='', current_command_idx=0, task_progress_percentage=0, command_progress_percentage=0, metrics={'timing': {}})
2020-06-10 14:56:36,450  WARNING flamenco_worker.commands.blender_render.(task_id=5ee0d876e5b84e87e6b7f459, command_idx=0) Error in settings: blender_cmd 'C:/Program' does not exist
2020-06-10 14:56:36,451     INFO flamenco_worker.worker.FlamencoWorker Task changed status to failed, pushing to manager
2020-06-10 14:56:36,452     INFO flamenco_worker.worker.FlamencoWorker Updating task 5ee0d876e5b84e87e6b7f459 with status 'failed' and activity Activity(activity="blender_render.(task_id=5ee0d876e5b84e87e6b7f459, command_idx=0): Invalid settings: blender_cmd 'C:/Program' does not exist", current_command_idx=0, task_progress_percentage=0, command_progress_percentage=0, metrics={'timing': {}})
2020-06-10 14:56:36,461  WARNING flamenco_worker.runner.TaskRunner Command 0 of task 5ee0d876e5b84e87e6b7f459 was not succesful, aborting task.
2020-06-10 14:56:36,463     INFO flamenco_worker.runner.TaskRunner Aggregated task timing info: {}
2020-06-10 14:56:36,465    ERROR flamenco_worker.worker.FlamencoWorker Task 5ee0d876e5b84e87e6b7f459 failed
2020-06-10 14:56:36,466     INFO flamenco_worker.upstream_update_queue.TaskUpdateQueue Pushing task update to Manager, queue size is 2
2020-06-10 14:56:36,482     INFO flamenco_worker.upstream_update_queue.TaskUpdateQueue Pushing task update to Manager, queue size is 1
2020-06-10 14:56:51,470     INFO flamenco_worker.worker.FlamencoWorker Updating task 5ee0d876e5b84e87e6b7f459 with status '' and activity Activity(activity='', current_command_idx=0, task_progress_percentage=0, command_progress_percentage=0, metrics={'timing': {}})
2020-06-10 14:56:51,481     INFO flamenco_worker.upstream_update_queue.TaskUpdateQueue Pushing task update to Manager, queue size is 1

This error has occurred on Windows with the same configuration that on Linux has worked, obviously changing the value paths to the correct ones.

1 Like

When using spaces in a filename, you need to quote that filename. So configure the Blender path as "C:/Program Files/blablabla/blender.exe" --factory-startup --other-options, including the quotes.

What is this “worker windows installer”? On the Flamenco download page there is only the ZIP file that you just unzip to get the worker. There is no installer.

1 Like

You are right, there is no installer, what I meant is that everything in the zip file is compiled.

It may look that way, but most of the code is just Python sources inside one big zip file (so a zip within a zip).

I finally solved the problem changing the path of the blender.exe to “C:/PROGRA~1/BlenderFoundation/Blender_2.82/blender.exe” note there are no spaces as you said. With that everything worked OK.
Anyway is a little bit confusing that the suggested installation path of Blender on Windows is under “Program Files/Blender Foundation/ Blender XX/blender.exe” but Flamenco worker doesn’t support spaces in path.

:joy: :joy: :joy: sorry

That is confusing indeed, and would be nasty if this was intentional. Fortunately, the note in the documentation that states that paths with spaces should be quoted indicates that this is not :wink:

Flamenco has been made primarily for the Blender Animation Studio. As such, it has seen much less use on Windows in general, and hasn’t been used on Windows at all by its developers. Windows is such a different system than Linux or macOS, that I’m not surprised that Flamenco has some issues there. I’m confident that, given a properly detailed bug report on how to reproduce this with a minimal amount of effort, and enough development time, every issue can be fixed.

2 Likes

Hi, I just started setting up a small render farm here at work, it’s made of 4 PCs running windows.
I can get the renders running alright on some computers, so the setup is mostly fine. However, if I happen to have one computer that struggles when connecting to the NAS, it gives a failed task. As it should. But what I am concerned about is that if I manually fix the error on the worker (usually retyping my credentials to connect to the network), this particular worker will get ignored throughout the whole job. It connects fine to the manager, but no more tasks are being dispatched to this worker. The only solution I have found so far is to cancel the whole job and start over. This is something that can become tricky for big jobs.
Generally speaking I really enjoy the flamenco web interface, but what is lacking in my opinion is a way to manage/check status on the different workers.
Also, perhaps a way to manually set different tile sizes for each workers ? I have a i9+2080ti that works best at 256px tile size in FHD and 4 i7+1060 that work best at 64px tile size.
I’m not sure if this is the right place to post those bugs/suggestions, or if I should go the RCS or file a bug report

1 Like

This is intentional. When a Worker fails a particular task type on a particular job, it gets blacklisted. This is to prevent situations where a Worker has a bit too little memory to render a particular shot, crash as soon as the shot is loaded, mark the task as failed, and race through all the tasks until there are so many failures the entire job is cancelled.

The Worker will only be blacklisted for that particular task type. So, if rendering is to heavy for it, it can still help moving files around, running ffmpeg to convert image sequences to a video file, etc.

You can prevent this by configuring the pre-task check on the Worker. Add something like this to your flamenco-worker.cfg:

[pre_task_check]
write.0 = /render/flamenco/output
read.0 = /render/flamenco/input

This makes the Worker check whether it can write to /render/flamenco/output and read from /render/flamenco/input, before asking the Manager for a task. Both are optional, so use whatever you want. You can also add more directories there, with write.1, write.2, etc.

In the Workers overview of Flamenco Manager, click on the little triangle to open up the worker details. There you’ll find the blacklisted jobs & task types for that worker. Click on the red cross to remove a blacklist entry.

2 Likes

Thank you Sybren for this answer, it makes sense.
I think my issue is that I can’t access my flamenco manager via my browser. (I have tried chrome and edge on windows for sure, I’m pretty confident I gave firefox a go as well). When I click on my manager’s URL in the flamenco managers webpage :


It opens this page :
But I can’t do anything on this as it keeps relentlessly reloading this same page every second.
I also remember having this issue while setting up the manager for the render farm, but I think I managed to bypass this by editing the cfg file by hand.

Also another issue I have just come across, I have migrated all my workers and manager to blender 2.92, and now my job is stuck on “Queued” on the flamenco webpage and I can’t see the progress, even though it keeps rendering as usual, and at the end ffmpeg compiles the video correctly, but still the job is “Queued”. Any idea why that would be ?

The refreshing issue is indeed problematic, and AFAIK it works fine on Firefox. Please double-check.

You are right, the constant refresh issue is gone when using firefox. I had never seen this page before so I guess I hadn’t given firefox a try, my apologies, it’s great to know !
However the last render jobs I launched since upgrading to 2.92 (It’s not the only thing that changed since my previous renders from a couple of weeks ago. We also moved the whole office so IP adresses are different, but I think I’ve changed all the config files accordingly) the jobs are still marked as “Queued” even though they are actually completed.

You’ll have to check the logs as sent from Worker to Manager, as well as the logging output of the Manager itself. That should give you more info.

1 Like

Sorry to bother you again, I’m not even sure this is the right place to come for help,
but I really can’t get my head around to what’s going on and I need to launch a big render by the end of the week.
My manager is stuck in a loop like this :


My workers give me this error :
2021-03-03 10:17:27,930 WARNING flamenco_worker.worker.FlamencoWorker.fetch_task Error fetching new task, will retry in 10 seconds: HTTPConnectionPool(host=‘192.168.1.29’, port=8083): Read timed out. (read timeout=3)
I’ve tried removing all my projects from the cloud, setting up a new manager with an empty project, but it still tells me that I have 12511 Tasks and 35269 Upstream Queue.
If a start a new render via flamenco/blender, the manager refuses to give tasks to the workers, and I see the above task number incremented by the number of frames I have asked to render.
It worked “fine” (apart from the fact that the jobs were stuck on queued), today it’s not rendering at all.
Any idea what I could do ? I wouldn’t mind creating a new blender cloud account if it’s the only solution…

Please don’t post screenshots of text. Just paste it, or pipe it to a file and attach that. You can’t search through screenshots, or copy-paste text from it.

Your worker can’t connect to the Manager. This is not solved by doing anything on Blender Cloud, as it’s a problem that’s local to your network. Is the Manager advertising the correct URL to the workers? Does the machine the Manager runs on have a static IP address? Or is the DHCP server set up to always give it the same IP address? This is a necessity for a stable connection – the Manager isn’t equipped to deal with changing IPs.

Removing projects from the Cloud doesn’t tell your Manager to forget what it has queued. This is like demolishing a house, and expecting the pose office to burn any letters that were queued up for delivery there. Run flamenco-manager -purgequeue to purge the Manager’s outgoing queue.

1 Like

My apologies for the screenshots, I will keep that in mind, I thought it would be more readable this way.
I finally managed to get the farm up and running again. It was the workers that apparently were sort of clugging the manager with tasks (sorry for not being more precise, in my haste of trying to figure things out I did not take the time to save logs). I’m still not sure what I did wrong to get things so messed up. The network and fixed IP adresses were all fine, I could ping all the computers and they could all access their shared folders.
The -purgequeue command did purge a few tasks, but I had to run it like twenty times before the upstream queue was actually empty, as tasks kept coming in thousands by thousands. I purged 4000, it appeared empty on the managed interface, then as soon as I started the manager again, 3800 came back. But in the end it did not make things work any better.
What did the trick was erase all the Flamenco-worker folders from my workers, and replace them with a fresh Flamenco-worker folder (I made sure to keep the same IP adress), now all the workers can connect to the manager, and I was able to start my render just before the week end. Thanks to blender-cloud I’ll be able to keep an eye on my renders from the comfort of my home :wink:
I hope my endeavours can be of help to someone else someday.

Hey gusy, im fighting with the same problem!

Error in settings: blender_cmd ‘{blender}’ does not exist

can anyone explain this problem to me would be great!

check it!

1 Like

Flamenco does support spaces in a path, but you have to quote the path. This is necessary to make the distinction between “space in executable path” from “space to separate executable from arguments”.

Won’t work:

C:\Program Files\Blender Foundation\Blender X.XX\blender.exe -factory-startup -no-audio

Will work:

"C:\Program Files\Blender Foundation\Blender X.XX\blender.exe" -factory-startup -no-audio

This is the same as you see when you create a shortcut on your desktop to Blender. It’ll have the path "C:\Program Files\Blender Foundation\Blender X.XX\blender.exe" including the quotes.

1 Like