Inside Sabertooth
Learn how Sabertooth uses 3ds Max to create 3D interactive projects, including HBO Go’s Game of Thrones interactive experience
  • 1/3
You are here: Forum Home / Autodesk 3ds® Max® / Autodesk 3ds Max / 3ds Max Design 2009 / Distributed Bucket Rendering - Thoughts...
  RSS 2.0 ATOM  
2 pages: 1.2 last

Distributed Bucket Rendering - Thoughts...
Rate this thread
 
13240
 
Permlink of this thread  
avatar
  • alexyork
  • Posted: 12 June 2008 01:41 PM
  • Location: london, UK
  • Total Posts: 178
  • Joined: 24 August 2006 06:34 AM

Hi all,

First of all apologies for the long thread. However the fact that DBR fails for so many of us makes this a valid and, i feel, important thing to discuss and try to solve.

As others have reported here (and elsewhere) many (most? all?!) of us are having issues with MR crashing during DBR. Sometimes when you hit render everything works, sometimes MR crashes half way, sometimes it crashes at the “cleanup” stage. Whatever the case it’s extremely unstable and makes using DBR in production on a tight deadline (or any deadline for that matter) completely unpredictable and therefore unfeasible/unusable.

to be clear before we start, for me at least it is MR that is crashing - “fatal error”, “unhandled exception” etc, NOT MAX. However the MR error window prevents you from using MAX once it’s popped up, so you cannot manually jump into the framebuffer and save out the finished image. upon accepting that MR has died a nasty death the whole load closes down, perhaps with the option to save a backup version, perhaps not… again this is another inconsistency with the crash.

to be clearer, the crash for me when using DBR can occur either immediately after render launch, during a render at any point or after a render when “cleanup” is in progress. Interestingly the last situation (cleanup crash) is the most common for me. the crash using DBR occurs using GI, FG, both or none, using BSP2, the original BSP, any of the other BSP settings. All scenes render perfectly on the master machine.

my setup (for reference):

3 machines total including master server machine.
All three machines are the same barebones (Shuttle SX38P2Pro, same motherboard, PSU, same RAM brand etc).
2 of them are Q9450 CPU (quad core), 8GB RAM, 500GB HDD, GeForce8600GTS.
1 of them (the main server machine) is QX9650 (quad core), 8GB RAM, 1TB HDD, GeForce8600GTS.
(Basically all three are the same - 12 buckets in total).
Gigabit ethernet LAN using Netgear 5-port switch with 10m CAT5E cables.
MAX 2009 Design 64-Bit on master, nodes used only for DBR obviously so just raysat running on those.
Windows XP 64-bit on all 3 machines.

I propose that we all chip in with our tips on setting up network/MAX/MR for DBR in the hopes that there is a possible “best practice” to get this to work as often as possible (or even all the time!).

I will start:

- hardware - for the best compatibility try to use machines that are all the same or similar spec. the most important being that they are all Intel or all AMD (i think procedural textures and other things like this render differently across CPU types so it’s important to use all Intel or all AMD) - all three of my nodes are identical down to graphics card, RAM and CPU so this should not be a factor.

- windows version - make sure all your nodes are running the same version of windows, pretty obvious. all of my nodes are on XP64 so this should not be a factor.

- same version of max - pretty obvious! they should all be the same version number and all 32 or all 64 bit.

- memory - the obvious first step is to ensure your nodes have enough RAM to cope with whatever scene you are throwing at it. If your “main” machine has 8GB but your nodes have only 4GB but your scene requries, say, 6GB to render, your DBR is going to fail. I have 8GB on all three of my nodes all the same speed so RAM should not be a factor.

- network speed and quality - i have a Gigabit LAN with a brand new Netgear 5-port switch and CAT5E cables (short 10m cables). speed and network reliability should not be a factor.

- general network performance in windows - make sure you can actually read and write and generally talk to all your nodes within windows itself. make sure each machine can “see” and access (read AND write) to your “main” machine (the server). this is the case with my setup so should not be a factor.

now, as far as i’m concerned at this point everything should just work perfectly using DBR. all the nodes are the same hardware and software config, all on a fast, reliable network and all talking to each other just fine. if the scene renders on the server machine it should render exactly the same on the nodes, no?

there are a few things I’m considering that might cause the inconsistencies with DBR success:

- softimage XSI (with MR of course) used to (often) require you to set up a separate “render” user account in windows in order to let DBR work properly. perhaps doing something like this for DBR with MR under MAX might help the situation? I doubt it but it’s a possibility.

- this “cleanup” stage. not sure what this actually is or why when DBR fails it is often at this stage (the whole render completes perfectly, then the “cleanup” begins and boom, MR crashes). if it’s only MR that’s crashed here why is MAX not responding afterwards? Would it not be possible at this stage to somehow prohibit that message from blocking MAX so you could actually manually jump into the frame buffer and save out the finished image?

- network cable quality - for sure good quality cables are a must, but i would imagine CAT5E would be more than adequate for this?

- mr’s “processing” options - those three checkboxes and the memory value. Could somebody indicate if these would make a difference at all or are likely to be a factor here?

- maps and other “info” that needs to be sent to each node - surely DBR is purely a “CPU-based” network rendering solution, so maps and other scene items and their paths should not be an issue here?

I look forward to hearing your thoughts on this as it’s been an issue for me since MAX version 9 (didn’t even try it before that) making DBR ultimately useless for me and many others it seems.

Cheers,



alex york
for and on behalf of Atelier York | Bespoke Architectural Visualization
http://www.atelieryork.co.uk
MentalRayTips Twitter Feed

Replies: 0
avatar
  • alexyork
  • Posted: 12 June 2008 01:47 PM

additional - make sure firewall is either disabled on all nodes (including master) or at least has exceptions for raysat. all disabled on my setup so should not be a factor.



alex york
for and on behalf of Atelier York | Bespoke Architectural Visualization
http://www.atelieryork.co.uk
MentalRayTips Twitter Feed

Replies: 0
avatar
  • alexyork
  • Posted: 12 June 2008 01:49 PM

additional - max is up to date with latest hotfix. viewcube is disabled.



alex york
for and on behalf of Atelier York | Bespoke Architectural Visualization
http://www.atelieryork.co.uk
MentalRayTips Twitter Feed

Replies: 0
avatar
  • Zolren
  • Posted: 12 June 2008 06:20 PM

As per my previous post in another thread about DBR and Mental Ray crashing, I have now been crash free for the past 48 hours.

What seemed to have made the difference is turning off the firewall on my other two PCs. BTW: I re-enabled the firewall on the other two PCs but since I am using Norton 360 version 2.0, I used the My Network feature to set the Trust level on my main PC as fully trusted on the other two (nods). Since doing so, I have no had a single MR crash in about 48 hours.

I am still using BSP 10 / 20 and no longer using BSP2 since I was getting MR crashes in BSP2 mode but that was before I changed the firewall settings and haven’t tested this again.

All 3 PC’s have the same MB and processor (Q6600) but different graphic cards and ram configuration. I am running over a Gigabit network.



Replies: 0
avatar
  • alexyork
  • Posted: 12 June 2008 06:40 PM

that’s good to know, however windows firewall is disabled on all of my nodes and I am not using BSP2 on certain scenes, yet DBR still crashes. the problem therefore may be helped by removing firewall, not using BSP2 or a combination of both, however it will not necessarily fix it altogether for certain people.

it would be very useful if you could test BSP2 again so we can rule that out.



alex york
for and on behalf of Atelier York | Bespoke Architectural Visualization
http://www.atelieryork.co.uk
MentalRayTips Twitter Feed

Replies: 0
avatar

- windows version - make sure all your nodes are running the same version of windows, pretty obvious. all of my nodes are on XP64 so this should not be a factor.

- same version of max - pretty obvious! they should all be the same version number and all 32 or all 64 bit.

Not necessarily.

Im using 32-Bit Vista and a slave PC 64-Bit. No large problems. Just the basic errors every now and then.



"If I see that damn teapot one more time....”

3ds Max 6.5 - Max 2012
Windows 7 Intel Core 2 Quad Q9400 2.13 GHZ
64-Bit, 8 GB RAM

Video Card: Galaxy GeForce 210 (Nvidia)

My Website: http://drmoore3d.com

Replies: 0
avatar
  • Meelis
  • Posted: 01 September 2008 09:20 AM

Just a question:

Has anybody had a problem when rendering large scenes with DBR, that last few buckets take hours to complete in Final Gather computing phase? For example yesterday when I tried to render a scene, I started rendering about 15:00 and by 17:00 approximately 80% of Final Gather computing was finished. I came back to work today at 10:00 and it was 98,9% and one bucket left! So 80% took 2 hours and last 20% took 17 hours!!! I canceled rendering because nothing was really going on there (0% of CPU was used) and zoomed the scene in a little bit as this final bucket was one near the edge with glass object reflecting the pool (obviously quite complicated part to compute). Now I am rendering the scene again and the pool with this glass object is outside the picture. But still I have five last buckets somewhere else that just won’t finish. It’s been five hours now from the beginning of rendering and about 2 hours have taken to complete 2 of the last 5 buckets. Yesterday as I rendered almost the same frame exactly the same part was completed just fine.
So I think this happens always even with smaller files, although it is not easy to figure it out as the computing takes less time. But it becomes more clear when the scene is extremely simple - last buckets takes drastically more time to complete - more than rendering the whole scene without DBR. Maybe a network issue but then why is it always last few buckets?

By the way, I get lots of DBR crashes too either in the beginning or in cleaning up phase. In the beginning it usually happens if I have rendered a scene before or I have canceled one. The only thing that helps then is to reset my computer. I repeat, reset helps not just reboot. For a few times I have seen even the death screen. This happens only when cleaning up is finished and Max seems to refresh open viewports.



Replies: 0
avatar
  • nirsul
  • Posted: 01 September 2008 11:39 AM

In my case I must say that after using MR for some years now - I recently shifted to VRay as I couldn’t stand the memory related crashes.(DBR or not)

Nir



Replies: 0
avatar

Has anyone got a step by step on how to set up DBR. I got some high quality rendering to do which keeps crashing when rendered on one machine. So i was hoping that DRB would be the way to spread the load over 4 PC.

Help greatly appreciated!



Replies: 0
avatar

Hi Mark,

its really simple. Install 3dsmax on your machines and go back to your master machine. Within the render dialog choose processing and scroll down. Enable “Distributed Bucked Rendering” and Add the IPs of your clients. After adding you can select the clients by clicking on them ( they turn blue ) and now render.
You’ll notice the translation time is now longer and it needs some time before it starts. Then enjoy rendering on all machines. This is quite awesome ;).



Regards Matthias

----
C2Q 9650, 8GB Ram
3dsmax 8, 2009, 2010 and 2011

Replies: 0
avatar
  • Meelis
  • Posted: 25 March 2009 07:00 AM

MatthiasGose 25 March 2009 09:33 AM

Hi Mark,

its really simple. Install 3dsmax on your machines and go back to your master machine. Within the render dialog choose processing and scroll down. Enable “Distributed Bucked Rendering” and Add the IPs of your clients. After adding you can select the clients by clicking on them ( they turn blue ) and now render.
You’ll notice the translation time is now longer and it needs some time before it starts. Then enjoy rendering on all machines. This is quite awesome ;).

Few things to remember:

1. Do not set up your render farm over wifi. It is of course possible to render but as wifi signal is quite unstable sometimes then when one of the satellites loses connection your machine crashes. Not just your render but the whole 3ds max could crash. So if you haven’t saved your work it would really hurt.
2. Information of your satellites are restored in max.rayhosts file which is located in your max installation folder under “mentalray”.
3. DBR may be tricky sometimes. It crashes in every imaginable situation. If you have a lots of maps, shaders etc in your scene, you have to make them available to your satellites.
4. It is useful to open mental ray messages window to check the progress of your rendering (not just errors). Sometimes you may find the reason there when it seems that nothing is going on anymore for a long time.
5. You have to remember that more satellites means of course quicker rendering but also more chances to crash. Problem with one of the satellites usually means blowing up the whole thing.
6. All of the PC-s should be approximately with the same hardware. One weak machine may hinder your rendering as other PC-s have to wait up the weak one.
7. Simple scenes or scenes with low quality settings may render quicker without DBR.

But all in all DBR is still very useful. Depending of the complexity of your scene 4 PC-s may shorten your rendering time not just 4 times but sometimes even more.



Replies: 0
2 pages: 1.2 last