With Support from Apple

(Page 1 of 1 pages for this article )

Wednesday, July 30, 2008

Filed under: Post Production

Compressor 3: A “virtual Cluster” Makes an Impressive Speed Boost

Scott Bates | 07/30

Faster encoding for the rest of us

A large number of us do our encode/transcode work in compressor on the same single machine as we did our FCP editing on, or are about to author our DVD on. The point being that many of us don’t have the luxury of raw extra computing power just sitting in a stack next to our desk waiting to become a Qmaster Cluster for distributed encoding. However, it turns out, there’s a great pro-tip for Compressor users that allows you to basically do distributed encoding to yourself.

Earlier this year at the Mac Developer Conference, Apple introduced the next version of OSX coming down the pipeline next year (10.6 aka: Snow Leopard). One of the primary goals outlined for the new OS, was to greatly improve the utilization of all these “multi-core” CPUs we have running in our Macs these days. The reason being that while todays OS and Applications are in-fact Multi-threaded and do try to take advantage of all that horse-power, it’s not optimized to the degree we’d all like it to be, and thus there are plenty of CPU cycles going unused.

The default installation of FCP/Compressor sets up Qmaster on your machine with only one instance of the process “CompressorTranscoderX”, the engine that drives encoding/transcoding from compressor. While Compressor is a multithreaded application, it trys to play well with others and thus does the grunt work as a background process where it’s actaully rather passive with the amount of CPU it chooses to eat up and thus is only modestly effective at spreading those threads across your multi-CPUs/Cores. So the fundamental tip here is to actually create additional instances of “CompressorTranscoderX” process, enough instances to match the number of cores your machine has and then let Qmaster divvy up the task across all the processes. The result is that you are forcing better usage of your CPUs and in turn you’ll see a very impressive bump in speed. Depending on how many CPUs/cores you have inside your box, your mileage will vary, but in my tests, I was able to shave off a whopping 30% to 60% of the encode time.

Clustering amongst yourself:

• Open System Preferences, and in the bottom row you’ll find a preference panel for “Apple Qmaster”.

image


The Qmaster Preference Panel

image

• Under the “Setup” Tab, choose ‘Services only’ for the Share this computer as: setting.

• Select the Compressor service, ‘Share’ should be checked, and since we are only clustering with ourselves, there’s no need to check ‘Managed’.

• Then the most important step, click the “Options for selected server”. This presents a pulldown selector for the ‘Number of Instances’.

image


• From the pulldown, select the highest number available, it should match the total number of cores your machine has.

• Finally, click the “Start Sharing” at the bottom of the preference panel.

The panel will grey-out as the services start up on your machine. This may take a minute the first time you do this. Once the services get up and running the final panel should look like this:

image


Taking advantage of yourself

The next time you are in Compressor, and hit the “Submit…” button to kick off an encode, there’s a little check box in the submission panel you need to tick to take advantage of your new processing power.

image


In the submission panel, the ‘Cluster’ pulldown should still be set to ‘This Computer’ as we didn’t setup Qmaster with a multi-computer cluster. We’re still just encoding on “this computer” only. To the right of the pulldown is the magic checkbox. Be sure to check “Include unmanaged services on other computers”. That’s it. Simple, eh?!

You can double check that all is running as planned by pulling up the Activity Monitor application in your Utilities folder. Sort by CPU and you should now see multiple “CompressorTranscoderX” processes at the top of the list during an encode. If you checked here before setting up your virtual cluster, there would have been just one of these processes listed.

image

Putting it to the test

To see how much improvement I could get by setting up a virtual cluster, I took two different machines and encoded a 10 minute Uncompressed 1080p24 clip outputting it with the default AppleTV setting (1280x720, h264). The first machine in my test is a first gen Mac Book Pro, so it has a core duo (2 cores) inside. The second machine is shiny Mac Pro with dual quad-core xeons (8 total cores).


image

So the MacBook Pro finished 28 minutes faster or roughly 30% faster than it would have otherwise. The Mac Pro cut it’s time in less than half. It shaved 32 minutes off it’s already comparably faster time, improving a whopping 62%. An interesting note though on the 8 core Mac Pro, for this test, only 4 instances of compressor engaged.

Try this at home

The setup I described here assumes you are on a single machine and not necessarily connected to a network with other machines that might also be configured similarly. If you did try this in an office environment you may find yourself inadvertantly stealing your neighbor-cubes CPU cycles, so for those scenarios, try setting yourself up as “Quick Cluster” in the Qmaster preference panel and then in compresser choose your Quick Cluster from the pull down menu rather than “This Computer” and do not check the box for unmanaged services on other machines. This way you force your machine to keep to itself.

You can reach Scott at: .(JavaScript must be enabled to view this email address)

 

(Page 1 of 1 pages for this article )

               


I definitely enjoy seeing all 8 cores of a Mac Pro sweat on a big render. I use the Qmaster Cluster with 8 instances and submit to the cluster from Compressor. Something to keep in mind is the speed of your target drive, because what Qmaster does is not only split up your processing to eight different processes, but it splits up the files it writes to as well. That’s fine until the processing is done and you have 8 different files that make up one clip and Qmaster starts rewriting the file in one piece, thus hogging your target drive’s read and write speed. The moral of the story is write to the fastest thing you’ve got then do a file copy and you’ll probably save time.

Posted by .(JavaScript must be enabled to view this email address)  on  08/02  at  11:35 PM


Did you get this to work directly from FCP, or did you create an intermediate QT movie? I cannot get it to happen without the intermediate on either my old 2x2 G5 or MBP.

Apple forum searches turned up more questions than answers.

Posted by .(JavaScript must be enabled to view this email address)  on  08/19  at  02:36 PM


The “Options for selected server” button is grayed out on my Mac Pro. Any idea why? The rendering option will allow me to pick a number.

Posted by .(JavaScript must be enabled to view this email address)  on  08/30  at  11:24 AM


Make sure that Sharing is OFF, you can’t change options when Sharing is active.

Posted by .(JavaScript must be enabled to view this email address)  on  08/30  at  02:17 PM


It’s not that. Sharing is not on. I’m using the first FCS, not 2… but should that matter?

Posted by .(JavaScript must be enabled to view this email address)  on  08/30  at  08:48 PM


Thanks for the article. After reading it, I was quite excited.

I followed the directions for my 8 core Mac Pro. The transcode time nearly dropped to half of its usual time (55 GB DVCPro HD 1080i60 to a ProRes HQ 1080i60 transcode).

However, after the transcoding finished, all the segments had to be assembled into the new file, resulting in the overall process taking 150%+ compared with the original, out-of-the-box behavior…

Using the described technique to export via Compressor from within FCP failed almost immediately after submission in Compressor.

Mac OS X 10.5.4, octo core 2.8 GHz Mac Pro (early 2008), 10 GB RAM, SATA RAID, nVidia GeForce 8800 GT graphics, Compressor 3.0.3, FCP 6.0.4.

It _was_ nice to see all 8 cores running at 90%, or more… Go figure…

Posted by .(JavaScript must be enabled to view this email address)  on  09/20  at  10:33 AM


something to consider is not using all 8 cores on the octo. i have tested all the combinations, and for my octo(early 2008), with 16GB of ram, 6 cores is the best numbers.

with 8, you pay the price of having more files to put back together, and it actually takes longer.

with 6 i got the fastest encode times from submit to actually done. which was still about 50% faster than not using a cluster.

Posted by eric james wood  on  03/24  at  11:07 AM


Name:

Email:

Location:

URL:

Smileys

Remember my personal information

Notify me of follow-up comments?

 
Autodesk Offers All-In-One Solution with NLE Suites Part II
Clint Milby | 03/16

Smoke 2012 for Mac offers an all in one solution to save you time and money

image

Basic Training
Now that I steeled myself to learn Smoke, I was ready and determined to get down to business. Fortunately, Autodesk has provided several…

Autodesk Offers All-In-One Solution with NLE Suites Part I
Clint Milby | 03/16

Smoke 2012 for Mac offers an all in one solution to save you time and money

image

Recently I’ve had a chance to visit with some of the brightest minds in post production today.  These gentlemen arrived at this point in their careers from…

From start to finish – an arsenal of tools
Marc-Andre Ferguson | 01/25

Finishing options from mobile workstations to pimped out desktops.







Copyright © 2012, HD Expo, LLC a division of Diversified Business Communications. DBA Createasphere

All rights reserved. HD EXPO, High Def EXPO, Createasphere, E-Tech, Entertainment Technology Exposition, 3D Production Workshop, VariCamp, P2 Camp, ColorCamp 101, and Lighting, Filters & Gels for HD are all trademarks of HD Expo, LLC.

Terms of Use  |  Privacy Policy

Check PageRank