Q&A: Delivering Good Spatial Video

Iain Anderson

4 months ago

I hear that the new Final Cut Pro 11 supports Spatial video editing! What’s that?

Spatial video is stereoscopic 3D video shot with a “normal” field of view, like most flat 2D videos, and stored in the MV-HEVC format. At its core, though: it’s 3D stereoscopic video, and could be viewed on another head-mounted device like a Meta Quest, or even a 3D monitor. You can also deliver one angle of a Spatial video to a regular 2D platform, so experimentation with this newer format is relatively low-risk. Spatial is also supported in DaVinci Resolve 19.1.

The author, excited, at Apple Park, in stereoscopic 3D

Q&A: Delivering Good Spatial Video 1 — The author, excited, at Apple Park, in stereoscopic 3D

Does Spatial also mean a fuzzy border?

Not always. Most people watching Spatial will be using an Apple Vision Pro, and if a Spatial video has the correct metadata, it will usually be shown with a fuzzy border. That fuzziness can help avoid Stereo Window Violations, where objects approach the edge of frame and become uncomfortable, so in general, it’s a pretty good idea. However, a fuzzy border is not used for Spatial videos shown on Vimeo, nor is it used in Disney+ when watching 3D movies.

Does Spatial mean the same as Immersive?

Nope. Spatial has a narrow field of view, while Immersive (usually) has a 180° field of view. Immersive is extremely impressive, but it requires the very best cameras, huge resolutions, demanding delivery systems, and a whole different style of filmmaking. Spatial is far easier to handle, so let’s focus there for now.

What can you shoot Spatial videos on?

The simplest workflow is to use an iPhone 15 Pro or Pro Max, or any kind of iPhone 16. In the Camera app, switch to Spatial, make sure you’re in landscape mode, and you’re away. Right now, the maximum resolution is 1080p, and the only frame rate is 30fps. One major limitation is that one of your two angles is taken from a crop of the ultra-wide lens, making that angle softer, and noisier in low light.

Although the 16 Pro and Pro Max shifted to a 48MP ultra-wide sensor, the video quality hasn’t changed. My best guess is that it’s not possible to take a direct 1:1 pixel feed from the center of this sensor, but instead, a binned 4K feed is being captured, and then cropped down. Bummer.

Can third-party apps on an iPhone do any better?

Yes, but with some restrictions. Although some third-party apps do enable 4K Spatial recording, which does look sharper, those apps either don’t enable stabilization at all (Spatialify, SpatialCamera), or the stabilization can differ between the two lenses (Spatial Camera). In a shot with significant movement, the disparity between the two angles can make a shot unwatchable, but to be fair, jitter can do the same.

Clockwise from top, here’s SpatialCamera, Spatialify, and Spatial Camera

Q&A: Delivering Good Spatial Video 2 — Clockwise from top, here’s SpatialCamera, Spatialify, and Spatial Camera

It’s important to note that resolution isn’t everything, and a good camera shooting 1080p will look nicer to most viewers than a 4K that’s oversharpened with a constrained bitrate. Right now, all the current iOS options have a regular, sharp, “phone” look. If Log with Spatial recording arrives (as Spatial Camera’s developer has offered) it could make a huge difference in quality.

Are there any good lenses you can use with mirrorless cameras?

Very few. Canon have a new lens for Spatial recording, and couple more for 180° Immersive recording. I’m planning to take a look at that system soon, and it’s the only option Apple’s actively recommending.

The Canon Spatial lens on the R7 — look for an in-depth review soon

Because the APS-C sensor Canon R7 tops out at 4K across, shared between the left and right angles, the maximum resolution is 1080p Spatial video, though you can expect it to look much better than most iPhone footage. There is more vertical headroom, so you could potentially deliver a taller, squarer image than 16:9 if you wanted to.

I have a Micro-Four-Thirds body — any lenses for that?

Lumix made a 3D lens over a decade ago, but it was only ever intended for stills. If you put some tape over the pins, you can use it to record video with a modern body like a GH6 or a GH7 in 5760×4320 @ 30fps or 5728×3024 @ 59.94fps. If you do go this route, realise that the minimum focus is 60cm with a fixed aperture of ƒ/12, and though this is not a modern lens, it’s not bad if you sharpen it up a little.

Outdoors, you can actually get some decent results, with a resolution of around 2400px across per eye after processing, and you can deliver square output if you want to. Right now, the preferred native square output from FCP 11 is 2200x2220px, which actually works out OK. Here’s some demo footage from this camera. If you’re on an Apple Vision Pro with visionOS 2.2 or later, click the title to open it in 3D. If that fails, click the Vimeo logo to open in Safari, then choose OPEN in the top right:

Are there any other 3D cameras you can recommend?

Not yet. Acer showed the SpatialLabs Eyes Stereo Camera back in June, promising delivery in September, but we haven’t seen it yet. It’s 4K per eye and relatively cheap at $550, but very few people have actually used one. Sample clips look good, though, and it’s trivial to download them and use them in FCP 11 today.

Demo footage from the SpatialLabs Eyes works in FCP 11 today

Though there are a couple of other cameras available, they have significant flaws. If you’re considering combining footage from two separate cameras, know that it’s a tricky process with many potential issues, including sync difficulties, vertical disparity, and general workflow complexity.

Can you shoot Spatial videos just like flat 2D videos?

There are a few limitations. You really should stay level, and if you want to move the camera, you’ll get the best results if you move slowly, as if in a dream. A gimbal (or better, a slider) is an excellent idea, because you may not be able to rely on in-camera stabilization to work correctly, or at all.

One important factor is that you shouldn’t place objects too close to the camera, because they’re difficult for a viewer to focus on. If you’re shooting on an iPhone, this is made worse because the two lenses have different minimum focus distances. The Lumix 3D lens can’t focus closely either.

Where should I place objects in the scene?

Ideally, build a frame with your subject about 1 meter from the camera, with a decent amount of distance between them and the background. Anything deep in the distance won’t look very 3D, and it’s a good idea to place something in the foreground for contrast.

By its nature, 3D video will be closer to reality than 2D video — you’re capturing a volume, not just a flat image, and it’ll take experience to know what works well and what doesn’t on any particular camera. As the distance between a camera’s two lenses increases, the apparent depth effect increases, though if you push it too far, distant objects will look like models. Therefore, you don’t necessarily need or want a camera with a large, human-like inter-ocular distance — work with the sweet spot of your setup.

OK, I’ve got my shots. What’s the best way to review my footage in an Apple Vision Pro?

If you shot everything on your iPhone, and you have iCloud Photos turned on, they’ll sync automatically. Another workflow is to pull all the videos into a folder on your Mac, use File Sharing to share that folder to your local network, then connect to it with the Files app on your Apple Vision Pro.

What special things do I need to know when importing clips into FCP 11?

When you bring in Spatial video clips from an iPhone, Final Cut Pro will recognize them and put a small icon on their thumbnail, and in the Info panel the “Stereoscopic Conform” field will have been set to “Side by Side”.

Look to the Info Inspector to find Stereoscopic Conform, then choose Side by Side

However, if you’ve already brought Spatial clips into an FCP 10.x library, they will not be tagged as “Side by Side” and you must make sure all these clips are tagged, or the second eye will be ignored and you’ll get a 2D clip.

If you have shot clips on any other kind of camera, you just need to tag them as side by side. If your camera produces two separate video files for left and right eyes, it’s probably a good idea to pre-process those clips to a side-by-side presentation with Compressor first.

How can I see those clips in 3D?

In the Viewer, under its View menu, look for Show Stereoscopic As. You can choose to view just one eye, both of them side by side, to superimpose the two angles, or to use Anaglyph mode with red/cyan glasses. One of the Anaglyph modes is usually the best way to judge the depth position of an object, as you can quickly see a positional difference between the red and cyan.

Using Anaglyph Monochrome mode, you can see cyan on the right edge of near objects and red on the right edge of far objects) — Anaglyph Outline works well too

Can’t I watch my video in 3D in my Apple Vision Pro while I’m editing it?

Not yet, and there’s no built-in way to monitor in 3D while you shoot either. Obviously a live link to Apple Vision Pro would be hugely useful, so submit a feature request and it might happen sooner. For now, use red/cyan glasses while editing, and export to a file when you want to preview in your Apple Vision Pro.

Can I control where things sit in 3D?

Yes, using Convergence, in the new Stereoscopic section found in the Video Inspector. This control offsets the two eyes from one another to control their apparent distance from the viewer. A member of the FCP team recommends that since the iPhone’s two lenses are parallel, and do not converge, that you should start by setting convergence to approximately 1.5. That sits the video back behind the screen a little (making it more comfortable to watch) while a negative value would sit in front.

Ideally, your subject in one shot should have a similar apparent convergence from the subject of the next shot, or you’ll force your viewers to refocus. Note that you can drag on the numbers next to the Convergence sliders to move them a lot further (±10) than the sliders themselves (±3).

While Convergence adjusts everything in a shot at the same time, it’s possible to virtually place separate elements in 3D space by using the new Magnetic Mask to separate objects from their shots, and then using different Convergence values.

Does all footage need the same treatment?

Footage from different cameras may need different numbers to create the same apparent look, and if you’re using an odd setup, you may need to use some pretty extreme values here. To deal with clips shot on my old Lumix 3D lens, for example, I need a convergence value of nearly 20. How can I get there?

Yes, this is a ColorChecker on a wheelie bin, but more importantly, the Lumix 3D lens needs a large convergence adjustment

Because the sliders don’t go far enough, I need to apply convergence of 10, with Swap Eyes checked (because dual-image lenses record the images in “cross-eyed” format) and then create a compound clip. I can then add another 8-10 convergence to the Compound Clip (without swapping eyes) to get me close to where I need to be.

However, a probably better workflow if you need a large baseline convergence shift is to use adjustment layers. Convergence changes to adjustment layers can affect all clips beneath them, allowing me to use two adjustment layers to set a convergence shift for the whole timeline, and then make further individual convergence adjustments on each clip.

Do I really need to worry about convergence?

Yes. Uncontrolled convergence becomes uncomfortable, and you don’t want titles, for example, to clash with other objects in the frame, which will happen if they appear to be at the same position in depth. Also, if anything could to be overlaid on your video, it’s probably going to sit at a convergence of 0, so it’s going to look a bit weird if your titles appear to be in front of things that they’re clearly behind.

Finally, while you absolutely can throw things toward the viewer, it’s a party trick. Don’t do it too often, and don’t do it for long.

Can I crop, scale, and transform clips?

You can, but you’ll need to use a free effect to do so. If you activate and then use the built-in Transform controls, you’ll be adjusting the side-by-side double-wide frame. That’s helpful for some technical tasks, such as correcting a vertical disparity between the two eyes, but not helpful if you want to crop both eyes in the same way. Instead, download and install the free Alex4D Transform, which lets you transform any clip, or even rotate it in 3D. Any Motion-made effect will work, but this one’s great.

OK, I’m done with the edit. How can I export it?

Access the Share menu, then choose the new preset for Apple Vision Pro. If you want to send it to your Apple Vision Pro for preview, send it straight to iCloud and then find in the Files app, or send it anywhere and then AirDrop it. Leave the default metadata options (45° Field of View, 19.2mm Baseline) if you’ve shot on iPhone, and be sure to use 8-bit rather than 10-bit if you’re uploading to Vimeo. (Currently, only 8-bit files are detected as Spatial.)

Vimeo? OK, but what about YouTube?

YouTube has been openly hostile to the Apple Vision Pro. Not only do they still not have a native app, but their legal threats have seen the best existing app (Juno) removed from the App Store. While YouTube does support 3D video, they don’t support native Spatial workflows yet. Instead, export your video as a regular H.264 video — this will give you a Full Side by Side video in a double-wide frame, 7680×2160 or 3840×1080.

Send this file to Handbrake, add “frame-packing=3” in the Additional Options on the Video tab, but don’t change the video dimensions. Start the re-encoding process. After re-encoding, upload the output to YouTube, and after processing the regular 2D versions, the 3D versions will eventually become available. Be patient once again as you wait for the highest resolution 3D to process, and check the video with your red/cyan glasses. Here’s the result:

I’ve got an Apple Vision Pro here. Do you have any sample videos for me to check out?

Yes — lots and lots of clips, mostly shot on iPhone with the third-party app Spatial Camera. Some shots are still, most are moving; some edits use transitions, some don’t; most clips are 4K while some (marked) are 1080p; some are close to the camera while others are further away.

None of this is intended as narrative, but it should be useful for anyone planning their shots or considering making a travel video in Spatial. Watch for the intentional mistakes that I’ve left in to show you what not to do! There’s a thumb at the edge of frame that’s not visible with the fuzzy border in the Photos app, but can be seen on Vimeo. Some shots are simply too hard to converge, because the stabilization on the two angles is out of sync. In other shots, the “bobbing” movement from not walking like a ninja can be somewhat unpleasant. But overall, there’s a definite sense of “presence” here that you don’t get from flat 2D video, more like a memory than a snapshot.

The best way to view these on the Apple Vision Pro is to update to the latest visionOS 2.2, which allows you to click the title of a video or click the Vimeo link to open it in 3D. Right now, Vimeo’s Apple Vision Pro app isn’t perfect, and can’t load a folder. (If you’re still on an older version of the OS, navigate to the video’s page in Safari, then choose the “OPEN” link at the top right.)

Hampstead’s autumn/fall foliage (comfortable, but the very first shot moves quickly):

The well-known Hampton Court Palace (you may know it as a shooting location for Bridgerton):

A walk in the Cotswolds, in the English countryside — with some resolution comparisons, plus a rogue thumb in one angle:

Here are a few more, but I don’t want to flood the page with embeds — please check out the folder, though you may to view individual links in Safari and then choose OPEN to se them in 3D:
https://vimeo.com/user/1116072/folder/22963118

I want to deliver to 2D and 3D. Is that possible?

Yes, you can deliver a single eye from your 3D Spatial video as a flat 2D video. That could be a clean 4K image if you used a third-party iOS app, but it might only be a 1080p image if you’ve used something else. Another option is to dual-shoot. I shot a whole lot of video at the Final Cut Pro Creative Summit on my GH6 (shooting 2D) with my iPhone mounted on top (shooting Spatial 3D).

Here’s a Spatial multicam clip of Michael Cioni, in which the second angle is just a regular 2D flat clip — it works well

These matched shots can be combined into a multicam, edited once, and then you simply need to choose the angle you want to show (2D or 3D) in your final output. However, a major issue with this idea is eyelines, if your subjects ever talk straight to the camera. Short of a beam splitter, there’s no way for interviewees to look into two cameras at the same time.

What’s next?

There hasn’t been a truly new frontier in video for some time, and Spatial is one of the few new things you can explore safely, without compromising your mainstream video outputs. Immersive is great, but it’s a whole separate beast needing a whole new pipeline. Spatial is something you can deliver as an add-on, wowing your clients in a whole new way.

Watch for more camera reviews, editing tips and advanced workflows here over the next few months.