The Tech Retreat is an annual four-day conference (plus Monday bonus session) for HD / Video / cinema geeks, sponsored by the Hollywood Post Alliance. Day 4 offered a SMPTE update, lens metadata, lightfield cameras, new encoding paradigms, Ethernet AVB and video, upconverting to 4K, and the REDRAY player.
Herewith, my transcribed-as-it-happened notes and screengrabs from today's session; please excuse the typos and the abbreviated explanations. (I'll follow up with demo room stuff and the post-retreat treat later this weekend.)
SMPTE Update (including the economics of standards)
Peter Symes, SMPTE
Very thriving community; more standards work going on. User sessions December 2011, asking "what do you want out of SMPTE?" Led to 48-bit RGB, 48fps timecode. Another session will be held this year.
Technical committee on cinema sound; study of UHDTV; working on processes for software-based standards (not just written-on-paper standards, but dynamically updatable standards); study group on media production system network architecture.
New communications: outcome reports from each meeting block; summarize all in-process projects (139 at present); available to all:
SMPTE Digital Library: SMPTE Journal from 1916 onwards (free to Professional members); Conference Proceedings and Standards (discounted to members). library.smpte.org. Fully searchable and cross-referenced. We don't necessarily have all the older conference proceedings; if you have a copy of an older doc and would consider giving it to us, let us know.
The economics of standards: Not well understood. "Standards are just for manufacturers; users can't make a difference; expensive." Not true! Standards reduce repetitive R&D, increase interoperability. For users? Wider choices of vendors and products, ability to mix 'n' match. Value for everyone: manufacturers don't need to build everything, small companies can enter a niche (more competition; products not viable for large manufacturers).
Participation: yes, it's best to be there. But you can participate remotely, too; all SMPTE standards meetings have voice / web remote access.
Why should users participate? Do you think manufacturers understand what you really need? You don't get good standards without user input. Those who don't participate get the standards they deserve!
"Why do we have to pay for the standards we helped to create?" The benefit of participation is better standards; selling access to standards is an important element of SMPTE revenue.
To participate: Professional membership, $125/year, or pay a one-off fee; $250/year extra ($200 until 1 April) to participate in person (keeps the tire-kickers and whiners out). Exemptions for volunteers. VOIP or call in for free.
New Developments in Camera & Lens Positioning Metadata Capture and Their Applications for Matchmove, CGI and Compositing
Mike Sippel, Director of Engineering, Fletcher Camera
Lens metadata: lens ID including serial #, focal length (and zoom range), focus distance, iris.
Camera positioning metadata: GPS location, pitch/roll/yaw, relative and absolute positions to other references.
Originally written by hand on camera report; not always legible, not always done, not always as detailed as we'd like. Typically only shot-based, not frame-based (doesn't track parameters as things change). Must be transcribed and managed separately from the picture data (this is a BIG problem, often isn't passed along).
Lens metadata: Cooke/i, Arri LDS (Lens Data System), Canon EF. Lens data embedded in raw files. Frame-based, dynamic data. Always on, no user action required. All-internal electronics with no external encoders needed. Negligible additional cost in money and bandwidth.
Sensors inside the lens convert iris, zoom, focus positional data to onboard logic, which translates code values to f-stop, distance, focal length data, fed to camera through lens-mount contacts.
Why? VFX. VFX are typically 400-500 shots in a non-special-effects film. In the new "Planet of the Apes", almost every shot is a VFX shot. Matchmove, CGI, compositing.
Lens distortion: barrel, pincushion, or wave (mustache) distortion.
And these are just the three most simple ones! Important for VFX: distortions have to match between lens image and composited elements. "Unwarping" software used to undistort source images.
Typically a checkerboard or other distortion chart is shot to characterize a lens.
When generating CGI, the artists need to guess at distances, angles, positions unless detailed metadata available. It can take several passes (rendered every time) to refine these guesses. With metadata, the artist will know the position and angle data; iris and focal distance lets 'em render proper focus / bokeh as well, all in the first pass.
Typical matchmove workflow: characterize lens distortion, correct distortion in plate, track scene and solve, composite in CGI elements, re-distort. First three steps greatly aided with lens metadata.
No Panavision on the list yet, but Panavision is working on it.
Summary: the lenses have metadata at no additional cost, the metadata travels in the raw files with separate conveyance required, improved accuracy of CGI geometry, reduces VFX guesswork. With ACES IIF, metadata can be preserved through entire post process.
Lens metadata does NOT replace having an on-set VFX data wrangler.
The Design of a Lightfield Camera
Siegfried Foessel, Fraunhofer Institut
Lightfield: each object emits or reflects light, reflection can be specular or diffuse. All light rays combined form a lightfield.
Conventional cameras collect light from one beam of directions. One each point of focal plane several light rays will be integrated. Aperture determines how many rays will be integrated, if large there's blur from having more rays.
A pinhole cameras can capture lightfield on a specific point (the pinhole) But the hole can't be infinitely small, intensity loss; the larger the hole the less sharp the image.
Microlens array (MLA). Complete lightfield can't be captured by a sparse representation.
Or can use a camera array.
First experiment: one camera with 2-axis positioning system, 17x17 positions.
Tests creating depth maps:
Designed microlens array. What dimensions are needed? Try a multiplicity of 5: each object is imaged 5x5 times on sensor (oversampling). Used a 5 Mpixel industrial camera with array in front of sensor.
Image captured into computer for processing, got 1-3 fps on a 24-core computer for a VGA-sized final image. If you only have a small number of images you get artifacts:
You need a fixed focus, fixed aperture on main lens, otherwise you need to recalibrate everything.
Several challenges: assembling MLA on sensor for absolute defined separation distance; calibration; real-time computation. Needs dense sampling. High-res video a challenge. New strategy: use camera arrays instead of microlens array in the future. Motivation: each camera can capture high-res data. Challenges: need lots of cameras, cameras need to be small (mobile phone cams). Built small test camera using embedded linux; version 1 at NAB should shoot 1 fps, version 2 will shoot 30-60 fps. Cameras capture images, send to central processing over IP, images get computed and output.
Conclusions: new creative opportunities in post. Microlens cameras are small and compact, but compromise between resolution and flexibility; hard to calibrate, low res depth maps. Multi-camera arrays more promising.
Qs: what about handling the masses of data? Looking at low complexity coding for compression, 4:1 to 8:1. Resolution of depth map? Depends on triangulation from image data (multiple views, 5x5 on our system).
Image res? About 1/3 of the sensor res (on our multiplicity-of-5 array), so if sensor res is 3K, final image map res is 1K.
Next: Encoding; Steve Lampen on video over AVB; upconverting to 4K, and REDRAY