From SMPTE updates to the speed of future Ethernet and a really-big announcement from Belden’s Steve Lampen, plus the post-retreat treat, which – unusually – is not a Mark Schubin confection.
[Updated 13:35 PST: Final]
The HPA Tech Retreat is an annual gathering of some of the sharpest minds in post, production, broadcast, and distribution. It’s an intensive look at the challenges facing the industry today and in the near future, more relentlessly real-world than a SMPTE conference, less commercial than an NAB or an IBC. The participants are people in the trenches: DPs, editors, colorists, studio heads, post-house heads, academics, and engineers. There’s no better place to take the pulse of the industry and get a glimpse of where things are headed; the stuff under discussion here and the tech seen in the demo room tend to surface in the wider world over the course the following year or two, at least if past years are any guide.
I’m here on a press pass, frantically trying to transcribe as much as I can. I’m posting my notes pretty much as-is, with minor editing and probably with horrendous typos; the intent is to give you the essence of the event as it happens. Please excuse the grammatical and spelling infelicities that follow.
Note: I’ll use “Q:” for a question from the audience. Sometimes the question isn’t a question, but a statement or clarification. I still label the speaker as “Q”.
(More coverage: Day 1, Day 2, Day 3.)
SMPTE Update – Barbara Lange, SMPTE & Wendy Aylsworth
The party last night was so good, why let it end?
We’re getting ready to celebrate our Centennial. Interesting to see where we’ve come from… SMPTE and HPA are merging, by this spring we’ll have consolidated formally.
A couple of events: fundraising to invest in the organization (especially Walt Disney and Panasonic and Dolby, $1M so far). A book coming out, and a historical documentary film.
At the tech Emmys, SMPTE won the Philo T. Farnsworth award, Larry Thorpe won the Charles Jenkins lifetime award.
Martinis first thing in the morning are good for the constitution!
Active sections all over the world, and HPA goes along for the ride. DC section’s “Bits by the Bay” in May; Australia will do the 2015 Persistence of Vision conference.
New standards being developed all the time, including IMF work, HDR, immersive audio, etc. Next standards meetings at Altera in San Jose in March, then in Australia in July.
Education: Tech Summit in Cinema at NAB in April; Entertainment Tech at Stanford; SMPTE Forum in Berlin, SMPTE Tech Conference in LA, IBC. Working with HPA on the HPA Awards and next year’s Tech Retreat. SMPTE Professional Development Academy online. The Journal continues; hope to return to monthly, and have HPA articles too.
New event at the fall conference: HPA Student Film Festival
Get involved: join, attend, connect, partner. Please participate!
AXF: Now a Standard – Merrill Weiss, Merrill Weiss Group
Archive eXchange Format. Layered context within an IT system, sitting between AXF-aware applications and block-level or filesystem-based storage.
AXF objects are identical on all media types, with small storage differences on certain media, like a header on tapes that isn’t needed on rotating disks. A file tree structure shows relationships between files in an AXF object. Metadata in XML. Inside an AXF object, Binary Structure Container. Containers define elements and structural metadata is carried redundantly. Lightweight filesystem within AXF; the file tree structure is carried in both header and footer for redundancy and resilience.
Spanning in AXF: Spanned sets allow references from one object to media in another. Some content is too big for a single medium, so spanning allows fragments to be concatenated across multiple media. UUIDs used as identifiers and links.
Versioning allows “collected sets”, with add/delete/replace operations. A “product object” is the result of the operations for a given version.
Unlimited file sizes, unlimited # of files, any type of file, any type of media. Enables streaming to/from cloud.
Resilient structure with redundant data and cryptographic hashes.
Abstracts storage from a given medium.
Field-proven at 25 companies so far, 50 petabytes being stored.
Works so well for storage, can we use it on-set? A new version of AXF is being developed to start on-set; AXF Bundles allow for the file tree / manifest to be on one medium then point to others, such as P2 cards used for recording camera original files. Sidecar metadata files.
Paper on AXF in the SMPTE Journal Jan/Feb 2014.
Q: How do do hand off access to AXS objects? A: AXF doesn’t tell you about management of keys, etc. You can choose to encrypt AXF files and objects, store keys in the metadata, but that’s orthogonal to AXF storage standard. For long-term, maybe you want to store a description of the decoder so future generations can open the files… what if the english language changes? [That’s long term!]
Q: Are there presently or will ever be open source implementations of the AXF structures? A: AXF designed to be useful on any archive system, useful on any computer. One of the exercise is develop utilities for variety of operating systems. Being able to read the directory structure. You can find the files if you can read the file structures (XML).
Enhancing the Creative Palette while Preserving the Intent from Camera to Consumer – Robin Atkins, Dolby Laboratories
HDR and wide color gamut. Many screens, one master? It’s what we want, but is it a reality? What do we pick for a master, and how do we present it on all these different screens?
The one master: easier to map down than up. Master using largest color and DR range, then compress down as needed for lesser gamuts and ranges on target displays.
How do you do that? Integrate a color management process on every device: maps from master color/DR into target display’s range. Preserve artistic intent across all these different displays. Take source metadata, convert to a working color space that supports mapping, color volume mapping (minimizing effect on skintones), convert to output color space (primaries and EOTF).
What if different monitors in the production pipeline have different capabilities, and allow each person to use the full data range? Same thing: color mapping on-board or in a feeder box (like a LUT box). Or have all your reference monitors the same.
Combining SDR with HDR: use an invertible color management process to move the SDR content to HDR.
Adapting the mapping from metadata: nice if you have it, but always a default fallback, e.g., S-curving highlights for a 10,000 nit signal into a 4,000 nit display. Better if you have the source metadata, though. You can even do this on a scene-by-scene basis, so you don’t compress highlights so much if you know the scene doesn’t have a lot of very bright content.
PLUGE calibration with HDR. SMPTE 2084 is the PQ curve (Perceptual Quantization), designed for a dark viewing area. You don’t have traditional brightness/contrast knobs since
L = EOTF(V)
but
L = EOTF(a*V + b)
where a is contrast and b is brightness gives you those controls back to adjust for brighter viewing environments.
MPEG-H and Immersive Sound – Robert Bleidt, Fraunhofer DMT
New TV immersive audio system developed by Fraunhofer. Viewers will be able to personalize the sound; audio tailored for listening environment and hardware.
Personalization: MPEG-H includes audio object (not channel-based), so perhaps you might adjust dialog level independently of other sound elements. Tested at Wimbledon with a 24dB range of adjustment; 84% thought it was useful. Many wanted 8dB louder, many wanted 8dB quieter than BBC default level. NASCAR demo: listen to driver’s sound, crowd noise, pit crew radio, alternate languages, etc. Real time decoding and rendering. B’caster can control the options and volume ranges allowed.
Contrasting approach cinema vs. living room. Cinema is focused on dynamic (movable) objects, home is more focused on levels and balances.
Best bang for buck is 7.1 audio plus 4 high speakers in corners. We support up to 22.2 sound, 128 objects, higher-order ambisonics.
Only your most enthusiastic consumers will put in 7.1 + 4 high speakers. What about mainstream? 3D soundbar (sound bars around the entire screen) helped.
Immersive audio: 7.1+4, 3D soundbar, higher-order ambisonics.
Improving the experience: multi-platform loudness control via AC-3 metadata allowed in legacy mode. Loudness measurement built into decoder. Adjustment of dynamic range to suit the environment and device capabilities.
We can correct for misplaced speakers to some degree.
How do we get this into TV without a forklift upgrade? 4-stage implementation in existing SDI plants: 1, replace AC-3 with MPEG-H; 2, add objects; 3, add immersive sound; 4, add dynamic objects.
We’ve tested in the field, built hardware for demos and tests. 2nd phase is to prep for new media delivery. Encoding immersive movies.
The standard progresses; experiments continue, visit us at NAB for a very compelling demo.
Lightfield Capture and Post Update – Siegfried Foessel, Fraunhofer IIS
Capture light rays at different positions and reconstruct lightfield. Real scenes can be modeled like CGI, create virtual cameras. Many approaches:
We are using the camera array because it’s very easy to reconfigure it.
Stop-motion production with a 4-camera array. Camera array unmoved, all moves / refocuses done in post [clip shown with boom, dolly, focus pull on a Lego set]. In post you create a sort of 3D surface in space that you can then move camera around.
Then we did live productions, starting with smartphones, then industrial cameras, maybe GoPros [clip showing industrial camera array, alongside Sony F3 (via beamsplitter), on-set, being used like a “real” camera for a theatrical production. ]
New user interface. Had a GUI with a 2D position and a focus slider, now have an Avid plugin as a special effects filter: you can keyframe 2D position, zoom, focus, depth of field along the timeline [live demo of the Avid interface].
Live-action short now in post, will show at NAB Digital Cinema Summit, but still a lot of things to do.
Automated QC for Files panel
Moderator: Mike Christmann, Flying Eye
Henry Gu, GIC
Chad Rounsavall, Nexidia
Howard Twine, VidCheck
Mike: Why QC in file-based workflow? Why automated QC? Who is already using? [maybe 30%]. Who is using more than one tool?
Why QC? IRT 2014 MXF Plugfest: 91.8% interoperability (so nearly 8% went wrong), better than in previous years. So out of 100 files in playout, 8 wouldn’t play. And that’s with only OP Atom and OP 1a and only 4 codecs. Also Netflix: redelivery rate of 7%, 1% in tape-based media. Clearly we need QC.
Automated: so much media, so many versions. Can traditional model scale up? Used to be 4 farmers on a cart, then 4 farmers driving combine harvesters (100x more efficient), next maybe a robotic tractor.
Chad: We do audio analysis. QC not any different as an end goal. Automated QC best practices, you should confirm you’re receiving what you should be, and you protect your business mode.
Henry: We make tools for packaging IMF deliveries. #1, can you play it back? Used to be easy, NTSC or PAL, now so many formats. #2, Integrity in the assets. QC is not easy; hard to hire people passionate about it!
Howard: We’re interested in the structure of the essence, video levels, file integrity, integrity of essence. Great success with DPP. We’ve developed tests for PSE, added corrections for it, allows QC engineers to do more.
Mike: Which issues best addressed with automation, which not?
Howard: You have to use common sense. Can’t take a new system out of the box and expect everything from it. Has it got 5.1, are there droputs, etc., OK, but no QC tool tells you the image is upside-down. Apply intelligently at different parts of the process.
Henry: Once you QC one version, other versions should be consistent. Best part for the computer is detecting dropped frames, audio dropout, etc.
Chet: We should use as much as possible but you’ll never automate all of it. We check closed-caption files.
Mike: Your experiences: It’s all about semantics, people use same words for different things. EBU periodic table, Howard can explain it.
Howard: It’s a stick in the ground, leveling nomenclature. One vendor’s “macroblock errors” may equal another’s “blockiness”. EBU periodic table of criteria goes a long way to quantify each of those areas.
Henry: With legacy workflow, may have older terms, nice to have a standard.
Chet: Standardization of language is critical. We have to be on same page to describe things.
Mike: Biggest challenges in automating QC?
Henry: “No I don’t like it, it’s taking my job!” Education, that computer will do routine stuff, leave you with the interesting bits to do. Human eyeballs still needed.
Chet: Profile definition, really getting to the margin of the pass/fail rate. That’s the biggest time expenditure, defining the profiles.
Howard: Getting up and running: where do you start? If it’s Netflix or DPP, we have built-in template, but they don’t handle everyone’s case. Don’t put square peg in round hole: is this a sensible place to add QC?
Chet: No one person with skillset for all content QC. We interrogate languages; how many people can tell you that all the language tracks are present and correct?
Howard: Educational side of things; people deploy VidChecker to ensure deliverable, then move it upstream, editors can check audio levels for CALM compliance.
Henry: People get a new format, can’t handle it in QC, we have people 24/7 to help with updates.
Howard: The ease of integrating a new QC engine is crucial, ensure provenance of a file end-to-end.
Q: What’s the most effective education? Howard: Depends on audience. Ease of use of a UI is very important, being able to identify what and where a problem is, in plain language. Have a nomenclature dictionary built in, “what does a macroblock error mean?”
Q: QC, I have 8 areas for QC. A mike in the shot, noise in the shot, automation is out. So what’s left for automated QC, file validation. Qualifying a system end-to-end is a form of QC. A lexicon to describe these is essential. Marketing speak, buzzwords, not possible to make rational choices.
Q: Can you look at plugfest errors and find out what went wrong so we don’t screw up again? A: Report online, read it for what went wrong.
Q: Looking for efficiency, QC helps, looking for tools to do more with less, get more stuff done. Whatever too is appropriate.
Q: We use a number of QC tools, many false positives, any way we can feed back the false positives to the client to warn them so we don’t keep trying to fix it? A: You should reject it. Send us the false positives so we can eliminate them from the tools.
Image Analysis for Enhanced Quality – Dirk Hildebrandt, Wavelet Beam
Looking at noise and picture quality: analyze every video to deliver best quality. Noise depends on the camera used, the compression format. We like it for “film look” and for unifying sources. We don’t like it for bandwidth, overall quality, etc. De-noising: separate noise from content, so you don’t just smooth stuff. Integrated de-noising in JPEG2000 workflow, encode and decode, using knowledge of the compression. Separate noise automatically, do it dynamically, equalize over spectrum. De-noising for uncompressed RAW is the best. Compression transforms noise, changes it, leads to de-noising artifacts and requires more cleanup.
Necessary adoption in metadata workflows. IMF, Frameworks for Interoperable Media Services (FIMS).
IMF input, decode, analyze, de-noise with profiles (and rights: is de-noising allowed?), scaling, encoding. Content owner can define de-noising rights, needed since de-noising applied in the past often damaged picture, so customers are leery.
The whole processing chain can be described in FIMS transform. Metadata information system. Additional context info based on UUID. Future-proofed GPU based content workflow to de-noise. www.GPUalliance.com This can be virtualized in the cloud for flexibility.
Summary: noise leads to compression artifacts. We can de-noise via FIMS transform services. Used de-noising profiles and rights. GPU and cloud solutions.
Video Optimization – Geoff Tully, Beamr
Mobile video: “You’re gonna break the Internet with that thing!” Optimizing video: reducing the file size / bit-rate to deliver higher quality on constrained connections. Not simple; different videos compress differently. In theory, you can optimize for each video, but efficiencies vary even within scenes. The costs of optimized compression can consume any bandwidth savings. Beamr is a post-encode optimizer, re-encoding previously encoded video. Goes frame-by-frame, compressing each one as much as possible while meeting quality metric. Doesn’t change GOP structure, chapter markers, etc.
Beamr used in M-Go: source to encode to Beamer to packaging. Works in local servers / shared NAS as well as on AWS. Very low data rates don’t show optimization (below 1000 for SD files), high rates save up to 42% bit-rate. M-Go found they can delete some of the delivery profiles, not do as many different compressions. Better video with less buffering, faster start times. Average bit rate is a bit higher, because fewer streams used and less switching to low-quality streams caused by unoptimized bursts of data.
Proof of concept shown at CES with UHD HEVC media, more to be shown at NAB.
Q: So you expect to be able to get 30% out of an HEVC file? A: We took 18Mbit/sec and got it down quite a bit. See it at NAB.
Q: Moscow University has a new H.264 compression implementation, saves 50%. Which non-optimized compression was used? A: It was all H.264. Possibly Vanguard. Interesting to note, for lots of Blu-Ray, not a different kind of codec. What we’ve done is automate the video quality measurement so we can fully optimized each frame in the movie. No human compressionist will hand-optimize every frame.
Q: A handful of AVC and HEVC compressors out there. Integration? A: We’re talking to a number of them about integrating it into the encoding process. Conversations are going on, especially in HEVC.
Q: Does it work as well for both CBR and 2-pass VBR? A: We’re working on elementary streams, we don’t care if it was CBR or VBR coming in. It’ll be VBR going out. I can’t compare them by the numbers.
Utilizing Fingerprinting Technology for Shot Matching: Benefits & Challenges – Mike Witte, Vobile & Steve Klevatt, Warner Bros. MPI
Mike: Vobile VDNA for content identification, we derive frame-level fingerprints. Used since 2007 by studios, networks, etc. New app: ReMatch.
Steve: Used ReMatch to conform 4 seasons of a TV show (“Family Matters”). No EDLs, no timecode, all we had were the images. Lorimar didn’t cut the neg on 1st 4 seasons. Used cloud-based ReMatch to programmatically recut SD assets into new HD and 4K masters. We recut 75 shows from ’89-’98 in two months (one episode/day).
Had to be frame-accurate, report confidence tiers, do fingerprinting on-site but analysis in the cloud, quick turnaround (2-3 days per episode regardless of load – the power of cloud), small files, must send hashes over the wire instead of entire frames, handle cadence (reports in reference timespace, not source).
Start with raw footage (cam orig and reference cut), fingerprint on-site. After 2-3 days in the cloud, getr a plaintext CSV file to derive EDLs, use thumbnails to visually confirm matches between source and cut. Used SD reference and film dailies. SD material converted to 23.98 via Teranex, cleaned up for a better match. At 720p v 720p, Vobile matched 95% or more. Get a scan EDL, a conform EDL and a CMX EDL; scan your selects at UHD, finish at UHD, release as UHD and convert to HD.
Sample report:
Time saving better than expected. Cloud based, so was infinitely scalable. And we wound up with EDLs, CVS files, PDFs, and MP4s to go back into the archives.
Lower costs over substitute workflows, no need to cut original neg, etc.
Other potential uses for ReMatch: stock footage databases, location scouting (match a location photo to other media), building libraries of reference material.
Q: How robust is the fingerprinting if, say, you’re working from a damaged print? A: Very robust; developed in anti-piracy world, so all sorts of transforms are dealt with.
Q: What about the other 5%? A: We held our nose and uprezzed.
Tomorrow’s Ethernet: 25, 50, 100 Gbps? Warren Belkin, Arista
Ethernet developments have an impact on media. File-based workflows are about things getting bigger, you have to move things faster. In the b’cast side, uncompressed video, we’ll soon be beyond 10Gig, whether it’s SMPTE 2022-6 or whatever. Move away from bespoke gear, take advantage of hyperscale cloud stuff built in volume and thus lower cost.
When you start getting into 4K video, you’re above 10Gig. [Survey of audience: everyone has 10Gig, 2 have 40Gig, nobody has 100Gig today].
40Gig may go big in 2016, but PCI-E NICs can’t get above 32 Gig. In server rooms, 100Gig will be big.
Today: 10Gig $1200/port, 40Gig $4660/port, 100Gig $12,000/port. Linear scaling of price with speed (improvement over previous years). 40 (and 100?) Gig use multiple lanes of 10Gig.
40 and 100 need more SerDes (serializers/deserializers), consume more power. Need to find optimal configuration.
July 2014: 25gEthernet.org, 25 & 50 Gb Ethernet consortium forming a de-facto standard, now 802.3by within IEEE. 25G is a single-lane tech, size and power closer to 10Gig. Price should be around 1.5x 10Gig. 50 Gig 2 lanes, 2.25x cost of 10 Gig. 25 and 50 Gig best price/performance.
QSFP 28, same size as QSFP 40, will support 100Gig, 3.5 watts/port. 12 ports of 100Gig on a single rack unit today.
Would you rather see a SMPTE 2022-6 camera with 4 10Gb fibers out or one 25Gb fiber out?
Products expected in 12-18 months.
A Really-Big Announcement – Steve Lampen, Belden
We took bets on what this talk was going to be about:
1: Steve comes out of closet. 2: Steve hired by Gepco. 3: Steve retires. 3.5: Belden folds when Steve retires… [etc.] Runner-up: Belden acquires Avid. 10: Belden announces wireless 10, 25, 40 Gb called CloudAx. Which is it?
Significant improvements in cables will that 5 years or more. The start of that process is to look at testing to 12 GHz: 30 factory batches of cables, testing for attenuation, return loss, expected distance for -40dB @ 1/2 clock (Nyquist limit).
Here’s what we’ve found: existing coax cables seem to work well out to 12 GHz. There are spikes at characteristic dimensions of the shield braid; we can redesign the braid to minimize the spikes. So here’s the deal:
Use the cables you already have.
Cables from 1990 are still good. There were designed for 400 MHz SD, but they seem to work for 12 GHz 4K.
Everyone at Belden was excited to think we’re gonna sell new cables for more money. But here’s 75 meters of 1649A at 12 GHz:
So this is a cheap upgrade!
Q: What about connectors, use the same? A: Connectors are critical, ours work up to 6GHz, above that you’ll need new ones from Kings, Cambridge, etc. for BNCs. DINs are something else.
Post-Retreat Treat: Silver Linings: How the Great Depression Led to Cloud Computing – Rich Welsh, Sundog Tools
Let’s go back to Charles Babbage.He developed a plan for a mechanical computer. Federico Luigi Menabrea wrote this up in Italian, Lady Ada Lovelace translated it and made notes; her notes describe the first computer program, using punchcards – not invented by Jaquard, but Bouchon and de Vaucanson. Babbage’s machine wasn’t built in his time, but it had a printer, branching based on results… and the lightbulb wasn’t even around yet.
1884 Hollerith started the Tabulating Machine Company, leasing the tabulators instead of selling them. Used for 1896 census. Merges with Bundy and Charles Randall Flint, handed off to Watson, became IBM. Around then we had the Great Depression. In Depression, nobody would lease machines. In WWII, Rejewski invented a cyptologic bombe, used at Bletchley by Turing to crack Enigma codes. Aiken was working in US with Grace Hopper, working on COBOL.
Lovelace though computers were good for more than just math stuff. Turing stated to develop computer science theory; the Turing Machine concept, everything we have today follows that model.
In 1952 IBM had a 90% market share. They decided to move into electronic computers. 1960s, Seymour Cray on cost/performance, John McCarthy on shared utility (and coined the term AI). Flops/$ vs $/CPU isn’t linear; pricier CPUs aren’t always proportionally performant. Parkhill saw that powerful computers get exorbitantly expensive; we should use shared resources.
What killed this was microprocessers which completely upset the existing cost/performance curves. The microprocessor and personal computer explosion led to decline of the mainframe.
Enid Mumford, a sociologist looking at people interacting with tech around 1980: it doesn’t matter if the system works perfectly, if people use it, things will fail (example: passwords).
So Jeff Bezos started Amazon in 1994, now the predominant cloud provider. Favolaros’ document at Compaq may have been the 1st use of “cloud” according to Wikipedia, but a chap in SMPTE Journal used it first. In 2003, concept of Amazon Elastic Cloud, developed in 2005. Pratt & Crosby created the Xen hypervisor, now most deployed in the world (17x more than all other hypervisors).
If you look at these developments, we went though mainframes, then PCs, then mobiles / smartphones. Are we moving into the age of utility computing? Gartner hype cycle: in 2006, no cloud; 2008 on the rise toward peak of inflated expectations, 2010 on the peak, in 2014 in the trough of disillusionment.
But we’re about to climb the slope of enlightenment towards the plateau of productivity.
Disclosure: I’m attending the Tech Retreat on a press pass, which gets me free entrance into the event. I’m paying for my own travel, local transport, hotel, and meals (other than those on-site as part of the event). There is no material relationship between me and the Hollywood Post Alliance or any of the companies mentioned, and no one has provided me with payments, bribes, blandishments, or other considerations for a favorable mention… aside from the press pass, that is.