Saturday, September 4, 2010

Why H.264 is disqualified from being a web standard

In short: H.264 can never become a standard for web video as long as the patents are not released according to W3C patent policies.

The MPEG-LA consortium so far has showed no interest whatsoever to release their patents in a W3C compatible way. Thus the question is answered, H.264 is not even a candidate for becoming a web standard. It can't win that race, since it's not even in the running!

The W3C patent policy

The goal of this policy is to assure that Recommendations produced under this policy can be implemented on a Royalty-Free (RF) basis.

You see, it's not a case of Mozilla and Opera being obnoxious. In fact they are only fighting for the same thing as the W3C is: An open web. Video should be open just like HTML, CSS and the DOM is open.

Yes, the W3C mandates that all standardized web technologies should be free for all and for all types of usage and that as far as they are affected by patents, the owners of those patents legally commit to not stopping such usage.

But H.264 is implemented natively in the browser

Yes, it is. In fact there is no law against implementing anything that is not a standard or being considered for standardization. Google implements Flash, within the browser, but that does not make Flash a web standard.

H.264 is usable, and since the drivers are mature technically still the best option for delivery on mobile platforms (more on that later). In fact, would the MPEG-LA consortium consider releasing their patents in a W3C friendly way, it would be an excellent web standard. I don't see that on the horizon, though.

The fact remains that H.264 is a proprietary, patented and closed technology. Some vendors have bought themselves the right to use that technology and others perhaps could, but that is not the kind of freedom web standards should be made of. I find it very ironic that people fighting for free and open web standards for markup and stylesheets, for scripting and for graphics (PNG, SVG, etc) and for net neutrality and universal access are so quick to sell out their ideals when it comes to video.

Since H.264 video is implemented natively is some browsers, we can do stuff with it that we otherwise perhaps could not. But there is still precious little we can do that could not be done in Flash. Really. At least when you look at the end result, not at how it's done.

Don't get me wrong, I like having native video in the browser, but native does not equal open.

An aside: The bigger issue

There is one ideal solution to this problem of course. The USA should change its patent system, that is flawed and broken beyond usefulness. Patents are granted for user interface ideas, algorithms and all kinds of obvious stuff.

If I'd been an American I'd write my congressman and ask him or her what they are doing about this. And if I'd not be content with that answer I'd vote for somebody else, and I'd let everybody know i DID. If the USA would change its laws, most of the world would follow.

However, since a change in US patent laws not is going to happen soon, we are stuck in this mess for the foreseeable future. So what do we really do about it?

Could the MPEG-LA consortium be persuaded to change its mind?

Here is an idea: Let's have Hixie add H.264 to the HTML5 spec and release that spec in such a way as to start the W3C patent clock. That would mean that any patent holder who feels that their patent is being infringed must protest.

There could be two outcomes. The MPEG-LA could show its true colors and protest or they might succumb to the pressure and actually change its policies. The first alternative would perhaps silence everyone who thinks H.264 is free enough, the second alternative would really make H.264 free enough!

I doubt Hixie would include H.264 in the spec in order to float a balloon like this, though. But it's a fun thought.

The real solution: Solve problems that can be solved

The one strong argument in favor of H.264 is hardware acceleration, especially on mobile platforms like phones, netbooks and pads. But bringing VP8 to a comparable state is within our grasp. The hardware acceleration problem can be solved and it is an easier problem to solve than flawed US patent laws or changing the minds of stubborn MPEG-LA patent bureaucrats.

In order to understand this we must consider two things: What exactly is hardware acceleration and what is the expected lifespan of a web standard compared to the lifespan of current chip sets?

I'll start with the former. Video codecs will probably improve over the next couple of years, regardless of them being standardized. Smart people will conjure up better ways to reduce file size while increasing quality, or at least improving one of the two without hurting the other too much.

The question thus becomes, has H.264 been implemented in the layout of the transistors of modern GPUs in such a way as to make any other algorithm, or any variation of the algorithm impossible? That is, are the calculations required to encode or decode H.264 implemented in silicon in every minute detail and will electrons flow from transistor to transistor in a sequence that exactly matches H.264 encoding or decoding?

If that's the case, we have really dug ourselves into a hole. If that's the case, we've made it impossible to improve anything at all! Since new ideas can not use the GPU, they are doomed to be bad ideas!

But since it still takes a whole lot of code to actually write an H.264 encoder or decoder, the answer is of course no. Hardware acceleration of H.264 is not a magic black box.

A GPU is just a slightly different processor, optimized for some kinds of arithmetic that a normal CPU is not. There is no magic to it. It's just a layout of transistors. In the 80's and early 90's most CPU's could not do floating point arithmetic effectively. One had to buy a separate piece of silicon to get that (the 8087, the 287 and the 387). IBM recently introduced a CPU that has a core for decimal (sic!) arithmetic and does cryptography in the hardware.

It's actually not about doing some stuff in the hardware as opposed to other stuff in software. Last time I looked, the CPU was a a piece of hardware! It's a matter of letting the right piece of hardware perform the kinds of computational stuff it does best. It's matter of writing and compiling your programs to use the integer part of the CPU when that's appropriate, the floating point part when that's appropriate and the GPU when that is the most effective solution.

There is no technical barrier preventing VP8 or ogg/theora, or indeed any other software, from using the GPU. In fact, Microsoft is using the GPU to speed up core JavaScript arithmetic in Chakra. That's just one example of modern programs using the power of the GPU to do calculations that are not graphics related at all. So if that's possible, what says it's impossible to move arithmetic calculations to the GPU in the case of non H.264 encoded Video?

Mozilla has gotten CPU usage decoding ogg/theora down from 100 % on the Nokia n900, to just 20 %. And the main thing preventing that number to drop is the fact that the sound is decoded only in the CPU. But that's an obstacle that can be overcome as well.

Lack of so called hardware support for ogg/theora or WebM is in fact not really a hardware problem, but a software problem. The decoders (and encoders) have not been written in such a way as to optimally harness the arithmetic power of the GPU &ndash: yet! I expect this to change rapidly, though.

But maybe current hardware has been made with H.264 in mind, making it impossible for VP8 to fully catch up? Well, if the web industry would show a clear support for the VP8 codec, AMD, NVIDIA and Intel will soon implement some alterations to their transistor layouts in the next generation of chip sets, making the playing field even.

In a very short time we will see WebM video implementations that move enough calculation to the GPU to make it usable in portable devices, using today's silicon. But for the sake of argument, let's suppose that looking at WebM video would drain the battery of your cell phone 10-20 percent more than H.264. How bad is that? It is still within a reasonable limit, I say. And HTML5 still let's you provide H.264 as progressive enhancement to any client. But what's being argued (at least in this article) is what we should consider as a baseline, what can become a true standard for web video.

Let me say this as emphatically as I possibly can. Even if H.264 could be considered somewhat better than VP8 from a technical point of view, it still is not a good enough reason to let go of our freedom. Anyone who is valuing a slight short term technological advantage over long term freedom, needs a reality check and an ethical wake up call!

What about submarine patents?

Microsoft and Apple keep talking about submarine patents, that it is a hazard to everyone implementing ogg/theora or WebM and the MPEG-LA likes everyone to believe that they soon will smack down on the VP8 codec used in WebM video. Since not everyone smells the FUD, let's argue about this for a while.

If indeed VP8 is trespassing on H.264 patents does that mean that anyone implementing a VP8 encoder or decoder can be sued? Could Microsoft be sued? Could Apple?

The premise for such a thought is that the patents for H.264 not only stipulate algorithms but prohibits anyone licensing those patents from doing any kind of alteration not only to individual patents, but to the exact combination of those patents.

This is thus a legal variation of the hardware argument. It stipulates a lock-in mechanism to H.264 that prevents any kind of experimentation or improvement. All by itself that would be a bullet proof case against H.264. Who would like to lock the web into such a solution?

But of course this is not the case. One may use individual algorithms from H.264 together with new or altered algorithms. Anything else would be plain stupid!

And since Apple and Microsoft are licensees of the MPEG-LA patent pool (as well as contributors to it, although Apple has not really contributed as much as Microsoft has), they are authorized to use those patents. They have bought themselves the right to write software that use those patents! So even if we admit – for the sake of argument – that the VP8 codec indeed does infringe on H.264, what risk does that pose to Apple or Microsoft? None whatsoever!

If Mozilla and Opera are willing to take the risk of implementing VP8, without licensing anything from MPEG-LA, what risk is that to Apple? In what way is that a threat to Microsoft? Having bought themselves the right to use all MPEG-LA patents that risk is absolutely zero.

Bottom line: MPEG-LA will not sue Apple if they implement the VP8 codec. Nor will they sue Microsoft.

(Of course, one option for Apple would be to let anyone submit any driver they'd like to IOS. If it was a truly open platform, we would see a WebM enabled version of Mobile Safari tomorrow, without Apple lifting a finger, without Apple programmers having to write a single line of code!)

H.264 advocates can not both have the cake and eat it too

On one hand we hear that VP8 is so similar to H.264 that it probably infringes on the patents guarding that codec. On the other hand we hear that it is so vastly different that we can not get hardware decoding. But which one is it?

If the algorithms are so similar, that there is a patent infringement going on, it goes without saying that GPU accelerated VP8 encoded video must not be hard to implement. If that's the case, the silicon has been wired to do these exact calculations.

On the other hand the algorithms are so different that decent GPU accelerations is impossible, what makes anyone think that the MPEG-LA could sue you for using them?

I wish H.264 advocates would chose which of these two dangers we are supposed to be afraid of, because they are mutually exclusive.

Another example of mutually exclusive claims is that MPEG-LA supposedly owns so many patents that is is virtually impossible to write a video codec that does not infringe on their patents and the fear that there might be some third party, that is not participating in WebM or Theora video, nor in the MPEG-LA, but holds patents in secret, waiting for someone to implement it. A Paul Allen, but with an actual case. A troll with infinite patience that will strike just when WebM has taken off.

But if VP8 is so akin to H.264 that it infringes on their patents, what space would that leave for this third party troll? Very little I'd say.

Once again, I am not saying that the one of these propositions is true. In fact I believe them both to be untrue. But I wish that H.264 advocates would agree on one argument, when mere logic dictates that one being true by definition means that the other one is not.

What kind of power does Apple and Microsoft wield within MPEG-LA?

Speaking of lawsuits, the MPEG-LA is a consortium and it must act according to the will of its members. So if Microsoft and Apple really cared about open video, I have suggestion for them. Use your muscle within that consortium, that you are part of, and convince your fellow members that truly open video is a good thing™. Convince them to release H.264 in a W3C patent policy compliant way. Show us that you are submitting such proposals to the board, show us that you are arguing the case. Only then will your opinion be worthy of consideration.

Until that happens, H.264 can not be a web standard. Until that happens, it can in fact not even be considered for standardization.