Saturday, September 4, 2010

Why H.264 is disqualified from being a web standard

In short: H.264 can never become a standard for web video as long as the patents are not released according to W3C patent policies.

The MPEG-LA consortium so far has showed no interest whatsoever to release their patents in a W3C compatible way. Thus the question is answered, H.264 is not even a candidate for becoming a web standard. It can't win that race, since it's not even in the running!

The W3C patent policy

The goal of this policy is to assure that Recommendations produced under this policy can be implemented on a Royalty-Free (RF) basis.

You see, it's not a case of Mozilla and Opera being obnoxious. In fact they are only fighting for the same thing as the W3C is: An open web. Video should be open just like HTML, CSS and the DOM is open.

Yes, the W3C mandates that all standardized web technologies should be free for all and for all types of usage and that as far as they are affected by patents, the owners of those patents legally commit to not stopping such usage.

But H.264 is implemented natively in the browser

Yes, it is. In fact there is no law against implementing anything that is not a standard or being considered for standardization. Google implements Flash, within the browser, but that does not make Flash a web standard.

H.264 is usable, and since the drivers are mature technically still the best option for delivery on mobile platforms (more on that later). In fact, would the MPEG-LA consortium consider releasing their patents in a W3C friendly way, it would be an excellent web standard. I don't see that on the horizon, though.

The fact remains that H.264 is a proprietary, patented and closed technology. Some vendors have bought themselves the right to use that technology and others perhaps could, but that is not the kind of freedom web standards should be made of. I find it very ironic that people fighting for free and open web standards for markup and stylesheets, for scripting and for graphics (PNG, SVG, etc) and for net neutrality and universal access are so quick to sell out their ideals when it comes to video.

Since H.264 video is implemented natively is some browsers, we can do stuff with it that we otherwise perhaps could not. But there is still precious little we can do that could not be done in Flash. Really. At least when you look at the end result, not at how it's done.

Don't get me wrong, I like having native video in the browser, but native does not equal open.

An aside: The bigger issue

There is one ideal solution to this problem of course. The USA should change its patent system, that is flawed and broken beyond usefulness. Patents are granted for user interface ideas, algorithms and all kinds of obvious stuff.

If I'd been an American I'd write my congressman and ask him or her what they are doing about this. And if I'd not be content with that answer I'd vote for somebody else, and I'd let everybody know i DID. If the USA would change its laws, most of the world would follow.

However, since a change in US patent laws not is going to happen soon, we are stuck in this mess for the foreseeable future. So what do we really do about it?

Could the MPEG-LA consortium be persuaded to change its mind?

Here is an idea: Let's have Hixie add H.264 to the HTML5 spec and release that spec in such a way as to start the W3C patent clock. That would mean that any patent holder who feels that their patent is being infringed must protest.

There could be two outcomes. The MPEG-LA could show its true colors and protest or they might succumb to the pressure and actually change its policies. The first alternative would perhaps silence everyone who thinks H.264 is free enough, the second alternative would really make H.264 free enough!

I doubt Hixie would include H.264 in the spec in order to float a balloon like this, though. But it's a fun thought.

The real solution: Solve problems that can be solved

The one strong argument in favor of H.264 is hardware acceleration, especially on mobile platforms like phones, netbooks and pads. But bringing VP8 to a comparable state is within our grasp. The hardware acceleration problem can be solved and it is an easier problem to solve than flawed US patent laws or changing the minds of stubborn MPEG-LA patent bureaucrats.

In order to understand this we must consider two things: What exactly is hardware acceleration and what is the expected lifespan of a web standard compared to the lifespan of current chip sets?

I'll start with the former. Video codecs will probably improve over the next couple of years, regardless of them being standardized. Smart people will conjure up better ways to reduce file size while increasing quality, or at least improving one of the two without hurting the other too much.

The question thus becomes, has H.264 been implemented in the layout of the transistors of modern GPUs in such a way as to make any other algorithm, or any variation of the algorithm impossible? That is, are the calculations required to encode or decode H.264 implemented in silicon in every minute detail and will electrons flow from transistor to transistor in a sequence that exactly matches H.264 encoding or decoding?

If that's the case, we have really dug ourselves into a hole. If that's the case, we've made it impossible to improve anything at all! Since new ideas can not use the GPU, they are doomed to be bad ideas!

But since it still takes a whole lot of code to actually write an H.264 encoder or decoder, the answer is of course no. Hardware acceleration of H.264 is not a magic black box.

A GPU is just a slightly different processor, optimized for some kinds of arithmetic that a normal CPU is not. There is no magic to it. It's just a layout of transistors. In the 80's and early 90's most CPU's could not do floating point arithmetic effectively. One had to buy a separate piece of silicon to get that (the 8087, the 287 and the 387). IBM recently introduced a CPU that has a core for decimal (sic!) arithmetic and does cryptography in the hardware.

It's actually not about doing some stuff in the hardware as opposed to other stuff in software. Last time I looked, the CPU was a a piece of hardware! It's a matter of letting the right piece of hardware perform the kinds of computational stuff it does best. It's matter of writing and compiling your programs to use the integer part of the CPU when that's appropriate, the floating point part when that's appropriate and the GPU when that is the most effective solution.

There is no technical barrier preventing VP8 or ogg/theora, or indeed any other software, from using the GPU. In fact, Microsoft is using the GPU to speed up core JavaScript arithmetic in Chakra. That's just one example of modern programs using the power of the GPU to do calculations that are not graphics related at all. So if that's possible, what says it's impossible to move arithmetic calculations to the GPU in the case of non H.264 encoded Video?

Mozilla has gotten CPU usage decoding ogg/theora down from 100 % on the Nokia n900, to just 20 %. And the main thing preventing that number to drop is the fact that the sound is decoded only in the CPU. But that's an obstacle that can be overcome as well.

Lack of so called hardware support for ogg/theora or WebM is in fact not really a hardware problem, but a software problem. The decoders (and encoders) have not been written in such a way as to optimally harness the arithmetic power of the GPU &ndash: yet! I expect this to change rapidly, though.

But maybe current hardware has been made with H.264 in mind, making it impossible for VP8 to fully catch up? Well, if the web industry would show a clear support for the VP8 codec, AMD, NVIDIA and Intel will soon implement some alterations to their transistor layouts in the next generation of chip sets, making the playing field even.

In a very short time we will see WebM video implementations that move enough calculation to the GPU to make it usable in portable devices, using today's silicon. But for the sake of argument, let's suppose that looking at WebM video would drain the battery of your cell phone 10-20 percent more than H.264. How bad is that? It is still within a reasonable limit, I say. And HTML5 still let's you provide H.264 as progressive enhancement to any client. But what's being argued (at least in this article) is what we should consider as a baseline, what can become a true standard for web video.

Let me say this as emphatically as I possibly can. Even if H.264 could be considered somewhat better than VP8 from a technical point of view, it still is not a good enough reason to let go of our freedom. Anyone who is valuing a slight short term technological advantage over long term freedom, needs a reality check and an ethical wake up call!

What about submarine patents?

Microsoft and Apple keep talking about submarine patents, that it is a hazard to everyone implementing ogg/theora or WebM and the MPEG-LA likes everyone to believe that they soon will smack down on the VP8 codec used in WebM video. Since not everyone smells the FUD, let's argue about this for a while.

If indeed VP8 is trespassing on H.264 patents does that mean that anyone implementing a VP8 encoder or decoder can be sued? Could Microsoft be sued? Could Apple?

The premise for such a thought is that the patents for H.264 not only stipulate algorithms but prohibits anyone licensing those patents from doing any kind of alteration not only to individual patents, but to the exact combination of those patents.

This is thus a legal variation of the hardware argument. It stipulates a lock-in mechanism to H.264 that prevents any kind of experimentation or improvement. All by itself that would be a bullet proof case against H.264. Who would like to lock the web into such a solution?

But of course this is not the case. One may use individual algorithms from H.264 together with new or altered algorithms. Anything else would be plain stupid!

And since Apple and Microsoft are licensees of the MPEG-LA patent pool (as well as contributors to it, although Apple has not really contributed as much as Microsoft has), they are authorized to use those patents. They have bought themselves the right to write software that use those patents! So even if we admit – for the sake of argument – that the VP8 codec indeed does infringe on H.264, what risk does that pose to Apple or Microsoft? None whatsoever!

If Mozilla and Opera are willing to take the risk of implementing VP8, without licensing anything from MPEG-LA, what risk is that to Apple? In what way is that a threat to Microsoft? Having bought themselves the right to use all MPEG-LA patents that risk is absolutely zero.

Bottom line: MPEG-LA will not sue Apple if they implement the VP8 codec. Nor will they sue Microsoft.

(Of course, one option for Apple would be to let anyone submit any driver they'd like to IOS. If it was a truly open platform, we would see a WebM enabled version of Mobile Safari tomorrow, without Apple lifting a finger, without Apple programmers having to write a single line of code!)

H.264 advocates can not both have the cake and eat it too

On one hand we hear that VP8 is so similar to H.264 that it probably infringes on the patents guarding that codec. On the other hand we hear that it is so vastly different that we can not get hardware decoding. But which one is it?

If the algorithms are so similar, that there is a patent infringement going on, it goes without saying that GPU accelerated VP8 encoded video must not be hard to implement. If that's the case, the silicon has been wired to do these exact calculations.

On the other hand the algorithms are so different that decent GPU accelerations is impossible, what makes anyone think that the MPEG-LA could sue you for using them?

I wish H.264 advocates would chose which of these two dangers we are supposed to be afraid of, because they are mutually exclusive.

Another example of mutually exclusive claims is that MPEG-LA supposedly owns so many patents that is is virtually impossible to write a video codec that does not infringe on their patents and the fear that there might be some third party, that is not participating in WebM or Theora video, nor in the MPEG-LA, but holds patents in secret, waiting for someone to implement it. A Paul Allen, but with an actual case. A troll with infinite patience that will strike just when WebM has taken off.

But if VP8 is so akin to H.264 that it infringes on their patents, what space would that leave for this third party troll? Very little I'd say.

Once again, I am not saying that the one of these propositions is true. In fact I believe them both to be untrue. But I wish that H.264 advocates would agree on one argument, when mere logic dictates that one being true by definition means that the other one is not.

What kind of power does Apple and Microsoft wield within MPEG-LA?

Speaking of lawsuits, the MPEG-LA is a consortium and it must act according to the will of its members. So if Microsoft and Apple really cared about open video, I have suggestion for them. Use your muscle within that consortium, that you are part of, and convince your fellow members that truly open video is a good thing™. Convince them to release H.264 in a W3C patent policy compliant way. Show us that you are submitting such proposals to the board, show us that you are arguing the case. Only then will your opinion be worthy of consideration.

Until that happens, H.264 can not be a web standard. Until that happens, it can in fact not even be considered for standardization.

20 comments:

  1. Its a setup of course. Free for now, but if you all get suckered in enough, well then it costs.

    Gee what a unique business approach they have.

    Oh and if WebM gets too popular we will sue the crap out of anyone that even utters the words "Webm" because we patented the concept of thinking for oneself of new ideas.

    ReplyDelete
  2. So internet users are to reject the "free stuff now" for "somewhat more free stuff next year"

    Can I just say something ?

    HAHAHHHAHHAAAAAHAHAHHAHAHAHAHAHAAAAAAHAAAAHAHHAHAHAHAHAHAHAHAHA

    ReplyDelete
  3. Standards are stupid! Even with standards we still need to code stuff unique to a particular implementation.

    Your stupid for thinking that we need standards...

    poo poo to you !

    ReplyDelete
  4. There is a huge confusion between "Programmable GPU" and "Hardware Video Decoding".

    For instance, in a NVidia graphics card you may find a GPU and a dedicated video hardware called "PureVideo".
    http://www.nvidia.com/page/purevideo.html

    In the Apple A4 you have a GPU (PowerVR SGX535), a CPU (Cortex-A8), and a video decoding unit (PowerVR VXD375).
    http://www.imgtec.com/powervr/powervr-vxd.asp
    (Carefully note the supported codec list on those 2 pages)

    While it's not proven that you can't do video decoding on the programmable GPU using OpenCL, or whatever shaders, there is to my knowledge no existing implementation for x264. Even less for WebM. So assuming that Programmable GPU are being widely used to do h264 video decoding is wrong.

    This confusion is more important than it seems: you can't write any piece of software to customize the Hardware Video Decoding Unit. ie, you can't add WebM support magically. It is really a magical black box for software developers.

    Lars Gunther wrote:
    > Lack of so called hardware support for ogg/theora or WebM is in fact not really a hardware problem, but a software problem.

    My conclusion is: This assertion looks suspiciously wrong.

    And... I forgot to add. There is currently no shipping Video Decoding unit that supports WebM. End of the story.

    ReplyDelete
  5. While some of your points may prove to be accurate, only time will tell. But I think you're wrong on gpu acceleration being mutually exclusive with patentability. Video decoding isn't a single opaque and somewhat magic task. It's a succession of baby steps. Some of those baby steps are accelerated by some H.264 implementations. Those steps can be unique to H.264 while the steps in common with WebM can be patented. This is the nature of video decoding that makes this duality possible.
    But you're correct that in reality some parts of WebM can be accelerated. The problem is that no one currently has an accelerated implementation for x86 or arm architecture. This means WebM is late by a couple of years to be where H.264 currently is.

    ReplyDelete
  6. Didn't know there was a race for a codec standard in HTML5 to start with...

    ReplyDelete
  7. ZZzzzzzzzzz.. what huh? Oh, yeah. Free as in freedom! Wolverines! Stallman for president! But not really because the presidency is a proprietary function of a non-open government asserting control on our lives and stuff. Can I have some open source money now?

    ReplyDelete
  8. a little update about that codec and it's licensing: http://www.mpegla.com/Lists/MPEG%20LA%20News%20List/Attachments/231/n-10-08-26.pdf

    ReplyDelete
  9. According to the arugment .mp3 will never become a web standard either.... oh wait...

    W3C is clueless. 30 tech companies hold patents to .h264, they all use it because each of 30 tech companies have offered there best solutions to the same challenge. To the point where they are an official ISO standard.

    If it's W3C goal is to slow progress so Unix users can keep up, then I'm fine with going against the standard. Outside of that all the patents have been fully paid in full if you own a copy of MacOS and Windows... which make up what % of the desktop users?

    That's right that apache server in the corner with the SVGA output can't play an HD movie, what will I ever do!

    ReplyDelete
  10. Evidently I've managed to draw the attention of the fan boy squad. I have decided to keep the low quality comments, but I won't bother answer.

    @jon: Citations. Nothing you can't find on Google. As for hardware accelerated VP8 being worked on, I recommend looking at the bug trackers for Firefox and Chromium.

    @Pierre: I did for a while plan to go into the details about media processors and the actual GPU. The fact remains, though, that Mozilla have been able to get ogg/theora playing on the n900 using only 20 % CPU, most of which is audio decoding. By altering their software to utilize other silicon.

    @Yann: I've not denied that H.264 currently has better implementations. In fact I think that is quite clear in my article. I am however saying that it is reasonable for VP8 to catch up during the next 12 months. If you want to push something to an iGadget today, use H.264 (and pay the license if you are charging for the service). But I am discussing how to keep the web open and free for tomorrow.

    @everybody: Nobody has come up with a real argument against my main thesis, that the W3C patent policy is not compatible with H.264 licensing terms. You may bikeshed all you want about particular details. It still won't affect the main argument.

    ReplyDelete
  11. I don't really understand why the Html5 standards group would consider the VP8/h.264 codecs in the first place when there are codecs like Dirac (http://diracvideo.org/) which seem to have a philosphy much more in line with keeping things free and open across the board.

    http://www.bbc.co.uk/rd/projects/dirac/licensing.shtml
    http://diracvideo.org/

    ReplyDelete
  12. LOL you do know that the worlds fastest VP8 codebase is the free ffmpeg FFVP8 right ?

    and you also know that google themselves tell the dev's to use that ffmpeg code-base for the most accurate and upto date API right ?

    so your also aware that this very same worlds fastest VP8 code is written By the key x264 dev's and they used much of their existing x264 to suit to get this faster speed.... or written their new code to use and speed it up here see

    http://x264dev.multimedia.cx/
    07/23/2010 (4:01 pm)
    Announcing the world’s fastest VP8 decoder: ffvp8
    Filed under: VP8,ffmpeg,google,speed ::

    "I’ve been working the past few weeks to help finish up the ffmpeg VP8 decoder, the first community implementation of On2′s VP8 video format. Now that I’ve written a thousand or two lines of assembly code and optimized a good bit of the C code, I’d like to look back at VP8 and comment on a variety of things — both good and bad — that slipped the net the first time, along with things that have changed since the time of that blog post...."

    ReplyDelete
  13. whats the matter lars ,dont you like true facts on your blog so you delete them within seconds as they dont match your perceptions or bios

    ReplyDelete
  14. http://x264dev.multimedia.cx/?p=486
    VP8: a retrospective
    Filed under: DCT,VP8,speed ::
    I’ve been working the past few weeks to help finish up the ffmpeg VP8 decoder, the first community implementation of On2′s VP8 video format. Now that I’ve written a thousand or two lines of assembly code and optimized a good bit of the C code, I’d like to look back at VP8 and comment on a variety of things — both good and bad — that slipped the net the first time, along with things that have changed since the time of that blog post...."

    ReplyDelete
  15. Dark Shikari Says:
    July 30th, 2010 at 1:30 pm
    @IgorC

    If I write for a VP8 encoder, that encoder will be called x264, and there will be a commandline option –vp8.

    ReplyDelete
  16. http://forum.doom9.org/showthread.php?p=1402771#post1402771
    "LoRd_MuldeR
    Software Developer 25th May 2010, 18:44

    Quote:
    Originally Posted by swg
    Basically OpenCL would replace the assembly part, openCL allows for massively parallel operations to happen on GPU, CPU or any supported hardware and would be optimized accordingly. end quote

    It has been explained a dozen times why the naive idea that you only need to throw your existing code on the GPU to get a massive speed-up is more than wrong

    Summery: Writing code that actually runs fast on the GPU isn't trivial at all.

    Inventing new algorithms that are suitable for GPU is even harder. Switching from the CPU to the GPU often requires finding completely new solutions for your problems - which needs a whole lot of work! And in some cases the problem is inherently sequential and thus will never run (efficiently) on the GPU.

    Last but not least, moving only small parts of a software to the GPU isn't reasonable (speed-wise), because transferring data between the host memory and the GPU memory has a huge delay...

    The fact that all the "GPU encoders" available on market only reach fast encoding speed by sacrificing quality shows that GPU's aren't that great

    ReplyDelete
  17. @Rehan Khan: Dirac has been discussed on the WhatWG mailing list as well as on the HTML5 WG. The main argument against it seem to be that it's very new and have very few people backing it up.

    @pip99: As I've said above, I've not deleted any comments. I would certainly not delete them because they contained facts that seem to contradict my position or opinions that I did not agree with.

    What I was referring to was the fact that some comments have diminished the reading value of this blog by being infantile and provide neither argument nor facts. (BTW, I've read all those arguments that you are linking to long ago...)

    ReplyDelete
  18. pip99 got caught in blogspot's spam filter. I saw that by a coincidence today.

    ReplyDelete
  19. Note: Here is Chromium working on GPU accelerated Video: http://code.google.com/p/chromium/issues/detail?id=53714

    Yes, it will happen!

    ReplyDelete