[GSoC] GPU hardware assisted H.264 decoding via OpenGL GLSL shaders - developers only

  Thread Rating:
  • 3 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
Rudd Offline
Junior Member
Posts: 14
Joined: Mar 2008
Reputation: 0
Post: #1
Hey all,

My name is Robert Rudd and I thought that I'd make an open post discussing my GSoC 2008 project for XBMC.

First of all, here's the wiki entry on it: GSoC - GPU Assisted Video Decoding. It's slightly out of date. I must admit, I wrote it in a slight rush off of a general understanding of video decoding, so it is somewhat incorrect in regards to H.264, (H.264 for example doesn't even use the IDCT). I'll try and update it in the coming days.

Right now I'm currently looking into an implementation using OpenGL and GLSL (OpenGL Shader Language) Shaders. I'm rather optimistic at the potential for this, though there are a few challenges to deal with. The latency between CPU<->GPU transfers will need to be handled. Also, if I want to do only motion compensation, but not intra-prediction, It will cause some problems as there are macroblock types in h.264 that can intra-predict off of P/B macro blocks.

Luckily, there has been some previous work done in this area. Three things worth checking out in respect to this are:



(Sadly, I don't know if everyone will be able to access these papers. I get a subscription through my school)
Real-time high definition H.264 video decode using the Xbox 360 GPU Wrote:http://spiedl.aip.org/getabs/servlet/Get...s&gifs=yes

This paper describes the implementation Microsoft did for the Xbox 360. They made a GPU assisted decoder using HLSL.

Performance evaluation of H.264/AVC decoding and visualization using the GPU Wrote:http://spiedl.aip.org/getabs/servlet/Get...s&gifs=yes

This has a slightly more academic flavor, but it outlines some techniques for Motion Compensation on a generic GPU. the results seem slighlty embellished, but there seem to be some good take aways.

GPU-Accelerated Dirac Video Codec Wrote:http://www.cs.rug.nl/~wladimir/sc-cuda/

Dirac is another advanced codec that has appeared in recent times. It is based on the wavelet transform. The above codec offers GPU acceleration of motion compensation and Forward/Inverse wavelet transforming via CUDA. I believe one of their GSoC projects is to reimplement the decoder using OpenGL/GLSL.



So, that's the story so far. Up to this point i've been trying to give myself a stronger footing in openGL/GLSL, as well as trying to get a good understanding of the FFMPEG codec. I'd appreciate any thoughts people may have on this process, as well as any informative links.

Smile

Google Summer of Code Student Developer for XBMC
GSoC Project 2008: GPU Assisted Video Decoding

[Image: Gsoc2008logo.png]
(This post was last modified: 2009-01-21 18:42 by Gamester17.)
find quote
Gamester17 Offline
Team-XBMC Forum Moderator
Posts: 10,523
Joined: Sep 2003
Reputation: 9
Location: Sweden
Lightbulb  VA API for Gallium 3D concept
Post: #2
This other GSoC student (for the X.org project) is this summer also trying to implement GPU hardware accelerated video decoding but of MPEG-2 only by adding XvMC front-end support to the Gallium 3D framework, the end-result when finished should be that any hardware-specific Gallium 3D back-end device-driver that supports XvMC will be able to take advantage this as long as the MPEG-2 software video decoder features support for XvMC. You should keep up with his blog for reference:
http://www.bitblit.org/gsoc/gallium3d_xvmc.shtml

You could however not do this exactly for the H.264 video codec as XvMC does not feature support for it nor it there today any hardware-specific Gallium 3D back-end device-driver which features any type of API for H.264 decoding, ...but what you could do is something similar in concept by replacing the XvMC front-end with the VA API front-end, and for the hardware-specific Gallium 3D back-end use OpenGL GLSL shaders to do parts of the H.264 decoding, if you could do that then you should have reached the first step in achieving an platform-independent H.264 hardware accelerated video decoding on GPU. Then the second step is that you would have to add VA API support for the software H264 video decoder to make if offload the parts like motion-compensation and intra-prediction which could possibly be accelerated on a GPU, that software decoding in XBMC's case is FFmpeg.

OFF-TOPIC; this type of architecture design using the Gallium 3D framework should also leave the option open to in the future replace or complement those OpenGL GLSL shader back-ends for Gallium 3D with other methods of hardware acceleration, (like for example Gallium 3D back-ends for Broadcom or Sigma Design hardware decoder chips, or the built-in bitstream processors in a modern GPU), this design would still use the same VA API front-end into the Gallium 3D driver framework so it should not be any difference on the software decoder side.

VA API (Video Acceleration API):
http://www.freedesktop.org/wiki/Software/vaapi
http://wiki.multimedia.cx/index.php?titl...ration_API
http://en.wikipedia.org/wiki/Video_Acceleration_API

Gallium 3D:
http://www.tungstengraphics.com/wiki/ind.../Gallium3D
http://en.wikipedia.org/wiki/Gallium_3D

Yet another GSoC student who trying to achive GPU hardware acceleration video decoder via GLSL for OpenGL (but for the Dirac codec):
http://code.google.com/soc/2008/dirac/ap...71C6F4785F
http://www.diracvideo.org/wiki/index.php...ction=edit

Best of luck to you Robert, I wish you happy coding this summer!

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
(This post was last modified: 2009-01-21 18:42 by Gamester17.)
find quote
malloc Offline
Team-XBMC Developer
Posts: 1,062
Joined: May 2004
Reputation: 0
Post: #3
It may be easier to use CUDA as a stepping stone. But maybe you don't have access to that kind of hardware.

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
dizzey Offline
Member
Posts: 73
Joined: Jul 2007
Reputation: 0
Post: #4
Is there any status updates on this. I know that when i am working on larger things i dont like to stop and make status reports but it would be fun to hear.
Decided wich gramework yet or even managed to test some code?
find quote
bb10 Offline
Member
Posts: 87
Joined: Feb 2008
Reputation: 0
Post: #5
http://xbmc.svn.sourceforge.net/viewvc/x...sion=13976

Wink
find quote
dizzey Offline
Member
Posts: 73
Joined: Jul 2007
Reputation: 0
Post: #6
nice
find quote
Rudd Offline
Junior Member
Posts: 14
Joined: Mar 2008
Reputation: 0
Rainbow   
Post: #7
dizzey Wrote:Is there any status updates on this. I know that when i am working on larger things i dont like to stop and make status reports but it would be fun to hear.
Decided wich gramework yet or even managed to test some code?

Thanks for showing interest Big Grin. As the above post shows I just made my first commit as the Midterm period for GSoC is upon us. I've decided to do it in OpenGL/GLSL simply to be the most compatible. CUDA, while extremely cool, is only available for the Geforce 8 series and the other GPGPU stuff wasn't very robust(brookGPU, etc.). The GPU decoder is set up as follows:

  1. N(I haven't decided just how many) macroblocks are entropy decoded and placed in a buffer. This is similar to the FFMPEG code in decode_slice2 atm.
  2. For B/P slices, the a special function is called for motion compensation that:
    1. Uploads any reference frames not present on the card(Most likely I frames)
    2. Renders a quad for each macroblock/block/subblock parititon depending on the type. the motion vector, as well as the index for the reference picture is transfered as texture coordinates
    3. Do a series of shader passes to accomplish motion compensation.
  3. I slices are decoded on the CPU as normal.
  4. The residual data is uploaded to the GPU and decoded.
  5. The decoded residual is combined with the motion compensated to form the final picture.



Ideally, the GPU portions(2, 4) will be done in parallel with the CPU portions, for maximum effect. As for progress this is essentially what i've done/still need to do:


  1. DONE
    1. Upload needed reference pictures DONE
    2. Renders a quad for each block/subblock DONE
    3. Writer Shaders for:
      • fullpel motion compensation DONE
      • halfpel(offcenter) IN PROGRESS
      • halfpel(center) IN PROGRESS
      • quarterpel(offcenter) IN PROGRESS
      • quarterpel(center-diagonal) IN PROGRESS
      • quarterpel(center-nondiagonal) IN PROGRESS
      • chroma motion compensation NOT STARTED
      • *maybe* shaders used in the xbox GPU paper NOT STARTED?
    4. Write tests to compare GPU mocomp to FFMPEG's NOT STARTED
  2. N/A
    1. upload residual data to GPU DONE
    2. Write shaders for:
      • 4x4 AC/DC inverse transform DONE
      • 8x8 AC/DC inverse transform IN PROGRESS
      • chroma inverse transforms NOT STARTED
    3. Write tests to compare GPU residual to FFMPEG's NOT STARTED
  3. NOT STARTED


After the above, I also have some work to do just to get it compatible with all the different types of h264 streams(dealing with things like field based streams and the like.)The shaders in the commit above are a simple implementation I did just to get started. They only work for a single reference picture, so my next step is to fix them up to work with a proper sized decoded picture buffer. Also, after finishing up these shaders, I hope to be able to do some profiling to see if all this work is actually doing any good Big Grin

Google Summer of Code Student Developer for XBMC
GSoC Project 2008: GPU Assisted Video Decoding

[Image: Gsoc2008logo.png]
find quote
Rudd Offline
Junior Member
Posts: 14
Joined: Mar 2008
Reputation: 0
Post: #8
Oh, cool, I have a signature Big Grin

Google Summer of Code Student Developer for XBMC
GSoC Project 2008: GPU Assisted Video Decoding

[Image: Gsoc2008logo.png]
find quote
malloc Offline
Team-XBMC Developer
Posts: 1,062
Joined: May 2004
Reputation: 0
Post: #9
Rudd Wrote:
  • *maybe* shaders used in the xbox GPU paper NOT STARTED?

Unsure or just don't want to tell us?

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Rudd Offline
Junior Member
Posts: 14
Joined: Mar 2008
Reputation: 0
Post: #10
malloc Wrote:Unsure or just don't want to tell us?

Just unsure if i'll get around to it. I'd like to finish the other stuff beforehand.

Google Summer of Code Student Developer for XBMC
GSoC Project 2008: GPU Assisted Video Decoding

[Image: Gsoc2008logo.png]
find quote
malloc Offline
Team-XBMC Developer
Posts: 1,062
Joined: May 2004
Reputation: 0
Post: #11
Actually I was referring to the question mark at the end (unsure if you've started?), not the *maybe*. I jest. I'm sure that was an accident and you're not just trying to keep us on the edge of our seats.

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Gamester17 Offline
Team-XBMC Forum Moderator
Posts: 10,523
Joined: Sep 2003
Reputation: 9
Location: Sweden
Thumbs Up  Great to see that you are maing progress!
Post: #12
What do you think will be the minimum OpenGL ARB (GLSL) version in the hardware requirement? OpenGL 1.4 + GLSL, 1.5 + GLSL, or 2.0?
http://en.wikipedia.org/wiki/GLSL
http://en.wikipedia.org/wiki/OpenGL_ARB

Rudd Wrote:I've decided to do it in OpenGL/GLSL simply to be the most compatible.
Will you be using an existing framework like VAAPI and/or Gallium, or? Confused

Rudd Wrote:
  • *maybe* shaders used in the xbox GPU
But the XDK and Xbox nativly only support DirextX/Direct3D/HLSL, right?, ...though I understand that GLSL shaders can be converted to HLSL shaders (and vice versa) and there are free software tools out there that help you do that convertion.

PS! Have you had time to follow the code progress of these other related GSoC projects?:
http://www.bitblit.org/gsoc/g3dvl/index.shtml
http://code.google.com/soc/2008/dirac/ap...71C6F4785F

Nerd

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Rudd Offline
Junior Member
Posts: 14
Joined: Mar 2008
Reputation: 0
Post: #13
Gamester17 Wrote:What do you think will be the minimum OpenGL ARB (GLSL) version in the hardware requirement? OpenGL 1.4 + GLSL, 1.5 + GLSL, or 2.0?
http://en.wikipedia.org/wiki/GLSL
http://en.wikipedia.org/wiki/OpenGL_ARB

I Believe it's going to be a strict OpenGL 2.0 requirement, however, i'm still reworking how I do some things so it might end up as OpenGL 1.5. +GLSL. I'm still not sure all what openGL functionality I will end up using

Gamester17 Wrote:Will you be using an existing framework like VAAPI and/or Gallium, or? Confused

I'm doing it right now directly as a modification to FFMPEG's h264 decoder. I'll try and keep the code not to tied to FFMPEG's constructs so as to perhaps port it to VAAPI or something similar the future, however I felt this would provide the most immediate results. Also VAAPI seems like it would be a bit out of my scope as it would entail learning implementing VAAPI support into FFMPEG, as well as implementing various VAAPI functionality into the gallium drivers to actually get a working gpu assisted decoder.



Gamester17 Wrote:But the XDK and Xbox nativly only support DirextX/Direct3D/HLSL, right?, ...though I understand that GLSL shaders can be converted to HLSL shaders (and vice versa) and there are free software tools out there that help you do that convertion.

PS! Have you had time to follow the code progress of these other related GSoC projects?:

Oh, well, unfortunately the Xbox GPU paper doesn't go as far as actually sharing their HLSL code. Big Grin It only outlines the algorithm they used for their shaders, so i'd have to implement them myself in GLSL(and there shouldn't be a reason why an algorithm used in HLSL shaders couldn't be used for GLSL shaders). it'd be more items i'd have to test/compare to the FFMPEG's CPU implementation so I am not sure if I will get around to it.

I have been following the gallium project, and the progress he's making has been very good. I believe he's finished the softpipe implementation and is going to be working on adding it to the hardware driver soon. I try to peak into the dirac mailing lists occasionally, but I don't know if there's any other sort of public presence that project has.

Google Summer of Code Student Developer for XBMC
GSoC Project 2008: GPU Assisted Video Decoding

[Image: Gsoc2008logo.png]
find quote
Gamester17 Offline
Team-XBMC Forum Moderator
Posts: 10,523
Joined: Sep 2003
Reputation: 9
Location: Sweden
Rainbow   
Post: #14
Rudd Wrote:I Believe it's going to be a strict OpenGL 2.0 requirement, however, i'm still reworking how I do some things so it might end up as OpenGL 1.5. +GLSL. I'm still not sure all what openGL functionality I will end up using
The reason I as is that the Mac Mini just features an Intel GMA950 graphic controller which only supports OpenGL + GLSL, and the Mac Mini have recently become quite a popular platform for XBMC (both as XBMC for Linux and XBMC for Mac). I fully understand if you originally aim at OpenGL 2.0 as the scope for this project but it would be great later (after and outside of the original scope) detection and software fallback could be added for any GLSL extensions not support by the GPU hardware (thus automaticly run any decoding processes not supported by the GPU on the CPU instead).


Rudd Wrote:I'm doing it right now directly as a modification to FFMPEG's h264 decoder. I'll try and keep the code not to tied to FFMPEG's constructs so as to perhaps port it to VAAPI or something similar the future, however I felt this would provide the most immediate results.
Cool, that is probably the smartest way to go given the time allotted by Google Summer of Code to produce some kind of result that will be usable by the end-user.

I not sure but I think another related GSoC project is the generic frame-level multithreading support effort for FFmpeg, however it will not be usable by yourself as it is now, but when that project is complete and fully integrated into the main FFmpeg SVN (hopefully soon after GSoC period is done this year) then that H264 decoder code could maybe used to more effectivly multi-thread the decoding to move some processes to the GPU that way.
http://code.google.com/soc/2008/ffmpeg/a...705A5D5DBB
http://wiki.multimedia.cx/index.php?titl...ng_support
The 'development-in-progress' files for that project will sooner or later in the FFmpeg Google Summer Of Code repository:
svn://svn.mplayerhq.hu/soc/
http://svn.mplayerhq.hu/soc/
(This post was last modified: 2008-07-08 16:22 by Gamester17.)
find quote
WiSo Offline
Team-XBMC Developer
Posts: 2,695
Joined: Oct 2003
Reputation: 0
Location: Germany
Post: #15
Gamester17 Wrote:The reason I as is that the Mac Mini just features an Intel GMA950 graphic controller which only supports OpenGL + GLSL, and the Mac Mini have recently become quite a popular platform for XBMC (both as XBMC for Linux and XBMC for Mac).

and for Windows on Mac also Cool

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
Post Reply