[GSoC] GPU hardware assisted H.264 decoding via OpenGL GLSL shaders - developers only

  Thread Rating:
  • 3 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #76
i have read the article. it's all generalities, hand waving, "experiences" and no details (guess why).
find quote
malloc Offline
Team-XBMC Developer
Posts: 1,062
Joined: May 2004
Reputation: 0
Post: #77
no results section with pretty graphs?

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
spiff Offline
Retired Developer
Posts: 12,386
Joined: Nov 2003
Post: #78
oh sorry forgot about those; yeah, pretty graphs are included Smile
find quote
CapnBry Offline
Fan
Posts: 406
Joined: Oct 2004
Reputation: 0
Location: Tampa, FL USA
Post: #79
I saw a presentation on GPU assisted video decoding at GDC many years back where there were plenty of awesome graphs about how fast it could be. Most of the presentation was about how the guy wrote the DCT in a shader, the main problem of the time being that fragment programs were limited to 28 instructions. Then the rest was presentation was theory on how GPUs could totally do it some day. Waste of an hour was what it was.

Why is it every article about this never ends in "and here's the code to do it!"? Smile
find quote
Gamester17 Offline
Team-XBMC Forum Moderator
Posts: 10,523
Joined: Sep 2003
Reputation: 10
Location: Sweden
Lightbulb  Not sure if this will help spark any ideas but it is interesting news never the less
Post: #80
Gamester17 Wrote:This other GSoC student (for the X.org project) is this summer trying to implement GPU hardware accelerated video decoding of MPEG-2 by adding XvMC front-end support to the Gallium 3D framework, the end-result when finished should be that any hardware-specific Gallium 3D back-end device-driver that supports XvMC will be able to take advantage this as long as the MPEG-2 software video decoder features support for XvMC. You should keep up with his blog for reference:

http://www.bitblit.org/gsoc/gallium3d_xvmc.shtml
Sounds as if Younes Manton (the GSoC student for the X.org project) is making quite good progress with his project to use OpenGL GLSL shaders in order to accelerate video decoding, checkout his blog post from the day before yesterday:
http://bitblitter.blogspot.com/2009/01/y...using.html
Quote:Yes I'm still decoding video using shaders

It's been a while since I've said much about my video decoding efforts, but there are two pieces of good news to share. Both are improvements to Nouveau in general, not specific to video decoding.

First, we can now load 1080p clips. Thanks to a very small addition to Gallium and a few lines of code in the Nouveau winsys, a lot of brittle code was removed from the state tracker and memory allocations for incoming data are now dynamic and only done as necessary. The basic situation is we allocate a frame-sized buffer, map it, fill it, unmap it, and use it. On the next frame we map it again, fill it again, and so on. But what if the GPU is still processing the first frame? The second time we attempt to map it the driver will have to stall and wait until the GPU is done before it can let us overwrite the contents of the buffer.

But do we have to wait? Not really, we don't need the previous contents of the buffer, we're going to overwrite the whole thing anyway, so we just need a buffer that we can map immediately. To get around this we were allocating N buffers at startup and rotating between them; filling buffer 0, then 1, and so on, which reduced the likelyhood of hitting a busy buffer. The problem with that is obvious, for high res video we need a ton of extra space, most of it not being used most of the time. Now if we try to map a busy buffer, the driver will allocate a new buffer under the covers if possible and point our buffer to it, deleting the old buffer when the GPU is done with it. If the GPU is fast enough and processes buffers before you attempt to map them again, everything is good and you'll have the minimum number of buffers at any given time. If not, you'll get new buffers as necessary, in the worst case until you run out of memory, in which case you'll get stalls when mapping. The best of both worlds.

The second bit of good news is that we've managed to figure out how to use swizzled surfaces, which gave a very large performance boost. Up to now we've been using linear surfaces everywhere, which are not very cache or prefetch friendly. Rendering to swizzled surfaces during the motion compensation stage lets my modest AthonXP 1.5 GHz + GeForce 6200 machine handle 720p with plenty of CPU to spare. 1080p still bogs the GPU down, but the reason for that is pretty clear: we still render to a linear back buffer and copy to a linear front buffer. We can't swizzle our back or front buffers, so the next step will be to figure out how to get tiled surfaces working, which are similar, but can be used for back and front buffers. Hopefully soon we can tile the X front buffer and DRI back buffers and get a good speed boost everywhere, but because of the way tiled surfaces seem to work (on NV40 at least) I suspect it will require a complete memory manager to do it neatly.

http://nouveau.freedesktop.org/wiki/Surface_Layouts

Beyond that there are still a few big optimizations that we can implement for video decoding (conditional tex fetching, optimized block copying, smarter vertex pos/texcoord generation, etc), but the big boost we got from swizzling gives me a lot of optimism that using shaders for at least part of the decoding process can be a big win. It probably won't beat dedicated hardware, but for formats not supported by hardware, or for decoding more than one stream at a time, we can probably do a lot of neat things in time.

I've also been looking at VDPAU, which seems like a nice API but will require a lot of work to support on cards that don't have dedicated hardware. More on that later maybe.

...and in related news it now also seams like the Gallium 3D framework will be merged into Mesa 3D mainline code sooner rather than latter:

http://www.phoronix.com/scan.php?page=ne...&px=Njk4OA
Quote:Gallium3D To Enter Mainline Mesa Code
Posted by Michael Larabel on January 12, 2009

As we shared late last week, Mesa 7.3 is getting ready for release with the first release candidate having arrived. Mesa 7.3 will feature improved GLSL 1.20 support, support for the Graphics Execution Manager, and Direct Rendering Infrastructure 2 integration
. The stabilized version of Mesa 7.3 will then go to make Mesa 7.4.

Beyond Mesa 7.4 we have learned some details as to what's next: merging Gallium3D to Mesa's master branch. Gallium3D, the new graphics architecture developed by Tungsten Graphics, has been in development for quite a while but is nearing a point of stabilization. If all goes according to plan, Gallium3D will see the light of day in Mesa 7.5. Brian Paul announced on the Mesa3D development mailing list that the gallium-0.2 branch will be merged to master following the Mesa 7.4 branching.
find quote
digitalhigh Offline
Skilled Skinner
Posts: 1,468
Joined: Oct 2005
Reputation: 100
Location: Milwaukee, WI
Post: #81
Nice. A little bit of it was beyond me, but I see why you're saying he's making progress.

So a realisitic solution isn't that far off, eh? Perhaps a couple of months?

I'm not just a semi-decent skinner...I also sing for a band. Check us out.

http://www.corruptable.com/main.html
find quote
Gamester17 Offline
Team-XBMC Forum Moderator
Posts: 10,523
Joined: Sep 2003
Reputation: 10
Location: Sweden
Post: #82
digitalhigh Wrote:So a realisitic solution isn't that far off, eh? Perhaps a couple of months?
Please understand that Younes Manton's (the GSoC student for the X.org) project has nothing to do with XBMC, nor anything directly to do with something that will help XBMC to accelerate H264 video decoding.

Younes Manton is only working on XvMC support for Gallium 3D (XvMC only supports MPEG-2 and there are only a very few drivers for Gallium 3D and non of those drivers are mature). And no, Younes Manton is not only a couple of months away from being usable to normal users anyway.

I think you misunderstood my intent with that post, I only posted it as a reference for ideas so that a skilled developers such as Rudd (or someone picking up where Rudd left of) could get a few more ideas that they might be able to use to further this development of GPU assisted H.264 decoding via OpenGL shaders.

PS! @everyone, please do not try to make this into discussion about VDPAU, such posts in this thread from non-developers will be deleted.

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
find quote
digitalhigh Offline
Skilled Skinner
Posts: 1,468
Joined: Oct 2005
Reputation: 100
Location: Milwaukee, WI
Post: #83
Gamester17 Wrote:Please understand that Younes Manton's (the GSoC student for the X.org) project has nothing to do with XBMC, nor anything directly to do with something that will help XBMC to accelerate H264 video decoding.

Younes Manton is only working on XvMC support for Gallium 3D (XvMC only supports MPEG-2 and there are only a very few drivers for Gallium 3D and non of those drivers are mature). And no, Younes Manton is not only a couple of months away from being usable to normal users anyway.

I think you misunderstood my intent with that post, I only posted it as a reference for ideas so that a skilled developers such as Rudd (or someone picking up where Rudd left of) could get a few more ideas that they might be able to use to further this development of GPU assisted H.264 decoding via OpenGL shaders.

PS! @everyone, please do not try to make this into discussion about VDPAU, such posts in this thread from non-developers will be deleted.

I understand. It's been a while since I've browsed this thread...I forgot who was originally working on this feature for XBMC, and that the blog post you gave was a progress report on the original development.

Duh...

I'm not just a semi-decent skinner...I also sing for a band. Check us out.

http://www.corruptable.com/main.html
find quote
kasbah Offline
Junior Member
Posts: 10
Joined: Jun 2009
Reputation: 0
Post: #84
@ Rudd
I am interested in the work you did for this as I am planning a University project to do H264 decoding using shaders. Anything you did at all would be useful. Please get in touch.
(This post was last modified: 2009-06-30 10:58 by kasbah.)
find quote
Gamester17 Offline
Team-XBMC Forum Moderator
Posts: 10,523
Joined: Sep 2003
Reputation: 10
Location: Sweden
Post: #85
kasbah Wrote:@ Rudd
I am interested in the work you did for this as I am planning a University project to H264 decoding using shaders. Anything you did at all would be useful. Please get in touch.
Hi kasbah, Robert (Rudd) has unfortunately gone M.I.A. and has not been in touch with us at XBMC in many months.

You could try to contact him yourself if you like via the mail address... EDIT:(althekiller) Removed, I'll PM it.

Note! The source code of the work that was in progress is still available in the XBMC SVN:
Code:
# svn checkout http://xbmc.svn.sourceforge.net/svnroot/xbmc/branches/gsoc-2008-rudd

PS! The VA API (Video Acceleration API) is now most interesting for abstracting this now that FFmpeg supports it:
http://en.wikipedia.org/wiki/VAAPI

Wink

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
(This post was last modified: 2009-06-24 18:13 by althekiller.)
find quote
bb10 Offline
Member
Posts: 88
Joined: Feb 2008
Reputation: 0
Post: #86
Gamester17 Wrote:PS! The VA API (Video Acceleration API) is now most interesting for abstracting this now that FFmpeg supports it:
http://en.wikipedia.org/wiki/VAAPI

Wink
Unfortunately, that would limit it to Linux only.
find quote
kasbah Offline
Junior Member
Posts: 10
Joined: Jun 2009
Reputation: 0
Post: #87
Thank you for your help! I will try and contact him or at least try to see if there is something interesting in his code.
find quote
kasbah Offline
Junior Member
Posts: 10
Joined: Jun 2009
Reputation: 0
Post: #88
anyone ever manage to compile rudd's modified ffmpeg? if anyone would be willing to have a look with me I would be most grateful.

Code:
svn checkout http://xbmc.svn.sourceforge.net/svnroot/xbmc/branches/gsoc-2008-rudd/sources/dvdplayer/ffmpeg

you need glut and glew development headers:

in debian:
libglew1.5-dev
freeglut3-dev or libglut3-dev

there seems to be something wrong in the way it copies the "decoded picture buffer" it errors:
Code:
gpu/h264gpu.c|201| error: ‘Picture’ has no member named ‘gpu_dpb’

I can find no reference to gpu_dpb anywhere. Only dpb_tex which is the picture data as a texture.
find quote
Stip Offline
Junior Member
Posts: 4
Joined: Apr 2010
Reputation: 0
Post: #89
Hello!

I'm also interested in Rudd's partial OpenGL implementation, and will likely study his code. I am in a university project as well, and will be looking into implementing h.264 (and hopefully the SVC extension of it) on GPUs using OpenGL ES. I will use ES, as I'd like to target systems on a chip, such as the one found in the BeagleBoard.

Even though an OpenGL ES implementation will likely be slightly different than an OpenGL implementation, the algorithms involved should probably be pretty similar.

Good luck, kasbah, if you intend to attempt continuing Rudd's work!
find quote
george_k Offline
Junior Member
Posts: 2
Joined: Jan 2010
Reputation: 0
Post: #90
Saw this on Technical Note TN2267 for OS X 10.6.3 and thought it might be interesting:

Quote:The Video Decode Acceleration framework is a C programming interface providing low-level access to the H.264 decoding capabilities of compatible GPUs such as the NVIDIA GeForce 9400M, GeForce 320M or GeForce GT 330M. It is intended for use by advanced developers who specifically need hardware accelerated decode of video frames.

http://developer.apple.com/mac/library/t...n2267.html
(This post was last modified: 2010-04-29 06:25 by george_k.)
find quote
Post Reply