GPU assisted video decoding in XBMC, like motion compensation, idct, and deblocking?
#46
i don't ever get why everyone says xbmc doesn't do hd. it may not do 1080i videos, but i have 720p movies that play just fine and look great. (a couple of imax ones and the final fantasy movie.)
Reply
#47
the current software/hardware can not keep up with the bitrates of true hd (720p (1280x720 60fps) or 1080i(1900x1080 30fps)) video.

it can keep up with some carefully encoded 1280x720 24/29fps video (divx as an example). these still look great, heck even 960x540 24/29fps looks good, but not close to as good as 720p or 1080i.

people would like to see the software (video decoders) improved to obtain 720p or 1080i without dropping frames.

hate to state this over and over, but i hope this clears this issue up some.
Reply
#48
apparently, real time decoding of wmv-hd 720p on the xbox has already been done. i don't think we're talking about crappy xvid hd stuff, i'm talking 5mbs+ video.

check out this site:
http://research.microsoft.com/~jackysh/

of interest:

project: hdox – high definition wmv decoding over xbox

· real-time playback of wmv encoded hd video on the first generation of commercial xbox.

· precision control for motion compensation on 8-bit alu

· gpu assembly optimization

project: gaxel – gpu accelerated video encoding and decoding

· efficient motion estimation on gpu

· efficient implementation of dct/idct on gpu

the guy is published in ieee:

previously, i was with internet media group at microsoft research asia. i have been working on various video processing and delivery related projects, ranging from efficient mpeg-2 to wmv transcoding, gpu accelerated video encoding and decoding to mpeg-2 transport stream wrapper for wmv bit stream, collaborative peer-to-peer streaming framework for scalable media. significant results were achieved, including for example, 4~7 real-time transcoding speed for standard definition video, and real-time high definition wmv video playback on xbox.



very interesting:

http://research.microsoft.com/~jackys....mes.pdf


i'd be willing to implement the above abstract, with help of course, pm me if you want to try.

if anyone's friends of gamesters', let him know about hdox, maybe the xbe is available somewhere; or theres more info to be had.

later...
Reply
#49
the article you linked to isn't the one you want.

you want the gpu assisted video decoding one (article 5).

and it doesn't give enough detail imo to allow efficient implementation. the idea is to move just the mocomp onto the gpu (as that saves readbacks by the cpu) but it's not a trivial process at all.

i emailled some of the contributors to that article when it appeared in ieee, but did not here back from them.

feel free to do what you can to develop this, just don't expect it to be easy - the article is light on the details where the action is really going to occur.

cheers,
jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#50
wow! it's interesting to see that someone out there has actually accomplished gpu assisted wmv-hd decoding in realtime! despite the fact that noone is working on this, it's still encouraging to know that it is technically possible, and if someone was to take on the task it's good to know it is feasible in advance!

i assume that a similar method could also be applied to avc decoding?
Reply
#51
a data point, for those interested...
but before i provide it, a suggestion on metrics.

people keep throwing around resolutions, when the more fundamental metric is "macroblocks per second" - how many macroblocks can be decoded and displayed without incuring frame drops. this metric encompasses all three parameters - width, height, and frame rate - and is easily computed
width * height / (16*16) *fps

i find that my v1.0 xbox can play back mpeg4 asp (without qpel or gmc, with up to 2 b-frames) encodes without *any* frame drops up to about 66,000 macroblocks per second using 'recent' builds.

to put that 66k number in perspective

1280 * 528 / (16*16) *25 = 66000
1280 * 544 / (16*16) *24 = 65280 (2.35:1 film)
960 * 720 / (16*16) *24 = 64800 (4x3 ar)
960 * 544 / (16*16) *30 = 61200 ('half res' hdtv)

note that when doing the above, i use hardware overlay video and i also use digital audio out in order to offload the ac3 audio decoder processing. all internal servers are disabled (including fan speed control). i play the large avi's and mkv's from the local disk wrapped in <1gb segmented rars to get around the fatx size limitation. haven't been able to do it from dvdr's, as i find the stock drive can't keep up with the bit rate peaks and even with a 16mb cache it will get depleted. have had limited success via smb - the extra processing overhead is enough to push some video 'over the edge'.

the 'half-res hdtv' above leaves sufficient cpu margin to do audio decoding and smb streaming as well.

generally speaking 1280*720*24 seems problematic. at 86,400 macroblocks per second, you will have dropped frames during high motion segments unless you starve it for bit rate (in which case you get artifacts) neither of which i find acceptable. (the point here is quality, right?) if there was some way to 'pipeline' the video decoding, so that it could use excess cpu during 'easy' segments to get a head start on the 'hard' segments... but that would probably require ram we don't have.

i'm still experimenting...
-i want to study b-frame impact
-i suspect carefully applied gmc might help
-mplayer has some interesting decoder options to be tried

and then there is the question of what the mpeg*2* decoder can handle on this platform...



Reply
#52
bitrate is extremely important. the quantizer will still do its blocking but a high bitrate vs a low one will effect decoding capability.
Reply
#53
Lightbulb 
I like to once again drag up this topic-thread and the discussion about GPU accelerated video decoding if I may...

More and more articles are popping up on the internet about using the GPU (Graphic Processor Unit) to accelerate almost any mathematical function, and today I was refered to an new article on tomshardware.com named "The hidden potential in your graphics card: A supercomputer?", in where it sais that a company called Peakstream not only claim have proof of concept on this, but actually they announced they they developed a new software platform (presumably a SDK with libaries) for C/C++ which can enable any application to take advantage of the superiour floating point performance of the GPU. Peakstream as a commersial company will sell their software tools for this as closed source, making it useless for XBMC. However the cencept it very interesting and once again it spings my imagination so I did some quick googling again and this is what I found:

This one specificly could possible be interesting for Xbox/XBMC:
http://www.cse.cuhk.edu.hk/~ttwong/demo/...wtgpu.html
it includes an open source example in C++ code which implements DWT (Discrete Wavelet Transform) on a Nvidias GPU for the JPEG2000 codec library JasPer. It also includes a standalone DWT-GPU C++ Class (source code with example program) that supports both FDWT (Forward Discrete Wavelet Transform) and IDWT (Inverse Discrete Wavelet Transform). Problems might be for us though that they only made it for and tested with nVidia GeForce FX (and newer) graphics-adapters, so maybe it could be used on older nVidia GPU with some tweaking, but then again maybe not? In any case, even if it was possible then it would probebely have to be specificaly added per codec (ie. the FFmpeg libavcodec or libmpeg2 source code for each codec for the DVDPlayer and/or MPlayer that we want to support this), right?

The above application could possible be used for both accelerated video decoding and for visual effects on textures in the GUI/skin(?)

Another similar project with available C++ open source code which also might be interesting is BrookGPU (sourceforge.net project). Brook for GPUs is a compiler and runtime implementation of the Brook stream program language for modern graphics hardware. The goals for this project are; Demonstrate general purpose programing on GPUs, provide a useful tool for developers who want to run applications on GPUs, Research the stream language programing model, streaming applications, and system implementations.

In theory the Xbox GPU should be able to help off-load the CPU with some MPEG-2 (and possible MPEG-4) decoding tasks with its floating point (alpha blending in the pixel/vertex shader?) pipeline:
- Half-Pel Motion Compensation (1/2 pel mocomp) for MPEG-2
- Quater-Pel Motion Compensation (1/4 pel mocomp) for MPEG-4
- Precision control for motion compensation on 8-bit ALU
- DCT/iDCT
- Deinterlacing
- Frame-rate conversion
- Post-processing filters
- GPU assembly optimization

Here is another interesting example (though probebely not applicable on the Xbox for XBMC as it's designed for OpenGL?) is in this PDF:
http://www.vis.uni-stuttgart.de/ger/rese...wavelet%22
It describes a method for calculate wavelet using OpenGL (however, it contains only mathematical and some timings, not source code). Using GLSL (OpenGL shading language) one could possible make a GPU program that can calculate the wavelet of any image.

Even a Microsoft's software-engineer claims to have achieved GPU accelerated video decoding of high definition MPEG-2 (Transport Stream) and WMVHD (WMV9 HD) in real-time on the original Xbox. See the "Research Statement" and "Projects Highlight (Previous Projects)" on http://research.microsoft.com/~jackysh/ ...hell, maybe a programmer should be so bold as to e-mail this Guobin (Jacky) Shen person, tell him about XBMC as ask him nicely for some advice or a referal to any other programmer that might be willing to assist.
He has posted an interested paper on the topic (and linked to a few others):
http://research.microsoft.com/~jackysh/publication.htm


PS! I would think that JMarshall as a mathematician would at the very least find all this a bit interesting Rolleyes

...well, maybe some day, one can always dream Nerd


PPS! I have of course posted about this idea previously in this (very long) topic-thread, (but no programmer ever followed through with the research or any development they did), here is a little reminder on those ideas still could still be vaid(?):
http://www.gpgpu.org
http://216.239.57.104/search?q=cache:npW...pdf+&hl=en
http://www.cs.washington.edu/homes/oskin...ro2002.pdf
http://ieeexplore.ieee.org/xpl/freeabs_a...mber=30783
http://www.tomshardware.com/hardnews/200...35943.html
http://www.tomshardware.com/hardnews/200...61353.html

FYI; after researching, one thing that does not seem possible is the previously mentioned DXVA API, that it not possible on the Xbox because Microsoft do not provide these API-libraries (or a DXVA SDK) for the Xbox.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#54
Good research Gamester!

HD playback on the good old Xbox would kick so much ass and blow everyones mind. Currently you need a 2 ghz PC just to play back MS' HD trailers in 720p and 2.5 ghz for 1080p.

Imagine exchanging the stock drive with a HDDVD or even blueray drive and be able to play back the nextgen. I'm sure that would get some press hehe (yes I know this would need a lot of work for menus again)
I really dislike the PS3 and the Xbox 360 with it's external drive. Also it doesn't look like it's ever going to be able to play homebrew from harddrive.
So XBMC forever baby!

Like you say, one can always dream Smile
Reply
#55
I've gotten the impression that the xbmc mplayer build uses the mplayer directx video driver. From reading the mplayer docs, I've wondered about the following drivers...
Code:
winvidix (Windows only)
       Windows frontend for VIDIX
cvidix
       Generic and platform independent VIDIX frontend, can even run in
===>   a text console with nVidia cards.
vidix
       VIDIX (VIDeo Interface for *niX) is an interface  to  the  video
       acceleration  features  of  different graphics cards.  Very fast
       video output driver on cards that support it.
          <subdevice>
               Explicitly choose the VIDIX  subdevice  driver  to  use.
               Available   subdevice   drivers  are  cyberblade_vid.so,
               mach64_vid.so,       mga_crtc2_vid.so,       mga_vid.so,
===>           nvidia_vid.so,         pm3_vid.so,        radeon_vid.so,
               rage128_vid.so, sis_vid.so and unichrome_vid.so.
Might that provide a path to doing motion compensation on the GPU?
Reply
#56
I have tried recording and playing back DVB Satellite TS streams (mpeg2)
1280x720p with AC3 audio. The audio plays fine ...the video playsback at
15 to 20 frames per sec....lots of dropped frames...but it is allmost playable.
It seems to me that with some tweeking it could be done.

bump
Reply
#57
This might be completely silly but doesn't the built in DVD player use motion compensation? Would it be possible to link to the appropriate functions without violating the applicable laws?

I know almost nothing about assembly level programming and I know it would not be easy to identify which parts do the motion compensation but if the code is there it may be worth taking advantage of.

David
Reply
#58
So is the major obstacle to implementing more GPU assisted operations in XBMC video playback the lack of published documentation for the NV2A GPU? I've seen a lot of semiconductor NDAs (non-disclosure agreements) in my line of work, and they all have one thing in common - they have an expiration date. What if the documentation for the NV2A is no longer under NDA and it's just a matter of finding the right person to provide it to us? Does anybody know anybody at Nvidia? I'm going to start sniffing around to see what I can find out.
Reply
#59
I guess that's not the main problem. The NV2A is very simillar to a geforce3.
I guess the problem is that you need to code a h264 decoder that executes on the gpu, two things that are by no way a easy feat, because coding a h264 decoder is a huge task that requires a profound knowledge on video, and because doing it on the GPU requires even much more than that: to be able to hand code the GPU, as there is no compiler that converts asm or c into shader stuff.
Reply
#60
Plus, the accuracy of the GPU is not that flash (9bits/chan including sign). Just the colour space conversion requires lookup textures to create a decent looking conversion (the HQ pixel shader), as otherwise the computation (rather trivial - effectively a single matrix multiplication) loses accuracy due the intermediate storage.

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply

Logout Mark Read Team Forum Stats Members Help
GPU assisted video decoding in XBMC, like motion compensation, idct, and deblocking?0