• 1
  • 2
  • 3
  • 4(current)
  • 5
  • 6
  • 22
No multithreaded Video decoding on XBMC INTREPID version (in-depth testing)
#46
mr_raider Wrote:I noticed CPU throttling management sucks in Intrepid. I get dropped frames when using software upscaling if the CPU mode is set on "ondemand". The CPU load doesn't go anywhere near 100%, and it stays throttled, at doesn't kick up to speed even if frames are dropped.

If set the power mode to performance, CPU usage drops, and the frame drops stop.

I think software upscaling is a whole different story, because it can run pararelly with other processes.

I think that somehow even if ffmpeg is running multithreaded, the vide decoding part is only running single threaded. It was like that for a long time on linux ffmpeg. I read that only nowadays we have FRAME-LEVEL multithreading for h264 streams, previously it was only slice-based parallelization which is now not working with new x264 encoded streams.

http://www.nabble.com/ffmpeg-SoC---Frame...03474.html
http://lists.mplayerhq.hu/pipermail/ffmp...48070.html

The question is that is XBMC using frame-level multithreading enabled Ffmpeg or not ?
The other question is that whatever answer we have for the first one, why do we have such worse performance on newer kernels than on the 2.6.24 Hardy kernel ?
Reply
#47
Did you do what I said? In top it's blatantly obvious that there are two threads decoding.
Reply
#48
althekiller Wrote:Did you do what I said? In top it's blatantly obvious that there are two threads decoding.

Yes I did. With killa sample it is 98% 15% in TOP pressing 1. No matter Powernowd running or not.
And it is on a 3ghz core 2 duo e8400 cpu ! In windows Media Player Classic which uses FFMPEG by the way is 60-70% each core evenly loaded.

I don't say that decoding is not going multithreaded but the h264 decoding part i think is. Sure sound decoding, and other stuff runs on a different thread.
Reply
#49
No.. We have the CABAC patch applied in XBMC's ffmpeg. Therefore it should be using two threads for decoding h264.

This cabac patch is on top of ffmpeg tree.
42.7% of all statistics are made up on the spot

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#50
tslayer Wrote:No.. We have the CABAC patch applied in XBMC's ffmpeg. Therefore it should be using two threads for decoding h264.

Thanks for the info. It is just what i thought, because on Hardy kernel i don't have this problem. I have no dropped frames there. But on Intrepid a lot of pepole seeing this performance issue and i thought it might be becasue of runnig the decoding single threaded.
Reply
#51
Why do you keep comparing to windows apps? Give us results from mplayer and ffplay. In top you want to be looking at the process list, when you press play you'll see ~5more threads for xbmc.bin spawn, two of them will be using a lot of cpu (for h264 only). Audio decoding is negligible in comparison.
Reply
#52
Sorry for mentioning windows. I came from windows world but i will never go back. I just picked up an FFMPEG using player, but i could have said XBMC under windows or XBMC under Hardy because there it all works fine. But a lot of perople having new hardware would like to use the new kernels and it will be more like this in the future and so far it not works ok on Intrepid.
Reply
#53
Right but I'm fairly certain you're looking in the wrong place.

Give this a read.

http://www.phoronix.com/scan.php?page=ar...2008&num=1
Reply
#54
You say that the only problem is that new kernels run XBMC slower ? So we just need a faster CPU to run it ?

I wonder if anyone was able to play killa sample without framerate drops on Intrepid and what hardware is he using.

Because the 3Ghz Core Duo is way too slow for this. We might need a 4Ghz processor...
Reply
#55
Well, I was going to test on my Intrepid box but didn't have that sample and was seeing even loads on other flicks. Since this is apparently such a big deal I will take a copy of the clip over tonight and check it out to see if this occurs on the Intrepid box I recently built. I am pretty sure I tested this previously though when I was testing CPUs so I will be surprised if I see big drops at 3ghz...
Openelec Gotham, MCE remote(s), Intel i3 NUC, DVDs fed from unRAID cataloged by DVD Profiler. HD-DVD encoded with Handbrake to x.264. Yamaha receiver(s)
Reply
#56
Here is the killa sample:

http://rapidshare.com/files/82525583/kil...4.mkv.html

Anyway is this the ffmpeg patch included in XBMC ?

http://code.google.com/p/google-summer-o...z&can=2&q=
Reply
#57
My Q6800 can play killa with 5 frames dropped (these are from the initial opening of the file i think) at stock clock.

The AMD 5000+ has no chance at stock. None. Its a 2.6Ghz processor, and i've got it ramped up to 3.25Ghz and it's still dropping ~150 frames.
Reply
#58
motd2k Wrote:Edit2: Found a workaround. Line 152 in xbmc/cores/dvdplayer/DVDCodecs/Video/DVDVideoCodecFFmpeg.cpp reads like this...
PHP Code:
int num_threads std::min(/*MAX_THREADS*/g_cpuInfo.getCPUCount()); 
Change it to this...
PHP Code:
int num_threads std::min(/*MAX_THREADS*/g_cpuInfo.getCPUCount()+2); 
and you should find it using much more CPU, also I found a fairly significant reduction in dropped frames.

I tried this and it actually helps a lot. Now i only have some few framedrops in the killa sample. The Dark Knight BD rip is actually playing without a problem. Big Grin

I noticed a strange thing: Playing back the killa i checked in TOP pressing 1 then H to show threads that the two significant threads of xbmc.bin share almost evenly on CPU time BUT on top where i can see the utilizations, i still see uneven numbers. Like the cores were taking different amount of the two threads. One core is taking all of one thread and also some of the other thread and the other core just takes a few of one thread. It is really strange.

This compile option change might tell something to the Devs what we can play in config options to get better core share to reach the performance of the old kernels.

Screenshot of TOP with the strange situation (Threads 49%-45% ; CPUs 99% 13%)
Image

I think the kernel tries to use as few cores as possible leaving the rest of the cores idle to make them downclockable. It is perfectly reasonable if we generally using the kernel for normal tasks. This way we have a 4 core or and 8 core processor used just as much core as it is nneded to fulfill a certain task. But with video decoding i think it's not good. We want to use the cores evenly loaded, better have the two main threads stayed exclusively on one a specific core.

Maybe there is a kernel option for this, how the user want to optimalize his cpu core usage. I think this could be the difference of Hardy and Intrepid kernel that Hardy did not have this so called "energy optimalized" cpu usage. (I already tried to disable powernowd and also speedstep in BIOS with no change)
Reply
#59
I tested it with 8.10 on a 2.5Ghz AMD X2. I have no HD content other than the "killa sampla" you guys seem to love.

For SD (MPeg-2), there is a slight imbalance of load between the two CPUs, but it's not flagrant, i.e. maybe 60%/20%. It does vary a lot, and XBMC reports similar numbers to top, with slight delay.

I have 4-6 threads spawned for xbmc.bin, but the first on is 50% of CPU, the 2nd one is 10-15%, and the last two are minimal.

For the killa sampla, the imbalance is more obvious (i.e. 90/10) by the end of the clip.

Since I don't view any h264 content, I am hesitant to recompile, as finally got a build that works properly with pulse audio. Let someone else be the guinea pig.

By the way, disabling CPU throttling helps a lot in the drop frames department.
Reply
#60
alanwww1 Wrote:I think the kernel tries to use as few cores as possible leaving the rest of the cores idle to make them downclockable. It is perfectly reasonable if we generally using the kernel for normal tasks. This way we have a 4 core or and 8 core processor used just as much core as it is nneded to fulfill a certain task. But with video decoding i think it's not good. We want to use the cores evenly loaded, better have the two main threads stayed exclusively on one a specific core.

Maybe there is a kernel option for this, how the user want to optimalize his cpu core usage. I think this could be the difference of Hardy and Intrepid kernel that Hardy did not have this so called "energy optimalized" cpu usage. (I already tried to disable powernowd and also speedstep in BIOS with no change)

No. Please get ahold of some operating systems and computer architecture textbooks and give them a read before making any more outlandish assumptions. At least read up on the new linux process scheduler (it's called The Completely Fair Scheduler or CFS). It was enabled by ubuntu in 2.6.26 IIRC. It's not the job of the sch...

...my rant may have sparked an idea as to the cause. If the scheduler takes into account cache coherence, and there is significant spacial locality, it may stick both threads on the same core in an attempt avoid having the same data in multiple caches. Off to read myself...
Reply
  • 1
  • 2
  • 3
  • 4(current)
  • 5
  • 6
  • 22

Logout Mark Read Team Forum Stats Members Help
No multithreaded Video decoding on XBMC INTREPID version (in-depth testing)1