• 1
  • 2(current)
  • 3
  • 4
  • 5
  • 11
Compile faster Mplayer.dll with different parameters
#16
wizboy11,

can you upload the h.264 files ? I would like to quick test them on a DX1480.
Reply
#17
I've added 3 more CFLAGS and made it ~1fps faster. I also went against my better judgment and use mfpmath=sse in it, but it doesn't really make a difference. I'm trusting the GCC guide in that:
Quote:The resulting code should be considerably faster in the majority of cases and avoid the numerical instability problems of 387 code, but may break some existing code that expects temporaries to be 80bit.

But thats not what made it 1fps faster. I added these flags:
Code:
-falign-functions -frename-registers -ffloat-store

I'm not really sure if one might slow it down and then another speed it up but I know it is faster then the previous builds. And it didn't get that much bigger either. This one came in around 4.99mb.

Link:
Fast Mplayer 3

If you want to test using the same files I used I'll upload them for ya. They're packaged in 7z files with the conf file necessary to run the benchmark. Just follow elupus's directions for it:
Quote: neat. could you try running the benchmark switch on it? turn on debug log, then add a .conf file with same name as the file you play ie filename.avi.conf in same dir as file.

file should just contain these two lines.
benchmark=1
nosound=1

then look at xbmc logfile after playback, it should give you timings for the playback.
BTW, You can tell I like anime by them Smile (but seriously, it's the only short things I had lying around, both OPs)
h264-File1
h264-file2

I'm done testing today. Whole thing is basically trial and error. So fvck'n annoying but I guess someone has to do it Big Grin
Thanks.

Hopefully my efforts will pay off one day and this will get intergrated into SVN. Big Grin
Reply
#18
Wizboy11,

Dude!!! 5-7% improvement for free!!! way to go Smile

a) the latest one you compiled is indeed the fastest

b) your benchmarks between runs differ to much. Mine are really much closer of each other! ( did you turn file caching off? file caching affects playback performance in a very negative way and may make you fps vary wildly)

c) a must! definitively a significative improvement over the standard mplayer.dll


d) I ended up not using your files for benchmark as XBMC is capped at 60fps max, and those files played at that framerate most of the time. I used two diferent 720p files, a .mov and a .mkv

e) benchmarks were run with the 9-7-2k7 build from t3ch using the original mplayer.dll and the three builds that you provided, each movie was run twice.

here they are:

Mplayer Orig


oceans_13-tlr2a_720p.mov

BENCHMARKs: VC: 109.048s VO: 17.666s A: 0.000s Sys: 4.697s = 131.410s
BENCHMARK%: VC: 82.9829% VO: 13.4432% A: 0.0000% Sys: 3.5740% = 100.0000%

BENCHMARKs: VC: 108.994s VO: 17.677s A: 0.000s Sys: 4.655s = 131.326s
BENCHMARK%: VC: 82.9949% VO: 13.4602% A: 0.0000% Sys: 3.5448% = 100.0000%

BornToKill_h.264_720p.mkv

BENCHMARKs: VC: 40.955s VO: 4.512s A: 0.000s Sys: 1.136s = 46.603s
BENCHMARK%: VC: 87.8806% VO: 9.6815% A: 0.0000% Sys: 2.4379% = 100.0000%

BENCHMARKs: VC: 40.823s VO: 4.499s A: 0.000s Sys: 1.157s = 46.479s
BENCHMARK%: VC: 87.8316% VO: 9.6802% A: 0.0000% Sys: 2.4883% = 100.0000%


Improvement over original Mplayer.dll: (oceans_13-tlr2a_720p.mov) : 0.00%
Improvement over original Mplayer.dll: (BornToKill_h.264_720p.mkv) : 0.00%


Mplayer fast2 ( i387 )


oceans_13-tlr2a_720p.mov

BENCHMARKs: VC: 102.447s VO: 17.698s A: 0.000s Sys: 4.666s = 124.811s
BENCHMARK%: VC: 82.0818% VO: 14.1799% A: 0.0000% Sys: 3.7383% = 100.0000%

BENCHMARKs: VC: 102.227s VO: 17.635s A: 0.000s Sys: 4.671s = 124.533s
BENCHMARK%: VC: 82.0882% VO: 14.1612% A: 0.0000% Sys: 3.7506% = 100.0000%

BornToKill_h.264_720p.mkv

BENCHMARKs: VC: 37.542s VO: 4.528s A: 0.000s Sys: 1.145s = 43.215s
BENCHMARK%: VC: 86.8724% VO: 10.4774% A: 0.0000% Sys: 2.6502% = 100.0000%

BENCHMARKs: VC: 37.575s VO: 4.512s A: 0.000s Sys: 1.163s = 43.250s
BENCHMARK%: VC: 86.8776% VO: 10.4332% A: 0.0000% Sys: 2.6892% = 100.0000%


Improvement over original Mplayer.dll: (oceans_13-tlr2a_720p.mov) : 5.17%
Improvement over original Mplayer.dll: (BornToKill_h.264_720p.mkv) : 7.02%




Mplayer fpmath ( SSE )


oceans_13-tlr2a_720p.mov

BENCHMARKs: VC: 102.349s VO: 17.681s A: 0.000s Sys: 4.713s = 124.744s
BENCHMARK%: VC: 82.0476% VO: 14.1742% A: 0.0000% Sys: 3.7783% = 100.0000%


BENCHMARKs: VC: 101.291s VO: 17.529s A: 0.000s Sys: 4.709s = 123.529s
BENCHMARK%: VC: 81.9977% VO: 14.1900% A: 0.0000% Sys: 3.8122% = 100.0000%

BornToKill_h.264_720p.mkv

BENCHMARKs: VC: 37.502s VO: 4.528s A: 0.000s Sys: 1.147s = 43.177s
BENCHMARK%: VC: 86.8571% VO: 10.4865% A: 0.0000% Sys: 2.6564% = 100.0000%

BENCHMARKs: VC: 37.541s VO: 4.533s A: 0.000s Sys: 1.110s = 43.184s
BENCHMARK%: VC: 86.9335% VO: 10.4965% A: 0.0000% Sys: 2.5699% = 100.0000%


Improvement over original Mplayer.dll: (oceans_13-tlr2a_720p.mov) : 5.93%
Improvement over original Mplayer.dll: (BornToKill_h.264_720p.mkv) : 7.10%




Mplayer fast3 ( SSE )


oceans_13-tlr2a_720p.mov

BENCHMARKs: VC: 101.430s VO: 17.774s A: 0.000s Sys: 4.676s = 123.880s
BENCHMARK%: VC: 81.8778% VO: 14.3478% A: 0.0000% Sys: 3.7743% = 100.0000%

BENCHMARKs: VC: 101.187s VO: 17.630s A: 0.000s Sys: 4.686s = 123.502s
BENCHMARK%: VC: 81.9313% VO: 14.2747% A: 0.0000% Sys: 3.7940% = 100.0000%

BornToKill_h.264_720p.mkv

BENCHMARKs: VC: 37.565s VO: 4.532s A: 0.000s Sys: 1.167s = 43.264s
BENCHMARK%: VC: 86.8288% VO: 10.4748% A: 0.0000% Sys: 2.6964% = 100.0000%

BENCHMARKs: VC: 37.566s VO: 4.528s A: 0.000s Sys: 1.154s = 43.248s
BENCHMARK%: VC: 86.8609% VO: 10.4697% A: 0.0000% Sys: 2.6694% = 100.0000%


Improvement over original Mplayer.dll: (oceans_13-tlr2a_720p.mov) : 5.95%
Improvement over original Mplayer.dll: (BornToKill_h.264_720p.mkv) : 6.95%
Reply
#19
on a sidenote, I get this error on the log, with all mplayer.dll when trying to play the .mov:

DEBUG: msg: Compiler did not align stack variables. Libavcodec has been miscompiled and may be very slow or crash. This is not a bug in libavcodec, but in the compiler. Do not report crashes to FFmpeg developers.
Reply
#20
Thanks.
Well they probably capped because of the faster processor. For the regular 733mhz processor they stay ~30fps. Lucky you right? Big Grin

Seems the mfpmath=sse is faster from your benchmarks. I'll just go with that.
Fast Mplayer #3 is the fastest as it would seem.

Also thanks for the tip for disable caching. Didn't do that on mine. I'll try that when I test any new builds I might have (but enough for tonight for me :p)

BTW if you want to compile these yourself, I believe that these instructions still work:
mplayer compile instructions

Just before you type make, copy and paste the config.mak file from the 7z file and overwrite the old one in the mplayer directory. You can even change the flags on these lines:
Code:
OPTFLAGS = -march=pentium3 -mmmx -msse -mfpmath=sse -Os -pipe -fomit-frame-pointer -ffast-math -m32 -mpreferred-stack-boundary=4 -malign-double -falign-functions -frename-registers -ffloat-store
EXTRA_INC = -I/c/XBMC/mplayer/xbmcsys -I/c/XBMC/mplayer/xbmcsys/xbmc_vobsub -D_XBOX -DXBMC_VOBSUB -march=pentium3 -mmmx -msse -mfpmath=sse -pipe -fomit-frame-pointer -ffast-math -m32 -mpreferred-stack-boundary=4 -malign-double -falign-functions -frename-registers -ffloat-store
OPTFLAGS = -I../libvo -I../../libvo  -fno-PIC -march=pentium3 -mmmx -msse -mfpmath=sse -Os -pipe -m32 -mpreferred-stack-boundary=4 -malign-double -ffast-math -fomit-frame-pointer -falign-functions -frename-registers -ffloat-store -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 $(EXTRA_INC)

I have some redundant stuff in it. I really don't need to be typing -mmmx or -msse since -march already implies these. In fact it probably isn't a good idea :p. But it doesn't really make a difference. These were the flags I used to compile Fast mplayer #3, which is so far the fastest one. Just make the flags match up in the three lines but I really don't think thats necessary (just to be safe, I wasn't sure what line the compiler would use when).

BTW, would someone please put the link for fast mplayer #3 in the OP?
Keep people up to date ya know? Same with the flags in this post. Have no idea why I can't edit posts. Don't see why that would be a problem but apparently thats what the mods see fit to do. Wink

Thanks Smile
Reply
#21
Could any coder explain why this made it faster? And does it have an impact on the visual quality?

If there's no setback with it, it seems strange it hasn't happened before.

Good job, trial and error "rUlEz" I guess :-)
Reply
#22
For some reason, when I compile it with my options I get this:
Code:
12:10:23 M: 34058240   DEBUG:   msg: Compiler did not align stack variables. Libavcodec has been miscompiled
                             and may be very slow or crash. This is not a bug in libavcodec,
                             but in the compiler. Do not report crashes to FFmpeg developers.

I believe that someone already said this too. From what I can tell, I know it's not running slower (in fact it's running faster) and it's not crashing so as far as I know it should be safe.

This message does not appear when I don't change the compiler options.

I'm not sure what it means by not aligning stack variables. I'll google it but I'm not sure I'll understand that either. Anyone care to enlighten?

Thanks Big Grin

Quote:And does it have an impact on the visual quality?
It shouldn't, AFAIK, but I'm no coder nor an expert. If anything happend, changing -O4 (which defaults to -O3 since O4 doesn't exist) to -Os made it more stable by disabling optimizations.

I'm not a coder but some applications just run faster with smaller files or less optimizations. Not all optimizations are good ones, as seen in this case.

I'm using this as my basis:
Quote:Note that -O2 is regarded as safer than "-O3", and "-O3" can often be a counter-productive attempt at optimization. On computers with limited cache and/or memory, "-Os" may provide better performance in some cases through smaller binaries, although it is slower when using the OpenSSL library with small keys (DSA keys with less than 2048 bits on VIA C3-2, 1200 MHz and 64 kb on-die cache).

And I tried -O2 and it was slower then -Os. Since the Xbox is a prime example of limited cache and memory I guess that might be why.

Taken from Safe Cflags:
Cflag Page
I realize that the page is for safe cflags for Gentoo but still, it's a good source of information.
Reply
#23
Fast Mplayer version 4:
Link (fast mplayer v4)

I recompiled xvidcore.dll with -Os and got a smaller file (~580 compared to ~700). Also replaced the other things that went along with xvidcore.dll (like xvidcore.dll.a and xvidcore.a) in \mplayer\xbmcsys\lib.

Probably will have absolutely no purpose whatsoever (since xvid files really don't need speeding up) but I'm trying to save space.

I'm not sure if mplayer uses them at all but it gave me a slightly smaller mplayer.dll so I'm gonna guess and say yes.

I also changed the version.h file so that it properly displays the rev. number on the system info page instead of saying UNKNOWN. Also changed mplayer name in same file to Fast Mplayer TEST.

Thanks. :cool2:

SVN time soon? Big Grin
Reply
#24
gronne Wrote:Could any coder explain why this made it faster?
it's probably faster because the smaller code fits better into the cpu cache and/or the conditional jump prediction works better on this code.
i'm guessing tho.
Reply
#25
wizboy11, first; Great work! Now, why doesn't xvid's need speeding up? look at the frame drops on an unlimited (not using plugh's version) xvid at 1280x540. any speed up would result in less dropping.

I was wondering if your experiments can be applied to dvdplayer decoder (and divx codec) also because this one is much faster than mplayer (but still lacks subtitle support) so it's my prefered (or rather required) decoder for hd xvid/divx's
Reply
#26
we use lavc for xvid.
Reply
#27
Question 
Is xvidcore any faster/better than lavcodec for Xvid video decoding in MPlayer these days?

PS! Reason we removed xvidcore was a bug that caused loads of early/late/skipped frames

Huh
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#28
xvidcodec is slower, (and worse in reliability)

the speedup is probably cause speed critical code now fit's into the L1 (or L2) cache, while when full optimizations are in use, it unrolls alot of loops, wich makes code take alot more space.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#29
elupus Wrote:xvidcodec is slower, (and worse in reliability)

I didn't say I made it use xvid codec. Smile
I just said that I updated it with a fresh and smaller build. So me updating it had absolutely no effect on xvid playback.

Why did the mplayer dll get smaller though?
Reply
#30
i meant the xvidcore lib. that should't be linked in, so i'd be very surprised if that affected the size of mplayer.dll
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
  • 1
  • 2(current)
  • 3
  • 4
  • 5
  • 11

Logout Mark Read Team Forum Stats Members Help
Compile faster Mplayer.dll with different parameters0