High CPU usage & Video stuttering while streaming
#16
(2014-08-12, 19:20)Milhouse Wrote: I'm not sure if OpenELEC has swap enabled by default on the Pi - pretty sure it doesn't. My testing above was without swap, 720 GUI and low colour textures.

I think swap is desirable on a 256M Pi.
Reply
#17
I can confirm that with the recent builds of OE my old 256mb pi is unusable. It has extreme lag and hangs during the playback of videos and also randomly while just navigating the UI.

I am going to try Raspbmc for this old pi as I see that there is a current thread on that forum discussing this exact issue. The newer builds of Raspbmc has swap enabled by default.
Reply
#18
(2014-08-12, 19:11)popcornmix Wrote: Okay, I can play that stream. I'll see if I can reproduce the high cpu (it's 2% currently).

Update: It's up to 14% - that looks like a bug.

Do you think it's fixable? It happens with any stream (video and audio) so for now I've disabled filecache.
Reply
#19
(2014-08-15, 14:25)darzur Wrote: Do you think it's fixable? It happens with any stream (video and audio) so for now I've disabled filecache.

Are you confirming that disabling filecache solves the high cpu - even when playing for long periods?

I spent a while stepping through the filecache code, and it didn't seem to within filecache itself(*).
Filecache calls the shoutcast handler which does some bad things - e.g. compiling regular expressions everytime it read from stream (sometime only reading one byte).
It also calls curl to do the network stream reading (which is something I'd prefer not to dig into as that's a huge library I know nothing about).

However if you are confirming that disabling filecache is a complete solution, then that suggests (likely rather than definitely) that the problem is not in shoutcast handler or curl,
but in filecache, so may be worth a closer look.

(*) strangely the act of attaching the debugger and halting and restarting execution made the high cpu disappear for a while. That's obviously inconvenient for catching the bug in the debugger.
I suspect that halting execution allow the network stack to fill up and that avoids the high cpu.
Reply
#20
Just an update to my previous post. I made an image of Raspbmc on the 256mb pi with 112 gpu mem and it fixed all my issues.

I have no problems with OpenElec on any of my 512mb pi's, but for some reason the newer builds of OE on my old pi's are unusable.
Reply
#21
I've noticed a 100% CPU utilization issue (only on the Raspberry Pi) when you have indexed content and you play back a song. After stopping the song and playing a video I noticed that the CPU utilization doesn't drop. It stays at or near 100%. Try removing your index by renaming the database folder and rebooting.

Note: You shouldn't have a swap enabled for the Pi, due to it being run off a flash card.
Reply
#22
(2014-08-15, 14:52)lowridin_guy Wrote: I have no problems with OpenElec on any of my 512mb pi's, but for some reason the newer builds of OE on my old pi's are unusable.

You could try the latest OpenELEC Helix build on your 256MB Pi - build #0815 onwards has the command /usr/bin/configrpi256.sh which will:
  1. Enable 128MB swap (on next boot, as long as your /storage is SD or USB)
  2. Increase vm.swappiness from 10 to 20 (using /storage/.config/autostart.sh)
  3. Set gpu_mem_256 to 112 in /flash/config.txt
This, in theory, is similar to the Raspbian/256MB configuration, and it would be interesting to see if it makes any difference (better/worse). This command only needs to be run once, and the effect will persist across reboots.

To disable swap, just delete /storage/.config/swap.conf and reboot. Run /usr/bin/configrpi256.sh (and reboot) to re-enable swap.

Edit: Re-testing the video that wouldn't play here (without swap, when simulating a 256MB Pi) I'm now able to play the video without any problem after running /usr/bin/configrpi256.sh while simulating a 256MB Pi.

I can see (using bcmstat.sh cgx) that about 30% of swap is being used which means the video is slower to start than it is on a 512MB Pi, but once it begins playing there is no stuttering or buffering, so that's a result. Smile

During playback, CPU is a fairly constant 25% (1GHz ARM, 500MHz Core, 600Mhz SDRAM) and network RX rate is about 400KB/s (wired ethernet).

The cache (according to the OSD) is 1MB at 100% and never drops below 100%. What's odd (and slightly OT) is that according to xbmc.log, the cacheMemBufferSize is 2MB, and when it was 20MB (on a 512MB Pi), the OSD only showed 10MB @ 100%, so it looks like only half the available cache memory allocation is being used.

<readbufferfactor> is the system default, 4.0.
Texture Cache Maintenance Utility: Preload your texture cache for optimal UI performance. Remotely manage media libraries. Purge unused artwork to free up space. Find missing media. Configurable QA check to highlight metadata issues. Aid in diagnosis of library and cache related problems.
Reply
#23
(2014-08-16, 06:33)Jimbo99 Wrote: Note: You shouldn't have a swap enabled for the Pi, due to it being run off a flash card.

While generally speaking, yes, a high number of writes will eventually wear out an SD Card (or USB stick, or SSD), for the vast majority of users enabling a small amount of swap is unlikely to cause a problem.

Almost all Flash-based memory storage devices now support wear-levelling of one form or another (dynamic wear-levelling, in the case of most SD cards/USB sticks), and these Flash-based devices will last at least 2 years of constant writes, so it's very, very unlikely that anyone will wear out their Flash storage simply by enabling swap. Also bear in mind the swap won't be written all of the time - it will only be used when required - and not all of the swap will be used etc., so the impact should be minimal (probably less than all of the re-writing that occurs with SQLite databases).
Texture Cache Maintenance Utility: Preload your texture cache for optimal UI performance. Remotely manage media libraries. Purge unused artwork to free up space. Find missing media. Configurable QA check to highlight metadata issues. Aid in diagnosis of library and cache related problems.
Reply
#24
(2014-08-15, 14:41)popcornmix Wrote:
(2014-08-15, 14:25)darzur Wrote: Do you think it's fixable? It happens with any stream (video and audio) so for now I've disabled filecache.
Are you confirming that disabling filecache solves the high cpu - even when playing for long periods?

Yes, I can confirm that. Disabling filecache solves this problem. I can play streaming content for hours and CPU usage stays low.
Reply
#25
(2014-08-16, 06:58)Milhouse Wrote: You could try the latest OpenELEC Helix build on your 256MB Pi - build #0815 onwards has the command /usr/bin/configrpi256.sh which will:
  1. Enable 128MB swap (on next boot, as long as your /storage is SD or USB)
  2. Increase vm.swappiness from 10 to 20 (using /storage/.config/autostart.sh)
  3. Set gpu_mem_256 to 112 in /flash/config.txt

Edit: Re-testing the video that wouldn't play here (without swap, when simulating a 256MB Pi) I'm now able to play the video without any problem after running /usr/bin/configrpi256.sh while simulating a 256MB Pi.

I can see (using bcmstat.sh cgx) that about 30% of swap is being used which means the video is slower to start than it is on a 512MB Pi, but once it begins playing there is no stuttering or buffering, so that's a result. Smile

During playback, CPU is a fairly constant 25% (1GHz ARM, 500MHz Core, 600Mhz SDRAM) and network RX rate is about 400KB/s (wired ethernet).

Hello,

First of all, thank you for your effort on maintaining OE for 256 Rpis.

I've done some testing with the setup you mentioned:

- Fresh install of 815 build
- 112 gpu mem
- Swap enabled

Testing the same video now I can play it without issues, using context controls and skipping perfectly like in previous OE Gotham or in Frodo.

However, CPU usage is around 40% instead of 25% for me:

Code:
Time          ARM     Core     H264  Core Temp (Max)   IRQ/s      RX B/s      TX B/s  GPUMem Free   %user   %nice %system   %idle %iowait    %irq  %s/irq  %total  Memory Free/Used(SwUse)
========  =======  =======  =======  ===============  ======  ==========  ==========  ===========  ======  ======  ======  ======  ======  ======  ======  ======  =======================
14:17:14  1000Mhz   500Mhz   250Mhz  56.22C (56.22C)     628     460,931       9,937   51M ( 55%)   20.90   31.36   41.81    0.00    0.00    0.00    5.23  100.00  174,432 kB/35.1%( 2.2%)
14:17:16   700Mhz   250Mhz   250Mhz  55.69C (56.22C)     797     567,192      10,830   51M ( 55%)   20.91    2.27   10.46   55.01    0.00    0.00    4.09   44.99  174,120 kB/35.2%( 2.2%)
14:17:18   700Mhz   500Mhz   250Mhz  55.69C (56.22C)     771     491,472      10,223   51M ( 55%)   21.02    1.37    9.59   56.65    0.00    0.00    4.57   43.35  173,688 kB/35.3%( 2.2%)
14:17:20   700Mhz   250Mhz   250Mhz  55.15C (56.22C)     628     360,744       5,831   51M ( 55%)   19.17    2.28    7.30   59.80    0.00    0.00    2.74   40.20  173,680 kB/35.3%( 2.2%)
14:17:23   700Mhz   250Mhz   250Mhz  55.15C (56.22C)     689     422,416       7,530   51M ( 55%)   20.28    1.80   10.81   57.67    0.00    0.00    3.15   42.33  173,792 kB/35.3%( 2.2%)
14:17:25   700Mhz   250Mhz   250Mhz  55.15C (56.22C)     671     415,263       6,545   51M ( 55%)   19.73    2.69   10.76   56.51    0.00    0.00    2.69   43.49  173,840 kB/35.3%( 2.2%)
14:17:27   700Mhz   250Mhz   250Mhz  55.15C (56.22C)     711     452,414       8,533   51M ( 55%)   21.76    3.84   13.65   52.04    0.00    0.00    3.84   47.96  173,972 kB/35.2%( 2.2%)
14:17:29   700Mhz   250Mhz   250Mhz  55.15C (56.22C)     660     385,506       6,858   51M ( 55%)   18.42    3.14    9.43   59.75    0.00    0.00    1.80   40.25  173,524 kB/35.4%( 2.2%)
14:17:32   700Mhz   250Mhz   250Mhz  55.15C (56.22C)     603     310,559       5,599   51M ( 55%)   18.98    2.71    9.49   59.65    0.00    0.00    1.81   40.35  173,780 kB/35.3%( 2.2%)
14:17:34   699Mhz   250Mhz   250Mhz  55.15C (56.22C)     648     384,395       6,625   51M ( 55%)   18.28    3.20    8.23   60.79    0.00    0.00    1.83   39.21  173,812 kB/35.3%( 2.2%)

As you can see, swap is barely used.

I've also noticed that now GUI is somewhat blurry. I guess that is being resized to 720p.

By the way, if you run your config256 script, gpu mem does not change automatically to 112 since the line is still commented out in config.txt.

Best regards.
Reply
#26
(2014-08-16, 16:28)jesus225 Wrote: However, CPU usage is around 40% instead of 25% for me:

That's because you're dynamically changing frequency between 700MHz and 1000MHz. I've added "force_turbo=1" to my /flash/config.txt so that my Pi runs constantly at 1000MHz.

(2014-08-16, 16:28)jesus225 Wrote: As you can see, swap is barely used.

OK but it is used, which may now mean the difference between working and not working.

I guess the amount of swap used may vary from system to system depending on how much free RAM there is to begin with (dependent on addons and other background services). I also had high colour textures enabled which could have saved me a few MB if disabled. GUI resolution I had set to "Auto", and seemed to be running at 720 judging by the obvious blur. My artwork textures are also very large (1080 fanartres, 720 imageres) which could also waste a few MB.

This is the bcmstat.sh output during my testing, with playback commencing at ~06:04:51 ending at ~06:19:57.

Check your /storage/.config/autostart.sh has had vm.swappiness=20 added to it correctly - run "sysctl vm.swappiness" at the command line, it should be set to 20.

Also, I'm testing with #0815b (newclock4) which may behave slightly differently to #0815 (newclock3).

(2014-08-16, 16:28)jesus225 Wrote: By the way, if you run your config256 script, gpu mem does not change automatically to 112 since the line is still commented out in config.txt.
Hmm, yeah. I'll fix that in the next build. Thanks.
Texture Cache Maintenance Utility: Preload your texture cache for optimal UI performance. Remotely manage media libraries. Purge unused artwork to free up space. Find missing media. Configurable QA check to highlight metadata issues. Aid in diagnosis of library and cache related problems.
Reply
#27
(2014-08-16, 17:11)Milhouse Wrote: I guess the amount of swap used may vary from system to system depending on how much free RAM there is to begin with (dependent on addons and other background services). I also had high colour textures enabled which could have saved me a few MB if disabled. GUI resolution I had set to "Auto", and seemed to be running at 720 judging by the obvious blur. My artwork textures are also very large (1080 fanartres, 720 imageres) which could also waste a few MB.

This is the bcmstat.sh output during my testing, with playback commencing at ~06:04:51 ending at ~06:19:57.

Check your /storage/.config/autostart.sh has had vm.swappiness=20 added to it correctly - run "sysctl vm.swappiness" at the command line, it should be set to 20.

Also, I'm testing with #0815b (newclock4) which may behave slightly differently to #0815 (newclock3).

I checked swappiness in my setup:

Code:
OpenELEC:~ # sysctl vm.swappiness
vm.swappiness = 20

As you said, I believe the amount of swap that is being used depends on background services as well as custom settings. Sometimes the same video needs 3% of swap and other times it needs 10%, specially if OE has been running for a long time.

Currently, my screen resolution is only 1680x1050 so I changed the limit from "Auto" to "Unlimited" in order to get rid of the blur and the performance is still good. I don't know if people with 1080p screens would notice lack of performance running GUI at full HD.

Does your newclock4 build include RPI256 optimizations as well?

Best regards.
Reply
#28
(2014-08-17, 13:30)jesus225 Wrote: Does your newclock4 build include RPI256 optimizations as well?

Yes.
Texture Cache Maintenance Utility: Preload your texture cache for optimal UI performance. Remotely manage media libraries. Purge unused artwork to free up space. Find missing media. Configurable QA check to highlight metadata issues. Aid in diagnosis of library and cache related problems.
Reply
#29
Is anyone else able to confirm if the swap optimisations improve performance on OpenELEC when using a 256MB Pi?

If you're running a standard 4.0.x/4.1.x build of OpenELEC and don't wish to experiment with a Helix test build, the optimisation script can be installed as follows:

Code:
wget http://www.nmacleod.com/public/oebuild/configrpi256.sh -P /storage && chmod +x /storage/configrpi256.sh

then to run it (this only needs to be done the once):

Code:
/storage/configrpi256.sh

and reboot.

If you wish to monitor your Pi, I'd suggest installing bcmstat.sh:

Code:
curl -Ls https://raw.githubusercontent.com/MilhouseVH/bcmstat/master/bcmstat.sh -o /storage/bcmstat.sh && chmod +x /storage/bcmstat.sh

then to run it:

Code:
/storage/bcmstat.sh cgxd10

To revert the configrpi256.sh optimisations:
  • Delete /storage/.config/swap.conf
  • Remove the vm.swappiness entry from /storage/.config/autostart.sh (or delete this file if vm.swappiness is the only entry)
  • Change your gpu_mem_256 setting in /flash/config.txt to whatever value you were using previously (assuming it wasn't 112)
Texture Cache Maintenance Utility: Preload your texture cache for optimal UI performance. Remotely manage media libraries. Purge unused artwork to free up space. Find missing media. Configurable QA check to highlight metadata issues. Aid in diagnosis of library and cache related problems.
Reply
#30
Whenever low on free RAM memory, xbmc will freeze, ssh will almost freeze. Reducing the vm.min_free_kbytes does delay it, but it is not a fix because there are several things that can cause low memory error, starting RIO2 movie, the complete bluray iso is one good example.

I have a 256MB pi. For me a 128MB swapfile together with vm.swappiness = 100 works great. I agree it feels a bit weird to configure for maximal swap, but higher swappiness does seem to work better.

Without overclocking, the best setting for me is buffermode 3 in advancedsettings.xml. This disables filecache. It gives me ability to play almost all videos in my library.
Streaming very high bitrate movies, such as the movie Nonstop or Rio2, 45GB+ complete blurays, is a challenge. The list below show my optimal settings for streaming.

1) Used nfs instead of smb because it needs less cpu
2) Did not add a nfs source in xbmc because it connects using tcp. (UDP need less cpu).
3) Mounted the nfs share using linux mount command, settings to use are noatime, udp protocol and 32kB or larger rsize.
4) Enabled as much overclocking as possible in /flash/config.txt
5) Increased iBufferSize and min_chunk_size in File.cpp to 256KB
6) Increased READ_CACHE_CHUNK_SIZE in FileCache.cpp to 256KB
7) Increased FFMPEG_FILE_BUFFER_SIZE in DVDDemuxFFmpeg.h to 256kB
8) Added READ_CACHED and READ_CHUNKED to if(fp->Open(filename)) in DVDInputStreamBluray.cpp (see DVDInputStreamFile.cpp)
9) buffermode 1 and 3MB cachemembuffersize and 99 readbufferfactor
10) reduced m_messageQueue.SetMaxDataSize to 15MB in DVDPlayerVideo.cpp (see OMXPlayerVideo.cpp)
11) increased m_messageQueue.SetMaxDataSize to 15MB in OMXPlayerVideo.cpp
11) sysctl net.ipv4.udp_rmem_min = 262144
12 sysctl vm.swappiness = 100
13) 128MB swapfile
14) Removed most of the Sleep occurances in the OMXPlayer.cpp process while-loop.
15) Decreased thread priority for OMXPlayer.cpp to lowest possible.

With these settings I can stream 45GB bluray movies stored in RAR archives with DTS-HD MA 5.1 audio using hdmi passthrough.
I have Codecinfo open, and it reports VQ and AQ never go below 50% when nfs streaming the "DTS HD-Master Audio Sound Check 5.1 (Lossless)" sample downloaded here: http://www.demo-world.eu/trailers/high-d...ailers.php

My experience when using audio hdmi passthrough (so the pi does not need to decode DTS audio): If not overclocked, buffermode 3 is better. With overclocking, buffermode 1 works better. For the high bitrate movies, the pi needs to be overclocked for it to have the extra cpu and memory bandwith that the filecache consumes. And for streaming the same high bitrate movies, the filecache is needed to keep OMXPlayer messagequeue filled.

It would be great if player could put data in the messagequeue faster without needing to use the FileCache. I suspect it would be better to eliminate the FileCache, but for some reason that I have not understood, the player does not seem to fetch data and put it into the messagequeue fast enough when it needs to wait for every nfs read io to complete. When filecache is enabled, the player seems to repeatedly loop and try to fetch data from the filecache, even if it is empty. This waste cpu cycles that are better spent by the filecache to fill itself, so therefor I changed the OMXPlayer thread to lowest possible priority and also removed most of the sleeps. Yes, Im sure it is not optimal, yes, it is very hackish, but it gives me better performance with high bitrate movies.

config.txt
(my pi seem to be overclock friendly)

gpu_mem_256=127
arm_freq=1130
core_freq=550
h264_freq=314
isp_freq=314
v3d_freq=314
sdram_freq=700
over_voltage=8
over_voltage_sdram=8
hdmi_mode=16
hdmi_group=1
hdmi_drive=2
hdmi_force_hotplug=1
hdmi_channel_map=8
dts_whole_frames=1
program_serial_random=1
config_hdmi_boost=2
init_emmc_clock=0x20c85580
emmc_pll_core=1
hvs_priority=13052
force_turbo=1
force_pwm_open=1
pause_burst_frames=1
avoid_fix_ts=1


advancedsettings.xml:

<advancedsettings>
<fanartres>1080</fanartres>
<imageres>720</imageres>
<showexitbutton>false</showexitbutton>
<cputempcommand>cputemp</cputempcommand>
<gputempcommand>gputemp</gputempcommand>
<videolibrary>
<cleanonupdate>true</cleanonupdate>
</videolibrary>
<samba>
<clienttimeout>30</clienttimeout>
</samba>
<libass>
<styleoverrides>
<style name="Shadow" value="0" />
<style name="Blur" value="0" />
<style name="BorderStyle" value="0" />
</styleoverrides>
</libass>
<network>
<curlclienttimeout>30</curlclienttimeout>
<buffermode>1</buffermode>
<cachemembuffersize>3145728</cachemembuffersize>
<readbufferfactor>99</readbufferfactor>
</network>
</advancedsettings>

You can try my build here: https://copy.com/VVqwl0r16Pmc
The patchfile for my Openelec 4.2 Gotham build is here: https://copy.com/jGofzg3Nk9q2

cd Openelec.tv
git checkout e8e995001fe71
patch -p1 < mypatchfile
Reply

Logout Mark Read Team Forum Stats Members Help
High CPU usage & Video stuttering while streaming1