VideoDatabase load speed and CFileItem
#1
While working on the hdflagging branch, there was a brief period of time where the performance hit associated with loading the extra stream information was discussed. Some generous xbmc users sent me their large databases (~1,000 movies) for performance testing.

What I found is that in loading the CFileItem list in CVideoDatabase::GetMoviesByWhere in ~5000ms, time was spent roughly in this proportion:
Code:
20%       CVideoInfoTag movie = GetDetailsForMovie(m_pDS);
...
71%       CFileItemPtr pItem(new CFileItem(movie));  
...
08%       m_pDS->next();
So the actual reading from the database only takes 20% of the load time, but the innocent looking CFileItem(VideoInfoTag &) was taking 70%. The dataset::next was a hotspot addressed by the sqlite scrolling patch and is now down to 4%, which still may be a little high but this thread is about CFileItem()!

What in the ctor is taking so much time? 90+% of it is spent in FillInDefaultIcon(). In fact, a call to FillInDefaultIcon:
2ms: smb://user:pass@server/path/to/file/name.avi
7.5ms: stack://smb://user:pass@server/path/to/file/name-cd1.avi, smb://user:pass@server/path/to/file/name-cd2.avi

For comparison, at 60Hz you only get 16.6ms to render a frame so 7.5ms is a huge number. The problem boils down to things like IsAudio() and IsVideo() etc taking too long, and these functions aren't just called here; they're even called from frame updates.

So a quick fix would be to have CFileItem(VideoInfoTag &) set it's default icon directly, since we know we've got a video or else there wouldn't be a VideoInfoTag. That reduces the load time on this test database from ~5s to ~2s. Delaying SetCachedVideoThumb() until it is needed reduces it further to 1.2s but that's beyond the scope of what I wanted to discuss.

Another quick performance increase would be to rearrange the if/else structure of FillInDefaultIcon() so that the statements are in descending result set size. Playlists are probably the smallest number of items anyone has but is always checked first over more common file types like audio and video. Should at least move IsPlaylist and IsXBE down below common media types.

Ok this is getting long but to sum up:
-- CFileItem(VideoInfoTag &) takes forever
-- Can we directly set the default icon instead of using FillInDefaultIcon()?
-- IsVideo, IsAudio et al take much longer than they should
Reply
#2
I talked with spiff this morning about this and he came up with the brilliant idea of checking for tag existence first to determine file type. Here are some results of those tests:

Baseline
Video database "bravo", 855 movies roughly 20% url-style:
4900ms

Case 1
An experiment was carried out to determine if changing IsVideo(et al) to use if HasVideoInfoTag() return true would increase performance:
4358ms

Case 2
An experiment was carried out to determine if Case 1 were modified to crossuse tags would increase performance. Crossuse of tags means IsVideo would check all 3 tag types and return true false false accordingly.
3551ms

Case 3
These 3 types (Music, Video, Picture) can be quickly determined in these cases, so let's move them ahead of other file types like playlist and xbe in FillInDefaultIcon():
Order Audio/Picture/Video (worst case): 2675ms
Order Video/Audio/Picture (best case): 2661ms

Adding these 3 lines of code to the 3 Is* functions has minimal performance impact to file mode, they're just NULL pointer checks, but can greatly increase the performance when working in library mode.

For reference here's the changes made for case 3
Reply
#3
Performance boosts are always appreciated Smile thanks for looking into this.
Please attach to a trac ticket (as a patch) so it doesn't get missed.

Thanks
TheUni
Reply
#4
I thought there would be more discussion about the various ways of modifying/caching/refactoring, but spiff went and jumped ahead of all that and went right to the best solution.

Also up for discussion is if these calls should be in that constructor:
FillInDefaultIcon();
SetCachedVideoThumb();
SetInvalid();

This is the only constructor that calls the first two functions. They are called later in a more generic way from the list they're loaded into. They can be safely removed, as checked from all callers (videodatabase, queueing an item, finding a music video, and linking a movie to tv show).

SetInvalid() I don't even know why that is called. Of course they're invalid they don't even have a layout until later. This function call does nothing and will never do anything so I'd recommend it be removed too.

I've also created a ticket at this time.
Reply
#5
digging through the commit logs, that is partly due to legacy and partly due to me blindly c&p'ing code some 2 years ago Smile they should definitely go
Reply
#6
We can use a variable (type AUDIO, VIDEO, FILELIST ) to store what type of fileitem its is before we call this function "FillInDefaultIcon". Inside this function we should check that variable.

We are calling CUtil::GetExtension(m_strPath, strExtension) too many times.

IsAudio, IsVideo, etc should be using that variable ..
Reply
#7
SetInvalid() should definitely go - it should already be done by Reset() or whatever it is anyway.

Mind reviewing the other constructors there for similar shenanigans?

In the future we may have to do what CrashX suggests - evolved probably to handle other circumstances (eg TVSHOW, MOVIE, MUSICVIDEO, <arbitrary type defined by user> etc.)

I'd ideally like to get rid of the specialisation of some stuff anyway (the *InfoTag stuff) and move towards using properties for everything - it'll mean a whole heap of code can be culled, though it does have the performance loss associated with a string key based map, which should either be changed to a hashmap or alternatively an integer based map with translation function. Hashmap will be the most convenient, assuming the hash function is fast enough.

Lastly, to get around all the GetExtension() shenanigans, we want to move CFileItem to use CURL as the storage for it's path. Any path-based manipulation will then be far faster (and can be moved out of that horrid CUtil class).

Volunteers for all the above most welcome Smile

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#8
CrashX Wrote:We can use a variable (type AUDIO, VIDEO, FILELIST ) to store what type of fileitem its is before we call this function "FillInDefaultIcon". Inside this function we should check that variable.

We are calling CUtil::GetExtension(m_strPath, strExtension) too many times.

IsAudio, IsVideo, etc should be using that variable ..
I thought that at first too, but this seems to solve at least this problem without adding more members and additional complexity. The only issue I see with an enum like that is that really those functions are the only place the info is needed and it shouldn't be called multiple times (or at least not in any appreciable multiplicity) so you're still going to get the one time cost of figuring it out but not make any huge gains in use. I had already checked for places where the Is* are called on a per-frame basis and it doesn't appear to be the case so really there's just a load and then some time-insensitive cases of single use.

The GetExtension is really a killer due to needing to split CUrl()s recursively and, while it seems like a good idea on the surface, I don't think that using a CUrl for a filename would solve much because a CUrl holding stack://smb://,smb:// doesn't deep parse. You still get a protocol of stack:// and a filename of smb://,smb:// so you still need the recursive calls. I could be wrong but that's how it looks on the surface.

I also did a cursory glance of the other constructors but didn't profile them like I did for the CVideoInfoTag& case. They look ok but then again so would this case if I hadn't measured it.
Reply
#9
CapnBry:-

It will be called multiple times if one of the checks fails. ie IsVideo(). Also their is cost of using functions to look up this information. You should see a slightly better performance with this fix than yours. If I have the time, I will post a patch .. Hopefully you can compare the results with 1000+ database .. I am curious on the result ..

We shouldn't determine something twice when we have already done so. I agree in some cases it is pointless if we use it only once but it won't hurt to do it now than later on. Those functions are public hence their is a chance it will be called.
Reply
#10
Indeed, stacks are horrible things. Fortunately they're an edge case for the most part, except for people who keep lots of movies in the same folder.

Using CURL is nice primarily as it takes care of things such as options without us having to re-parse the URL all the time.

Are you using an actual profiler (if so which one?) or "poor-mans" profiling btw?

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#11
Poor-mans profiling using QueryPerformanceCounter assigning function calls to buckets. I don't have any cool whizbang profilers. I've tried a bunch of sampling profilers in the past and never found them to be worth a damn, and the instrumenting ones are too expensive for me.

The enum type thing could be better but honestly it seemed like a lot of more typing than I wanted to get into. Also you still have to do the same number of initial checks so the only performance benefit would be from things that call Is* functions over and over, which doesn't seem like a lot of. It is the better solution in design, but again I didn't want to make a big change to the way the class worked. I did this while on conference calls at work today Smile

EDIT: Oh and if you want the database I used to test I can post it for you. It's not even mine.
Reply
#12
Exact same experience I've had with profilers, yeah. And the same method Wink

And yeah - most of the stuff I mentioned was focusing on the performance of the mere mortals who have to read and maintain it, rather than the performance of the code in use Smile

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#13
CapnBry:-

Wow too many Is functions to convert ... You idea will be best for now ... Huh
Reply
#14
Yeah see the thing is there are really just like 5 types, but then there's location methods like IsLocal IsRemote IsInternetStream etc. So you can't really set 1 type and cover all the Is* functions, since they represent different things. That's why I was all "errrrrr this solves the immediate problem!"
Reply
#15
Similarly, it may be a good idea to order the Is* calls in FillInDefaultIcon() by the most likely first.

I can order them, and will probably do so once your patch is applied.
Reply

Logout Mark Read Team Forum Stats Members Help
VideoDatabase load speed and CFileItem0