Login at Kodi Home

Donger · 2013-04-07, 21:24

Hi There,

I think this is a great initiative, and it will be helpful in the final future state. I'm not a SQLlite expert by any means, but do have some experience in DB/query tuning - hopefully these ORCL/MSS concepts also apply here.

Couple of comments suggestions that may trigger some thoughts:

1) The number of joins on some of the views could be problematic, as previously stated. We would need to have indexes on the 'join' fields, and not necessarily the primary/surrogate keys.

2) In general we want to avoid upper functions in the 'where' clauses; this type of SQL will automatically result in a table scan (reducing performance). Better to do the upper on the actual variable before the SQL is fired; stuff like function-based indexes is an option also.

3) Beware of the 'monolithic view' or the 'one view fits all' scenario. Big fat views make it easier to code for the SQL writer (he/she only needs to know one view definition); however, as the view gets bigger over time, the performance cost of the joins outweighs the benefits.

4) Since we're in a 'read many write once' type of database, we can be agressive with the indexing, or even the denormalization for the data model for core user cases. My opinion: performance is way more important than disk space.

5) Assuming design changes go through, would existing data be migrated or recreated/rescraped?

It would be a good idea to get a few core test cases documented, so they can be traced to determine whether tuning changes help/hurt. What I would want to see is the actual SQL that is being fired (and not necessarily the view definitions), because the values in the bind variables are important.

Unsure about next steps, but I hope this helps. Excellent work!

Donger

Milhouse · 2013-04-07, 22:44

Interesting point about "read many write once" - is the XBMC database, in fact, a warehouse? Should (could) there be a transactional database that is written to by front end code, and a denomarlised warehouse for reading? How would data be transferred from the transactional database to the warehouse, a periodic background ETL task? Is this over thinking things? Smile

jmarshall · 2013-04-08, 00:58

Yes, that is essentially exactly what garbear's stuff on the picture and games libraries do - they use BSON to store objects in a table for fast retrieval, and everything else is available on other normalized plain-text tables for filtering and the like.

The disadvantage is it's not user-editable. I guess one could get around that problem by having a completely normalized database with the BSON tables sitting on top that are regenerated as needed.

A further thing to consider is exactly where to stop. For example, movies have a cast list. Should the cast list be in the BSON object for that movie? What about the actor's image? What about their ID, or (in future) their bio... Defining where to stop is not necessarily straight forward.

**Klojum** · 2013-04-08, 08:25

I don't know about other XBMC/OpenELEC users, but in general I have no interest at all in actor's images or their personal info. If I ever do, then I'll do some googling or pay IMDB.com a visit. Since the XBMC box usually has a link to the internet already, why not retrieve the info from IMDB.com only on a 'need to know' basis, and not for all movies during scraping?

Currently, the thumbs of actors are also stored per movie. Resulting in a popular actor having his/her identical photos stored what.. 10,20,30 times or more. I know some XBMC users have gigabytes and terabytes in storage at their disposal these days, but optimization in that area would be welcome. Perhaps a separate folder in the thumbnail cache could store distinct thumbnails of each actor/actress.

I know from experience that documentation is generally the least popular item among programmers. But would there be a possibility to create (and maintain) some functional requirements document for XBMC? Perhaps I haven't looked yet in the right places, but so far I have not found this document.

I used to be a programmer, I am now a software tester, but testing can only be done properly if you know what the test object is supposed to do. This document could come in handy in, for example: "What happens if I delete a movie's source? Will all movies using that source be removed from the database? And all subsequent database info, and/or disk files/thumbs?"

jmarshall · (This post was last modified: 2013-04-08, 10:27 by jmarshall.)

They're only stored on your local filesystem in that way (on export), not in the cache, where there's only a single image per actor.

As for documentation, it depends on the question you're asking. The answer to your question, for instance is "you're prompted whether to remove all movies in the source from the database. If you answer yes, it will remove all information from the database except for file-specific information (settings, playcount etc.) which is generally kept around unless Clean Library is run and the files themselves are removed. No obviously won't. It won't remove textures from the texture cache - they're designed to be cleaned up periodically."

This is quite different documentation than development documentation or user documentation. The problem, as always is finding someone with time who is prepared to coordinate it. All they need do is ask questions and collate the answers - devs are always willing to write down the answers.