GSOC 2013: Improved Database Layout
#1
Star 
Hello all,

I recently graduated from the Indiana University of PA with degrees in computer science and mathematics. I am starting grad school in the fall at Binghamton University. I am very much looking forward to my participation in the GSOC 2013. My original proposal follows:

Summary: Improved database layout - The existing SQL database layout for music and videos is steeped in history, not all of it good. The layout is non-optimal for the storage system that XBMC needs for data. The need is for a fast store from which movie information can be retrieved. It's mostly static information, and there's a lot of interlinked information (m:n maps) that need to be associated with each item and displayed when a list is fetched. i.e. either the information is grabbed in a single query, whereby you have a bunch of unnormalised data (e.g. a list of genres for each movie item) or the information is grabbed through multiple queries per item, whereby you can have everything nicely normalised. A hybrid solution might be best - some work has been started on this as part of garbear's piccture library project. Various other suggestions and notes are available on the forums and wiki.

How will I achieve this: The forum suggestions linked to in the proposal seem like the best starting point and would accomplish many of the improvements desired. A particular benefit of this is it allows XBMC to continue using a SQLite database while greatly increasing the databases ability to handle sharing the data to other clients.

Goals:
API methods to: define a type of artwork, define a type of media (Home Movies, etc.), add movies to one or more sets
Get a working version of Frodo using DecK's database idea (or a modified version thereof)

After some discussion with members of Team XBMC, I am going to be working on a stand-alone application, which will act as a proof-of-concept and testbed for the new database and corresponding API.
Reply
#2
Star 
One thing I would like to see is standardized support for cdart, banners and logos.

And also a change in some of the music fields as we no longer use allmusic as default which is what the original design is based on
Reply
#3
Standardizing artwork was one of the primary reasons for me wanting to get involved on the database project.

We are going to define certain "archetypal" nodes for the (e.g. video) database. For example: source, scraper, art, videos type, etc.

Then, we can define any arbitrary number of these nodes. For artwork, say, we can add to the artwork node table "banner", "logo", "discart", etc. And then the user, in the future, could add any more of these as desired. Consequently, we can have any type of art easily associated with anything else. So, for example, sets could have posters and fanart as well as logos and banners.

Hopefully this moves us towards the standardization you meant.
Reply
#4
Uhm isnt that already kinda possible? I mean atm we can add any artwork we want for videos. It is just music that is left behind. Also check some of the latest changes to the music db.
Read/follow the forum rules.
For troubleshooting and bug reporting, read this first
Interested in seeing some YouTube videos about Kodi? Go here and subscribe
Reply
#5
Adamsey
I've been off the grid lately, real life is getting in the way.. But please feel free to reach out to me if you need anything.
Reply
#6
(2013-05-29, 06:41)Martijn Wrote: Uhm isnt that already kinda possible? I mean atm we can add any artwork we want for videos. It is just music that is left behind. Also check some of the latest changes to the music db.

Yes, to my knowledge it is. It was unclear, but I was using video as an example, since this is what I have been looking at most recently. I think the new database schema will, though, make all of this a lot more normalized, even for the videos where the functionality is existing. However, while the way different types of artwork are declared/stored in the database will change, this isn't really the big improvement that I think will come from the new schema. Much better will be the ability to declare new types, such as "Documentaries" or "Home Movies" rather than the monolithic "TV Shows" and "Movies"; also, adding arbitrary attributes such as (off the top of my head) "cinematographer(s)" or (to give a music example) "producer(s)" rather than having fixed values of just "director", "artist", "actor", etc.

Like I said though, I've been thinking a lot more about the schema changes for the video db most recently, so I welcome any suggestions towards improving the music and picture schemas.

As the community bonding period is beginning, I'm looking forward to this feedback from the community. Also, I'm taking my time to get familiar recent updates, both to the music db and with the picture db, as in garbear's branch.

P.S. -- DecK, thank you for your offer and also your inspiration for my proposal.
Reply
#7
IMO it'll be good to have one generic media table which holds common/shared meta data so that in future we can have mixed content type lists etc with a single query. This table should ideally contain all basic info used in basic list views (name, abstract, thumb, ...). It might require to implement some sort of lazy-loading into our VFS items to dynamically load related child nodes/objects/data on demand, rather then aggregating everything on before hand even if f.e. nobody is scrolling to the item and would use/see this information.
Looking forward to your db schema.
Reply
#8
(2013-05-29, 07:25)adamsey Wrote: As the community bonding period is beginning, I'm looking forward to this feedback from the community. Also, I'm taking my time to get familiar recent updates, both to the music db and with the picture db, as in garbear's branch.

P.S. -- DecK, thank you for your offer and also your inspiration for my proposal.

I think you're on the right track and I would like to see ability of making more video sections with it's own filters instead of everything being put in Movies:

Movies
-recently added
-by genre
-by year....
Documentaries
-recently added
-by rating
-by folder
-by date...

It would be nice if video clips could also be scanned into library so that kids can get to them without browsing folder structure and so that you can add your own plot/descriptions via NFO files if you choose so.
My skins:

Amber
Quartz

Reply
#9
Wink 
Congratulations and thanks for taking this on.

A generic category of videos, 3 or 4 levels deep, would be quite useful for Home Movies and other miscellaneous groups of videos. Similar to TV Shows (Series>Seasons>Episodes) or Music Videos ( Artist>Albums>songs/videos) but allowing text fields at all levels. Actually, it could be exactly like Music Videos (structure) but with a scanner that extracts meta data locally, from file name or nfo's.

Will the new structure accommodate this?
Reply
#10
Make it way more abstract than tvshows or movies or whatever.
Make music,video and whatever the same structure and you can assign what it actualy is. i.e. a video file can be marked as tv show and documentary. Make it freely custimisable what they are (something like tags).
Read/follow the forum rules.
For troubleshooting and bug reporting, read this first
Interested in seeing some YouTube videos about Kodi? Go here and subscribe
Reply
#11
My 2 cents: Is there need for separate video and music databases with separate schemas? Wouldn't it be better to have a sinlge media database to allow arbitrary user-configurable relations between entries?

I might want to look up a movie's theme song from its database entry (browsing the movie library) as well as from the album's (which might just be a collection of theme songs) entry in the music library. I might also want to access movie stills (pictures) from the movie's information page or based on actors/characters shown on the picture. The actor may also be a music artist at the same time, so I might want to find that same movie still based on a song I'm listening to. If XBMC should support an ebook library in the future, I might want to find the book a movie was based on. I'd possibly want to find prequels/sequels to a movie or tv-show spinoffs and movies that feature a certain song or songs by an artist I have just another song of, and so on...

I think it's impossible to anticipate what cross-relations a user will expect from the database in the future, so it's best not to introduce unnecessary barriers.
Reply
#12
(2013-05-29, 16:56)LongMan Wrote: Congratulations and thanks for taking this on.

A generic category of videos, 3 or 4 levels deep, would be quite useful for Home Movies and other miscellaneous groups of videos. Similar to TV Shows (Series>Seasons>Episodes) or Music Videos ( Artist>Albums>songs/videos) but allowing text fields at all levels. Actually, it could be exactly like Music Videos (structure) but with a scanner that extracts meta data locally, from file name or nfo's.

Will the new structure accommodate this?

The table for these collections will have columns for both ID and Parent ID, so they should be able to be arbitrarily nested, in that there will be collections (e.g. "Movies") which can have sub-collections, sub-sub-collections, etc.

(2013-05-29, 18:22)don.corleone Wrote: My 2 cents: Is there need for separate video and music databases with separate schemas? Wouldn't it be better to have a sinlge media database to allow arbitrary user-configurable relations between entries?

I might want to look up a movie's theme song from its database entry (browsing the movie library) as well as from the album's (which might just be a collection of theme songs) entry in the music library. I might also want to access movie stills (pictures) from the movie's information page or based on actors/characters shown on the picture. The actor may also be a music artist at the same time, so I might want to find that same movie still based on a song I'm listening to. If XBMC should support an ebook library in the future, I might want to find the book a movie was based on. I'd possibly want to find prequels/sequels to a movie or tv-show spinoffs and movies that feature a certain song or songs by an artist I have just another song of, and so on...

I think it's impossible to anticipate what cross-relations a user will expect from the database in the future, so it's best not to introduce unnecessary barriers.

Truly, thank you so much for this comment. I seem to have been thinking about this all wrong, and, in fact, misreading some of the proposals for the new schema. What you have suggested will be exactly the case: all of the content, collections, etc. for any types of media will be all together in the same schema. It should then be able to support arbitrary links between content, like we can now link a TV show and a movie, for example.

I am particularly fond of your idea of ebooks and linking them to movies, and I will do my best not to impose any arbitrary restrictions on what links can exist.
Reply
#13
(2013-05-29, 11:30)da-anda Wrote: It might require to implement some sort of lazy-loading into our VFS items to dynamically load related child nodes/objects/data on demand, rather then aggregating everything on before hand even if f.e. nobody is scrolling to the item and would use/see this information.

I don't think this can be stressed enough. Low power devices like the Raspberry Pi - which are perfectly capable of rendering most media content - often struggle with large libraries because of the lack of effective lazy loading. The Pi makes for a good XBMC client but is really only let down by the GUI/library performance.

IMHO, all future GUI/database development should be smoke tested on a Raspberry Pi - if it runs like a dog, it needs more work. A 1 or 2-second delay on a quad-core 3GHz+ x86 box might go unnoticed, but that could translate into 10+ seconds (or even much longer) on a Pi, so the Pi can often be an early warning of potential performance problems/bottlenecks.

/Pi haters flame on... Wink
Texture Cache Maintenance Utility: Preload your texture cache for optimal UI performance. Remotely manage media libraries. Purge unused artwork to free up space. Find missing media. Configurable QA check to highlight metadata issues. Aid in diagnosis of library and cache related problems.
Reply
#14
(2013-05-29, 22:21)MilhouseVH Wrote: /Pi haters flame on... Wink

The weak CPU stress test is actually always done to some degree these days. It's necessary for the Apple TV and any Android devices like the Pivos, which are admittedly faster than the Pi, but still EXTREMELY slow compared to a typical i5 or i3 Intel CPU.
Reply
#15
According to some of my tests, the main problem of rpi and all flash based distribution is much more the slowness of the disk than cpu for database.

The more random access the slower it will be and such a complex multi row link thing will be slow Smile But the benefits are so good over that.

I guess for those platform the only performance solution would be to find some kind of cache system.
Reply

Logout Mark Read Team Forum Stats Members Help
GSOC 2013: Improved Database Layout1