Help to rewrite the whole SQL Database? - Printable Version +- Kodi Community Forum (https://forum.kodi.tv) +-- Forum: Development (https://forum.kodi.tv/forumdisplay.php?fid=32) +--- Forum: Kodi Application (https://forum.kodi.tv/forumdisplay.php?fid=93) +--- Thread: Help to rewrite the whole SQL Database? (/showthread.php?tid=128054) Pages:
1
2
|
Help to rewrite the whole SQL Database? - MatsF - 2012-04-07 Hi, I have since a couple of years used the XBMC database directly (through ODBC) to
I have noticed a couple of things with it.
I have been waiting for Eden to see if you were going to change the structure of the database but now I can see that it is not done any major changes. My question is simply if you are interested in me rewriting the whole database and (among a lot of other things) make it normalized? That is having the right tables, the right connections between the tables and the right fields in the right tables. I will not go into much details if you are not interested but some of the differences would be:
(I think) there is a smart way of creating new versions of a database in XBMC. That is one of the reason why a total rewrite wouldn't be that hard to implement. And it is often impossible to do all the coding changes in the same time but that is also easy to use. For all the reading and hopefully a lot of the writing (depending on how SQLlite/MySQL is handling that, and that I don't know) of the database it is possible to use SQL views that looks exactly as the old database looks like, making the transformations easy and the code-changing a process that can be done whenever all the different programmers have the time to do it. This is not any criticism of how the database looks right now. It is very common that the database gets like this after some years of using when building a lot of new features onto the first version. In my profession I have been working with MS SQL Server for over 15 years. /Mats RE: Help to rewrite the whole SQL Database? - jmarshall - 2012-04-07 You're allowed to critique the current video database - we don't get offended easily, and likely know the faults better than anyone Compare though, with the other databases - the video database is somewhat unusual in this respect - it is how it is primarily due to faults with the ancient version of sqlite that we used back in the xbox days that couldn't alter tables (thus the c## columns were used to allow simple extension, particularly while the library was under rapid development). We're not interested in just replacing the existing layout with another layout. We are interested in replacing the layout with something that allows generality (quite possibly at the cost of a certain amount of unnormalised data - after all, normalisation is not always appropriate.) There's been several aborted attempts at this. I believe the best last version was this: http://wiki.xbmc.org/index.php?title=Database_Schema_4.0a. Related thread is here: http://forum.xbmc.org/showthread.php?tid=73831&highlight=Database+4.0 In particular, what we're after is allowing the support of arbitrary types of content, arbitrary collections of that content (with attached metadata) and arbitrary (where it makes sense) metadata fields for content or collections. Further, all layers in the library hierarchy (filtered lists and the like) must allow retrieval using ideally a single query to get all information required for initial display of list and sorting. Additional information may then be retrieved with individual queries if necessary. Take a look through the thread/wiki I linked above as a starting point. Once you've absorbed that, if you're still interested, let us know! Cheers, Jonathan RE: Help to rewrite the whole SQL Database? - MatsF - 2012-04-07 I am glad that you are not easily offended And I agree, there are always reasons why a database looks like it does and the C##-columns isa perfect example of that now that you explained it. And my interest is of course also to make a layout that works with all of your current and future ideas of what XBMC should hold. I have read what you asked me to (although one of the links didn't work but a search fixed that). In my opinion, what you are trying to do is kind of the opposite of what an SQL database are made for. The database structure you are planning creates:
Instead what I think you should do is to make a structured normalized database (that of course can take care of everyhing you need regarding your current and future ideas) and then make the complex queries in the few cases where it is needed. It will both be a faster, more general way to do it than your plan and also a lot more easy to program using that database structure. And if you for some reason that I don't see right now want/need that not logical structure then it can be created with views using Union Select. Disclaimer: Obviously you know a lot more about XBMC and what your current and future ideas are so I can just answer on what i have seen so far and maybe there is a showstopper somewhere for my idea that I just don't know about right now. If you still don't agree with me and you think it is worth the time, please explain what your idea of the structure will solve in a better way (or maybe is unsolvable with my suggestion), because I have looked at this for a couple of hours now and I don´t see that at all. Not when I look at the database structure and definitely not when looking at the examples in the other thread . Or maybe I´m just old fashioned ;-) /Mats RE: Help to rewrite the whole SQL Database? - MatsF - 2012-04-07 This part disappeared from the above post for the simple reason that I forgot to paste it in before posting I thought EAV/CR was mainly for when you have a lot of different columns and most of them were never used. I also thought that EAV/CR more or less isn't used anymore and that XML-fields were used instead (where it was needed). This is from Wikipedia and about when EAV is better to use than standard :
In my opinion neither of them applies to XBMCs database. RE: Help to rewrite the whole SQL Database? - jmarshall - 2012-04-08 I suspect you hit the wrong page. We're not wanting EAV/CR at all - that was basically an attempt at something even more general which was thankfully dialed back quite a bit! Here's the correct link: http://wiki.xbmc.org/index.php?title=Database_Schema_4.0a The reason we're looking at genericity (i.e. general "content" table that applies to all video types, and a general "collections" table that applies to natural groupings of that content (tvshows/documentaries/anime/seasons etc.) is that we want the database to be as flexible as possible - there's no points having an episode-specific table and a movie-specific table along with an anime-episodes or documentaries or mini-series or.... - when most of the fields remain the same. Further, we don't want to restrict the many:many mappings for attributes (genres, actors, directors, tags, keywords etc.) too much, as we almost guarantee there'll be another one we need. Whilst we could use a separate table+linktable(s) per attribute mapping, the tables will essentially be identical, and thus we'd need a further map (or set of functions) to generate the queries. Mind you, this is IMO a minor point compared to the basic idea of content-genericity. (Note that currently we have neither: we have 4 different content tables, and about 7 attribute tables with thus 28 link tables - as you can appreciate, things quickly get out of hand...) Cheers, Jonathan RE: Help to rewrite the whole SQL Database? - MatsF - 2012-04-08 I tried the same way (with your first links) again, starting with your link that didnt work, and did a search. Yesterday 4.0 came up (without me noticing the lack of "a") and this time 4.0a came first... But your link to the forum was for 4.0. I have tried to search for 4.0a but cant find any thread for that. Is there any? RE: Help to rewrite the whole SQL Database? - MatsF - 2012-04-08 I had some time today so I started to look at 4.0a, although I don’t know what has been said of the database structure and the other things I wrote down, maybe everything is gone through already.... Then I got a bit carried away. What I am trying to say is that I don’t know if you even are interested in what I have to say, and if you aren’t then it is no idea for me to put down any more hours in this. When I realized that, I stopped working, and I am not finished. I liked the 4.0a a lot more than the one I looked at yesterday Here are some suggestions, both small and large, in no particular order. I just wrote down all of the things I saw while looking through it. There should be a lot of “maybe”, IMO and “for example” that I didn’t wrote in the text below:
A diagram with small changes (mostly it is your 4.0a). So tell me what, if anything, we do next about this. /Mats RE: Help to rewrite the whole SQL Database? - MatsF - 2012-04-09 I don't know why the diagram doesnt show up. But here it is as a link. RE: Help to rewrite the whole SQL Database? - jmarshall - 2012-04-09 Wow - nice work. Lots to get through, but you put the effort in, so I can at least do the same! 1. Will keep it in mind - a nice to have for debugging, I can see - not sure it needs to be in absolutely every table, but definitely useful in the content table and perhaps files/path. 2. Not sure what you mean here? 3. Yes, we definitely want online/offline status to be apparent. Storing it in the db is fine as theoretically at least our file-finders should pick up if they're available or absent. However, we do need to be careful about how we handle files that are deleted - we should be able to tell this pretty easy though (folder is present but file isn't -> flag as deleted). Perhaps a status field? 4. Part of the reason it's in the file table is because then it works in the current file view (even when not scanned) - in the future we can assume everything will be scanned, so in this case it's user-content information (per-user, so it would have to be linked to the profile in some way) 5. We don't store when things were played (play history) we just store a number of times played. This could be changed if we had a play history (again tied to a profile). 6. I think we probably need a type field in the content table? That way we can pick out any shows/albums etc. simply by SELECT FROM content WHERE type="movie" (suitably normalized). 7. Agreed - it does get a little tricky with our VFS though as it's not necessarily heirarchical, so combining file + path to get actual playable item might be tricky. 8. Yup - agreed. One thing you've dropped (which personally I agree with) is that with this layout you can really only assign a scraper to a particular source, rather than a path. Currently you can assign a scraper to any path (and switch scrapers half way down the tree). Personally I don't see a problem with restricting this - my guess is only "power users" even know about this feature. 9. Could do, yes. 10. I guess potentially this could be per-profile (it stores client-specific things with respect to a shared library) 11. It could be in the file table, yup. 12. I dunno what consensus is here, but I have no issue renaming stuff so they're global names wrt to the db. 13. See 6. 14/15/16 Kinda tangential to this. The new DB will be developed independently from XBMC (for ease and sanity) as much as possible. I doubt we'll ever completely separate them out into separate executables, however, though there may well be a separate server build available. 17. I guess this is normalisation - paths typically have more than one file in them, and we'd want to be able to quickly grab all the files by path (or technically, all the content by path). Actually, this brings up quite a separate issue. It would be useful to be able to define virtual file items (typically a folder) that represent content. i.e. a better mapping from a nicely cleaned up/stacked down etc. file list to content. We currently do this via the basePath/parentPath members of the movie table (likely some of the c## fields). I think the next step forward is refining the schema, and ensuring that everything we want to be able to do is easily queried, plus that it gives us the flexibility we need. Then get up a sample db running that allows queries to be tested out and further refinement. I'm quite happy to start coding something up at that point. Cheers, Jonathan RE: Help to rewrite the whole SQL Database? - MatsF - 2012-04-09 Thank you for that
Now I have seen that you have done a really nice job with the new structure and have done a lot more than I could see when I started this thread. But I still think I can be of help, the question is do you need or want it? If I can be of any assistance in that project, just let me know. /Mats RE: Help to rewrite the whole SQL Database? - jmarshall - 2012-04-09 That was quick 2. Do you mean dropping idCollection from the content table and instead just using the collectionlinkcontent link table? I'm not sure under which circumstances a piece of content can be in more than one collection (assuming collection here means something like "TV Show") - I guess technically some episodes can be in 2 "different" seasons (the season they air during, and "specials"). 3. Agreed - the only reason for more than just Online/Offline is that in some cases we'll know that not only the item is offline, but it's actually most likely deleted (we'd mark this differently in the UI). Either way, it's the same basic idea. 4. The current db (i.e. in Eden) has playCount in the files table. Personally I think this needs to be a profile-level setting. Possibly also we may wish to consider storing play history (i.e. a list of dates when the file was played - atm we just have playcount and lastPlayed), though I suspect this is getting a little fine-grained for most users. We definitely need the profiles in the database and most things associated with the profile (user rating, playcount, settings possibly tied to that rather than just to the file/content). 5. What I meant is we don't store the history - we do store lastPlayed in the Eden database (in the files table). 6. Hmm, still not sure I get this one. Documentaries would be a sub-collection of TV Shows, but would there be links between the collection "TV Shows" and the content? eg. if we have 2 tv shows "Planet Earth" and "Mad Men" then one is a documentary and one is just a TV Show (no subcategory). How do you see the link tables being setup between the "TV Shows" and "Documentaries" items, the shows, and the episodes in the show? 17. Actually, this is referring to playable items pretty much (episodes, movies, music videos, clips etc) - often movies for instance are actually a folder on disk (or a virtual folder such as a rar or zip) or could be a folder of folders (such as a dvd or bluray folder layout). When the user is browsing their files, they're not really interested in the layout of the files for that movie beyond the fact that the folder item represents it. They don't want to have to click into the folder then click on the movie, they just want it to play when the click on it immediately. Thus, we need something similar to the files table representing these virtual items on the disk - it would have idPath and idContent, possibly idFile and other information (eg hash info so we can detect when things inside that folder changed). As for where to next, this is currently in limbo due to no-one pushing it forward, so if you'd like to be involved, then great - as I say, it seems to me that the first step is refining the schema to the point we think it has everything we need (we'll likely not know for sure until it's been implemented) and then get a test framework up and running. Cheers, Jonathan RE: Help to rewrite the whole SQL Database? - da-anda - 2012-04-09 sorry - I haven't read the 4 huge posts of you guys yet, I just wanted to mention that I'd really love to work on a new DB layout. Developing domain driven (DDD) for some years now, I have a pretty good sense for abstracting and generalizing data. I also know about the requirements of initially simple queries for basic media listing + on demand queries. The php framework I'm coding with has a nice concept for this - it's called "lazy loading". Basically you have a "repository" (a class that returns objects of a certain type, abstracting the sql stuff from the application layer) that fetches raw DB data from the primary object table and returns it as ready to use objects. Object properties that are relations to other tables can either be loaded eager (so instantly on creating the object by triggering subqueries) or lazy, which means that the object property itself only holds a proxy class that will load the according data/object on demand/first use. This is pretty nice to work with and I think that's also how it should work in XBMC on the long run, because it would easily allow the desired generalized DB schema. RE: Help to rewrite the whole SQL Database? - topfs2 - 2012-04-09 (2012-04-09, 15:03)da-anda Wrote: sorry - I haven't read the 4 huge posts of you guys yet, I just wanted to mention that I'd really love to work on a new DB layout. Developing domain driven (DDD) for some years now, I have a pretty good sense for abstracting and generalizing data. I also know about the requirements of initially simple queries for basic media listing + on demand queries. The php framework I'm coding with has a nice concept for this - it's called "lazy loading". Basically you have a "repository" (a class that returns objects of a certain type, abstracting the sql stuff from the application layer) that fetches raw DB data from the primary object table and returns it as ready to use objects. Object properties that are relations to other tables can either be loaded eager (so instantly on creating the object by triggering subqueries) or lazy, which means that the object property itself only holds a proxy class that will load the according data/object on demand/first use. This is pretty nice to work with and I think that's also how it should work in XBMC on the long run, because it would easily allow the desired generalized DB schema. Must say that having a way to handle linking to other databases is a great thing, that would allow us to for example link a fileitem to a movie on themoviedb and thus just have a form of cache which acts themoviedb, that way we could schedule updates from themoviedb as needed but also if a fileitem is wrongly classed as a movie we can just remove that link. RE: Help to rewrite the whole SQL Database? - jmarshall - 2012-04-10 Another idea is just dropping SQL and using a document store (eg Kyoto Cabinet with a JSON document for value). Though we'd lose the experience of those that deal with SQL, we might gain with a c++ style interface (map:reduce implementation, cursors etc.) RE: Help to rewrite the whole SQL Database? - xbmcfanboy0 - 2012-04-10 I have experience with creating a data abstraction layer within VB6 in a manner similar to that described by da-anda. Just this weekend I was telling some old co-workers about how I felt xbmc could benefit from such a framework. Although I know very little c++, this topic interests me and gives me something to do to in c++. |