Help to rewrite the whole SQL Database?

  Thread Rating:
  • 2 Votes - 4.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
jmarshall Offline
Team-XBMC Developer
Posts: 24,523
Joined: Oct 2003
Reputation: 138
Post: #11
That was quick Smile

2. Do you mean dropping idCollection from the content table and instead just using the collectionlinkcontent link table? I'm not sure under which circumstances a piece of content can be in more than one collection (assuming collection here means something like "TV Show") - I guess technically some episodes can be in 2 "different" seasons (the season they air during, and "specials").

3. Agreed - the only reason for more than just Online/Offline is that in some cases we'll know that not only the item is offline, but it's actually most likely deleted (we'd mark this differently in the UI). Either way, it's the same basic idea.

4. The current db (i.e. in Eden) has playCount in the files table. Personally I think this needs to be a profile-level setting. Possibly also we may wish to consider storing play history (i.e. a list of dates when the file was played - atm we just have playcount and lastPlayed), though I suspect this is getting a little fine-grained for most users. We definitely need the profiles in the database and most things associated with the profile (user rating, playcount, settings possibly tied to that rather than just to the file/content).

5. What I meant is we don't store the history - we do store lastPlayed in the Eden database (in the files table).

6. Hmm, still not sure I get this one. Documentaries would be a sub-collection of TV Shows, but would there be links between the collection "TV Shows" and the content? eg. if we have 2 tv shows "Planet Earth" and "Mad Men" then one is a documentary and one is just a TV Show (no subcategory). How do you see the link tables being setup between the "TV Shows" and "Documentaries" items, the shows, and the episodes in the show?

17. Actually, this is referring to playable items pretty much (episodes, movies, music videos, clips etc) - often movies for instance are actually a folder on disk (or a virtual folder such as a rar or zip) or could be a folder of folders (such as a dvd or bluray folder layout). When the user is browsing their files, they're not really interested in the layout of the files for that movie beyond the fact that the folder item represents it. They don't want to have to click into the folder then click on the movie, they just want it to play when the click on it immediately. Thus, we need something similar to the files table representing these virtual items on the disk - it would have idPath and idContent, possibly idFile and other information (eg hash info so we can detect when things inside that folder changed).

As for where to next, this is currently in limbo due to no-one pushing it forward, so if you'd like to be involved, then great - as I say, it seems to me that the first step is refining the schema to the point we think it has everything we need (we'll likely not know for sure until it's been implemented) and then get a test framework up and running.

Cheers,
Jonathan

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


[Image: badge.gif]
find quote
da-anda Offline
Team-XBMC Member
Posts: 1,382
Joined: Jun 2009
Reputation: 27
Location: germany
Post: #12
sorry - I haven't read the 4 huge posts of you guys yet, I just wanted to mention that I'd really love to work on a new DB layout. Developing domain driven (DDD) for some years now, I have a pretty good sense for abstracting and generalizing data. I also know about the requirements of initially simple queries for basic media listing + on demand queries. The php framework I'm coding with has a nice concept for this - it's called "lazy loading". Basically you have a "repository" (a class that returns objects of a certain type, abstracting the sql stuff from the application layer) that fetches raw DB data from the primary object table and returns it as ready to use objects. Object properties that are relations to other tables can either be loaded eager (so instantly on creating the object by triggering subqueries) or lazy, which means that the object property itself only holds a proxy class that will load the according data/object on demand/first use. This is pretty nice to work with and I think that's also how it should work in XBMC on the long run, because it would easily allow the desired generalized DB schema.
find quote
topfs2 Offline
Team-XBMC Developer
Posts: 3,825
Joined: Dec 2007
Reputation: 8
Post: #13
(2012-04-09 15:03)da-anda Wrote:  sorry - I haven't read the 4 huge posts of you guys yet, I just wanted to mention that I'd really love to work on a new DB layout. Developing domain driven (DDD) for some years now, I have a pretty good sense for abstracting and generalizing data. I also know about the requirements of initially simple queries for basic media listing + on demand queries. The php framework I'm coding with has a nice concept for this - it's called "lazy loading". Basically you have a "repository" (a class that returns objects of a certain type, abstracting the sql stuff from the application layer) that fetches raw DB data from the primary object table and returns it as ready to use objects. Object properties that are relations to other tables can either be loaded eager (so instantly on creating the object by triggering subqueries) or lazy, which means that the object property itself only holds a proxy class that will load the according data/object on demand/first use. This is pretty nice to work with and I think that's also how it should work in XBMC on the long run, because it would easily allow the desired generalized DB schema.

Must say that having a way to handle linking to other databases is a great thing, that would allow us to for example link a fileitem to a movie on themoviedb and thus just have a form of cache which acts themoviedb, that way we could schedule updates from themoviedb as needed but also if a fileitem is wrongly classed as a movie we can just remove that link.

If you have problems please read this before posting

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


[Image: badge.gif]

"Well Im gonna download the code and look at it a bit but I'm certainly not a really good C/C++ programer but I'd help as much as I can, I mostly write in C#."
find quote
jmarshall Offline
Team-XBMC Developer
Posts: 24,523
Joined: Oct 2003
Reputation: 138
Post: #14
Another idea is just dropping SQL and using a document store (eg Kyoto Cabinet with a JSON document for value). Though we'd lose the experience of those that deal with SQL, we might gain with a c++ style interface (map:reduce implementation, cursors etc.)

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


[Image: badge.gif]
find quote
xbmcfanboy0 Offline
Junior Member
Posts: 34
Joined: Mar 2012
Reputation: 4
Post: #15
I have experience with creating a data abstraction layer within VB6 in a manner similar to that described by da-anda. Just this weekend I was telling some old co-workers about how I felt xbmc could benefit from such a framework. Although I know very little c++, this topic interests me and gives me something to do to in c++.
find quote
sebak Offline
Junior Member
Posts: 31
Joined: Mar 2010
Reputation: 0
Location: Belgium
Post: #16
(2012-04-09 21:19)topfs2 Wrote:  
(2012-04-09 15:03)da-anda Wrote:  sorry - I haven't read the 4 huge posts of you guys yet, I just wanted to mention that I'd really love to work on a new DB layout. Developing domain driven (DDD) for some years now, I have a pretty good sense for abstracting and generalizing data. I also know about the requirements of initially simple queries for basic media listing + on demand queries. The php framework I'm coding with has a nice concept for this - it's called "lazy loading". Basically you have a "repository" (a class that returns objects of a certain type, abstracting the sql stuff from the application layer) that fetches raw DB data from the primary object table and returns it as ready to use objects. Object properties that are relations to other tables can either be loaded eager (so instantly on creating the object by triggering subqueries) or lazy, which means that the object property itself only holds a proxy class that will load the according data/object on demand/first use. This is pretty nice to work with and I think that's also how it should work in XBMC on the long run, because it would easily allow the desired generalized DB schema.

Must say that having a way to handle linking to other databases is a great thing, that would allow us to for example link a fileitem to a movie on themoviedb and thus just have a form of cache which acts themoviedb, that way we could schedule updates from themoviedb as needed but also if a fileitem is wrongly classed as a movie we can just remove that link.

da-anda is talking about an Object-relational mapper (I'm guessing doctrine, which is a php implementation of the java orm), I'm pretty sure there must be good c++ implementations around. However, I don't really see how that helps with linking to other databases...
find quote
Noodle Offline
Senior Member
Posts: 159
Joined: Jan 2005
Reputation: 0
Post: #17
Obviously, MatsF is a great enterprise application designer. Lot of good ideas, for example: more meaningful column names, DateCreated and DateChanged for every table.

But I have one concern about delete flag. If this design is for a separate XBMC database management project for large/huge amount of content, and historical data is required, then delete flag (Delete date maybe better for debug purpose) is useful. But most XBMC user use it for home media center, unless XBMC team can provide good database maintain tools/scripts, a database full with out-of-date data could be pretty mess. Many of user is looking for a tool to clean up Thumbnails already. I can imaging, soon, user will start to looking for database clean up tool.

Just my 2 cents.
find quote
MatsF Offline
Junior Member
Posts: 17
Joined: Jan 2011
Reputation: 0
Post: #18
(2012-04-09 11:21)jmarshall Wrote:  That was quick Smile
Well this answer was a little bit slower. I had some other things needed to be done....

(2012-04-09 11:21)jmarshall Wrote:  2. Do you mean dropping idCollection from the content table and instead just using the collectionlinkcontent link table? I'm not sure under which circumstances a piece of content can be in more than one collection (assuming collection here means something like "TV Show") - I guess technically some episodes can be in 2 "different" seasons (the season they air during, and "specials").
No, i didn't mean that. What I meant was that in the proposal it is 2 links from Content.idCollection. One directly to table Collection and one to the table CollectionLinkContent and the later is the actual N:N link between Content and Collection. So what I just am trying to say is that you don't need the first link, the one between Content and directly to Collection that is according to the text primary collection, If you look at the diagram you can see that there is only the (what I think) wrong one showed there, the 1:N between Content and Collection. What I think should be right is to move that to just be between Content and CollectionLinkContent and then it will be a N:N.

(2012-04-09 11:21)jmarshall Wrote:  3. Agreed - the only reason for more than just Online/Offline is that in some cases we'll know that not only the item is offline, but it's actually most likely deleted (we'd mark this differently in the UI). Either way, it's the same basic idea.
It is nice when we agree Smile

(2012-04-09 11:21)jmarshall Wrote:  4. The current db (i.e. in Eden) has playCount in the files table. Personally I think this needs to be a profile-level setting. Possibly also we may wish to consider storing play history (i.e. a list of dates when the file was played - atm we just have playcount and lastPlayed), though I suspect this is getting a little fine-grained for most users. We definitely need the profiles in the database and most things associated with the profile (user rating, playcount, settings possibly tied to that rather than just to the file/content).
5. What I meant is we don't store the history - we do store lastPlayed in the Eden database (in the files table).
My bad, both in thinking and writing.i will rewrite:
Yes,and in 4.0a you store lastPlayed in Settings. And yes, I do agree that it should be profile level wherever you decide to put it, I dont know where I said otherwise, but I have never ment anything other than that.

[playCount] You have to create a new table for playCount to work with count(). I still think it is a good idea to have it. (My own example: I have just started to look at West Wing again and I would have loved to be able to see if it was 2006 or 2007 I watched it the first time). But that is maybe because I am older so I don´t remember exactly when I saw it and I dont remember a lot of things in the show either ;-) ).
[Watched] Instead, the question of where to put Watched is just a question about what function you want to have in XBMC (my previous answer). Although if you want it to be that the Content is watched and not the file (which I still think makes the best function in XBMC), then you have to put it in a new table holding idProfile, idContent and Watched (T/F), I didnt think of that, but it was not my meaning to make the Watched field without Profile-dependence. And the same actually apply if you put it in File.

(2012-04-09 11:21)jmarshall Wrote:  6. Hmm, still not sure I get this one. Documentaries would be a sub-collection of TV Shows, but would there be links between the collection "TV Shows" and the content? eg. if we have 2 tv shows "Planet Earth" and "Mad Men" then one is a documentary and one is just a TV Show (no subcategory). How do you see the link tables being setup between the "TV Shows" and "Documentaries" items, the shows, and the episodes in the show?

No, sometimes I am explaining really bad, sorry for that. And my suggestion is probably not even needed, it is just for help in making a query a little bit easier. What I meant was that the field TopLevel just was to be used for the content that has child-information (episodes or songs). It has nothing to do with subcollections (like Documentaries) more than it is easier to get all the Documentaries if the field TopLevel is there also.

The Show-episode links are the easy one, There you already have the table ContentLinkContent.
And then when you need the episodes you just query the Content and ContentLinkContent where ContentLinkContent.idParent = (the idContent of the show) AND Content.TopLevel = False (and probably some season-filter also).

It is just a little bit easier, nothing more than that.

(2012-04-09 11:21)jmarshall Wrote:  17. Actually, this is referring to playable items pretty much (episodes, movies, music videos, clips etc) - often movies for instance are actually a folder on disk (or a virtual folder such as a rar or zip) or could be a folder of folders (such as a dvd or bluray folder layout). When the user is browsing their files, they're not really interested in the layout of the files for that movie beyond the fact that the folder item represents it. They don't want to have to click into the folder then click on the movie, they just want it to play when the click on it immediately. Thus, we need something similar to the files table representing these virtual items on the disk - it would have idPath and idContent, possibly idFile and other information (eg hash info so we can detect when things inside that folder changed).

To me you already have that information in the File so obviously there is my turn to not understand exactly what you mean.
Aha, maybe I understand it and this is for file browsing only. Then you either put that information also in the table File or you create a new table for Content and put the information there for those Contents that needs it.

(2012-04-09 11:21)jmarshall Wrote:  As for where to next, this is currently in limbo due to no-one pushing it forward, so if you'd like to be involved, then great - as I say, it seems to me that the first step is refining the schema to the point we think it has everything we need (we'll likely not know for sure until it's been implemented) and then get a test framework up and running.

Yes I do want to be involved, if you see that I can be of any use.

Can I push it forward, and if so how?

Obviously I can´t do the refining until you tell me everything you need (or if 4.0a is mostly everything you need, then it is fine).
Also, when we fill the test framework with data then some things will change, I guarantee Smile

The test framework is that going to be in SQLite or something else? If I will be involved then I probably will create a local mirror with Microsoft SQL Server because that is the one I know inside out....

So how will we get there?

/Mats
find quote
topfs2 Offline
Team-XBMC Developer
Posts: 3,825
Joined: Dec 2007
Reputation: 8
Post: #19
(2012-04-10 13:50)sebak Wrote:  
(2012-04-09 21:19)topfs2 Wrote:  
(2012-04-09 15:03)da-anda Wrote:  sorry - I haven't read the 4 huge posts of you guys yet, I just wanted to mention that I'd really love to work on a new DB layout. Developing domain driven (DDD) for some years now, I have a pretty good sense for abstracting and generalizing data. I also know about the requirements of initially simple queries for basic media listing + on demand queries. The php framework I'm coding with has a nice concept for this - it's called "lazy loading". Basically you have a "repository" (a class that returns objects of a certain type, abstracting the sql stuff from the application layer) that fetches raw DB data from the primary object table and returns it as ready to use objects. Object properties that are relations to other tables can either be loaded eager (so instantly on creating the object by triggering subqueries) or lazy, which means that the object property itself only holds a proxy class that will load the according data/object on demand/first use. This is pretty nice to work with and I think that's also how it should work in XBMC on the long run, because it would easily allow the desired generalized DB schema.

Must say that having a way to handle linking to other databases is a great thing, that would allow us to for example link a fileitem to a movie on themoviedb and thus just have a form of cache which acts themoviedb, that way we could schedule updates from themoviedb as needed but also if a fileitem is wrongly classed as a movie we can just remove that link.

da-anda is talking about an Object-relational mapper (I'm guessing doctrine, which is a php implementation of the java orm), I'm pretty sure there must be good c++ implementations around. However, I don't really see how that helps with linking to other databases...

Not saying his does but the idea to link to other databases is a good thing to consider. i.e. follow the semantical web ideals. So a file on your harddrive is imdb/tt0096754, much as its done in rdf etc.

If you have problems please read this before posting

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


[Image: badge.gif]

"Well Im gonna download the code and look at it a bit but I'm certainly not a really good C/C++ programer but I'd help as much as I can, I mostly write in C#."
find quote
da-anda Offline
Team-XBMC Member
Posts: 1,382
Joined: Jun 2009
Reputation: 27
Location: germany
Post: #20
(2012-04-11 15:38)topfs2 Wrote:  
(2012-04-10 13:50)sebak Wrote:  da-anda is talking about an Object-relational mapper (I'm guessing doctrine, which is a php implementation of the java orm), I'm pretty sure there must be good c++ implementations around. However, I don't really see how that helps with linking to other databases...

Not saying his does but the idea to link to other databases is a good thing to consider. i.e. follow the semantical web ideals. So a file on your harddrive is imdb/tt0096754, much as its done in rdf etc.

yes, I was talking about a ORM, but not doctrine - it's a custom implementation for a CMS I'm using (Extbase-Framework for TYPO3).

@relation to other DBs
using proxy objects as properties would allow that easily I guess.
find quote