2008-07-02, 14:21
Because there were a rash of reports about duplicate tvshow entries due to the same show being found in many locations, including multiple path sources, I decided to try and remove the reliance on the path for a tvshow entry in the database.
Unfortunately, I spent considerable amount of time on a dead end. It never really worked correctly. I had to keep making more hacks to work around the times when a path was required. So, I decided to dump all that work and rethink it.
My first thought was to add a hash field to the tvshow table and hash all the fields together into some unique id per show. The first issue was that I really didn't want to have to change the schema as its an annoyance. You need to update the database version and have it update itself on next load, etc. The other issue was that its possible that to have the same show in two locations and use two different scrapers, thus potentially making the hash value different. So, I abandoned that idea after getting it partially working.
Last night, while staring at the database in Sqlite Spy after a few beers, I thought why not just key off the title?! It's simple enough. I expect that would always be the same, regardless of the scraper used. So, I spent a few hours and got that working!
Duplicate entries are "stacked" into a single show, using the show info from the first tv show entry in the database. (It's actually the lowest id which is typically the first one added.) The episode counts, and watch counts are just tallied up for the additional entries. Everything else remains the same. Very little code had to change. It was mostly just some nested sql queries. There were no changes to the scanner. The database still has duplicate entries by path, but they are just hidden from the display.
So far, it works perfectly. I can navigate tvshows using any criteria. Seasons gets stacked if episodes are split across paths. I need to do some more testing, but I dont expect anything to break since all the work is done in the database.
The one issue I thought of, because I encountered it myself, was if the scanner incorrectly names a show, it could get stacked into the show that's correctly named. This didn't happen to me while working on this stacking solution, but before. I had episodes from the original Battlestar Galactica and the new BSG both in folders named Battlestar Galactica but they were in different paths. The scanner thought they were the same show. It even skipped the episodes beyond season one for the new BSG as there's only one season of the original. I had to manually refresh the duplicate which was incorrectly titled. (And the only way to tell which was which was to enter one of them and play an episode.) After that it found the rest of the episodes and all was good.
My idea to combat this issue is to make this a setting in video library. "Stack tv shows by Title" or something like that. It'll default to off. The user can then correct any issues, and turn it on.
What's the consensus? Does anyone see an others issues keying the stacking only off the Title, other than the issue I mentioned?
Unfortunately, I spent considerable amount of time on a dead end. It never really worked correctly. I had to keep making more hacks to work around the times when a path was required. So, I decided to dump all that work and rethink it.
My first thought was to add a hash field to the tvshow table and hash all the fields together into some unique id per show. The first issue was that I really didn't want to have to change the schema as its an annoyance. You need to update the database version and have it update itself on next load, etc. The other issue was that its possible that to have the same show in two locations and use two different scrapers, thus potentially making the hash value different. So, I abandoned that idea after getting it partially working.
Last night, while staring at the database in Sqlite Spy after a few beers, I thought why not just key off the title?! It's simple enough. I expect that would always be the same, regardless of the scraper used. So, I spent a few hours and got that working!
Duplicate entries are "stacked" into a single show, using the show info from the first tv show entry in the database. (It's actually the lowest id which is typically the first one added.) The episode counts, and watch counts are just tallied up for the additional entries. Everything else remains the same. Very little code had to change. It was mostly just some nested sql queries. There were no changes to the scanner. The database still has duplicate entries by path, but they are just hidden from the display.
So far, it works perfectly. I can navigate tvshows using any criteria. Seasons gets stacked if episodes are split across paths. I need to do some more testing, but I dont expect anything to break since all the work is done in the database.
The one issue I thought of, because I encountered it myself, was if the scanner incorrectly names a show, it could get stacked into the show that's correctly named. This didn't happen to me while working on this stacking solution, but before. I had episodes from the original Battlestar Galactica and the new BSG both in folders named Battlestar Galactica but they were in different paths. The scanner thought they were the same show. It even skipped the episodes beyond season one for the new BSG as there's only one season of the original. I had to manually refresh the duplicate which was incorrectly titled. (And the only way to tell which was which was to enter one of them and play an episode.) After that it found the rest of the episodes and all was good.
My idea to combat this issue is to make this a setting in video library. "Stack tv shows by Title" or something like that. It'll default to off. The user can then correct any issues, and turn it on.
What's the consensus? Does anyone see an others issues keying the stacking only off the Title, other than the issue I mentioned?