Question about TV.com scraping
#1
Question 
Hi!
(Hopefully this post is in the right place so the right people can read it)

I'm working on a little app to organize my tv shows in my computer, and I keep finding myself reinventing the wheel (xbmc in this case). Hopefully my little app will complement xbmc and not just replicate what it does.

I've been reading the tv.com scraping xml file, but I can't really understand how you guys managed to go from a season number and an episode number to a series episode number (overall episode number).

For example, if you see Prison Break's episode list, episode 2x22 shows up as episode 44.
Also, how did you manage to skip "special" episodes or episodes that were "unaired?" I'm trying to find the most elegant solution, and thought you guys would point me in the right direction.

Anyway, thank you for your time.

KodeK
Reply
#2
skipping is handled by the regexp's.

as for going number -> season, episode the logic is simple. make sure each season starts on ep 1.
Reply
#3
That makes sense. Thanks!

One thing that's not clear, though, is the skipping. I noticed that on Prison Break, there's a "special" episode that's not counted as an actual episode (they go from episode n, to the special episode, to n+1), but on mythbusters, they have special episodes that count as episodes (episode n, special, episode n+2). Does XBMC take this into consideration? If so, how?

Thanks again.

KodeK
Reply
#4
that is not handled unfortunately. hard to find a general logic :/
Reply
#5
More on this (I think Smile )

Can it handle specials in the DB, such as the Black Adder specials that are categorized with Season 4 on tv.com
http://www.tv.com/the-black-adder/show/4...dropdown;3

And the Little Britain Specials
http://www.tv.com/little-britain/show/16...=nav_bar;2

I will add this info to the manual if it makes sense.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#6
I noticed that on quite alot of episodes, some of the cast seem to be missing in the xbmc-database, but are on tv.com as star/guest-star for that episode.
Even after a refresh of the episode-info it doesn't show up.

http://www.tv.com/csi/unbearable/episode...mmary.html , a couple of people are missing from this episode (CSI 5x14 Unbearable). 'Palmer Davis' and 'Sara Foster' for instance don't show up in the database as cast for this episode.

It seems the scraper is missing something, but can't really figure out why it's going wrong.
Reply
#7
I posted this somewhere else and haven't got an answer in a few weeks so I'm trying it here.

I love the new interface and the support of TV shows - that work for many of the TV shows, but many do NOT seem to recognize the episodes.

I'm following the wiki's naming format:

e.g. \\.....\Penn and Teller\Season 4\01 - Boy Scouts.ISO

Now, I'm using ISO files (as well as some wmv files) - but that shouldn't ben an issue because it works perfectly for some shows (e.g. 24)..

I'm just trying to find out if there is a manual way to enter TV show episode informaiton into the database so that the episodes show up in library mode (currently they are ignored).

Thanks.
Reply
#8
Played around with this more and I was able to get the shows that were in TV.com to show up in there... Interestingly, the naming convention doesn't seem to work for some reason (e.g. 01 - name of show).. What DID work was doing the following:

Penn and Teller/Season 1/Penn and Teller Bullshit - S01E01 - Name Of Show.ISO

I'd still like to find a way to add shows that are NOT in the tv.com site (e.g. they seem to be missing the third season of high stakes poker - so how do I manually add that into the database....)

Any ideas?
Reply
#9
You have to remember that the TV scraper and database are in the early stages. I don't believe there is a way to manually add a TV show yet.

Maybe I can check into the star/guest stars not being picked up but I can't promise anything. I got lucky with the IMDB.xml scraper but my luck may have been used up.

J_K_M_A_N
Reply
#10
I fixed the missing Actors. It was looking for a comma after the actor/roll but the last listing of each actor/roll didn't have a comma. I will submit a patch.

J_K_M_A_N
Reply
#11
fix is in svn.
Reply
#12
Thanks alot J_K_M_A_N
I really love the tv-database, it is great to be able to check if some guest-actors can be found in other shows you have.
Reply
#13
Issue, season 9 ep. 3 of Top Gear isn't detected, presumably because it has the status of "special" at TV.com (although it has both season and ep. number)
See here
I am trying to get my head around this and I presume it is the same reason other specials no not show up?
I guess the question would be, is this fixable through regexps, or is it simply non-functional as the other specials (which don´t have the season/ep. identifier)

Little Britain
Black adder
for comparison
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#14
fixable but not sure if that's something we want to do. how's the general status of this (this is why backends like zsori is so much better than scraping a web page)?
Reply
#15
I have another problem with the tv.com scraper.
I have a directory "the.office.us" with all episodes of "The Office" in the structure "the.office.us/season.3/the.office.us.3x21.product.recall.avi"
When i first added the series to the database, i had to change the search from "the.office.us" to "The Office".
Now when i do a "scan for new content" on the dir or i want to get the episode-info for the last episode added the pop-up dialog comes up and disappears almost immediately.
The logs give an error like:

15:30:31 M: 31539200 INFO: Loading skin file: DialogVideoScan.xml
15:30:31 M: 30965760 DEBUG: CVideoInfoScanner:Tonguerocess - Starting scan
15:30:32 M: 30965760 INFO: Get URL: http://www.tv.com/search.php?stype=progr...g=tv_shows
15:30:32 M: 30924800 ERROR: IMDB: Unable to parse web site
15:30:32 M: 30965760 DEBUG: CVideoInfoScanner:Big GrinoScan - Finished dir: smb://192.168.1.134/series$/archive/the.office.us/
15:30:32 M: 30978048 DEBUG: CVideoInfoScanner:Tonguerocess - Finished scan

Only via a total refresh of all episodes (and again choosing "The Office" as tvshow), it lets me add new episodes.

It seems it doesnt remember the show i chose, but uses the dirname as search on tv.com.
Reply

Logout Mark Read Team Forum Stats Members Help
Question about TV.com scraping0