Scraping for thumbnails avoiding overloading
#1
Hi all,
I am developing a plugin to get and play video from a website. Nothing special, but the problem is the following:

I have a textual list of the shows (about 100 shows) that I can easily get from the URL http://www.site.com/video/shows. Now I would like to get the thumbnail from those shows that are in another location http://www.site.com/shows

So I have this function (pseudocode):

Code:
def get_shows():
url = "http://www.site.com/video/shows"
load(url)
shows_list = all_shows
for show in shows_list:
show.title = grab_show_title
show.link = grab_show_link
show.thumbnail = grab_show_thumbnail(by_show_title)


and this one to return the thumbnail

Code:
def grab_show_thumbnail(title)
url = "http://www.site.com/" + title
thumbnail = get_show_thumbnail
return thumbnail

As you can imagine, it is a very intensive task because you have to get an image for every show and for every plugin activation.

Here the two questions:

1) Do you know any other more efficient way to accomplish this task?
2) Is it possible to modify the function like this


Code:
def get_shows():
url = "http://www.site.com/video/shows"
load(url)
shows_list = all_shows
for show in shows_list:
show.title = grab_show_title
show.link = grab_show_link
[b]if thumbnail_is_in_cache or thumbnail_already_stored_somewhere:
do nothing
else:
show.thumbnail = grab_show_thumbnail(by_show_title)[/b]


Thanks for every feedback and keep up the good work.

Cheers,
Goph
Reply
#2
Solved.

Common plugin cache script using cacheFunction function.

In the pseudocode:

Code:
cache = StorageServer.StorageServer("tablename", 24)
def get_shows():
url = "http://www.site.com/video/shows"
load(url)
shows_list = all_shows
for show in shows_list:
show.title = grab_show_title
show.link = grab_show_link
show.thumbnail = cache.cachFunction(grab_show_thumbnail, by_show_title))
Reply

Logout Mark Read Team Forum Stats Members Help
Scraping for thumbnails avoiding overloading0