Icefilms (Icefilms.info) Addon Development Thread

  Thread Rating:
  • 1 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Thread Closed
anarchintosh Offline
Fan
Posts: 550
Joined: Jul 2010
Reputation: 4
Post: #1
per XBMC's Official Piracy Policy
> > > FOR ALL FUTURE DISCUSSION RELATING TO ICEFILMS AND THIS ADD-ON SEE: XBMC HUB < < <




DEVELOPERS ONLY!

If you are a user wondering how to install the addon/get support for errors please go to the Release Thread.
This thread is for those who are trying to develop the addon, through patches, testing, art or other means of contributing.

Development is done on Github:
https://github.com/icefilms-xbmc

To collaborate on Github, you need to open an account with them and know/learn how to use Git, which is quite easy.
All contributions are welcome; anyone who wants to work on the code should just tell me or another Icefilms XBMC team member the name of their Github account.

The code is quite messy.

GIT + Github is used for sharing development code.
SVN + Googlecode is used for uploading to anarchintosh-projects repository the latest stable release of the addon to users.
(This post was last modified: 2012-01-07 11:29 by Ned Scott.)
find
anarchintosh Offline
Fan
Posts: 550
Joined: Jul 2010
Reputation: 4
Post: #2
Here are some development-related threads.
http://forum.xbmc.org/showthread.php?p=676892
http://forum.xbmc.org/showthread.php?tid=90257
http://forum.xbmc.org/showthread.php?tid=90002
http://forum.xbmc.org/showthread.php?tid=88843
http://forum.xbmc.org/showthread.php?tid=89959
http://forum.xbmc.org/showthread.php?tid=89960
http://forum.xbmc.org/showthread.php?tid=89845
http://forum.xbmc.org/showthread.php?tid=87759
http://forum.xbmc.org/showthread.php?tid=89245
http://forum.xbmc.org/showthread.php?tid=88722
(This post was last modified: 2011-01-12 00:01 by anarchintosh.)
find
anarchintosh Offline
Fan
Posts: 550
Joined: Jul 2010
Reputation: 4
Post: #3
dangerFlakes Wrote:While I cannot promise anything, I would consider learning to help this project out. I'll start reading up, but again don't rely on me at all, I'm a slow learner. Haha.

Quick question, I'm not too familiar with all the features of Xbmc, but once you start pulling metadata, could this addon conceivably run in library mode?

I would assume it would take a loooong time to scrape, but is it possible to essentially just have all the data stored locally (poster pics, fan art, etc) and just stream the content?

This would be great for my situation, as my htpc is basically a streaming box with a large unused harddrive. Used to be into the server stuff and library managing, but I don't have the time or will to do all that again

do not worry about being a slow learner! i only scraped (excuse the pun) by through studying the Voinage plugin guide and other addons.
Here are a few links and comments to help get you started:
Crucial Voinage tutorial. This is old and rather outdated, but it is still the backbone of many decent addons, including icefilms addon.

The python documentation is really good. I mostly just found the bits i needed from it by googling for certain problems.

Theres a good new guide from chippyash, but its probably a bit too much info for someone just introduced to python.

I also recommend this addon for lubetube.com as a simple addon that is fairly easy to get your head round the code. to get the addon, click download master as tar.gz , then unarchive and rename the folder to plugin.video.lubetube

For developing i use a virtual os, (run windows 7 in VMWare) just to keep my developing environment separate from my nice clean computer.
PM me if you want to do this...

plugins cannot run like library mode unfortunately. but they can get close to looking like it with metadata.

this is my concept for icefilms metadata:
metadata is very very small, probably would amount to at the absolute most 30MB for the whole of icefilms.
especially without fanart. my idea, is since everyone needs the same metadata for icefilms, why not have one person scrape the info from all the relevant imdb and tvdb pages and then put it all in a .zip and upload it to megaup, updating it once a week. then have the addon pull down the metadata .zip unzip it and use that metadata.

would involve some nifty use of python, but it will eliminate annoying slow scraping, overburdening imdb+tvdb and supporting the damn scraping code for other people. only downside is that newly added things to ice won't have art until a week later, but honestly thats really not that bad.

concept for dealing with 2shared is to use Selenium Remote, which is programmable from python. however, this is a horrible method that introduces operating system dependencies.
(This post was last modified: 2011-01-12 16:02 by anarchintosh.)
find
dangerFlakes Offline
Senior Member
Posts: 141
Joined: Sep 2010
Reputation: 0
Post: #4
Well IMDb offers their entire database for download in plain text: http://www.imdb.com/interfaces

Seems like we could just get the IDs from icefilms and compare and pull from the entire db when needed. That way, no weekly scraping the entire site. Just replace when needed.

EDIT!: for some reason I was thinking you were hosting files :/ NM, yeah your plan sounds good. haha.
(This post was last modified: 2011-01-12 20:00 by dangerFlakes.)
find
BlueCop Offline
bipedal omnivore
Posts: 1,657
Joined: May 2004
Reputation: 70
Post: #5
is anyone in communication with the icefilms administrators?

why not just scrape their site for the meta-data?

an api to access their database of content would be ideal because they have meta-data and poster on all the content I was looking at. So you just have to access a single source and don't have to build a database that needs updating. If it was grabbed from the site you would always have the info on the new episodes and not have to update a database to get info on the new episodes posted.

I have just started looking at this site and I have to say it is pretty impressive amount of content. Nice work!
find
anarchintosh Offline
Fan
Posts: 550
Joined: Jul 2010
Reputation: 4
Post: #6
@dangerflakes
that doesn't provide links to images......
themoviedb.org api looks really good, it allows you to use imdb id:
http://api.themoviedb.org/2.1/methods/Movie.imdbLookup
http://api.themoviedb.org/2.1/methods/Movie.getImages
thetvdb.com API would work by searching names of tv shows / episodes. icefilms content has name and date, so there would be very few false matches...

@bluecop
thanks!
http://forum.icefilms.info/viewtopic.php?f=35&t=17201
this is the thread i started in their forum, and i've had a few chats with staff. unfortunately i insulted one.

although i haven't asked, i don't think they'd be willing to implement an API.... they see the addon as a neat little extra for xbmc users, not a viable way of using their site.

They don't have many servers, so they might be worried about bandwidth....
the mirror pages embed content served by imdb anyway.
this is why i was going to use the imdb links to get images.

@everyone
what about the addon making api calls to moviedb and tvdb?
this data would have to be cached or they will ban the apikey if its using too much of their bandwidth.
since icefilms right of the bat contains tons of content i could pre-package a .zip with loads of scraped metadata that could be automatically unzipped into a cache folder on first run, then the addon could add to that cache folder from calls to moviedb and tvdb api if it does not find metadata files for a film. this would work for updating when new episodes/films come out.

is this excessive? i know my posts are Rolleyes
(This post was last modified: 2011-01-16 20:07 by anarchintosh.)
find
daledude Offline
Member
Posts: 85
Joined: Oct 2009
Reputation: 0
Post: #7
I hacked some code on your default.py to scrape movie info (thumbs and plot) from themoviedb and cache it using sqlite. I didn't put any comments Wink. I'm just throwing you the patch below. Hopefully it will inspire you for the TV listings. Note that my code doesnt cache "not found" responses from themoviedb and you must put your api key in tmdb_apikey.

Code:
--- default.py.orig     2011-01-16 19:48:45.252329743 -0600
+++ default.py  2011-01-17 00:59:28.290630000 -0600
@@ -2,11 +2,12 @@

#Icefilms.info v0.6.3 - anarchintosh 27/12/2010
# very convoluted code.
+from pysqlite2 import dbapi2 as sqlite
import sys,os
import mechanize,cStringIO
import urllib,urllib2,re,cookielib,html2text
import xbmc,xbmcplugin,xbmcgui,xbmcaddon,StringIO
-from BeautifulSoup import BeautifulSoup
+from BeautifulSoup import BeautifulSoup, BeautifulStoneSoup



@@ -35,6 +36,7 @@


icedatapath = 'special://profile/addon_data/plugin.video.icefilms'
+movies_cache_db = xbmcpath(icedatapath, 'movies_cache')
translatedicedatapath = xbmcpath(icedatapath,'')
art = icepath+'/resources/art'
megacookie = xbmcpath(icedatapath,'cookies.lwp')
@@ -82,6 +84,7 @@

#useful global strings:
iceurl = 'http://www.icefilms.info/'
+tmdb_apikey = ''

def openfile(filename):
      fh = open(filename, 'r')
@@ -539,13 +542,57 @@
                 return name

def MOVIEINDEX(url):
-        link=GetURL(url)
-# below is the original scraper that ignores HD tags.
-#        match=re.compile('<img class=star><a href=/(.+?)>(.+?)</a>').findall(link)
-        match=re.compile('<img class=star><a href=/(.+?)>(.+?)<br>').findall(link)
-        for url,name in match:
-                name=CLEANUP(name)
-                addDir(name,iceurl+url,100,'')
+    html = GetURL(url)
+    BS = BeautifulSoup( html )
+    video_list = BS.find("span", { "class" : "list" } )
+    links = iter(video_list.findAll("a"))
+    while links:
+        imdb_id = links.next()["id"]
+        video_info = get_video_info_by_imdb(imdb_id)
+        if video_info is not None:
+            plot = video_info.overview.string
+            try:
+                video_thumb = video_info.find("image", { "type": "poster", "size": "mid" } )["url"]
+            except:
+                video_thumb = ""
+        else:
+            plot = ""
+            video_thumb = ""
+
+        video_link = links.next()
+        name = CLEANUP(video_link.string)
+        url = video_link["href"]
+        #print "addDir(%s,%s,%d,%s)" % (name,iceurl+url,100,video_thumb)
+        addDir(name,iceurl+url,100,video_thumb, plot, imdb_id)
+        
+
+
+# returns a BeautifulSoup object
+def get_video_info_by_imdb(imdb_id):
+    con = sqlite.connect(movies_cache_db)
+    cur = con.cursor()
+    cur.execute("CREATE TABLE IF NOT EXISTS video_info (imdb_id text, name TEXT, xml BLOB, UNIQUE(imdb_id), UNIQUE(name));")
+    cur.execute("SELECT xml FROM video_info WHERE imdb_id = '%s'" % imdb_id)
+    video = cur.fetchone()
+    if video is not None:
+        #print "FROM CACHE"
+        video_info = BeautifulStoneSoup(video[0]).find("movie")
+    else:
+        #print "CREATING CACHE"
+        video_xml_data = urllib2.urlopen("http://api.themoviedb.org/2.1/Movie.imdbLookup/en/xml/"+tmdb_apikey+"/tt"+imdb_id).read()
+        video_info = BeautifulStoneSoup(video_xml_data)
+        results = video_info.find("opensearch:totalresults").contents[0]
+        if int(results) == 0:
+            return None
+        video_info = video_info.find("movie")
+        cur.execute("INSERT INTO video_info VALUES (?,?,?)", (imdb_id, video_info.nameTag.string, video_xml_data))
+        con.commit()
+
+    cur.close()
+    con.close()
+    return video_info
+
+

def TVINDEX(url):
         link=GetURL(url)
@@ -889,11 +936,11 @@
         return ok


-def addDir(name,url,mode,iconimage):
+def addDir(name,url,mode,iconimage,plot="", imdb_id=""):
         u=sys.argv[0]+"?url="+urllib.quote_plus(url)+"&mode="+str(mode)+"&name="+urllib.quote_plus(name)
         ok=True
         liz=xbmcgui.ListItem(name, iconImage="DefaultFolder.png", thumbnailImage=iconimage)
-        liz.setInfo( type="Video", infoLabels={ "Title": name } )
+        liz.setInfo( type="Video", infoLabels={ "Title": name, "Plot": plot.encode( "utf-8" ), "Code": "tt"+imdb_id } )
         ok=xbmcplugin.addDirectoryItem(handle=int(sys.argv[1]),url=u,listitem=liz,isFolder=True)
         return ok
find
anarchintosh Offline
Fan
Posts: 550
Joined: Jul 2010
Reputation: 4
Post: #8
Hey thanks for that, use of sqlite is clever. Should help to form an initial draft of metadata support.

metadata concept:
Code:
#get_movie meta (return details as list)
    #(folder is named the same as film name)

    #check if folder exists

    #if folder exists:
        #add metadata and imgpath to list

    #if folder does not exist:
        #create folder
        #call moviedb api, get meta+imgpath
        #download xml, parse for image link and download image
        #save xml + image to folder.

    #if metadetails are retrieved, add plot, rating, path to image etc to list.
    #use list when adding item as directory and playing source.

if this is too slow i will use an sqlite database (as opposed to original .xml's), and a folder full of images. use sqlite image url link for their filenames.
this would still allow the importing of an initial cache made by me.
(This post was last modified: 2011-01-17 20:39 by anarchintosh.)
find
dangerFlakes Offline
Senior Member
Posts: 141
Joined: Sep 2010
Reputation: 0
Post: #9
daledude Wrote:I hacked some code on your default.py to scrape movie info (thumbs and plot) from themoviedb and cache it using sqlite. I didn't put any comments Wink. I'm just throwing you the patch below. Hopefully it will inspire you for the TV listings. Note that my code doesnt cache "not found" responses from themoviedb and you must put your api key in tmdb_apikey.

Code:
--- default.py.orig     2011-01-16 19:48:45.252329743 -0600
+++ default.py  2011-01-17 00:59:28.290630000 -0600
@@ -2,11 +2,12 @@

#Icefilms.info v0.6.3 - anarchintosh 27/12/2010
# very convoluted code.
+from pysqlite2 import dbapi2 as sqlite
import sys,os
import mechanize,cStringIO
import urllib,urllib2,re,cookielib,html2text
import xbmc,xbmcplugin,xbmcgui,xbmcaddon,StringIO
-from BeautifulSoup import BeautifulSoup
+from BeautifulSoup import BeautifulSoup, BeautifulStoneSoup



@@ -35,6 +36,7 @@


icedatapath = 'special://profile/addon_data/plugin.video.icefilms'
+movies_cache_db = xbmcpath(icedatapath, 'movies_cache')
translatedicedatapath = xbmcpath(icedatapath,'')
art = icepath+'/resources/art'
megacookie = xbmcpath(icedatapath,'cookies.lwp')
@@ -82,6 +84,7 @@

#useful global strings:
iceurl = 'http://www.icefilms.info/'
+tmdb_apikey = ''

def openfile(filename):
      fh = open(filename, 'r')
@@ -539,13 +542,57 @@
                 return name

def MOVIEINDEX(url):
-        link=GetURL(url)
-# below is the original scraper that ignores HD tags.
-#        match=re.compile('<img class=star><a href=/(.+?)>(.+?)</a>').findall(link)
-        match=re.compile('<img class=star><a href=/(.+?)>(.+?)<br>').findall(link)
-        for url,name in match:
-                name=CLEANUP(name)
-                addDir(name,iceurl+url,100,'')
+    html = GetURL(url)
+    BS = BeautifulSoup( html )
+    video_list = BS.find("span", { "class" : "list" } )
+    links = iter(video_list.findAll("a"))
+    while links:
+        imdb_id = links.next()["id"]
+        video_info = get_video_info_by_imdb(imdb_id)
+        if video_info is not None:
+            plot = video_info.overview.string
+            try:
+                video_thumb = video_info.find("image", { "type": "poster", "size": "mid" } )["url"]
+            except:
+                video_thumb = ""
+        else:
+            plot = ""
+            video_thumb = ""
+
+        video_link = links.next()
+        name = CLEANUP(video_link.string)
+        url = video_link["href"]
+        #print "addDir(%s,%s,%d,%s)" % (name,iceurl+url,100,video_thumb)
+        addDir(name,iceurl+url,100,video_thumb, plot, imdb_id)
+        
+
+
+# returns a BeautifulSoup object
+def get_video_info_by_imdb(imdb_id):
+    con = sqlite.connect(movies_cache_db)
+    cur = con.cursor()
+    cur.execute("CREATE TABLE IF NOT EXISTS video_info (imdb_id text, name TEXT, xml BLOB, UNIQUE(imdb_id), UNIQUE(name));")
+    cur.execute("SELECT xml FROM video_info WHERE imdb_id = '%s'" % imdb_id)
+    video = cur.fetchone()
+    if video is not None:
+        #print "FROM CACHE"
+        video_info = BeautifulStoneSoup(video[0]).find("movie")
+    else:
+        #print "CREATING CACHE"
+        video_xml_data = urllib2.urlopen("http://api.themoviedb.org/2.1/Movie.imdbLookup/en/xml/"+tmdb_apikey+"/tt"+imdb_id).read()
+        video_info = BeautifulStoneSoup(video_xml_data)
+        results = video_info.find("opensearch:totalresults").contents[0]
+        if int(results) == 0:
+            return None
+        video_info = video_info.find("movie")
+        cur.execute("INSERT INTO video_info VALUES (?,?,?)", (imdb_id, video_info.nameTag.string, video_xml_data))
+        con.commit()
+
+    cur.close()
+    con.close()
+    return video_info
+
+

def TVINDEX(url):
         link=GetURL(url)
@@ -889,11 +936,11 @@
         return ok


-def addDir(name,url,mode,iconimage):
+def addDir(name,url,mode,iconimage,plot="", imdb_id=""):
         u=sys.argv[0]+"?url="+urllib.quote_plus(url)+"&mode="+str(mode)+"&name="+urllib.quote_plus(name)
         ok=True
         liz=xbmcgui.ListItem(name, iconImage="DefaultFolder.png", thumbnailImage=iconimage)
-        liz.setInfo( type="Video", infoLabels={ "Title": name } )
+        liz.setInfo( type="Video", infoLabels={ "Title": name, "Plot": plot.encode( "utf-8" ), "Code": "tt"+imdb_id } )
         ok=xbmcplugin.addDirectoryItem(handle=int(sys.argv[1]),url=u,listitem=liz,isFolder=True)
         return ok


Just curious, what parts of the default.py are you replacing with the above? Wanted to try it out.
find
daledude Offline
Member
Posts: 85
Joined: Oct 2009
Reputation: 0
Post: #10
dangerFlakes Wrote:Just curious, what parts of the default.py are you replacing with the above? Wanted to try it out.

NOTE: You must put a themoviedb api key in the tmdb_apikey variable otherwise it won't work.

Paste that code into a file called default.py.patch in the same dir as default.py and run this command. Make a copy of the original default.py first just in case.

Code:
patch -p0 < default.py.patch
(This post was last modified: 2011-01-18 02:25 by daledude.)
find
Thread Closed