Login at Kodi Home

Eldorado · (This post was last modified: 2011-10-24, 18:38 by Eldorado.)

I thought I would get the ball rolling on getting the next piece in the proposed Video Falcon project - common meta data script

I've put the initial version on git - https://github.com/Eldorados/xbmc-metautils

This version was pulled from the master branch of Icefilms, I've basically put everything into it's own script folder and set it up to stand on it's own

Basic functionality:
* addon calls this module with a episode/movie imdb id or title
* module looks in its database for metadata for the required episode/movie
* if not found it scrapes a site for the required metadata, adds it to the database and returns it
* if it is found, it simply returns it from the database
* search by name and return a list of possible matches

How to

When searching by just movie/tv show name I recommend to try and pass in as clean a name as possible, strip anything that is not apart of the actual name eg. many sites display 'The Hangover (2009)', you need to strip (2009) from the name and pass the year in separately

Initialize:

Code:
metaget=metahandlers.MetaData()

You can specify a new cache path by specifying path='<addon data path>' but recommended to use the default

If you wish to download the covers to the cache folders, in prep to release a meta data zip pack to users, specify the preparezip=True option

Code:
metaget=metahandlers.MetaData(preparezip=True)

Search for movie:

Code:
Search by IMDB ID:

meta = metaget.get_meta('movie', movie_name, imdb_id=imdb_id)

Serach by TMDB ID:

meta = metaget.get_meta('movie', movie_name, tmdb_id=tmdb_id)

Search by movie name + year

meta = metaget.get_meta('movie', movie_name, year=year)

Search for tv series:

Code:
Search by IMDB ID:

meta = metaget.get_meta('tvshow',tvshow_name, imdb_id=imdb_id)

Search by name:

meta = metaget.get_meta('tvshow',tvshow_name)

Search for tv show season covers:

By this point you *should* have the imdb id of the tv show, if you don't have it then no results will be returned

get_seasons() returns a dictionary for each season found in order

Code:
season_list = [1,2,3]

seasons = metaget.get_seasons(imdb_id, season_list)

Search for tv show episode:

Code:
season_num = 1

episode_num = 1

episode=metaget.get_episode_meta(imdb_id, season_num, episode_num)

Update watched status:
Will update status in DB from 6 to 7 or reverse depending on what initial value is, 6 = unwatched, 7=watched

Code:
metaget.change_watched('movie', movie_name, imdb_id, year)

or

metaget.change_watched('tvshow', tvshow_name, imdb_id)

or

metaget.change_watched('episode', episode_name, imdb_id, season)

Search for a movie by name and return a list of possible matches:

Code:
search_meta = metaget.search_movies(movie_name)

Returns an array of dictionaries with the following data:

- IMDB ID

- TMDB ID

- Name

- Year

Meta Data being collected

Movies:

Code:
IMDB ID

TMDB ID

Title

Writer

Director

Tagline

Cast & Role

Rating

Duration

Plot

MPAA Rating

Premiered

Year

Genre

Studio

Trailer URL

Thumb URL

Cover URL

Backdrop/Fanart URL

Overlay (watched status)

TV Shows:

Code:
IMDB ID

TheTVDB ID

Title

Rating

Duration

Plot

MPAA Rating

Premiered

Genre

Studio

Cast & Role

Trailer URL

Thumb URL

Cover URL

Backdrop/Fanart URL

Overlay (watched status)

Seasons:

Code:
IMDB ID

TheTVDB ID

Season #

Cover URL

Overlay (watched status)

Episodes:

Code:
IMDB ID

TheTVDB ID

Episode ID

Season #

Episode #

Title

Director

Writer

Plot

Rating

Premiered

Poster URL

Overlay (watched status)

To Do's

- LOTS of code cleanup/optimization
- add more meta - director, writers, cast DONE
- fleshing out methods and how they are called
- fix unicode problems, possibly integrating with t0mm0.common DONE
- metacontainers needs attention
- create a metacontainer zip file to optionally download instead of creating a blank DB ala icefilms

All welcome who are wanting to help!

t0mm0 · 2011-09-09, 18:33

Eldorado Wrote:I thought I would get the ball rolling on getting the next piece in the proposed Video Falcon project - common meta data script

I've put the initial version on git - https://github.com/Eldorados/xbmc-metautils

This version was pulled from the master branch of Icefilms, I've basically put everything into it's own script folder and set it up to stand on it's own

There are some to-do's left in the code as well as quite a bit of Icefilms specific coding, once that stuff is cleaned up I'm optimistic that it could be very close to being release ready

Scraping of metadata for movies (TMDB) and tvshows (TheTVDB) is currently working using IMDB ID's, as an enhancement it would be nice to add ability to search based on simply movie/tv show name

All welcome who are wanting to help!

hi eldorado!

is there a specification somewhere for what this bit is meant to do? i'm not very familiar with the metadata stuff so would be interested in what function this bit fills!

t0mm0

**rogerthis** · 2011-09-09, 19:50

Eldorado, could you get an example of how to execute this in an addon. I had a quick look at the code and I don't know where to start.

Eldorado · 2011-09-09, 20:10

Hi guys, I haven't dug into it yet to do up an example.. though the master branch of Icefilms currently uses it, so maybe a good place to look if you want to jump in?

Basically it needs work before it can be used in any sort of way outside of Icefilms as I said it has quite a bit of specific coding done

t0mm0, this was another item on the Video Falcon list:

http://forum.xbmc.org/showthread.php?tid=99384

Quote:script.module.metahandlers
Metahandlers will be the module that can get and cache metadata.
As well as build/download and install metacontainers of pre-packaged metadata for sites (you provide it with a list of all content, and it will pre-make a cache of metadata for that list).

t0mm0 · 2011-09-09, 22:37

Eldorado Wrote:t0mm0, this was another item on the Video Falcon list:

http://forum.xbmc.org/showthread.php?tid=99384

so am i right in thinking what it needs to do is...

addon calls this module with a episode/movie imdb id or title
module looks in its database for metadata for the required episode/movie
if not found it scrapes a site for the required metadata, adds it to the database and returns it
if it is found, it simply returns it from the database

is it possible to use the existing xbmc scraper modules rather than having separately maintained ones? (just asking the question - i know nothing about metadata in xbmc)

i assume the intention is to maintain a central database so that if a movie is added from one addon its metadata will be available from all others using the module?

seems what is really needed is to be able to add stuff to the main xbmc library. there is also the hack that is doing the rounds at the moment with creating loads of strm files which is trying to solve the same problem i guess?

also there is the mention of building pre-packaged metadata bundles - this sounds like a nightmare to me but maybe there is a particular use?

there should probably be a definition of what metadata is required. does it include posters/thumbs for example? maybe this would also be a good place to track watched status (especially as it would wok across addons) while we can't do it in xbmc?

(i always find it better to try and define what something is supposed to do before writing code - might save rewriting it too much later. i hope the questions above aren't too silly - as i say i don't know anything about metadata in xbmc Huh

)

t0mm0

ps. eldorado you need to add .pyo (and .pyc while you are at it) files to your .gitignore file in this repo!

slyi · 2011-09-10, 18:20

I was also tinkering with meta data updates using asynchronous methods for icefilms see demo on http://dl.dropbox.com/u/6589941/asyncmet...ncmeta.zip

I think a generic system should only download whats requested at the time and be text only (no images) as these are better stored online rather filling the limited hd of embedded devices apple tv etc...

I'd be interested in helping on this aswell, can you provide a sample that works with ice films v12?

Eldorado · 2011-09-11, 19:41

t0mm0 Wrote:so am i right in thinking what it needs to do is...

addon calls this module with a episode/movie imdb id or title

module looks in its database for metadata for the required episode/movie

if not found it scrapes a site for the required metadata, adds it to the database and returns it

if it is found, it simply returns it from the database

I think you nailed it here, pretty much what I was thinking the main functions should be, I'll add this to my op

t0mm0 Wrote:is it possible to use the existing xbmc scraper modules rather than having separately maintained ones? (just asking the question - i know nothing about metadata in xbmc)

Very good question and one I've been asking myself too!

Hoping someone can jump in with the knowledge to give a yay or nay, as your right it's very redundant and quite a bit of extra work to write and maintain a separate scraper

t0mm0 Wrote:i assume the intention is to maintain a central database so that if a movie is added from one addon its metadata will be available from all others using the module?

seems what is really needed is to be able to add stuff to the main xbmc library. there is also the hack that is doing the rounds at the moment with creating loads of strm files which is trying to solve the same problem i guess?

also there is the mention of building pre-packaged metadata bundles - this sounds like a nightmare to me but maybe there is a particular use?

there should probably be a definition of what metadata is required. does it include posters/thumbs for example? maybe this would also be a good place to track watched status (especially as it would wok across addons) while we can't do it in xbmc?

(i always find it better to try and define what something is supposed to do before writing code - might save rewriting it too much later. i hope the questions above aren't too silly - as i say i don't know anything about metadata in xbmc )

t0mm0

All good points, I guess initially I was thinking to basically get this module running on it's own first and keeping all the current functionality that it is performing with Icefilms - pulling in all metadata from plot, genre, cast, thumbnail etc. storing it all in a local cache accessible by any addon, set those as it's initial boundaries then work towards defining what phase 2/enhancements should be.. eg as you said adding to main library, pre-packaged meta containers etc

t0mm0 Wrote:ps. eldorado you need to add .pyo (and .pyc while you are at it) files to your .gitignore file in this repo!

Eeee.. I usually make sure I don't copy those files Smile

Eldorado · 2011-09-11, 19:50

slyi Wrote:I was also tinkering with meta data updates using asynchronous methods for icefilms see demo on http://dl.dropbox.com/u/6589941/asyncmet...ncmeta.zip

I think a generic system should only download whats requested at the time and be text only (no images) as these are better stored online rather filling the limited hd of embedded devices apple tv etc...

I'd be interested in helping on this aswell, can you provide a sample that works with ice films v12?

I'm not sure the user would like a system that has to re-scrape every time you pull up a list of movies, and I'm assuming nor would a site such as TMDB

The Apple TV has I believe 2gig storage space?

Perhaps an option between saving just text vs text & images?

The code I have posted only works with the master branch of Icefilms due to the number of changes, Anarchintosh had said it was 95% complete, don't see any notes on what is left to do..

If you need a v12 version you can simply pull it from the current addon folder, will need to modify to remove all the icefilms specific logic

anarchintosh · 2011-09-14, 18:21

asynchronous updates are only realistic if you only use low quality thumbnail cover art instead of high quality cover art, which is a bit of a sad tradeoff.

Eldorado · (This post was last modified: 2011-09-24, 00:11 by Eldorado.)

I've done some small updates

- removed all (that I could find) icefilms specific coding, which at quick glance appeared to be scraping the icefilms site for metadata if a IMDB id did not exist, possibly something like this might be useful for other sites, something to keep in mind for updates

- small changes to use getAddonInfo('path') and sqlite3

Below is a quick example on how to scrape for a movie or tv show, the metadata will be stored in a sql db in the addon_data folder

Code:
from metautils import metahandlers, metacontainers

    metapath = xbmc.translatePath('special://profile/addon_data/script.module.metautils/meta_cache')

    metaget=metahandlers.MetaData(metapath,preparezip = True)

    meta = metaget.get_meta('tt1499658','movie','Horrible Bosses')

    print meta

Output:

Code:
{'rating': 8.1999999999999993, 'genres': u'Comedy', 'name': u'Horrible Bosses', 'tmdb_id': u'51540', 'plot': 'Starring : \nJennifer Aniston, Jason Bateman, Charlie Day, Jason Sudeikis, Colin Farrell\n\nPlot : \nAfter three friends realize that their bosses are standing in the way of their happiness, they come up with a murderous plot, hoping to better their lives.', 'mpaa': u'R', 'studios': u'New Line Cinema', 'premiered': u'2011-07-08', 'imdb_id': u'tt1499658', 'imgs_prepacked': u'true', 'cover_url': u'http://cf1.imgobject.com/posters/9d8/4e258b037b9aa11b5c0009d8/horrible-bosses-cover.jpg', 'duration': 100, 'watched': 6, 'thumb_url': u'http://cf1.imgobject.com/posters/9d8/4e258b037b9aa11b5c0009d8/horrible-bosses-thumb.jpg', 'trailer_url': u'http://www.youtube.com/watch?v=mh9cG5dzs-U', 'backdrop_url': u'http://cf1.imgobject.com/backdrops/079/4dae1a515e73d67899000079/horrible-bosses-original.jpg'}

Still quite a bit left to do which mainly consists of code cleanup

Also looking for info on the possibility of using the existing TMDB and TVDB scrapers so that a second set does not need to be maintained

edit - found my answer, no can do currently : http://forum.xbmc.org/showthread.php?tid...ht=scraper

k_zeon · 2011-09-24, 00:29

hey Eldorado.

how would you intergrate this into an addon.
ie where would you put the call to get the info.. and how would XBMC know there is data to show.

thanks

Eldorado · (This post was last modified: 2011-09-24, 04:15 by Eldorado.)

k_zeon Wrote:hey Eldorado.

how would you intergrate this into an addon.
ie where would you put the call to get the info.. and how would XBMC know there is data to show.

thanks

The call would be as you are adding either directories or video items, currently you must have a imdb id for it to scrape

Using what is returned you can put it into a new dict with the proper labels

eg.

liz = xbmcgui.ListItem()
infoLabels = []
infoLabels['genre'] = str(meta['genres'])
infoLabels['duration'] = str(meta['duration'])
infoLabels['premiered'] = str(meta['premiered'])
infoLabels['studio'] = meta['studios']
infoLabels['mpaa'] = str(meta['mpaa'])
infoLabels['code'] = str(meta['imdb_id'])
infoLabels['rating'] = float(meta['rating'])

liz.setInfo(type="Video", infoLabels=infoLabels)

Or if you are using t0mmo's common library it looks like you can just pass infoLabels in:

add_video_item({'url': url},{infoLabels},img=thumb)

This is still very much in dev so just use for testing for now

k_zeon · 2011-09-24, 08:15

Eldorado Wrote:The call would be as you are adding either directories or video items, currently you must have a imdb id for it to scrape

Using what is returned you can put it into a new dict with the proper labels

eg.

liz = xbmcgui.ListItem()
infoLabels = []
infoLabels['genre'] = str(meta['genres'])
infoLabels['duration'] = str(meta['duration'])
infoLabels['premiered'] = str(meta['premiered'])
infoLabels['studio'] = meta['studios']
infoLabels['mpaa'] = str(meta['mpaa'])
infoLabels['code'] = str(meta['imdb_id'])
infoLabels['rating'] = float(meta['rating'])

liz.setInfo(type="Video", infoLabels=infoLabels)

Or if you are using t0mmo's common library it looks like you can just pass infoLabels in:

add_video_item({'url': url},{infoLabels},img=thumb)

This is still very much in dev so just use for testing for now

so basically

add_video_item({'url': url},{genre='Action',duration='102 mins',premiered='xxxxx' etc },img=thumb)

of would it be slightly different. havent tried it yet.

Eldorado · 2011-09-24, 15:24

k_zeon Wrote:so basically

add_video_item({'url': url},{genre='Action',duration='102 mins',premiered='xxxxx' etc },img=thumb)

of would it be slightly different. havent tried it yet.

Just a slight correction:

add_video_item({'url': url},{'genre': 'Action','duration':'102 mins','premiered': 'xxxxx' etc },img=thumb)

k_zeon · 2011-09-24, 18:56

Eldorado Wrote:Just a slight correction:

add_video_item({'url': url},{'genre': 'Action','duration':'102 mins','premiered': 'xxxxx' etc },img=thumb)

ahh thanks.

Does this mean that i would need to scrape an IMDB number from each movie the first time round.
The new TVShack has an IMDB number once you click into the page, so what i would need to do is
1.get the webpage and scrape the IMBD number
2.use metautils to get the movie information
3. Then call the add_directory and place all info as you mentioned for each movie.

If movie found then would scrape info . next time if data already there would it not scrape the info.