Kodi Community Forum

Pages: 1 2

Developing a new addon, should be rather quick and simple

http://www.redlettermedia.com

Code will be up on my repo soon: https://github.com/Eldorados/eldorado-xbmc-addons

All videos hosted from blip.tv and they have a very nice API and a wiki on how to use it

In case anyone is looking to create a full blip.tv addon - http://wiki.blip.tv

As usual, I have some issues I need help with! Smile

1. Can someone help explain the issue with pulling in this page - http://redlettermedia.com/films/

In Python I get a 404 error, but of course is valid in a web browser

If I try instead http://redlettermedia.com/films (notice no trailing / ) I get a quick html portion which basically points to the real page

Not sure what I'm missing and why, and more importantly what to look for when I encounter this

2. I'm slowly learning regex, this situation is bugging me..

I need to do an OR and grab the video id at the same time, the embed source could be formatted in one of two ways:

src="http://blip.tv/play/<video id>
or
src="http://a.blip.tv/api.swf#<video id>

I've tried:
videos = re.compile('<embed.+?(src="http://blip.tv/play/(.+?)"|src="http://a.blip.tv/api.swf#(.+?)")', re.DOTALL).findall(html)

But I get an array back of either:
('src="http://blip.tv/play/huARgqrHZwA%2Em4v"', 'huARgqrHZwA%2Em4v', '')
or
('src="http://a.blip.tv/api.swf#huARgtC2IwA"', '', 'huARgtC2IwA')

Obviously I only want the ID portion, I know my brackets are the cause of the extra results, but I'm not sure how to format the OR correctly

Almost forgot..

Anyone care to give some quick JSON tips?

I may/may not use it for this addon but would like to get a better understanding of it

Blip.tv gives you this: http://blip.tv/players/episode/7FCBxNIHA...&version=2

Yet any tries I've made so far have not worked, I was thinking I could pull it in with a command like this:

response = urllib2.urlopen(url)
jsondata = simplejson.loads(response.read())

But no luck..

Eldorado Wrote:2. I'm slowly learning regex, this situation is bugging me..

I need to do an OR and grab the video id at the same time, the embed source could be formatted in one of two ways:

src="http://blip.tv/play/<video id>
or
src="http://a.blip.tv/api.swf#<video id>

I've tried:
videos = re.compile('<embed.+?(src="http://blip.tv/play/(.+?)"|src="http://a.blip.tv/api.swf#(.+?)")', re.DOTALL).findall(html)

But I get an array back of either:
('src="http://blip.tv/play/huARgqrHZwA%2Em4v"', 'huARgqrHZwA%2Em4v', '')
or
('src="http://a.blip.tv/api.swf#huARgtC2IwA"', '', 'huARgtC2IwA')

Obviously I only want the ID portion, I know my brackets are the cause of the extra results, but I'm not sure how to format the OR correctly

You make your regex far too complicated and both OR cases still have a lot in common. I obviously didn't test it so you might have to adjust some stuff but it might help you get it the way you want:

Code:
<embed.+?src="http://[a.]{0,2}blip.tv/[^#/]*[#/]{1}([^"]*)"

You may notice that the "[a.]{0,2}" part is not 100% correct but it certainly covers both of the cases you mentioned. Furthermore you may have noticed that I'm no fan of ".*?" because that's not supported by every regex implementation (but it is by python). What I prefer is to use something like "[^#/]*" which means "get every character that is neither a # nor a /". So in your case "[^#/]*" will cover both "play" and "api.swf".
As I only used one set of ( and ) your match array should only contain one entry, which is the <video id> you are seeking.

Eldorado Wrote:Anyone care to give some quick JSON tips?
response = urllib2.urlopen(url)
jsondata = simplejson.loads(response.read())

But no luck..

I'm not sure if it's proper, but what I've been doing is trimming the response down so that it just contains the json

Code:
try:

    import json

except:

    import simplejson as json

url = 'http://blip.tv/players/episode/7FCBxNIHAA?skin=json&callback=foo&version=2'

headers = {'User-agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0'}

req = urllib2.Request(url,None,headers)

response = urllib2.urlopen(req)

link=response.read()

response.close()

data = json.loads(link[5:-4])

Another tip, to get the [keys] instead of having to read through the json string you can do in this case

Code:
>>> data.keys()

[u'Post', u'User']

data['Post'].keys()

Eldorado Wrote:As usual, I have some issues I need help with!

1. Can someone help explain the issue with pulling in this page - http://redlettermedia.com/films/

In Python I get a 404 error, but of course is valid in a web browser

If I try instead http://redlettermedia.com/films (notice no trailing / ) I get a quick html portion which basically points to the real page

Not sure what I'm missing and why, and more importantly what to look for when I encounter this

looks like a pretty broken site - in a browser it gives a 404 too after a bunch of weird of 302s (look at the headers in firebug or chrome dev tools) so in this case a 404 is not an error. Huh

Montellese Wrote:You make your regex far too complicated and both OR cases still have a lot in common. I obviously didn't test it so you might have to adjust some stuff but it might help you get it the way you want:

Code:
<embed.+?src="http://[a.]{0,2}blip.tv/[^#/]*[#/]{1}([^"]*)"

You may notice that the "[a.]{0,2}" part is not 100% correct but it certainly covers both of the cases you mentioned. Furthermore you may have noticed that I'm no fan of ".*?" because that's not supported by every regex implementation (but it is by python). What I prefer is to use something like "[^#/]*" which means "get every character that is neither a # nor a /". So in your case "[^#/]*" will cover both "play" and "api.swf".
As I only used one set of ( and ) your match array should only contain one entry, which is the <video id> you are seeking.

I really need to take some sort of regex course Smile

Tested it on a few pages and seems to work very well, thanks!

Eldorado Wrote:I really need to take some sort of regex course

Tested it on a few pages and seems to work very well, thanks!

Well I would just read up on the basics and then it's just a matter of experience. It took me quite a while to be able to write up regular expressions on the fly as well so no worries Wink

t0mm0 Wrote:looks like a pretty broken site - in a browser it gives a 404 too after a bunch of weird of 302s (look at the headers in firebug or chrome dev tools) so in this case a 404 is not an error.

Odd, the page loads ok for me in Firefox, don't see anything in firebug..

Are you unable to get to the page at all? Even by navigating via the home page: http://www.redlettermedia.com -> click on Feature Films

Edit - Ah I see it in Chrome, if I'm not mistaken it appears to be this file giving the error:
http://redlettermedia.com/wp-includes/js...mation.gif

Is there anyway to still pull in the rest of the page ignoring this error?

Eldorado Wrote:Odd, the page loads ok for me in Firefox, don't see anything in firebug..

Are you unable to get to the page at all? Even by navigating via the home page: http://www.redlettermedia.com -> click on Feature Films

Edit - Ah I see it in Chrome, if I'm not mistaken it appears to be this file giving the error:
http://redlettermedia.com/wp-includes/js...mation.gif

you aren't loading the gif from python. the page itself returns 404 when it should return 200

Eldorado Wrote:Is there anyway to still pull in the rest of the page ignoring this error?

yes though if you are using my common stuff then there is no way to do that currently.

edit: actually, of course that should work with my common stuff too Wink

t0mm0 Wrote:you aren't loading the gif from python. the page itself returns 404 when it should return 200

yes though if you are using my common stuff then there is no way to do that currently.

edit: actually, of course that should work with my common stuff too

Ok odd, that img seems to be the only error that pops up for me.. otherwise in any browser that page loads fine..

Eldorado Wrote:Ok odd, that img seems to be the only error that pops up for me.. otherwise in any browser that page loads fine..

that's because browsers render the page whatever the http status code. if you look at the headers you will see the status code is 404

t0mm0 Wrote:that's because browsers render the page whatever the http status code. if you look at the headers you will see the status code is 404

Light bulb just turned on Smile

Hey, as a big RLM fan I wondered if you ever finished this addon or got it to a useable state? Or is there another addon that would better display RLM shows??

EDIT: Just found the Blip videos app. I'm assuming this is what I'm looking for??

I did, check my repo link in the first post, I updated the addon just a few weeks ago

I'm planning on submitting it to the xbmc official repo at some point

Pages: 1 2

Eldorado

Eldorado

Eldorado

Montellese

divingmule

t0mm0

Eldorado

Montellese

Eldorado

t0mm0

Eldorado

t0mm0

Eldorado

peanutismint

Eldorado