Login at Kodi Home

stacked · (This post was last modified: 2009-08-30, 21:24 by stacked.)

sansat Wrote:Thanks Stacked, actually I was looking at only the first regex and it was matching in both working and non working videos, did not look into the second one, do we need the second one ? for any videos as I am not downloading the video's so maybe I will remove it and see if it helps.

The second regex uses flashvideodownloader.org to extract the direct video url (*.flv) from the dailymotion site. So you do need it.

Quote:Thanks for your guidance and also regarding multiple urls, my requirement was to first scrape a url, if it does not return a value, then scrape the second url, so when I add the below code for each url it does not work so was not sure if we could scrape many url in one definition(for example in def PARTS(url) using below code twice,

Code:
req = urllib2.Request(url) req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3') response = urllib2.urlopen(req) link=response.read() response.close()

thanks for your guidance and tips as it really directs me in the right direction..As you said this site is very inconsistent but since its the first site i am trying so if I try various scenarios it will help me in creating other plugins faster - hopefully

The code below checks for matches in part 1 and then part 2
http://pastebin.com/m6119b0ae

...I really hope your plugin is done because I don't feel like writing any more code Big Grin

sansat · 2009-08-31, 05:08

Ya I do understand Smile

I changed the code for parts a bit

Code:
('1 of (.+?)')

to

Code:
('1 of (\d+)')

as the first code would only display part 1 if there are 10 or 12 parts..

But with your guidance by trying to write this plugin, I have got a hang of the flow from categories to parts and video is the section I would like to know - no code Laugh

but to understand how it works.

Quote:The second regex uses flashvideodownloader.org to extract the direct video url (*.flv) from the dailymotion site. So you do need it.

since the second regex is needed - and the download link is unvailable for some shows, hence we see parts without videos in xbmc - so how is it playing in the browser online ? Is there any solution for such scenarios or we will have to live with it ?

Also is there any logic to get parts for shows which do not have "part 1 of " or "part 2 of" in any of its pages and just have prev and next references?

Thanks

stacked · 2009-08-31, 08:05

sansat Wrote:Ya I do understand

since the second regex is needed - and the download link is unvailable for some shows, hence we see parts without videos in xbmc - so how is it playing in the browser online ? Is there any solution for such scenarios or we will have to live with it ?

Because the website is playing directly from dailymotion flash player. In xbmc you are using flashvideodownloader.org to get the video url. Try using a different site to extract the url for you.

Code:
Also is there any logic to get parts for shows which do not have "part 1 of " or "part 2 of" in any of its pages and just have prev and next references?

I can't think of any.

~~Voinage~~ · 2009-08-31, 15:44

Please stop using sites to grab your urls.

Just adapt the below code to give you the direct blobby.

Code:
#DAILYMOTION

        try:

                daily=re.compile('<param name="movie" value="http://www.dailymotion.com/swf/(.+?)"/>').findall(link)

                req = urllib2.Request('http://www.dailymotion.com/video/%s'%daily[0])

                req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14')

                response = urllib2.urlopen(req).read()

                match=re.compile('url=rev=.+?&uid=.+?&lang=en&callback=.+?&preview=.+?&video=(.+?)%40%40spark').findall(response)

                addLink(name,"http://www.dailymotion.com"+urllib.unquote(match[0]),"")

        except: pass

sansat · 2009-09-01, 19:35

Thanks stacked,..

Thanks Voinage for this new solution.

When I add the code it does not display video in the parts section - its empty.

Below is the code

http://pastebin.com/f339d6ecc

Please let me know if I have missed something.

Thanks

sansat · 2009-09-02, 15:54

Any update on above message ?

sansat · 2009-09-02, 19:13

I removed / from below code

Code:
daily=re.compile('<param name="movie" value="http://www.dailymotion.com/swf/(.+?)"/>').findall(link)

to

Code:
daily=re.compile('<param name="movie" value="http://www.dailymotion.com/swf/(.+?)">').findall(link)

and it passes the daily[0] value but it does not display the videos in the parts - its still empty so am not sure what else has to be changed in remaining code to make it to work --

Code:
req = urllib2.Request('http://www.dailymotion.com/video/%s'%daily[0])

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14')

response = urllib2.urlopen(req).read()

match=re.compile('url=rev=.+?&uid=.+?&lang=en&callback=.+?&preview=.+?&video=(.+?)%40%40spark').findall(response)

addLink(name,"http://www.dailymotion.com"+urllib.unquote(match[0]),"")

please let me know

Thanks

stacked · (This post was last modified: 2009-09-02, 20:53 by stacked.)

sansat Wrote:I removed / from below code

Code:
daily=re.compile('<param name="movie" value="http://www.dailymotion.com/swf/(.+?)"/>').findall(link)

to

Code:
daily=re.compile('<param name="movie" value="http://www.dailymotion.com/swf/(.+?)">').findall(link)

and it passes the daily[0] value but it does not display the videos in the parts - its still empty so am not sure what else has to be changed in remaining code to make it to work --

Code:
req = urllib2.Request('http://www.dailymotion.com/video/%s'%daily[0]) req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14') response = urllib2.urlopen(req).read() match=re.compile('url=rev=.+?&uid=.+?&lang=en&callback=.+?&preview=.+?&video=(.+?)%40%40spark').findall(response) addLink(name,"http://www.dailymotion.com"+urllib.unquote(match[0]),"")

please let me know

Thanks

I know where the problem is but I want you to figure it out...

Check if match is returning a value. If not, there is something wrong with the regex. Go to a dailymotion page, look at the source and confirm if the regex is correct. Once you're done with that, check if the url you are passing to addLink is accessible in your browser. Remember to print things out if you don't know whats going on.

sansat · 2009-09-02, 21:07

Thanks for replying, I was going in the same direction as you have mentioned and have changed code

from

Code:
match=re.compile('url=rev=.+?&uid=.+?&lang=en&callback=.+?&preview=.+?&video=(.+?)%40%40spark').findall(response

)

to

Code:
match=re.compile('url=rev=.+?&lang=en&callback=.+?&preview=.+?&video=(.+?)%40%40spark').findall(response)

and it displays the videos under parts but it does not play the videos:

Below is the log:

http://pastebin.com/d54ad799d

Anyway will keep you posted

Thanks

sansat · 2009-09-02, 23:10

As an update - I changed below code also

from

Code:
addLink(name,"http://www.dailymotion.com"+urllib.unquote(match[0]),"")

to

Code:
addLink(name,urllib.unquote(match[0]),"")

And it plays video but some of the parts which were shown as empty are still empty so now I have to see what url is being passed for those videos

will keep you posted.

Thanks

sansat · (This post was last modified: 2009-09-03, 00:32 by sansat.)

Ok now I would need your suggestion in this as I am not able to figure out Smile

From below code - 13909,14169,14159 works but 13872,13867,13863,13851 does not work and I am not able to see the source of non-working videos in dailymotion as it says access denied like below link but they work online..
non-working video
http://www.dailymotion.com/video/xa1etr

for 14169 working video - below link from dailymotion works -
http://www.dailymotion.com/video/k5Sgqp6J3QrO9r1awhc

Code:
import urllib2,urllib,re

url='http://www.filmicity.in/videos.php?id=13909'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

print url

daily=re.compile('<param name="movie" value="http://www.dailymotion.com/swf/(.+?)">').findall(link)

print daily[0]

req = urllib2.Request('http://www.dailymotion.com/video/%s'%daily[0])

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14')

response = urllib2.urlopen(req).read()

match=re.compile('url=rev=.+?&lang=en&callback=.+?&preview=.+?&video=(.+?)%40%40spark').findall(response)

print match[0]

#addLink(name,"http://www.dailymotion.com"+urllib.unquote(match[0]),"")

Thanks

stacked · (This post was last modified: 2009-09-03, 03:33 by stacked.)

That's odd. The embed video works but the link to the actual video doesn't.

works:
http://www.dailymotion.com/swf/xa1etr

doesn't:
http://www.dailymotion.com/video/xa1etr

sansat · 2009-09-03, 18:42

Ya, so then for that site we may have to live with that limitation as I am not able to figure out any other method...

Also most of videos in movies sections have parts but in the source they are not mentioning the no of parts so we can't have those videos play either as I can't figure a way for that either..

Thanks

sansat · 2009-09-04, 22:40

Well I found a way to display parts which are not having part 1 of or part 2 of in source, I kept at default value for parts as 15 and it will list 15 parts by default with consecutive videos under each part - this way I am able to atleast play 60% of the video's on this site: now the remaining some are google videos which I need to work on and others are the ones for which we cannot find a fix where we can play online but not through the source..

Will keep you posted.

sansat · 2009-09-05, 00:00

For google video in below code, I am not able to get the id

Code:
import urllib2,urllib,re

url='http://www.filmicity.in/videos.php?id=4684'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

print url

google=re.compile('<embed style="width:625px; height:509px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=(.+?)" flashvars=""> </embed>').findall(link)

print google

#addLink('Play '+name,url,"")

Instead of just id, if I do (.+?) on whole url like below , i get a result, but i need only ID - anything I am missing ?

Code:
import urllib2,urllib,re

url='http://www.filmicity.in/videos.php?id=4684'

req = urllib2.Request(url)

req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

response = urllib2.urlopen(req)

link=response.read()

response.close()

print url

google=re.compile('<embed style="width:625px; height:509px;" id="VideoPlayback" type="application/x-shockwave-flash" src="(.+?)" flashvars=""> </embed>').findall(link)

print google

#addLink('Play '+name,url,"")

Also from source of the google video, I am not able to figure out the actual url which I can pass through urllib.unquote()

http://video.google.com/videoplay?docid=...0520526787

Please let me know

Thanks