help with scraping once rather than twice

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
mikey1234 Offline
Banned
Posts: 408
Joined: Nov 2011
Post: #1
right i made the plugin that scrapes epg info but to make it easier for me i made 2 that scrapes same info hence its slow i have <epg1>"(uk)" to show in the title and <epg2>"(uk2)" which is description to show in fanart but the plugin scrapes twice as i made 2 defget

as it it the same defget

Code:
def getuk2(link):
    req = urllib2.Request('http://www.locatetv.com/uk/listings/'+link)
    req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
    response = urllib2.urlopen(req)
    link=response.read()
    response.close()
    match=re.compile('<a class="pickable" href="/tv/(.+?)/(.+?)</a><p>(.+?)</p></div></li>').findall(link)
    nowtitle = match[0][0]
    nowdesc = match [0][2]
    nexttitle = match[1][0]
    nextdesc = match [1][2]
    nowtitle = nowtitle.replace("-"," ")
    nowtitle = re.sub("\">.+?<", "", nowtitle)
    nexttitle = nexttitle.replace("-"," ")
    nexttitle = re.sub("\">.+?<", "", nexttitle)
    nowdesc = nowdesc.replace("'","")
    nextdesc = nextdesc.replace("'","")    
    return "[b][NOW]%s[/b]\n%s\n\n[b][NEXT] - %s[/b]\n%s" %(nowtitle, nowdesc, nexttitle, nextdesc)

but this one just gets the nowtitle but still needs to page pull

Code:
def getuk(link):
    req = urllib2.Request('http://www.locatetv.com/uk/listings/'+link)
    req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
    response = urllib2.urlopen(req)
    link=response.read()
    response.close()
    match=re.compile('<a class="pickable" href="/tv/(.+?)/(.+?)</a><p>(.+?)</p></div></li>').findall(link)
    nowtitle = match[0][0]
    nowtitle = nowtitle.replace("-"," ")
    nowtitle = re.sub("\">.+?<", "", nowtitle)    
    return "   -   %s" %(nowtitle)

what can i change below to get uk2 string but only the nowtitle "nowtitle" part of uk2 string

rather than it show everything
can i just add at the end of [0].string = nowtitle

i have tried but doesnt work

Code:
try:
                if item('uk2'):
                    if item('uk2')[0].string > 1:
                        name += getuk2(item('uk2')[0].string)

hope i make sense to you guys once i have sorted this i can release with epg info
find quote