[RELEASE] EPG Scraper for Switzerland, Germany, and Austria
#1
Thumbs Up 
Hello to all .) LaughLaughLaugh

I have written a epg-scrapper witch do produce xmltv filess from the
source tv.search.ch witch can be used in a wide range of applications
that needs xml-files. (mythtv and others)

http://code.google.com/p/epg-swiss/

- runs on linux Big Grin
- is programmed in shell-script
- does provide epg-data for over 270 channels
- In the moment only the fast-mode scrapping is supported.

1.) Download the beta-script from the project url
2.) install needed software (lynx,snarf,awk,grep,sed)
3.) extract the tar-file and change to directory tvsearch
4.) Open a editor of your choice and open setup.cfg
5.) Every channel with # in the front will not be scrapped ....
6.) Do not delete lines or insert lines in this configuration file
7.) Save your configuration
8.) Scrap the epg-data with the following command

./epg-swiss.sh fast 2

This command will extract the epg-data from the desired channels for 3 days
and stores the data with the name epg.xml inside xml folder.


Feedback would be nice ....
PS : This is a beta-release

Greetings from switzerland
Hans
Reply
#2
http://code.google.com/p/epg-swiss/downl...r&can=2&q=


New release ;-)

You can choose over 250 channels with epg-data ......

Regards
Hans
Reply
#3
Ok
After 6 Weeks of programming I do release my xmltv-scrapper for
the website http://tv.search.ch/

The srapper now supports fast / slow mode.
There are over 250 channels for witch you can produce xmltv-listings.

Here is the link to the new script

http://code.google.com/p/epg-swiss/downl...r&can=2&q=

Regards
Hans
Reply
#4
@linuxluemmel
Wow, your tool works great, nice and super work! Big Grin

Do you have an idea, how to use your script output in TVHeadEnd? Huh
Reply
#5
panzaeron Wrote:@linuxluemmel
Wow, your tool works great, nice and super work! Big Grin

Do you have an idea, how to use your script output in TVHeadEnd? Huh

Sorry but I only know Mythtv very good ... but I guess there is someone to ask about xmltv files and TVHeadEnd.

What channels do you use ?
I use allmost only german channels ...
Regards Hans
Reply
#6
@linuxluemmel
I only use the german channels too.

Hmm, if I understand the TVHeadEnd Documents, the output of the xmltv grabbers would be parsed internaly:
Code:
Tvheadend can parse the output from an xmltv grabber. It executes the grabber directly and parse the information internally. It's important that you configure your xmltv grabber to take use of its cache (this should default be on) or it might cause excessive burden on the server if you stop and start tvheadend often.

"Channel mapping"
Tvheadend will use the channel icon URL referred in the xmltv output as the channel icon. This icon is visible in the web ui, and also forwarded externally via the output modules where applicable.

Due to the fact that there may be differences between how channels are named in xmltv and DVB tvheadend utilizes a channel matching heuristic.
If more than 10 consequtive events (i.e programs) matches between the EPG received from DVB and the xmltv EPG, the channels are said to match.

It will also match a channel from xmltv to the rest of the system if the channel names matches exactly.

"Transfer of event information"
Once a channel has been matched all events will be transfered to the internal EPG.
If there is any conflicting information between the DVB EPG and XMLTV EPG the DVB EPG will always take precedence.

"Configuration"
Two global configuration statements are used to configure xmltv:

xmltvgrabber = <path> (optional)
Path to the xmltv grabber, e.g '/usr/bin/tv_grab_se_swedb'.
If not specified, xmltv will be disabled.

xmltvinterval = <seconds> (optional)
Specifies the time, in seconds, between executions of the xmltv grabber.
This defaults to 43200 (12 hours).

So, I can't use you script, is this right?
Reply
#7
If there is a script that produces a valid XMLTV formatted file, but it is not considered a "propper" XMLTV grabber then you can use this following "grabber" in combination with Tvheadend to load this XMLTV formatted file into Tvheadend:

http://code.google.com/p/tv-grab-file/so...b_file?r=2

tv_grab_file is an extreemly simple shell script that presents itself towards Tvheadend as a grabber, but in fact just passes an xmltv formatted file as it's output.

I hope this helps.
Reply
#8
Thanks for the info.
I guess panzaeron will have fun. But there is maybe a lîttle problem.
The generated xmltv-id are hard-coded inside the script.
Reply
#9
that is not a big problem because within Tvheadend you can map the xmltv-id to a channel by hand or when the name of the channel (so not the ID) it might map it automatically.

It only uses the xmltv-id internally but presents the user with the channel name linked to that ID.
Reply
#10
Rigolo Wrote:that is not a big problem because within Tvheadend you can map the xmltv-id to a channel by hand or when the name of the channel (so not the ID) it might map it automatically.

It only uses the xmltv-id internally but presents the user with the channel name linked to that ID.

I tested it with the tv_grab_file wrapper and it works Big Grin, but the channel -> xmltv-id linking is not so easy it could be, because in TVHeadEnd every xmltv-channel with xmltv_file_grab is not displaying with the channel name, only with the channel id (for example: epgs123). I use the tvsearch folder with the icons inside to find the channel id (searching for the correct picture and the name is the id). With this id I could link the Channel with the id in the xmltv channel list.

For the use with TVHeadEnd it would be much better, if your great tvsearch scrapper could act as a "real" xmltv scrapper. But without this and a little own work (channel -> id linking) it works great Big Grin
Reply
#11
OK As soon I have time ... I guess I can add a switch for this.
In the moment I do develop a xbmc addon that is very time-using.
Since this week I do coding like a crazy ;-)
BTW Do you use slow mode or fast mode ?
Compared to the tv_search grabber from xmltv my script is running like
a ferrari ;-)
Hans
Reply
#12
OK Fellows :-)
In the future a svn checkout is possible with the script.
Cheers Hans
Reply
#13
Has someone my tool running on Debian Squeeze ?
I have bug-report on the project-url for a script error on Debian
http://code.google.com/p/epg-swiss/issues/detail?id=3
Reply
#14
I release 0.9H for testing over svn ...

svn checkout http://epg-swiss.googlecode.com/svn/trunk/ epg-swiss-read-only

Feedback would be nice
Hans
Reply
#15
I created a patch against version 0.9H that works now with debian 6.0.

http://epg-swiss.googlecode.com/files/pa...ebian.diff

output ubuntu :
-rw-r--r-- 1 root root 32460 2010-08-17 22:25 chan280_0.html

output debian
-rw-r--r-- 1 root root 32460 2010 08-17 22:25 chan280_0.html

Inside the script I used a statement like ls -alv html | awk '{print $8} ' to get all the names of a directory.

Because debian shows like above you have to type {print $9} to have the same result.
Reply

Logout Mark Read Team Forum Stats Members Help
[RELEASE] EPG Scraper for Switzerland, Germany, and Austria0