XMLTV Epg Grabber
#16
It's a tool that grabs EPG from websites or xml sites for your channels, from diferent sources, and merges them into one xmltv file. Smile
You can for instance use it with TVGuide program addon for XBMC, and connect each channel with a .srtm file or stream from favorites.

More info here: http://www.webgrabplus.com/

Great tool
HTPC Server - Windows 8.1 + XBMC Helix | Intel QuadCore, 4GB RAM, 4 TB SATA
Intel NUC D54250WYK - Windows 8.1 + XBMC Helix | Logitech Harmony 750
Samsung UE8005 | Bluetooth keyboard & mousepad
Reply
#17
Greetings... I'm having the same problem as gborri mentioned above:

Code:
Unhandled Exception: System.TypeLoadException: A type load exception has occurred.
[ERROR] FATAL UNHANDLED EXCEPTION: System.TypeLoadException: A type load exception has occurred.

Ubuntu 12.04 - Mono 2.10.8.1 - WebGrab 1.1.1

Any guess? Thank you...
Reply
#18
Got it working...
I forgot the double quotes in the command:

mono WebGrab+Plus.exe "/your_path_to_WebGrab_directory"


just after the executable... Tongue

Then I got a message error about a missing library solved installing mono-complete with apt-get install mono-complete.

Thanks...
Reply
#19
thanks Zazza your suggestione have resolved my problem.

I have a question for WG++Maker.
I would like to introduce, for the ini file gudetv.sky.it, the espisode info: my problem is that i would like to introduce either xmltv_ns or onscreen.
how can I make it happens?

thanks
Giovanni
Reply
#20
(2013-01-24, 13:50)gborri Wrote: thanks Zazza your suggestione have resolved my problem.

I have a question for WG++Maker.
I would like to introduce, for the ini file gudetv.sky.it, the espisode info: my problem is that i would like to introduce either xmltv_ns or onscreen.
how can I make it happens?

thanks
Giovanni

To get episode/season numbers use this guidatv.sky.it.ini file I've modified.

Code:
**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
* @Site: guidatv.sky.it
* @MinSWversion: V0
*   none
* @Revision 2 - [30/08/2011] Willy De Wilde/Jan van Straaten
*   added credits/category and production date
* @TESTING - [26/01/2013] Zazza
*   added episode-num in xmltv_ns format
* @Remarks:
*   none
* @header_end
**------------------------------------------------------------------------------------------------

site {url=guidatv.sky.it|timezone=UTC+01:00|maxdays=7.1|cultureinfo=it-IT|charset=UTF-8|titlematchfactor=90|episodesystem=xmltv_ns}
url_index{url|http://guidatv.sky.it/guidatv/canale/|channel|.shtml}
urldate.format {daycounter|0}
*
index_urlshow {url ()||<a href="||">|</li>}
*
index_showsplit.scrub {multi ()|<p class="ora">|||<li class="dispari">}
index_date.scrub {single(force)|<p class="giorno">||h.|<p class="tools">}
index_start.scrub {single|||</p>}
index_title.scrub {single(separator=" - " include=first)|<strong>||</strong>|</li>}
*
* enable the next two lines to create a channel list
*index_site_channel.scrub {multi|<ul id="clup-menu-bar"|class="">|</a>|</ul><!-- end clup-menu-bar -->}
*index_site_id.scrub {multi|<ul id="clup-menu-bar"|weekChannel=|" class=|"</ul><!-- end clup-menu-bar -->}
*
title.scrub {single(separator=" - " include=first)|<title>|||</title>}
subtitle.scrub {multi(separator=" - " exclude=first)|<title>||</title>}

temp_1.scrub {single(exclude="protagonista")|<div class="__description">||</div>|</div>}
temp_2.scrub {multi|<div class="__description">||</div>|</div>}
temp_3.scrub {multi|<div class="testo">||</div>|</div>}

* find season

temp_4.scrub {multi(separator=" " include=first)|<div class="__pilat">||</div>|</div>}
temp_4.modify {remove('temp_4' not "")|\'}
temp_4.modify {calculate(not "" format=F0)|'temp_4' 1 -}

* find episode

temp_5.scrub {single|<div class="__pilat">| Stagione Ep.| -}
temp_5.modify {calculate(format=F0)|'temp_5' 1 -}

episode.modify {addend('temp_4' = "")|...}
episode.modify {addend('temp_4' not "")|'temp_4'}
episode.modify {addend('temp_4' not "")|.'temp_5'.}

description.scrub {single|__pilat">||</div>|</div>}

category.scrub {single|<h5>Informazioni</h5>|<strong>Genere</strong>: |<br />|<br />}
rating.scrub {single|<h5>Informazioni</h5>|<img style="display:inline"|</p>|</p>}
director.scrub {single(separator=", con " include=first)|<div class="testo">|Regia di |; |</div>}
actor.scrub {single(separator=", con " exclude=first)|<div class="testo">|Regia di |; |</div>}
*
*
* operations:
subtitle.modify {remove|Sky.it}
description.modify {addend(null)|'temp_1'}
description.modify {addend(null)|'temp_2'}
description.modify {addend(null)|'temp_3'}
description.modify {remove|<span style="font-weight: bold;">}
description.modify {remove|<span style="font-style: italic;">}
description.modify {remove|<font face="Arial">}
description.modify {remove|<span style="FONT-WEIGHT: bold; FONT-STYLE: italic">
description.modify {remove|<span style="FONT-STYLE: italic">
description.modify {remove|<span style="FONT-WEIGHT: bold">
description.modify {cleanup}
rating.modify {replace(~ "per tutti")|'rating'|per tutti}
rating.modify {replace(~ "bambini accompagnati")|'rating'|bambini accompagnati}
rating.modify {replace(~ "V.M. 12")|'rating'|12+}
rating.modify {replace(~ "V.M. 14")|'rating'|14+}
productiondate.modify {calculate(format=F0)|'description' 1 *}
productiondate.modify {remove(0)|'productiondate'}
description.modify {remove|Regia di 'director', }
description.modify {remove|con 'actor'; }
actor.modify {replace|,|\|}

Please test it and let me know if it's working inside a frontend. Consider that the HTML source page not always shows season and episode numbers. In that case the episode-num element is filled with the string "..." .

Thank you.
Reply
#21
(2013-01-26, 17:21)Zazza Wrote:
(2013-01-24, 13:50)gborri Wrote: thanks Zazza your suggestione have resolved my problem.

I have a question for WG++Maker.
I would like to introduce, for the ini file gudetv.sky.it, the espisode info: my problem is that i would like to introduce either xmltv_ns or onscreen.
how can I make it happens?

thanks
Giovanni

To get episode/season numbers use this guidatv.sky.it.ini file I've modified.

Code:
**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
* @Site: guidatv.sky.it
* @MinSWversion: V0
*   none
* @Revision 2 - [30/08/2011] Willy De Wilde/Jan van Straaten
*   added credits/category and production date
* @TESTING - [26/01/2013] Zazza
*   added episode-num in xmltv_ns format
* @Remarks:
*   none
* @header_end
**------------------------------------------------------------------------------------------------

site {url=guidatv.sky.it|timezone=UTC+01:00|maxdays=7.1|cultureinfo=it-IT|charset=UTF-8|titlematchfactor=90|episodesystem=xmltv_ns}
url_index{url|http://guidatv.sky.it/guidatv/canale/|channel|.shtml}
urldate.format {daycounter|0}
*
index_urlshow {url ()||<a href="||">|</li>}
*
index_showsplit.scrub {multi ()|<p class="ora">|||<li class="dispari">}
index_date.scrub {single(force)|<p class="giorno">||h.|<p class="tools">}
index_start.scrub {single|||</p>}
index_title.scrub {single(separator=" - " include=first)|<strong>||</strong>|</li>}
*
* enable the next two lines to create a channel list
*index_site_channel.scrub {multi|<ul id="clup-menu-bar"|class="">|</a>|</ul><!-- end clup-menu-bar -->}
*index_site_id.scrub {multi|<ul id="clup-menu-bar"|weekChannel=|" class=|"</ul><!-- end clup-menu-bar -->}
*
title.scrub {single(separator=" - " include=first)|<title>|||</title>}
subtitle.scrub {multi(separator=" - " exclude=first)|<title>||</title>}

temp_1.scrub {single(exclude="protagonista")|<div class="__description">||</div>|</div>}
temp_2.scrub {multi|<div class="__description">||</div>|</div>}
temp_3.scrub {multi|<div class="testo">||</div>|</div>}

* find season

temp_4.scrub {multi(separator=" " include=first)|<div class="__pilat">||</div>|</div>}
temp_4.modify {remove('temp_4' not "")|\'}
temp_4.modify {calculate(not "" format=F0)|'temp_4' 1 -}

* find episode

temp_5.scrub {single|<div class="__pilat">| Stagione Ep.| -}
temp_5.modify {calculate(format=F0)|'temp_5' 1 -}

episode.modify {addend('temp_4' = "")|...}
episode.modify {addend('temp_4' not "")|'temp_4'}
episode.modify {addend('temp_4' not "")|.'temp_5'.}

description.scrub {single|__pilat">||</div>|</div>}

category.scrub {single|<h5>Informazioni</h5>|<strong>Genere</strong>: |<br />|<br />}
rating.scrub {single|<h5>Informazioni</h5>|<img style="display:inline"|</p>|</p>}
director.scrub {single(separator=", con " include=first)|<div class="testo">|Regia di |; |</div>}
actor.scrub {single(separator=", con " exclude=first)|<div class="testo">|Regia di |; |</div>}
*
*
* operations:
subtitle.modify {remove|Sky.it}
description.modify {addend(null)|'temp_1'}
description.modify {addend(null)|'temp_2'}
description.modify {addend(null)|'temp_3'}
description.modify {remove|<span style="font-weight: bold;">}
description.modify {remove|<span style="font-style: italic;">}
description.modify {remove|<font face="Arial">}
description.modify {remove|<span style="FONT-WEIGHT: bold; FONT-STYLE: italic">
description.modify {remove|<span style="FONT-STYLE: italic">
description.modify {remove|<span style="FONT-WEIGHT: bold">
description.modify {cleanup}
rating.modify {replace(~ "per tutti")|'rating'|per tutti}
rating.modify {replace(~ "bambini accompagnati")|'rating'|bambini accompagnati}
rating.modify {replace(~ "V.M. 12")|'rating'|12+}
rating.modify {replace(~ "V.M. 14")|'rating'|14+}
productiondate.modify {calculate(format=F0)|'description' 1 *}
productiondate.modify {remove(0)|'productiondate'}
description.modify {remove|Regia di 'director', }
description.modify {remove|con 'actor'; }
actor.modify {replace|,|\|}

Please test it and let me know if it's working inside a frontend. Consider that the HTML source page not always shows season and episode numbers. In that case the episode-num element is filled with the string "..." .

Thank you.

thanks zazza,

in the meanwhile i have tried a solution and seems to work. I'll look to your code and let you if i have make something different.
I know that not always there are episode/season info, but ig webgrab+plus is fast enough to grab in incremental i hope to have more chance to get it.

i have a problem (i think you too): the italian letter à ò ì etc are not recognized correctly, have you found a solution?

thanks
Giovanni
Reply
#22
Is it possible to rework file that it would contains information from xbmc scraper (scraper is xml file so probably it would be easy to reuse) for example for movies?
And I would love trakt.TV integration so looking at file I would see which movie I saw and do I don't need to record it.
Reply
#23
Quote:thanks zazza,

in the meanwhile i have tried a solution and seems to work. I'll look to your code and let you if i have make something different.
I know that not always there are episode/season info, but ig webgrab+plus is fast enough to grab in incremental i hope to have more chance to get it.

i have a problem (i think you too): the italian letter à ò ì etc are not recognized correctly, have you found a solution?

thanks
Giovanni
Hi Giovanni,
the main problem in this script is the source page used to scrap information. Pages from guidatv.sky.it are often erratic, not updated (EPG data in the source page doesn't match with the updated info inside the javascript code in the same page) and often incomplete (missing episode/season details). XMLTV does a better job (because uses other SKY sources) but it's DAMN SLOW while Webgrab+ is pretty fast. Currently I switched from webgrab+ to xmltv (tv_grab_it) and xmltv2vdr perl script and make a crontab for scheduled EPG updates but I'm open to any suggestions...

P.S. : I'll take a look at the accented letters problem...
Reply
#24
(2013-01-29, 03:36)Zazza Wrote:
Quote:thanks zazza,

in the meanwhile i have tried a solution and seems to work. I'll look to your code and let you if i have make something different.
I know that not always there are episode/season info, but ig webgrab+plus is fast enough to grab in incremental i hope to have more chance to get it.

i have a problem (i think you too): the italian letter à ò ì etc are not recognized correctly, have you found a solution?

thanks
Giovanni
Hi Giovanni,
the main problem in this script is the source page used to scrap information. Pages from guidatv.sky.it are often erratic, not updated (EPG data in the source page doesn't match with the updated info inside the javascript code in the same page) and often incomplete (missing episode/season details). XMLTV does a better job (because uses other SKY sources) but it's DAMN SLOW while Webgrab+ is pretty fast. Currently I switched from webgrab+ to xmltv (tv_grab_it) and xmltv2vdr perl script and make a crontab for scheduled EPG updates but I'm open to any suggestions...

P.S. : I'll take a look at the accented letters problem...

i'm interested in grab+ for 2 more reason: 1. the description, xmltv doesent get it 2. imdb integration (i've not tried yest).

Giovanni
Reply
#25
As of today there is also another place to get support for WebGrab+Plus .

Visit its new website http://www.webgrabplus.com/

See you there WG++Maker --- Jan
Reply
#26
Hello

I've been trying all night to get this working but I can't seem to figure it out. I'm trying to get EPG with the balkan sites tvprofil.net and mojtv.hr, I've set up my Linux installation and config file according to the manual but when I run mono and actually try to grap EPG I keep getting these wierd errors:

Quote: Channel HTV1 site -- TVPROFIL.NET -- update mode full

Debugging information SiteIni; UrlIndex builder
SiteIni entry :
urldate format type: datestring, value: |yyyy-MM-dd

UrlIndex created:
http://tvprofil.net/xmltv/data/htv1.hr/2...il.net.xml
Unable to update Channel HTV1

System.TypeInitializationException: An exception was thrown by the type initializer for WebGrab.PostProcess ---> System.NullReferenceException: Object reference not set to an instance of an object
at WebGrab.PostProcess..cctor () [0x00000] in :0
--- End of inner exception stack trace ---
at WebGrab.Scrub.ScrubDateAndLogo (System.String index, WebGrab.SiteIni scrubstrings) [0x00000] in :0
at WebGrab.Program.UpdateChannel (System.String strIndex, WebGrab.ChannelToUpdate Chan, WebGrab.XmlTarget xTarget) [0x00000] in :0
at WebGrab.Program.Main (System.String[] args) [0x00000] in :0

Every last one my channels get this error. Anyone know what the problem might be?
Reply
#27
(2013-02-21, 22:56)Sajk Wrote: Hello

I've been trying all night to get this working but I can't seem to figure it out. I'm trying to get EPG with the balkan sites tvprofil.net and mojtv.hr, I've set up my Linux installation and config file according to the manual but when I run mono and actually try to grap EPG I keep getting these wierd errors:

Quote: Channel HTV1 site -- TVPROFIL.NET -- update mode full

Debugging information SiteIni; UrlIndex builder
SiteIni entry :
urldate format type: datestring, value: |yyyy-MM-dd

UrlIndex created:
http://tvprofil.net/xmltv/data/htv1.hr/2...il.net.xml
Unable to update Channel HTV1

System.TypeInitializationException: An exception was thrown by the type initializer for WebGrab.PostProcess ---> System.NullReferenceException: Object reference not set to an instance of an object
at WebGrab.PostProcess..cctor () [0x00000] in :0
--- End of inner exception stack trace ---
at WebGrab.Scrub.ScrubDateAndLogo (System.String index, WebGrab.SiteIni scrubstrings) [0x00000] in :0
at WebGrab.Program.UpdateChannel (System.String strIndex, WebGrab.ChannelToUpdate Chan, WebGrab.XmlTarget xTarget) [0x00000] in :0
at WebGrab.Program.Main (System.String[] args) [0x00000] in :0

Every last one my channels get this error. Anyone know what the problem might be?

Ne znam kako je kod linux ali bi trebao imati u foldi gdje je WebGrab++.config.xml i .ini fajlove tj u tvom slucaju mojtv.hr.ini i tvprofil.net.ini
Takodjer treba da napravis konfiguraciju WebGrab++.config.xml gdje ces traziti koje kanale trazis da dobijes tv program
Reply
#28
English please
Reply
#29
Smile 
Hi all. First time poster long time lurker.

Just a quick, excited post to say that I've managed to get this to work on OS X Lion!!!

Having tested all the xmltv grabbers out there over the past month, I settled on WG++. The problem was I have a Mac and my XBMC setup is on a Raspberry Pi running Xbian.

I tried running WG++ on the Pi but Mono has issues with the hard-float architecture so no joy.

I could get it to run in a Windows 7 environment running VirtualBox on my Mac but I wanted it automated and I didn't fancy having VirtualBox running all the time.

Anyway, long story short. There is an OS X version of Mono available here and after much tinkering I now have it running via the OS X command line.
This required a slight deviation from the Linux installation instructions and if anyone is interested I can post the details of what I did.

During my journey, I also got XMLTV to run automatically on the Pi so can help anyone wanting to go down that route if needed too.

Many thanks for a great site. I'm off to download my EPG. Wink

Cheers


Rich
Reply
#30
Oh yes please post them ! (Osx)
Reply

Logout Mark Read Team Forum Stats Members Help
XMLTV Epg Grabber1