Mythtv's xmltv grabber for Malaysia channels
Ever since I got mythtv up and running months ago, I have always wanted to use the Electronic Program Guide (EPG) feature. Unfortunately, getting tv schedules in a format understandable by mythtv (i.e. xmltv) is not so easy.
From a bit of googling, I found 2 (non-)solutions. The first one involves using tvxb through wine to grab tv schedules from Astro through screenscraping. Apparently, it doesn't work anymore, as the tvxb site is showing the following message:
All Astro satellite channels (No longer works - needs updating. 2008/10/12)
The other solution is a perl script written by Shahada Abubakar that also screenscrapes Astro listing. Like the first one, this solution has also ceased to be working, due to the flaky nature of screenscraping.
Of course, the googling and testing were just unnecessary foreplay. I was set at the beginning to come up with my own solution anyway. With the help of wonderful python libraries such as BeautifulSoup and lxml, I wrote a xmltv grabber that:
can screenscrape either Astro or The Star listings for channels rtm1, rtm2, tv3, ntv7, 8tv, and tv9
is functioning as of 2008-12-31
Here's the script: grabmy.py
To get it to work, install the requirements first:
easy_install BeautifulSoup lxml httplib2 python_dateutil
Then, run the script to generate a xmltv file:
python grabmy.py -f my.xml
Feed mythbackend with the file:
mythfilldatabase --file 1 my.xml
And finally, here's the EPG in its full glory if you channel-flip at 2am:

Bravo... great job... I downloading the Python now.. will test on it right after this....
Comment by YF Chin — Jan 10, 2009 3:57:42 PM | #
Dear Sayap, I got problem installing the " easy_install BeautifulSoup lxml httplib2 python_dateutil " It fail to install. It tried to connect to the internet to get search and install some apt. But, fail with error stating that the setup script is not exist. Please help. I am using Kubuntu 710. Your help is much appreciated.
Comment by YF Chin — Jan 14, 2009 7:05:10 PM | #
Seems like you do not have setuptools installed yet, which provides the easy_install script. Install it by:
You may also want to install the following packages that provides the header files needed to compile BeautifulSoup/lxml:
Finally, easy_install the dependencies one by one so that it is easier to tell which one fails in case something goes wrong:
Comment by sayap — Jan 14, 2009 10:17:22 PM | #
Thanks Sayap... Everything installed... except feed mythfilldatabase with the generated xml file. I got this error message: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ linuxmce@dcerouter:~/Desktop/Myth TV Plugin/Malaysia MYTHTV Programs Guides$ sudo mythfilldatabase --file 1 my.xml
[sudo] password for linuxmce:
missing or invalid parameters for --file option
2009-01-16 01:38:55.132 DataDirect: Deleting temporary files ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Please help... again.. THANK YOU VERY MUCH....
Comment by YF Chin — Jan 16, 2009 5:55:00 PM | #
You are probably running a different version of mythtv. Try this instead:
Comment by sayap — Jan 16, 2009 9:03:07 PM | #
Thanks SAYAP... it install now... But, the tv guide not show up in mythtv. Here is the installation message in the terminal. 2009-01-17 09:32:54.014 Using runtime prefix = /usr 2009-01-17 09:32:54.030 New DB connection, total: 1 2009-01-17 09:32:54.034 Connected to database 'mythconverg' at host: localhost 2009-01-17 09:32:54.069 New DB connection, total: 2 2009-01-17 09:32:54.069 Connected to database 'mythconverg' at host: localhost 2009-01-17 09:32:54.119 Updating icons for sourceid: 1 2009-01-17 09:32:54.121 New DB connection, total: 3 2009-01-17 09:32:54.121 Connected to database 'mythconverg' at host: localhost
Unknown xmltv channel identifier: 8tv
Skipping channel.
Unknown xmltv channel identifier: ntv7
Skipping channel.
Unknown xmltv channel identifier: rtm1
Skipping channel.
Unknown xmltv channel identifier: rtm2
Skipping channel.
Unknown xmltv channel identifier: tv3
Skipping channel.
Unknown xmltv channel identifier: tv9
Skipping channel.
Updated programs: 0 Unchanged programs: 0
2009-01-17 09:32:54.129 New DB connection, total: 4 2009-01-17 09:32:54.129 Connected to database 'mythconverg' at host: localhost 2009-01-17 09:32:54.131 Adjusting program database end times. 2009-01-17 09:32:54.132 0 replacements made 2009-01-17 09:32:54.132 Marking generic episodes.
2009-01-17 09:32:54.132 Found 0
2009-01-17 09:32:54.133 Marking repeats.
2009-01-17 09:32:54.135 Found 0
2009-01-17 09:32:54.135 Unmarking new episode rebroadcast repeats. 2009-01-17 09:32:54.135 Found 0 2009-01-17 09:32:54.135 Marking episode first showings. 2009-01-17 09:32:54.136 Found 0 2009-01-17 09:32:54.136 Marking episode last showings.
2009-01-17 09:32:54.137 Found 0
2009-01-17 09:32:54.138
=============================================================== | Attempting to contact the master backend for rescheduling. | | If the master is not running, rescheduling will happen when | | the master backend is restarted. | =============================================================== 2009-01-17 09:32:54.142 Connecting to backend server: 192.168.80.1:6543 (try 1 of 5) 2009-01-17 09:32:54.143 Using protocol version 31 2009-01-17 09:32:54.151 mythfilldatabase run complete. 2009-01-17 09:32:54.152 DataDirect: Deleting temporary files
I guess I made mistake in mythtv setup. Can you please help me?
Comment by YF CHIN — Jan 17, 2009 10:00:00 AM | #
Ah, I think you didn't assign any XMLTV ID when setting up the channels. Just run mythtv-setup, go to each of channels, and assign the following XMLTV ID: rtm1, rtm2, tv3, ntv7, 8tv, tv9
Those XMLTV ID are hardcoded into the grabber script, so you have to match them exactly :)
Comment by sayap — Jan 21, 2009 8:48:11 AM | #
Dear Yap, I can notied that the program guide layout had changed a bit. But still no complete data in. Is it have certain pre-fix data upload time? And, Gong Xi Fa Chai.... Happy Chinese New Year....
Comment by YF CHIN — Jan 22, 2009 5:07:10 PM | #
By default, the script only get the program guide for the day. You can pass the "-n" parameter to change that, e.g.
python grabmy.py -f my.xml -n 2will get today and tomorrow guides. Anything larger than 2 doesn't do much good, since The Star usually updates the program guide one day before.So, you have to schedule to run the script periodically. For me, I just setup a cron job to run grabmy.py followed by mythfilldatabase at every midnight.
Happy Chinese New Year :)
Comment by sayap — Jan 24, 2009 9:50:53 AM | #
Hi Ms. Yap, Pardon me... I am new to Linux. Can you show me how exactly to do the "cron job" to get mythfilldatabase update every midnite automatically?
Comment by YF Chin — Feb 19, 2009 4:38:47 PM | #
Hi Ms. Yap, Pardon me... I am new to Linux. Can you show me how exactly to do the "cron job" to get mythfilldatabase update every midnite automatically?
Comment by YF Chin — Feb 19, 2009 4:39:25 PM | #
Hi Ms. Yap, Pardon me... I am new to Linux. Can you show me how exactly to do the "cron job" to get mythfilldatabase update every midnite automatically?
Comment by YF Chin — Feb 19, 2009 4:40:06 PM | #
Hi Ms. Yap, Pardon me... I am new to Linux. Can you show me how exactly to do the "cron job" to get mythfilldatabase update every midnite automatically?
Comment by YF Chin — Feb 19, 2009 4:40:47 PM | #
Guys…. if you look at the Astro website (http://www.astro.com.my) they now publish their complete tv listings (albeit channel by channel) in XML / RSS format. For example TV1 is at http://www.astro.com.my/channels/rtm1/ with the RSS feed being at http://www.astro.com.my/rss/channels.asp?sid=M038. Although I am not as technical as many on this forum, surely this must make life easier?
Chaggy
Comment by chaggy looga — Aug 13, 2009 11:42:38 AM | #
when do you exactly run this script? i'm sorry but i have just installed mythtv and i am still googling the way to make it up and running.
Comment by bitto — Feb 21, 2010 3:39:02 PM | #
hi, its me again. i've tried to run the script, but i got these. i don't know whether its common. any insight would do.
bitto@bitto:~$ python grabmy.py -f my.xml
Traceback (most recent call last):
File "grabmy.py", line 236, in <module>
main()
File "grabmy.py", line 225, in main
for elem in grabber.grab(date + timedelta(i), params_dict):
File "grabmy.py", line 102, in grab
html = self.get_html(date, kwargs)
File "grabmy.py", line 63, in get_html
return BeautifulSoup(content)
File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1499, in __init__ File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1230, in __init__ File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1263, in _feed File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed self.goahead(0) File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead k = self.parse_starttag(i) File "/usr/lib/python2.6/HTMLParser.py", line 226, in parse_starttag endpos = self.check_for_whole_start_tag(i) File "/usr/lib/python2.6/HTMLParser.py", line 301, in check_for_whole_start_tag self.error("malformed start tag") File "/usr/lib/python2.6/HTMLParser.py", line 115, in error raise HTMLParseError(message, self.getpos()) HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36
Comment by bitto — Feb 21, 2010 4:47:00 PM | #
@Chaggy,
I actually started writing this script by parsing the tv listing on Astro website. However, their listing was less accurate and less complete compared to the listing on The Star website, so the script uses the later as the default.
Since it still works, I don't really care about RSS
@bitto,
I just check the EPG (for the 1st time in 6 months) and it still works fine. Maybe your connection went down while the script was running, that's why it complained about malformed start tag.
Comment by sayap — Mar 11, 2010 11:38:20 PM | #
the script works fine now (as in no error anymore) however, why does i do not get EPG like yours? maybe that you can reply thru my email.
Many thanks
Comment by bitto — Mar 23, 2010 1:34:08 PM | #
@bitto,
Good to hear that the script works now. After generating the .xml file, you need to use mythfilldatabase to consume it, e.g.
mythfilldatabase --file 1 my.xml
or
mythfilldatabase --file 1 -1 my.xml
If there is still no EPG after that, you can send me the log from mythfilldatabase to diagnose (although I have moved from mythtv to freevo :D).
Comment by sayap — Apr 6, 2010 10:41:00 PM | #
If any one is still interested in this, I've an updated script that works as of January 2011 at home.abubakar.net/pubdocs/, under astro_xmltv.
Comment by shahada abubakar — Feb 7, 2011 10:45:48 PM | #