The last couple of weeks we've been working on getting events on ZiTYZEN. Basically because we believe that more content drives more traffic, and since the content of an event calender changes on a daily basis, this also drives more returning users.
Our initial approach was that we could buy the events from other sources, so we had to do less work. We contacted Kultunaut, the largest event calender in Denmark and event calender supplier to several other websites. But for some reason they didn't reply.
Another possible approach was to make an HTML/XML-parser for each place that had a website and had events, and simply collect the content ourselves. I must admit I was very sceptic of this approach as it potentially could be an extremely large task.
After waiting a few days for Kultunaut to reply, Pia eventually persuaded me to give it a try. So we started off by finding the places which had events and an RSS-feed, as this seemed to be the easiest way to start.
It turned out to be pretty easy and fast to make parsers for the 10(!) places in Denmark which had an RSS-feed.
Since creating an RSS-parser was pretty easy I identified 4 different scenarios for websites with events, and for each of these scenarios I created a template, such that developing the actual parser would be much easier.
Currently I've written more than 30 parsers and it takes roughly 30-60 minutes to create a parser.
The beauty of it is that it runs completely automatical using a cronjob. The cronjob also removes duplicate material and allows places to slightly change the headlines for their events, e.g. Metallica -> Metallica (Few tickets left). Furthermore it detects whether a website is down or changes layout, and then lets me know.
Well, maybe the best part is that we are independent, we can display the events in any way we chose and where we chose (as long as it's on ZiTYZEN.
Wednesday, March 4, 2009
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment