For me, a reason for considering Clojure was the problem that modern web-standards are getting more and more complicated, so it takes a lot of time to implement and maintain them for example for Common Lisp. And even if I think this is not a good thing, I cant do much against it, so I have to somehow get used to it – either taking more work or use something else. I could use ABCL, or jScheme, but Clojure is – as I already wrote – a very interesting Lisp-Dialect, and I see no reason why not using it.
Today, I have tried to parse feeds with Clojure. There are a lot of Feedparser-Libraries for Java out there, but surprisingly, most of them seem to have disadvantages or are not maintained anymore. I decided to use rome. There is a good QuickStart for it, which already tells most, I want to do so far (just for testing purposes). There is also a JavaDoc for it.
Ok, I first start clojure. I have already set the $CLASSPATH-Variable to an appropriate value. And I use rlwrap. Notice that I will break some lines to fit them into the wordpress-layout:
$ rlwrap java -cp $CLASSPATH clojure.lang.Repl Clojure user=>
Then I get all necessary requirements:
user=> (import '(com.sun.syndication.feed.synd SyndFeed SyndContentImpl SyndEntryImpl)) nil user=> (import '(com.sun.syndication.io SyndFeedInput XmlReader)) nil user=> (import '(java.net URL)) nil user=>
Then I download the feed:
user=> (def input (new SyndFeedInput)) #'user/input user=> (def feed (. input build (new XmlReader (new URL "http://matthias.benkard.de/journal/feed/")))) #'user/feed user=>
I hope Matthias wont be peeved that I raise his costs ;-)
So, now that we have fetched the feed, we can go on getting the first entry and its contents:
user=> (def entries (. feed getEntries)) #'user/entries user=> (def entry0 (. entries get 0)) #'user/entry0 user=> (def content (. entry0 getContents)) #'user/content user=>(def content0 (. content get 0)) #'user/content0 user=>
So far so good. Now lets see what the content contains:
user=> (. content0 getValue) "rn <div xmlns="http://www.w3.org/1999/xhtml">rn rn<p><a href="http://freitag.de/">Der Freitag</a> ist laut dem <a href="http://www.spiegelfechter.com/wordpress/475/%e 2%80%9eder-freitag%e2%80%9c-auferstanden-aus-ruinen">Spiegelfechter</a> die „letzte ‚links-intellektuelle‘ Wochenzeitung“. Ist er ein Printmedium, das es sich zu abonnieren lohnt? Vielleicht gar das einzige? rn</p>rn </div>rn " user=>
This looks good. Seems to be the content of http://matthias.benkard.de/journal/63.
user=> (. feed getDescription ) "n Geschwafel eines libertärsozialistischen Geeksn "