I’ve been playing with SQL Anywhere’s support for web services. What is particularly cool is that you can call out to (or consume) a web service right from the database server using a stored procedure. Since I have been playing with blogs, and setting up rss, I wondered if I could consume RSS feeds via the database server. It turns out this is very easy. The following functon pulls the rss feed from digg.com:
CREATE FUNCTION DBA.getdiggrss() RETURNS long VARCHAR url 'http://www.digg.com/rss/index.xml'
Once that is done, I can retrieve the rss as follows:
SELECT DBA.getdiggrss() FROM DUMMY;
Easy… However, XML is not necessarily the most readble output. It would be nice if I could shred it into its component pieces and view the relevant ones. This requires that I look at the XML and determine the tags I wish to look at. Once I identify them, I can use the OpenXML(…) function to build a resultset that is much more readable:
SELECT * FROM OPENXML( DBA.getdiggrss(), '/rss/channel/item', 1, '<root xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:digg="http://digg.com/docs/diggrss/"/>' ) WITH( title long VARCHAR 'title', link long VARCHAR 'link', postdescription long VARCHAR 'description', diggcount INTEGER './/digg:diggCount', commentcount INTEGER './/digg:commentCount', pubdate long VARCHAR 'pubDate', permalink long VARCHAR 'guid')
Basically, I pull all the namespaces from the XML file so OPENXML() knows how to process it. Then I tell OPENXML that I want to examine all of the “item” elements (which equate to particular RSS feed entries). I use the WITH clause to define which fields of the “item” elements I want returned in my resultset, what their column names should be and what their datatypes should be.
Not too difficult, and potentially very powerful. I think next I will create an event now that will periodically pull down the RSS and shred it and then store the information in a table, which I can then create a full text index on so I can do local searching of the blog feed. In fact, if I add several feeds, I will be able to search across them with a single query.