Tuesday, 10 August 2010

PostgreSQL and xpath

Wow.

It all started with me being a lazy developer. I had to store some figures and rather than create a new database table I decided to reuse an existing one. Only problem was that the existing table only had a field of type text for storing these figures (power readings). No worries, I thought, I'll just encode it as XML and use PHP to extract the info when I need it.

Sounds ok so far? Well, I guess I could have gone down that path but it broke my general way of doing things. I wanted to be able to summerize data inside the database where the data lived rather than externally.

So I began looking at what options there were for working with XML inside postgreSQL. Well if you do a simple search for these terms you find yourself looking at the manual for version 8.2 . I soon discovered that XML support in 8.2 was limited at best with third party add-ons required to be useful.

When I took at look at the manual for 8.3 though, it became aparant that I could do XML manipulation as part of a regular SQL query! Imagine that. Now, that lovely XML data that I had sneakily placed into a text field could behave like all the other data that had been given its own columns.

Try this on for size as a scary looking but way cool SQL query for postgreSQL:
SELECT
date_trunc('month'::text, lo.ti_start)::date AS "Reading Taken",
cl.cl_name AS "Client", (xpath('/load/board/text()'::text, lo.ti_data::xml)::text[])[1] AS "DB",
avg((xpath('/load/reading/phase[@units=''amps'']/text()'::text, lo.ti_data::xml)::text[]::real[])[1])::numeric(6,1) AS "Phase1 (A)", avg((xpath('/load/reading/phase[@units=''amps'']/text()'::text, lo.ti_data::xml)::text[]::real[])[2])::numeric(6,1) AS "Phase2 (A)", avg((xpath('/load/reading/phase[@units=''amps'']/text()'::text, lo.ti_data::xml)::text[]::real[])[3])::numeric(6,1) AS "Phase3 (A)", avg((xpath('/load/reading/phase[@units=''kw'']/text()'::text, lo.ti_data::xml)::text[]::real[])[1])::numeric(6,2) AS "Phase1 (kW)", avg((xpath('/load/reading/phase[@units=''kw'']/text()'::text, lo.ti_data::xml)::text[]::real[])[2])::numeric(6,2) AS "Phase2 (kW)", avg((xpath('/load/reading/phase[@units=''kw'']/text()'::text, lo.ti_data::xml)::text[]::real[])[3])::numeric(6,2) AS "Phase3 (kW)"
FROM netdb.load lo, netdb.client cl
WHERE lo.se_type = 39 AND lo.cl_id = cl.cl_id AND lo.cl_id = 3
GROUP BY date_trunc('month'::text, lo.ti_start)::date, cl.cl_name, (xpath('/load/board/text()'::text, lo.ti_data::xml)::text[])[1];

As you can see, with the xpath function, you can treat any text block as XML and then manipulate, select or group by the results.

This is such a powerful idea that I'm still geeking out about it. For a long time I had seen XML formats as a data storage format, throw in XPath and XQuery and you have somethign that could potentially replace a RDBMS. Now think about XML inside your RDBMS and you get the best of both worlds. Better still, you can slowly add XML into your existing database without having to retool all your existing code.

What a fantastic idea.

Look forward to doing lots of cool things with XML and postgreSQL.

Read this knol that I wrote on this subject.

No comments: