Building a Document Management System: Part 2

After building the basic network protocol for our document store, we find something lacking: The documents we are working with have some meta-data. What kind of document we are talking about (e.g. invoice, order) and what attributes has this document (e.g. customer-number, order-number, invoice-number).

The Atom Publishing Protocol provides something called „collections“ and something called „categories„. On the first glance this seems nicely map to „document type“ and and „document attributes“. Unfortunately categories seem to be more something like tags but for attributes I need key-value pairs. And form the current atom draft I don’t fully understand how to create collections. And in our document storage application the client defines the document type and should be also able to create new types on the fly with minimal hassle.

Just to clarify: We have some documents like this:

* Doc1, Invoice, customerid:12345, invoiceid:23456, orderid:345678
* Doc2, Invoice, customerid:12345, invoiceid:2912345, orderid:345678
* Doc3, Order, customerid:12345, orderid:345678
* Doc4, Offer, customerid:12345, offerid:345566
* Doc5, ProductPhoto, productid:901234
* Doc6, PizzaOrder

Maybe I just didn’t understand how to map this to Atom. for now I decided to go with custom HTTP-Headers:

X-de.hudora-attributes: {"customerid": "12345", "invoiceid": "23456"}
X-de.hudora-category: Invoice
X-de.hudora-timestamp': 2007-06-22

We use JSON to encode the attributes and a plain string to encode the Document-Type. I also found no way a client can Post a Last-Modified Date to the server, so I crafted my own header. But I guess there is a better way.

Now we can store attributes on the server we need a way to retrieve them. We define a /documents/search/{attrname}/{arrtvalue}/ resource represented by Atom formated documents.

GET /documents/search/customerid/12345/ HTTP/1.1

This gives us an atom feed with all documents where customerid=12345:

HTTP/1.0 200 OK
Content-Type: application/atom+xml;charset=utf-8
Content-Length: 22128

<feed xmlns="">
    <name>HUODORA DoDoStore Search for customerid=12345</name>
  <link href="http://.../documents/search/customerid/12345/" rel="self"/>
    <title>Document 1 (2007-06-21)</title>
    <link href="http://.../document/703...ea1/" type="text/plain"/>
    <content type="xhtml">
      <div xmlns="" name="703...ea1">

Now we still nesd a way to represent the attributes in our atom entries. One simple way is just using XHTML to represent the data. Keeps it readable in a browser. Call it a microformat. We drop the following XHTML into our Atom content elements:

<content type="xhtml">
  <div xmlns="" name="703...ea1"
    <dl class="attributes">
      <dt class="customerid">customerid
        <a class="customerid"
      <dt class="invoiceid">invoiceid
        <a class="invoiceid"


  1. teenage mutant ninja hero coders » Blog Archive » Building a Track and Trace Application with CouchDB - 2008-12-28

    […] it with a self-designed Document store called DoDoStorage. For background on that project see here, here and […]

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

Du kommentierst mit Deinem Abmelden / Ändern )


Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )


Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s