Rethink databases

The classic way that we use databases is oriented towards the amount of available storage that we have for data. It’s time to rethink that pattern, now the cost of storage sinks so fast that it must converge to free or nearly free at some point.

Consider the following record in a database:

residence Oldebroek

The traditional pattern for updating this record is to replace the value in place:

residence Oldebroek Mastenbroek

Why have we always done this? Well, in fact we haven’t. When storage was still simply a piece of paper, we simply slipped in a new piece of paper in the folder and that would be the new valid file, plus as a bonus, you had a history over time.

cabinet

Now that the cost of storage is not a problem anymore, we should go back that paradigm. The above example would translate to:

residence point in time 1 Oldebroek
residence point in time 2 Mastenbroek

Note that we added a perception of time to the record to be able to figure out what the prevailing version of the record is. Think a bit what this means for your application. You simply store a new fact at a point in time without erasing the past.

You can implement this paradigm into your existing database by adding it to your design, or you can use database technology that does this by design. You might find Sesame or Apache Jena interesting. Alternatively you could look into Datomic.

Interested in Datomic?

  • We have a tech day planned for 25/2 that you can sign up for via https://www.avisi.nl/nl/evenementen/
  • student at the HAN? We have a tostitalk about Datomic planned at 25/2 in Lokaal 99. Sign up via the form at https://www.avisi.nl/nl/evenementen/. 

 

  • Facebook
  • LinkedIn

One thought on “Rethink databases

  1. I’m having some doubts whether the concept of time-dependency is new to database design. In my work I have used time-dependent storage and retrieval for quite some time (from the dinosaur days of COBOL and IDMS up to now). In some occasions, even double-layer time-dependencies are applied, e.g. to recreate a person’s residence on a point in time from a given viewpoint point in time, which is almost standard procedure in insurance claim systems, but could sometimes be very challenging for application programmers.
    As for RDF; I was extremely pleased to see a revival of Peter Chen’s classic ERD in the RDF triples. This truly proves the man’s genius.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>