Saturday 19 July 2014

System Time Temporal Tables

Following on from the temporal tables overview lets first look at the concept of System Time Temporal tables.

So we remember that temporal tables are all about keeping the history of changes to data, and what is nice about System Time Temporal tables, is that this is entirely a DB stack concept, it is ( or at least can be ) entirely transparent to the application stack.

But I am getting ahead of myself, lets go a bit slower.

If data is going to have a lifetime, then by definition we need a start time and end time, but what values we choose for those times is still up for debate. The first and most obvious choice is to use the time at which we insert data as the start time, and the time at which we change that data to be its end point. That choice of using the times at which the data was touched is what we mean when we talk about System Temporal Tables.

At this point, it does not take a rocket scientist to spot that DB2 can and should be able to help us out a lot here, since it will have very intimate knowledge of when data is changed. And so it is. At an application level, we can do all our CRUD operations as usual oblivious to the temporal nature of this data, and DB2 will automatically make copies of our data when it has changed, and set the start timestamp and end timestamp accordingly.

But to get this right, we will first have to give DB2 the hook points it needs to do the heavy lifting for us, we will need four things

  1. A column for the start timestamp 
  2. A column for the end timestamp 
  3. A placeholder Timestamp variable for DB2 to use when updating rows. 
  4. A table to record history in Which we are be in a position to supply to DB2 when we create our table.

CREATE TABLE tb_YourTable
(
  -- your data
  my_data INT NOT NULL,  
  -- A place for DB2 to add a start date
  sys_start TIMESTAMP(12) NOT NULL GENERATED ALWAYS AS ROW BEGIN,
  -- A place for DB2 to add an end date
  sys_end TIMESTAMP(12) NOT NULL GENERATED ALWAYS AS ROW END,
  -- A place for DB2 to save a transaction time when a row is updated, hidden 'cause we dont care about this
  trans_start TIMESTAMP(12) GENERATED ALWAYS AS TRANSACTION START ID IMPLICITLY HIDDEN,
  -- Bind our timestams to DB2 concept of Sytem_Time
  PERIOD SYSTEM_TIME(sys_start,sys_end)
);
-- make a duplicate table for our update history
CREATE TABLE tb_YourTableHistory LIKE tb_YourTable;
-- tell DB2 to bind our table to the history tables
ALTER TABLE tb_YourTable ADD VERSIONING USE HISTORY TABLE tb_YourTableHistory;
So what are the interesting thing that we can pick up from this syntax above : 

First there is no magic behind-the-scenes schema being created, we are in control of all of this, for both the table we are interested in and and the table used to keep history - nice!

There is just some extra syntax that gives DB2 the knowledge to bind the concept of temporal tables to the schema we have created. This is how we give DB2 permission to use our hook points ie the columns that we have created that it can use for the SYSTEM_TIME and the history table we have created which it can use to populate our history.

Clearly we do have to play by some rules regarding the schema, first our TIMESTAMP columns must be TIMESTAMP(12) and our history table must have exactly the same schema as our temporal table ( I like how easy LIKE make this, very DRY )

So now we have a temporal table set up, what happens when we add content to it ?

Well lets insert some data, just like we would if we did not know about this whole temporal thing..

insert into tb_YourTable(my_data) values(1)
tb_YourTable
------------
MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------
          1 2014-07-19-12.14.16.100732000000 9999-12-30-00.00.00.000000000000

Not too surprisingly, we have data ... but equally unsurprising nothing in our history table....

tb_YourTableHistory
-------------------
MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------

right now lets do an update to this table

update tb_YourTable set my_data=2
 tb_YourTable
 ------------
 MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------
          2 2014-07-19-12.17.21.539897000000 9999-12-30-00.00.00.000000000000

We now see that our update took place as expected, well almost... notice that the start time was also updated.

Also if we look at our history table :

 tb_YourTableHisory
 ------------------
  MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------
          1 2014-07-19-12.14.16.100732000000 2014-07-19-12.17.21.539897000000

We can see that it now has some data in it, where the histories start time is exactly the same as our tables original start value and the histories end time is the same as our updates new start value. ie the full time that the old data was valid for.

okay, well that worked as expected.... what about a delete ?

delete tb_YourTable where my_data=2
tb_YourTable
------------
MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------

tb_YourTableHistory
-------------------
MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------
          1 2014-07-19-12.14.16.100732000000 2014-07-19-12.17.21.539897000000
          2 2014-07-19-12.17.21.539897000000 2014-07-19-12.34.11.079309000000

Okay, so "magically" getting our history into our history table is all well and good, but if we don't read it, then who cares... so lets try and query some data from our history, by running a query on our now empty table.

select * from tb_YourTable for system_time as of '2014-07-19-12.14.17'
tb_YourTable
------------
MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------
          1 2014-07-19-12.14.16.100732000000 2014-07-19-12.17.21.539897000000  

Notice how we use the "for system_time as of" expression to pull data out of the history table while referencing our data table ( which right now is empty ).

Also notice how the SYS_END time is not some value trying to identify infinity, so we know that the value we have pulled was only true for this time period. There are a few more ways in which we can pull data from our history ( or current table for that matter )

select ... FOR SYSTEM_TIME FROM ... TO ...
select ... FOR SYSTEM_TIME BETWEEN ... AND ...

eg:

select * from tb_YourTable for system_time between '2014-07-19-12.14.15' and '2014-07-20-00.00.00'
tb_YourTable
------------
MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------
          1 2014-07-19-12.14.16.100732000000 2014-07-19-12.17.21.539897000000
          2 2014-07-19-12.17.21.539897000000 2014-07-19-12.34.11.079309000000

sweet...

hmmm but did I not lead this article off with the promise that the application stack would be able to take advantage of all this temporal goodness without changes.. and so far, to get any real value.. we have to learn how to query in a new way ... hmmm, did I lie ?

No wait there is more :) With the use of a nice register "Current Temporal System_Time"

set current temporal system_Time = '2014-07-19-12.14.16.100732000000'

Then run a temporal agnostic query

select * from tb_YourTable
tb_YourTable
------------
MY_DATA     SYS_START                        SYS_END
----------- -------------------------------- --------------------------------
          1 2014-07-19-12.14.16.100732000000 2014-07-19-12.17.21.539897000000

ohhh did we just do a time shift from under our application nose .. nice

To turn it off, we simply set the time to NULL



set current temporal system_time = NULL
All well and good, but if you cast you mind back to my overview, I spoke about the salary scenario, and if you are astute, you would know that this current solution would not actually solve that problem.. for that we need to look at the concept of Business Time Temporal Tables... but that is another post for the future.

No comments:

Post a Comment