Sunday, 31 August 2014

Business Time Temporal Tables

When we last left off, I told you about all the goodness that is System Time Temproal Tables and how they exist to deal with tracking changes over time, but not burden your application with this complexity -- but then left with the annoying fact that it did in fact not address the very first situation that brought this all up - getting a raise after the payroll run - but before the start of the next month, meaning reports run on the current month would still be incorrect as they would reflect details only intended to be relevant for the next month.

Really You could not tell?

Well yes and no... no, because it would be possible for people to run queries which target the time-frame before the payroll run was done -- but realistically in typical situations like this there are going to be many many of these types of "reports" or "procedures" that have the base assumption that if you run a query in a month boundary it should be correct - or simply, if I run query now agnostic of time, it should be correct for now -- now should be implicit when I don't specify time.

It would not be fun being the one who would have to go to all the other teams and tell them that they need to update their products to bind queries to specific dates -- moreover, then deal with complexity of the fact that payroll day will move days from month to month because of those pesky weekends, implying that we would have to create some abstraction everyone could use to find out when payroll happened that month -- its all getting very messy and exhausting ... and this is just a silly fictitious example.

There, must be a better way...

and there is.. enter Business Time Temporal Tables, which is conceptually similar to the System Time Temporal Tables we have spoken about before, but this time DB2 is handing back a bit of control to us, and equally we are taking on a bit more responsibility for this concept of history change.

The extra control that we are given is that in business time, we are given the responsibility for identifying the time at which we want the effect of a change of data to take effect. We are able to post-date our change.

Enough waffle.. lets see it in action

First we need to create our table... but this time, since the dates that we are going to be passing in are in the control of the application, there is no good reason to have a duplicate version history table.. so not too surprisingly when we go the BUSINESS_TIME route, there is less framework we need to put in place

CREATE TABLE tb_Salary ( 
  id              INT NOT NULL, -- employee id
  salary          INT NOT NULL,  -- the data  
  bus_start       DATE NOT NULL, -- create space for our start time
  bus_end         DATE NOT NULL, -- create space for our end time
  PERIOD BUSINESS_TIME(bus_start, bus_end), -- bind the above to the concept of the business time period
  PRIMARY KEY( BUSINESS_TIME WITHOUT OVERLAPS) -- make sure all time frames are unique
)

So what we have again is our table with our raw data, employee number and salary, and then two dates that represent our period. Like when we look at the System Time tables, there is no hidden magic in the scheme regarding these times, but rather we need to explicitly tell DB2 that these are the columns that should be use for the period range with the PERIOD BUSINESS_TIME clause, and then for a bit of data sanity, we constrain our table to guarantee to have a unique time range for each row.

Adding some data into our table is as expected, where we must explicitly define our time ranges

insert into tb_Salary 
    values(1, 100, '2014-01-01', '2015-01-01');

select * from tb_Salary 
    where id = 1 
    order by BUS_START asc;

ID          SALARY      BUS_START  BUS_END
----------- ----------- ---------- ----------
          1         100 01/01/2014 01/01/2015

  1 record(s) selected.

Lets now give our hard working employer a raise, but only make it effective at the beginning of the next month, assuming we are in the 8th month now.

--give a raise ( a normal update )
update tb_Salary
    for portion of BUSINESS_TIME from '2014-09-01' to '2015-01-01'
    set salary = 150
    where id = 1;


select * from tb_Salary 
    where id = 1 
    order by BUS_START asc;

ID          SALARY      BUS_START  BUS_END
----------- ----------- ---------- ----------
          1         100 01/01/2014 09/01/2014
          1         150 09/01/2014 01/01/2015

  2 record(s) selected.

Nice .. so now we see that two rows exist since there are now - from a business perspective - two salaries, depending on the time frame that you are interested in. DB2 also takes care of the "heaving lifting" of splitting the one row into two, and appropriately adjusting the date ranges to be correct within the constraint we added to our table that all date ranges must be unique.

On the topic of row splits.. its feels pretty intuitive that a row split would happen for an update... but what about a delete... what if our employee decides to take an unpaid sabbatical for a month... could we simply remove that period without having to concern ourselves with the complexity of the number of rows that might need to be affected... let see

--takes a sabbatical ( removing from the table and getting a row split )
delete from tb_Salary
    for portion of BUSINESS_TIME 
    from '2014-10-01' to '2014-11-01'
    where id = 1;
 
select * from tb_Salary 
    where id = 1 
    order by BUS_START asc;

ID          SALARY      BUS_START  BUS_END
----------- ----------- ---------- ----------
          1         100 01/01/2014 09/01/2014
          1         150 09/01/2014 10/01/2014
          1         150 11/01/2014 01/01/2015

  3 record(s) selected.

ohh its almost like someone else is doing all the hard work for us here :)

Now HR is able to run queries on the salaries for a specific payroll not not be affected by raises for the next period

select * from tb_Salary 
    for BUSINESS_TIME as OF '2014-01-01 
    where id = 1';

ID          SALARY      BUS_START  BUS_END
----------- ----------- ---------- ----------
          1         100 01/01/2014 09/01/2014

  1 record(s) selected.
  
  --( during sabbatical )
 select * from tb_Salary
    for BUSINESS_TIME as OF '2014-10-01'
    where id = 1;
    
 ID          SALARY      BUS_START  BUS_END
----------- ----------- ---------- ----------

  0 record(s) selected.
 

Or now even to be able other interesting questions like, what the salaries have been for a given period

select * from tb_Salary 
    for BUSINESS_TIME FROM '2014-01-01' TO '2015-01-01' 
    where id = 1;

ID          SALARY      BUS_START  BUS_END
----------- ----------- ---------- ----------
          1         150 09/01/2014 10/01/2014
          1         100 01/01/2014 09/01/2014
          1         150 11/01/2014 01/01/2015

  3 record(s) selected.

hmmm but wait... now we have business context history that we did not have with System Time Temporal Tables.. but we have lost all the goodness of transactional history for any changes.. so while we can see from a business perspective that someone was paid a certain value in a given date range, we don't have the same degree of auditing control that we had with System Time temporal tables in that we would be able to tell exactly when that salary edit took place ( i.e. was that salary change effectively part of backpay).

I am one of those people who tend to not understand the concept of having a cake and not being able to eat it.

Where is my cake?

The good news is the edible cake is in our next article, where we "introduce" Bitemproal tables... I say "introduce" because -- well you already know it all, but you just have to learn the new fancy term for it ;)

No comments:

Post a Comment