Maintaining record history in database

|
| By Webner

When you are building a web-based multi/single tenant system or a desktop system which has a database at its backend to maintain records, keeping the history of records proves to be very useful in the long run from information and security perspectives. History of records means keeping track of which piece of data was modified when and by whom (and for what reason if possible). This helps in report generation (like which features of the system are getting modified more often than others). This also helps in figuring out who made the change when something goes wrong by mistake or intentionally.

There are various levels of maintaining the history and based on your requirements you can decide what table structure you need. For understanding purpose we are taking this database table that maintains product information:

These are some options for maintaining historical information:

1. Basic history keeping: In this case all you are interested in is knowing when a certain change was made and who made the change. For example in above table if a new product is added and you want to know when that product was added and who added it you can add 2 more columns to it – created_by and date_of_creation. In created_by you can keep user id of the user who created the product and in date_of_creation you can keep date on which record was first inserted in the table.

2. Intermediate history keeping: Basic history keeping makes it better but in future if someone changes the data of a product (like change in name or price), you will have no way to know who made the change and when and for what reason. To solve this problem there are 2 options. First one is that created_by should be renamed to edited_by and date_of_creation should be renamed to last_modified_timestamp. Now initially when record is created you can keep record creator’s id in edited_by and current date+time in last_modified_timestamp. In future if some other user changes this information you should change edited_by for same record to user id of the user who is making the change and again update last_modified_timestamp to current timestamp. That will make sure you can always see who modified the product recently and at what date and time. But since we have only 1 set of fields so we lose the information of who created the product originally and at what date. So the second option is to leave created_by and date_of_creation fields intact and add 2 new columns edited_by and last_modified_timestamp separately to store this information. Then you will have information related to change in the product information as well as of its original creation. In addition you can create a not null column reason_of_change and force the user editing the data to fill it.

3. Advanced history keeping: Previous step made it better but there are still some shortcomings. A product may go through a series of changes. But since we have created edited_by and last_modified_timestamp columns in Product table itself so we cannot store information of multiple historical changes in these columns for single product. An ugly way to achieve this will be to store comma separated information in the columns to keep ids of all the users who made changes in the edited_by column and their corresponding timestamps also comma separated in the last_modified_timestamp column. I have seen this type of data unfortunately and I do not recommend this way. It makes it useless since you cannot run SQL queries on this data to fetch records for a specific date or date range or even for a specific user id. You need to use tricks to split the values on comma before using them and more such issues. A better way is to create another table with name product_history like this:

So this table gives us the flexibility to store multiple records for single product id and we can keep track of all the changes that happen to the product in its lifecycle.

4. One Step Further: I would like to close this post with one more addition to the previous solution. The previous solution does not tell us which field was modified by the user, what the previous value was and what the new value is. For that, one concept is, we can add 3 more columns to the Product_History table to get following structure of this table:

But generally in one transaction user can change multiple field values so a better approach will be to keep the entire previous record in the history table like this:

Leave a Reply

Your email address will not be published. Required fields are marked *