Qlik Replicate – Task Metadata

by | Jul 2, 2020

Recently I was working for a client and had a specific issue which led me to probe around various aspects of Qlik Replicate tasks, attempting to understand how Qlik Replicate works. While I did not manage to achieve what I was specifically attempting I did have a look at some of the inner workings of Qlik Replicate tasks, and that’s what I’d like to share today.

While I was looking into these aspects I made contact with Qlik Support to try and garner some further information about what exactly I was looking at, however Qlik Support were not overly forthcoming and suggested that information on these low level elements would require a Professional Services engagement. So as such, the detail here is purely speculation, which may be of assistance, and equally it may not. I also take no responsibility for any damage you may cause. Proceed with absolute caution, in your development environment!!

When we create a new task in the Qlik Replicate console, that task creates a corresponding folder on the Qlik Replicate disk, at the location the original installer defined. In my case this is location is D:\Attunity\Replicate\data\tasks

Initially when we create a task a folder of the same name is created at the disk location. This will have a single file, {TASK_NAME}.repo which contains the metadata of the task such as the task name, source data endpoint, target data endpoint.

If a task has been run previously, we know the task folder is going to hold an variety of other folders files and databases as follows:

Sorter (folder)

Contains files which are names like:

ars_swap_tr_0000000000003.tswp
ars_swap_tr_0000000000004.tswp

StateManager (folder)

Contains files which are names like:

ars_saved_state_000001.sts
ars_saved_state_000002.sts

{TASK_NAME}.repo

dynamic_metadata.sql

notifications.sqllite

task_audit.sqllite

task_errors.sqlite

task_tables.sqlite

So, each of these artifacts are generated once a task has been initiated from the Qlik Replicate console. When creating a new task, you will see that you only have the option to run the task, you cannot resume processing. The Qlik Replicate platform references the above files and databases to determine where it was in the transaction log, and subsequently where to “start” from. I will give a slightly deeper explanation of what these artifacts, which will generally make sense when we think about how Qlik Replicate tasks are streaming data from multitudes of tables and needs to keep some positional reference on these.

Some further details:

Sorter:

The sorter folder contains .TSWP files which while not entirely human readable when opened with Notepad we can see the transaction identifier present:

When Replicate is running the .TSWP files seem to be updated intermittently. When Replicate crashes the .TSWP files are written out, so it would appear this is so Replicate knows where it last was in the binary logs.

StateManager:

The StateManager folder contains .STS files which like the .TSWP files are not entirely readable:

These files appear to contain information on the task, it’s source connection, target connection and what looks like a transaction identifier again.

The Replicate User and Reference guide says these “locally stored checkpoint” files are referenced when the you have corrupt swap files, presumably the files found in the Sorter file location.  The .STS files are referenced when starting a task and using the Recovery option under Advanced Run Options from the console.

{TASK_NAME}.repo

This file appears to be the repository of the task, which holds all the information about how it executes. I suspect that when exporting a task to. JSON this is where it references the data. Again, it is difficult to glean much when opening this with Notepad, but you get the general idea:

Handy Tip:

Another backdoor way of retrieving the full metadata from the global repo for all your tasks is to open CMD (run as administrator) and navigate to:

C:\Program Files\Attunity\Replicate\Bin 

Then execute:

repctl -d D:\Attunity\Replicate\Data exportRepository

If you then navigate to the file_path D:\Attunity\Data\Imports and open the file Replication_Definition.json you will see all the attributes of all you the tasks within the global repository:

DynamicMetadata.sqllite

The DynamicMetadata sqllite database appears to capture each table in the task and the transaction sequence for the respective tables. There are three tables within the database:

capture_run_log:

record_mapping:

table_definitions:

Notifications.sqllite

In my case this database is empty, with no tables at all so I am not sure what the purpose of this is, potentially Replicate creates the tables if there is a “notification”?

TaskAudit.sqllite

This database presumably contains audit related information, there are two tables within:

sqllite_sequence:

I am not entirely sure how this table operates, it seems to be storing a sequence number of sorts, for what purpose I am unsure. 

task_audit:

This table seems to contain an audit trail for each of the tables in the task. Each table has a corresponding time and status. Unfortunately, I don’t know what the event_type or status_code fields keys translate to.

TaskErrors.sqllite

This database contains a single table named task_errors, in my case I do not have any errors so I am unsure of what the output would be.

TaskTables.sqllite

When reviewing the contents of this database it is quite apparent where this data ends up – well with the table_status table anyway. The row count estimates, insert, and update counts are displayed within the Replicate console. 

CDC_status:

Load_status:

Table_status:

Well that’s all for today! Although I don’t know what all of these things are, hopefully this might be of assistance for someone out there!

Thomas – MacGyver of code.

Thomas blogs about big data, reporting platforms, data warehousing and the systems behind them. Read some of his other blogs here.

Do you know we run a Business Intelligence Meetup group in Wellington?

0 Comments
Submit a Comment

Your email address will not be published. Required fields are marked *