Poor data quality, format changes and unreachable data sources are just a few examples of runtime problems that can wreak havoc on a seemingly robust data pipeline. Good data pipeline design needs to address these unexpected situations.
CloverDX 5.6 introduces a new tool to allow you to handle errors that occur when running your transformations. CTL – our language for writing your own mapping or business logic – now includes a try-catch statement that improves your ability to capture and handle runtime errors.
Try-catch statements are familiar to most programmers as they are part of many common programming languages including C#, Java, Python etc. For the non-programmer, a try-catch statement allows you to wrap potentially dangerous code in a try block, and it something goes wrong, to manage that condition in the catch block. The basic structure of the statement looks like this:
try {
// potentially dangerous code
} catch {
// handle errors and react gracefully
}
In CloverDX prior to version 5.6 we used callback functions like transformOnError to implement similar functionality. While the end result was similar, try-catch allows for more natural handling of errors and the exceptions provide more information about the error than what was available with callback functions. Error-handling callbacks are still supported, so if you have older graphs that rely on them, they will continue running without any changes.
Let’s now have a look at how to use exceptions on a simple example – a graph that is loading a set of transaction data files into a warehouse.
The input files come from multiple places, perhaps different sales regions. Most might look like this:
But there are a few sales regions that produce account numbers that have an embedded “-“
Our target warehouse for this data requires the account number to be numeric, so mapping from the source data structure to warehouse data structure must include code that removes the dash. The CTL code for this might look like this:
function integer transform() {
$out.0.trans_id = $in.0.Transaction_id;
$out.0.trans_type = $in.0.Transaction_type;
$out.0.amount = $in.0.Amount;
$out.0.trans_time = $in.0.Time;
$out.0.account_id = str2long(replace($in.0.Account_number, "-", ""));
return ALL;
}
We’ve mapped all out source fields to our target fields, and we have taken care to examine the incoming account number and remove a “-“ if it occurs before converting it to a number.
Based on what we know about the data so far, the code above will handle the difference in account number format.
However, let’s consider a case where input data is malformed, and the account number contains other non-numeric characters. If we try to run the above code with an account number like that – “1000#96312406” for example – we’ll get following error in our graph log:
In such cases we want to capture the error and reject the input record since we cannot assume that the account number will be correct if the non-numeric characters will be removed.
We can easily use try-catch block to catch any error that occurs during the value conversion:
function integer transform() {
$out.0.trans_id = $in.0.Transaction_id;
$out.0.trans_type = $in.0.Transaction_type;
$out.0.amount = $in.0.Amount;
$out.0.trans_time = $in.0.Time;
try {
$out.0.account_id = str2long(replace($in.0.Account_number, "-", ""));
} catch (CTLException e) {
$out.1.* = $in.0.*;
$out.1.error_message = "Error converting account_id.";
return 1;
}
return 0;
}
Since in such cases we always want to know that such an invalid value occurred in the source data, we have connected the second output port of the Reformat component to a FlatFileWriter so that we can capture the error message. We use “error” metadata on this port – same metadata as on input, but with one additional field called error_message that can be used to capture error. Our graph looks like this after this change:
In the catch statement we then copy all data from input port to the error output and write our custom error message. We also need to make sure that at the end of the transform function we only write data into the first port instead of the second (error) port.
When we now run our graph, all records with invalid account numbers will be rejected:
In the catch statement we also have access to a CTLException record which provides additional details about the error that can be useful when handling the exception – for example more detailed error message, location in code and more. See more details in our documentation here.
In our example, we can provide more descriptive error message like this:
$out.1.error_message = "Error converting account_id: " + e.message;