Performance tuning: How to troubleshoot database-related performance issues in CloverDX

Performance is undoubtedly one of the key factors when running data transformations. In this article, we will look at how to troubleshoot database-related performance issues in CloverDX’s DatabaseReader or DatabaseWriter components.

Step 1: Tweak component configuration

Let’s start by looking at the configuration of the DatabaseReader and DatabaseWriter components:

DatabaseReader

In the DatabaseReader component, the property that you should pay attention to is the Fetch size.

Fetch size controls how many records are fetched from the database in one call. The default fetch size is 20; however, when processing large amounts of data, this size can cause too many roundtrips to the database, resulting in longer run times. Setting the appropriate fetch size might considerably increase the database read performance. Note that there is no generally recommended fetch size; the value needs to be adjusted based on the data volume and might require the trial-and-error approach to find the optimal value.

DatabaseWriter

In the DatabaseWriter the properties that can affect the write performance are the Batch mode, Batch size, and the Commit size.

Batch mode, which enables the processing of data in batches as opposed to processing individual records one by one, is disabled by default in the component. If your database supports batching, enabling the Batch mode in the DatabaseWriter component and setting the appropriate Batch size might help increase the data write performance, because it reduces the number of roundtrips to the database.

Hand in hand with the Batch size also goes the Commit size, which should be adjusted accordingly (i.e., either set to the same value as the batch size or set to multiples of the batch size).

Similarly to the Fetch size in the DatabaseReader component, there is no universally recommended batch size or commit size, and it might require multiple tests to find the best values that provide optimal performance.

Step 2: Use the correct JDBC driver

Using an incorrect driver to connect to your database might also negatively impact performance.

Keep away from MS SQL legacy jTDS driver

When connecting to an MS SQL database, we advise against using the jTDS driver unless necessary. This driver stopped being actively developed 10 years ago, and it might cause performance issues and other unexpected behavior.

In CloverDX the jTDS driver is kept only for backward compatibility, and in the list of bundled drivers it is present under “Microsoft SQL Server Legacy”. If possible, switch to the other supported MS SQL Server driver in the list of drivers.

With other distributors, you might want to verify that you are using an appropriate JDBC driver for your database distributor and version combination. To find the version of the bundled JDBC drivers within CloverDX, navigate to:

For local projects: in CloverDX Designer install directory under plugins\com.cloveretl.gui_<version number>\lib\plugins\org.jetel.jdbc\lib

For Server projects: in CloverDX Server install directory under webapps\clover\WEB-INF\plugins\org.jetel.jdbc\lib

If you would like to test performance with newer drivers, you can load custom drivers in CloverDX Designer; see our documentation for more information.

Step 3: Optimize job design

If the job performance seems to deteriorate due to extensive data volume, consider changing the design of your job to process the data in multiple batches.

Another option to consider would be implementing parallel processing across multiple worker processes to split the data read or write operations into multiple processes running simultaneously. See our documentation for more information on parallel processing.

Step 4: Consider Bulk Writers

If the performance of the DatabaseWriter component does not seem optimal even after performing the optimization steps above, you can try using one of our bulk writer components as another troubleshooting point (see our documentation for more information).

These components use their respective native clients to establish the database connection and load the data. The native clients need to be installed either where CloverDX Server is installed (for server projects) or where CloverDX Designer is installed (for local projects).

Because the bulk writers rely on 3rd party software, though, we would not recommend implementing these components in jobs that are intended to be run regularly. It might rather be used as a one-time solution or as part of the troubleshooting process to see if the load performance would differ from the load performance in the DatabaseWriter component.

Step 5: Look at server and network configuration

When processing large amounts of data, the resource consumption and network load on the database server and on the machine where CloverDX Server or Designer are running will be affected. You might want to review if the servers and network are optimized to withstand such a load.

Further troubleshooting

To see if the performance issues originate in CloverDX or not, try replicating the data read or load process from either the related database management platform (e.g., MS SQL Management Studio for MSSQL) or from a 3rd party software like DBeaver. To create the same conditions, the management platform or the 3rd party software should be installed on the same machine where CloverDX is running (for server projects) or where CloverDX Designer is installed (for local projects), and, ideally, with the same JDBC drivers as used in the CloverDX job.

If the result of the test points to an issue external to CloverDX, consult the issue with your database administrator to research the issue further. 

Should the result of this test should confirm that the issue lies on CloverDX side, feel free to contact our CloverDX Support team to help review the performance.

To expedite the research process please provide the following:

  • Job execution log files
  • Support package if working in a Server project
  • Designer log files if working in a local project. 
  • If possible, also provide the job files in the .grf, .sgrf, or .jbf form (please remove any sensitive information from the files before attaching).

More from Tech Blog

Visit CloverDX Blog

Read On