Skip to main content

Custom Code Processing

Overview

Custom Code allows users to write Java code with data processing logic , upload Java jar packages to CloudCanal, and CloudCanal automatically call these codes during Full Data and Incremental to achieve various data transformation processing purposes.

Custom Code calls are located in the middle of CloudCanal's overall task processing chain, as shown in the following diagram:

Custom Code In CloudCanal

Scenarios

Custom code is mainly used in Data Migration or Synchronization scenarios where CloudCanal cannot be standardized at present, and has the characteristics of flexibility, certain business semantics, and some complexity.

Some scenarios are listed below for reference:

  • Data transformation
    • Data masking can be accompanied by service encryption and decryption algorithms
    • Time zone conversion
  • Data cleansing
    • Outlier and null handling
    • Missing value completion
    • Data normalization
  • Real-time wide table construction
    • The Fact table join dimension tables
  • Data aggregation
    • Aggregate data shards
    • Cross-region dataset centralization
  • Business logic processing
    • Complex data transformation resulting from business architecture upgrades

Steps

Development

  • Development is recommended in a Java IDE such as Intellij Idea or Eclipse.
  • CloudCanal provides the basic project cloudcanal-data-process.
  • The cloudcanal-data-process project provides several examples that can be applied with reference to modifications
  • Custom Code classes need to implement interfaces to achieve the purpose of being called by CloudCanal. custom_code_processing_1

Packaging

  • Modify the packaging meta information. custom_code_processing_2

  • Go to the project directory and use the command to package.

    % pwd
    /Users/zylicfc/source/product/cloudcanal/cloudcanal-data-process
    % mvn -Dtest -DfailIfNoTests=false -Dmaven.javadoc.skip=true -Dmaven.compile.fork=true clean package
  • After executing the command, you can get the jar package in the corresponding directory. Jar File Path

DataJob creation

  • Other steps omitted...

  • Select the Columns page, top right corner, click button to Upload Custom Code. custom_code_processing_3

  • DataJob runs automatically. custom_code_processing_4

Debugging

Remote Debug

  • refer to the Custom Code Debuging documentation for detail steps.
  • After the DataJob running, debug can be found at the IDE breakpoint. custom_code_debug_3
  • CloudCanal provides a fixed log file (custom_processor.log) to print logs in Custom Code.

  • For specific steps, please refer to the Print Log in Custom Code document.

  • After it takes effect,you can see the log content in the Console (Details > Log > custom_process.log). custom_code_log_2

  • You can also go to the DataTask log directory to view the full log file. custom_code_log_1

Code Update

  • Details > Custome Code [Management] > [View] custom_code_processing_5

  • Upload, Active, and Restart custom_code_processing_6

FAQ

The code package not founded

  • You can restart DataJob on the console.Because jar package distribution triggered by Start operation.
  • If not effected,check whether the jar package exists in the sidecar container or node directory /home/clougence/cloudcanal/datahandle. Customer Code On Node
  • You can manually upload the code package (the code package name can be modified according to the prompts in the error log).

The log is not printed

  • Check that the logger name is correct custom_code_log_2

Custom Code doesn't work in Verification and Correction

  • The Verification and Correction DataJob does not support Custom Code now.

Summary

This article briefly introduces CloudCanal Custom Code, covering code development, task creation and updates, troubleshooting.