Change Data Capture connector
A couple of weeks ago, Red Hat celebrated its 2020 summit. The summit is one of the most important events in the IT industry and Syndesis (through its product branded by Red Hat, Fuse Online) was there too!
One of the most recurrent pattern we’ve observed during different sessions was about the usage of changes happening on an application data layer. Those changes are streamed to a topic and provided as event to be consumed by integrations. We’ll see how Syndesis and Debezium, a Change Data Capture software, are a powerful combination that will simplify all that work for you.
Change Data Capture
A Change Data Capture (shortly CDC
) software is a combined set of technologies that tail the database log journal in order to capture the changes happening on any table/collection of the database and present as a stream of events.
These events can be therefore consumed by any application, in our case, any integration that is interested in the changes happening on certain tables.
One of the coolest project on the scene is without any doubt, Debezium, an open source CDC framework. Debezium supports many database, either relational or NOSQL databases.
Debezium architecture
We won’t enter in deep details, but let’s have a quick look on how the Debezium ecosystem works:
Figure 1. Debezium high level architecture
There are two mode supported, the embedded
mode and the kafka
mode. The embedded
is a lighter version that can be used in experiments or if you don’t have strict mission critical requirements. The kafka
mode is where we’re mostly interested as it use Apache Kafka
ecosystem to stream events of a Kafka
broker using a Kafka Connect
connector.
Basically, once Debezium is up and running, you can create a connector in order to “tail” any database change and stream changes to a kafka
topic.
Debezium connector
As the changes are streamed to a kafka
topic, it’s easy to think we can leverage a Kafka connector and consume them within an integration. It’s a fair approach and it will work smoothly: it was the approach shown during the summit.
However, this approach has a couple of limitations:
- You must know beforehand the event schema (the table/collection structure expected)
- The event schema mixes structure and meta information (such as the operation)
Some time ago we started a Debezium connector that solved those limitations. We give to citizen integrator
the possibility to automatically discover the schema structure while creating the integration. We also provide the operation as a message header, therefore making it easy to use it during the integration composition (ie, through a conditional flow).
Use case scenario
In order to show how this connector works, let’s use a microservice decoupling approach described in this blog post: this time we’ll use the low code
superpowers of Syndesis though!
In short, we have a User
microservice that must be notified when a new Order
is created: the CDC will capture the changes happening on Order
and Syndesis will take care to call an API exposed by User
. The code and detailed step to execute this example are provided in this github repo.
The first thing to do is to select the topic where the Order
changes will be streamed.
Figure 2. Subscribe to a table change
Then we will filter those actions we’re interested: in our case we want to add an Order
to User
list when a new Order
is created, and delete it when it’s deleted. A conditional flow will help us to define both flows.
Figure 3. Filter CREATE and DELETE events only
In the CREATE
condition branch we will select the addOrder
endpoint provided by User
API.
Figure 4. Update user when adding an Order
Once we choose the operation we must map correctly the fields coming from the event with the fields expected by the API call.
Figure 5. Data mapping between event and API call
As soon as the integration is up and running, it will be easy to perform some API request and see how the User
is updated when any Order
is created or deleted.
Yes, the two microservices are loosely coupled! And, yes! we performed that decoupling without writing a line of code!
Development status
The actual development can be considered as a POC as it is actually based on MySQL
database only. However, given the great interest that CDC software is having we are targeting to work during next release and make it a stable development! We’d love to hear any feedback about. You’re invited to try and let us know!