Data Shapes
A very important feature of any integration platform is to manage transparently the data format between the source and the destination channel. Syndesis and the powerful visual data mapper tool bundled with it, simplifies this concept with the definition of a datashape
.
A datashape
is a way to describe any inbound/outbound message format and to allow the user to easily map each data property in the integration step: said in other words, you will be able to transform on the fly the input/output of the data involved in an integration.
JSON Descriptor
When you develop a new Syndesis connector you must specify a file descriptor that defines several properties of the source/destination involved. The section related to the data shapes is defined by the descriptor.inputDataShape
and descriptor.outputDataShape
. The format is the same, but, as the name let it guess, you will be able to specify a different format if the data you’re describing is the input or the output of the connector involved.
{
"actions": [
{
"actionType":
...
"descriptor": {
...
"inputDataShape": {
"kind":
...
},
"outputDataShape": {
"kind":
...
},
First of all, let’s explore a complete description of a datashape (we use as an example an inputDataShape
but the same is valid for the output):
"inputDataShape": {
"name": "my data shape",
"description": "my data shape description",
"kind": "java",
"type": "io.syndesis.connector.mycomponent.MyModel",
"specification": "used in json-schema only",
"collectionClass": "java.util.ArrayList",
"collectionType": "List",
"metadata": {
"variant": "collection"
},
"variants": [
{
"kind": "java",
"metadata": {
"compression": "true",
"variant": "element"
},
"type": "io.syndesis.connector.mycomponent.MyModelSplit"
}
]
},
Don’t worry, the above is the full definition, in majority of the cases you won’t need to define all those configuration. Let’s explain briefly what everything stands for.
The kind
parameter is the most important as it is used by Syndesis to convert any input/output message to the specified kind
. This is the only required field when declaring a datashape, so, most of the times, you will finally end up just setting this and the specification
field when configuring your connector. We allow the following values:
- any: open specification - read next paragraph for more info
- java: used in conjunction of
type
to define the java class to convert the data to - json-schema: used in conjunction to
specification
to define the json schema expected - json-instance: used to represent a generic json instance
- xml-schema: used in conjunction to
specification
to define the xml schema expected - xml-schema-inspected: used in conjunction to
specification
to define the xml schema expected inspecting any included resource type - xml-instance: used to represent a generic xml instance
- none: used if no data is expected
The type
is used only when the kind
specified and will have to contain the full package and class name of the expected java type. The collections configuration are used when you’re expecting a java collection of elements, you can specify the interface and the concrete implementation to use.
metadata
and variants
are used to specify certain configuration that could help the split and aggregate EIP features. By defining the metadata
as either an element
or collection
you will declare explicitly if the expected data is a single element or an array of elements, sparing the “guesswork” at runtime. Also, using variants
declaration, the splitting feature will be able to work correctly by knowing how to split the original message (ie, through another java
class model).
ANY and NONE datashapes
We reserve a particular mention to the any
datashape as this kind can be quite useful when you don’t know beforehand the expected data model. When you define it, Syndesis will provide you an additional user interface requiring to fill with the expected datashape. This is particularly useful as you will leave the user to select the data model expect at runtime.
Also none
is useful if you don’t expect any data at all (tipically in the source connectors input data shapes).
Static vs Dynamic datashape
The above example is showing a “static” configuration of a datashape that will be always the same once it has been deployed to your platform. Most of the time this is not useful, as your data shape vary depending on the parameters configuration submitted by the final user. dynamic
tag comes to rescue!
{
"actions": [
{
"inputDataShape": {...},
"outputDataShape": {...},
},
...
"pattern": "To",
"tags": [
"dynamic"
]
When this is set, you’re instructing Syndesis gui to look up for the meta
information to be retrieved dynamically and according the parameters that the user is submitting in each step of the integration configuration, including the datashape
s. The GUI is triggering a call to the server
that will forward the request to the meta
which is finally the one that knows how to retrieve such information (see backend architecture diagram).
Let’s then discover how to develop such extension and how to recover dynamically metadata
in Syndesis.
Development example
In order to simplify the discussion, let’s follow up with the same example provided in the connector development guideline.
We expect our integration to be able to handle any input coming from any source with the format expected by the collection provided by the user. So we’ll define dynamically a json-schema
that will read the specification directly from the database (at runtime). The output expected is a generic json-instance
, as there are several operations that our producer can perform.
{
"actions": [
{
"actionType": "connector",
"descriptor": {
"componentScheme": "mongodb3",
...
"inputDataShape": {
"kind": "json-schema"
},
"outputDataShape": {
"kind": "json-instance"
},
...
"pattern": "To",
"tags": [
"dynamic"
]
As we’ve marked the connector as dynamic we will have to instruct the platform how to properly retrieve the meta-information. We need to create a simple file beside the main json descriptor under
META-INF/syndesis/connector/meta/mongodb3
This will contain a simple configuration pointing at the right java class:
class=io.syndesis.connector.mongo.meta.MongoDBMetadataRetrieval
The class has to extend io.syndesis.connector.support.verifier.api.ComponentMetadataRetrieval
with only a method, whose goal is to either retrieve the meta information you need, or, adapt the ones that may be already provided by the upstream component that you’re extending from Camel. You should possibly adopt this last strategy deferring to the platform (Camel in our case) the duty to retrieve such meta information.
public final class MongoDBMetadataRetrieval extends ComponentMetadataRetrieval {
@Override
protected SyndesisMetadata adapt(CamelContext context, String componentId, String actionId, Map<String, Object> properties, MetaDataExtension.MetaData metadata) {
String jsonPayload = metadata.getPayload(String.class);
LOGGER.debug("Adapting meta retrieved by upstream component {}", jsonPayload);
DataShape jsonSchemaDataShape = new DataShape.Builder()
.name(String.format("%s.%s", properties.get("database"), properties.get("collection")))
.description(String.format("Schema validator for %s collection", properties.get("collection")))
.kind(DataShapeKinds.JSON_SCHEMA)
.specification(jsonPayload)
.build();
return SyndesisMetadata.of(jsonSchemaDataShape);
}
}
The example above should ease the discussion. Our goal in Syndesis is to adapt the meta information coming from the upstream platform: in our case, the payload retrieved in the metadata
parameter is a json-schema payload that the camel-mongodb
component is getting on our behalf - it uses the information in the properties to query the database and collection and return the expected json-schema validator. Of course, this business logic may be different in each component, but the principle is the same: getting the meta information on the source/destination, parse it here and adapt according to Syndesis data model.
As we’de delegated the complex stuff to the platform, the rest is to simply adapt the format. You can see that we’re setting a nice name
and description
, the kind
as json-schema and the retrieved specification
. Finally we’re calling the SyndesisMetadata.of(...)
that will set the datashape both for input and output. Cool, with this example you will be able now to adapt dynamically the datashape for any kind of connector, as you likely will have a different data structure for each of them.