Copyright © 2007-2023 JumpMind, Inc
Version 3.14.9
Permission to use, copy, modify, and distribute this SymmetricDS User Guide for any purpose and without fee is hereby granted in perpetuity, provided that the above copyright notice and this paragraph appear in all copies.
Preface
This user guide introduces SymmetricDS and its features for data synchronization. It is intended for users, developers, and administrators who want to install the software, configure synchronization, and manage its operation. Thank you to all the members of the open source community whose feedback and contributions helped us build better software and documentation. This version of the guide was generated on 2023-08-25.
1. Introduction
SymmetricDS is open source software for database and file synchronization, with support for multi-master replication, filtered synchronization, and transformation. It uses web and database technologies to replicate change data as a scheduled or near real-time operation, and it includes an initial load feature for full data loads. The software was designed to scale for a large number of nodes, work across low-bandwidth connections, and withstand periods of network outage.
1.1. System Requirements
SymmetricDS is written in Java and requires a Java Runtime Environment (JRE) Standard Edition (SE) or Java Development Kit (JDK) Standard Edition (SE) version 8.0 or above. Most major operating systems and databases are supported. See the list of supported databases in the Database Compatibility section. The minimum operating system requirements are:
-
Java SE Runtime Environment 8 or above
-
Memory - 64 (MB) available
-
Disk - 256 (MB) available
The memory, disk, and CPU requirements increase with the number of connected clients and the amount of data being synchronized. The best way to size a server is to simulate synchronization in a lower environment and benchmark data loading. However, a rule of thumb for servers is one server-class CPU with 2 GB of memory for every 500 MB/hour of data transfer and 350 clients. Multiple servers can be used as a cluster behind a load balancer to achieve better performance and availability.
1.2. Overview
A node is responsible for synchronizing the data from a database or file system with other nodes in the network using HTTP. Nodes are assigned to one of the node Groups that are configured together as a unit. The node groups are linked together with Group Links to define either a push or pull communication. A pull causes one node to connect with other nodes and request changes that are waiting, while a push causes one node to connect with other nodes when it has changes to send.
Each node is connected to a database with a Java Database Connectivity (JDBC) driver using a connection URL, username, and password. While nodes can be separated across wide area networks, the database a node is connected to should be located nearby on a local area network for the best performance. Using its database connection, a node creates tables as a Data Model for configuration settings and runtime operations. The user populates configuration tables to define the synchronization and the runtime tables capture changes and track activity. The tables to sync can be located in any Catalog and Schema that are accessible from the connection, while the files to sync can be located in any directory that is accessible on the local server.
At startup, SymmetricDS looks for Node Properties Files and starts a node for each file it finds, which allows multiple nodes to run in the same instance and share resources. The property file for a node contains its external ID, node group, registration server URL, and database connection information. The external ID is the name for a node used to identify it from other nodes. One node is configured as the registration server where the master configuration is stored. When a node is started for the first time, it contacts the registration server using a registration process that sends its external ID and node group. In response, the node receives its configuration and a node password that must be sent as authentication during synchronization with other nodes.
1.3. Architecture
Each subsystem in the node is responsible for part of the data movement and is controlled through configuration. Data flows through the system in the following steps:
-
Capture into a runtime table at the source database
-
Route for delivery to target nodes and group into batches
-
Extract and transform into the rows, columns, and values needed for the outgoing batch
-
Send the outgoing batch to target nodes
-
Receive the incoming batch at the target node
-
Transform into the rows, columns, and values needed for the incoming batch
-
Load data and return an acknowledgment to the source node
- Capture
-
Change Data Capture (CDC) for tables uses database triggers that fire and record changes as comma-separated values into a runtime table called DATA. For file sync, a similar mechanism is used, except changes to the metadata about files are captured. The changes are recorded as insert, update, and delete event types. The subsystem installs and maintains triggers on tables based on the configuration provided by the user, and it can automatically detect schema changes on tables and regenerate triggers.
- Route
-
Routers run across new changes to determine which target nodes will receive the data. The user configures which routers to use and what criteria is used to match data, creating subsets of rows if needed. Changes are grouped into batches and assigned to target nodes in the DATA_EVENT and OUTGOING_BATCH tables.
- Extract
-
Changes are extracted from the runtime tables and prepared to be sent as an outgoing batch. If large objects are configured for streaming instead of capture, they are queried from the table. Special event types like "reload" for Initial Loads are also processed.
- Transform
-
If transformations are configured, they operate on the change data either during the extract phase at the source node or the load phase at the target node. The node’s database can be queried to enhance the data. Data is transformed into the tables, rows, columns, and values needed for either the outgoing or incoming batch.
- Outgoing
-
The synchronization sends batches to target nodes to be loaded. Multiple batches can be configured to send during a single synchronization. The status of the batch is updated on the OUTGOING_BATCH table as it processes. An acknowledgment is received from target nodes and recorded on the batch.
- Incoming
-
The synchronization receives batches from remote nodes and the data is loaded. The status of the batch is updated on the INCOMING_BATCH table as it processes. The resulting status of the batch is returned to the source node in an acknowledgment.
1.4. Features
SymmetricDS offers a rich set of features with flexible configuration for large scale deployment in a mixed environment with multiple systems.
-
Data Synchronization - Change data capture for relational databases and file synchronization for file systems can be periodic or near real-time, with an initial load feature to fully populate a node.
-
Central Management - Configure, monitor, and troubleshoot synchronization from a central location where conflicts and errors can be investigated and resolved.
-
Automatic Recovery - Data delivery is durable and low maintenance, withstanding periods of downtime and automatically recovering from a network outage.
-
Secure and Efficient - Communication uses a data protocol designed for low bandwidth networks and streamed over HTTPS for encrypted transfer.
-
Transformation - Manipulate data at multiple points to filter, subset, translate, merge, and enrich the data.
-
Conflict Management - Enforce consistency of two-way synchronization by configuring rules for automatic and manual resolution.
-
Extendable - Scripts and Java code can be configured to handle events, transform data, and create customized behavior.
-
Deployment Options - The software can be installed as a self-contained server that stands alone, deployed to a web application server, or embedded within an application.
1.5. Why SymmetricDS?
SymmetricDS is a feature-rich data synchronization solution that focuses on ease of use, openness, and flexibility. The software encourages interoperability and accessibility for users and developers with the availability of source code, an application programming interface (API), and a data model supported by documentation. Configuration includes a powerful set of options to define node topology, communication direction, transformation of data, and integration with external systems. Through scripts and Java code, the user can also extend functionality with custom behavior. With a central database for setup and runtime information, the user has one place to configure, manage, and troubleshoot synchronization, with changes taking immediate effect across the network.
The trigger-based data capture system is easy to understand and widely supported by database systems. Table synchronization can be setup by users and application developers without requiring a database administrator to modify the server. Triggers are database objects written in a procedural language, so they are open for examination, and include flexible configuration options for conditions and customization. Some overhead is associated with triggers, but they perform well for applications of online transaction processing, and their benefits of flexibility and maintenance outweigh the cost for most scenarios.
Using an architecture based on web server technology, many simultaneous requests can be handled at a central server, with proven deployments in production supporting more than ten thousand client nodes. Large networks of nodes can be grouped into tiers for more control and efficiency, with each group synchronizing data to the next tier. Data loading is durable and reliable by tracking batches in transactions and retrying of faults for automatic recovery, making it a low maintenance system.
1.6. License
SymmetricDS is free software licensed under the GNU General Public License (GPL) version 3.0. See http://www.gnu.org/licenses/gpl.html for the full text of the license. This project includes software developed by JumpMind (http://www.jumpmind.com/) and a community of multiple contributors. SymmetricDS is licensed to JumpMind as the copyright holder under one or more Contributor License Agreements. SymmetricDS and the SymmetricDS logos are trademarks of JumpMind.
2. Installation
SymmetricDS at its core is a web application. A SymmetricDS instance runs within the context of a web application container like Jetty or Tomcat, and uses web based protocols like HTTP to communicate with other instances.
An instance has one of the following installation options:
-
Standalone Installation - SymmetricDS is installed and run as a standalone process using the built-in Jetty web server. This is the simplest and recommended way to install an instance.
-
Web Archive (WAR) - A SymmetricDS web archive (WAR) file is deployed to an existing web application container that is separately installed, maintained and run.
-
Embedded - SymmetricDS is embedded within an existing application. In this option, a custom wrapper program is written that calls the SymmetricDS API to synchronize data.
2.1. Standalone Installation
The SymmetricDS standalone ZIP file can be downloaded from Sourceforge. It is installed by unzipping to the installation location.
The sym
command line utility starts a standalone instance of SymmetricDS using
the built-in Jetty web server to handle requests.
The web server can be configured by changing properties in the
conf/symmetric-server.properties
file.
The following example starts the server on the default port from the command line. SymmetricDS will automatically create a node for each Node Properties File configured in the engines directory.
bin/sym
To automatically start SymmetricDS when the machine boots, see Running as a Service.
2.2. Running as a Service
SymmetricDS can be configured to start automatically when the system boots, running as a Windows service or Linux/Unix daemon.
A wrapper process starts SymmetricDS and monitors it, so it can be restarted if it runs out of memory or exits unexpectedly.
The wrapper writes standard output and standard error to the logs/wrapper.log
file.
2.2.1. Running as a Windows Service
To install the service, run the following command as Administrator:
bin\sym_service.bat install
Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as Administrator:
bin\sym_service.bat uninstall
To start and stop the service manually, run the following commands as Administrator:
bin\sym_service.bat start
bin\sym_service.bat stop
2.2.2. Running as a Linux/Unix daemon
An init script is written to the system /etc/init.d
directory.
Symbolic links are created for starting on run levels 2, 3, and 5 and stopping on run levels 0, 1, and 6.
To install the script, running the following command as root:
bin/sym_service install
Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as root:
bin/sym_service uninstall
To start and stop the service manually, run the following commands:
bin/sym_service start
bin/sym_service stop
2.3. Clustering
A single SymmetricDS node can be deployed across a series of servers to cooperate as a cluster. A node can be clustered to provide load balancing and high availability.
Clustering is available in SymmetricDS Pro. |
Each node in the cluster shares the same database. A separate hardware or software load balancer is required to receive incoming requests and direct them to one of the backend nodes. Use the following steps to setup a cluster:
-
Set the
cluster.lock.enabled
property totrue
-
Optionally, set the
cluster.server.id
property to a unique name, otherwise the hostname will be used -
Set the
sync.url
property to the URL of the load balancer -
Set the
initial.load.use.extract.job.enabled
property tofalse
if using local staging -
Copy the engine properties,
security/keystore
, andconf/sym_service.conf
files to each installation -
Configure the load balancer for sticky sessions
With the cluster.lock.enabled
property set to true
, jobs will acquire an entry in the LOCK table to ensure that only
one instance of the job runs across the cluster. When a lock is acquired, a row is updated in the lock table with the time of the lock
and the server ID of the locking job. The locking server ID defaults to the host name, but it can specified with the cluster.server.id
property if nodes are running on the same server. Another instance of the job cannot acquire a lock until the locking instance releases
the lock and sets the lock time back to null. If an instance is terminated while the lock is still held, an instance with the same server ID
is allowed to re-acquire the lock. If the locking instance remains down, the lock can be broken after it expires, specified by the
cluster.lock.timeout.ms
property. Jobs refresh their lock periodically as they run, which prevents a lock from expiring due to a long run time.
The load balancer should be configured to use sticky sessions if the cluster will receive push synchronization. Push connections first request a reservation from the target node and then connect again using the reservation to push changes. Sticky sessions ensures that the push request is sent to the same server where the reservation is held.
Staging is writing batches to disk before sending over the network, which can use local disk or a shared network drive.
Staging can improve performance by reducing the time that resources are held open in the database and by extracting batches before they are served.
To use local staging in a cluster, disable the initial.load.use.extract.job.enabled
property so the initial load will extract
batches on the node serving the request, rather than extracting in the background on a different node.
To use shared staging in a cluster, set the staging.dir
property to the directory path of the network drive and enable the
cluster.staging.enabled
property so files are locked during use. With shared staging, the initial load extracts in the background on one node,
but batches can be served from any of the nodes in the cluster, which can improve performance.
When deploying nodes in a cluster to an application server like Tomcat or JBoss, the application server does NOT need any clustering of sessions configured.
2.4. Other Deployment Options
It is recommended that SymmetricDS is installed as a standalone service, however there are two other deployment options.
2.4.1. Web Archive (WAR)
This option means packaging a WAR file and deploying to your favorite
web server, like Apache Tomcat. It’s a little more work, but you
can configure the web server to do whatever you need. SymmetricDS can also
be embedded in an existing web application, if desired. As a web application archive, a WAR is deployed to an application server,
such as Tomcat, Jetty, or JBoss. The structure of the archive will have a web.xml
file in the WEB-INF
folder, an appropriately configured symmetric.properties
file in the WEB-INF/classes
folder,
and the required JAR files in the WEB-INF/lib
folder.
A war file can be generated using the standalone installation’s symadmin
utility and the
create-war
subcommand. The command requires the name of the war file to generate. It
essentially packages up the web directory, the conf directory and includes an optional
properties file. Note that if a properties file is included, it will be copied to
WEB-INF/classes/symmetric.properties. This is the same location conf/symmetric.properties
would have been copied to. The generated war distribution uses the same web.xml as the standalone
deployment.
bin/symadmin -p my-symmetric-ds.properties create-war /some/path/to/symmetric-ds.war
2.4.2. Embedded
This option means you must write a wrapper Java program that runs SymmetricDS. You would probably use Jetty web server, which is also embeddable. You could bring up an embedded database like Derby or H2. You could configure the web server, database, or SymmetricDS to do whatever you needed, but it’s also the most work of the three options discussed thus far.
The deployment model you choose depends on how much flexibility you need versus how easy you want it to be. Both Jetty and Tomcat are excellent, scalable web servers that compete with each other and have great performance. Most people choose either the Standalone or Web Archive with Tomcat 5.5 or 6. Deploying to Tomcat is a good middle-of-the-road decision that requires a little more work for more flexibility.
A Java application with the SymmetricDS Java Archive (JAR) library on its
classpath can use the SymmetricWebServer
to start the server.
import org.jumpmind.symmetric.SymmetricWebServer;
public class StartSymmetricEngine {
public static void main(String[] args) throws Exception {
SymmetricWebServer node = new SymmetricWebServer(
"classpath://my-application.properties", "conf/web_dir");
// this will create the database, sync triggers, start jobs running
node.start(8080);
// this will stop the node
node.stop();
}
This example starts the SymmetricDS server on port 8080.
The configuration properties file, my-application.properties
,
is packaged in the application to provide properties that override the SymmetricDS
default values. The second parameter to the constructor points to the web directory.
The default location is web
. In this example the web directory is located
at conf/web_dir
. The web.xml is expected to be found at conf/web_dir/WEB-INF/web.xml
.
2.4.3. Client Mode
This option runs the SymmetricDS engine without a web server, so it can initiate push and pull requests, but not receive them. Without the web server, there are no open ports listening for sync requests, which can help with security requirements.
The sym
command line utility has an option to start in client mode:
bin/sym --client
The conf/sym_service.conf
file has a parameter to start the service in client mode:
wrapper.app.parameter.3=--client
3. Setup
The first node created is called the Master Node. The master node is where configuration is managed. Other nodes typically register with the master node to get their initial configuration. After nodes are registered, additional configuration changes made at the master node are synchronized to the registered nodes automatically.
All nodes require a properties file that contains identity information and database connection information. SymmetricDS configuration needs to be inserted via a SQL script at the master node.
3.1. Node Properties File
Each node that is deployed to a server is represented by a properties file that allows it to connect to a database and
register with a parent node. Properties are configured in a file named
xxxxx.properties
. It is placed in the engines directory of the SymmetricDS install. The
file is usually named according to the engine.name, but it is not a
requirement.
To give a node its identity, the following properties are required. Any
other properties found in
conf/symmetric.properties
can be overridden for a specific engine in an engine’s properties file.
If the properties are changed in
conf/symmetric.properties
they will take effect across all engines deployed to the server.
You can use the variable $(hostName) to represent the host name of the machine when defining these properties
(for example, external.id=$(hostName) ). You can also access external id, engine name, node group id, sync URL, and registration URL in this manner.
(for example, engine.name=$(nodeGroupId)-$(externalId)).
|
You can also use BSH script for the external id, engine name, node group id, sync URL, and registration URL. Use back ticks to indicate the BSH expression, and note that only one BSH expression is supporter for a given property line. The script can be prefixed or suffixed with fixed text. For example, if you wish to based the external id off of just a part of the hostname (e.g., substring of hostName): external.id=store-`import org.apache.commons.lang.StringUtils; return StringUtils.substring(hostName,2,4);\` |
You can also set both environmental variables and system properties directly in the engine file. Environmental variables refer to external variables supplied by the Operating System, and system properties are variables that can be passed into the JVM. For example, if the environment variable you’re using is USERNAME, you would use $(USERNAME) in the engine file. Note, this is case sensitive. |
- engine.name
-
This is an arbitrary name that is used to access a specific engine using an HTTP URL. Each node configured in the engines directory must have a unique engine name. The engine name is also used for the domain name of registered JMX beans.
- group.id
-
The node group that this node is a member of. Synchronization is specified between node groups, which means you only need to specify it once for multiple nodes in the same group.
- external.id
-
The external id for this node has meaning to the user and provides integration into the system where it is deployed. For example, it might be a retail store number or a region number. The external id can be used in expressions for conditional and subset data synchronization. Behind the scenes, each node has a unique sequence number for tracking synchronization events. That makes it possible to assign the same external id to multiple nodes, if desired.
- sync.url
-
The URL where this node can be contacted for synchronization. At startup and during each heartbeat, the node updates its entry in the database with this URL. The sync url is of the format:
http://{hostname}:{port}/{webcontext}/sync/{engine.name}
The {webcontext} is blank for a standalone deployment. It will typically be the name of the war file for an application server deployment.
The {engine.name} can be left blank if there is only one engine deployed in a SymmetricDS server.
When a new node is first started, it is has no information about synchronizing. It contacts the registration server in order to join the network and receive its configuration. The configuration for all nodes is stored on the registration server, and the URL must be specified in the following property:
- registration.url
-
The URL where this node can connect for registration to receive its configuration. The registration server is part of SymmetricDS and is enabled as part of the deployment. This is typically equal to the value of the sync.url of the registration server.
Note that a registration server node is defined as one whose registration.url
is either blank or identical to itssync.url
.For a deployment where the database connection pool should be created using a JDBC driver, set the following properties:
- db.driver
-
The class name of the JDBC driver.
- db.url
-
The JDBC URL used to connect to the database.
- db.user
-
The database username, which is used to login, create, and update SymmetricDS tables.
- db.password
-
The password for the database user.
See Startup Parameters, for additional parameters that can be specified in the engine properties file.
3.2. Adding Nodes
Section Add Node talks about creating additional nodes.
3.3. Load Only Node
Load only nodes can be configured for nodes that are only designed to load data into the database. This prevents any triggers or runtime tables from being installed on this database but still allow data to be replicated to it.
-
No SymmetricDS runtime tables (SYM_*) installed inside the target database.
-
No SymmetricDS triggers installed inside the target database.
-
There is still support for bulk loaders if provided (MSSQL, Oracle, Postgres, MySQL for example).
-
Allows SymmetricDS to load data into dialects that are not fully supported yet or may not have full trigger support.
-
Can not be set up to capture changes
-
Requires a JDBC driver
-
May require table creation outside of SymmetricDS
-
Requires an additional database to be used for SymmetricDS runtime usage. H2 is used in the setup below but any other database could be used as the runtime database.
3.3.1. Setup Load Only Node
-
Begin by setting up a Node Properties File. You do not need to provide any of the db.* properties just yet as they will be adjusted in step 2.
-
Adjust the db.* properties in the load only engine properties file to utilize H2 for runtime information and configuration. The db.user and db.password values are required but do not have to contain values, an empty property for these is sufficient.
db.driver=org.h2.Driver
db.url=jdbc:h2:file:load-only;LOCK_TIMEOUT=60000
db.validation.query=select 1
db.user=
db.password=
If you have multiple load only nodes under a single installation of SymmetricDS you will need to provide a unique H2 database name for each in the db.url above (default is load-only). |
-
Add the following properties to identify the node as load only and provide connection information for your load only database.
load.only=true target.db.driver= target.db.url= target.db.user= target.db.password=
If additional database properties are needed you can add the prefix target. to any other existing SymmetricDS parameters Startup Parameters. |
-
Add the jar file containing the jdbc driver provided in the target.db.driver property to the /lib folder of SymmetricDS
-
Restart SymmetricDS
4. Configuration
Configuring SymmetricDS is the process of setting up your synchronization scenario.
4.1. Groups
In SymmetricDS, configuration rules are applied to groups of nodes versus individual nodes. A group is a categorization of nodes with similar synchronization needs. For example, in a synchronization scenario where a corporate office database is synchronized with field office databases, two node groups would be created, one for the corporate office database (Corporate), and one for the field office databases (Field_office). In the corporate group, there would be a single node and database. In the field_office group, there would be many nodes and databases, one for each field office. Configuration rules/elements are applied to the node group versus the individual nodes in order to simplify the configuration setup (no need to configure each individual field office node, just how the field office nodes sync with the corporate office node).
- Group ID
-
Unique identifier for the group.
- Description
-
Description of the group that is available through the console.
insert into SYM_NODE_GROUP
(node_group_id, description)
values ('store', 'A retail store node');
insert into SYM_NODE_GROUP
(node_group_id, description)
values ('corp', 'A corporate node');
4.2. Group Links
Group links define at a high level how data moves throughout your synchronization scenario. The group link defines which node groups will synchronize data to other node groups and within that exchange, which node group will initiate the conversation for that exchange.
- Source Group ID
-
The source group of the communication link.
- Link
-
Defines how the source and target groups will communicate.
Push [P] |
Indicates that nodes in the source node group will initiate communication over an HTTP PUT and push data to nodes in the target node group. |
Wait for Pull [W] |
Indicates nodes in the source node group will wait for a node in the target node group to connect via an HTTP GET and allow the nodes in the target node group to pull data from the nodes in the source node group. |
Route-only [R] |
Route-only indicates that the data isn’t exchanged between nodes in the source and nodes in the target node groups via SymmetricDS. This action type might be useful when using an XML publishing router or an audit table changes router. |
- Target Group ID
-
The target group of the communication link.
- Sync Configuration
-
Determines if configuration is also sent through this group link. This affects all SymmetricDS configuration tables except for sym_extract_request, sym_file_snapshot, sym_monitor_event, sym_node, sym_node_host, sym_node_security, sym_table_reload_request, and sym_table_reload_status. By default this is checked and configuration will communicate on this path. There are configurations that might cause configuration to continuously loop through the network. As a result, this might need to be unchecked for some links.
- Reversible
-
Allows the communication link to send in the reverse direction if specified on the channel. A push link can be overridden to pull and a pull link can be overridden to push using a setting on the channel.
insert into SYM_NODE_GROUP_LINK
(source_node_group_id, target_node_group_id, data_event_action)
values ('store', 'corp', 'P');
insert into SYM_NODE_GROUP_LINK
(source_node_group_id, target_node_group_id, data_event_action)
values ('corp', 'store', 'W');
4.3. Routers
Routers ride on top of group links. While a group link specifies that data should be moved from nodes in a source node group to nodes in a target node group, routers define more specifically which captured data from a source node should be sent to which specific nodes in a target node group, all within the context of the node group link.
- Router Id
-
Unique description of a specific router
- Group Link
-
The group link used for the source and target node groups of this router
- Router Type
-
The type of router. Standard router types are listed below. Custom routers can be configured as extension points.
Type | Description |
---|---|
default |
A router that sends all captured data to all nodes that belong to the target node group defined in the router. See Default Router |
column |
A router that compares old or new column values in a captured data row to a constant value or the value of a target node’s external id or node id. See Column Match Router |
audit |
A router that inserts into an automatically created audit table. It records captured changes to tables that it is linked to. See Audit Table Router |
java |
A router that executes a Java expression in order to select nodes to route to. The script can use the old and new column values. See [Java Router] |
lookuptable |
A router which can be configured to determine routing based on an existing or ancillary table specifically for the purpose of routing data. See Lookup Table Router |
subselect |
A router that executes a SQL expression against the database to select nodes to route to. This SQL expression can be passed values of old and new column values. See Subselect Router |
convertToReload |
When subselect router is too slow and there are too many tables or rows for lookuptable router, a convertToReload router can efficiently sub-set data by converting multiple changes into a reload batch. See ConvertToReload Router |
bsh |
A router that executes a Bean Shell script expression in order to select nodes to route to. The script can use the old and new column values. See Beanshell Router |
csv |
A router that sends data from a CSV file to a target table of the target node group defined in the router. See CSV Router |
dbf |
A router that sends data from a dBase generated DBF file to a target table of the target node group defined in the router. See DBF Router |
- Router Expression
-
An expression that is specific to the type of router that is configured in router type. See the documentation for each router for more details.
- Use Source Catalog/Schema
-
If set then the source catalog and source schema are sent to the target to be used to find the target table.
- Target Catalog
-
Optional name of catalog where a target table is located. If this field is unspecified, the catalog will be either the default catalog at the target node or the "source catalog name" from the table trigger, depending on how "use source catalog schema" is set for the router. Variables are substituted for
$(sourceNodeId)
,$(sourceExternalId)
,$(sourceNodeGroupId)
,$(targetNodeId)
,$(targetExternalId)
,$(targetNodeGroupId)
,$(sourceCatalogName)
, and$(sourceSchemaName)
. Parameter values can be substituted using$(name)
syntax. See Variables. - Target Schema
-
Optional name of schema where a target table is located. If this field is unspecified, the schema will be either the default schema at the target node or the "source schema name" from the table trigger, depending on how "use source catalog schema" is set for the router. Variables are substituted for
$(sourceNodeId)
,$(sourceExternalId)
,$(sourceNodeGroupId)
,$(targetNodeId)
,$(targetExternalId)
,$(targetNodeGroupId)
,$(sourceCatalogName)
, and$(sourceSchemaName)
. Parameter values can be substituted using$(name)
syntax. See Variables. - Sync on Update
-
Flag that indicates that this router should send updated rows from nodes in the source node group to nodes in the target node group.
- Sync on Insert
-
Flag that indicates that this router should send inserted rows from nodes in the source node group to nodes in the target node group.
- Sync on Delete
-
Flag that indicates that this router should send deleted rows from nodes in the source node group to nodes in the target node group.
- Target Table
-
Optional name for a target table. Only use this if the target table name is different than the source.
4.3.1. Router Types
Default Router
The simplest router is a router that sends all the data that is captured by its associated triggers to all the nodes that belong to the target node group defined in the router.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, create_time,
last_update_time) values ('corp-2-store','corp', 'store',
current_timestamp, current_timestamp);
insert into SYM_TRIGGER_ROUTER
(trigger_id, router_id, initial_load_order, create_time,
last_update_time) values ('item', 'corp-2-store', 1, current_timestamp,
current_timestamp);
Column Match Router
Sometimes requirements may exist that require data to be routed based on the current value or the old value of a column in the table that is being routed. Column routers are configured by setting the router_type column on the ROUTER table to column and setting the router_expression column to an equality expression that represents the expected value of the column.
The first part of the expression is always the column name. The column name should always be defined in upper case. The upper case column name prefixed by OLD_ can be used for a comparison being done with the old column data value.
The second part of the expression can be a constant value, a token that represents another column, or a token that represents some other SymmetricDS concept. Token values always begin with a colon (:).
-
Consider a table that needs to be routed to all nodes in the target group only when a status column is set to 'READY TO SEND.'
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-ok','corp', 'store', 'column', 'STATUS=READY TO SEND',
current_timestamp, current_timestamp);
-
Consider a table that needs to be routed to all nodes in the target group only when a status column changes values.
The use of OLD_STATUS, where the OLD_ prefix gives access to the old column value. |
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-status','corp', 'store', 'column', 'STATUS!=:OLD_STATUS',
current_timestamp, current_timestamp);
Attributes on a NODE that can be referenced with the following tokens
|
-
Consider a table that needs to be routed to only nodes in the target group whose STORE_ID column matches the external id of a node.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-id','corp', 'store', 'column', 'STORE_ID=:EXTERNAL_ID',
current_timestamp, current_timestamp);
-
Consider a table that needs to be routed to a redirect node defined by its external id in the REGISTRATION_REDIRECT table.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-redirect','corp', 'store', 'column',
'STORE_ID=:REDIRECT_NODE', current_timestamp, current_timestamp);
-
More than one column may be configured in a router_expression. When more than one column is configured, all matches are added to the list of nodes to route to. The following is an example where the STORE_ID column may contain the STORE_ID to route to or the constant of ALL which indicates that all nodes should receive the update.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-multiple-matches','corp', 'store', 'column',
'STORE_ID=ALL or STORE_ID=:EXTERNAL_ID', current_timestamp,
current_timestamp);
-
The NULL keyword may be used to check if a column is null. If the column is null, then data will be routed to all nodes who qualify for the update. This following is an example where the STORE_ID column is used to route to a set of nodes who have a STORE_ID equal to their EXTERNAL_ID, or to all nodes if the STORE_ID is null.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-multiple-matches','corp', 'store', 'column',
'STORE_ID=NULL or STORE_ID=:EXTERNAL_ID', current_timestamp,
current_timestamp);
-
External data collected as part of the trigger firing (see External Select) can also be used as a virtual column in the router expression as well.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-multiple-matches','corp', 'store', 'column',
'EXTERNAL_DATA=:EXTERNAL_ID', current_timestamp,
current_timestamp);
Audit Table Router
This router audits captured data by recording the change in an audit table that the router creates and keeps up to date. The router creates a table named the same as the table for which data was captured with the suffix of _AUDIT. It will contain all of the same columns as the original table with the same data types only each column is nullable with no default values.
The following parameter must be set to true so that the audit table can be created. |
auto.config.database=true
AUDIT_ID |
the primary key of the table. |
AUDIT_TIME |
the time at which the change occurred. |
AUDIT_EVENT |
the DML type that happened to the row. |
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type, create_time,
last_update_time) values ('audit_at_corp','corp', 'local', 'audit',
current_timestamp, current_timestamp);
The audit router must be associated with a node group link of type 'R'. The 'R' stands for 'only routes to' (see Group Links). In the above example, we refer to a 'corp to local' group link. Here, local is a new node_group created for the audit router. No nodes belong to the 'local' node_group. If a trigger linked to an audit router fires on the corp node, a new audit table will be created at the corp node with the new data inserted. |
Lookup Table Router
A lookup table may contain the id of the node where data needs to be routed. This could be an existing table or an ancillary table that is added specifically for the purpose of routing data. Lookup table routers are configured by setting the router_type column on the ROUTER table to lookuptable and setting a list of configuration parameters in the router_expression column.
Each of the following configuration parameters are required.
|
The lookup table will be read into memory and cached for the duration of a routing pass for a single channel. |
Consider a table that needs to be routed to a specific store, but the data in the changing table only contains brand information. In this case, the STORE table may be used as a lookup table.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-ok','corp', 'store', 'lookuptable', 'LOOKUP_TABLE=STORE
KEY_COLUMN=BRAND_ID LOOKUP_KEY_COLUMN=BRAND_ID
EXTERNAL_ID_COLUMN=STORE_ID', current_timestamp, current_timestamp);
Subselect Router
Sometimes routing decisions need to be made based on data that is not in the current row being synchronized. A 'subselect' router can be used in these cases. A 'subselect' is configured with a router expression that is a SQL select statement which returns a result set of the node ids that need routed to. Column tokens can be used in the SQL expression and will be replaced with row column data.
The overhead of using this router type is high because the 'subselect' statement runs for each row that is routed. It should not be used for tables that have a lot of rows that are updated. It also has the disadvantage that if the data being relied on to determine the node id has been deleted before routing takes place, then no results would be returned and routing would not happen. |
The router expression you specify is appended to the following SQL statement in order to select the node ids:
select c.node_id
from sym_node c
where c.node_group_id=:NODE_GROUP_ID
and c.sync_enabled=1 and ...
The SQL statement has access to the following variables that are replaced before running:
:NODE_GROUP_ID |
The target node group ID that is configured for the router. |
:EXTERNAL_DATA |
The external data for current row, as configured by sym_trigger.external_select. |
:DATA_EVENT_TYPE |
The event type of either INSERT, UPDATE, or DELETE. |
:TABLE_NAME |
The table name for the current row. |
:COLUMN_NAME |
Variables named for each column name (in uppercase), which return the column value for the new row. |
:OLD_COLUMN_NAME |
Variables named for each column name (in uppercase and prefixed with OLD_), which return the column value for the old row. |
For an example, consider the case where an Order table and an OrderLineItem table need to be routed to a specific store. The Order table has a column named order_id and STORE_ID. A store node has an external_id that is equal to the STORE_ID on the Order table. OrderLineItem, however, only has a foreign key to its Order of order_id. To route OrderLineItems to the same nodes that the Order will be routed to, we need to reference the master Order record.
There are two possible ways to solve this in SymmetricDS.
-
Configure a 'subselect' router type (shown below).
-
Use an external select to capture the data via a trigger for use in a column match router, see External Select.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store','corp', 'store', 'subselect', 'c.external_id in (select
STORE_ID from order where order_id=:ORDER_ID)', current_timestamp,
current_timestamp);
In this example that the parent row in Order must still exist at the moment of routing for the child rows (OrderLineItem) to route, since the select statement is run when routing is occurring, not when the change data is first captured. |
ConvertToReload Router
This router converts multiple change events into a single reload event, which can be used to sub-set data quickly when the subselect router is too slow or when the lookuptable router can’t handle the amount of lookup data. ConvertToReload inserts the primary key values for each row, along with a unique load ID, into a temporary table. The reload event uses the initial load SQL along with a join to the temporary table to retrieve the changes and sub-set data.
The router expression requires the name of the temporary table to use:
temptable=mytable_router
create table mytest (id integer, name varchar(50), primary key(id));
create table mytest_router (id integer, load_id integer, primary key(load_id, id));
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store','corp', 'store', 'convertToReload', 'temptable=mytest_router', current_timestamp,
current_timestamp);
Beanshell Router
When more flexibility is needed in the logic to choose the nodes to route to, then the a scripted router may be used. The currently available scripting language is Bean Shell. Bean Shell is a Java-like scripting language. Documentation for the Bean Shell scripting language can be found at http://www.beanshell.org .
The router type for a Bean Shell scripted router is 'bsh'. The router expression is a valid Bean Shell script that:
nodes |
Collection of org.jumpmind.symmetric.model.Node objects the router would route to normally. |
nodeIds |
Collection of node ids that the router would route to normally. You can just return this if you want the bsh router to behave like the default router. |
targetNodes |
Collection of org.jumpmind.symmetric.model.Node objects to be populated and returned. |
engine |
The instance of org.jumpmind.symmetric.ISymmetricEngine which has access to SymmetricDS services. |
Any Data Column |
Data column values are bound to the script evaluation as Java object representations of the column data. The columns are bound using the uppercase names of the columns. For example, a table with a store_id column will have a STORE_ID variable name available in Bean Shell script. When using file sync, columns from the FILE_SNAPSHOT table will be available in the script. |
Any Old Values |
Old Data column values are bound to the script evaluation as Java object representations of the column data. The columns are bound using the uppercase representations that are prefixed with 'OLD_'. For example, a table with a store_id column will have an OLD_STORE_ID variable name available in Bean Shell script representing the old value for the store_id before the change. When using file sync, columns from the FILE_SNAPSHOT table will be available in the script. |
targetNodes |
Collection of org.jumpmind.symmetric.model.Node objects that will be routed to. |
true |
All nodes should be routed |
false |
No nodes should be routed |
The last line of a bsh script is always the return value. |
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-bsh','corp', 'store', 'bsh', 'targetNodes.add(STORE_ID +
"-" + WORKSTATION_NUMBER);', current_timestamp, current_timestamp);
The same could also be accomplished by simply returning the node id.
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-bsh','corp', 'store', 'bsh', 'STORE_ID +
"-" + WORKSTATION_NUMBER', current_timestamp, current_timestamp);
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-flag-changed','corp', 'store', 'bsh', 'FLAG != null
&& !FLAG.equals(OLD_FLAG)', current_timestamp,
current_timestamp);
Here we make use of OLD_, which provides access to the old column value. |
insert into SYM_ROUTER (router_id,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('corp-2-store-trimmed-station','corp', 'store', 'bsh', 'for
(org.jumpmind.symmetric.model.Node node : nodes) { if (STATION != null
&& node.getExternalId().equals(STATION.trim())) {
targetNodes.add(node.getNodeId()); } }', current_timestamp,
current_timestamp);
CSV Router
This router is used to route csv files to a specified target database table.
In your router expression you can tell the router to include a transaction ID for the routed data with INCLUDE_TRANSACTION_ID=true|false. Default is false.
insert into SYM_ROUTER (router_id, target_table_name,
source_node_group_id, target_node_group_id, router_type,
router_expression, create_time, last_update_time) values
('store-2-corp-csv','table1','store', 'corp', 'csv',
'INCLUDE_TRANSACTION_ID=true', current_timestamp, current_timestamp);
The CSV Router routes from file to database so a file trigger must be created to specify the directory or path that the file trigger should watch. (see File Triggers) |
DBF Router
This router is used to route dbf files that are generated from a dBase system to a specified target database table.
insert into SYM_ROUTER (router_id, target_table_name,
source_node_group_id, target_node_group_id, router_type,
create_time, last_update_time) values
('store-2-corp-dbf','table1','store', 'corp', 'dbf',
current_timestamp, current_timestamp);
The DBF Router routes from file to database so a file trigger must be created to specify the directory or path that the file trigger should watch. (see File Triggers) |
4.4. Channels
Once group links and routers are defined, configuration must be completed to specify which data (tables, file systems, etc.) should be synchronized over those links and routers. The next step in defining which specific data in the database is moved is to define logical groupings for that data. Channels define those logical groupings. As an example, a set of tables that hold customer data might be logically grouped together in a Customer channel. Sales, returns, tenders, etc. (transaction data) might be logically grouped into a transaction channel. A default channel is automatically created that all tables will fall into unless other channels are created and specified. The default channel is called 'default'.
Channels can be disabled, suspended, or scheduled as needed.
Transactions will NOT be preserved across channels so its important to setup channels to contain all tables that participate in a given transaction. |
- Channel ID
-
Identifier used through the system to identify a given channel.
- Processing Order
-
Numeric value to determine the order in which a channel will be processed. Channels will be processed in ascending order.
- Batch Algorithm
-
Batching is the grouping of data, by channel, to be transferred and committed at the client together.
Default |
All changes that happen in a transaction are guaranteed to be batched together. Multiple transactions will be batched and committed together until there is no more data to be sent or the max_batch_size is reached. The routing.max.batch.size.exceed.percent parameter is used to keep batch sizes within a percentage over the max_batch_size, or it can be set to zero for no limit. |
Transactional |
Batches will map directly to database transactions. If there are many small database transactions, then there will be many batches. The max_batch_size column has no effect. |
Nontransactional |
Multiple transactions will be batched and committed together until there is no more data to be sent or the max_batch_size is reached. The batch will be cut off at the max_batch_size regardless of whether it is in the middle of a transaction. |
- Max Batch Size
-
Specifies the maximum number of data events to process within a batch for this channel.
- Max Batch To Send
-
Specifies the maximum number of batches to send for a given channel during a 'synchronization' between two nodes. A 'synchronization' is equivalent to a push or a pull. For example, if there are 12 batches ready to be sent for a channel and max_batch_to_send is equal to 10, then only the first 10 batches will be sent even though 12 batches are ready.
- Max Data To Route
-
Specifies the maximum number of data rows to route for a channel at a time.
- Max KB/s
-
Specifies the maximum network transfer rate in kilobytes per second. Use zero to indicate unlimited. When throttling the channel, make sure the channel is on its own queue or within a queue of channels that are throttled at the same rate. This is currently only implemented when staging is enabled.
- Data Loader Types
-
Determines how data will be loaded into the target tables. These are used during an initial load or a reverse initial load. Data loaders do not always have to load into the target relational database. They can write to a file, a web service, or any other type of non-relational data source. Data loaders can also use other techniques to increase performance of data loads into the target relation database.
default |
Performs an insert first and if this fails will fall back to an update to load the data. |
ftp_localhost |
Sends the data in CSV format to a configured ftp location. These locations are setup in the TODO {SYM_HOME}/conf/ftp-extensions.xml |
bulk |
Assigns the appropriate bulk loader to this channel. Supported bulk loaders include: Microsoft SQL, PostgreSQL, MySQL and Amazon Redshift over S3. |
mongodb |
MongoDB data loader. |
Tables that should be data loaded should be configured to use this channel. Many times, a reload channel will be set to bulk load to increase the performance of an initial load. |
- Queue Name
-
Determines a queue that the channel will sync in. Channels with the same queue name are processed synchronously (one at a time) and channels on different queues are processed asynchronously (in parallel).
- Group Link Direction
-
For a node group link that is reversible, the channel can specify either "push" or "pull" to override the default group link communication. If this field is empty, the default group link communication is used.
- Enabled
-
Indicates whether the channel is enabled or disabled. If a channel is disabled, data is still captured for changes that occur on the source system, but it will not be routed and sent to the target until the channel is re-enabled.
- Reload Channel
-
Indicates whether a channel is available for initial loads and reverse initial loads.
- File Sync Channel
-
Indicates whether a channel is available for file synchronization.
- Use Old Data To Route
-
Indicates if the old data will be included for routing. Routing can then use this data for processing. Defaults to true.
- Use Row Data To Route
-
Indicates if the current data will be included for routing. Routing can then use this data for processing. Defaults to true.
- Use Primary Key (PK) Data to Route
-
Indicates if the primary key data will be include for routing. For example maybe a store ID is needed to apply logic on before sending to the appropriate target nodes. Defaults to true.
- Contains Lob or Wide Row Data
-
For Oracle, Tibero, Firebird, and Interbase, this setting can be enabled when change data capture exceeds the character limit. Oracle and Tibero have a character limit of 4000, while Firebird and Interbase have a character limit of 20000 for changes and 1000 for primary key values. Change data capture is first attempted to extract as character data for better performance, then it will automatically fall back to extract as a large object (LOB). Enable this setting when most changes captured on the channel need extracted as LOB or when the extraction is receiving a truncation error.
insert into SYM_CHANNEL (channel_id, processing_order, max_batch_size, max_batch_to_send,
extract_period_millis, batch_algorithm, enabled, description)
values ('item', 10, 1000, 10, 0, 'default', 1, 'Item and pricing data');
insert into SYM_CHANNEL (channel_id, processing_order, max_batch_size,
max_batch_to_send, extract_period_millis, batch_algorithm, enabled, description)
values ('sale_transaction', 1, 1000, 10, 60000,
'transactional', 1, 'retail sale transactions from register');
Channel Tips and Tricks
Increase performance by creating designated channels for tables that use LOB data types. For these channels be sure to check the "Table Contains Big Lobs" to increase performance.
|
4.5. Table Triggers
SymmetricDS captures synchronization data using database triggers. SymmetricDS' Triggers are defined in the TRIGGER table. Each record is used by SymmetricDS when generating database triggers. Database triggers are only generated when a trigger is associated with a ROUTER whose source_node_group_id matches the node group id of the current node.
When determining whether a data change has occurred or not, by default the triggers will record a change even if the data was updated to the same value(s) they were originally. For example, a data change will be captured if an update of one column in a row updated the value to the same value it already was. There is a global property, trigger.update.capture.changed.data.only.enabled (false by default), that allows you to override this behavior. When set to true, SymmetricDS will only capture a change if the data has truly changed (i.e., when the new column data is not equal to the old column data).
- Trigger Id
-
Unique identifier for a trigger.
- Source Catalog
-
Optional name for the catalog the configured table is in. If the name includes * then a wildcard match on the table name will be attempted. \ Wildcard names can include a list of names that are comma separated. The ! symbol may be used to indicate a NOT match condition. Parameter values can be substituted using
$(name)
syntax. See Variables. - Source Schema
-
Optional name for the schema a configured table is in. If the name includes * then a wildcard match on the table name will be attempted. Wildcard names can include a list of names that are comma separated. The ! symbol may be used to indicate a NOT match condition. Parameter values can be substituted using
$(name)
syntax. See Variables. - Source Table
-
The name of the source table that will have a trigger installed to watch for data changes. See Trigger Wildcards for using wildcards to specify multiple source tables. Parameter values can be substituted using
$(name)
syntax. See Variables. - Channel
-
The channel_id of the channel that data changes will flow through.
- Sync On Insert
-
Flag for installing an insert trigger.
- Sync On Update
-
Flag for installing an update trigger.
- Sync On Delete
-
Flag for installing a delete trigger.
- Reload Channel Id
-
The channel_id of the channel that will be used for initial loads.
- Sync Conditions
-
A procedure language expression included in the trigger text to determine whether a change is captured or not. Most platforms include the condition inside an "IF" statement, while SQL-Server includes the condition in a "WHERE" clause. Old and new values of a column can be referenced using "$(oldTriggerValue)" and "$(newTriggerValue)" aliases respectively. See Trigger Variables. For example, if a character column is named "STATUS" and the row should be captured when the value is "2", then the condition would be:
$(newTriggerValue).status = '2'
- Sync On Insert Condition
-
Conditional expression for the insert trigger to determine if a change is captured or not. See Sync Conditions.
- Sync On Update Condition
-
Conditional expression for the update trigger to determine if a change is captured or not. See Sync Conditions.
- Sync On Delete Condition
-
Conditional expression for the delete trigger to determine if a change is captured or not. See Sync Conditions.
- Custom Insert Trigger Text
-
Specify insert trigger text (SQL) to execute after the SymmetricDS trigger fires. This field is not applicable for H2, HSQLDB 1.x or Apache Derby.
- Custom Update Trigger Text
-
Specify update trigger text (SQL) to execute after the SymmetricDS trigger fires. This field is not applicable for H2, HSQLDB 1.x or Apache Derby.
- Custom Delete Trigger Text
-
Specify delete trigger text (SQL) to execute after the SymmetricDS trigger fires. This field is not applicable for H2, HSQLDB 1.x or Apache Derby.
- Sync On Incoming
-
Whether or not an incoming batch that loads data into this table should cause the triggers to capture changes. Changes are never sent back to the source node, but enabling this setting makes it possible to create a never-ending loop through intermediary nodes, such as A to B to C to A.
- Capture Row As LOB
-
For Oracle, Tibero, SQL Server, and DB2, row data that exceeds the character limit needs converted into a large object (LOB) for capture. Oracle and Tibero have a character limit of 4000, SQL Server has a nvarchar limit of 4000 and a varchar limit of 8000, while DB2 has a character limit of 32767. The trigger will try to automatically capture the row correctly, so only enable this setting if you are getting truncation errors during trigger creation or when changing data.
- Stream LOBs
-
Captures an empty placeholder for large object (LOB) data types when a row is changed, then queries for the LOB value later when the batch is extracted. If normal capturing of LOBs is not working, enabling this setting may work instead. When very large LOB data is involved, this setting can reduce the overhead of making changes in the database, but it usually results in worse performance of synchronization since it queries each row during extraction.
- Stream Row
-
Captures only the primary key values when the trigger fires, which can reduce overhead for tables with wide data or many columns. The data will be queried using the PK values when the batch is extracted. This results in worse performance of synchronization, but it can be used when triggers for all columns won’t install or when contention from triggers is too high.
- Capture Old Data
-
Indicates whether this trigger should capture and send the old data, which is the previous state of the row before the change. Enable this option if you need to access old data in custom trigger text, routing expression, or transform expression. Otherwise, disable this option for better performance.
- Handle Key Updates
-
For SQL-Server and Sybase, enable this setting to capture changes to the primary key. The trigger needs to do some additional work to handle changes to the primary key, so this setting is normally disabled.
- External Select
-
Specify a SQL select statement that returns a single row, single column result. It will be used in the generated database trigger to populate the EXTERNAL_DATA field on the data table.
- Excluded Column Names
-
Specify a comma-delimited list of columns that should not be synchronized from this table.
- Included Column Names
-
Specify a comma-delimited list of columns only should be synchronized from this table.
- Sync Key Names
-
Specify a comma-delimited list of columns that should be used as the key for synchronization operations. By default, if not specified, then the primary key of the table will be used.
- Channel Expression
-
An expression that will be used to capture the channel id in the trigger. This expression will only be used if the channel_id is set to 'dynamic'. The variable "$(schemaName)" can be used, which is replaced with the source schema of the table. See Variables.
insert into SYM_TRIGGER (trigger_id, source_table_name,
channel_id, last_update_time, create_time)
values ('item', 'item', 'item', current_timestamp, current_timestamp);
Multiple Triggers On A Table
Note that many databases allow for multiple triggers of the same type to be defined. Each database defines the order in which the triggers fire differently.
If you have additional triggers beyond those SymmetricDS installs on your table, please consult your database documentation to determine if
there will be issues with the ordering of the triggers.
|
Capture Changed Data
When determining whether a data change has occurred or not, by default the triggers will record a change even if the data was updated to the same value(s) they were originally. For example, a data change will be captured if an update of one column in a row updated the value to the same value it already was. There is a global property that allows you to override this behavior (defaults to false). trigger.update.capture.changed.data.only.enabled=false This property is currently only supported on MySQL, DB2, SQL Server, and Oracle. |
4.5.1. Trigger Wildcards
The source table name may be an expression that is used to match multiple table names. Special characters include the asterisk ('*') for wildcards, the comma (',') for multiple expressions, and the exclamation ('!') for negation.
-
Separate multiple table names or table name expressions with a comma.
-
Characters are always evaluated from left to right.
-
When a table match is made, the table is either added to or removed from the list of tables. If another trigger already exists for a table, then that table is not included in the wildcard match (the explicitly defined trigger entry take precedence).
-
System tables and any table names that start with the SymmetricDS table prefix will be excluded.
-
To negate the expression and exclude tables, start the expression with an exclamation.
-
Double up special characters to match a single literal character. (Use two asterisks to match a single asterisk.) The entire expression is processed as wildcarded when an odd number of consecutive special characters are found.
4.5.2. External Select
Occasionally, you may find that you need to capture and save away a piece of data present in another table when a trigger is firing. This data is typically needed for the purposes of determining where to 'route' the data to once routing takes place. Each trigger definition contains an optional "external select" field which can be used to specify the data to be captured. Once captured, this data is available during routing in DATA 's external_data field.
For these cases, place a SQL select statement which returns the data item you need for routing in external_select. See Trigger Variables for a list of variables available for use.
The external select SQL must return a single row, single column |
insert into SYM_TRIGGER
(trigger_id,source_table_name,channel_id,external_select, last_update_time,create_time)
values ('orderlineitem', 'orderlineitem','orderlineitem',
'select STORE_ID from order where order_id=$(curTriggerValue).$(curColumnPrefix)order_id',
current_timestamp, current_timestamp);
External select SQL statements should be used carefully as they will cause the trigger to run the additional SQL each time the trigger fires. |
Using an external select on the trigger is similar to using the 'subselect' router. The advantage of this approach over the 'subselect' approach is that it guards against the (somewhat unlikely) possibility that the master Order table row might have been deleted before routing has taken place. This external select solution also is a bit more efficient than the 'subselect' approach. |
4.5.3. Trigger Variables
The Sync Condition, External Select, and Custom Trigger Text configurations allow the user to provide procedure language text that is included inside the trigger. Variables can be used for configuration that works across different database platforms. When triggers are created, the variables are replaced with the syntax needed for that specific database.
$(newTriggerValue) |
New row alias for inserts and updates. For example, "$(newTriggerValue).MYCOLUMN" becomes ":new.MYCOLUMN" for an insert/update trigger on Oracle. |
$(oldTriggerValue) |
Old row alias for updates and deletes. For example, "$(oldTriggerValue).MYCOLUMN" becomes ":old.MYCOLUMN" for an update/delete trigger on Oracle. |
$(curTriggerValue) |
Current row alias for insert, updates, and deletes. This variable acts like $(newTriggerValue) for inserts and updates, and it acts like $(oldTriggerValue) for deletes. |
$(curColumnPrefix) |
Column prefix only used by H2 database. It is replaced with the NEW_ or OLD_ column prefix needed by H2. All other platforms will replace the variable with an empty string |
4.5.4. Load Only Triggers
Occasionally the decision of what data to load initially results in additional triggers. These triggers, known as load only triggers, are configured such that they do not capture any data changes. In other words, the sync on insert, sync on update, and sync on delete attributes of the trigger are all set to false.
-
Initial Loads
-
Reverse Initial Loads
-
Table Reloads
-
Creation of tables during initial loads
-
To load a read-only lookup table, for example. It could also be used to load a table that needs populated with example or default data.
-
Recovery of data for tables that have a single direction of synchronization. For example, a retail store records sales transactions that synchronize in one direction by trickling back to the central office. If the retail store needs to recover all the sales transactions from the central office, they can be sent are part of an initial load from the central office by setting up a load only trigger that "sync" in that direction.
The following SQL statement sets up a non-syncing dead Trigger that sends the sale_transaction table to the "store" Node Group from the "corp" Node Group during an initial load.
insert into sym_trigger (TRIGGER_ID,SOURCE_CATALOG_NAME,
SOURCE_SCHEMA_NAME,SOURCE_TABLE_NAME,CHANNEL_ID,
SYNC_ON_UPDATE,SYNC_ON_INSERT,SYNC_ON_DELETE,
SYNC_ON_INCOMING_BATCH,NAME_FOR_UPDATE_TRIGGER,
NAME_FOR_INSERT_TRIGGER,NAME_FOR_DELETE_TRIGGER,
SYNC_ON_UPDATE_CONDITION,SYNC_ON_INSERT_CONDITION,
SYNC_ON_DELETE_CONDITION,EXTERNAL_SELECT,
TX_ID_EXPRESSION,EXCLUDED_COLUMN_NAMES,
CREATE_TIME,LAST_UPDATE_BY,LAST_UPDATE_TIME)
values ('SALE_TRANSACTION_DEAD',null,null, 'SALE_TRANSACTION','transaction',
0,0,0,0,null,null,null,null,null,null,null,null,null,
current_timestamp,'demo',current_timestamp);
insert into sym_router (ROUTER_ID,TARGET_CATALOG_NAME,TARGET_SCHEMA_NAME,
TARGET_TABLE_NAME,SOURCE_NODE_GROUP_ID,TARGET_NODE_GROUP_ID,ROUTER_TYPE,
ROUTER_EXPRESSION,SYNC_ON_UPDATE,SYNC_ON_INSERT,SYNC_ON_DELETE,
CREATE_TIME,LAST_UPDATE_BY,LAST_UPDATE_TIME)
values ('CORP_2_STORE',null,null,null, 'corp','store',null,null,1,1,1,
current_timestamp,'demo',current_timestamp);
insert into sym_trigger_router (TRIGGER_ID,ROUTER_ID,INITIAL_LOAD_ORDER,
INITIAL_LOAD_SELECT,CREATE_TIME,LAST_UPDATE_BY,LAST_UPDATE_TIME)
values ('SALE_TRANSACTION_DEAD','CORP_2_REGION',100,null,
current_timestamp,'demo',current_timestamp);
4.6. Table Routing
The TRIGGER_ROUTER table is used to define which specific combinations of triggers and routers are needed for your configuration. The relationship between triggers and routers is many-to-many, so this table serves as the join table to define which combinations are valid, as well as to define settings available at the trigger-router level of granularity.
Three important controls can be configured for a specific Trigger / Router combination: Enabled, Initial Loads and Ping Back. The parameters for these can be found in the Trigger / Router mapping table, TRIGGER_ROUTER .
- Table Trigger
-
The table trigger determines the source of data.
- Router
-
The router determines where data will be sent.
- Initial Load Select
-
A SQL expression used in the WHERE clause of the SELECT that extracts the table during initial load. Leave blank to retrieve all rows. For correlated joins, reference the table using the "t" alias. Variables are substituted for
$(groupId)
,$(nodeId)
, and$(externalId)
. See Variables section for formatting and manipulation of variables. - Initial Load Delete
-
A SQL statement to run that clears the table on the target node at the beginning of the initial load before data is sent. This is usually a delete or truncate statement. Variables are substituted for
$(groupId)
,$(nodeId)
,$(externalId)
,$(sourceGroupId)
,$(sourceNodeId)
, and$(sourceExternalId)
. See Variables section for formatting and manipulation of variables. - Initial Load Order
-
Numeric position for this table in the initial load, sent in ascending numeric order. When two numeric values are the same, the ordering is based on foreign key constraints. Use a negative number to exclude the table from initial load.
- Enabled
-
When enabled, changes are captured and routed into batches for the table. When disabled, changes are no longer captured for the table and any outstanding changes are placed into batches with a status of "unrouted".
- Ping Back Enabled
-
When enabled, synchronized rows with the Sync On Incoming option set on a trigger will be routed back to the source node. By default, the data is not routed back to the source node (to prevent endless loops of data changes). A reason to turn this option on is when a transform changes some of the data and the transformed data needs to be returned to the source node.
4.7. File Triggers
In addition to supporting database synchronization, SymmetricDS also supports File Synchronization. Similar to database synchronization which allows configuring Table Triggers, SymmetricDS also supports setting up File Triggers. A file trigger is equivalent to specifying a directory structure or path that should be "watched" for files that need to be synchronized.
- File Trigger Id
-
Unique identifier for a trigger.
- Channel
-
The channel_id of the channel that data changes will flow through.
- Reload Channel Id
-
The channel_id of the channel that will be used for reloads.
- Base Directory
-
The base directory on the source node that files will be synchronized from.
- Recurse
-
Whether to synchronize child directories.
- Include Files
-
Wildcard-enabled (*), comma-separated list of file to include in synchronization.
- Exclude Files
-
Wildcard-enabled (*), comma-separated list of file to exclude from synchronization.
- Sync On Create
-
Whether to capture and send files when they are created.
- Sync On Modified
-
Whether to capture and send files when they are modified.
- Sync On Delete
-
Whether to capture and send files when they are deleted.
- Sync On Ctl File
-
Combined with sync_on_create, determines whether to capture and send files when a matching control file exists. The control file is a file of the same name with a '.ctl' extension appended to the end.
- Delete After Sync
-
Determines whether to delete the file after it has synced successfully.
- Before Copy Script
-
A beanshell script that is run at the target node right before the file copy to it’s destination directory.
- After Copy Script
-
A beanshell script that is run at the target node right after the file copy to it’s destination directory.
4.7.1. Operation
Not only is file synchronization configured similar to database synchronization, but it also operates in a very similar way. The file system is monitored for changes via a background job that tracks the file system changes (this parallels the use of triggers to monitor for changes when synchronizing database changes). When a change is detected it is written to the FILE_SNAPSHOT table. The file snapshot table represents the most recent known state of the monitored files. The file snapshot table has a SymmetricDS database trigger automatically installed on it so that when it is updated the changes are captured by SymmetricDS on an internal channel named filesync.
The changes to FILE_SNAPSHOT are then routed and batched by a file-synchronization-specific router that delegates to the configured router based on the FILE_TRIGGER_ROUTER configuration. The file sync router can make routing decisions based on the column data of the snapshot table, columns which contain attributes of the file like the name, path, size, and last modified time. Both old and new file snapshot data are also available. The router can, for example, parse the path or name of the file and use it as the node id to route to.
Batches of file snapshot changes are stored on the filesync channel in OUTGOING_BATCH. The existing SymmetricDS pull and push jobs ignore the filesync channel. Instead, they are processed by file-synchronization-specific push and pull jobs.
When transferring data, the file sync push and pull jobs build a zip file dynamically based on the batched snapshot data. The zip file contains a directory per batch. The directory name is the batch_id. A sync.bsh Bean Shell script is generated and placed in the root of each batch directory. The Bean Shell script contains the commands to copy or delete files at their file destination from an extracted zip in the staging directory on the target node. The zip file is downloaded in the case of a pull, or, in the case of a push, is uploaded as an HTTP multi-part attachment. Outgoing zip files are written and transferred from the outgoing staging directory. Incoming zip files are staged in the filesync_incoming staging directory by source node id. The filesync_incoming/{node_id} staging directory is cleared out before each subsequent delivery of files.
The acknowledgement of a batch happens the same way it is acknowledged in database synchronization. The client responds with an acknowledgement as part of the response during a file push or pull.
4.7.2. BeanShell Scripts
There are two types of Bean Shell scripts that can be leveraged to customize file synchronization behavior:
- Before copy script
-
This runs on delivery of a file before it is copied to it’s target location
- After copy script
-
This run on delivery of a file after it is copied to it’s target location
Each of these scripts have access to local variables that can be read or set to affect the behavior of copying files.
- targetBaseDir
-
The preset base directory as configured in file trigger or overwritten in file trigger router. This variable can be set by the before copy script to set a different target directory.
- targetFileName
-
The name of the file that is being synchronized. This variable can be overwritten by the before copy script to rename a file at the target.
- targetRelativeDir
-
The name of a directory relative to the target base directory to which the target file will be copied. The default value of this variable is the relative directory of the source. For example, if the source base directory is /src and the target base directory is /tgt and the file /src/subfolder/1.txt is changed, then the default targetRelativeDir will be subfolder. This variable can be overwritten by the before_copy_script to change the relative directory at the target. In the above example, if the variable is set to blank using the following script, then the target file will be copied to /tgt/1.txt.
targetRelativeDir = "";
- processFile
-
This is a variable that is set to true by default. A custom before copy script may process the file itself and set this variable to false to indicate that the file should NOT be copied to its target location.
- sourceFileName
-
This is the name of the file.
- sourceFilePath
-
This is the path where the file can be found relative to the batch directory.
- batchDir
-
This is the staging directory where the batch has been extracted. The batchDir + sourceFilePath + sourceFileName can be used to locate the extracted file.
- engine
-
This is the bound instance of the ISymmetricEngine that is processing a file. It gives access to all of the APIs available in SymmetricDS.
- sourceNodeId
-
This is a bound variable that represents the nodeId that is the source of the file.
- log
-
This is the bound instance of an org.slf4j.Logger that can be used to log to the SymmetricDS log file.
File file = new File(batchDir + "/" + sourceFilePath + "/" + sourceFileName);
if (file.exists()) {
String path = file.getAbsolutePath();
cp (path,"/backup/" + sourceFileName);
}
4.8. File Routing
The FILE_TRIGGER_ROUTER table is used to define which specific combinations of file triggers and routers are needed for your configuration. The relationship between file triggers and routers is many-to-many, so this table serves as the join table to define which combinations are valid, as well as to define settings available at the trigger-router level of granularity.
- File Triggers
-
The file trigger to link.
- Routers
-
The router to link.
- Target Base Directory
-
The base directory on the target node that files will be synchronized to.
- Conflict Strategy
-
The strategy to employ when a file has been modified at both the client and the server.
source_wins |
The source file will be used when a conflict occurs. |
target_wins |
The target file will be used when a conflict occurs. |
manual |
If a conflict occurs the batch will be put in ER (error) status and require manual intervention to resolve the issue. |
- Initial Enabled
-
Indicates whether this file trigger should be initial loaded.
- Enabled
-
Indicates whether this file trigger router is enabled or not.
4.9. Conflicts
Conflict detection is the act of determining if an insert, update or delete is in "conflict" due to the target data row not being consistent with the data at the source prior to the insert/update/delete. Without any overriding configuration, the system uses a detection of USE_CHANGED_DATA and a resolution of NEWER_WINS by default.
-
Detection - How to detect a conflict when loading data at the target.
-
Resolution - When a row is in conflict, what to do with it when loading at the target.
-
Ping Back - What data to capture at target during resolution, if any, that will be sent back to source.
Conflict detection and resolution strategies are configured in the CONFLICT table. They are configured at minimum for a specific NODE_GROUP_LINK . The configuration can also be specific to a CHANNEL and/or table.
Conflict detection is configured in the detect_type and detect_expression columns of CONFLICT. The value for detect_expression depends on the detect_type.
- Conflict Id
-
Unique identifier for a specific conflict detection setting.
- Group Link
-
References a node group link.
- Detection Type
-
Indicates the strategy to use for detecting conflicts during a dml action.
Conflicts are detected while data is being loaded into a target system. |
USE_PK_DATA |
Indicates that only the primary key is used to detect a conflict. If a row exists with the same primary key, then no conflict is detected during an update or a delete. Updates and deletes rows are resolved using only the primary key columns. If a row already exists during an insert then a conflict has been detected. |
USE_CHANGED_DATA |
Indicates that the primary key plus any data that has changed on the source system will be used to detect a conflict. If a row exists with the same old values on the target system as they were on the source system for the columns that have changed on the source system, then no conflict is detected during an update or a delete. If a row already exists during an insert then a conflict has been detected. |
USE_OLD_DATA |
Indicates that all of the old data values are used to detect a conflict. Old data is the data values of the row on the source system prior to the change. If a row exists with the same old values on the target system as they were on the source system, then no conflict is detected during an update or a delete. If a row already exists during an insert then a conflict has been detected. |
USE_TIMESTAMP |
Indicates that the primary key plus a timestamp column (as configured in detect_expression ) will indicate whether a conflict has occurred. If the target timestamp column is not equal to the old source timestamp column, then a conflict has been detected. If a row already exists during an insert then a conflict has been detected. You must specify the name of the column containing a timestamp to use in the detect_expression. |
USE_VERSION |
Indicates that the primary key plus a version column (as configured in detect_expression ) will indicate whether a conflict has occurred. If the target version column is not equal to the old source version column, then a conflict has been detected. If a row already exists during an insert then a conflict has been detected. You must specify the name of the column containing a version number to use in the detect_expression. |
- Detection Expression
-
An expression that provides additional information about the detection mechanism. If the detection mechanism is use_timestamp or use_version then this expression will be the name of the timestamp or version column. The detect_expression is also used to exclude certain column names from being used. For example, to exclude column1 and column2, the expression is "excluded_column_names=column1,column2".
- Resolution Type
-
The choice of how to resolve a detected conflict is configured via the resolve type.
NEWER_WINS |
Indicates that when a conflict is detected that the either the source or the target will win based on which side has the newer timestamp or higher version number. With USE_TIMESTAMP detection, the column specified in detect_expression is used, otherwise the time of capture is used. |
FALLBACK |
Indicates that when a conflict is detected the system should automatically apply the changes anyway. If the source operation was an insert, then an update will be attempted. If the source operation was an update and the row does not exist, then an insert will be attempted. If the source operation was a delete and the row does not exist, then the delete will be ignored. |
IGNORE |
Indicates that when a conflict is detected the system should automatically ignore the incoming change. Use IGNORE between two node groups in one direction, and FALLBACK in the other direction to establish which group wins a conflict. |
MANUAL |
Indicates that when a conflict is detected, the batch will remain in error until manual intervention occurs. A row in error is inserted into the INCOMING_ERROR table, which includes the conflict ID, old data, new data, and current data at the target. The user can specify the resolve data to use on the next load attempt. The resolve_ignore flag can also be used to indicate that the row should be ignored. |
To make a primary node group always win a conflict, use a "fallback" resolution on group links where primary is the source and an "ignore" resolution on group links where primary is the target. |
- Ping Back
-
For each configured conflict, you also have the ability to control if and how much "resolved" data is sent back to the node whose data change is in conflict.
SINGLE_ROW |
The resolved data of the single row in the batch that caused the conflict is sent back to the originating node. Recommended use with MANUAL resolution, so the resolved data is sent back to the originating node. |
REMAINING_ROWS |
The resolved data of the single row in the batch in conflict, along with the entire remainder of the batch, is sent back to the originating node. |
OFF |
No data is sent back to the originating node, even if the resolved data doesn’t match the data the node sent. Recommended use with resolution types that choose a winning row, including NEWER_WINS and when IGNORE and FALLBACK are used on opposing group links. |
- Resolve Changes Only
-
Indicates that when applying changes during an update that only data that has changed should be applied. Otherwise, all the columns will be updated. This really only applies to updates.
- Resolve Row Only
-
Ignore only the row in conflict when true, or ignore the entire batch when false. Used by IGNORE and NEWER_WINS resolvers. Recommended setting is true.
- Channel
-
Optional channel that this setting will be applied to.
- Target Catalog
-
Optional database catalog that the target table belongs to. Only use this if the target table is not in the default catalog.
- Target Schema
-
Optional database schema that the target table belongs to. Only use this if the target table is not in the default schema.
- Target Table
-
Optional database table that this setting will apply to. If left blank, the setting will be for any table in the channel (if set) and in the specified node group link.
Be aware that conflict detection will not detect changes to binary columns in the case where use_stream_lobs is true in the trigger for the table. In addition, some databases do not allow comparisons of binary columns whether use_stream_lobs is true or not. |
Some platforms do not support comparisons of binary columns. Conflicts in binary column values will not be detected on the following platforms: DB2, DERBY, ORACLE, and SQLSERVER. |
4.10. Transforms
Transforms allow you to manipulate data on a source node or target node, as the data is being loaded or extracted.
The source table must first be configured for synchronization (see Table Triggers) before it can be transformed. |
The source trigger captures data changes that are passed through transformations during extraction from the source and during loading at the target. If any enhancement with additional data is needed, an extract transform can access the source database, while a load transform can access the target database. A single row change may be processed by multiple transforms on the same group link, which transforms it into multiple target rows. Rows for the same target table do not merge, which means multiple transforms with the same target table results in multiple rows for that target table.
SymmetricDS stores its transformation configuration in two configuration tables, TRANSFORM_TABLE and TRANSFORM_COLUMN. Defining a transformation involves configuration in both tables, with the first table defining which source and destination tables are involved, and the second defining the columns involved in the transformation and the behavior of the data for those columns. We will explain the various options available in both tables and the various pre-defined transformation types.
To define a transformation, you will first define the source table and target table that applies to a particular transformation. The source and target tables, along with a unique identifier (the transform_id column) are defined in TRANSFORM_TABLE . In addition, you will specify the source_node_group_id and target_node_group_id to which the transform will apply, along with whether the transform should occur on the Extract step or the Load step (transform_point).
- Transform Id
-
Unique identifier of a specific transform.
- Group Link
-
The group link defining which direction the transform will process.
- Transform Point
-
Where this transform will occur. The options include:
EXTRACT |
The transform will execute while data is being extracted from the source. This means the transform will have access to the source’s database. |
LOAD |
The transform will execute while data is being loaded into the target. This means the transform will have access to the target’s database. |
- Column Policy
-
Indicates whether unspecified columns are passed thru or if all columns must be explicitly defined. The options include:
SPECIFIED |
Indicates that only the transform columns that are defined will be the ones that end up as part of the transformation. |
IMPLIED |
Indicates that if not specified, then columns from the source are passed through to the target. This is useful if you just want to map a table from one name to anther or from one schema to another. It is also useful if you want to transform a table, but also want to pass it through. You would define an implied transform from the source to the target and would not have to configure each column. |
- Source Catalog
-
Name of the catalog of the configured source table. This should only be set if Use Source Catalog/Schema or Target Catalog are set on the Router. Parameter values can be substituted using
$(name)
syntax. - Source Schema
-
Name of the schema for the configured source table. This should only be set if Use Source Catalog/Schema or Target Schema are set on the Router. Parameter values can be substituted using
$(name)
syntax. - Source Table
-
The name of the source table that will be transformed. Parameter values can be substituted using
$(name)
syntax. - Target Catalog
-
Optional name for the catalog a target target table is in. Only use this if the target table is not in the default catalog. Parameter values can be substituted using
$(name)
syntax. - Target Schema
-
Optional name of the schema a target target table is in. Only use this if the target table is not in the default schema. Parameter values can be substituted using
$(name)
syntax. - Target Table
-
The name of the target table. Parameter values can be substituted using
$(name)
syntax. - Update First
-
This option overrides the default behavior for an Insert operation. Instead of attempting the Insert first, SymmetricDS will always perform an Update first and then fall back to an Insert if that fails. Note that, by default, fall back logic always applies for Insert and Updates. Here, all you a specifying is whether to always do an Update first, which can have performance benefits under certain situations you may run into (see Operation Change).
- Delete Action
-
An action to take upon delete of a row.
DEL_ROW |
The delete results in a delete of the row as specified by the pk columns defined in the transformation configuration. |
UPDATE_COL |
The delete results in an update operation (see Operation Change) on the target which updates the specific rows and columns based on the defined transformation. |
NONE |
The delete results in no target changes. |
- Update Action
-
An action to take upon update of a row.
UPD_ROW |
The update performs normally. |
INS_ROW |
The update is transformed into an insert instead. |
DEL_ROW |
The update is transformed into a delete instead. |
NONE |
The update is ignored and no changes are made. |
- Transform Order
-
For a single source operation that is mapped to a transformation, there could be more than one target operation that takes place. You may control the order in which the target operations are applied through a configuration parameter defined for each source-target table combination. This might be important, for example, if the foreign key relationships on the target tables require you to execute the transformations in a particular order.
-
Copy a column from a source table to two (or more) target table columns,
-
Merge columns from two or more source tables into a single row in a target table,
-
Insert constants in columns in target tables based on source data synchronizations,
-
Insert multiple rows of data into a single target table based on one change in a source table,
-
Apply a Bean Shell script to achieve a custom transform when loading into the target database.
You must define columns for the transformation that are sufficient to fill in any primary key or other required data in the target table. |
insert into SYM_TRANSFORM_TABLE (
transform_id, source_node_group_id, target_node_group_id, transform_point, source_table_name,
target_table_name, update_action, delete_action, transform_order, column_policy, update_first,
last_update_by, last_update_time, create_time
) values (
'itemSellingPriceTransform', 'store', 'corp', 'EXTRACT', 'ITEM_SELLING_PRICE',
'ITEM_SELLING_PRICE', 'UPDATE_COL', 'DEL_ROW', 1, 'IMPLIED', 1,
'Documentation', current_timestamp, current_timestamp
);
4.10.1. Operation Change
By default the the "source operation" or "source DML type" (i.e., an insert, a delete, or an update), which initiated the transform execution will be the same operation applied to the target. There are two ways you can override this behavior.
Source Operation | Target Operation | Setting |
---|---|---|
INSERT |
UPDATE followed by INSERT if update was unsuccessful |
Update First |
DELETE |
UPDATE |
Delete Action is set to UPDATE_COL |
Update First
insert into SYM_TRANSFORM_TABLE (
transform_id, source_node_group_id, target_node_group_id, transform_point, source_table_name,
target_table_name, update_action, delete_action, transform_order, column_policy, update_first,
last_update_time, create_time
) values (
'update-first', 'store', 'corp', 'EXTRACT', 'ITEM_SELLING_PRICE',
'ITEM_SELLING_PRICE', 'UPDATE_COL', 'DEL_ROW', 1, 'IMPLIED', 1,
current_timestamp, current_timestamp
);
Delete Action
insert into SYM_TRANSFORM_TABLE (
transform_id, source_node_group_id, target_node_group_id, transform_point, source_table_name,
target_table_name, update_action, delete_action, transform_order, column_policy, update_first,
last_update_time, create_time
) values (
'delete-action-update-col', 'store', 'corp', 'EXTRACT', 'ITEM_SELLING_PRICE',
'ITEM_SELLING_PRICE', 'UPDATE_COL', 'UPDATE_COL', 2, 'IMPLIED', 0,
current_timestamp, current_timestamp
);
4.10.2. Columns
Transforms are not complete until the columns involved in the transformation have been defined. Typically there will be several columns defined for each transform, each of which will define a source column and a target column.
- PK
-
Indicates that this mapping is used to define the "primary key" for identifying the target row(s) (which may or may not be the true primary key of the target table). This is used to define the "where" clause when an Update or Delete on the target is occurring.
Unless the column policy is "IMPLIED" at least one row marked as a pk should be present for each transform_id. |
- Source
-
The source column name to be transformed.
- Target
-
The target column name to be transformed.
- Transform On
-
Defines whether this entry applies to source operations of Insert, Update, Delete, or All.
I |
Insert |
U |
Update |
D |
Delete |
* |
All |
- Type
-
The name of a specific type of transform, default type is "copy". See Transform Types for more information.
- Expression
-
An expression that is specific to the type of transform that is configured in transform_type. See Transform Types for more information.
- Order
-
In the event there are more than one columns to transform, this defines the relative order in which the transformations are applied.
4.10.3. Transform Types
There are several pre-defined transform types available in SymmetricDS. Additional ones can be defined by creating and configuring an extension point which implements the IColumnTransform interface. The pre-defined transform types include the following:
Copy Transform
This transformation type copies the source column value to the target column. This is the default behavior.
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'ITEM_ID', 'ITEM_ID', 1,
'copy', '', 1, current_timestamp, 'Documentation',
current_timestamp
);
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'STORE_ID', 'STORE_ID', 1,
'copy', '', 1, current_timestamp, 'Documentation',
current_timestamp
);
Remove Transform
This transformation type excludes the source column. This transform type is only valid for a table transformation type of 'IMPLIED' where all the columns from the source are automatically copied to the target.
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', '', 'COST', 1,
'remove', '', 1, current_timestamp, 'Documentation',
current_timestamp
);
Constant Transform
This transformation type allows you to map a constant value to the given target column. The constant itself is placed in transform expression.
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'PRICE', 'PRICE', 0,
'const', '10', 1, current_timestamp, 'Documentation',
current_timestamp
);
Variable Transform
This transformation type allows you to map a built-in dynamic variable to the given target column. The variable name is placed in transform expression. The following variables are available:
system_date |
current system date |
system_timestamp |
current system date and time using default timezone |
system_timestamp_utc |
current system date and time using UTC timezone |
source_node_id |
node id of the source (from the batch) |
target_node_id |
node id of the target (from the batch) |
source_node_id_from_data |
source_node_id value from sym_data (source of a captured synchronization data change) |
null |
null value |
old_column_value |
column’s old value prior to the DML operation. |
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'STORE_ID', 'STORE_ID', 0,
'variable', 'source_node_id', 1, current_timestamp, 'Documentation',
current_timestamp
);
Additive Transform
This transformation type is used for numeric data. It computes the change between the old and new values on the source and then adds the change to the existing value in the target column. That is, target = target + multiplier (source_new - source_old), where multiplier is a constant found in the transform expression (default is 1 if not specified).
If the source column changed from a 2 to a 4, the target column is currently 10, and the multiplier is 3, the effect of the transform will be to change the target column to a value of 16 ( 10+3*(4-2) ⇒ 16 ).
In the case of deletes, the new column value is considered 0 for the purposes of the calculation. |
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'PRICE', 'PRICE', 0,
'additive', '3', 1, current_timestamp, 'Documentation',
current_timestamp
);
Substring Transform
This transformation computes a substring of the source column data and uses the substring as the target column value. The transform expression can be a single integer ( n , the beginning index), or a pair of comma-separated integers ( n,m - the beginning and ending index). The transform behaves as the Java substring function would using the specified values in transform expression.
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'STORE_ID', 'STORE_ID', 0,
'substring', '0,5', 1, current_timestamp, 'Documentation',
current_timestamp
);
Left Transform
This transform copies the left most number of characters specified.
BLeft Transform
This transform copies the left most number of bytes specified.
Lookup Transform
This transformation determines the target column value by using a query, contained in transform expression to lookup the value in another table. The query must return a single row, and the first column of the query is used as the value. Your query references source column values by prefixing with a colon (e.g., :MY_COLUMN). Also, you can reference old values with :OLD_COLUMN and previously transformed columns (see transform order) with :TRM_COLUMN.
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'COST', 'COST', 0,
'lookup', 'select max(price) from sale_return_line_item
where item_id = :ITEM_ID', 1, current_timestamp, 'Documentation',
current_timestamp
);
Multiply Transform
This transformation allows for the creation of multiple rows in the target table based on the transform expression. This transform type can only be used on a primary key column. The transform expression is a SQL statement, similar to the lookup transform, except it can return multiple rows that result in multiple rows for the target table. The first column of the query is used as the value for the target column. The query can reference source column values by prefixing them with a colon (e.g., :MY_COLUMN).
BeanShell Script Transform
This transformation allows you to provide a BeanShell script in the transform expression and executes the script at the time of transformation. Beanshell transforms can return either a String value or an instance of NewAndOldValue. Some variables are provided to the script:
<COLUMN_NAME> |
The variable name is the source column name in uppercase of the row being changed (replace <COLUMN_NAME> with your column) |
currentValue |
The value of the current source column |
oldValue |
The old value of the source column for an updated row |
sqlTemplate |
org.jumpmind.db.sql.ISqlTemplate object for querying or updating the database |
channelId |
name of the channel on which the transformation is happening |
sourceNode |
org.jumpmind.symmetric.model.Node object that represents the node from where the data came |
sourceNodeId |
same as sourceNode.getNodeId() |
sourceNodeGroupId |
same as sourceNode.getNodeGroupId() |
sourceNodeExternalId |
same as sourceNode.getNodeExternalId() |
targetNode |
org.jumpmind.symmetric.model.Node object that represents the node where the data is being loaded. |
targetNodeId |
same as targetNode.getNodeId() |
targetNodeGroupId |
same as targetNode.getNodeGroupId() |
targetNodeExternalId |
same as targetNode.getNodeExternalId() |
transformColumn |
org.jumpmind.symmetric.io.data.transform.TransformColumn that is the transform configuration |
includeOn |
org.jumpmind.symmetric.io.data.transform.TransformColumn.IncludeOnType, same as transformColumn.getIncludeOn(), tells whether column transform is configured for all, insert, update, or delete |
sourceSchemaName |
source schema name that the transform matched |
sourceCatalogName |
source catalog name that the transform matched |
sourceTableName |
source table name that the transform matched |
transformedData |
org.jumpmind.symmetric.io.data.transform.TransformedData, the model object representing the outputted transformed data |
sourceDmlType |
org.jumpmind.symmetric.io.data.DataEventType that is the source row change type, either insert, update, or delete |
sourceDmlTypeString |
same as sourceDmlType.toString(), returning insert, update, or delete |
log |
org.slf4j.Logger, write to the log file |
context |
org.jumpmind.symmetric.io.data.DataContext containing internal variables and also acts like a Map for sharing variables between transforms for the current sync session |
bshContext |
java.util.Map, static map of variables to share between transforms |
engine |
org.jumpmind.symmetric.ISymmetricEngine, access to engine functions and services |
if (currentValue > oldValue) { return currentValue * .9; } else { return PRICE; }
if (currentValue != null && currentValue.length() == 0) { return org.jumpmind.symmetric.io.data.transform.NewAndOldValue(null, oldValue); } else { return currentValue; }
String newFilePath = PATH; String oldFilePath = null; if (transformedData.getOldSourceValues() != null) { oldFilePath = transformedData.getOldSourceValues().get("path"); } if (oldFilePath == null) { return newFilePath; } else { return oldFilePath; }
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'COST', 'COST', 0,
'bsh', 'if (currentValue > oldValue) { return currentValue * .9 } else { return PRICE }', 1, current_timestamp, 'Documentation',
current_timestamp
);
Identity Transform
This transformation allows you to insert into an identity column by letting the database compute a new identity, instead of loading an explicit value from a source database. This transform is needed on databases like SQL-Server and Sybase, which have an INSERT_IDENTITY option that is normally ON for normal data sync. By using the identity transform, the INSERT_IDENTITY is set to OFF, so the next value is generated by the database.
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'ITEM_ID', 'ITEM_ID', 0,
'identity', '', 1, current_timestamp, 'Documentation',
current_timestamp
);
Mathematical Transform
This transformation allows you to perform mathematical equations in the transform expression. Some variables are provided to the script:
#{COLUMN_NAME} |
A variable for a source column in the row, where the variable name is the column name in uppercase (replace COLUMN_NAME with your column name). |
#{currentValue} |
The value of the current source column |
#{oldValue} |
The old value of the source column for an updated row. |
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'COST', 'COST', 0,
'math', '#{currentValue} - #{oldValue} * #{PRICE}', 1, current_timestamp, 'Documentation',
current_timestamp
);
Copy If Changed
This transformation will copy the value to the target column if the source value has changed. More specifically, the copy will occur if the the old value of the source does not equal the new value.
IgnoreColumn |
If old and new values are equal, the COLUMN will be ignored |
{empty string} |
If old and new values are equal, the ROW will be ignored |
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'COST', 'COST', 0,
'copyIfChanged', 'IgnoreColumn', 1, current_timestamp, 'Documentation',
current_timestamp
);
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'COST', 'COST', 0,
'copyIfChanged', '', 2, current_timestamp, 'Documentation',
current_timestamp
);
Value Map Transform
This transformation allows for simple value substitutions through use of the transform expression. The transform expression should consist of a space separated list of value pairs of the format sourceValue=TargetValue. The column value is used to locate the correct sourceValue, and the transform will change the value into the corresponding targetValue. A sourceValue of * can be used to represent a default target value in the event that the sourceValue is not found. Otherwise, if no default value is found, the result will be null.
transform expression | source value | target value (result) |
---|---|---|
s1=t1 s2=t2 s3=t3 *=t4 |
s1 |
t1 |
s1=t1 s2=t2 s3=t3 *=t4 |
s2 |
t2 |
s1=t1 s2=t2 s3=t3 *=t4 |
s3 |
t3 |
s1=t1 s2=t2 s3=t3 *=t4 |
s4 |
t4 |
s1=t1 s2=t2 s3=t3 *=t4 |
s5 |
t4 |
s1=t1 s2=t2 s3=t3 *=t4 |
null |
t4 |
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) values (
'itemSellingPriceTransform', '*', 'COST', 'COST', 0,
'valueMap', 's1=t1 s2=t2 s3=t3 *=t4', 1, current_timestamp, 'Documentation',
current_timestamp
);
Clarion Date Time
Convert a Clarion date column with optional time column into a timestamp. Clarion dates are stored as the number of days since December 28, 1800, while Clarion times are stored as hundredths of a second since midnight, plus one. Use a source column of the Clarion date and a target column of the timestamp. If the Clarion time exists in a separate column it can optionally be provided through the transform expression to be included in the target timestamp column.
Columns To Rows
Convert column values from a single source row into a row per column value at the target. Two column mappings are needed to complete the work:
columnsToRowsKey |
Maps which source column is used |
column1=key1,column2=key2
columnsToRowsValue |
Maps the value |
changesOnly=true |
Convert only rows when the old and new values have changed |
ignoreNulls=true |
Convert only rows that are not null |
TODO Need SQL for this scenario
"fieldid" mapped as "columnsToRowsKey" with expression of "user1=1,user2=2" and column "color" mapped as "columnsToRowsValue" would convert a row with columns named "user1" and "user2" containing values "red" and "blue" into two rows with columns "fieldid" and "color" containing a row of "1" and "red" and a row of "2" and "blue".
isEmpty Transform
This transformation checks to see if a string is null or zero length. If it is empty the replacement value will be used. If no value is provided null will be used as a default replacement for empty values.
isBlank Transform
This transformation checks to see if a string is null or zero length after trimming white spaces. If it is blank the replacement value will be used. If no value is provided null will be used as a default replacement for blank values.
Null Value Transform
This transformation checks to see if the source value is null and if so replaces it with the provided value.
Deleted Columns Transform
For an update, this transform returns a comma-separated list of columns names that were set to null and previously not null.
Java Transform
Java Transform ('java'): Use Java code in the transform expression that is included in the transform method of a class that extends JavaColumnTransform. The class is compiled whenever the transform expression changes and kept in memory for runtime. The code must return a String for the new value of the column being mapped.
Some variables are provided to the code:
Variable Name | Java Type | Description |
---|---|---|
platform |
org.jumpmind.db.platform.IDatabasePlatform |
The platform for the database that this node is connected to |
context |
org.jumpmind.symmetric.io.data.DataContext |
The data cotext for the synchronization of the current batch |
column |
org.jumpmind.symmetric.io.data.transform.TransformColumn |
The transform column |
data |
org.jumpmind.symmetric.io.data.transform.TransformedData |
The transformed data |
sourceValues |
java.util.Map<java.lang.String, java.lang.String> |
The map of source values |
newValue |
java.lang.String |
The captured new value |
oldValue |
java.lang.String |
The captured old value |
if (sourceValues.containsKey("OLDKEY")) { return sourceValues.get("OLDKEY"); } else { return sourceValues.get("NEWKEY"); }
INSERT INTO SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order, last_update_time,
last_update_by, create_time
) VALUES (
'testjava', '*', 'NEWKEY', null, 0,
'java', 'if (sourceValues.containsKey("OLDKEY")) {
return sourceValues.get("OLDKEY");
} else {
return sourceValues.get("NEWKEY");
}', 0, current_timestamp, 'Documentation', current_timestamp);
4.10.4. Virtual Columns
Transforms provide the ability to create "virtual columns" which can pass data between nodes for use by other SymmetricDS processes.
-
Extract transform adds virtual column to be processed by a target load transform.
-
Extract transform adds virtual column to be processed by a target load filter.
-
Extract transform adds virtual column to be processed by a source router.
insert into SYM_TRANSFORM_TABLE (
transform_id, source_node_group_id, target_node_group_id, transform_point, source_table_name,
target_table_name, delete_action, transform_order, column_policy)
values (
'extractStoreItemSellingPriceTransform', 'store', 'corp', 'EXTRACT', 'ITEM_SELLING_PRICE',
'ITEM_SELLING_PRICE', 'DEL_ROW', 1, 'IMPLIED');
insert into SYM_TRANSFORM_TABLE (
transform_id, source_node_group_id, target_node_group_id, transform_point, source_table_name,
target_table_name, delete_action, transform_order, column_policy)
values (
'loadCorpItemSellingPriceTransform', 'corp', 'store', 'LOAD', 'ITEM_SELLING_PRICE',
'ITEM_SELLING_PRICE', 'DEL_ROW', 1, 'IMPLIED');
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order)
values (
'extractStoreItemSellingPriceTransform', '*', 'VIRTUAL_COL', 'COST', 0,
'lookup', 'select max(price) from sale_return_line_item
where item_id = :ITEM_ID', 1);
insert into SYM_TRANSFORM_COLUMN (
transform_id, include_on, target_column_name, source_column_name, pk,
transform_type, transform_expression, transform_order)
values (
'loadCorpItemSellingPriceTransform', '*', 'COST', 'VIRTUAL_COL', 0,
'copy', '', 1);
4.11. Load Filters
Load Filters are a way to take a specific action when a row of data is loaded by SymmetricDS at a destination database node.
Load filters run for each row of data being loaded. |
- Filter Id
-
The unique identifier for the load filter
- Group Link
-
The group link for with the load filter will be applied.
- Type
-
The type of load filter. Today only Bean Shell, Java, and SQL are supported ('BSH', 'Java', 'SQL').
- Target Table
-
The table on the target which the load filter will execute when changes occur on it.
Use the wildcard * to specify all tables configured through the group link. Partial table names in conjunction with a wildcard are NOT supported. If the wildcard is used it should be the only value. |
- Filter Order
-
The order in which load filters should execute if there are multiple scripts pertaining to the same source and target data.
- Filter On Update
-
Determines whether the load filter takes action (executes) on a database update statement.
- Filter On Insert
-
Determines whether the load filter takes action (executes) on a database insert statement.
- Filter On Delete
-
Determines whether the load filter takes action (executes) on a database delete statement.
- Fail On Error
-
Whether we should fail the batch if the filter fails.
- Target Catalog
-
The name of the target catalog for which you would like to watch for changes.
- Target Schema
-
The name of the target schema for which you would like to watch for changes.
4.11.1. Load Filter Scripts
Load filters are based on the execution of a script. You have the ability to set the execution point of the script at 6 different points. A script can be provided for one or more of these execution points.
Return Values
|
Before Write Script |
The script to execute before the database write occurs. |
After Write Script |
The script to execute after the database write occurs. |
Batch Complete Script |
The script to execute after the entire batch completes. |
Batch Commit Script |
The script to execute after the entire batch is committed. |
Batch Rollback Script |
The script to execute if the batch rolls back. |
Handle Error Script |
A script to execute if data cannot be processed. |
Variable | BSH | SQL | JAVA | Description |
---|---|---|---|---|
engine |
X |
The Symmetric engine object. |
||
COLUMN_NAME |
X |
X |
The source values for the row being inserted, updated or deleted. |
|
OLD_COLUMN_NAME |
X |
X |
The old values for the row being inserted, updated or deleted. |
|
context |
X |
X |
The data context object for the data being inserted, updated or deleted. . |
|
table |
X |
X |
The table object for the table being inserted, updated or deleted. |
|
data |
X |
X |
The |
|
error |
X |
X |
|
insert into sym_load_filter
(LOAD_FILTER_ID, LOAD_FILTER_TYPE, SOURCE_NODE_GROUP_ID,
TARGET_NODE_GROUP_ID, TARGET_CATALOG_NAME, TARGET_SCHEMA_NAME,
TARGET_TABLE_NAME, FILTER_ON_UPDATE, FILTER_ON_INSERT, FILTER_ON_DELETE,
BEFORE_WRITE_SCRIPT, AFTER_WRITE_SCRIPT, BATCH_COMPLETE_SCRIPT,
BATCH_COMMIT_SCRIPT, BATCH_ROLLBACK_SCRIPT, HANDLE_ERROR_SCRIPT,
CREATE_TIME, LAST_UPDATE_BY, LAST_UPDATE_TIME, LOAD_FILTER_ORDER,
FAIL_ON_ERROR) values
('SampleFilter','BSH','Client','Server',NULL,NULL,
'ITEM_SELLING_PRICE',1,1,1,'
if (OLD_COST > COST) {
// row will not be loaded
return false
} else {
// row will be loaded
return true
}
',
null,null,null,null,null,
sysdate,'Documentaion',sysdate,1,1);
insert into sym_load_filter
(LOAD_FILTER_ID, LOAD_FILTER_TYPE, SOURCE_NODE_GROUP_ID,
TARGET_NODE_GROUP_ID, TARGET_CATALOG_NAME, TARGET_SCHEMA_NAME,
TARGET_TABLE_NAME, FILTER_ON_UPDATE, FILTER_ON_INSERT, FILTER_ON_DELETE,
BEFORE_WRITE_SCRIPT, AFTER_WRITE_SCRIPT, BATCH_COMPLETE_SCRIPT,
BATCH_COMMIT_SCRIPT, BATCH_ROLLBACK_SCRIPT, HANDLE_ERROR_SCRIPT,
CREATE_TIME, LAST_UPDATE_BY, LAST_UPDATE_TIME, LOAD_FILTER_ORDER,
FAIL_ON_ERROR) values
('EmailErrorFilter','BSH','Client','Server',NULL,NULL,
'*',1,1,1,null,
null,null,null,null,'
authListener = new javax.mail.Authenticator() {
protected javax.mail.PasswordAuthentication getPasswordAuthentication() {
return new javax.mail.PasswordAuthentication(engine.getParameterService().getString("mail.smtp.username"),
engine.getParameterService().getString("mail.smtp.password"));
}
};
if (bsh.shared.mailMap == void) {
bsh.shared.mailMap = new HashMap();
}
String batchId = context.getBatch().getNodeBatchId();
String targetNodeId = context.getBatch().getTargetNodeId();
if (!bsh.shared.mailMap.containsKey(batchId)) {
bsh.shared.mailMap.put(batchId, Boolean.TRUE);
javax.mail.Session session = javax.mail.Session.getInstance
(engine.getParameterService().getAllParameters(), authListener);
javax.mail.internet.MimeMessage msg = new
javax.mail.internet.MimeMessage(session);
msg.setFrom(new javax.mail.internet.InternetAddress
(engine.getParameterService().getString("mail.smtp.from")));
msg.setRecipients(javax.mail.Message.RecipientType.TO,
engine.getParameterService().getString("mail.smtp.to"));
msg.setSubject("SymmetricDS - batch " + batchId + " is in error at node " + targetNodeId);
msg.setSentDate(new java.util.Date());
msg.setText(org.apache.commons.lang.exception.ExceptionUtils.
getFullStackTrace(error));
javax.mail.Transport.send(msg);
}',
sysdate,'Documentation',sysdate,1,1);
4.11.2. Custom Load Filters
Custom load filters can be created by implementing the IDatabaseWriterFilter, see IDatabaseWriterFilter for more information.
4.12. Extensions
Extensions are custom code written to a plug-in interface, which allows them to run inside the engine and change its default behavior. Saving extension code in the configuration has the advantage of dynamically running without deployment or restarting. Configured extensions are available to other nodes and move between environments when configuration is exported and imported.
- Extension Id
-
Identifier for a unique extension entry.
- Extension Type
-
Type of extension, either written in Java or BeanShell. Java extensions are compiled to bytecode on first use and may be compiled to native code by the Just-In-Time (JIT) compiler, giving them the best performance. BeanShell extensions are parsed on first use and interpreted at runtime, but they are easier to write because of loose typing and short-cuts with syntax.
Java |
Indicates that Java code is provided in the extension text. |
BSH |
Indicates that BeanShell code is provided in the extension text. Built-in variables are available for engine, sqlTemplate, and log. |
- Interface Name
-
The full class name for the interface implemented by the extension, including the package name. Only needed for extension type of BSH.
- Node Group Id
-
The node group where this extension will be active and run.
- Enabled
-
Whether or not the extension should be run.
- Extension Order
-
The order to register extensions when multiple extensions for the same interface exist.
- Extension Text
-
The code for the extension that will be compiled or interpreted at runtime.
Add a new transform type called "mask" that replaces all characters in a string with an asterisk except the last number of characters specified by the user in the expression. This BeanShell extension uses the ISingleValueColumnTransform interface and applies only to the "corp" node group.
insert into sym_extension
(extension_id, extension_type, interface_name, node_group_id, enabled,
extension_order, extension_text)
values
('mask', 'bsh',
'org.jumpmind.symmetric.io.data.transform.ISingleValueColumnTransform',
'corp', 1, 1, '
import org.apache.commons.lang.StringUtils;
isExtractColumnTransform() {
return true;
}
isLoadColumnTransform() {
return true;
}
transform(platform, context, column, data, sourceValues, newValue, oldValue) {
if (StringUtils.isNotBlank(newValue)) {
String expression = column.getTransformExpression();
if (StringUtils.isNotBlank(expression)) {
count = newValue.length() - Integer.parseInt(expression.trim());
return StringUtils.repeat("*", count) + newValue.substring(count);
}
}
return newValue;
}
');
IReloadListener
extension that disables foreign keys before a load and enables them after the load.insert into SYM_EXTENSION (EXTENSION_ID, EXTENSION_TYPE, INTERFACE_NAME, NODE_GROUP_ID, ENABLED, EXTENSION_ORDER, EXTENSION_TEXT, CREATE_TIME, LAST_UPDATE_BY, LAST_UPDATE_TIME) values ('disable ref integrity','java','org.jumpmind.symmetric.load.IReloadListener','ALL',1,1,'import org.jumpmind.db.sql.ISqlTransaction;
import org.jumpmind.symmetric.ISymmetricEngine;
import org.jumpmind.symmetric.ext.ISymmetricEngineAware;
import org.jumpmind.symmetric.load.IReloadListener;
import org.jumpmind.symmetric.model.Node;
public class MyReloadListener implements IReloadListener, ISymmetricEngineAware {
ISymmetricEngine engine;
@Override
public void setSymmetricEngine(ISymmetricEngine engine) {
this.engine = engine;
}
@Override
public void beforeReload(ISqlTransaction transaction, Node node, long loadId) {
TriggerHistory history = engine.getTriggerRouterService().findTriggerHistoryForGenericSync();
engine.getDataService().insertSqlEvent(transaction, history, "reload", node, "SET REFERENTIAL_INTEGRITY FALSE", true, loadId, "initial load");
}
@Override
public void afterReload(ISqlTransaction transaction, Node node, long loadId) {
TriggerHistory history = engine.getTriggerRouterService().findTriggerHistoryForGenericSync();
engine.getDataService().insertSqlEvent(transaction, history, "reload", node, "SET REFERENTIAL_INTEGRITY TRUE", true, loadId, "initial load");
}
}
',current_timestamp,'some user',current_timestamp);
Extensions Tips and Tricks
For BeanShell, implement only the methods needed from an interface, then write a special method of "invoke(method, args) {}"
that will be called for any unimplemented methods.
|
4.13. Jobs
In SymmetricDS, jobs are tasks that are scheduled to run by a job manager. These jobs do the majority of synchronization work for SymmetricDS. In addition to the built in jobs, you can create custom jobs in Configuration to run for specified node groups.
- Job Name
-
The Job Name is a unique identifier to refer to the job
- Job Type
-
The type of job. Possible types are listed below.
Type | Provided | Description |
---|---|---|
BSH |
x |
A job that will compile and run a beanshell script. |
JAVA |
x |
A job that will compile and run a java script. |
SQL |
x |
A job that will compile and run a sql script. |
- Requires Registration
-
Boolean value that specifies whether the engine needs to be registered for this job to run.
- Auto Start-Up
-
Boolean value that determines whether this job should auto start.
- Node Group
-
Targets the job at a specific node group id. To target all groups, use the value of 'ALL'.
- Job Expression
-
The payload of the job. For BSH jobs, this should be a beanshell script. For Java jobs, this should be Java code of a class which implements the IJob interface. For SQL jobs, this should be a sql script. BSH jobs are written in beanshell script (http://www.beanshell.org/). The beanshell has the following variables available:
-
engine - An instances of the current ISymmetricEngine
-
sqlTemplate - An instance of ISqlTemplate which can be used to run SQL statements.
-
log - An logger which can be used to write messages to the log.
-
This is an example BSH job:
// Configuration for this extension...
final String TABLE_NAME = "item";
final String CHANNEL_ID = "item";
final String ROUTER_ID = "corp_2_store";
// End Configuration.
String INSERT_RELOAD_DATA_TEMPLATE =
"insert into sym_data ( table_name, event_type, row_data, trigger_hist_id, channel_id, transaction_id, create_time)" +
" select t.source_table_name, 'R', 'item_id > ''1''', " +
" h.trigger_hist_id, t.channel_id, '1', current_timestamp" +
" from sym_trigger t inner join sym_trigger_router tr on" +
" t.trigger_id=tr.trigger_id inner join sym_trigger_hist h on" +
" h.trigger_hist_id=(select max(trigger_hist_id) from sym_trigger_hist" +
" where trigger_id=t.trigger_id)" +
" where channel_id=? and" +
" tr.router_id like ? and" +
" (t.source_table_name = ?)" +
" order by tr.initial_load_order asc;";
int updatedCount = sqlTemplate.update(INSERT_RELOAD_DATA_TEMPLATE, new Object[]{CHANNEL_ID, ROUTER_ID, TABLE_NAME});
if (updatedCount == 1) {
log.info("Issued reload for table '" + TABLE_NAME + "'");
} else {
throw new SymmetricException("SyncOnHeartbeatExtension is designed to insert exactly 1 sym_data row. Instead inserted " +
updatedCount + ". Check TABLE_NAME, CHANNEL_ID, ROUTER_ID parameters in the extension itself.", null);
}
This is the same job as a Java job. This might perform a little better, but you’ll need to have a JDK (not just JRE) available on your nodes for this to compile.
// Configuration for this extension...
final String TABLE_NAME = "item";
final String CHANNEL_ID = "item";
final String ROUTER_ID = "corp_2_store";
// End Configuration.
String INSERT_RELOAD_DATA_TEMPLATE =
"insert into sym_data ( table_name, event_type, row_data, trigger_hist_id, channel_id, transaction_id, create_time)" +
" select t.source_table_name, 'R', 'item_id > ''1''', " +
" h.trigger_hist_id, t.channel_id, '1', current_timestamp" +
" from sym_trigger t inner join sym_trigger_router tr on" +
" t.trigger_id=tr.trigger_id inner join sym_trigger_hist h on" +
" h.trigger_hist_id=(select max(trigger_hist_id) from sym_trigger_hist" +
" where trigger_id=t.trigger_id)" +
" where channel_id=? and" +
" tr.router_id like ? and" +
" (t.source_table_name = ?)" +
" order by tr.initial_load_order asc;";
log.info("TESTING\n\n\n");
int updatedCount = sqlTemplate.update(INSERT_RELOAD_DATA_TEMPLATE, CHANNEL_ID, ROUTER_ID, TABLE_NAME);
if (updatedCount == 1) {
log.info("Issued reload for table '" + TABLE_NAME + "'");
} else {
throw new org.jumpmind.symmetric.SymmetricException("SyncOnHeartbeatExtension is designed to insert exactly 1 sym_data row. Instead inserted " +
updatedCount + ". Check TABLE_NAME, CHANNEL_ID, ROUTER_ID parameters in the extension itself.");
}
This is an example of a SQL job that updates the heartbeat_time of the current node. Available tokens are: :NODE_ID and :NODE_GROUP_ID.
update sym_node_host set heartbeat_time = now() where node_id = :NODE_ID;
delete from item where item_id = 0;
insert into ITEM (ITEM_ID, NAME) values (0,'tesing');
- Schedule
-
Specifies how often this job should run.
insert into SYM_JOB ("job_name", "job_type", "requires_registration",
"job_expression", "default_schedule", "default_auto_start", "node_group_id")
values ('Java Job', 'JAVA', 1, <YOUR JOB EXPRESSION HERE>,'1000', 0, 'server');
4.14. Parameters
Parameters can be used to help tune and configure your SymmetricDS configuration. Parameters can be set for an individual node or for all nodes in your network.
See Parameter List for a complete list of parameters.
4.15. Mail Server
A mail server can be configured for sending email notifications.
- Target Nodes
-
The node group ID that will use this configuration.
- Hostname
-
The hostname or IP address of the mail server to contact for sending mail.
- Transport
-
The transport mechanism is either SMTP (Simple Mail Transfer Protocol) or SMTPS (encrypted with SSL).
- Port
-
The default port for SMTP is 25, while the default port for SMTPS is 465.
- Use StartTLS
-
After connecting over SMTP, the TLS protocol is used to encrypt content.
- Use Authentication
-
The mail server requires a login and password before email can be sent.
- Use SSL Authentication
-
The mail server requires SSL authentication (Gmail, AWS).
- User
-
The user login to use for authentication.
- Password
-
The login password to use for authentication.
4.16. Monitors
A monitor watches some part of the system for a problem, checking to see if the monitored value exceeds a threshold. (To be notified immediately of new monitor events, configure a notification.)
- Monitor ID
-
The monitor ID is a unique name to refer to the monitor.
- Node Group ID
-
The node group that will run this monitor. Use "ALL" to match all groups.
- External ID
-
The external ID of nodes that will run this monitor. Use "ALL" to match all nodes.
- Monitor Type
-
The monitor type is one of several built-in or custom types that run a specific check and return a numeric value that can be compared to a threshold value.
Type | Description |
---|---|
cpu |
Percentage from 0 to 100 of CPU usage for the server process. |
disk |
Percentage from 0 to 100 of disk usage (tmp folder staging area) available to the server process. |
memory |
Percentage from 0 to 100 of memory usage (tenured heap pool) available to the server process. |
batchError |
Number of incoming and outgoing batches in error. |
batchUnsent |
Number of outgoing batches waiting to be sent. |
dataUnrouted |
Number of change capture rows that are waiting to be batched and sent. |
dataGaps |
Number of active data gaps that are being checked during routing for data to commit. |
offlineNodes |
The number of nodes that are offline based on the last heartbeat time. The console.report.as.offline.minutes parameter controls how many minutes before a node is considered offline. |
log |
Number of entries found in the log for the specified severity level. |
block |
Number of seconds that a transaction has been blocked for. |
loadAverage |
Sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time. Not implemented for Windows. |
fileHandles |
Percentage from 0 to 100 of Operating System’s open file handles. Not implemented for Windows. |
job |
Number of jobs that are in error. This only applies to jobs that record statistics in the NODE_HOST_JOB_STATS table. The built-in jobs that write to this table are Routing, Purge Outgoing, Purge Incoming, and SyncTriggers. |
- Expression
-
An expression used by the monitor to set options specific to the monitor type. For batchError monitors, setting the expression to "notifyOnIncrease=true" will trigger an additional notification if the number of batches in error has increased since the last time the monitor job ran.
- Threshold
-
When this threshold value is reached or exceeded, an event is recorded.
- Run Period
-
The time in seconds of how often to run this monitor. The monitor job runs on a period also, so the monitor can only run as often as the monitor job.
- Run Count
-
The number of times to run the monitor before calculating an average value to compare against the threshold.
- Severity Level
-
The importance of this monitor event when it exceeds the threshold.
- Enabled
-
Whether or not this monitor is enabled to run.
4.17. Notifications
A notification sends a message to the user when a monitor event records a system problem. First configure a monitor to watch the system and record events with a specific severity level. Then, configure a notification to match the severity level and write to the log or send an email.
- Notification ID
-
The notification ID is a unique name to refer to the notification.
- Node Group ID
-
The node group that will run this monitor. Use "ALL" to match all groups.
- External ID
-
The external ID of nodes that will run this monitor. Use "ALL" to match all nodes.
- Notification Type
-
The notification type is either a built-in or custom type that is given the list of monitor events to send.
Type | Description |
---|---|
log |
The monitor events are written to the log using the same severity level. |
The monitor events are sent in an email to a list of recipients. Use the expression for the comma-separated list of email addresses. |
- Expression
-
Additional information to configure the notification type.
- Severity Level
-
Find monitor events that occur at this severity level or above.
- Enabled
-
Whether or not this notification is enabled to run.
5. Manage
This section will talk about how to manage and monitor SymmetricDS.
5.1. Nodes
5.1.1. Add Node
Multiple nodes can be hosted in a single SymmetricDS instance. SymmetricDS will start a node for each properties file it finds in the engines directory.
Additional nodes can be added to the same SymmetricDS instance that the master node is running in or they can be added to a different
SymmetricDS instance. Either way, you create additional nodes by creating an Node Properties File with the registration.url
set to
the sync.url
of the master node. When the SymmetricDS instance is restarted the new node will attempt to register with the master node.
For the new node to fully become a part of the synchronization network Registration must be opened.
5.1.2. Load Data
A load is the process of seeding tables at a target node with data from a source node. Instead of capturing data, data is selected from the source table using a SQL statement and then it is streamed to the client.
Initial loads, reverse initial loads, and table reloads can utilize the TABLE_RELOAD_REQUEST to request a load with a variety of options.
Initial Load (all tables)
Insert a row into TABLE_RELOAD_REQUEST containing the value 'ALL' for both the trigger_id and router_id.
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, last_update_time)
values ('store-001', 'corp-000', 'ALL', 'ALL', current_timestamp, current_timestamp);
Partial Load
Insert a row into TABLE_RELOAD_REQUEST for each trigger router combination to load.
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, last_update_time)
values ('store-001', 'corp-000', 'item_selling_price', 'corp_2_store', current_timestamp, current_timestamp);
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, last_update_time)
values ('store-001', 'corp-000', 'item', 'corp_2_store', current_timestamp, current_timestamp);
Reverse Initial Load (all tables)
Insert a row into TABLE_RELOAD_REQUEST with the proper source and target nodes for the direction of the load.
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, last_update_time)
values ('corp-000', 'store-001', 'ALL', 'ALL', current_timestamp, current_timestamp);
Load data and create target tables
Insert a row into TABLE_RELOAD_REQUEST and set the create_table to 1 to send a table creation prior to the load running.
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, create_table, last_update_time)
values ('corp-000', 'store-001', 'ALL', 'ALL', current_timestamp, 1, current_timestamp);
Load data and delete from target tables
Insert a row into TABLE_RELOAD_REQUEST and set the delete_first to 1 to delete all data in the target table prior to the load running.
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, delete_first, last_update_time)
values ('corp-000', 'store-001', 'ALL', 'ALL', current_timestamp, 1, current_timestamp);
Load data for a specific table with partial data
Insert a row into TABLE_RELOAD_REQUEST and set the reload_select to the where clause to run while extracting data. There are 3 variables available for replacement. See Variables.
-
$(groupId)
-
$(nodeId)
-
$(externalId)
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, reload_select, last_update_time)
values ('store-001', 'corp-000', 'item_selling_price', 'corp_2_store', current_timestamp, 'store_id=$(externalId)', current_timestamp);
Load table with custom SQL run before the load executes.
Insert a row into TABLE_RELOAD_REQUEST and set the before_custom_sql to run before the load runs. The %s variable is available as replacement for the table name.
insert into SYM_TABLE_RELOAD_REQUEST (target_node_id, source_node_id, trigger_id, router_id, create_time, before_custom_sql, last_update_time)
values ('store-001', 'corp-000', 'ALL', 'ALL', current_timestamp, 'truncate table %s', current_timestamp);
5.1.3. Control
Stopping a Node
Installed nodes are started automatically when the SymmetricDS server is started. An individual node instance can be stopped while other nodes continue to run.
From the command line, you can use JMX to stop a node. The following is an example. You would replace <engine name> with the name of the engine as found in the Node Properties File
bin/jmx --bean org.jumpmind.symmetric.<engine name>:name=Node --method stop
Uninstalling a Node
Uninstalling a node will remove all SymmetricDS database artifacts and delete the engine’s property file.
This can not be undone so be sure to uninstall with caution. |
From the command line you can use the symadmin utility to uninstall a node.
bin/symadmin --engine <engine name> uninstall
5.1.4. Registration
In order for a node to synchronize with other nodes it must be registered. When a node is registered, it downloads its configuration as well as references to the nodes that it should sync with.
A node is considered unregistered if it does not have an NODE_IDENTITY row. When a node is unregistered,
it will use the registration.url
defined in the Node Properties File to request registration.
The registration.url
of the new node is the sync.url
of the node that is being registered with.
Before a node is allowed to register, it must have an open registration. If there is no open registration, then a REGISTRATION_REQUEST is recorded.
You can open registration from the command line with the following command:
bin/symadmin open-registration --engine <engine name> <node group> <external id>
The <node group> and <external id> should match the group.id
and external.id
in the registering node’s Node Properties File.
Node registration is stored in the NODE and NODE_SECURITY tables. Nodes are only allowed to register if rows exist
for the registering node and the registration_enabled
flag is set to 1.
If the auto.registration
SymmetricDS parameter is set to true, then when a node attempts to register, the node will
automatically be accepted.
SymmetricDS allows you to have multiple nodes with the same external_id
. In order to enable this you must set
external.id.is.unique.enabled
to false.
5.1.5. Push Registration
If the client is unable to pull from the server, it can register by having the server push to it instead. The group link for server to client should be set to push.
You can open registration from the command line with the following command:
bin/symadmin open-registration --engine <engine name> <node group> <external id> <sync url>
The <node group>, <external id>, and <sync url> should match the group.id
, external.id
, and sync.url
in the registering node’s Node Properties File.
5.1.6. Initial Loads
Loading data for 3.8 and above has been modified, see Load Data.
When a load is requested it will either set the initial_load_enabled
or the reverse_initial_load_enabled
flag on the
appropriate NODE_SECURITY row.
When the Route Job runs next, it will create batches that represent the initial load. Batches will be created on the reload channel for each table that is defined by Table Triggers and linked by Table Routing in the direction that the load was requested. The default reload channel is the "reload" channel. At the same time reload batches are inserted, all previously pending batches for the node are marked as successfully sent.
Each table defined by Table Triggers and linked by Table Routing is represented by a reload OUTGOING_BATCH. The batches
are inserted in the defined order. If the initial_load_order
is the same then SymmetricDS tries to determine the order the
tables need to be loaded in automatically based on foreign key dependencies. A negative value for initial_load_order
in Table Routing will result
no reload batch being inserted.
If there are cyclical constraints, then foreign keys might need to be turned off or the initial load will need to be manually configured based on knowledge of how the data is structured.
A SQL statement is run against each table to get the data load
that will be streamed to the target node. The selected data is filtered
through the configured router for the table being loaded. If the data
set is going to be large, then SQL criteria can optionally be provided in initial_load_select
to pare down the data that is selected out of the database.
Note that if the parent node that a node is registering with is not a registration server node
(as can happen when using REGISTRATION_REDIRECT or when using multiple tiers) the parent node’s NODE_SECURITY
entry must exist at the parent node and have a non-null value for column initial_load_time . Nodes can’t be registered
to a non-registration-server node without this value being set one way or another (i.e., manually, or as a result
of an initial load occurring at the parent node).
|
Partial Initial Loads
An efficient way to select a subset of data from a table for an initial
load is to provide an initial_load_select
clause in Table Routing
. This clause, if present, is applied as a where
clause to the SQL used to
select the data to be loaded. The clause may use "t" as an alias for the table being loaded, if needed. The
$(externalId)
token can be used for subsetting the data in the where clause.
In cases where routing is done using a feature like
Subselect Router
, an
initial_load_select
clause matching the subselect’s criteria would be a more efficient
approach. Some routers will check to see if the
initial_load_select
clause is provided, and they will
not
execute assuming that the more optimal path is using the
initial_load_select
statement.
One example of the use of an initial load select would be if you wished
to only load data created more recently than the start of year 2011.
Say, for example, the column
created_time
contains the creation date. Your
initial_load_select
would read
created_time > ts {'2011-01-01 00:00:00.0000'}
(using whatever timestamp format works for your database). This then
gets applied as a
where
clause when selecting data from the table.
When providing an
initial_load_select
be sure to test out the criteria against production data in a query
browser. Do an explain plan to make sure you are properly using indexes.
|
Initial Load Extract In Background
By default, initial loads for a table are broken into multiple batches, with the size of batches based on the
max_batch_size
of the Channels for the reload channel being used.
Batches are pre-extracted to staging in the background, instead of waiting for a push or pull to extract them.
An estimated count of rows for the table are queried from the database statistics, or it will execute a count(*) query
from the table if statistics are not available.
The extract process creates batches based on the number of rows in the table divided by the max_batch_size
.
If the background job is disabled by setting initial.load.use.extract.job.enabled
to false,
then all data for a given table will be extracted into a single batch during a push or pull, regardless of channel settings.
For large tables, this can result in a batch that can take a long time to extract and load.
Reverse Initial Loads
Normal initial loads load data from the parent node to a client node. Occasionally, there may be need to do a one-time
initial load of data in the "reverse" direction. A reverse initial load is started by setting the reverse_initial_load_enabled
flag
on NODE_SECURITY.
Other Initial Load Settings
Initial Load Parameters
There are several parameters that can be used to modify the behavior of an initial load.
- auto.reload
-
A load is queued up to a node automatically when it registers.
- auto.reload.reverse
-
A reverse initial load is queued up for a node automatically when it registers.
- initial.load.delete.first / initial.load.delete.first.sql
-
By default, an initial load will not delete existing rows from a target table before loading the data. If a delete is desired, the parameter
initial.load.delete.first
can be set to true. If true, the command found ininitial.load.delete.first.sql
will be run on each table prior to loading the data. The default value forinitial.load.delete.first.sql
isdelete from %s
Note that individual reload batches are created that represent the deletes in the reverse order that load batches are created. All delete batches are inserted first. The
initial.load.delete.first.sql
can be overwritten at the TRIGGER_ROUTER level by entering ainitial_load_delete_stmt
. - initial.load.create.first
-
By default, an initial load will not create the table on the target if it doesn’t already exist. If the desired behavior is to create the table on the target if it is not present, set the parameter
intial.load.create.first
to true. SymmetricDS will attempt to create the table and indexes on the target database before doing the initial load. Note that individual create batches are created to represent each of the table creates.Sometimes when creating tables across different database platforms default values do not translate. You can turn off the use of default values during the table create by setting
create.table.without.defaults
.
5.1.7. Send
Events other than data changes can be sent to nodes, which include the following:
- SQL Scripts
-
Sql can be sent to be executed on a target node
- BSH Scripts
-
Beanshell scripts can be sent to be executed on a target node
- Table Schema
-
The schema the source node can be replicated to the target node individually
To create an event, you insert into DATA at the source database. Any column not specified is not required.
Column |
Value |
table_name |
Name of your table to be sent. |
event_type |
Use 'R' for reload, 'S' for SQL, 'C' for create table. |
row_data |
For reload event, specify the "where" clause to limit rows or use null for all rows. For SQL event, specify the SQL to run. |
old_data |
For Table Schema event, specify a comma separated list of options to modify what gets sent with the table schema request. |
trigger_hist_id |
Reference to the active entry in TRIGGER_HIST for your table. |
channel_id |
The channel to create batches on. |
create_time |
Timestamp when this event was created. |
node_list |
A comma separated list of node_ids to route batches to. |
Reload Event
A "reload" event will send rows for a table, either all rows or the rows you specify. You may need to re-send data when the change itself was not captured. For example, if changes occurred prior to installing triggers, or if the data at the destination was accidentally deleted, or if a batch was manually ignored
Be careful to send data for tables in an order that will satisfy foreign keys. |
insert into sym_data (node_list, table_name, event_type, row_data, trigger_hist_id, channel_id, create_time)
select '00001', source_table_name, 'R', '"tran_id=900"', trigger_hist_id, 'reload', current_timestamp
from sym_trigger_hist where source_table_name = 'sale_transaction' and inactive_time is null;
SQL Event
A "SQL" event runs SQL you specify at the target nodes.
insert into sym_data (node_list, table_name, event_type, row_data, trigger_hist_id, channel_id, create_time)
select '00001', source_table_name, 'S', '"delete from sale_transaction where day = ''2012-12-01''"', trigger_hist_id, 'reload', current_timestamp
from sym_trigger_hist where source_table_name = 'sale_transaction' and inactive_time is null;
BSH Event
A BSH event runs a BeanShell script you specify at the target nodes.
sourceNodeId |
Source node ID |
sourceNodeExternalId |
Source node external ID |
sourceNodeGroupId |
Source node group ID |
sourceNode |
Source node object of type org.jumpmind.symmetric.model.Node |
targetNodeId |
Target node ID |
targetNodeExternalId |
Target node external ID |
targetNodeGroupId |
Target node group ID |
targetNode |
Target node object of type org.jumpmind.symmetric.model.Node |
engine |
Current engine of type org.jumpmind.symmetric.ISymmetricEngine |
log |
Logger of type org.slf4j.Logger for writing messages to log |
insert into sym_data (node_list, table_name, event_type, row_data, trigger_hist_id, channel_id, create_time)
select '00001', source_table_name, 'B', 'log.info("Hello, world");', trigger_hist_id, 'config', current_timestamp
from sym_trigger_hist where source_table_name = 'sym_node' and inactive_time is null;
Table Schema
A "create" event will send the table definition as XML to the target nodes, where it is converted to database specific SQL that will either create the table if it’s missing or alter it to match the definition.
insert into sym_data (node_list, table_name, event_type, trigger_hist_id, channel_id, create_time)
select '00001', source_table_name, 'C', trigger_hist_id, 'reload', current_timestamp
from sym_trigger_hist where source_table_name = 'sale_transaction' and inactive_time is null;
There are three options that can be specified in the old_data column. These options should be comma separated. The options are excludeIndices, excludeForeignKeys, and excludeDefaults. The option excludeIndices will send the create table request without associated indices. The excludeForeignKeys option will send the create table request without the associated foreign key references. The excludeDefaults option will send the create table request without column defaults.
insert into sym_data (node_list, table_name, event_type, trigger_hist_id, channel_id, create_time, old_data)
select '00001', source_table_name, 'C', trigger_hist_id, 'reload', current_timestamp, 'deferIndices,deferForeignKeys'
from sym_trigger_hist where source_table_name = 'sale_transaction' and inactive_time is null;
5.2. Jobs
Most work done by SymmetricDS is initiated by jobs. Jobs are tasks that are started and scheduled by a job manager.
Jobs are enabled by the start.<name>.job
parameter.
Most jobs are enabled by default. The frequency at which a job runs is controlled by one of two parameters:
-
job.<name>.period.time.ms
-
job.<name>.cron
If a valid cron property exists in the configuration, then it will be used to schedule the job. Otherwise, the job manager will attempt to use the period.time.ms property.
The frequency of jobs can be configured in either the Node Properties File or in the PARAMETER table. When managed in PARAMETER table the frequency properties can be changed in the master node and when the updated settings sync to the other nodes in the system the job manager will restart the jobs at the new frequency settings.
SymmetricDS utilizes Spring’s CRON support, which includes seconds as the first parameter. This differs from the typical Unix-based
implementation, where the first parameter is usually minutes. For example,
*/15 * * * * *
means every 15 seconds, not every 15 minutes. See
Spring’s
documentation
for more details.
5.2.1. Route Job
The Route Job is responsible for creating outgoing batches of captured data that are targeted at specific nodes.
The job processes Channels, one at a time, reading up to Max Data To Route data rows which have not been routed.
The data is assigned to outgoing batches based on the Batch Algorithm defined for the channel. Note that, for the default and transactional algorithm Max Data To Route rows may be exceeded depending on the transaction boundaries.
An outgoing batch is initially created with a status of "RT". Data is assigned to the batch by inserting into data event. When a batch is complete, the batch is committed and the status is changed to "NE".
The route job will respect the Max Batch Size as configured in Channels. If the max batch size is reached before the end of a captured database transaction and the batch algorithm is set to something other than nontransactional the batch may exceed the specified max size.
The route job delegates to a router to decide which nodes need to receive the data. The correct router is looked up by
referencing the captured trigger_hist_id
in the DATA table and using Table Routing configuration.
After outgoing batches have been created by the Route Job, they need to be transported to the target node.
Data Gaps
The DATA to route is selected based on the values in the DATA_GAP table. For efficiency, DATA_GAP tracks gaps in the data ids in DATA table that have not yet been processed.
A gap while routing in DATA can occur because concurrently running transactions have not yet committed. They can also be caused by rolled back transactions.
Most of gaps are only temporarily and fill in at some point after routing and need to be picked up with the next routing run.
This table completely defines the entire range of data that can be routed at any point in time.
For a brand new instance of SymmetricDS, this table is empty and SymmetricDS creates a gap starting
from data id of zero and ending with a very large number (defined by routing.largest.gap.size
).
At the start of a route job, the list of valid gaps (gaps with status of 'GP') is collected, and each gap is evaluated in turn.
If a gap is sufficiently old (as defined by routing.stale.dataid.gap.time.ms
, SymmetricDS assumes that a transaction has
been rolled back and deletes the gap.
If the gap is not stale, then DATA_EVENT is searched for data ids present in the gap. If one or more data ids is
found in DATA_EVENT, then the current gap is deleted, and new gap(s) are created to represent the data ids still
missing in the gap’s range. This process is done for all gaps. If the very last gap contained data, a new
gap starting from the highest data id and ending at (highest data id + routing.largest.gap.size
) is then created.
This results in an updated gap list that can be used to select DATA for routing.
5.2.2. Push Job
The Push Job is responsible for assigning nodes that need to be pushed to individual threads. See Push Threads for more details.
The job sends Outgoing Batches to the target node using an HTTP PUT. By default an HTTP PUT buffers data at the client.
If large batches are going to be sent using the push job, then consider turning on http.push.stream.output.enabled
.
The push job is considered to be slightly more efficient than the Pull Job because it only needs to make a network connection if there are batches available to send.
In order to be more efficient, the push job sends an HTTP HEAD to request a reservation at the target node. If the target nodes responds and accepts the request, then the job issues the HTTP PUT with the data pay load in Data Format
5.2.3. Pull Job
The Pull Job is responsible for assigning nodes that need to be pulled to individual threads. See Pull Threads for more details.
The job expects to receive Incoming Batches from a source node using an HTTP GET.
5.2.4. Purge Outgoing Job
The Purge Outgoing Job is responsible to purging outgoing data that has successfully been loaded at the target and is older than
purge.retention.minutes
.
This job purges the following tables:
5.2.5. Purge Incoming Job
The Purge Incoming Job is responsible for purging the INCOMING_BATCH table.
5.2.6. Statistics Job
The Statistics Job flushes captured statistics to following tables:
It also purges the same tables based on the purge.stats.retention.minutes
parameter.
5.2.7. Sync Triggers Job
The Sync Triggers Job runs when a node is started and on the prescribed job schedule. The job checks for missing SymmetricDS database triggers and creates them. It also updates the SymmetricDS database triggers that have had a change to its configuration or the database table has had a change to its structure.
5.2.8. Heartbeat Job
The Heartbeat Job updates its own NODE_HOST row with a new heartbeat_time
so that it is synchronized to it’s created_at_node_id
node
to indicate that the node is online and healthy.
5.2.9. Watchdog Job
The Watchdog Job looks for nodes that have been offline for offline.node.detection.period.minutes
and disables them.
5.2.10. Stage Management Job
The Stage Management Job purges the staging area according to the stream.to.file.ttl.ms
parameter.
5.2.11. Refresh Cache Job
The Refresh Cache Job checks the last_update_time
on each cached configuration resource and determines if it needs to refresh the cached items.
This job is mostly relevant when using Clustering.
5.2.12. File Sync Tracker Job
The File System Tracker job is responsible for monitoring and recording the events of files being created, modified, or deleted. It records the current state of files to the FILE_SNAPSHOT table.
See File Synchronization for more details.
5.2.13. File Sync Pull Job
The File Sync Pull Job is responsible for assigning nodes that need to be pulled to individual threads.
See File Synchronization and Pull Threads for more details.
5.2.14. File Sync Push Job
The File Sync Push Job is responsible for assigning nodes that need to be pushed to individual threads.
See File Synchronization and Pull Threads for more details.
5.2.15. Initial Load Extract Job
The Initial Load Extract Job processes EXTRACT_REQUESTs. See Initial Load Extract In Background for more details.
5.3. Installed Triggers
SymmetricDS installs database triggers to capture changes in the DATA table. A record of the triggers that were installed and what columns are being captured is stored in the TRIGGER_HIST table. When data is captured in DATA it references the TRIGGER_HIST record that represented the trigger at the time data was captured.
This is necessary because if a trigger is rebuilt after columns are added or removed and data that was captured by the old trigger has not yet been delivered, we need a record of what columns were in play at the time the data had been captured.
The TRIGGER_HIST table records the reason a trigger was rebuilt. The following reasons are possible:
N |
New trigger that has not been created before |
S |
Schema changes in the table were detected |
C |
Configuration changes in Trigger |
T |
Trigger was missing |
A configuration entry in Trigger without any history in Trigger Hist results in a new trigger being created (N). The Trigger Hist stores a
hash of the underlying table, so any alteration to the table causes the trigger to be rebuilt (S). When the last_update_time
is changed
on the Trigger entry, the configuration change causes the trigger to be rebuilt ©. If an entry in Trigger Hist is missing the
corresponding database trigger, the trigger is created (T).
5.4. Outgoing Batches
Outgoing batches are delivered to the target node when the source node pushes or when the target node pulls.
A single push or pull connection is called a synchronization.
For one synchronization, each enabled channel will be processed. Channels are processed in the order defined by the Processing Order setting on the channel with two exceptions:
-
If there are reload channels available to be sent and the reload channels are not in error, then only reload channels will be sent
-
If a channel is in error it will be moved to the bottom of the list
When outgoing batches are selected for a node and a channel, the maximum number of batches that are extracted per synchronization is controlled by the Max Batch To Send setting on the channel.
There is also a setting that controls the max number of bytes to send in one synchronization. If SymmetricDS has extracted more than
the number of bytes configured by the transport.max.bytes.to.sync
parameter, then it will finish extracting the current batch and then finish
synchronization so the client has a chance to process and acknowledge the "big" batch. This may happen before the configured
Max Batch To Send has been reached.
When extracting a batch, data is first extracted to the Staging Area and then sent across the network from the Staging Area.
The staging area is used to minimize the amount of time a database connection is being used when streaming
over slower networks. The use of the staging area can be turned off by setting the stream.to.file.enabled
parameter.
5.4.1. Extract Frequency By Channel
The pull and push frequency cannot be adjusted by channel. If you want to adjust the frequency that outgoing batches for a specific channel are sent, you have two options:
-
Batches are extracted by channel at an interval controlled by the
extract_period_millis
in the Channels settings. Thelast_extract_time
is always recorded, by channel, on the NODE_CHANNEL_CTL table for the host node’s id. When the Pull and Push Job run, if the extract period has not passed according to the last extract time, then the channel will be skipped for this run. If theextract_period_millis
is set to zero, data extraction will happen every time the jobs run. -
SymmetricDS provides the ability to configure windows of time when synchronization is allowed. This is done using the NODE_GROUP_CHANNEL_WND table. A list of allowed time windows can be specified for a node group and a channel. If one or more windows exist, then data will only be extracted and transported if the time of day falls within the window of time specified. The configured times are always for the target node’s local time. If the
start_time
is greater than theend_time
, then the window crosses over to the next day.
5.4.2. Outgoing Sync Status
The status of outgoing synchronization can be queried at the source database.
The following query will show outgoing synchronization failures by node:
select count(*), node_id from sym_outgoing_batch
where error_flag=1 group by node_id;
The following query will show the number of data rows that have not been delivered to target nodes:
select sum(data_row_count), node_id from sym_outgoing_batch
where status != 'OK' group by node_id;
The following queries summed together give an approximation of the number of rows that have not been routed:
select sum(end_id-start_id) from sym_data_gap
where start_id < (select max(start_id) from sym_data_gap);
select count(*) from sym_data
where data_id >= (select max(start_id) from sym_data_gap);
5.4.3. Outgoing Batch Errors
By design, whenever SymmetricDS encounters an issue with synchronization, the batch containing the error is marked as being in an error state, and all subsequent batches on the same channel to the same node are not synchronized until the batch error is resolved.
SymmetricDS will retry the batch in error until the situation creating the error is resolved (or the data for the batch itself is changed). If the error is caused by network or database failures, then the error might eventually resolve itself when the network or database failures are resolved.
Analyzing and resolving issues can take place on the outgoing or incoming side. The techniques for analysis are slightly different in the two cases, however, due to the fact that the node with outgoing batch data also has the data and data events associated with the batch in the database. On the incoming node, however, all that is available is the incoming batch header and data present in an incoming error table.
Analyzing the Issue
The first step in analyzing the cause of a failed batch is to locate information about the data in the batch.
To locate batches in error, run the following SQL query:
select * from sym_outgoing_batch where error_flag=1;
Several useful pieces of information are available from this query:
-
The batch number of the failed batch, available in column
BATCH_ID
. -
The node to which the batch is being sent, available in column
NODE_ID
. -
The channel to which the batch belongs, available in column
CHANNEL_ID
. All subsequent batches on this channel to this node will be held until the error condition is resolved. -
The specific data id in the batch which is causing the failure, available in column
FAILED_DATA_ID
. -
Any SQL message, SQL State, and SQL Codes being returned during the synchronization attempt, available in columns
SQL_MESSAGE
,SQL_STATE
, andSQL_CODE
, respectively.
Using the error_flag on the batch table, as shown above, is more reliable than using the
status column. The status column can change from 'ER' to a different status temporarily as
the batch is retried.
|
The query above will also show you any recent batches that were originally in error and were changed to be manually skipped. See the end of Outgoing Batches for more details. |
To get a full picture of the batch, you can query for information representing the complete list of all data changes associated with the failed batch by joining DATA and DATA_EVENT, such as:
select * from sym_data where data_id in
(select data_id from sym_data_event where batch_id='XXXXXX');
where XXXXXX is the batch id of the failing batch.
This query returns a wealth of information about each data change in a batch, including:
-
The table involved in each data change, available in column
TABLE_NAME
, -
The event type (Update [U], Insert [I], or Delete [D]), available in column
EVENT_TYPE
, -
A comma separated list of the new data and (optionally) the old data, available in columns
ROW_DATA
andOLD_DATA
, respectively. -
The primary key data, available in column
PK_DATA
-
The channel id, trigger history information, transaction id if available, and other information.
More importantly, if you narrow your query to just the failed data id you can determine the exact data change that is causing the failure:
select * from sym_data where data_id in
(select failed_data_id from sym_outgoing_batch where batch_id='XXXXX'
and node_id='YYYYY');
where XXXXXX is the batch id and YYYYY is the node id of the batch that is failing.
The queries above usually yield enough information to be able to determine why a particular batch is failing.
Common reasons a batch might fail include:
-
The schema at the destination has a column that is not nullable yet the source has the column defined as nullable and a data change was sent with the column as null.
-
A foreign key constraint at the destination is preventing an insertion or update, which could be caused from data being deleted at the destination or the foreign key constraint is not in place at the source.
-
The data size of a column on the destination is smaller than the data size in the source, and data that is too large for the destination has been synced.
Resolving the Issue
Once you have decided upon the cause of the issue, you’ll have to decide the best course of action to fix the issue. If, for example, the problem is due to a database schema mismatch, one possible solution would be to alter the destination database in such a way that the SQL error no longer occurs. Whatever approach you take to remedy the issue, once you have made the change, on the next push or pull SymmetricDS will retry the batch and the channel’s data will start flowing again.
If you have instead decided that the batch itself is wrong, or does not need synchronized, or you wish to remove a particular data change from a batch, you do have the option of changing the data associated with the batch directly.
Be cautious when using the following two approaches to resolve synchronization issues. By far, the best approach to solving a synchronization error is to resolve what is truly causing the error at the destination database. Skipping a batch or removing a data id as discussed below should be your solution of last resort, since doing so results in differences between the source and destination databases. |
Now that you’ve read the warning, if you still want to change the batch data itself, you do have several options, including:
-
Causing SymmetricDS to skip the batch completely. This is accomplished by setting the batch’s status to 'IG' (ignore), as in:
update sym_outgoing_batch set status='IG' where batch_id='XXXXXX'
where XXXXXX is the failing batch. On the next pull or push, SymmetricDS will skip this batch since it now thinks the batch has already been synchronized. Note that you can still distinguish between successful batches and ones that you’ve artificially marked as 'OK', since the
error_flag
column on the failed batch will still be set to '1' (in error). -
Removing the failing data id from the batch by deleting the corresponding row in DATA_EVENT. Eliminating the data id from the list of data ids in the batch will cause future synchronization attempts of the batch to no longer include that particular data change as part of the batch. For example:
delete from sym_data_event where batch_id='XXXXXX' and data_id='YYYYYY' where XXXXXX is the failing batch and YYYYYY is the data id to longer be included in the batch.
After modifying the batch you will have to clear the Staging Area manually or wait for the staged version of the batch to timeout and clear itself.
5.5. Incoming Batches
Incoming batches are delivered to the target node when the source node pushes or when the target node pulls.
Incoming batches are written to the Staging Area first and then loaded. The use of the staging area can be turned off by setting the stream.to.file.enabled
parameter.
5.5.1. Incoming Sync Status
The status of incoming synchronization can be queried at the target database.
The following query will show incoming synchronization failures by node:
select count(*), node_id from sym_incoming_batch
where error_flag=1 group by node_id;
Client nodes update their heartbeat_time
in the NODE_HOST table. If a client node is online and actively syncing you can
monitor the NODE_HOST table at the server to find client nodes that are offline. Note that at times there could be more
than one NODE_HOST row per node_id
. This could be the case if the nodes are clustered or the host_name
changes.
The following query will give you nodes that have not synchronized in the last 24 hours. Note that the SQL might vary slightly for
some databases as some of the supported databases do not support current_timestamp
.
select node_id, host_name from sym_node_host
where heartbeat_time < current_timestamp-1;
5.5.2. Incoming Batch Errors
When a batch fails to load it is marked with an Error status.
Analyzing the Issue
Analysis using an incoming batch is different than that of outgoing batches.
For incoming batches, you will rely on two tables, INCOMING_BATCH and INCOMING_ERROR.
The first step in analyzing the cause of an incoming failed batch is to locate information about the batch, starting with INCOMING_BATCH . To locate batches in error, use:
select * from sym_incoming_batch where error_flag=1;
Several useful pieces of information are available from this query:
-
The batch number of the failed batch, available in column
BATCH_ID
. Note that this is the batch number of the outgoing batch on the outgoing node. -
The node the batch is being sent from, available in column
NODE_ID
. -
The channel to which the batch belongs, available in column
CHANNEL_ID
. All subsequent batches on this channel from this node will be held until the error condition is resolved. -
The data_id that was being processed when the batch failed, available in column
FAILED_DATA_ID
. -
Any SQL message, SQL State, and SQL Codes being returned during the synchronization attempt, available in columns
SQL_MESSAGE
,SQL_STATE
, andSQL_CODE
, respectively.
For incoming batches, we do not have data and data event entries in the database we can query. We do, however, have a table, INCOMING_ERROR, which provides some information about the batch.
select * from sym_incoming_error
where batch_id='XXXXXX' and node_id='YYYYY';
where XXXXXX is the batch id and YYYYY is the node id of the failing batch.
This query returns a wealth of information about each data change in a batch, including:
-
The table involved in each data change, available in column
TARGET_TABLE_NAME
, -
The event type (Update [U], Insert [I], or Delete [D]), available in column
EVENT_TYPE
, -
A comma separated list of the new data and (optionally) the old data, available in columns
ROW_DATA
andOLD_DATA
, respectively, -
The column names of the table, available in column
COLUMN_NAMES
, -
The primary key column names of the table, available in column
PK_COLUMN_NAMES
,
Resolving the Issue
For batches in error, from the incoming side you’ll also have to decide the best course of action to fix the issue.
Incoming batch errors that are in conflict can by fixed by taking advantage of two columns in INCOMING_ERROR which are examined each time
batches are processed. The first column, resolve_data
if filled in will be used in place of row_data
.
The second column, resolve_ignore
if set will cause this particular data item to be ignored and batch processing to continue. This is the same
two columns used when a manual conflict resolution strategy is chosen, as discussed in Conflicts.
5.6. Staging Area
SymmetricDS creates temporary extraction and data load files with the CSV payload of a synchronization when
the value of the stream.to.file.threshold.bytes
SymmetricDS property has been reached. Before reaching the threshold, files
are streamed to/from memory. The default threshold value is 0 bytes. This feature may be turned off by setting the stream.to.file.enabled
property to false.
SymmetricDS creates these temporary files in the directory specified by the java.io.tmpdir
Java System property.
The location of the temporary directory may be changed by setting the Java System property passed into the Java program at startup. For example,
-Djava.io.tmpdir=/home/.symmetricds/tmp
5.7. Pull Threads
Both the Pull Job and the File Sync Pull Job can be configured to pull multiple nodes in parallel. In order to take advantage of this the
pull.thread.per.server.count
or file.pull.thread.per.server.count
should be adjusted (from the default value of 1) to the number to the
number of concurrent pulls you want to occur per period on each SymmetricDS instance.
Pull activity is recorded in the NODE_COMMUNICATION table. This table is also used as a semaphore to lock pull activity across multiple servers in a cluster.
5.8. Push Threads
Both the Push Job and the File Sync Push Job can be configured to push multiple nodes in parallel. In order to take advantage of this the
push.thread.per.server.count
or file.push.thread.per.server.count
should be adjusted (from the default value of 1) to the number to the
number of concurrent pushes you want to occur per period on each SymmetricDS instance.
Push activity is recorded in the NODE_COMMUNICATION table. This table is also used as a semaphore to lock push activity across multiple servers in a cluster.
5.9. Monitors
When a Monitor is configured, it is run periodically to check the current value of a system metric and compare it to a threshold value.
Different monitor types can check the CPU usage, disk usage, memory usage, batch errors, outstanding batches, unrouted data, number
of data gaps, and job errors.
Custom monitor types can be created using Extensions that use the IMonitorType interface.
When the value returned from the check meets or exceeds the threshold value, a MONITOR_EVENT is recorded.
The MONITOR_EVENT table is synchronized on the "monitor" channel, which allows a central server to see events from remote nodes,
but this behavior can be disabled by setting the monitor.events.capture.enabled
parameter to false.
To be immediately notified of a monitor event, use Notifications to match on the severity level. Different notification type can send a message by writing to the log or sending an email. Custom notification types can be created using Extensions that use the INotificationType interface. In order to send email, the Mail Server should be configured.
5.10. Logging
The standalone SymmetricDS installation uses Log4J for logging. The configuration file is conf/log4j.xml
.
The log4j.xml
file has hints as to what logging can be enabled for useful, finer-grained logging.
There is a command line option to turn on preconfigured debugging levels. When the --debug
option is used the conf/debug-log4j.xml
is used instead of log4j.xml.
SymmetricDS proxies all of its logging through SLF4J. When deploying to an application server or if Log4J is not being leveraged, then the general rules for SLF4J logging apply.
6. Advanced Topics
This chapter focuses on a variety of topics, including deployment options, jobs, clustering, encryptions, synchronization control, and configuration of SymmetricDS.
6.1. Advanced Synchronization
6.1.1. Disabling Synchronization
All data loading may be disabled by setting the dataloader.enable
property to false. This has the effect of not allowing incoming
synchronizations, while allowing outgoing synchronizations. All data extractions may be disabled by setting the dataextractor.enable
property to false. These properties can be controlled by inserting into the master node’s PARAMETER
table. These properties affect every channel with the exception of the
'config' channel.
6.1.2. Bi-Directional Synchronization
SymmetricDS allows tables to be synchronized bi-directionally. Note that an outgoing
synchronization does not process changes during an incoming synchronization on the same node unless the trigger
was created with the sync_on_incoming_batch
flag set. If the sync_on_incoming_batch
flag
is set, then update loops are prevented by a feature that is available in most database dialects.
More specifically, during an incoming synchronization the source node_id
is put into a database session variable that is
available to the database trigger. Data events are not generated if the target node_id
on an outgoing synchronization is equal to the source node_id
.
By default, only the columns that changed will be updated in the target system.
Conflict resolution strategies can be configured for specific links and/or sets of tables.
6.1.3. Multi-Tiered Synchronization
There may be scenarios where data needs to flow through multiple tiers of nodes that are organized in a tree-like network with each tier requiring a different subset of data. For example, you may have a system where the lowest tier may be a computer or device located in a store. Those devices may connect to a server located physically at that store. Then the store server may communicate with a corporate server for example. In this case, the three tiers would be device, store, and corporate. Each tier is typically represented by a node group. Each node in the tier would belong to the node group representing that tier.
A node can only pull and push data to other nodes that are represented in the node’s NODE
table and in cases where that node’s sync_enabled
column is set to 1.
Because of this, a tree-like
hierarchy of nodes can be created by having only a subset of nodes belonging to the same node group represented at the different branches of the tree.
If auto registration is turned off, then this setup must occur manually by opening registration
for the desired nodes at the desired parent node and by configuring each node’s registration.url
to be the parent node’s URL.
The parent node is always tracked by the setting of the parent’s node_id
in the created_at_node_id
column of the new node.
When a node registers and downloads its configuration it is always provided the configuration for nodes
that might register with the node itself based on the Node Group Links defined in the parent node.
Registration Redirect
When deploying a multi-tiered system it may be advantageous to have only one registration server, even though the parent node of a registering node
could be any of a number of nodes in the system. In SymmetricDS the parent node is always the node that a child registers with. The
REGISTRATION_REDIRECT table allows a single node, usually the root server in the network, to
redirect registering nodes to their true parents. It does so based on a mapping found in the table of the external id (registrant_external_id
) to the parent’s node
id (registration_node_id
).
For example, if it is desired to have a series of regional servers that workstations at retail stores get assigned to based on their external_id
, the store number, then
you might insert into REGISTRATION_REDIRECT the store number as the registrant_external_id
and the node_id
of
the assigned region as the registration_node_id
. When a workstation at the store registers, the root server sends an HTTP redirect to the sync_url
of the node
that matches the registration_node_id
.
Please see Initial Loads for important details around initial loads and registration when using registration redirect. |
6.2. Offline Synchronization
6.2.1. Setup an Offline Node
Configuring a node as offline will still allow changes to be captured and batched for replication. However the push and/or pull jobs that are used to interact with this node will not use the standard http or https protocols to communicate with other nodes for changes. Instead the local file system will be used for replication. It is up to the user transport batch (*.csv) files to and from the node based on incoming or outgoing changes.
-
Turn on the offline push and pull jobs.
INSERT INTO sym_parameter
(external_id, node_group_id, param_key, param_value, create_time, last_update_by, last_update_time) VALUES
('ALL', 'ALL', 'start.offline.pull.job', 'true', current_timestamp, 'userid', current_timestamp);
INSERT INTO sym_parameter
(external_id, node_group_id, param_key, param_value, create_time, last_update_by, last_update_time) VALUES
('ALL', 'ALL', 'start.offline.push.job', 'true', current_timestamp, 'userid', current_timestamp);
-
Turn on the offline.node parameter for the node that should be offline.
INSERT INTO sym_parameter
(external_id, node_group_id, param_key, param_value, create_time, last_update_by, last_update_time) VALUES
('001', 'STORE', 'node.offline', 'true', current_timestamp, 'userid', current_timestamp);
-
Setting these parameters immediately affects the behavior of the push and pull jobs, so outgoing batches intended for the offline node are now written as files. Other nodes are unaffected and will continue to synchronize normally. All outstanding batches for this node are immediately written to files. As new changes occur going forward, any batches for this will also be written as files. In this example, two batches of data were waiting to sync, so they are written to files.
-
At the offline store node, the parameter immediately affects the behavior of the push and pull jobs, so outgoing batches intended for CORP are now written as files. Other nodes are unaffected. All outstanding batches for CORP are immediately written to files. As new changes occur going forward, any batches for CORP will also be written as files. In this example, two batches of data are written.
-
Finally, move the batch files to their respective incoming folder. After moving the files, the folders will contain the files depicted below.
6.2.2. Turn offline node online again
To configure the node online again simply remove the parameter entries from step 2 above. As an additional step to save resources the offline jobs can be stopped as well from step 1 above if there are not any nodes operating in an offline mode.
6.3. Encrypted Passwords
The db.user
and db.password
properties can be protected by encrypting them.
This encryption can be done two different ways.
The first option is to pass the text as an argument:
symadmin encrypt-text "text-to-encrypt"
The text-to-encrypt on the command line should be surrounded with double quotes if spaces or symbols are used. |
The second option is without an argument, and it will prompt you to enter text:
symadmin encrypt-text
Enter Text:
Encrypted text starts with "enc:" to differentiate it from plain text. See the Keystores section for an explanation of the encryption key.
6.4. Secure Transport
By specifying the "https" protocol for a URL, SymmetricDS will communicate over Secure Sockets Layer (SSL) for an encrypted transport. The following properties need to be set with "https" in the URL:
- sync.url
-
This is the URL of the current node, so if you want to force other nodes to communicate over SSL with this node, you specify "https" in the URL.
- registration.url
-
This is the URL where the node will connect for registration when it first starts up. To protect the registration with SSL, you specify "https" in the URL.
6.4.1. Standalone
The SymmetricDS service and the "sym" launch command use Jetty as an embedded web server.
Edit the conf/symmetric-server.properties
file to change port numbers
to listen on and whether or not to use HTTP and/or HTTPS.
http.enable=true
http.port=31415
https.enable=true
https.port=31417
6.4.2. Tomcat
If you deploy SymmetricDS to Apache Tomcat, it can be secured by editing the
TOMCAT_HOME/conf/server.xml
configuration file. There is already a line that can be uncommented
and changed to the following:
<Connector port="8443" protocol="HTTP/1.1" SSLEnabled="true"
maxThreads="150" scheme="https" secure="true"
clientAuth="false" sslProtocol="TLS"
keystoreFile="/symmetric-ds-1.x.x/security/keystore" />
6.4.3. Keystores
The security
subdirectory contains a private keystore and a trusted keystore. System properties are used to specify the location
of each keystore file and a password to protect it.
The system properties are set in the bin/setenv
(or bin\setenv.bat
on Windows) and conf/sym_service.conf
files.
Filename | Store Type | System Property for File | System Property for Password | Description |
---|---|---|---|---|
keystore |
PKCS12 |
sym.keystore.file |
javax.net.ssl.keyStorePassword |
Contains private encryption key and TLS certificate. |
cacerts |
JKS |
javax.net.ssl.trustStore |
javax.net.ssl.trustStorePassword |
Contains public certificates for trusted authorities who sign keys. |
The following entries in the keystore
file are used.
sym |
The TLS certificate used for handling incoming HTTPS communication. |
sym.secret |
The encryption key used for protecting secrets like database password. |
If an entry is missing when encryption is requested, SymmetricDS will automatically generate a random key for use.
It tries to use the strongest encryption algorithm and the largest key size available on the system.
If the keystore
file is missing, it will be created. Starting in SymmetricDS 3.14, it uses PKCS12 as the store type for new keystores,
but it is backwards compatible with the older JCEKS store type.
6.4.4. Finding Keystore Password
The keystores and each key entry is protected with a password. The default password is changeit
.
To obtain the current password, use the following steps:
-
Look in the
bin/setenv
(orbin\setenv.bat
on Windows) or theconf/sym_service.conf
files. (The password should be the same in both files.) -
For the password to
keystore
, look for thejavax.net.ssl.keyStorePassword
system property. -
For the password to
cacerts
, look for thejavax.net.ssl.trustStorePassword
system property. -
If the password starts with "obf:" then it is obfuscated. To obtain the cleartext password, use the following command from the
bin
subdirectory:
symadmin unobfuscate-text obf:cHVuYXRydmc=
-
If the password does not start with "obf:" then it is the cleartext password.
6.4.5. Changing Keystore Password
To change the keystore password, use the following steps:
-
Open a command prompt and navigate to the SymmetricDS installation.
-
In the
security
subdirectory, use the following commands to enter the old and new password for the keystore and each key entry.
keytool -keystore keystore -storepasswd
keytool -keystore keystore -alias sym -keypasswd
keytool -keystore keystore -alias sym.secret -keypasswd
-
Edit
bin/setenv
(orbin\setenv.bat
on Windows) andconf/sym_service.conf
files to update the new password.
-Djavax.net.ssl.keyStorePassword=changeit
-
Optionally, obfuscate the new password in the previous step to prevent casual observation.
syadmin obfuscate-text changeit
6.4.6. Generating Certificates
To generate new keys and install a server certificate, use the following steps:
-
Open a command prompt and navigate to the
security
subdirectory of SymmetricDS. -
Delete the old key pair and certificate.
keytool -keystore keystore -delete -alias sym
keytool -keystore cacerts -delete -alias sym
If you receive a message like, "Alias <sym> does not exist" - then the key entry does not exist and you can skip this step. |
-
Generate a new key pair. Note that the first name/last name (the "CN") must match the fully qualified hostname the client will be using to communcate to the server.
keytool -keystore keystore -alias sym -genkey -keyalg RSA -validity 10950
-
Export the certificate from the private keystore.
keytool -keystore keystore -export -alias sym -rfc -file sym.cer
-
Install the certificate into the trusted keystore.
keytool -keystore cacerts -import -alias sym -file sym.cer
-
Copy the
cacerts
file that is generated by this process to thesecurity
directory of each client’s SymmetricDS installation.
6.5. Java Management Extensions
Monitoring and administrative operations can be performed using Java Management Extensions (JMX).
SymmetricDS exposes JMX attributes and operations that can be accessed
from the jmx
command, Java’s jconsole, or a third party tool.
In order to use jconsole, you must enable JMX remote management in the JVM. You can edit the startup scripts to set the following system parameters.
-Dcom.sun.management.jmxremote.port=31417
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
More details about enabling JMX for JConsole can be found here.
Using the Java jconsole command, SymmetricDS is listed as a local process named SymmetricLauncher.
In jconsole, SymmetricDS appears under the MBeans tab under the name defined by the engine.name
property. The default value is SymmetricDS.
The management interfaces under SymmetricDS are organized as follows:
Node |
administrative operations |
Parameters |
access to properties set through the parameter service |
6.6. JMS Publishing
With the proper configuration SymmetricDS can publish XML messages of captured data changes to JMS during routing or transactionally while data loading synchronized data into a target database. The following explains how to publish to JMS during synchronization to the target database.
The XmlPublisherDatabaseWriterFilter is a IDatabaseWriterFilter that may be configured to publish specific tables as an XML message to a JMS provider. See Extensions for information on how to configure an extension point. If the publish to JMS fails, the batch will be marked in error, the loaded data for the batch will be rolled back and the batch will be retried during the next synchronization run.
The following is an example extension point configuration that will publish four tables in XML with a root tag of 'sale'. Each XML message will be grouped by the batch and the column names identified by the groupByColumnNames property which have the same values.
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-3.0.xsd">
<bean id="configuration-publishingFilter"
class="org.jumpmind.symmetric.integrate.XmlPublisherDatabaseWriterFilter">
<property name="xmlTagNameToUseForGroup" value="sale"/>
<property name="tableNamesToPublishAsGroup">
<list>
<value>SALE_TX</value>
<value>SALE_LINE_ITEM</value>
<value>SALE_TAX</value>
<value>SALE_TOTAL</value>
</list>
</property>
<property name="groupByColumnNames">
<list>
<value>STORE_ID</value>
<value>BUSINESS_DAY</value>
<value>WORKSTATION_ID</value>
<value>TRANSACTION_ID</value>
</list>
</property>
<property name="publisher">
<bean class="org.jumpmind.symmetric.integrate.SimpleJmsPublisher">
<property name="jmsTemplate" ref="definedSpringJmsTemplate"/>
</bean>
</property>
</bean>
</beans>
The publisher property on the XmlPublisherDatabaseWriterFilter takes an interface of type IPublisher. The implementation demonstrated here is an implementation that publishes to JMS using Spring’s JMS template. Other implementations of IPublisher could easily publish the XML to other targets like an HTTP server, the file system or secure copy it to another server.
The above configuration will publish XML similar to the following:
<?xml version="1.0" encoding="UTF-8"?>
<sale xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
id="0012010-01-220031234" nodeid="00001" time="1264187704155">
<row entity="SALE_TX" dml="I">
<data key="STORE_ID">001</data>
<data key="BUSINESS_DAY">2010-01-22</data>
<data key="WORKSTATION_ID">003</data>
<data key="TRANSACTION_ID">1234</data>
<data key="CASHIER_ID">010110</data>
</row>
<row entity="SALE_LINE_ITEM" dml="I">
<data key="STORE_ID">001</data>
<data key="BUSINESS_DAY">2010-01-22</data>
<data key="WORKSTATION_ID">003</data>
<data key="TRANSACTION_ID">1234</data>
<data key="SKU">9999999</data>
<data key="PRICE">10.00</data>
<data key="DESC" xsi:nil="true"/>
</row>
<row entity="SALE_LINE_ITEM" dml="I">
<data key="STORE_ID">001</data>
<data key="BUSINESS_DAY">2010-01-22</data>
<data key="WORKSTATION_ID">003</data>
<data key="TRANSACTION_ID">1234</data>
<data key="SKU">9999999</data>
<data key="PRICE">10.00</data>
<data key="DESC" xsi:nil="true"/>
</row>
<row entity="SALE_TAX" dml="I">
<data key="STORE_ID">001</data>
<data key="BUSINESS_DAY">2010-01-22</data>
<data key="WORKSTATION_ID">003</data>
<data key="TRANSACTION_ID">1234</data>
<data key="AMOUNT">1.33</data>
</row>
<row entity="SALE_TOTAL" dml="I">
<data key="STORE_ID">001</data>
<data key="BUSINESS_DAY">2010-01-22</data>
<data key="WORKSTATION_ID">003</data>
<data key="TRANSACTION_ID">1234</data>
<data key="AMOUNT">21.33</data>
</row>
</sale>
To publish JMS messages during routing
the same pattern is valid, with the exception that the extension point would be the XmlPublisherDataRouter and the router
would be configured by setting the router_type
of a ROUTER to the Spring bean
name of the registered extension point. Of course, the router would need to be linked through TRIGGER_ROUTERs
to each TRIGGER table that needs published.
6.7. File Synchronization
SymmetricDS not only supports the synchronization of database tables, but it also supports the synchronization of files and folders from one node to another.
6.7.1. File Synchronization Overview
-
Monitoring one or more file system directory locations for file and folder changes
-
Support synchronizing a different target directory than the source directory
-
Use of wild card expressions to “include” or “exclude” files
-
Choice of whether to recurse into subfolders of monitored directories
-
Use of existing SymmetricDS routers to subset target nodes based on file and directory metadata
-
Ability to specify if files will be synchronized on creation, or deletion, and/or modification
-
Ability to specify the frequency with which file systems are monitored for changes
-
Ability to extend file synchronization through scripts that run before or after a file is copied to its source location
-
Support for bidirectional file synchronization
-
Like database synchronization, file synchronization is configured in a series of database tables. The configuration was designed to be similar to database synchronization in order to maintain consistency and to give database synchronization users a sense of familiarity.
For database synchronization, SymmetricDS uses Table Triggers to configure which tables will capture data for synchronization and Routers to designate which nodes will be the source of data changes and which nodes will receive the data changes. Table Routing links triggers to routers.
Likewise, for file synchronization, SymmetricDS uses File Triggers to designate which base directories will be monitored. Each entry in File Triggers designates one base directory to monitor for changes on the source system. The columns on File Triggers provide additional settings for choosing specific files in the base directory that will be monitored, and whether to recurse into subdirectories, etc. File triggers are linked to routers using File Routing. The file trigger router not only links the source and the target node groups, but it also optionally provides the ability to override the base directory name at the target. File Routing also provides a flag that indicates if the target node should be seeded with the files from the source node during SymmetricDS’s initial load process.
File synchronization does require a database for runtime information about the synchronization scenario. File Triggers will also need to be linked to an appropriate router like table triggers in order to complete the setup. |
H2 database works great as a small lightweight database to support file synchronization runtime information if you do not have a relational database readily available to support file sync. |
6.7.2. How File Synchronization Works
Not only is file synchronization configured similar to database synchronization, but it also operates in a very similar way. The file system is monitored for changes via a background job that tracks the file system changes (this parallels the use of triggers to monitor for changes when synchronizing database changes). When a change is detected it is written to the FILE_SNAPSHOT table. The file snapshot table represents the most recent known state of the monitored files. The file snapshot table has a SymmetricDS database trigger automatically installed on it so that when it is updated the changes are captured by SymmetricDS on an internal channel named filesync.
The changes to FILE_SNAPSHOT are then routed and batched by a file-synchronization-specific router that delegates to the configured router based on the File Routing configuration. The file sync router can make routing decisions based on the column data of the snapshot table, columns which contain attributes of the file like the name, path, size, and last modified time. Both old and new file snapshot data are also available. The router can, for example, parse the path or name of the file and use it as the node id to route to.
Batches of file snapshot changes are stored on the filesync channel in OUTGOING_BATCH. The existing SymmetricDS pull and push jobs ignore the filesync channel. Instead, they are processed by file-synchronization-specific push and pull jobs. These jobs, file sync tracker, file sync pull, and file sync push are turned off by default. They need to be started before any changes are processed.
When transferring data, the file sync push and pull jobs build a zip file dynamically based on the batched snapshot data. The zip file contains a directory per batch. The directory name is the batch_id. A sync.bsh Bean Shell script is generated and placed in the root of each batch directory. The Bean Shell script contains the commands to copy or delete files at their file destination from an extracted zip in the staging directory on the target node. The zip file is downloaded in the case of a pull, or, in the case of a push, is uploaded as an HTTP multi-part attachment. Outgoing zip files are written and transferred from the outgoing staging directory. Incoming zip files are staged in the filesync_incoming staging directory by source node id. The filesync_incoming/{node_id} staging directory is cleared out before each subsequent delivery of files.
The acknowledgement of a batch happens the same way it is acknowledged in database synchronization. The client responds with an acknowledgement as part of the response during a file push or pull.
6.8. Variables
Variables can be used throughout configuration with the $(variableName)
syntax. Check the documentation for each configuration
item to see which variables it supports. A substring of the variable value can be specified with a starting index and an optional
ending index. The first character is at index 0, and the end index is not included in the substring.
Therefore, the length of the substring will be end index minus start index.
$(variableName:start)
$(variableName:start:end)
Examples when $(externalId)
is set to 00001-002
:
$(externalId:0:5) - returns 00001
$(externalId:6) - returns 002
The variable value can be formatted using a format string
supported by java.lang.String.format()
.
$(variableName|format_string)
Examples when $(externalId)
is set to 1
:
$(variableName|%05d) - returns 00001
7. Developer
This chapter focuses on a variety of ways for developers to build upon and extend some of the existing features found within SymmetricDS.
7.1. Extension Points
SymmetricDS has a pluggable architecture that can be extended. A Java class that implements the appropriate extension point interface, can implement custom logic and change the behavior of SymmetricDS to suit special needs. All supported extension points extend the IExtensionPoint interface. The available extension points are documented in the following sections.
When SymmetricDS starts up, the ExtensionPointManager searches a Spring Framework context for classes that implement the IExtensionPoint interface, then creates and registers the class with the appropriate SymmetricDS component.
Extensions should be configured in the conf/symmetric-extensions.xml file as Spring beans. The jar file that contains the extension should be placed in the web/WEB-INF/lib directory.
If an extension point needs access to SymmetricDS services or needs to connect to the database it may implement the ISymmetricEngineAware interface in order to get a handle to the ISymmetricEngine.
The INodeGroupExtensionPoint interface may be optionally implemented to indicate that a registered extension point should only be registered with specific node groups.
/**
* Only apply this extension point to the 'root' node group.
*/
public String[] getNodeGroupIdsToApplyTo() {
return new String[] { "root" };
}
7.1.1. IParameterFilter
Parameter values can be specified in code using a parameter filter. Note that there can be only one parameter filter per engine instance. The IParameterFilter replaces the deprecated IRuntimeConfig from prior releases.
public class MyParameterFilter
implements IParameterFilter, INodeGroupExtensionPoint {
/**
* Only apply this filter to stores
*/
public String[] getNodeGroupIdsToApplyTo() {
return new String[] { "store" };
}
public String filterParameter(String key, String value) {
// look up a store number from an already existing properties file.
if (key.equals(ParameterConstants.EXTERNAL_ID)) {
return StoreProperties.getStoreProperties().
getProperty(StoreProperties.STORE_NUMBER);
}
return value;
}
public boolean isAutoRegister() {
return true;
}
}
7.1.2. IDatabaseWriterFilter
Data can be filtered or manipulated before it is loaded into the target database. A filter can change the data in a column, save it somewhere else or do something else with the data entirely. It can also specify by the return value of the function call that the data loader should continue on and load the data (by returning true) or ignore it (by returning false). One possible use of the filter, for example, might be to route credit card data to a secure database and blank it out as it loads into a less-restricted reporting database.
A DataContext is passed to each of the callback methods. A new context is created for each synchronization. The context provides a mechanism to share data during the load of a batch between different rows of data that are committed in a single database transaction.
The filter also provides callback methods for the batch lifecycle. The DatabaseWriterFilterAdapter may be used if not all methods are required.
A class implementing the IDatabaseWriterFilter interface is injected onto the DataLoaderService in order to receive callbacks when data is inserted, updated, or deleted.
public class MyFilter extends DatabaseWriterFilterAdapter {
@Override
public boolean beforeWrite(DataContext context, Table table, CsvData data) {
if (table.getName().equalsIgnoreCase("CREDIT_CARD_TENDER")
&& data.getDataEventType().equals(DataEventType.INSERT)) {
String[] parsedData = data.getParsedData(CsvData.ROW_DATA);
// blank out credit card number
parsedData[table.getColumnIndex("CREDIT_CARD_NUMBER")] = null;
}
return true;
}
}
The filter class should be specified in conf/symmetric-extensions.xml as follows.
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-3.0.xsd">
<bean id="myFilter" class="com.mydomain.MyFilter"/>
</beans>
7.1.3. IDatabaseWriterErrorHandler
Implement this extension point to override how errors are handled. You can use this extension point to ignore rows that produce foreign key errors.
7.1.4. IDataLoaderFactory
Implement this extension point to provide a different implementation of the org.jumpmind.symmetric.io.data.IDataWriter that is used by the SymmetricDS data loader. Data loaders are configured for a channel. After this extension point is registered it can be activated for a CHANNEL by indicating the data loader name in the data_loader_type column.
SymmetricDS has two out of the box extensions of IDataLoaderFactory already implemented in its PostgresBulkDataLoaderFactory and OracleBulkDataLoaderFactory classes. These extension points implement bulk data loading capabilities for Oracle, Postgres and Greenplum dialects. See Appendix C. Database Notes for details.
Another possible use of this extension point is to route data to a NOSQL data sink.
7.1.5. IAcknowledgeEventListener
Implement this extension point to receive callback events when a batch is acknowledged. The callback for this listener happens at the point of extraction.
7.1.6. IReloadListener
Implement this extension point to listen in and take action before or after a reload is requested for a Node. The callback for this listener happens at the point of extraction.
7.1.7. IReloadVariableFilter
Implement this extension point to filter the SQL used by the initial load to query source tables and purge target tables. The extension receives the SQL and can replace variable names with values, which allows for adding new variables. The org.jumpmind.util.FormatUtils.replace() method can be used to find and replace variables.
7.1.8. ISyncUrlExtension
This extension point is used to select an appropriate URL based on the URI provided in the sync_url column of sym_node.
To use this extension point configure the sync_url for a node with the protocol of ext://beanName. The beanName is the name you give the extension point in the extension xml file.
7.1.9. IColumnTransform
This extension point allows custom column transformations to be created. There are a handful of out-of-the-box implementations. If any of these do not meet the column transformation needs of the application, then a custom transform can be created and registered. It can be activated by referencing the column transform’s name transform_type column of TRANSFORM_COLUMN
7.1.10. INodeIdCreator
This extension point allows SymmetricDS users to implement their own algorithms for how node ids and passwords are generated or selected during the registration process. There may be only one node creator per SymmetricDS instance (Please note that the node creator extension has replaced the node generator extension).
7.1.11. ITriggerCreationListener
Implement this extension point to get status callbacks during trigger creation.
7.1.12. IBatchAlgorithm
Implement this extension point and set the name of the Spring bean on the batch_algorithm column of the Channel table to use. This extension point gives fine grained control over how a channel is batched.
7.1.13. IDataRouter
Implement this extension point and set the name of the Spring bean on the router_type column of the Router table to use. This extension point gives the ability to programmatically decide which nodes data should be routed to.
7.1.14. IHeartbeatListener
Implement this extension point to get callbacks during the heartbeat job.
7.1.15. IOfflineClientListener
Implement this extension point to get callbacks for offline events on client nodes.
7.1.16. IOfflineServerListener
Implement this extension point to get callbacks for offline events detected on a server node during monitoring of client nodes.
7.1.17. INodePasswordFilter
Implement this extension point to intercept the saving and rendering of the node password.
7.2. Embedding in Android
SymmetricDS has its web-enabled, fault-tolerant, database synchronization software available on the Android mobile computing platform. The Android client follows all of the same concepts and brings to Android all of the same core SymmetricDS features as the full-featured, Java-based SymmetricDS client. The Android client is a little bit different in that it is not a stand-alone application, but is designed to be referenced as a library to run in-process with an Android application requiring synchronization for its SQLite database.
By using SymmetricDS, mobile application development is simplified, in that the mobile application developer can now focus solely on interacting with their local SQLite database. SymmetricDS takes care of capturing and moving data changes to and from a centralized database when the network is available
The same core libraries that are used for the SymmetricDS server are also used for Android. SymmetricDS’s overall footprint is reduced by eliminating a number of external dependencies in order to fit better on an Android device. The database access layer is abstracted so that the Android specific database access layer could be used. This allows SymmetricDS to be efficient in accessing the SQLite database on the Android device.
In order to convey how to use the SymmetricDS Android libraries, the example below will show how to integrate an Android application with the Quick-Start demo configuration. This Android application will create an embedded SQLite database, all SymmetricDS run-time tables, and configure the Quick-Start demo tables (ITEM, ITEM_SELLING_PRICE, SALE_TRANSACTION, SALE_RETURN_LINE_ITEM) for synchronization. The application has a basic user interface that can run queries on the SQLite database to demonstrate synchronization with a Quick-Start demo corp server.
Android Studio 3.0.1 and Android SDK 26 were used for this example. The example Android application can be cloned into Android Studio using the GitHub project located here: https://github.com/JumpMind/symmetric-android-client-demo
SymmetricDS for Android comes as a zip file of Java archives (jar files) that are required by the SymmetricDS client at runtime. The libs directory containing the necessary .jar files can be downloaded as a zip archive (symmetric-android-VERSION.zip) from the SymmetricDS downloads page. Alternatively, the libs directory can be generated by cloning the symmetric-ds GitHub repository and running the following command in the symmetric-assemble directory:
./gradlew androidDistZip
The first step to using SymmetricDS in an Android application is to unzip the jar files into a location where the project will recognize them. The latest Android SDK requires that these jar files be put into a libs directory under the app directory of the Android application project.
In order to sync properly, the Sync URL of the corp-000 node must be updated to use the IP address of host rather than localhost. Then, update the String REGISTRATION_URL in the DbProvider class of the Android project to the new Sync URL of the corp-000 node. |
Next, set up an Android Emulator or connect and Android device. This can be done by opening the Android Virtual Device Manager. Click New and follow the steps. The higher the Emulator’s API, the better.
Run your Android Application by pressing the Run button in Android Studio. When prompted, select the emulator you just created. Monitor the Console in Android Studio. Let the apk install on the emulator. Now watch the LogCat and wait as it attempts to register with your SymmetricDS Master Node.
The core functionality of SymmetricDS on Android is implemented by starting the SymmetricService class as an Android service. This requires building the SymmetricDS Android libraries using the steps mentioned above and adding them to your Android project’s dependencies.
The SymmetricService Intent is defined in the AndroidManifest.xml using the following XML snippet:
<service android:name="org.jumpmind.symmetric.android.SymmetricService" android:enabled="true" >
<intent-filter>
<action android:name="org.jumpmind.symmetric.android.SymmetricService" />
</intent-filter>
</service>
The SymmetricService Intent is started using the following java code:
Intent intent = new Intent(getContext(), SymmetricService.class);
// Replace extras with desired node configuration
intent.putExtra(SymmetricService.INTENTKEY_SQLITEOPENHELPER_REGISTRY_KEY, DATABASE_NAME);
intent.putExtra(SymmetricService.INTENTKEY_REGISTRATION_URL, REGISTRATION_URL);
intent.putExtra(SymmetricService.INTENTKEY_EXTERNAL_ID, NODE_ID);
intent.putExtra(SymmetricService.INTENTKEY_NODE_GROUP_ID, NODE_GROUP);
intent.putExtra(SymmetricService.INTENTKEY_START_IN_BACKGROUND, true);
Properties properties = new Properties();
// Put any additional SymmetricDS parameters into properties
intent.putExtra(SymmetricService.INTENTKEY_PROPERTIES, properties);
getContext().startService(intent);
7.3. Embedding in C/C++
A minimal implementation of the SymmetricDS client is written in C, which includes a shared library named "libsymclient" and a command line executable named "sym" for synchronizing a database. It currently only supports the SQLite database. The SymmetricDS C library and client are built from the following projects:
- symmetric-client-clib
-
This project contains most of the code and builds the libsymclient C library. It depends on libcurl, libsqlite3, and libcsv.
- symmetric-client-clib-test
-
This project links against the C library to runs unit tests. It also depends on the CUnit library.
- symmetric-client-native
-
This project links against the C library to build the sym executable.
The binaries are built using Eclipse CDT (C/C++ Development Tooling), which is an Integrated Developer Environment based on the Eclipse platform. A distribution of Eclipse CDT can be downloaded or an existing Eclipse installation can be updated to include the CDT. (See https://eclipse.org/cdt/ for information and downloads.) In the future, the projects above will switch to a general build system like Autotools for automating builds, but for now Eclipse is required.
The "sym" executable can be run from the command line and expects the "libsymclient.so" library to be installed on the system. If running from the project directories during development, the path to the library can be specified with the LD_LIBRARY_PATH environment variable on Linux, the DYLD_LIBRARY_PATH on Mac OS X, or PATH on Windows. The executable will look for a "symmetric.properties" file containing startup parameters in the user’s home directory or in the current directory:
LD_LIBRARY_PATH=../../symmetric-client-clib/Debug ./sym
It will also accept an argument of the path and filename of the properties file to use:
LD_LIBRARY_PATH=../../symmetric-client-clib/Debug ./sym /path/to/client.properties
The client uses Startup Parameters to connect to a database, identify itself, and register with a server to request synchronization. Here is an example client.properties file:
db.url=sqlite:file:test.db
group.id=store
external.id=003
registration.url=http://localhost:31415/sync/corp-000
The symmetric-client-native project is an example of how to use the SymEngine API provided by the C library. The C library uses an object-oriented pattern and follows the same naming conventions as the Java project. All symbol names in the C library are prefixed with "Sym". Each Java class is represented in C with a struct that contains member data and pointers to member functions. Here is an example C program that runs the SymmetricDS engine:
#include "libsymclient.h"
int main(int argCount, char **argValues) {
// Startup and runtime parameters
SymProperties *prop = SymProperties_new(NULL);
prop->put(prop, SYM_PARAMETER_DB_URL, "sqlite:file:data.db");
prop->put(prop, SYM_PARAMETER_GROUP_ID, "store");
prop->put(prop, SYM_PARAMETER_EXTERNAL_ID, "003");
prop->put(prop, SYM_PARAMETER_REGISTRATION_URL, "http://localhost:31415/sync/corp-000");
// Uncomment to read parameters from a file instead
//SymProperties *prop = SymProperties_newWithFile(NULL, fileName);
SymEngine *engine = SymEngine_new(NULL, prop);
// Connects to database, creates config/runtime tables and triggers
engine->start(engine);
// Pull changes from remote nodes
engine->pull(engine);
// Create batches of captured changes
engine->route(engine);
// Push changes to remote nodes
engine->push(engine);
// Create a heartbeat batch with current host information
engine->heartbeat(engine, 0);
// Purge old batch data that has successfully synced
engine->purge(engine);
// Clean up
engine->stop(engine);
engine->destroy(engine);
prop->destroy(prop);
return 0;
}
8. By Example
This chapter focuses on using examples for a variety of use cases with SymmetricDS.
8.1. Replicating CSV File Into a Database
8.1.1. SymmetricDS Setup
-
Create or choose a target table for your target node group. For this example, we will create a table, person, for the "corp" node group.
create table person(
id integer not null,
first_name varchar(50) not null,
last_name varchar(50) not null,
primary key(id)
);
-
Next, a file trigger should be created to watch for csv files at the specified base directory. You can also specify files that should be included to set off the trigger.
FileSync needs to be enabled for each node group to create file triggers. |
insert into SYM_FILE_TRIGGER (trigger_id, channel_id,
reload_channel_id, base_dir, recurse, includes_files,
create_time, last_update_time) values
('person_csv_trigger','filesync','filesync_reload',
<<BASE_DIR>>, 0, 'person*.csv',
current_timestamp, current_timestamp);
'BASE_DIR' should be changed to the directory you are looking in. |
-
After the file trigger is created, a router should be created so that the changes can be routed to the correct location. The communication link needs to be specified as well as the router type and target table. In this example, "store" is the source node group which communicates to the target node group,"corp". The router type is csv and person is the target table.
insert into SYM_ROUTER (router_id, target_table_name,
source_node_group_id, target_node_group_id, router_type,
create_time, last_update_time) values
('person_csv_router','person','store', 'corp', 'csv',
current_timestamp, current_timestamp);
-
Finally, a file routing link should be created between the file trigger and router.
insert into SYM_FILE_TRIGGER_ROUTER (trigger_id, router_id,
target_base_dir, create_time, last_update_time) values (
'person_csv_trigger', 'person_csv_router',
<<TARGET_BASE_DIR>>, current_timestamp, current_timestamp);
'TARGET_BASE_DIR' should be changed to your target base directory. |
-
Testing. You are now ready to test your csv to database routing. Add a csv file that matches your table specifications to your base directory or make a change to a pre-existing csv file in the directory. File sync tracker checks for changes in already sync’d files and runs every 5 minutes. File sync pull checks for new files to pull down and runs every 1 minute. Depending on the change you choose (changing a file or adding a file), wait the appropriate amount of time and then verify that the changes are shown in the target table.
select * from person;
Appendix A: Data Model
What follows is the complete SymmetricDS data model. Note that all tables are prepended with a configurable prefix so that multiple instances of SymmetricDS may coexist in the same database. The default prefix is sym_.
SymmetricDS configuration is entered by the user into the data model to control the behavior of what data is synchronized to which nodes.
At runtime, the configuration is used to capture data changes and route them to nodes. The data changes are placed together in a single unit called a batch that can be loaded by another node. Outgoing batches are delivered to nodes and acknowledged. Incoming batches are received and loaded. History is recorded for batch status changes and statistics.
A.1. CHANNEL
This table represents a category of data that can be synchronized independently of other channels. Channels allow control over the type of data flowing and prevents one type of synchronization from contending with another.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
CHANNEL_ID |
VARCHAR |
128 |
PK |
X |
A unique identifer, usually named something meaningful, like 'sales' or 'inventory'. |
|
PROCESSING_ORDER |
INTEGER |
1 |
X |
Order of sequence to process channel data. |
||
MAX_BATCH_SIZE |
INTEGER |
1000 |
X |
The maximum number of Data Events to process within a batch for this channel. |
||
MAX_BATCH_TO_SEND |
INTEGER |
60 |
X |
The maximum number of batches to send during a 'synchronization' between two nodes. A 'synchronization' is equivalent to a push or a pull. If there are 12 batches ready to be sent for a channel and max_batch_to_send is equal to 10, then only the first 10 batches will be sent. |
||
MAX_DATA_TO_ROUTE |
INTEGER |
100000 |
X |
The maximum number of data rows to route for a channel at a time. |
||
EXTRACT_PERIOD_MILLIS |
INTEGER |
0 |
X |
The minimum number of milliseconds allowed between attempts to extract data for targeted at a node_id. |
||
ENABLED |
TINYINT |
1 |
1 |
X |
Indicates whether channel is enabled or not. |
|
USE_OLD_DATA_TO_ROUTE |
TINYINT |
1 |
1 |
X |
Indicates whether to read the old data during routing. |
|
USE_ROW_DATA_TO_ROUTE |
TINYINT |
1 |
1 |
X |
Indicates whether to read the row data during routing. |
|
USE_PK_DATA_TO_ROUTE |
TINYINT |
1 |
1 |
X |
Indicates whether to read the pk data during routing. |
|
RELOAD_FLAG |
TINYINT |
1 |
0 |
X |
Indicates that this channel is used for reloads. |
|
FILE_SYNC_FLAG |
TINYINT |
1 |
0 |
X |
Indicates that this channel is used for file sync. |
|
CONTAINS_BIG_LOB |
TINYINT |
1 |
0 |
X |
Provides SymmetricDS a hint on how to treat captured data. Currently only supported by Oracle, Interbase and Firebird. If set to '0', then selects for routing and data extraction will be more efficient and lobs will be truncated at 4k in the trigger text. When it is set to '0' there is a 4k limit on the total size of a row and on the size of a LOB column. |
|
BATCH_ALGORITHM |
VARCHAR |
50 |
default |
X |
The algorithm to use when batching data on this channel. Possible values are: 'default', 'transactional', and 'nontransactional' |
|
DATA_LOADER_TYPE |
VARCHAR |
50 |
default |
X |
Identify the type of data loader this channel should use. Allows for the default dataloader to be swapped out via configuration for more efficient platform specific data loaders. |
|
DESCRIPTION |
VARCHAR |
255 |
Description on the type of data carried in this channel. |
|||
QUEUE |
VARCHAR |
25 |
default |
X |
User provided queue name for channel to operate on. Creates multi-threaded channels. Defaults to 'default' thread |
|
MAX_NETWORK_KBPS |
DECIMAL |
10,3 |
0.000 |
X |
The maximum network transfer rate in kilobytes per second. Zero or negative means unlimited. Channels running in serial or parallel can have an effect on how much bandwidth can be used and when a channel will be processed. This is currently only implemented when staging is enabled. |
|
DATA_EVENT_ACTION |
CHAR |
1 |
For a node group link with a data event action of B (both), select how to send changes to the target node group. (P = Push, W = Wait for Pull) |
|||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a user last updated this entry. |
A.2. CONFLICT
Defines how conflicts in row data should be handled during the load process.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
CONFLICT_ID |
VARCHAR |
50 |
PK |
X |
Unique identifier for a specific conflict detection setting. |
|
SOURCE_NODE_GROUP_ID |
VARCHAR |
50 |
FK |
X |
The source node group for which this setting will be applied to. References a node group link. |
|
TARGET_NODE_GROUP_ID |
VARCHAR |
50 |
FK |
X |
The target node group for which this setting will be applied to. References a node group link. |
|
TARGET_CHANNEL_ID |
VARCHAR |
128 |
Optional channel that this setting will be applied to. |
|||
TARGET_CATALOG_NAME |
VARCHAR |
255 |
Optional database catalog that the target table belongs to. Only use this if the target table is not in the default catalog. |
|||
TARGET_SCHEMA_NAME |
VARCHAR |
255 |
Optional database schema that the target table belongs to. Only use this if the target table is not in the default schema. |
|||
TARGET_TABLE_NAME |
VARCHAR |
255 |
Optional database table that this setting will apply to. If left blank, the setting will be for any table in the channel (if set) and in the specified node group link. |
|||
DETECT_TYPE |
VARCHAR |
128 |
X |
Indicates the strategy to use for detecting conflicts during a dml action. The possible values are: use_pk_data (manual, fallback, ignore), use_changed_data (manual, fallback, ignore), use_old_data (manual, fallback, ignore), use_timestamp (newer_wins), use_version (newer_wins) |
||
DETECT_EXPRESSION |
LONGVARCHAR |
An expression that provides additional information about the detection mechanism. If the detection mechanism is use_timestamp or use_version then this expression will be the name of the timestamp or version column. |
||||
RESOLVE_TYPE |
VARCHAR |
128 |
X |
Indicates the strategy for resolving update conflicts. The possible values differ based on the detect_type that is specified. |
||
PING_BACK |
VARCHAR |
128 |
X |
Indicates the strategy for sending resolved conflicts back to the source system. Possible values are: OFF, SINGLE_ROW, and REMAINING_ROWS. |
||
RESOLVE_CHANGES_ONLY |
TINYINT |
1 |
0 |
Indicates that when applying changes during an update that only data that has changed should be applied. Otherwise, all the columns will be updated. This really only applies to updates. |
||
RESOLVE_ROW_ONLY |
TINYINT |
1 |
0 |
Indicates that an action should take place for the entire batch if possible. This applies to a resolve type of 'ignore'. If a row is in conflict and the resolve type is 'ignore', then the entire batch will be ignored. |
||
CREATE_TIME |
TIMESTAMP |
X |
The date and time when this entry was created. |
|||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
The date and time when a user last updated this entry. |
A.3. CONTEXT
Context variables used by runtime services on a single node
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NAME |
VARCHAR |
80 |
PK |
X |
The name of the context variable. |
|
CONTEXT_VALUE |
LONGVARCHAR |
The value of the context variable. |
||||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when emtry was last updated. |
A.4. DATA
The captured data change that occurred to a row in the database. Entries in data are created by database triggers.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
DATA_ID |
BIGINT |
PK |
X |
Unique identifier for a data. |
||
TABLE_NAME |
VARCHAR |
255 |
X |
The name of the table in which a change occurred that this entry records. |
||
EVENT_TYPE |
CHAR |
1 |
X |
The type of event captured by this entry. For triggers, this is the change that occurred, which is 'I' for insert, 'U' for update, or 'D' for delete. Other events include: 'R' for reloading the entire table (or subset of the table) to the node; 'S' for running dynamic SQL at the node, which is used for adhoc administration. |
||
ROW_DATA |
LONGVARCHAR |
The captured data change from the synchronized table. The column values are stored in comma-separated values (CSV) format. |
||||
PK_DATA |
LONGVARCHAR |
The primary key values of the captured data change from the synchronized table. This data is captured for updates and deletes. The primary key values are stored in comma-separated values (CSV) format. |
||||
OLD_DATA |
LONGVARCHAR |
The captured data values prior to the update. The column values are stored in CSV format. |
||||
TRIGGER_HIST_ID |
INTEGER |
X |
The foreign key to the trigger_hist entry that contains the primary key and column names for the table being synchronized. |
|||
CHANNEL_ID |
VARCHAR |
128 |
The channel that this data belongs to, such as 'prices' |
|||
TRANSACTION_ID |
VARCHAR |
255 |
An optional transaction identifier that links multiple data changes together as the same transaction. |
|||
SOURCE_NODE_ID |
VARCHAR |
50 |
If the data was inserted by a SymmetricDS data loader, then the id of the source node is record so that data is not re-routed back to it. |
|||
EXTERNAL_DATA |
VARCHAR |
50 |
A field that can be populated by a trigger that uses the EXTERNAL_SELECT |
|||
NODE_LIST |
VARCHAR |
255 |
A field that can be populated with a comma separated subset of node ids which will be the only nodes available to the router |
|||
IS_PREROUTED |
TINYINT |
1 |
0 |
X |
Set to true when routing should ignore this row because data_event and outgoing_batch rows are manually entered. |
|
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
A.5. DATA_EVENT
Each row represents the mapping between a data change that was captured and the batch that contains it. Entries in data_event are created as part of the routing process.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
DATA_ID |
BIGINT |
PK |
X |
Id of the data to be routed. |
||
BATCH_ID |
BIGINT |
PK |
X |
Id of the batch containing the data. |
||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
A.6. DATA_GAP
Used only when routing.data.reader.type is set to 'gap.' Table that tracks gaps in the data table so that they may be processed efficiently, if data shows up. Gaps can show up in the data table if a database transaction is rolled back.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
START_ID |
BIGINT |
PK |
X |
The first missing data_id from the data table where a gap is detected. This could be the last data_id inserted plus one. |
||
END_ID |
BIGINT |
PK |
X |
The last missing data_id from the data table where a gap is detected. If the start_id is the last data_id inserted plus one, then this field is filled in with a -1. |
||
CREATE_TIME |
TIMESTAMP |
X |
Timestamp when this entry was created. |
|||
LAST_UPDATE_HOSTNAME |
VARCHAR |
255 |
The host who last updated this entry. |
A.7. EXTENSION
Dynamic extensions stored in the database that plug-in to the running engine and receive callbacks according to their interface.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
EXTENSION_ID |
VARCHAR |
50 |
PK |
X |
The unique id of the extension. |
|
EXTENSION_TYPE |
VARCHAR |
10 |
X |
The type of the extension. Types are 'java' and 'bsh' |
||
INTERFACE_NAME |
VARCHAR |
255 |
Name of interface, required for 'bsh' only. |
|||
NODE_GROUP_ID |
VARCHAR |
50 |
X |
Target the extension at a specific node group id. To target all groups, use the value of 'ALL'. |
||
ENABLED |
TINYINT |
1 |
1 |
X |
Whether or not the extension is enabled. |
|
EXTENSION_ORDER |
INTEGER |
1 |
X |
Specifies the order in which to install extensions when multiple extensions implement the same interface. |
||
EXTENSION_TEXT |
LONGVARCHAR |
The script or code of the extension. |
||||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a user last updated this entry. |
A.8. EXTRACT_REQUEST
This table is used internally to request the extract of initial loads asynchronously when the initial load extract job is enabled.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
REQUEST_ID |
BIGINT |
PK |
X |
Unique identifier for a request. |
||
SOURCE_NODE_ID |
VARCHAR |
50 |
default |
PK |
X |
Unique identifier for the node that will be the source of the extract. |
NODE_ID |
VARCHAR |
50 |
X |
The node_id of the batch being loaded. |
||
QUEUE |
VARCHAR |
128 |
The channel queue name of the batch being loaded. |
|||
STATUS |
CHAR |
2 |
NE, OK |
|||
START_BATCH_ID |
BIGINT |
X |
A load can be split across multiple batches. This is the first of N batches the load will be split across. |
|||
END_BATCH_ID |
BIGINT |
X |
This is the last of N batches the load will be split across. |
|||
TRIGGER_ID |
VARCHAR |
128 |
X |
Unique identifier for a trigger associated with the extract request. |
||
ROUTER_ID |
VARCHAR |
50 |
X |
Unique description of the router associated with the extract request. |
||
LOAD_ID |
BIGINT |
The load id associated with the extract request. |
||||
TABLE_NAME |
VARCHAR |
255 |
The table name for this extract request |
|||
EXTRACTED_ROWS |
BIGINT |
0 |
X |
The rows in this table that have been extracted to target |
||
EXTRACTED_MILLIS |
BIGINT |
0 |
X |
The time spent extracting this table |
||
TRANSFERRED_ROWS |
BIGINT |
0 |
X |
The rows in this table that have been transferred to target |
||
TRANSFERRED_MILLIS |
BIGINT |
0 |
X |
The time spent transfering this table |
||
LAST_TRANSFERRED_BATCH_ID |
BIGINT |
The last batch id that was successfully transferred to the target |
||||
LOADED_ROWS |
BIGINT |
0 |
X |
The rows in this table that have been loaded to the target |
||
LOADED_MILLIS |
BIGINT |
0 |
X |
The time spent loading this table |
||
LAST_LOADED_BATCH_ID |
BIGINT |
The last batch id that was successfully loaded on the target |
||||
TOTAL_ROWS |
BIGINT |
The rows in this table to be extracted |
||||
LOADED_TIME |
TIMESTAMP |
Timestamp when this table was loaded. |
||||
PARENT_REQUEST_ID |
BIGINT |
0 |
X |
Parent request_id that will actually handle the extract for this request. |
||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a process last updated this entry. |
||||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
A.9. FILE_INCOMING
As files are loaded from another node the file and source node are captured here for file sync to use to prevent file ping backs in bidirectional file synchronization.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
RELATIVE_DIR |
VARCHAR |
255 |
PK |
X |
The path to the file starting at the base_dir and excluding the file name itself. |
|
FILE_NAME |
VARCHAR |
260 |
PK |
X |
The name of the file that has been loaded. |
|
LAST_EVENT_TYPE |
CHAR |
1 |
X |
The type of event that caused the file to be loaded from another node. 'C' is for create, 'M' is for modified, and 'D' is for deleted. |
||
NODE_ID |
VARCHAR |
50 |
X |
The node_id of the source of the batch being loaded. |
||
FILE_MODIFIED_TIME |
BIGINT |
The last modified time of the file at the time the file was loaded. |
A.10. FILE_SNAPSHOT
Table used to capture file changes. Updates to the table are captured and routed according to the configured file trigger routers.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
TRIGGER_ID |
VARCHAR |
128 |
PK |
X |
The id of the trigger that caused this snapshot to be taken. |
|
ROUTER_ID |
VARCHAR |
50 |
PK |
X |
The id of the router that caused this snapshot to be taken. |
|
RELATIVE_DIR |
VARCHAR |
255 |
PK |
X |
The path to the file starting at the base_dir |
|
FILE_NAME |
VARCHAR |
260 |
PK |
X |
The name of the file that changed. |
|
CHANNEL_ID |
VARCHAR |
128 |
filesync |
X |
The channel_id of the channel that data changes will flow through. |
|
RELOAD_CHANNEL_ID |
VARCHAR |
128 |
filesync_reload |
X |
The channel_id of the channel that data changes will flow through. |
|
LAST_EVENT_TYPE |
CHAR |
1 |
X |
The type of event captured by this entry. 'C' is for create, 'M' is for modified, and 'D' is for deleted. |
||
CRC32_CHECKSUM |
BIGINT |
File checksum. Can be used to determine if file content has changed. |
||||
FILE_SIZE |
BIGINT |
The size in bytes of the file at the time this change was detected. |
||||
FILE_MODIFIED_TIME |
BIGINT |
The last modified time of the file at the time this change was detected. |
||||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
Timestamp when a user last updated this entry. |
|||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
CREATE_TIME |
TIMESTAMP |
X |
Timestamp when this entry was created. |
A.11. FILE_TRIGGER
This table defines files or sets of files for which changes will be captured for file synchronization
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
TRIGGER_ID |
VARCHAR |
128 |
PK |
X |
Unique identifier for a trigger. |
|
CHANNEL_ID |
VARCHAR |
128 |
filesync |
X |
The channel_id of the channel that data changes will flow through. |
|
RELOAD_CHANNEL_ID |
VARCHAR |
128 |
filesync_reload |
X |
The channel_id of the channel that will be used for reloads. |
|
BASE_DIR |
VARCHAR |
255 |
X |
The base directory on the client that will be synchronized. |
||
RECURSE |
TINYINT |
1 |
1 |
X |
Whether to synchronize child directories. |
|
INCLUDES_FILES |
VARCHAR |
255 |
Wildcard-enabled, comma-separated list of file to include in synchronization. |
|||
EXCLUDES_FILES |
VARCHAR |
255 |
Wildcard-enabled, comma-separated list of file to exclude from synchronization. |
|||
SYNC_ON_CREATE |
TINYINT |
1 |
1 |
X |
Whether to capture and send files when they are created. |
|
SYNC_ON_MODIFIED |
TINYINT |
1 |
1 |
X |
Whether to capture and send files when they are modified. |
|
SYNC_ON_DELETE |
TINYINT |
1 |
1 |
X |
Whether to capture and remove files when they are deleted. |
|
SYNC_ON_CTL_FILE |
TINYINT |
1 |
0 |
X |
Combined with sync_on_create, determines whether to capture and send files when a matching control file exists. The control file is a file of the same name with a '.ctl' extension appended to the end. |
|
DELETE_AFTER_SYNC |
TINYINT |
1 |
0 |
X |
Determines whether to delete the file after it has synced successfully. |
|
BEFORE_COPY_SCRIPT |
LONGVARCHAR |
A bsh script that is run right before the file copy. |
||||
AFTER_COPY_SCRIPT |
LONGVARCHAR |
A bsh script that is run right after the file copy. |
||||
CREATE_TIME |
TIMESTAMP |
X |
Timestamp of when this entry was created. |
|||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
Timestamp of when a user last updated this entry. |
|||
DESCRIPTION |
LONGVARCHAR |
Optional notes and comments for file_trigger |
A.12. FILE_TRIGGER_ROUTER
Maps a file trigger to a router.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
TRIGGER_ID |
VARCHAR |
128 |
PK FK |
X |
The id of a file trigger. |
|
ROUTER_ID |
VARCHAR |
50 |
PK FK |
X |
The id of a router. |
|
ENABLED |
TINYINT |
1 |
1 |
X |
Indicates whether this file trigger router is enabled or not. |
|
INITIAL_LOAD_ENABLED |
TINYINT |
1 |
1 |
X |
Indicates whether this file trigger should be initial loaded. |
|
TARGET_BASE_DIR |
VARCHAR |
255 |
The base directory on the destination that files will be synchronized to. |
|||
CONFLICT_STRATEGY |
VARCHAR |
128 |
source_wins |
X |
The strategy to employ when a file has been modified at both the client and the server. Possible values are: source_wins, target_wins, manual |
|
CREATE_TIME |
TIMESTAMP |
X |
Timestamp when this entry was created. |
|||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
Timestamp when a user last updated this entry. |
|||
DESCRIPTION |
LONGVARCHAR |
Optional notes and comments for file_trigger_router |
A.13. GROUPLET
Deprecated in 3.9. This tables defines named groups to which nodes can belong to based on their external id. Grouplets are used to designate that synchronization should only affect an explicit subset of nodes in a node group.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
GROUPLET_ID |
VARCHAR |
50 |
PK |
X |
Unique identifier for the grouplet. |
|
GROUPLET_LINK_POLICY |
CHAR |
1 |
I |
X |
Specified whether the external ids in the grouplet_link are included in the group or excluded from the grouplet. In the case of excluded, the grouplet starts with all external ids and removes the excluded ones listed. Use 'I' for inclusive and 'E' for exclusive. |
|
DESCRIPTION |
VARCHAR |
255 |
A description of this grouplet. |
|||
CREATE_TIME |
TIMESTAMP |
X |
Timestamp when this entry was created. |
|||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
Timestamp when a user last updated this entry. |
A.14. GROUPLET_LINK
Deprecated in 3.9. This tables defines nodes belong to a grouplet based on their external.id
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
GROUPLET_ID |
VARCHAR |
50 |
PK FK |
X |
Unique identifier for the grouplet. |
|
EXTERNAL_ID |
VARCHAR |
255 |
PK |
X |
Provides a means to select the nodes that belong to a grouplet. |
|
CREATE_TIME |
TIMESTAMP |
X |
Timestamp when this entry was created. |
|||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
Timestamp when a user last updated this entry. |
A.15. INCOMING_BATCH
The incoming_batch is used for tracking the status of loading an outgoing_batch from another node. Data is loaded and commited at the batch level. The status of the incoming_batch is either successful (OK) or error (ER).
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
BATCH_ID |
BIGINT |
50 |
PK |
X |
The id of the outgoing_batch that is being loaded. |
|
NODE_ID |
VARCHAR |
50 |
PK |
X |
The node_id of the source of the batch being loaded. |
|
CHANNEL_ID |
VARCHAR |
128 |
The channel_id of the batch being loaded. |
|||
STATUS |
CHAR |
2 |
The current status of the batch can be loading (LD), successfully loaded (OK), in error (ER) or skipped (SK) |
|||
ERROR_FLAG |
TINYINT |
1 |
0 |
A flag that indicates that this batch was in error during the last synchornization attempt. |
||
SQL_STATE |
VARCHAR |
10 |
For a status of error (ER), this is the XOPEN or SQL 99 SQL State. |
|||
SQL_CODE |
INTEGER |
0 |
X |
For a status of error (ER), this is the error code from the database that is specific to the vendor. |
||
SQL_MESSAGE |
LONGVARCHAR |
For a status of error (ER), this is the error message that describes the error. |
||||
LAST_UPDATE_HOSTNAME |
VARCHAR |
255 |
The host name of the process that last did work on this batch. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a process last updated this entry. |
||||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
SUMMARY |
VARCHAR |
255 |
A high level summary of what is included in a batch, often a list of table names. |
|||
IGNORE_COUNT |
INTEGER |
0 |
X |
The number of times a batch was ignored. |
||
BYTE_COUNT |
BIGINT |
0 |
X |
The number of bytes that were sent as part of this batch. |
||
LOAD_FLAG |
TINYINT |
1 |
0 |
A flag that indicates that this batch is part of an initial load. |
||
EXTRACT_COUNT |
INTEGER |
0 |
X |
The number of times this an attempt to extract this batch occurred. |
||
SENT_COUNT |
INTEGER |
0 |
X |
The number of times this batch was sent. A batch can be sent multiple times if an ACK is not received. |
||
LOAD_COUNT |
INTEGER |
0 |
X |
The number of times an attempt to load this batch occurred. |
||
RELOAD_ROW_COUNT |
INTEGER |
0 |
X |
The number of reloads in the batch, which include rows from any extract transforms. |
||
OTHER_ROW_COUNT |
INTEGER |
0 |
X |
The number of SQL, BSH, and DDL events in the batch, which include rows from any extract transforms. |
||
DATA_ROW_COUNT |
INTEGER |
0 |
X |
The total number of rows in the batch, which include rows from any extract transforms. |
||
EXTRACT_ROW_COUNT |
INTEGER |
0 |
X |
The total number of rows extracted from the capture log. |
||
LOAD_ROW_COUNT |
INTEGER |
0 |
X |
The total number of rows loaded into the database, which includes rows from any load transforms or filters. |
||
DATA_INSERT_ROW_COUNT |
INTEGER |
0 |
X |
The number of inserts in the batch, which include rows from any extract transforms. |
||
DATA_UPDATE_ROW_COUNT |
INTEGER |
0 |
X |
The number of updates in the batch, which include rows from any extract transforms. |
||
DATA_DELETE_ROW_COUNT |
INTEGER |
0 |
X |
The number of deletes in the batch, which include rows from any extract transforms. |
||
EXTRACT_INSERT_ROW_COUNT |
INTEGER |
0 |
X |
The number of inserts extracted from the capture log. |
||
EXTRACT_UPDATE_ROW_COUNT |
INTEGER |
0 |
X |
The number of updates extracted from the capture log. |
||
EXTRACT_DELETE_ROW_COUNT |
INTEGER |
0 |
X |
The number of deletes extracted from the capture log. |
||
LOAD_INSERT_ROW_COUNT |
INTEGER |
0 |
X |
The number of inserts loaded into the database, which includes rows from any load transforms or filters. |
||
LOAD_UPDATE_ROW_COUNT |
INTEGER |
0 |
X |
The number of updates loaded into the database, which includes rows from any load transforms or filters. |
||
LOAD_DELETE_ROW_COUNT |
INTEGER |
0 |
X |
The number of deletes loaded into the database, which includes rows from any load transforms or filters. |
||
NETWORK_MILLIS |
INTEGER |
0 |
X |
The number of milliseconds spent transfering this batch across the network. |
||
FILTER_MILLIS |
INTEGER |
0 |
X |
The number of milliseconds spent in filters processing data. |
||
LOAD_MILLIS |
INTEGER |
0 |
X |
The number of milliseconds spent loading the data into the target database. |
||
ROUTER_MILLIS |
INTEGER |
0 |
X |
The number of milliseconds spent creating this batch. |
||
EXTRACT_MILLIS |
INTEGER |
0 |
X |
The number of milliseconds spent extracting the data out of the source database. |
||
TRANSFORM_EXTRACT_MILLIS |
INTEGER |
0 |
X |
The number of milliseconds spent transforming the data on the extract side. |
||
TRANSFORM_LOAD_MILLIS |
INTEGER |
0 |
X |
The number of milliseconds spent transforming the data on the load side. |
||
LOAD_ID |
BIGINT |
An id that ties multiple batches together to identify them as being part of an initial load. |
||||
COMMON_FLAG |
TINYINT |
1 |
0 |
A flag that indicates that the data in this batch is shared by other nodes (they will have the same batch_id). Shared batches will be extracted to a common location. |
||
FALLBACK_INSERT_COUNT |
INTEGER |
0 |
X |
The number of times an update was turned into an insert because the data was not already in the target database. |
||
FALLBACK_UPDATE_COUNT |
INTEGER |
0 |
X |
The number of times an insert was turned into an update because a data row already existed in the target database. |
||
CONFLICT_WIN_COUNT |
INTEGER |
0 |
X |
Not implemented. The number of times a conflict was detected for a row and it was resolved as the winning row. |
||
CONFLICT_LOSE_COUNT |
INTEGER |
0 |
X |
Not implemented. The number of times a conflict was detected for a row and it was resolved as the losing row. |
||
IGNORE_ROW_COUNT |
INTEGER |
0 |
X |
The number of times a row was ignored. |
||
MISSING_DELETE_COUNT |
INTEGER |
0 |
X |
The number of times a delete did not affect the database because the row was already deleted. |
||
SKIP_COUNT |
INTEGER |
0 |
X |
The number of times a batch was sent and skipped because it had already been loaded according to incoming_batch. |
||
FAILED_ROW_NUMBER |
INTEGER |
0 |
X |
This numbered data event that failed as read from the CSV. |
||
FAILED_LINE_NUMBER |
INTEGER |
0 |
X |
The current line number in the CSV for this batch that failed. |
||
FAILED_DATA_ID |
BIGINT |
0 |
X |
For a status of error (ER), this is the data_id that was being processed when the batch failed. |
||
BULK_LOADER_FLAG |
TINYINT |
1 |
0 |
A flag that indicates that this batch did or did not use the bulk loader. |
A.16. INCOMING_ERROR
The captured data change that is in error for a batch. The user can tell the system what to do by updating the resolve columns. Entries in data_error are created when an incoming batch encounters an error.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
BATCH_ID |
BIGINT |
50 |
PK |
X |
The id of the outgoing_batch that is being loaded. |
|
NODE_ID |
VARCHAR |
50 |
PK |
X |
The node_id of the source of the batch being loaded. A node_id of -1 means that the batch was 'unrouted'. |
|
FAILED_ROW_NUMBER |
BIGINT |
PK |
X |
The row number in the batch that encountered an error when loading. |
||
FAILED_LINE_NUMBER |
BIGINT |
0 |
X |
The current line number in the CSV for this batch that failed. |
||
TARGET_CATALOG_NAME |
VARCHAR |
255 |
The catalog name for the table being loaded. |
|||
TARGET_SCHEMA_NAME |
VARCHAR |
255 |
The schema name for the table being loaded. |
|||
TARGET_TABLE_NAME |
VARCHAR |
255 |
X |
The table name for the table being loaded. |
||
EVENT_TYPE |
CHAR |
1 |
X |
The type of event captured by this entry. For triggers, this is the change that occurred, which is 'I' for insert, 'U' for update, or 'D' for delete. Other events include: 'R' for reloading the entire table (or subset of the table) to the node; 'S' for running dynamic SQL at the node, which is used for adhoc administration. |
||
BINARY_ENCODING |
VARCHAR |
10 |
HEX |
X |
The type of encoding the source system used for encoding binary data. |
|
COLUMN_NAMES |
LONGVARCHAR |
X |
The column names defined on the table. The column names are stored in comma-separated values (CSV) format. |
|||
PK_COLUMN_NAMES |
LONGVARCHAR |
X |
The primary key column names defined on the table. The column names are stored in comma-separated values (CSV) format. |
|||
ROW_DATA |
LONGVARCHAR |
The row data from the batch as captured from the source. The column values are stored in comma-separated values (CSV) format. |
||||
OLD_DATA |
LONGVARCHAR |
The old row data prior to update from the batch as captured from the source. The column values are stored in CSV format. |
||||
CUR_DATA |
LONGVARCHAR |
The current row data that caused the error to occur. The column values are stored in CSV format. |
||||
RESOLVE_DATA |
LONGVARCHAR |
The capture data change from the user that is used instead of row_data. This is useful when resolving a conflict manually by specifying the data that should load. |
||||
RESOLVE_IGNORE |
TINYINT |
1 |
0 |
Indication from the user that the row_data should be ignored and the batch can continue loading with the next row. |
||
CONFLICT_ID |
VARCHAR |
50 |
Unique identifier for the conflict detection setting that caused the error |
|||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
Timestamp when a user last updated this entry. |
A.17. JOB
Defines custom jobs that can be executed in the SymmetricDS instance.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
JOB_NAME |
VARCHAR |
50 |
PK |
X |
The unique name of the job |
|
JOB_TYPE |
VARCHAR |
10 |
X |
The type of the job. Valid values are 'sql', 'java' and 'bsh'. The system exclusively uses the 'built_in' job type. |
||
REQUIRES_REGISTRATION |
TINYINT |
1 |
1 |
X |
Does the engine need to be registered for this job to run? |
|
JOB_EXPRESSION |
LONGVARCHAR |
The payload of the job. For bsh (beanshell) jobs, this should be bsh script. For Java jobs, this should be Java code of a class which implements the IJob interface. For SQL jobs, this should be a sql script. |
||||
DESCRIPTION |
VARCHAR |
255 |
An optional description of the job for users of the system. |
|||
DEFAULT_SCHEDULE |
VARCHAR |
50 |
The schedule to use if no schedule parameter is found. Overridden by job.jobname.period.time.ms or job.jobname.cron. |
|||
DEFAULT_AUTO_START |
TINYINT |
1 |
1 |
X |
Determine if this job should auto start. Overridden by start.jobname.job. |
|
NODE_GROUP_ID |
VARCHAR |
50 |
X |
Target the job at a specific node group id. To target all groups, use the value of 'ALL'. |
||
IS_CLUSTERED |
TINYINT |
1 |
0 |
X |
Whether to acquire a cluster lock or not. |
|
CREATE_BY |
VARCHAR |
50 |
The user who created this entry. |
|||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a user last updated this entry. |
A.18. LOAD_FILTER
A table that allows you to dynamically define filters using bsh.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
LOAD_FILTER_ID |
VARCHAR |
50 |
PK |
X |
The id of the load filter. |
|
LOAD_FILTER_TYPE |
VARCHAR |
10 |
X |
The type of load filter. Possible values include: BSH, JAVA, SQL |
||
SOURCE_NODE_GROUP_ID |
VARCHAR |
50 |
X |
The source node group for the filter. |
||
TARGET_NODE_GROUP_ID |
VARCHAR |
50 |
X |
The destination node group for the filter. |
||
TARGET_CATALOG_NAME |
VARCHAR |
255 |
Optional name for the catalog the configured table is in. |
|||
TARGET_SCHEMA_NAME |
VARCHAR |
255 |
Optional name for the schema a configured table is in. |
|||
TARGET_TABLE_NAME |
VARCHAR |
255 |
The name of the target table that will trigger the bsh filter. |
|||
FILTER_ON_UPDATE |
TINYINT |
1 |
1 |
X |
Whether or not the filter should apply on an update. |
|
FILTER_ON_INSERT |
TINYINT |
1 |
1 |
X |
Whether or not the filter should apply on an insert. |
|
FILTER_ON_DELETE |
TINYINT |
1 |
1 |
X |
Whether or not the filter should apply on a delete. |
|
BEFORE_WRITE_SCRIPT |
LONGVARCHAR |
The script to apply before the write is completed. |
||||
AFTER_WRITE_SCRIPT |
LONGVARCHAR |
The script to apply after the write is completed. |
||||
BATCH_COMPLETE_SCRIPT |
LONGVARCHAR |
The script to apply on batch complete. |
||||
BATCH_COMMIT_SCRIPT |
LONGVARCHAR |
The script to apply on batch commit. |
||||
BATCH_ROLLBACK_SCRIPT |
LONGVARCHAR |
The script to apply on batch rollback. |
||||
HANDLE_ERROR_SCRIPT |
LONGVARCHAR |
The script to apply when data cannot be processed. |
||||
CREATE_TIME |
TIMESTAMP |
X |
Timestamp when this entry was created. |
|||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
X |
Timestamp when a user last updated this entry. |
|||
LOAD_FILTER_ORDER |
INTEGER |
1 |
X |
Specifies the order in which to apply load filters if more than one target operation occurs. |
||
FAIL_ON_ERROR |
TINYINT |
1 |
0 |
X |
Whether we should fail the batch if the filter fails. |
A.19. LOCK
Contains semaphores that are set when processes run, so that only one server can run a process at a time. Enable this feature by using the cluster.lock.enabled parameter.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
LOCK_ACTION |
VARCHAR |
50 |
PK |
X |
The process that needs a lock. |
|
LOCK_TYPE |
VARCHAR |
50 |
X |
Type of lock that indicates differently locking behavior. Types include cluster, exclusive, and shared. Cluster lock is used to allow one server to run at a time, but any process from the same server can overtake the lock, which avoids stalled processing. Exclusive lock is owned by one process, regardless of which server it is on, but another process can acquire the lock after lock_time is older than exclusive.lock.timeout.ms. Shared lock allows multiple processes to use the same lock, incrementing the shared_count, but requires no exclusive lock exists and prevents an exclusive lock. |
||
LOCKING_SERVER_ID |
VARCHAR |
255 |
The name of the server that currently has a lock. This is typically a host name, but it can be overridden using the -Druntime.symmetric.cluster.server.id=name System property. |
|||
LOCK_TIME |
TIMESTAMP |
The time a lock is aquired. Use the cluster.lock.timeout.ms to specify a lock timeout period. |
||||
SHARED_COUNT |
INTEGER |
0 |
X |
For a lock_type of SHARED, this is the number of processes sharing the same lock. After the shared_count drops to zero, a shared lock is removed. |
||
SHARED_ENABLE |
INTEGER |
0 |
X |
For a lock_type of SHARED, this flag set to 1 indicates that more processes can share the lock. If an exclusive lock is needed, the flag is set to 0 to prevent further shared locks from accumulating. |
||
LAST_LOCK_TIME |
TIMESTAMP |
Timestamp when a process last updated this entry. |
||||
LAST_LOCKING_SERVER_ID |
VARCHAR |
255 |
The server id of the process that last did work on this batch. |
A.20. MONITOR
Defines monitors that will run periodically to look for problems in the system.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
MONITOR_ID |
VARCHAR |
128 |
PK |
X |
Unique identifier for a monitor. |
|
NODE_GROUP_ID |
VARCHAR |
50 |
ALL |
X |
Target a specific node group to run this monitor. Target all groups, use a value of 'ALL'. |
|
EXTERNAL_ID |
VARCHAR |
255 |
ALL |
X |
Target a specific node by its external ID to run this monitor. Target all nodes, use a value of 'ALL'. |
|
TYPE |
VARCHAR |
50 |
X |
Monitor type to execute. Built-in types are cpu, disk, memory, batchError, batchUnsent, dataGap, dataUnrouted, and log. |
||
EXPRESSION |
LONGVARCHAR |
An expression used by the monitor to set options specific to the monitor type. |
||||
THRESHOLD |
BIGINT |
0 |
X |
The minimum value returned when the monitor runs that will cause a monitor event to be recorded. |
||
RUN_PERIOD |
INTEGER |
0 |
X |
Run this monitor periodically every number of seconds. |
||
RUN_COUNT |
INTEGER |
0 |
X |
Average the value across a number of runs before checking threshold. |
||
SEVERITY_LEVEL |
INTEGER |
0 |
X |
|||
ENABLED |
TINYINT |
1 |
0 |
X |
Whether or not this monitor is enabled for execution. |
|
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a user last updated this entry. |
A.21. MONITOR_EVENT
Records an event of when a system problem occurred.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
MONITOR_ID |
VARCHAR |
128 |
PK |
X |
Unique identifier for a monitor that caused the event. |
|
NODE_ID |
VARCHAR |
50 |
PK |
X |
Unique identifier for the node that created the event. |
|
EVENT_TIME |
TIMESTAMP |
PK |
X |
Timestamp when the event was created. |
||
HOST_NAME |
VARCHAR |
60 |
Host name of the node that created the event. |
|||
TYPE |
VARCHAR |
50 |
X |
Monitor type that detected the value recorded. |
||
THRESHOLD |
BIGINT |
0 |
X |
Minimum value for the monitor to cause an event. |
||
EVENT_VALUE |
BIGINT |
0 |
X |
Actual value detected by the monitor. |
||
EVENT_COUNT |
INTEGER |
0 |
X |
Number of times this event has occurred and been updated. |
||
SEVERITY_LEVEL |
INTEGER |
0 |
X |
Severity level configured for the monitor. |
||
IS_RESOLVED |
TINYINT |
0 |
X |
Whether an event is resolved because its value dropped below the threshold. |
||
IS_NOTIFIED |
TINYINT |
0 |
X |
Whether a notification was run. |
||
DETAILS |
LONGVARCHAR |
JSON formatted details about the event |
||||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when the event was last updated. |
A.22. NODE
Representation of an instance of SymmetricDS that synchronizes data with one or more additional nodes. Each node has a unique identifier (nodeId) that is used when communicating, as well as a domain-specific identifier (externalId) that provides context within the local system.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_ID |
VARCHAR |
50 |
PK |
X |
A unique identifier for a node. |
|
NODE_GROUP_ID |
VARCHAR |
50 |
X |
The node group that this node belongs to, such as 'store'. |
||
EXTERNAL_ID |
VARCHAR |
255 |
X |
A domain-specific identifier for context within the local system. For example, the retail store number. |
||
SYNC_ENABLED |
TINYINT |
1 |
0 |
Indicates whether this node should be sent synchronization. Disabled nodes are ignored by the triggers, so no entries are made in data_event for the node. |
||
SYNC_URL |
VARCHAR |
255 |
The URL to contact the node for synchronization. |
|||
SCHEMA_VERSION |
VARCHAR |
50 |
The version of the database schema this node manages. Useful for specifying synchronization by version. |
|||
SYMMETRIC_VERSION |
VARCHAR |
50 |
The version of SymmetricDS running at this node. |
|||
CONFIG_VERSION |
VARCHAR |
50 |
The version of configuration running at this node. |
|||
DATABASE_TYPE |
VARCHAR |
50 |
The database product name at this node as reported by JDBC. |
|||
DATABASE_VERSION |
VARCHAR |
50 |
The database product version at this node as reported by JDBC. |
|||
DATABASE_NAME |
VARCHAR |
50 |
The database product name identified by SymmetricDS. |
|||
BATCH_TO_SEND_COUNT |
INTEGER |
0 |
The number of outgoing batches that have not yet been sent. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
|||
BATCH_IN_ERROR_COUNT |
INTEGER |
0 |
The number of outgoing batches that are in error at this node. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
|||
BATCH_LAST_SUCCESSFUL |
TIMESTAMP |
The last incoming or outgoing batch that was marked OK that is not on the config, heartbeat, or monitor channels. |
||||
DATA_ROWS_TO_SEND_COUNT |
INTEGER |
0 |
The number of of rows to replicated in batches. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
|||
DATA_ROWS_LOADED_COUNT |
INTEGER |
0 |
The number of rows that has been successfully replicated since last purge. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
|||
OLDEST_LOAD_TIME |
TIMESTAMP |
The oldest recorded sync time since the last purge. |
||||
MOST_RECENT_ACTIVE_TABLE |
VARCHAR |
The most active table to sync recently (time based on last 5 minutes and doubles until a match is found or until 80 min.)) |
||||
PURGE_OUTGOING_LAST_RUN_MS |
BIGINT |
0 |
The number of milliseconds that the purge job took on the last run. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
|||
PURGE_OUTGOING_LAST_FINISH |
TIMESTAMP |
The time the purge outgoing last finished succesfully. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
||||
PURGE_OUTGOING_AVERAGE_MS |
BIGINT |
The average time the purge outgoing has been run. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
||||
ROUTING_LAST_RUN_MS |
BIGINT |
The time it took for the the last routing job to finish succesfully. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
||||
ROUTING_LAST_FINISH |
TIMESTAMP |
The time the routing job last finished succesfully. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
||||
ROUTING_AVERAGE_RUN_MS |
BIGINT |
0 |
The number of milliseconds that the routing job averages. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
|||
SYM_DATA_SIZE |
BIGINT |
The current size of the sym_data table. This field is updated as part of the heartbeat job if the heartbeat.update.node.with.batch.status property is set to true. |
||||
CREATED_AT_NODE_ID |
VARCHAR |
50 |
The node_id of the node where this node was created. This is typically filled automatically with the node_id found in node_identity where registration was opened for the node. |
|||
DEPLOYMENT_TYPE |
VARCHAR |
50 |
An indicator as to the type of SymmetricDS software that is running. Possible values are, but not limited to: engine, standalone, war, professional, mobile |
|||
DEPLOYMENT_SUB_TYPE |
VARCHAR |
50 |
An indicator as to the deployment sub type of the node. Possible values are, but not limited to: load-only |
A.23. NODE_CHANNEL_CTL
Used to ignore or suspend a channel. A channel that is ignored will have its data_events batched and they will immediately be marked as 'OK' without sending them. A channel that is suspended is skipped when batching data_events.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_ID |
VARCHAR |
50 |
PK |
X |
Unique identifier for a node. |
|
CHANNEL_ID |
VARCHAR |
128 |
PK |
X |
The name of the channel_id that is being controlled. |
|
SUSPEND_ENABLED |
TINYINT |
1 |
0 |
Indicates if this channel is suspended, which prevents batches from being sent, although new batches can still be created. |
||
IGNORE_ENABLED |
TINYINT |
1 |
0 |
Indicates if this channel is ignored, which marks batches with a status of OK like they were actually processed. |
||
LAST_EXTRACT_TIME |
TIMESTAMP |
Record the last time data was extract for a node and a channel. |
A.24. NODE_COMMUNICATION
This table is used to coordinate communication with other nodes.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_ID |
VARCHAR |
50 |
PK |
X |
Unique identifier for a node. |
|
QUEUE |
VARCHAR |
25 |
default |
PK |
X |
The queue name to use in relation to the channel. |
COMMUNICATION_TYPE |
VARCHAR |
10 |
PK |
X |
The type of communication that is taking place with this node. Valid values are: PULL, PUSH |
|
LOCK_TIME |
TIMESTAMP |
The timestamp when this node was locked |
||||
LOCKING_SERVER_ID |
VARCHAR |
255 |
The name of the server that currently has a pull lock for the node. This is typically a host name, but it can be overridden using the -Druntime.symmetric.cluster.server.id=name System property. |
|||
LAST_LOCK_TIME |
TIMESTAMP |
The timestamp when this node was last locked |
||||
LAST_LOCK_MILLIS |
BIGINT |
0 |
The amount of time the last communication took. |
|||
SUCCESS_COUNT |
BIGINT |
0 |
The number of successive successful communication attempts. |
|||
FAIL_COUNT |
BIGINT |
0 |
The number of successive failed communication attempts. |
|||
SKIP_COUNT |
BIGINT |
0 |
The number of skipped communication attempts. |
|||
TOTAL_SUCCESS_COUNT |
BIGINT |
0 |
The total number of successful communication attempts with the node. |
|||
TOTAL_FAIL_COUNT |
BIGINT |
0 |
The total number of failed communication attempts with the node. |
|||
TOTAL_SUCCESS_MILLIS |
BIGINT |
0 |
The total amount of time spent during successful communication attempts with the node. |
|||
TOTAL_FAIL_MILLIS |
BIGINT |
0 |
The total amount of time spent during failed communication attempts with the node. |
|||
BATCH_TO_SEND_COUNT |
BIGINT |
0 |
The number of batches this node has queued for pull. |
|||
NODE_PRIORITY |
INTEGER |
0 |
Used to order nodes when initiating a pull operation. Can be used to move a node to the top of the list to pull from it as quickly as possible. |
A.25. NODE_GROUP
A category of Nodes that synchronizes data with one or more NodeGroups. A common use of NodeGroup is to describe a level in a hierarchy of data synchronization.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_GROUP_ID |
VARCHAR |
50 |
PK |
X |
Unique identifier for a node group, usually named something meaningful, like 'store' or 'warehouse'. |
|
DESCRIPTION |
VARCHAR |
255 |
A description of this node group. |
|||
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a user last updated this entry. |
A.26. NODE_GROUP_CHANNEL_WND
An optional window of time for which a node group and channel will extract and send data.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_GROUP_ID |
VARCHAR |
50 |
PK |
X |
The node_group_id that this window applies to. |
|
CHANNEL_ID |
VARCHAR |
128 |
PK |
X |
The channel_id that this window applies to. |
|
START_TIME |
TIMESTAMP |
PK |
X |
The start time for the active window. |
||
END_TIME |
TIMESTAMP |
PK |
X |
The end time for the active window. Note that if the end_time is less than the start_time then the window crosses a day boundary. |
||
ENABLED |
TINYINT |
1 |
0 |
X |
Enable this window. If this is set to '0' then this window is ignored. |
A.27. NODE_GROUP_LINK
A source node_group sends its data updates to a target NodeGroup using a pull, push, or custom technique.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
SOURCE_NODE_GROUP_ID |
VARCHAR |
50 |
PK FK |
X |
The node group where data changes should be captured. |
|
TARGET_NODE_GROUP_ID |
VARCHAR |
50 |
PK FK |
X |
The node group where data changes will be sent. |
|
DATA_EVENT_ACTION |
CHAR |
1 |
W |
X |
The notification scheme used to send data changes to the target node group. (P = Push, W = Wait for Pull, B = Both Push and Wait for Pull (control from channel), R = Route-Only) |
|
SYNC_CONFIG_ENABLED |
TINYINT |
1 |
1 |
X |
Indicates whether configuration that has changed should be synchronized to target nodes on this link. Affects all SymmetricDS configuration tables except for sym_extract_request, sym_file_snapshot, sym_monitor_event, sym_node, sym_node_host, sym_node_security, sym_table_reload_request, and sym_table_reload_status. |
|
IS_REVERSIBLE |
TINYINT |
1 |
0 |
X |
Indicates if communication can work in reverse as specified on the channel. A reversible push link can be overridden to pull, and a reversible pull link can be overridden to push on the channel. |
|
CREATE_TIME |
TIMESTAMP |
Timestamp when this entry was created. |
||||
LAST_UPDATE_BY |
VARCHAR |
50 |
The user who last updated this entry. |
|||
LAST_UPDATE_TIME |
TIMESTAMP |
Timestamp when a user last updated this entry. |
A.28. NODE_HOST
Representation of an physical workstation or server that is hosting the SymmetricDS software. In a clustered environment there may be more than one entry per node in this table.
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_ID |
VARCHAR |
50 |
PK |
X |
A unique identifier for a node. |
|
HOST_NAME |
VARCHAR |
60 |
PK |
X |
The host name of a workstation or server. If more than one instance of SymmetricDS runs on the same server, then this value can be a 'server id' specified by -Druntime.symmetric.cluster.server.id |
|
INSTANCE_ID |
VARCHAR |
60 |
A unique identifer generated by each installation of SymmetricDS. Used to to safe-guard against multi-access to the sym tables. |
|||
IP_ADDRESS |
VARCHAR |
50 |
The ip address for the host. |
|||
OS_USER |
VARCHAR |
50 |
The user SymmetricDS is running under |
|||
OS_NAME |
VARCHAR |
50 |
The name of the OS |
|||
OS_ARCH |
VARCHAR |
50 |
The hardware architecture of the OS |
|||
OS_VERSION |
VARCHAR |
50 |
The version of the OS |
|||
AVAILABLE_PROCESSORS |
INTEGER |
0 |
The number of processors available to use. |
|||
FREE_MEMORY_BYTES |
BIGINT |
0 |
The amount of free memory available to the JVM. |
|||
TOTAL_MEMORY_BYTES |
BIGINT |
0 |
The amount of total memory available to the JVM. |
|||
MAX_MEMORY_BYTES |
BIGINT |
0 |
The max amount of memory available to the JVM. |
|||
JAVA_VERSION |
VARCHAR |
50 |
The version of java that SymmetricDS is running as. |
|||
JAVA_VENDOR |
VARCHAR |
255 |
The vendor of java that SymmetricDS is running as. |
|||
JDBC_VERSION |
VARCHAR |
255 |
The verision of the JDBC driver that is being used. |
|||
SYMMETRIC_VERSION |
VARCHAR |
50 |
The version of SymmetricDS running at this node. |
|||
TIMEZONE_OFFSET |
VARCHAR |
6 |
The time zone offset in RFC822 format at the time of the last heartbeat. |
|||
HEARTBEAT_TIME |
TIMESTAMP |
The last timestamp when the node sent a heartbeat, which is attempted every ten minutes by default. |
||||
LAST_RESTART_TIME |
TIMESTAMP |
X |
Timestamp when this instance was last restarted. |
|||
CREATE_TIME |
TIMESTAMP |
X |
Timestamp when this entry was created. |
A.29. NODE_HOST_CHANNEL_STATS
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_ID |
VARCHAR |
50 |
PK |
X |
A unique identifier for a node. |
|
HOST_NAME |
VARCHAR |
60 |
PK |
X |
The host name of a workstation or server. If more than one instance of SymmetricDS runs on the same server, then this value can be a 'server id' specified by -Druntime.symmetric.cluster.server.id |
|
CHANNEL_ID |
VARCHAR |
128 |
PK |
X |
The channel_id of the channel that data changes will flow through. |
|
START_TIME |
TIMESTAMP |
PK |
X |
The start time for the period which this row represents. |
||
END_TIME |
TIMESTAMP |
PK |
X |
The end time for the period which this row represents. |
||
DATA_ROUTED |
BIGINT |
0 |
Indicate the number of data rows that have been routed during this period. |
|||
DATA_UNROUTED |
BIGINT |
0 |
The amount of data that has not yet been routed at the time this stats row was recorded. |
|||
DATA_EVENT_INSERTED |
BIGINT |
0 |
Indicate the number of data rows that have been routed during this period. |
|||
DATA_EXTRACTED |
BIGINT |
0 |
The number of data rows that were extracted during this time period. |
|||
DATA_BYTES_EXTRACTED |
BIGINT |
0 |
The number of bytes that were extracted during this time period. |
|||
DATA_EXTRACTED_ERRORS |
BIGINT |
0 |
The number of errors that occurred during extraction during this time period. |
|||
DATA_BYTES_SENT |
BIGINT |
0 |
The number of bytes that were sent during this time period. |
|||
DATA_SENT |
BIGINT |
0 |
The number of rows that were sent during this time period. |
|||
DATA_SENT_ERRORS |
BIGINT |
0 |
The number of errors that occurred while sending during this time period. |
|||
DATA_LOADED |
BIGINT |
0 |
The number of rows that were loaded during this time period. |
|||
DATA_BYTES_LOADED |
BIGINT |
0 |
The number of bytes that were loaded during this time period. |
|||
DATA_LOADED_ERRORS |
BIGINT |
0 |
The number of errors that occurred while loading during this time period. |
|||
DATA_LOADED_OUTGOING |
BIGINT |
0 |
The number of rows that were acknowledged as loaded by another node during this time period. |
|||
DATA_BYTES_LOADED_OUTGOING |
BIGINT |
0 |
The number of bytes that were acknowledged as loaded by another node during this time period. |
|||
DATA_LOADED_OUTGOING_ERRORS |
BIGINT |
0 |
The number of errors that occurred while loading to another node during this time period. |
A.30. NODE_HOST_JOB_STATS
Name |
Type |
Size |
Default |
Keys |
Not Null |
Description |
NODE_ID |
VARCHAR |
50 |
PK |
X |
A unique identifier for a node. |
|
HOST_NAME |
VARCHAR |
60 |
PK |
X |
The host name of a workstation or server. If more than one instance of SymmetricDS run |