Toolsverse-Data management tools

The most common data integration, ETL and data migration problems solved by ETL framework:

  • Ability to embed ETL engine in your application and use only components that you need
  • Ability to use XML-based language or Java to create ETL scenarios
  • Ability to extend functionality by adding plugin modules
  • Native support for wide selection of SQL and Non-SQL data sources
  • High performance data integration and ETL
  • Real time and batch data movements and transformations
  • Any-to-any data migration

Easy to embed in your application

It is easy to embed the framework in your application. The framework is designed as a set of pluggable components such as connectors, IO engines, ETL drivers, metadata drivers, functional libraries and code generators.

Core components are required, but everything else can be removed. For example, if you are only working with Oracle and MySQL databases, don't need FTP or SFTP access to the data or support for any file-based formats, you can keep generic SQL connector, Oracle and MySQL ETL drivers and remove all other components.

You can configure and deploy framework exactly according to your particular needs. The API is simple and easy to use.

Also, you are not required to actually "integrate" the framework into your application. There are command line executables for all major operating systems.

Usage and embedding scenarios:

  • Command line, headless application
  • Framework integrated into Java application
  • ETL process is executed as a Web service
  • Design, execute and schedule scenarios using Data Explorer

Example of embedding ETL framework in Java:

 

Learn more about ETL framework usage scenarios.

See more examples of embedding ETL framework.


Easy-to-Program, Rich API

You are in full control of how you create data integration and ETL scenarios. There is a high level XML-based programming language, but there is also an open Java API. Scenarios can be created using XML-based language, in Java, or, when using Data Explorer, without programming at all. This is your choose.

Example of ETL scenario created using XML:

 

Example of ETL scenario created in Java:

 

Learn more about ETL scenario language. 

Learn how to create data integration and data migration scenarios. 


Easy to extend

While the framework provides a comprehensive set of connectors, transformations and functions, there are times when you need to add something new. For example, you might need a native connector for your business application or support for exotic file formats.

Thanks to the pluggable architecture and open API the ETL framework is easy to extend. We can help you write plugins of any kind. You are free to contribute and share plugins with others or not.


Data Connectivity

Out of the box ETL Framework comes with a wide selection of connectors for various SQL and Non-SQL data sources. Database connectors natively support database-specific SQL features (for example Oracle PL/SQL), metadata catalogs, connectivity options and parameters. Non-SQL connectors can write (and in most cases read) data and metadata from most popular file-based formats. There is an open API so it is possible to add new connectors as plugin modules. The built-in connectors include:

Connector

Metadata

Native SQL parser and code generator

Write

Read

Oracle

Entire Oracle metadata catalog

Yes, PL/SQL, Oracle specific SQL

Yes

Yes

MS SQL Server

Entire MS SQL Server metadata catalog

Yes, Transact-SQL, SQL Server specific SQL

Yes

Yes

MySQL

Entire MySQL metadata catalog

Yes, MySQL stored procedure language, MySQL specific SQL

Yes

Yes

PostgreSQL

Entire PostgeSQL metadata catalog

Yes, pg/SQL, PostgeSQL specific SQL

Yes

Yes

DB2

Entire DB2 metadata catalog

Yes, SQL PL, DB2 specific SQL

Yes

Yes

Sybase ASE

Entire Sybase ASE metadata catalog

Yes, T-SQL, Sybase ASE specific SQL

Yes

Yes

Informix

Entire Informix metadata catalog

Yes, SPL, Informix specific SQL

Yes

Yes

Generic JDBC

Only supported by JDBC driver

No

Yes

Yes

Generic ODBC

Only supported by ODBC driver

No

Yes

Yes

Excel ODBC

Yes

No

Yes

Yes

Access ODBC

Yes

No

Yes

Yes

XML

Yes

N/A

Yes

Yes

XML with transformation

Yes

N/A

Yes

Yes

JSON

Yes

N/A

Yes

Yes

Delimited text

Yes

N/A

Yes

Yes

Fixed-length text

Yes

N/A

Yes

Yes

Excel XLS

Yes

N/A

Yes

Yes

Excel XLSX

Yes

N/A

Yes

Yes

HTML

No

N/A

Yes

No

PDF

No

N/A

Yes

No


I/O

Read, create and update your data regardless of location. In addition to IP based JDBC connectivity the ETL Framework can read and write data using File, FTP, SFTP and HTTP protocols. Standard mail SMTP protocol is also supported. There is an open API so it is possible to add new I/O processors as plugin modules.


High performance data integration and ETL

ETL engineSome unique features of the ETL framework available to developers:

  • Automatic data type translation
  • Data streaming
  • Parallel extract and load
  • Real time data movements and data transformations
  • Batch processing
  • Support for bind variables
  • Creating schema objects on the fly using the source as a pattern. For example, table can be automatically created if it doesn’t exist
  • Column transformations: calculate value, add column, exclude column, change data type
  • Dataset transformations: filter, sort, set operations, de-normalization, de-duplication, matrix transformations, XLT transformations
  • Dimension transformations: add dimension, extract dimension
  • Data quality features and validation using SQL, JavaScript, other scripting languages and Regex
  • Support for pluggable scripting languages such as JavaScript and Ruby
  • Using databases specific features: language constructs, cursors, temporary tables, exception handling, external tools 
  • CLOB and BLOB support
  • Common tasks such as OS commands, support for file-based operations such as copy, delete, rename, etc, sending emails

Example of the column level transformations:

 

Read more about features of the ETL Framework.


Support for database specific features

While most of the transformations can be truly database agnostic, there are situations in which you want to use features of the source or destination database such as inline stored procedures and functions, cursors, temporary tables, bulk extract and load.

ETL framework solves the complexity of working with specific databases. It includes ETL drivers and complex transformations for the most widely used databases. Framework natively supports Oracle, DB2, MS SQL Server, MySql, PostgreSQL, Sybase ASE and Informix.

In addition to these databases the framework can work with any database as long as there is a JDBC or ODBC driver. One of the unique features of the framework is an ability to automatically discover and use data types and functions supported by the target database.

Example of using database specific features in the ETL scenario:

 

See more examples 


Support for Non-SQL data sources

Using ETL framework, you can read, write, create and stream a wide selection of Non-SQL data sources such as JSON, XML, Text, Excel, etc. What's more, you can apply the same transformations to the Non-SQL data sources as you would for SQL databases. In most cases, your ETL scenario will look exactly the same regardless of data formats.

Example of creating Excel file from database:

 

See more examples


Real time data movements and transformations

ETL Data Transformation

ETL framework supports real time data movements and transformations.

Real time transformations are used when you want to keep data where they are but need to aggregate information from multiple sources and perform complex transformations.

For example, if you want to produce a report that combines data from an OLTP Oracle database and accounting system in MySql, you can do that with ETL Framework.

You can mix and match heterogeneous datasets by applying transformations such as set operations, de-normalization, de-duplication and much more. All of these with just a dozen lines of code in Java and XML.


Data Quality and Validation

The crucial part of any good ETL process is ensuring data quality. ETL framework includes inline (executed for each row) and post-process (executed for the entire dataset) validation tasks. Developer can use JavaScript or any other supported scripting language to write small validation programs which can reject a row or an entire dataset if there are problems with the data.

Validation example using JavaScript:


Any-to-any data migration

Data migration cannot be easier.  For example, the same scenario can be used to migrate data and schema objects from Oracle database to MySql or from PostgreSql to Json files. If needed, schema objects such as tables, indexes and files are automatically created using the source as a pattern. It is possible to stream data and do parallel migration of many sources to many destinations.

Example of the data migration scenario which creates tables and indexes on the fly and uses streaming and parallel load:


Data Streaming

ETL framework supports data streaming when reading and writing data. When streaming is enabled you can move practically unlimited datasets from the source to destination even in the memory constraint environment.  

Enabling streaming is as easy as setting one attribute:


Parallel execution

ETL framework supports parallel execution on many levels. You can extract sources in parallel, load data in parallel and execute individual scenarios in parallel. You can also set synchronization points.  Once again, allowing parallel execution is as easy as setting one attribute in XML:


Ability to create schema objects automatically

ETL scenario can be configured to automatically create schema objects such as tables, indexes and files.

For example, there is ETL scenario which moves data from MS SQL Server database to MySql. There are tables in SQL Server schema which don’t exist in MySql.

If you want the framework to automatically create destination tables (if they don’t exist) it is as easy as setting one attribute in XML. The source data types will be automatically translated to the destination data types (in the example above, MS SQL Server to MySql).

Example:


Batch processing and bind variables

When working with databases, one of the techniques available to developers is support for batch processing and bind variables.

Batch processing groups identical SQL statements in batches and executes them as one set which significantly improves performance.

Using bind variables also improves performance since SQL is getting compiled (prepared) once and then reused with different parameters (bind variables).

Example of using batch processing and bind variables:


The best way to create, execute and schedule ETL and data integration scenarios

ETL framework is a powerful tool and the best way to use it is together with Data Explorer – a fully integrated Universal Database Development, Data Discovery, Data Integration and Data Migration IDE. Below is just a small list of features available when using Data Explorer:

  • Create and execute complex ETL and data integration scenarios using intuitive drag and drop interface
  • Execute ETL scenarios on the server
  • Schedule ETL scenarios using built-in cron based scheduler
  • Discover dependencies and metadata on the fly
  • Use security and audit features of the Data Explorer

Learn more about using Data Explorer for ETL and data integration development.


How ETL Framework works

Any data integration or data migration process can be described as an extract-transform-load (ETL) or extract-load-transform (ELT). With that in mind, writing scenarios becomes a process of splitting task into extracts, loads and transformations. A scenario glues them all together and adds purpose and logic.

The ETL engine which powers Toolsverse products uses XML-based language to create scenarios. XML is a natural choice because it enforces a structure: loads follow extracts and transformations follow extracts and loads. That said, you can chose to stream data, so extract and load are combined in one logical operation or run them in parallel so the actual order of extracts and loads is not guaranteed.

The ETL engine makes it possible to concentrate on a task at hand without reinventing the wheel. It hides the complexity of data integration so in most cases the same techniques can be applied when working with any SQL and Non-SQL data sources. It is possible to use a full power of the target database and do things like direct data loads, parallel extracts, etc.

The framework reads data sources, performs transformations and generates database specific SQL code which is executed within a transaction. When working with Non-SQL data sources the ETL framework uses pluggable connectors to read and write data. The framework consists of core components and plugins.

Core Components

ETL Process

ETL Factory

Extractor

Loader

Connection Factory

Transaction Monitor

Code Generator

Task Executor

Scripting Engine

 

Plugins

 

ETL Drivers
Oracle,
DB2,
MS SQL Server,
MySQL,
PostgreSQL,
Sybase ASE,
Informix,
Generic JDBC,
Generic File,
Other pluggable ETL drivers

Connectors
SQL,
JSON,
XML,
Delimited Text files,
Fixed-length text files,
Excel XLS,
Excel XLSX,
HTML,
PDF,
Other pluggable Connectors

Transformations
Column transformations: add column, excludes column, change column value, change column type, validate column, regexp replacement
Dataset transformations: filter, order, join, minus, intersect, union, pivot, de-normalize, transpose, matrix
Dimension transformations: add dimension, extract dimension
Other pluggable transformations

I/O Processors
File,
SFTP,
FTP,
HTTP,
SMTP,
Other pluggable IO processors

Functions
Common,
Oracle
DB2,
MS SQL Server,
MySQL,
PostgreSQL,
Sybase ASE,
Informix,
Other pluggable functions

Tasks
File tasks: copy, move, delete, rename, zip, unzip
Other tasks: XLT transformation, send email, SQL task, execute OS command
Other pluggable tasks

Metadata Drivers
Oracle, DB2, MS SQL Server, MySQL, PostgeSQL, Sybase ASE, Informix, Generic file, Excel XLS, Excel XLSX, JSON, Text, XML, Generic JDBC,
Other pluggable metadata drivers


Pricing, source code and distribution

ETL Framework is a dual-licensed product. The core components are open-sourced and provided under a LGPL license. A small subset of libraries are close-sourced and distributed under commercial license. A single developer license cost $100 (US) and includes libraries, source code, test suite, build scripts and examples. The royalty-free distribution license is available. The full source code of the ETL framework is available for purchase as well.

Product Download Price Buy

ETL Framework

Download

$100 per developer

Buy it now

ETL Framework with Royalty-free Distribution license

Download

$500

Buy it now

ETL Framework with Full Source Code (Royalty-free Distribution license is included)

Inquire

$1000

Buy it now


Support

Standard email support available at no charge. We will answer your questions and help you to setup and configure ETL Framework. We offer a premium support uppon request. Read more about Toolsverse support options.


Buy Toolsverse ETL Framework

Buy ETL Framework Today

Compact, modular, high performance and embeddable ETL engine written in Java.