Difference between revisions of "Configuration file"

From Toolsverse Knowledge Base
Jump to: navigation, search
(Heading text)
 
(36 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Heading text ==
+
== ETL Framework configuration file ==
ETL Framework configuration file
+
  
ETL configuration file tells ETL framework what connections to use and what ETL scenarios to run. It is located under APP_HOME/config.  
+
ETL configuration file is used to define ETL properties, what ETL scenarios to run and what connections to use. It is located under APP_HOME/config. The default name is etl_config.xml. You can change name of the ETL configuration file using -D property etl.config.name.  
  
For example: c:/etl/config/etl_config.xml.
+
Example: <pre>-Detl.config.name=c:/etl/config/test.xml</pre>
  
 
File contains the following elements:  
 
File contains the following elements:  
Line 11: Line 10:
 
This section includes Java system properties which can be used by ETL Framework.
 
This section includes Java system properties which can be used by ETL Framework.
  
Example:
+
'''Example:'''
 
<syntaxhighlight lang="xml">
 
<syntaxhighlight lang="xml">
 
<properties>
 
<properties>
Line 20: Line 19:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
List of default properties:
+
==== List of default properties: ====
 +
{| class="wikitable"
 +
|-
 +
! Property !! Description !! Default value !! Example
 +
|-
 +
| log.step || Log step is a how many rows of the data set to skip until log the event. For example if "log step" property set to 10 while iterating through the rows the ETL process will log every 10 rows.
 +
Note: to log this kind of events the log level must be set to INFO in the log4j.properties located under APP_HOME/config
 +
|| 0 (never) ||<pre><log.step>777</log.step></pre>
 +
|-
 +
| cache || Defines the class which is used by ETL framework to cache values such as primary and foreign keys.
 +
 
 +
Note:  It must implement com.toolsverse.cache. Cache interface
 +
|| com.toolsverse.cache.SynchMemoryCache || <pre><cache>com.toolsverse.cache.MemoryCache</cache></pre>
 +
|-
 +
| connection.factory || Defines the class which is used to create connections used by ETL framework. Note: It must implement com.toolsverse.etl.core.connection.EtlConnectionFactory interface || com.toolsverse.etl.core.connection.EtlConnectionFactoryImpl || <pre><connection.factory>abc</connection.factory></pre>
 +
|-
 +
| oracle.oraclehome || The Oracle client home ||  || <pre><oracle.oraclehome>c:/oracle</oracle.oraclehome></pre>
 +
|-
 +
| db2.db2home || The DB2 client home || || <pre><db2.db2home>c:/db2</db2.db2home></pre>
 +
|-
 +
| mssql.mssqlhome || The MS SQL Server client home || || <pre><mssql.mssqlhome>c:/sqlserver</mssql.mssqlhome></pre>
 +
|-
 +
| sybase.sybasehome || The Sybase client home || || <pre><sybase.sybasehome>c:/ase</sybase.sybasehome></pre>
 +
|-
 +
| mysql.mysqlhome || The MySQL client home || || <pre><mysql.mysqlhome>c:/mysql</mysql.mysqlhome></pre>
 +
|-
 +
| postgres.postgreshome || The PostgreSQL client home || || <pre><postgres.postgreshome>c:/postgres</postgres.postgreshome></pre>
 +
|-
 +
| informix.informixhome || The Informix client home || || <pre><informix.home>c:/informix</informix.home></pre>
 +
|}
 +
 
 +
=== Connections ===
 +
In this section you describe SQL and Non SQL connections (for example Excel, XML or text files). There can be as many connections as you want, not necessary the once you are going to use when running particular ETL scenario. Just keep them here, you might need them next time.
 +
 
 +
'''Example:'''
 +
<syntaxhighlight lang="xml">
 +
<connections>
 +
      <connection alias="test javadb">
 +
        <driver>org.apache.derby.jdbc.EmbeddedDriver</driver>
 +
        <url>jdbc:derby:./data-test/javadb</url>
 +
      </connection>
 +
 
 +
  <connection alias="test oracle">
 +
      <driver>oracle.jdbc.driver.OracleDriver </driver>
 +
      <url>jdbc:oracle:thin:@localhost:1521:orcl1</url>
 +
      <userid>user</userid>
 +
      <password>password</password> 
 +
  </connection>
 +
</connections>
 +
</syntaxhighlight>
 +
 
 +
==== Attributes of the connection node: ====
 +
{| class="wikitable"
 +
|-
 +
! Name !! Description !! Attribute or node !! Example
 +
|-
 +
| alias || The alias name || Attribute || <pre>alias="test javadb"</pre>
 +
|-
 +
| driver || JDBC driver class name || Node || <pre><driver>org.apache.derby.jdbc.EmbeddedDriver</driver>
 +
 
 +
Note: Can be empty for non SQL connections such as Excel, text, XML</pre>
 +
|-
 +
| url || The URL || Node || <pre><url>jdbc:derby:./data-test/javadb</url></pre>
 +
|-
 +
| connector || The connector class name || Node || <pre>Possible values:
 +
• empty or com.toolsverse.etl.connector.sql.SqlConnector for database connections
 +
• com.toolsverse.etl.connector.excel.ExcelConnector for Excel (*.xls) connections
 +
• com.toolsverse.etl.connector.excel.ExcelXlsxConnector  for Excel (*.xlsx) connections
 +
• com.toolsverse.etl.connector.xml.XmlObjectConnector for XML connections
 +
• com.toolsverse.etl.connector.xml.XmlConnector for connections to the Toolsverse dataset XML
 +
• com.toolsverse.etl.connector.text.TextConnector for Delimited and Fixed Length Text connections
 +
• com.toolsverse.etl.connector.json.JsonConnector for JSON connections
 +
• com.toolsverse.etl.connector.html.HtmlConnector for HTML connections
 +
• com.toolsverse.etl.connector.pdf.PdfConnector for PDF connections
 +
</pre>
 +
|-
 +
| autocommit || The auto commit flag for SQL connections (default=false) || Node|| <pre><autocommit>true</autocommit></pre>
 +
|-
 +
| transport || The transport protocol for file-based connections: file,ftp,sftp,http (default=file) || Node|| <pre><transport>ftp</transport></pre>
 +
|-
 +
| userid|| The user name for JDBC connection or remote FTP or SFTP connections || Node|| <pre><userid>user</userid></pre>
 +
|-
 +
| password|| The password for JDBC connection or remote FTP or SFTP connections|| Node|| <pre><password>password</password></pre>
 +
|-
 +
| params|| The connection properties. Use ';' as a delimiter|| Node|| <pre><params>SERVER=ol_svr_custom;DB=etl</params></pre>
 +
|-
 +
| sql|| The SQL which will be executed when connection established|| Node|| <pre><sql>insert into test (abc) values ('123')</sql></pre>
 +
|}
 +
 
 +
=== SQL Connections ===
 +
 
 +
Create SQL connection if you need to extract or load data into database which supports SQL. SQL connection uses JDBC or ODBC driver. Driver and url nodes are required when configuring database connection. Other attributes such as userid, password, params, and sql are optional. Connector attribute can be either omitted or set to com.toolsverse.etl.connector.sql.SqlConnector.
 +
 
 +
'''Example of the SQL connection:'''
 +
<syntaxhighlight lang="xml">
 +
<connection alias="test oracle">
 +
  <driver>oracle.jdbc.driver.OracleDriver </driver>
 +
  <url>jdbc:oracle:thin:@localhost:1521:orcl1</url>
 +
  <userid>user</userid>
 +
  <password>password</password> 
 +
</connection>
 +
</syntaxhighlight>
 +
 
 +
If url points to the file or folder in the file system system variables can be used as a part of the url.
 +
 
 +
=== Non SQL Connections ===
 +
 
 +
Non SQL connection are used to read data from and create file-based data sources such as XML, JSON, Excel, etc. Url is a required node, everything else is optional. Use '''params''' node to define connection parameters such as delimited for text files, worksheet for excel, etc. Use ";" to separate properties.
 +
 
 +
'''Example of the Non SQL connection:'''
 +
<syntaxhighlight lang="xml">
 +
<connection alias="test excel">
 +
  <url>{app.data}/test.xls</url>
 +
  <connector> com.toolsverse.etl.connector.excel.ExcelConnector</connector>
 +
  <params>sheetname=Employee;date=MMddyy;datetime=MMddyyyy;time=HH:mm</params>
 +
</connection>
 +
</syntaxhighlight>
 +
 
 +
'''Common properties for all file-based connections:'''
 +
{| class="wikitable"
 +
|-
 +
! Property !! Description !! Example !! Default
 +
|-
 +
| date || Date format || date=MMddyyyy || MM/dd/yyyy
 +
|-
 +
| datetime|| Date+time format || datetime=MM/dd/yyyy HH:mm || MM/dd/yyyy HH:mm
 +
|-
 +
| time|| Time format || time=HH:mm:ss || HH:mm
 +
|-
 +
| encoding|| Character encoding || encoding=utf-8 || System default for everything, utf-8 for XML
 +
|-
 +
| encode || Encode additional dimension || encode=true || flase
 +
|}
 +
 
 +
'''Excel properties:'''
 +
{| class="wikitable"
 +
|-
 +
! Property !! Description !! Example !! Default
 +
|-
 +
| sheetname || Worksheet name || sheetname=Test || None
 +
|}
 +
 
 +
'''Toolsverse XML dataset properties:'''
 +
{| class="wikitable"
 +
|-
 +
! Property !! Description !! Example !! Default
 +
|-
 +
| xsl|| Name of the xlst file  || <pre>xsl={app.root.data}/schema/webrowset2dataset.xsl</pre> || None
 +
|-
 +
| xslfrom || Name of the xlst file used to transform from other XML format to XML dataset  || <pre>xslfrom={app.root.data}/schema/dataset2webrowset.xsl</pre> || None
 +
|-
 +
| xslto|| Name of the xlst file used to transform to other XML format from XML dataset  || <pre>xslfrom={app.root.data}/schema/webrowset2dataset.xsl</pre> || None
 +
|}
 +
 
 +
'''Delimited text file properties:'''
 +
{| class="wikitable"
 +
|-
 +
! Property !! Description !! Example !! Default
 +
|-
 +
| delimiter|| The field delimiter  || <pre>delimiter=';'</pre> || '|'
 +
|-
 +
| firstrow|| Use first row for data  || <pre>firstrow=false</pre> || False
 +
|-
 +
| metadata|| Store metadata in XML dataset format || <pre>metadata=false</pre> || False
 +
|-
 +
| charseparator || The character used to enclose string values into || charseparator='"' || None
 +
|-
 +
| lineseparator|| The separator between lines || <pre>
 +
lineseparator=w
 +
 
 +
Possible values:
 +
• s – os default
 +
• w – windows
 +
• u - unix</pre> || s
 +
|}
 +
 
 +
'''Filex-lenth text file properties (in addition to the properties above):'''
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Property !! Description !! Example !! Default
 +
|-
 +
| fields || The length of the each field in the dataset  || <pre>fields='6;12;15;8'</pre> || None
 +
|}
 +
 
 +
'''PDF file properties:'''
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Property !! Description !! Example !! Default
 +
|-
 +
| pagesize || The page size  || <pre>pagesize='A4'</pre> || None
 +
|}
 +
 
 +
=== Active connections ===
 +
In this section we describe source and destination connections which are going to be used when running particular ETL scenario. There can be multiple source and destination connections. Connection can have a name which must be referenced from the ETL scenario. The default name for the source connection is '''source''' and for destination is '''dest'''.
 +
 
 +
'''Example of the single source and destination connections with default names:'''
 +
<syntaxhighlight lang="xml">
 +
<active.connections>
 +
  <sourses>
 +
    <source alias="test javadb" />
 +
  </sourses>
 +
  <destination alias="test oracle"/>
 +
</active.connections>
 +
</syntaxhighlight>
 +
 
 +
'''Example of the multiple source and destination connections with user’s defined names:'''
 +
<syntaxhighlight lang="xml">
 +
<active.connections>
 +
  <sourses>
 +
    <source alias="test excel" name="excel" />
 +
    <source alias="test javadb" name="javadb" />
 +
  </sourses>
 +
  <destinations>
 +
    <destination alias="test oracle" name="oracle" />
 +
    <destination alias="test xyz" name="xyzcon" />
 +
  </destinations>
 +
</active.connections>
 +
</syntaxhighlight>
 +
 
 +
'''Attributes:'''
 +
{| class="wikitable"
 +
|-
 +
! Attribute !! Description !! Example
 +
|-
 +
| alias|| The name of the alias. Must be the same as in the '''connections''' section|| <pre><source alias="test javadb" /></pre>
 +
|-
 +
| name|| The name of the connection. Name must be referenced from the ETL scenario|| <pre><destination alias="test xyz" name="xyzcon" /></pre>
 +
|}
 +
 
 +
=== Execute (scenarios) ===
 +
This section contains ETL scenarios which should be executed. You can execute multiple scenarios one by one or in parallel. If one of the scenarios fails the rest will be terminated as well.  Each scenario can be executed using different action.
 +
 
 +
'''Example  (single scenario):'''
 +
<syntaxhighlight lang="xml">
 +
<execute>
 +
  <scenario name="test.xml" action="extract_load" />
 +
</execute>
 +
</syntaxhighlight>
 +
 
 +
'''Example  (multiple scenarios, different actions):'''
 +
<syntaxhighlight lang="xml">
 +
<execute>
 +
  <scenario name="test1.xml" action="extract" />
 +
  <scenario name="test2.xml" action="load" />
 +
</execute>
 +
</syntaxhighlight>
 +
 
 +
'''Example  (multiple scenarios, executed in parallel):'''
 +
<syntaxhighlight lang="xml">
 +
<execute>
 +
  <scenario name="test1.xml" action="extract_load" parallel="true"/>
 +
  <scenario name="test2.xml" action=" extract_load" parallel="true" />
 +
</execute>
 +
</syntaxhighlight>
 +
 
 +
'''Attributes:'''
 +
{| class="wikitable"
 +
|-
 +
! Attribute !! Description !! Example
 +
|-
 +
| name || Scenario file name.
 +
 
 +
Note: If scenario file name does not have a folder it is expected to be in the APP_HOME/data/scenario folder.
 +
|| <pre>name="test1.xml"</pre>
 +
|-
 +
| action || <pre>The ETL action. Possible actions:
 +
• extract – only extract
 +
• load – only load
 +
• extract_ load – extract and load</pre>
 +
|| <pre>action="extract"</pre>
 +
|-
 +
| parallel || If set to true the scenario will be executed in the separate thread. Makes sense when there is more than one scenario to execute || <pre>parallel="true"</pre>
 +
|}
 +
 
 +
== Example of the ETL configuration file ==
 +
 
 +
<syntaxhighlight lang="xml">
 +
<?xml version="1.0" encoding="UTF-8"?>
 +
<config>
 +
  <properties>
 +
      <log.step>1000</log.step>
 +
  </properties>
 +
 
 +
  <connections>
 +
      <connection alias="test excel">
 +
        <url>{app.data}/test.xls</url>
 +
        <connector> com.toolsverse.etl.connector.excel.ExcelConnector</connector>
 +
        <params>sheetname=Employee;date=MMddyy;datetime=MMddyyyy;time=HH:mm</params>
 +
      </connection>
 +
 
 +
      <connection alias="test javadb">
 +
        <driver>org.apache.derby.jdbc.EmbeddedDriver</driver>
 +
        <url>jdbc:derby:{app.data}/javadb</url>
 +
      </connection>
 +
     
 +
    <connection alias="test oracle">
 +
        <driver>oracle.jdbc.driver.OracleDriver </driver>
 +
        <url>jdbc:oracle:thin:@localhost:1521:orcl1</url>
 +
        <userid>user</userid>
 +
        <password>password</password> 
 +
    </connection>
 +
  </connections>
 +
 
 +
  <active.connections>
 +
      <sourses>
 +
        <source alias="test excel" name="excel" />
 +
        <source alias="test javadb" name="derby" />
 +
      </sourses>
 +
      <destination alias="test oracle"/>
 +
  </active.connections>
 +
  <execute>
 +
      <scenario name="test.xml" action="extract_load" />
 +
  </execute>
 +
</config>
 +
</syntaxhighlight>
 +
 
 +
In this example test.xml ETL scenario located under the {app.data}/scenario folder will be executed using extract_load action. Connections excel and derby which linked to the aliases test excel and test javadb will be used as a source connections. Connection linked to the alias test oracle will be used as a destination connection.  ETL framework is set to log every 1000 extracted or loaded records.

Latest revision as of 16:56, 5 August 2014

ETL Framework configuration file

ETL configuration file is used to define ETL properties, what ETL scenarios to run and what connections to use. It is located under APP_HOME/config. The default name is etl_config.xml. You can change name of the ETL configuration file using -D property etl.config.name.

Example:
-Detl.config.name=c:/etl/config/test.xml

File contains the following elements:

Properties

This section includes Java system properties which can be used by ETL Framework.

Example:

<properties>
   <log.step>777</log.step>
   <cache>com.toolsverse.cache.MemoryCache</cache>
   <oracle.oraclehome>c:/oracle</oracle.oraclehome>
</properties>

List of default properties:

Property Description Default value Example
log.step Log step is a how many rows of the data set to skip until log the event. For example if "log step" property set to 10 while iterating through the rows the ETL process will log every 10 rows.

Note: to log this kind of events the log level must be set to INFO in the log4j.properties located under APP_HOME/config

0 (never)
<log.step>777</log.step>
cache Defines the class which is used by ETL framework to cache values such as primary and foreign keys.

Note: It must implement com.toolsverse.cache. Cache interface

com.toolsverse.cache.SynchMemoryCache
<cache>com.toolsverse.cache.MemoryCache</cache>
connection.factory Defines the class which is used to create connections used by ETL framework. Note: It must implement com.toolsverse.etl.core.connection.EtlConnectionFactory interface com.toolsverse.etl.core.connection.EtlConnectionFactoryImpl
<connection.factory>abc</connection.factory>
oracle.oraclehome The Oracle client home
<oracle.oraclehome>c:/oracle</oracle.oraclehome>
db2.db2home The DB2 client home
<db2.db2home>c:/db2</db2.db2home>
mssql.mssqlhome The MS SQL Server client home
<mssql.mssqlhome>c:/sqlserver</mssql.mssqlhome>
sybase.sybasehome The Sybase client home
<sybase.sybasehome>c:/ase</sybase.sybasehome>
mysql.mysqlhome The MySQL client home
<mysql.mysqlhome>c:/mysql</mysql.mysqlhome>
postgres.postgreshome The PostgreSQL client home
<postgres.postgreshome>c:/postgres</postgres.postgreshome>
informix.informixhome The Informix client home
<informix.home>c:/informix</informix.home>

Connections

In this section you describe SQL and Non SQL connections (for example Excel, XML or text files). There can be as many connections as you want, not necessary the once you are going to use when running particular ETL scenario. Just keep them here, you might need them next time.

Example:

<connections>
      <connection alias="test javadb">
         <driver>org.apache.derby.jdbc.EmbeddedDriver</driver>
         <url>jdbc:derby:./data-test/javadb</url>
      </connection>
 
   <connection alias="test oracle">
      <driver>oracle.jdbc.driver.OracleDriver </driver>
      <url>jdbc:oracle:thin:@localhost:1521:orcl1</url>
      <userid>user</userid>
      <password>password</password>   
   </connection>
</connections>

Attributes of the connection node:

Name Description Attribute or node Example
alias The alias name Attribute
alias="test javadb"
driver JDBC driver class name Node
<driver>org.apache.derby.jdbc.EmbeddedDriver</driver>

Note: Can be empty for non SQL connections such as Excel, text, XML
url The URL Node
<url>jdbc:derby:./data-test/javadb</url>
connector The connector class name Node
Possible values:
• empty or com.toolsverse.etl.connector.sql.SqlConnector for database connections
• com.toolsverse.etl.connector.excel.ExcelConnector for Excel (*.xls) connections
• com.toolsverse.etl.connector.excel.ExcelXlsxConnector  for Excel (*.xlsx) connections
• com.toolsverse.etl.connector.xml.XmlObjectConnector for XML connections
• com.toolsverse.etl.connector.xml.XmlConnector for connections to the Toolsverse dataset XML
• com.toolsverse.etl.connector.text.TextConnector for Delimited and Fixed Length Text connections
• com.toolsverse.etl.connector.json.JsonConnector for JSON connections
• com.toolsverse.etl.connector.html.HtmlConnector for HTML connections
• com.toolsverse.etl.connector.pdf.PdfConnector for PDF connections
autocommit The auto commit flag for SQL connections (default=false) Node
<autocommit>true</autocommit>
transport The transport protocol for file-based connections: file,ftp,sftp,http (default=file) Node
<transport>ftp</transport>
userid The user name for JDBC connection or remote FTP or SFTP connections Node
<userid>user</userid>
password The password for JDBC connection or remote FTP or SFTP connections Node
<password>password</password>
params The connection properties. Use ';' as a delimiter Node
<params>SERVER=ol_svr_custom;DB=etl</params>
sql The SQL which will be executed when connection established Node
<sql>insert into test (abc) values ('123')</sql>

SQL Connections

Create SQL connection if you need to extract or load data into database which supports SQL. SQL connection uses JDBC or ODBC driver. Driver and url nodes are required when configuring database connection. Other attributes such as userid, password, params, and sql are optional. Connector attribute can be either omitted or set to com.toolsverse.etl.connector.sql.SqlConnector.

Example of the SQL connection:

<connection alias="test oracle">
   <driver>oracle.jdbc.driver.OracleDriver </driver>
   <url>jdbc:oracle:thin:@localhost:1521:orcl1</url>
   <userid>user</userid>
   <password>password</password>   
</connection>

If url points to the file or folder in the file system system variables can be used as a part of the url.

Non SQL Connections

Non SQL connection are used to read data from and create file-based data sources such as XML, JSON, Excel, etc. Url is a required node, everything else is optional. Use params node to define connection parameters such as delimited for text files, worksheet for excel, etc. Use ";" to separate properties.

Example of the Non SQL connection:

<connection alias="test excel">
   <url>{app.data}/test.xls</url>
   <connector> com.toolsverse.etl.connector.excel.ExcelConnector</connector>
   <params>sheetname=Employee;date=MMddyy;datetime=MMddyyyy;time=HH:mm</params>
</connection>

Common properties for all file-based connections:

Property Description Example Default
date Date format date=MMddyyyy MM/dd/yyyy
datetime Date+time format datetime=MM/dd/yyyy HH:mm MM/dd/yyyy HH:mm
time Time format time=HH:mm:ss HH:mm
encoding Character encoding encoding=utf-8 System default for everything, utf-8 for XML
encode Encode additional dimension encode=true flase

Excel properties:

Property Description Example Default
sheetname Worksheet name sheetname=Test None

Toolsverse XML dataset properties:

Property Description Example Default
xsl Name of the xlst file
xsl={app.root.data}/schema/webrowset2dataset.xsl
None
xslfrom Name of the xlst file used to transform from other XML format to XML dataset
xslfrom={app.root.data}/schema/dataset2webrowset.xsl
None
xslto Name of the xlst file used to transform to other XML format from XML dataset
xslfrom={app.root.data}/schema/webrowset2dataset.xsl
None

Delimited text file properties:

Property Description Example Default
delimiter The field delimiter
delimiter=';'
'
firstrow Use first row for data
firstrow=false
False
metadata Store metadata in XML dataset format
metadata=false
False
charseparator The character used to enclose string values into charseparator='"' None
lineseparator The separator between lines
lineseparator=w

Possible values:
• s – os default
• w – windows
• u - unix
s

Filex-lenth text file properties (in addition to the properties above):

Property Description Example Default
fields The length of the each field in the dataset
fields='6;12;15;8'
None

PDF file properties:

Property Description Example Default
pagesize The page size
pagesize='A4'
None

Active connections

In this section we describe source and destination connections which are going to be used when running particular ETL scenario. There can be multiple source and destination connections. Connection can have a name which must be referenced from the ETL scenario. The default name for the source connection is source and for destination is dest.

Example of the single source and destination connections with default names:

<active.connections>
  <sourses>
     <source alias="test javadb" />
  </sourses>
  <destination alias="test oracle"/> 
</active.connections>

Example of the multiple source and destination connections with user’s defined names:

<active.connections>
  <sourses>
     <source alias="test excel" name="excel" />
     <source alias="test javadb" name="javadb" />
  </sourses>
  <destinations> 
     <destination alias="test oracle" name="oracle" />
     <destination alias="test xyz" name="xyzcon" />
  </destinations> 
</active.connections>

Attributes:

Attribute Description Example
alias The name of the alias. Must be the same as in the connections section
<source alias="test javadb" />
name The name of the connection. Name must be referenced from the ETL scenario
<destination alias="test xyz" name="xyzcon" />

Execute (scenarios)

This section contains ETL scenarios which should be executed. You can execute multiple scenarios one by one or in parallel. If one of the scenarios fails the rest will be terminated as well. Each scenario can be executed using different action.

Example (single scenario):

<execute>
   <scenario name="test.xml" action="extract_load" />
</execute>

Example (multiple scenarios, different actions):

<execute>
   <scenario name="test1.xml" action="extract" />
   <scenario name="test2.xml" action="load" />
</execute>

Example (multiple scenarios, executed in parallel):

<execute>
   <scenario name="test1.xml" action="extract_load" parallel="true"/>
   <scenario name="test2.xml" action=" extract_load" parallel="true" />
</execute>

Attributes:

Attribute Description Example
name Scenario file name.

Note: If scenario file name does not have a folder it is expected to be in the APP_HOME/data/scenario folder.

name="test1.xml"
action
The ETL action. Possible actions:
• extract – only extract
• load – only load
• extract_ load – extract and load
action="extract"
parallel If set to true the scenario will be executed in the separate thread. Makes sense when there is more than one scenario to execute
parallel="true"

Example of the ETL configuration file

<?xml version="1.0" encoding="UTF-8"?>
<config>
   <properties>
      <log.step>1000</log.step>
   </properties>
 
   <connections>
      <connection alias="test excel">
         <url>{app.data}/test.xls</url>
         <connector> com.toolsverse.etl.connector.excel.ExcelConnector</connector>
         <params>sheetname=Employee;date=MMddyy;datetime=MMddyyyy;time=HH:mm</params>
      </connection>
 
      <connection alias="test javadb">
         <driver>org.apache.derby.jdbc.EmbeddedDriver</driver>
         <url>jdbc:derby:{app.data}/javadb</url>
      </connection>
 
     <connection alias="test oracle">
         <driver>oracle.jdbc.driver.OracleDriver </driver>
         <url>jdbc:oracle:thin:@localhost:1521:orcl1</url>
         <userid>user</userid>
         <password>password</password>   
     </connection>
   </connections>
 
   <active.connections>
      <sourses>
         <source alias="test excel" name="excel" />
         <source alias="test javadb" name="derby" />
      </sourses>
      <destination alias="test oracle"/> 
   </active.connections>
   <execute>
       <scenario name="test.xml" action="extract_load" />
   </execute>
</config>

In this example test.xml ETL scenario located under the {app.data}/scenario folder will be executed using extract_load action. Connections excel and derby which linked to the aliases test excel and test javadb will be used as a source connections. Connection linked to the alias test oracle will be used as a destination connection. ETL framework is set to log every 1000 extracted or loaded records.