Difference between revisions of "Configuration file"
(→Non SQL Connections) |
(→Non SQL Connections) |
||
Line 166: | Line 166: | ||
! Property !! Description !! Example !! Default | ! Property !! Description !! Example !! Default | ||
|- | |- | ||
− | | xsl|| Name of the xlst file || <pre>xsl={app.root.data}/schema/webrowset2dataset.xsl<pre> || None | + | | xsl|| Name of the xlst file || <pre>xsl={app.root.data}/schema/webrowset2dataset.xsl</pre> || None |
|- | |- | ||
− | | xslfrom || Name of the xlst file used to transform from other XML format to XML dataset || <pre>xslfrom={app.root.data}/schema/dataset2webrowset.xsl<pre> || None | + | | xslfrom || Name of the xlst file used to transform from other XML format to XML dataset || <pre>xslfrom={app.root.data}/schema/dataset2webrowset.xsl</pre> || None |
|- | |- | ||
− | | xslto|| Name of the xlst file used to transform to other XML format from XML dataset || <pre>xslfrom={app.root.data}/schema/webrowset2dataset.xsl<pre> || None | + | | xslto|| Name of the xlst file used to transform to other XML format from XML dataset || <pre>xslfrom={app.root.data}/schema/webrowset2dataset.xsl</pre> || None |
|} | |} | ||
Line 178: | Line 178: | ||
! Property !! Description !! Example !! Default | ! Property !! Description !! Example !! Default | ||
|- | |- | ||
− | | delimiter|| The field delimiter || <pre>delimiter=';'<pre> || '|' | + | | delimiter|| The field delimiter || <pre>delimiter=';'</pre> || '|' |
|- | |- | ||
− | | firstrow|| Use first row for data || <pre>firstrow=false<pre> || False | + | | firstrow|| Use first row for data || <pre>firstrow=false</pre> || False |
|- | |- | ||
− | | metadata|| Store metadata in XML dataset format || <pre>metadata=false<pre> || False | + | | metadata|| Store metadata in XML dataset format || <pre>metadata=false</pre> || False |
|- | |- | ||
| charseparator || The character used to enclose string values into || charseparator='"' || None | | charseparator || The character used to enclose string values into || charseparator='"' || None | ||
− | + | |- | |
+ | | lineseparator|| The separator between lines || lineseparator=w. | ||
+ | <pre> | ||
+ | Possible values: | ||
+ | • s – os default | ||
+ | • w – windows | ||
+ | • u - unix</pre> || s | ||
|} | |} |
Revision as of 14:53, 31 July 2014
Contents
ETL Framework configuration file
ETL configuration file tells ETL framework what connections to use and what ETL scenarios to run. It is located under APP_HOME/config.
For example: c:/etl/config/etl_config.xml.
File contains the following elements:
Properties
This section includes Java system properties which can be used by ETL Framework.
Example:
<properties> <log.step>777</log.step> <cache>com.toolsverse.cache.MemoryCache</cache> <oracle.oraclehome>c:/oracle</oracle.oraclehome> </properties>
List of default properties:
Property | Description | Default value | Example |
---|---|---|---|
log.step | Log step is a how many rows of the data set to skip until log the event. For example if "log step" property set to 10 while iterating through the rows the ETL process will log every 10 rows.
Note: to log this kind of events the log level must be set to INFO in the log4j.properties located under APP_HOME/config |
0 (never) | <log.step>777</log.step> |
cache | Defines the class which is used by ETL framework to cache values such as primary and foreign keys.
Note: It must implement com.toolsverse.cache. Cache interface |
com.toolsverse.cache.SynchMemoryCache | <cache>com.toolsverse.cache.MemoryCache</cache> |
connection.factory | Defines the class which is used to create connections used by ETL framework. Note: It must implement com.toolsverse.etl.core.connection.EtlConnectionFactory interface | com.toolsverse.etl.core.connection.EtlConnectionFactoryImpl | <connection.factory>abc</connection.factory> |
oracle.oraclehome | The Oracle client home | <oracle.oraclehome>c:/oracle</oracle.oraclehome> | |
db2.db2home | The DB2 client home | <db2.db2home>c:/db2</db2.db2home> | |
mssql.mssqlhome | The MS SQL Server client home | <mssql.mssqlhome>c:/sqlserver</mssql.mssqlhome> | |
sybase.sybasehome | The Sybase client home | <sybase.sybasehome>c:/ase</sybase.sybasehome> | |
mysql.mysqlhome | The MySQL client home | <mysql.mysqlhome>c:/mysql</mysql.mysqlhome> | |
postgres.postgreshome | The PostgreSQL client home | <postgres.postgreshome>c:/postgres</postgres.postgreshome> | |
informix.informixhome | The Informix client home | <informix.home>c:/informix</informix.home> |
Connections
In this section you describe SQL and Non SQL connections (for example Excel, XML or text files). There can be as many connections as you want, not necessary the once you are going to use when running particular ETL scenario. Just keep them here, you might need them next time.
Example:
<connections> <connection alias="test javadb"> <driver>org.apache.derby.jdbc.EmbeddedDriver</driver> <url>jdbc:derby:./data-test/javadb</url> </connection> <connection alias="test oracle"> <driver>oracle.jdbc.driver.OracleDriver </driver> <url>jdbc:oracle:thin:@localhost:1521:orcl1</url> <userid>user</userid> <password>password</password> </connection> </connections>
Attributes of the connection node:
Name | Description | Attribute or node | Example |
---|---|---|---|
alias | The alias name | Attribute | alias="test javadb" |
driver | JDBC driver class name | Node | <driver>org.apache.derby.jdbc.EmbeddedDriver</driver> Note: Can be empty for non SQL connections such as Excel, text, XML |
url | The URL | Node | <url>jdbc:derby:./data-test/javadb</url> |
connector | The connector class name | Node | Possible values: • empty or com.toolsverse.etl.connector.sql.SqlConnector for database connections • com.toolsverse.etl.connector.excel.ExcelConnector for Excel (*.xls) connections • com.toolsverse.etl.connector.excel.ExcelXlsxConnector for Excel (*.xlsx) connections • com.toolsverse.etl.connector.xml.XmlObjectConnector for XML connections • com.toolsverse.etl.connector.xml.XmlConnector for connections to the Toolsverse dataset XML • com.toolsverse.etl.connector.text.TextConnector for Delimited and Fixed Length Text connections • com.toolsverse.etl.connector.json.JsonConnector for JSON connections • com.toolsverse.etl.connector.html.HtmlConnector for HTML connections • com.toolsverse.etl.connector.pdf.PdfConnector for PDF connections |
autocommit | The auto commit flag for SQL connections (default=false) | Node | <autocommit>true</autocommit> |
transport | The transport protocol for file-based connections: file,ftp,sftp,http (default=file) | Node | <transport>ftp</transport> |
userid | The user name for JDBC connection or remote FTP or SFTP connections | Node | <userid>user</userid> |
password | The password for JDBC connection or remote FTP or SFTP connections | Node | <password>password</password> |
params | The connection properties. Use ';' as a delimiter | Node | <params>SERVER=ol_svr_custom;DB=etl</params> |
sql | The SQL which will be executed when connection established | Node | <sql>insert into test (abc) values ('123')</sql> |
SQL Connections
Create SQL connection if you need to extract or load data into database which supports SQL. SQL connection uses JDBC or ODBC driver. Driver and url nodes are required when configuring database connection. Other attributes such as userid, password, params, and sql are optional. Connector attribute can be either omitted or set to com.toolsverse.etl.connector.sql.SqlConnector.
Example of the SQL connection:
<connection alias="test oracle"> <driver>oracle.jdbc.driver.OracleDriver </driver> <url>jdbc:oracle:thin:@localhost:1521:orcl1</url> <userid>user</userid> <password>password</password> </connection>
If url points to the file or folder in the file system system variables can be used as a part of the url.
Non SQL Connections
Non SQL connection are used to read data from and create file-based data sources such as XML, JSON, Excel, etc. Url is a required node, everything else is optional. Use params node to define connection parameters such as delimited for text files, worksheet for excel, etc. Use ";" to separate properties.
Example of the Non SQL connection:
<connection alias="test excel"> <url>{app.data}/test.xls</url> <connector> com.toolsverse.etl.connector.excel.ExcelConnector</connector> <params>sheetname=Employee;date=MMddyy;datetime=MMddyyyy;time=HH:mm</params> </connection>
Common properties for all file-based connections:
Property | Description | Example | Default |
---|---|---|---|
date | Date format | date=MMddyyyy | MM/dd/yyyy |
datetime | Date+time format | datetime=MM/dd/yyyy HH:mm | MM/dd/yyyy HH:mm |
time | Time format | time=HH:mm:ss | HH:mm |
encoding | Character encoding | encoding=utf-8 | System default for everything, utf-8 for XML |
encode | Encode additional dimension | encode=true | flase |
Excel properties:
Property | Description | Example | Default |
---|---|---|---|
sheetname | Worksheet name | sheetname=Test | None |
Toolsverse XML dataset properties:
Property | Description | Example | Default |
---|---|---|---|
xsl | Name of the xlst file | xsl={app.root.data}/schema/webrowset2dataset.xsl |
None |
xslfrom | Name of the xlst file used to transform from other XML format to XML dataset | xslfrom={app.root.data}/schema/dataset2webrowset.xsl |
None |
xslto | Name of the xlst file used to transform to other XML format from XML dataset | xslfrom={app.root.data}/schema/webrowset2dataset.xsl |
None |
Delimited text file properties:
Property | Description | Example | Default |
---|---|---|---|
delimiter | The field delimiter | delimiter=';' |
' |
firstrow | Use first row for data | firstrow=false |
False |
metadata | Store metadata in XML dataset format | metadata=false |
False |
charseparator | The character used to enclose string values into | charseparator='"' | None |
lineseparator | The separator between lines | lineseparator=w.
Possible values: • s – os default • w – windows • u - unix|| s |