Difference between revisions of "Running ETL Scenarios"

From Toolsverse Knowledge Base
Jump to: navigation, search
(Running ETL scenario using standalone executable)
 
(8 intermediate revisions by the same user not shown)
Line 58: Line 58:
  
 
To run ETL scenario using Web service you need to have ETL server installed, configured, up and running. Please read [http://www.toolsverse.com/products/data-explorer/docs/doc.html#_Installing_Server here] how to install and configure ETL server.
 
To run ETL scenario using Web service you need to have ETL server installed, configured, up and running. Please read [http://www.toolsverse.com/products/data-explorer/docs/doc.html#_Installing_Server here] how to install and configure ETL server.
 +
 +
You will also need to specify -Dnetwork.appserverurl JVM option.
 +
 +
Example:
 +
 +
<pre>-Dnetwork.appserverurl=http://host:port/dataexplorer/ide</pre> 
 +
 +
You can use ETL_FRAMEWORK_HOME/config/etl.properties to set network.appserverurl or set it programmatically:
 +
 +
<pre>
 +
network.appserverurl=http://localhost:8080/dataexplorer/ide
 +
network.appupdateurl=http://www.toolsverse.com/api/services/CheckForUpdates
 +
app.update.key=etl
 +
app.name=etlprocess
 +
app.title=ETL Framework
 +
</pre> 
 +
 +
 +
Once server is configured using Web service is as easy as:
 +
 +
# Instantiating EtlConfig and adding all required connection information
 +
# Instantiating instance and initializing EtlRequest
 +
# Instantiating instance of the EtlService interface using ServiceFactory and appropriate dynamic proxy
 +
# Calling EtlService#executeEtl 
 
   
 
   
 
Example:
 
Example:
Line 162: Line 186:
  
 
In this example move.xml ETL scenario located under the {app.data}/scenario folder will be executed using extract_load action. Alias test javadb will be used for the source connection and alias test oracle for the destination.
 
In this example move.xml ETL scenario located under the {app.data}/scenario folder will be executed using extract_load action. Alias test javadb will be used for the source connection and alias test oracle for the destination.
 +
 +
Read more about [[Configuration_file|ETL Configuration File]].
 +
 +
You can pass properties file name (or any other -D property) as a command line argument:
 +
 +
<pre>etl.exe -Dconfig.file.name=import_data.properties</pre>
 +
 +
You can also set the name of the XML configuration file (connections):
 +
 +
<pre>etl.exe -Detl.config.name=import_data_config.xml</pre>
 +
 +
Last but not least you can use properties file to define XML configuration file (connections):
 +
 +
<pre>
 +
# don't modify
 +
network.appupdateurl=http://www.toolsverse.com/api/services/CheckForUpdates
 +
app.update.key=etl
 +
app.title=ETL Framework
 +
etl.config.name=import_data_config.xml.xml
 +
</pre>
  
 
== Creating and running ETL scenario using Data Explorer ==
 
== Creating and running ETL scenario using Data Explorer ==
  
 
The best way to create, run and schedule ETL, data integration and data migration scenario is using [http://toolsverse.com/products/data-explorer/docs/doc.html#developetlscenarios Data Explorer] - an integrated ETL IDE.
 
The best way to create, run and schedule ETL, data integration and data migration scenario is using [http://toolsverse.com/products/data-explorer/docs/doc.html#developetlscenarios Data Explorer] - an integrated ETL IDE.

Latest revision as of 23:03, 8 February 2015

Embedding ETL scenario into existing Java application

There are three easy steps:

  1. Instantiate EtlConfig
  2. Instantiate EtlProcess
  3. Execute scenario

Example:

public static void main(String[] args)
{
        LoadScenarioAndConfigurationFromFile engine = new LoadScenarioAndConfigurationFromFile();
 
        try
        {
            // instantiates ETL configuration
            EtlConfig etlConfig = new EtlConfig();
 
            // set log level to INFO which increases verbosity of the etl engine
            Logger.setLevel(EtlLogger.class, Logger.INFO);
 
            // creates embedded ETL process
            EtlProcess etlProcess = new EtlProcess(EtlProcess.EtlMode.EMBEDDED);
 
            // print out framework version
            System.out.println(SystemConfig.instance().getTitle(
                    EtlConfig.DEFAULT_TITLE)
                    + " "
                    + SystemConfig.instance().getSystemProperty(
                            SystemConfig.VERSION));
 
            // load configuration which contains source and destination
            // connections and ETL scenario name. Load and execute ETL scenario.
            // If no full path
            // provided the configuration file test_etl_config.xml is expected
            // to be under app_home/config
            EtlResponse response = engine.loadConfigAndExecute(etlConfig,
                    "test_etl_config.xml", etlProcess);
 
            // print out formatted output from the ETL response
            System.out.println(engine.getMessage(response,
                    "Exampes/Engine/db2file.xml"));
 
        }
        catch (Exception ex)
        {
            System.out.println(Utils.getStackTraceAsString(ex));
        }
 
        System.exit(0);
}

Look at other examples of embedding ETL engine into Java application.

Using Web service to run ETL scenario

To run ETL scenario using Web service you need to have ETL server installed, configured, up and running. Please read here how to install and configure ETL server.

You will also need to specify -Dnetwork.appserverurl JVM option.

Example:

-Dnetwork.appserverurl=http://host:port/dataexplorer/ide

You can use ETL_FRAMEWORK_HOME/config/etl.properties to set network.appserverurl or set it programmatically:

network.appserverurl=http://localhost:8080/dataexplorer/ide
network.appupdateurl=http://www.toolsverse.com/api/services/CheckForUpdates
app.update.key=etl
app.name=etlprocess
app.title=ETL Framework


Once server is configured using Web service is as easy as:

  1. Instantiating EtlConfig and adding all required connection information
  2. Instantiating instance and initializing EtlRequest
  3. Instantiating instance of the EtlService interface using ServiceFactory and appropriate dynamic proxy
  4. Calling EtlService#executeEtl

Example:

/**
 * Creates connection aliases for ETL process, sets scenario name, remotely executes ETL scenario.
 * 
 * @return ETL response
 * @throws Exception in case of any error
 */
private EtlResponse execute()
        throws Exception
{
        // initializes system config, loads properties
        SystemConfig.instance();
 
        // instantiates ETL config
        EtlConfig config = new EtlConfig();
 
        // initializes ETl config
        config.init();
 
        // creates source alias
        Alias source = new Alias();
        source.setName("Java DB");
        source.setUrl("jdbc:derby:{app.root.data}/demo/javadb");
        source.setJdbcDriverClass("org.apache.derby.jdbc.EmbeddedDriver");
 
        // creates destination alias
        Alias destination = new Alias();
        destination.setName("JSON files");
        destination
                .setConnectorClassName("com.toolsverse.etl.connector.json.JsonConnector");
        destination.setUrl("{app.root.data}/*.json");
 
        // adds aliases. ETL process will create connections from these aliases
        config.addAliasToMap(EtlConfig.SOURCE_CONNECTION_NAME, source);
        config.addAliasToMap(EtlConfig.DEST_CONNECTION_NAME, destination);
 
        // creates empty ETL scenario, sets ETL action. The remote ETL process
        // will load it from file at the run-time.
        Scenario scenario = new Scenario();
        scenario.setName("Examples/Engine/db2file.xml");
        scenario.setAction(EtlConfig.EXTRACT_LOAD);
 
        // creates ETL request using given config, scenario and log level
        EtlRequest request = new EtlRequest(config, scenario, Logger.INFO);
 
        if (Utils.isNothing(SystemConfig.instance().getSystemProperty(
                SystemConfig.SERVER_URL)))
            SystemConfig.instance().setSystemProperty(SystemConfig.SERVER_URL,
                    "http://localhost:8080/dataexplorer/ide");
 
        // gets ETL service from the factory. The ServiceProxyWeb used as a
        // dynamic proxy
        EtlService etlService = ServiceFactory.getService(EtlService.class,
                ServiceProxyWeb.class.getName());
 
        // remotely executes ETL process
        return etlService.executeEtl(request);
}

Running ETL scenario using standalone executable

  1. Open APP_HOME/config/etl_config.xml file in your favorite text editor.
  2. Add connections for the particular ETL scenario
  3. Specify connections to use and scenarios to run
  4. Save
  5. Run ETL executable. For example c:\etl\etl.exe on Windows
  6. When it is finished check the etl.log file located under APP_HOME/logs

Example of the etl_config.xml:

<?xml version="1.0" encoding="UTF-8"?>
<config>
   <connections>
      <connection alias="test javadb">
         <driver>org.apache.derby.jdbc.EmbeddedDriver</driver>
         <url>jdbc:derby:{app.root.data}/demo/javadb</url>
      </connection>
      <connection alias="test oracle">
         <driver>oracle.jdbc.driver.OracleDriver </driver>
         <url>jdbc:oracle:thin:@localhost:1521:orcl1</url>
         <userid>user</userid>
         <password>password</password>   
         <params/>
      </connection>
   </connections>
 
   <active.connections>
      <sourses>
         <source alias="test javadb" />
      </sourses>
      <destination alias="test oracle"/> 
   </active.connections>
   <execute>
       <scenario name="move.xml" action="extract_load" />
   </execute>
</config>

In this example move.xml ETL scenario located under the {app.data}/scenario folder will be executed using extract_load action. Alias test javadb will be used for the source connection and alias test oracle for the destination.

Read more about ETL Configuration File.

You can pass properties file name (or any other -D property) as a command line argument:

etl.exe -Dconfig.file.name=import_data.properties

You can also set the name of the XML configuration file (connections):

etl.exe -Detl.config.name=import_data_config.xml

Last but not least you can use properties file to define XML configuration file (connections):

# don't modify
network.appupdateurl=http://www.toolsverse.com/api/services/CheckForUpdates
app.update.key=etl
app.title=ETL Framework
etl.config.name=import_data_config.xml.xml

Creating and running ETL scenario using Data Explorer

The best way to create, run and schedule ETL, data integration and data migration scenario is using Data Explorer - an integrated ETL IDE.