Difference between revisions of "Writing data integration and data migration scenarios"

From Toolsverse Knowledge Base
Jump to: navigation, search
Line 14: Line 14:
 
<?xml version="1.0" encoding="UTF-8"?>
 
<?xml version="1.0" encoding="UTF-8"?>
 
<scenario>
 
<scenario>
     <name>Automatically create tables and indexes</name>
+
     <name>Migrate data</name>
    <description>Creates tables if needed. Creates indexes when createing a table and adds suffix to the index name. Copies data</description>
+
     <script>migrate_date</script>
     <script>create_indexes</script>
+
 
     <driver>auto</driver>
 
     <driver>auto</driver>
 
     <sources>
 
     <sources>
 
           <source>
 
           <source>
 
               <name>employee</name>
 
               <name>employee</name>
              <tablename>EMPLOYEE</tablename> <!-- required for index creation -->
 
 
               <extract>
 
               <extract>
 
                     <sql>select * from employee</sql>
 
                     <sql>select * from employee</sql>
Line 29: Line 27:
 
           <source>
 
           <source>
 
               <name>emp_resume</name>
 
               <name>emp_resume</name>
              <tablename>EMP_RESUME</tablename> <!-- required for index creation -->
 
 
               <extract>
 
               <extract>
 
                     <sql>select * from emp_resume</sql>
 
                     <sql>select * from emp_resume</sql>
 
               </extract>
 
               </extract>
 
           </source>
 
           </source>
 
+
    </sources>
          <source>
+
              <name>emp_photo</name>
+
              <tablename>EMP_PHOTO</tablename> <!-- required for index creation -->
+
              <extract>
+
                    <sql>select * from emp_photo</sql>
+
              </extract>
+
          </source>
+
         
+
    </sources>
+
 
     <destinations>
 
     <destinations>
 
           <destination>
 
           <destination>
 
               <name>employee</name>
 
               <name>employee</name>
               <metadata indexes="true" suffix="_t"/> <!-- suffix is not a required attribute -->
+
               <metadata>true</metadata>
 
           </destination>
 
           </destination>
  
 
           <destination>
 
           <destination>
 
               <name>emp_resume</name>
 
               <name>emp_resume</name>
               <metadata indexes="true" suffix="_t"/> <!-- suffix is not a required attribute -->
+
               <metadata>true</metadata>
 
           </destination>
 
           </destination>
 
          <destination>
 
              <name>emp_photo</name>
 
              <metadata indexes="true" suffix="_t"/> <!-- suffix is not a required attribute -->
 
          </destination>
 
         
 
 
     </destinations>
 
     </destinations>
 
</scenario>
 
</scenario>
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
  
 
== Using scripting languages ==
 
== Using scripting languages ==
  
 
== Using SQL ==
 
== Using SQL ==

Revision as of 21:02, 27 July 2014

Basic stuff

Any data integration or data migration process can be described as an extract-transform-load (ETL) or extract-load-transform (ELT). With that is mind writing scenario becomes a process of splitting task on extracts, loads and transformations. Scenario glues them all together and adds purpose and logic.

[ETL engine] which powers Toolsverse products uses XML-based language to create scenarios. Before reading this page please take a look at ETL scenario language specification. XML was a natural choice because it enforces a structure: loads follow extracts and transformations follow extracts and loads. That said you can chose to stream data so extract and load combined in one logical operation and run things in parallel where actual order of extracts and loads is not guarantee.

ETL engine makes it possible to concentrate on task in hands without reinventing the wheel. It hides complexity of data integration so in most cases the same techniques and language artifacts are used to work with any SQL and Non SQL data sources. It is however possible to use the full power of the target database and do things like direct data load, anonymous SQL blocks, etc.

Simple data migration scenario

Let take a look at the simple data migration scenario:

<?xml version="1.0" encoding="UTF-8"?>
<scenario>
     <name>Migrate data</name>
     <script>migrate_date</script>
     <driver>auto</driver>
     <sources>
          <source>
               <name>employee</name>
               <extract>
                    <sql>select * from employee</sql>
               </extract>
          </source>
 
          <source>
               <name>emp_resume</name>
               <extract>
                    <sql>select * from emp_resume</sql>
               </extract>
          </source>
    </sources>
     <destinations>
          <destination>
               <name>employee</name>
               <metadata>true</metadata>
          </destination>
 
          <destination>
               <name>emp_resume</name>
               <metadata>true</metadata>
          </destination>
     </destinations>
</scenario>


Using scripting languages

Using SQL