Toolsverse-Data management tools

Use Cases

Project description: implement repeatable migration process which copies 200 tables-strong Microsoft SQL Server database to the Oracle and HSSQLDB databases

Challenges

Customer needed to develop a repeatable and really high performance ETL process which migrates data from the MS SQL Server database to the Oracle and HSSWLDB databases. ETL process copies data to the existing Oracle and HSSWLDB tables which include CLOB and BLOB columns. The average size of the source table was in the neighborhood of 105 rows but there were ~30 tables with 106 rows and more.

Solution

Customer chose to use ETL Framework and develop ETL process completely in Java, no XML-based language involved. The key part was a decision to use streaming ability of the ETL framework along with a batch processing and bind variables. Customer developed adaptive algorithm which controls how multiple extract and load threads are created by ETL framework. Toolsverse helped customer with examples and best practices. Ability to use bind variables was added to the framework after request from this customer.


Project description: migrate legacy application from Microsoft SQL Server database to the MySql

Challenges

Customer needed to develop a data migration process to move data from the legacy application in the Microsoft SQL Server to the cloud based system with MySql database as a backend. One of the main requirements was that ETL process must create tables in the destination MySql database on the fly using source tables in the Microsoft SQL Server as a pattern. Also some destination columns must be renamed and values of some others must be calculated.

Solution

Customer used Data Explorer ETL edition to interactively develop ETL scenarios for this project using ETL Designer. Customer used ability of the ETL engine to create tables on the fly and built-in JavaScript to define functions which calculate column values. We have added several usability enhancement to the ETL Designer after request from this customer.


Project description: implement system which requests data from JSON based Web services and updates Oracle database

Challenges

Customer uses system which requests data from the external JSON-based Web services and produces CRUD updates for their cloud based Oracle database. One response can include multiple updates which must be performed within a transaction. Customer was using specific code to communicate with each Web service and they wanted more unified approach but most importantly they wanted to get out of business of developing and supporting custom ETL code.

Solution

Toolsverse developed solution which is using ETL Framework deployed to the AWS instance running Linux and we also created ETL scenarios for this customer. ETL is scheduled as periodic cron job which can automatically notify their IT stuff by email in case of any failures. We used ability of the ETL framework to call Web services over HTTP and built-in notifications by email. JSON is one of the data formats natively supported by ETL Framework.


Project description: implement offline replication between Derby and DB2 databases

Challenges

Customer needed to implement offline asymmetrical two-way replication between Derby database used as a back end for their downtime solution and DB2 database used as OLTP. They wanted to have a batch job executed periodically in their data center and controlled by their IT personal. They were going to extract data into XML, send it over the wire and load into the designated Derby database. Similarly they wanted a completely automated and self-contained ETL process which can be initiated from the client to push data to the server when connectivity is restored.

Solution

Customer used ETL Framework to implement replication. They embedded framework directly into the client part of the application and they created a batch job to run in the data center. They used ability of the ETL framework to extract and load data separately as two independent tasks. They chose Toolsverse XML format for serialization since it includes data and metadata such as columns data types. They also used ability of the ETL framework to create zip files and copy files using FTP protocol. The projected amount of data which need to be extracted and loaded was in the neighborhood of hundreds of megabytes so ability of the framework to extract from multiple sources and load into multiple destinations in parallel threads was a critical part of success.


Project description: implement reporting system which produces Excel reports and sends them to the remote SFTP location

Challenges

Customer had IT personal to periodically create reports for upper management from the Oracle data warehouse.  They were using Excel and they were sending reports directly to the management by email. They wanted to make this process completely automated and they wanted a Web based dashboard with information about created reports and status. They still wanted to use Excel as a format for reports. 

Solution

Customer used Data Explorer Enterprise edition deployed to the Tomcat app server to develop and schedule ETL scenario which creates Excel reports by running extracts on the Oracle database. Scenario automatically updates HTML dashboard directly in the Web server and notifies IT by email in case of any failure. Customer used ability of the ETL engine to copy files using SFTP protocol so reports created on the Tomcat are getting pushed to the Windows-based file server where they can be accessed by the management.  The ability to schedule and monitor ETL tasks was the key factor in choosing Data Explorer Enterprise edition.


Project description: implement data integration process which aggregates multiple comma delimited, XML and JSON files of the different structure from the several FTP sites, performs transformations and creates a multi-dimensional JSON object which it then sends by email to the subscriber

Challenges

Customer needed a solution which will allow them to aggregate data from the different heterogeneous file-based data sources. They wanted to create a single nested multi-dimensional json object which they can send to the subscribers by email.  Customer didn’t want to stage data in the database so they needed in-memory solution which can perform set operations such as join, intersect and union on multiple datasets. They also needed a way to “assemble” a multi-dimensional json object using different pieces of information.

Solution

Customer used Data Explorer Pro edition to design and create ETL scenario. Scenario is executed by ETL Framework deployed to the JBoss application server which handles transactions, scheduling and error recovering. They used built-in transformations such as “join”, “intersect”, “union” and “add dimension” to assemble object. They also used ability of the ETL framework to stream data so it can easily handle large datasets. The framework was embedded in the existing J2EE application which among other things is responsible for emailing files to the subscribers.  


Project description: implement repeatable interactive solution which can be used to read normalized data from the OLTP database and produce de-normalized report

Challenges

The database architect who works for large University is responsible for creating relatively complicated reports for the faculty members. Creating some of the reports involves writing SQL queries and denormalizing result sets so multiple rows for the same combination of keys become a single row with multiple columns.

Solution

Customer is using Data Explorer SQL edition to automate routine denormalization transformations. They save frequently used transformations and re uses them when needed. They run queries from Data Explorer and apply built-in denormalization. After transformation is applied they export result sets to the Excel. They also use ability of the Data Explorer to automatically split data on multiple files during export, so for example, different files can be created for each month of the year.


Project description: implement repeatable ETL process which copies configuration data from the production database to the QA environment and from the golden schema to the production database

Challenges

One of the tasks which this medium-size development shop frequently faces is a need to copy configuration settings for their application back and forth between different databases. In most cases they need to merge data because there are existing reference constraints in the database which cannot be ignored.

Solution

Customer developed ETL scenarios for different configuration domains. They manually execute scenarios when needed from the Data Explorer Pro edition. When creating scenarios they use ability of the framework to execute database specific code (PL/SQL in this particular case). They also use column level functions which can automatically generate new primary keys using sequences and update referential constraints accordingly.


Project description: off the shelf solution which allows to audit what SQL scripts are executed, by who, when and what was the outcome 

Challenges

Customer needed a way to audit when particular SQL scripts are executed in the particular environments.  They wanted to know who and when executed scrip and see the output log if it was available at the time of execution. Most importantly they needed substitution for the rather large selection of database development tools used across the organization.  One of the requirements was that different developers should have different level of access to the different environments or don’t have access at all. Also they needed ability to maintain a shared library of the frequently used SQL scripts.

Solution

The only database development tool with such capabilities available on the market is Database Explorer Enterprise edition. Since customer didn’t care about ETL functionality they requested a quote for the Database Explorer Enterprise without ETL. All editions of the Data Explorer share the same components so configuring product for their particular needs was as easy as sending a new license. At that time it saved customer considerable amount of money and currently they are considering upgrading to the full Data Explorer Enterprise for the fraction of the list price.


Project description: data feeds from data warehouse

Challenges

Customer needed an easy way to create coma delimited data feeds from their data warehouse. Data feeds include financial information for legacy mainframe-based system and also used to communicate with the third party systems.

Solution

Customer is using Data Explorer Pro edition to create ETL scenarios together with ETL framework executed as a standalone batch process in data center. Writing most ETL scenarios is easy as wring a single SQL statement since framework take cares of the rest, including parsing and creating files, streaming, notifications, etc. Time needed to create and deploy scenarios was reduced significantly as well as productivity of the developers since they don’t need to write custom code for each new data feed.


Project description: classic data warehouse ETL

Challenges

For years customer used home grown ETL process to populate data warehouse from the OLTP database. They were using a combination of SQL scripts, shell scripts and Oracle sql*load scenarios. It was working rather well for them until they reached a point when ETL become really slow due to the amount of data which need to be moved from the OLTP to OLAP database daily. The designated time window become too small and they were facing constant delays. Customer was looking for a solution which will allow them to use commodity hardware and massive parallel processing for ETL.

Solution

Customer developed solution which is using ETL Framework together with Hadoop MapReduce engine. At the end they managed to cut ETL time in half without investing in new expensive hardware. The transition from separate SQL scripts and sql*load scenarios was relatively easy because they used ability of the ETL framework to execute native SQL and Oracle sql*loader. They also used Data Explorer ETL edition to create scenarios in the Visual ETL designer.


Project description: universal database development tool

Challenges

Working as a freelancer customer was looking for inexpensive universal database development tool. He had very specific requirements for the tool, such as support for certain databases and ability to work with packages and stored procedures. Most importantly since he was dealing with a lot of different databases he was looking for a tool with any-to-any data migration capabilities built-in. He also needed ability to view and edit data directly in the tool, including CLOBs and BLOBs, complex data types and arrays. Intuitive and familiar drag and drop interface was also a plus.

Solution

Customer selected Data Explorer SQL edition. There are a lot of great products on the market so the choice he made was not easy.  The main selling point was Data Migration Wizard.  This easy to use any-to-any data migration tool is integrated directly into Data Explorer. Customer is now able to perform data conversions without wring a single line of code. The fact that other requirements were also met certainly helped.


Project description: educational tool

Challenges

As an educational institution customer needed a tool for undergraduate students to teach database management courses. One of the requirements was ability to view and search metadata such as database objects, constraints, etc. Also they needed an easy way to query database so visual query designer was a big plus.

Solution

Customer received a free site license for the Data Explorer Pro edition which was installed in several labs. Toolsverse used it as an opportunity to beta test many useful database development and database management features which are currently available in the product.


Project description: integration testing and development support

Challenges

When developing a data driven application, creating an integration tests can be a problem. You need to have a data in the cohesive state in order to test things reliable. Customer was experimenting with several testing frameworks such as DBUnit but still was not satisfied. The data preparation process was mostly manual meaning that developers needed to describe all data needed for the test either in XML or by other means. Supporting tests become a full time job since every change in the database structure require change in N amount of tests as well.

Solution

Customer developed integration testing framework which uses ETL Framework to prepare data for the tests. Particular cohesive sets of data extracted upfront, zipped and later fed to the ETL Framework which updates data in the designated test database. The extract and load process are fully automated and require a small configuration file for each test or test suite. File includes table names and other parameters. Developers don’t need to include or exclude columns, think about dependencies, etc.  Test database is getting flushed before build continue integration process runs entire test suite.  Customer developed a large and growing list of ETL scenarios for different domains so as a positive side effect developers now able to quickly get data from the production or QA environments to test and fix things.