CommonEtlUtils (Toolsverse API Reference 5.9)

Object
- com.toolsverse.etl.common.CommonEtlUtils

```
public final class CommonEtlUtils
extends Object
```
The collection of static methods used by core and common ETL components.

Since:

2.0

Version:

4.0

Field Summary

Fields
Modifier and Type	Field and Description
`static String`	`MODE_PROP` MODE property.
`static String`	`PARENT_ROW_ID_FIELD_NAME` The Constant PARENT_ROW_ID_FIELD_NAME.
`static String`	`PASSIVE_MODE` The PASSIVE MODE.
`static String`	`PROXY_HOST_PROP` PROXY_HOST property.
`static String`	`PROXY_PORT_PROP` PROXY_PORT property.
`static String`	`ROW_ID_FIELD_NAME` The Constant ROW_ID_FIELD_NAME.

Constructor Summary

Constructors
Constructor and Description

CommonEtlUtils()

Constructors
Constructor and Description
`CommonEtlUtils()`

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static DataSet`	`addDimensions(DataSet drivingDataSet, String keys, String addColumns, boolean keepAllFields, DataSet... dataSetsToJoin)` Add dimensions to the data set.
`static DataSetFields`	`cloneFields(DataSetFields fields)` Clone fields.
`static DataSet`	`denormalize(DataSet source, boolean cleanUp)` Pivot data set which has multiple versions of the same field.
`static DataSet`	`executeSql(DataSet dataSet, String sql)` Execute sql on any data set.
`static DataSet`	`executeSql(DataSet dataSet, String sql, TypedKeyValue<String,Object>... args)` Execute sql on any data set.
`static DataSet`	`executeSql(String sql, DataSet... dataSets)` Execute sql on arrays of data sets.
`static DataSet`	`executeSql(String sql, TypedKeyValue<String,Object>[] args, DataSet... dataSets)` Execute sql on arrays of data sets.
`static DataSet`	`extractDimension(String keys, DataSet drivingDataSet, String name)` Extract dimension from multidimensional data set.
`static DataSet`	`filter(DataSet dataSet, String conditions)` Filter data set and keep the source data set intact.
`static DataSet`	`filter(DataSet dataSet, String conditions, boolean keepOriginal)` Filter data set.
`static Map<String,String>`	`getCaseInsensitiveKeys(String key)` Gets the case insensitive keys.
`static Map<String,List<FieldMapping>>`	`getFieldMappingPerDataSet(List<FieldMapping> mapping)` Gets the field mapping per data set.
`static String`	`getFieldName(List<FieldMapping> mapping, int index, boolean isSource)` Gets the fully qualified field name.
`static DataSetFields`	`getFieldsAfterIncludeExclude(DataSet dataSet, Set<String> includeFields, Set<String> excludeFields)` Gets the map of fields after include and exclude.
`static String`	`getFieldsAsString(DataSetFields fields)` Gets the fields as a comma delimited string.
`static Map<String,FieldDef>`	`getFieldsExceptExcluded(String excludedFields, Map<String,FieldDef> dataSetFields)` Returns the map of the fields except given excluded fields.
`static Object`	`getFieldValue(DataSet dataSet, String sql, String fieldName)` Execute sql on any data set and return value of the given field.
`static Object`	`getFieldValue(DataSet dataSet, String sql, String fieldName, TypedKeyValue<String,Object>... args)` Execute sql on any data set and return value of the given field.
`static TypedKeyValue<String,String>`	`getFrom(String sql)` Gets the from and modified SQL statement without from.
`static String`	`getKey(DataSet dataSet, DataSetRecord record, Map<String,FieldDef> keys, boolean ignoreCase, boolean doTrim)` Gets the string representation of the key for the given record and map of key fields.
`static Map<String,FieldDef>`	`getKeyFields(String keys, Map<String,FieldDef> dataSetFields)` Returns the map of the key fields for the given keys.
`static DataSetRecord`	`getRecordAfterInlcudeExclude(DataSet dataSet, DataSetRecord record, int cols, boolean isSelective, Set<String> includeFields, Set<String> excludeFields)` Gets the record after inlcude and exclude.
`static DataSet`	`getSelectedDataSet(DataSet dataSet, TypedKeyValue<int[],int[]> selected)` Gets the selected data set.
`static DataSet`	`intersect(DataSet drivingDataSet, DataSet secondDataSet, String keys)` Performs intersect operations on two data sets.
`static DataSet`	`join(DataSet drivingDataSet, String keys, String include, String exclude, TypedKeyValue<DataSet,Boolean>... dataSetsToJoin)` Joins data sets.
`static DataSet`	`keyValueDenormalization(DataSet source, DataSet newDataSet, String groupByColumns, String includeColumns, String keyColumn, String valueColumn, boolean ignoreCase, boolean doTrim, String fieldsToHave)` This transformation groups rows by combination of 'group by' fields.
`static DataSet`	`keyValueDenormalization(DataSet source, DataSet newDataSet, String groupByColumns, String includeColumns, String keyColumn, String valueColumn, boolean ignoreCase, boolean doTrim, String fieldsToHave, String combine)` This transformation groups rows by combination of 'group by' fields.
`static DataSet`	`keyValueNormalization(DataSet source, String normalizeColumns, String keyColumn, String valueColumn, boolean ignoreEmpty)` This transformation performs key-value normalization.
`static DataSet`	`keyValueNormalization(DataSet source, String normalizeColumns, String keyColumn, String valueColumn, boolean ignoreEmpty, String doNotNormalizeColumns)` This transformation performs key-value normalization.
`static Object`	`lookup(DataSet dataSet, String conditions, String fieldName)` Find the field value using given filter conditions and field name.
`static DataSet`	`matrix(DataSet drivingDataSet, int maxColsInRow)` This transformation leaves
`static DataSetData`	`mergeData(DataSetFields fields, DataSetData to, DataSetRecord driving, DataSet... from)` Merge and normalize data.
`static DataSetFields`	`mergeFields(DataSetFields... allFields)` Merge fields from different data sets.
`static DataSet`	`minus(DataSet drivingDataSet, DataSet dataSetToMinus, String keys)` Performs minus operations on two data sets.
`static DataSetData`	`normalizeDataSetData(DataSetFields fields, DataSetData data, DataSetRecord driving, DataSet dataSet)` Normalize data set data using given common list of fields, update given data.
`static Map<String,DataSet>`	`normalizeNestedDataSet(DataSet source, Set<String> toExclude)` Normalize nested data set.
`static Map<String,DataSet>`	`normalizeNestedDataSet(String parentRowId, String name, DataSet source, Map<String,DataSet> existingDimensions, Set<String> toExclude)` Normalize nested data set.
`static DataSet`	`pivot(DataSet dataSet, String theKeys, String theFields, String theInclude, String theExclude, boolean denorm, boolean ignoreCase, boolean doTrim, int maxFields, String leading)` This method performs pivoting operations on data set, such as grouping, de-normalization, etc.
`static DataSet`	`pivot(DataSet dataSet, String theKeys, String theFields, String theInclude, String theExclude, boolean denorm, boolean ignoreCase, boolean doTrim, int maxFields, String leading, DataSet drivingDataSet)` This method performs pivoting operations on data set, such as grouping, de-normalization, etc.
`static void`	`processFiles(Alias alias, IoProcessorCallable callable, boolean disconnect, boolean closeIfNotDone, boolean doneOnDone, String ownerName, Driver driver)` Process files for the give alias.
`static void`	`processFiles(Alias alias, IoProcessorCallable callable, boolean disconnect, Driver driver)` Process files for the give alias.
`static List<TypedKeyValue<Integer,FieldDef>>`	`reorderFields(DataSetFields fields, String pattern)` Reorder fields according to the pattern.
`static DataSetRecord`	`reorderFieldsInRecord(DataSetRecord record, DataSetFields fields, DataSetFields originalFields)` Reorder fields in record.
`static DataSetRecord`	`reorderFieldsInRecord(DataSetRecord newRecord, DataSetRecord record, DataSetFields fields, Object keyFieldValue, Object valueFieldValue, String[] actualFields, String combine)` Reorder fields in record.
`static LinkedHashMap<String,DataSet>`	`split(DataSet dataSet, String keys)` Splits the data set on multiple data sets using given key field(s).
`static LinkedHashMap<String,DataSet>`	`split(DataSet dataSet, String keys, int dsSize)` Splits the data set on multiple data sets using given key field(s).
`static List<DataSetRecord>`	`splitRecord(DataSetRecord record, int maxColsInRow, DataSetFields fields)` Split record.
`static DataSet`	`transform(DataSet source, List<FieldMapping> mapping)` Transform source data set into destination using given mapping.
`static DataSet`	`union(DataSet dataSet1, DataSet dataSet2, String keys, boolean unionAll, String include, String exclude)` Performs union of the two data sets.

Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail
- MODE_PROP
```
public static final String MODE_PROP
```
  MODE property.
  
  See Also:
  
  Constant Field Values
- PROXY_HOST_PROP
```
public static final String PROXY_HOST_PROP
```
  PROXY_HOST property.
  
  See Also:
  
  Constant Field Values
- PROXY_PORT_PROP
```
public static final String PROXY_PORT_PROP
```
  PROXY_PORT property.
  
  See Also:
  
  Constant Field Values
- PASSIVE_MODE
```
public static final String PASSIVE_MODE
```
  The PASSIVE MODE.
  
  See Also:
  
  Constant Field Values
- ROW_ID_FIELD_NAME
```
public static final String ROW_ID_FIELD_NAME
```
  The Constant ROW_ID_FIELD_NAME.
  
  See Also:
  
  Constant Field Values
- PARENT_ROW_ID_FIELD_NAME
```
public static final String PARENT_ROW_ID_FIELD_NAME
```
  The Constant PARENT_ROW_ID_FIELD_NAME.
  
  See Also:
  
  Constant Field Values

Constructor Detail
- CommonEtlUtils
```
public CommonEtlUtils()
```

Method Detail

addDimensions
```
public static DataSet addDimensions(DataSet drivingDataSet,
                                    String keys,
                                    String addColumns,
                                    boolean keepAllFields,
                                    DataSet... dataSetsToJoin)
                             throws Exception
```
Add dimensions to the data set. Basically this transformation creates a multidimensional data set by joining two or more data sets and adding columns to the driving data set which are the result of look up in the joining data sets.

Parameters:

drivingDataSet - the driving data set

keys - the key fields

addColumns - the columns to add (otherwise use data set name)

keepAllFields - if true keep all columns from data set to join, otherwise only columns which are not included in the key

dataSetsToJoin - the data sets to join

Returns:

the data set

Throws:

Exception - in case of any error

denormalize
```
public static DataSet denormalize(DataSet source,
                                  boolean cleanUp)
```
Pivot data set which has multiple versions of the same field.

Parameters:

source - the source data set

cleanUp - is true the source data set will be cleared

Returns:

denormalized data set

extractDimension

public static DataSet extractDimension(String keys,
                                       DataSet drivingDataSet,
                                       String name)
                                throws Exception

Extract dimension from multidimensional data set.

Parameters:: keys - the key columns; drivingDataSet - the driving data set; name - the name
Returns:: the data set
Throws:: Exception - in case of any error

getFieldsAfterIncludeExclude

public static DataSetFields getFieldsAfterIncludeExclude(DataSet dataSet,
                                                         Set<String> includeFields,
                                                         Set<String> excludeFields)

Gets the map of fields after include and exclude.

Parameters:: dataSet - the data set; includeFields - the include fields; excludeFields - the exclude fields
Returns:: the fields after include and exclude

getFieldsAsString
```
public static String getFieldsAsString(DataSetFields fields)
```
Gets the fields as a comma delimited string.

Parameters:

fields - the fields

Returns:

the fields as string

getKey

public static String getKey(DataSet dataSet,
                            DataSetRecord record,
                            Map<String,FieldDef> keys,
                            boolean ignoreCase,
                            boolean doTrim)

Gets the string representation of the key for the given record and map of key fields.

Parameters:: dataSet - the data set; record - the record; keys - the key fields; ignoreCase - if true ignore char case; doTrim - if true truncate string
Returns:: the string representation of the key

getCaseInsensitiveKeys
```
public static Map<String,String> getCaseInsensitiveKeys(String key)
```
Gets the case insensitive keys.

Parameters:

key - the key

Returns:

the case insensitive keys

getKeyFields

public static Map<String,FieldDef> getKeyFields(String keys,
                                                Map<String,FieldDef> dataSetFields)

Returns the map of the key fields for the given keys.

Parameters:: keys - the keys; dataSetFields - the fields
Returns:: the key fields

getFieldsExceptExcluded
```
public static Map<String,FieldDef> getFieldsExceptExcluded(String excludedFields,
                                                           Map<String,FieldDef> dataSetFields)
```
Returns the map of the fields except given excluded fields.

Parameters:

excludedFields - the excluded fields

dataSetFields - the fields

Returns:

the remaining fields

getRecordAfterInlcudeExclude

public static DataSetRecord getRecordAfterInlcudeExclude(DataSet dataSet,
                                                         DataSetRecord record,
                                                         int cols,
                                                         boolean isSelective,
                                                         Set<String> includeFields,
                                                         Set<String> excludeFields)

Gets the record after inlcude and exclude.

Parameters:: dataSet - the data set; record - the source record; cols - the number of columns; isSelective - if true select only fields which are included or not excluded; includeFields - the include fields; excludeFields - the exclude fields
Returns:: the record after inlcude and exclude

normalizeNestedDataSet

public static Map<String,DataSet> normalizeNestedDataSet(String parentRowId,
                                                         String name,
                                                         DataSet source,
                                                         Map<String,DataSet> existingDimensions,
                                                         Set<String> toExclude)
                                                  throws Exception

Normalize nested data set. Takes nested data set as an input and creates a list of data sets, each representing a dimension. Adds "id" fields for the referential integrity.

Parameters:: parentRowId - the parent row id; name - the name; source - the source data set; existingDimensions - the existing dimensions; toExclude - the to exclude
Returns:: the map of data sets, each representing a dimension. Key is a dimension name.
Throws:: Exception - the exception

normalizeNestedDataSet
```
public static Map<String,DataSet> normalizeNestedDataSet(DataSet source,
                                                         Set<String> toExclude)
                                                  throws Exception
```
Normalize nested data set. Takes nested data set as an input and creates a list of data sets, each representing a dimension. Adds "id" fields for the referential integrity.

Parameters:

source - the source data set

toExclude - the to exclude

Returns:

the map of data sets, each representing a dimension. Key is a dimension name.

Throws:

Exception - the exception

getSelectedDataSet

public static DataSet getSelectedDataSet(DataSet dataSet,
                                         TypedKeyValue<int[],int[]> selected)

Gets the selected data set.

Parameters:: dataSet - the data set; selected - the selected rows and columns
Returns:: the selected data set

intersect
```
public static DataSet intersect(DataSet drivingDataSet,
                                DataSet secondDataSet,
                                String keys)
                         throws Exception
```
Performs intersect operations on two data sets. Returns rows from the first dataset that identical to the rows in the second dataset.

Parameters:

drivingDataSet - the driving data set

secondDataSet - the second dataset

keys - the key fields

Returns:

the data set

Throws:

Exception - in case of any error

pivot
```
public static DataSet pivot(DataSet dataSet,
                            String theKeys,
                            String theFields,
                            String theInclude,
                            String theExclude,
                            boolean denorm,
                            boolean ignoreCase,
                            boolean doTrim,
                            int maxFields,
                            String leading)
                     throws Exception
```
This method performs pivoting operations on data set, such as grouping, de-normalization, etc. For example there is a data set with the following fields: lastname, firstname and address. There can be multiple records for the same last/first name. This task will transform source data set into data set which has exactly one record for the given lastname and firstname but multiple columns for the address: address1, address2, etc. User can specify fields to display which will be calculated using JavaScript.

Parameters:

dataSet - the data set

theKeys - the fields to "group by"

theFields - the calculated fields

theInclude - the fields to include

theExclude - the fields too exclude

denorm - if true - denormalize data set at the end

ignoreCase - if true - ignore case of the key fields

doTrim - if true - trim key fields

maxFields - the maximum number of fields

leading - the leading fields

Returns:

the data set

Throws:

Exception - in case of any error

pivot
```
public static DataSet pivot(DataSet dataSet,
                            String theKeys,
                            String theFields,
                            String theInclude,
                            String theExclude,
                            boolean denorm,
                            boolean ignoreCase,
                            boolean doTrim,
                            int maxFields,
                            String leading,
                            DataSet drivingDataSet)
                     throws Exception
```
This method performs pivoting operations on data set, such as grouping, de-normalization, etc. For example there is a data set with the following fields: lastname, firstname and address. There can be multiple records for the same last/first name. This task will transform source data set into data set which has exactly one record for the given lastname and firstname but multiple columns for the address: address1, address2, etc. User can specify fields to display which will be calculated using JavaScript.

Parameters:

dataSet - the data set

theKeys - the fields to "group by"

theFields - the calculated fields

theInclude - the fields to include

theExclude - the fields too exclude

denorm - if true - denormalize data set at the end

ignoreCase - if true - ignore case of the key fields

doTrim - if true - trim key fields

maxFields - the maximum number of fields

leading - the leading fields

drivingDataSet - the driving data set

Returns:

the data set

Throws:

Exception - in case of any error

filter
```
public static DataSet filter(DataSet dataSet,
                             String conditions)
                      throws Exception
```
Filter data set and keep the source data set intact.

Parameters:

dataSet - the data set

conditions - the filter conditions

Returns:

the filtered data set

Throws:

Exception - in case of any error

lookup
```
public static Object lookup(DataSet dataSet,
                            String conditions,
                            String fieldName)
                     throws Exception
```
Find the field value using given filter conditions and field name.

Parameters:

dataSet - the data set

conditions - the filter conditions

fieldName - the field name

Returns:

the field value

Throws:

Exception - in case of any error

filter

public static DataSet filter(DataSet dataSet,
                             String conditions,
                             boolean keepOriginal)
                      throws Exception

Filter data set.

Parameters:: dataSet - the data set; conditions - the filter conditions; keepOriginal - if true keep original data
Returns:: the filtered data set
Throws:: Exception - in case os eny error

getFrom
```
public static TypedKeyValue<String,String> getFrom(String sql)
```
Gets the from and modified SQL statement without from.

Parameters:

sql - the sql

Returns:

the From and the modified SQL

executeSql
```
public static DataSet executeSql(DataSet dataSet,
                                 String sql)
                          throws Exception
```
Execute sql on any data set.

Parameters:

dataSet - the data set

sql - the sql to execute

Returns:

the data set

Throws:

Exception - in case of any error

executeSql

public static DataSet executeSql(String sql,
                                 DataSet... dataSets)
                          throws Exception

Execute sql on arrays of data sets.

Parameters:: sql - the sql to execute; dataSets - the array of data sets
Returns:: the data set
Throws:: Exception - in case of any error

executeSql

public static DataSet executeSql(DataSet dataSet,
                                 String sql,
                                 TypedKeyValue<String,Object>... args)
                          throws Exception

Execute sql on any data set.

Parameters:: dataSet - the data set; sql - the sql to execute; args - the SQL parameters as key/value pairs
Returns:: the data set
Throws:: Exception - in case of any error

cloneFields
```
public static DataSetFields cloneFields(DataSetFields fields)
```
Clone fields.

Parameters:

fields - the fields

Returns:

the data set fields

executeSql

public static DataSet executeSql(String sql,
                                 TypedKeyValue<String,Object>[] args,
                                 DataSet... dataSets)
                          throws Exception

Execute sql on arrays of data sets.

Parameters:: sql - the sql to execute; args - the SQL parameters as key/value pairs; dataSets - the array of data sets
Returns:: the data set
Throws:: Exception - in case of any error

getFieldValue

public static Object getFieldValue(DataSet dataSet,
                                   String sql,
                                   String fieldName,
                                   TypedKeyValue<String,Object>... args)
                            throws Exception

Execute sql on any data set and return value of the given field.

Parameters:: dataSet - the data set; sql - the sql to execute; fieldName - the field name; args - the SQL parameters as key/value pairs
Returns:: the data set
Throws:: Exception - in case of any error

getFieldValue
```
public static Object getFieldValue(DataSet dataSet,
                                   String sql,
                                   String fieldName)
                            throws Exception
```
Execute sql on any data set and return value of the given field.

Parameters:

dataSet - the data set

sql - the sql to execute

fieldName - the field name

Returns:

the data set

Throws:

Exception - in case of any error

join

@SafeVarargs
public static DataSet join(DataSet drivingDataSet,
                                         String keys,
                                         String include,
                                         String exclude,
                                         TypedKeyValue<DataSet,Boolean>... dataSetsToJoin)
                                  throws Exception

Joins data sets.

Parameters:: drivingDataSet - the driving data set; keys - the key fields; include - the fields to include; exclude - the fields to exclude; dataSetsToJoin - the data sets to join
Returns:: the data set
Throws:: Exception - in case of any error

keyValueDenormalization

public static DataSet keyValueDenormalization(DataSet source,
                                              DataSet newDataSet,
                                              String groupByColumns,
                                              String includeColumns,
                                              String keyColumn,
                                              String valueColumn,
                                              boolean ignoreCase,
                                              boolean doTrim,
                                              String fieldsToHave)
                                       throws Exception

This transformation groups rows by combination of 'group by' fields. It than creates columns from all unique values of the Key column and sets value of this column to the value of the corresponding Value field. Example:

 
 Before:
 
 id   attribute    value
 1    first_name   John
 1    last_name    Doe
 2    email        test@yahoo.com
 2    ssn          123
 
 After:
 
 id   first_name   last_name   email           ssn
 1    John         Doe
 2                             test@yahoo.com  123

Parameters:: source - the source dataset; newDataSet - the new dataset; groupByColumns - the group by columns; includeColumns - include these columns in addition to groupByColumns; keyColumn - the key column; valueColumn - the value column; ignoreCase - if true - ignore character case when comparing key columns; doTrim - if true - trim column values when comparing key columns; fieldsToHave - the comma delimited list of fields to have in the dataset. This parameter can be null.
Returns:: the data set after transformation
Throws:: Exception - in case of any error

keyValueDenormalization

public static DataSet keyValueDenormalization(DataSet source,
                                              DataSet newDataSet,
                                              String groupByColumns,
                                              String includeColumns,
                                              String keyColumn,
                                              String valueColumn,
                                              boolean ignoreCase,
                                              boolean doTrim,
                                              String fieldsToHave,
                                              String combine)
                                       throws Exception


 Before:

 id   attribute    value
 1    first_name   John
 1    last_name    Doe
 2    email        test@yahoo.com
 2    ssn          123

 After:

 id   first_name   last_name   email           ssn
 1    John         Doe
 2                             test@yahoo.com  123

Parameters:: source - the source dataset; newDataSet - the new dataset; groupByColumns - the group by columns; includeColumns - include these columns in addition to groupByColumns; keyColumn - the key column; valueColumn - the value column; ignoreCase - if true - ignore character case when comparing key columns; doTrim - if true - trim column values when comparing key columns; fieldsToHave - the comma delimited list of fields to have in the dataset. This parameter can be null.; combine - the field to combine key-values into (must exist)
Returns:: the data set after transformation
Throws:: Exception - in case of any error

keyValueNormalization

public static DataSet keyValueNormalization(DataSet source,
                                            String normalizeColumns,
                                            String keyColumn,
                                            String valueColumn,
                                            boolean ignoreEmpty)
                                     throws Exception

This transformation performs key-value normalization. Example:

 
 Before:
 
 id   first_name   last_name   email           ssn
 1    John         Doe
 2                             test@yahoo.com  123
 
 After:
 
 id   attribute    value
 1    first_name   John
 1    last_name    Doe
 2    email        test@yahoo.com
 2    ssn          123

Parameters:: source - the source; normalizeColumns - the columns to normalize (transform to key-value pairs where key is a column name and value is columns value); keyColumn - the name of the key column; valueColumn - the name of the value column; ignoreEmpty - if true empty columns with empty values will be ignored
Returns:: the data set
Throws:: Exception - in case of any error

keyValueNormalization

public static DataSet keyValueNormalization(DataSet source,
                                            String normalizeColumns,
                                            String keyColumn,
                                            String valueColumn,
                                            boolean ignoreEmpty,
                                            String doNotNormalizeColumns)
                                     throws Exception

This transformation performs key-value normalization. Example:

 
 Before:
 
 id   first_name   last_name   email           ssn
 1    John         Doe
 2                             test@yahoo.com  123
 
 After:
 
 id   attribute    value
 1    first_name   John
 1    last_name    Doe
 2    email        test@yahoo.com
 2    ssn          123

Parameters:: source - the source; normalizeColumns - the columns to normalize (transform to key-value pairs where key is a column name and value is columns value); keyColumn - the name of the key column; valueColumn - the name of the value column; ignoreEmpty - if true empty columns with empty values will be ignored; doNotNormalizeColumns - the columns to not normalize
Returns:: the data set
Throws:: Exception - in case of any error

matrix
```
public static DataSet matrix(DataSet drivingDataSet,
                             int maxColsInRow)
```
This transformation leaves
```
 <= maxColsInRow
 
```
columns in one row and moves the rest to the next row.
Parameters:

drivingDataSet - the data set to transform

maxColsInRow - the max cols in row

Returns:

the data set

mergeFields
```
@SafeVarargs
public static DataSetFields mergeFields(DataSetFields... allFields)
```
Merge fields from different data sets.

Parameters:

allFields - the all fields

Returns:

the fields

normalizeDataSetData

public static DataSetData normalizeDataSetData(DataSetFields fields,
                                               DataSetData data,
                                               DataSetRecord driving,
                                               DataSet dataSet)

Normalize data set data using given common list of fields, update given data.

Parameters:: fields - the fields; data - the data; driving - the driving record; dataSet - the data set to normalize
Returns:: the data set data

mergeData

public static DataSetData mergeData(DataSetFields fields,
                                    DataSetData to,
                                    DataSetRecord driving,
                                    DataSet... from)

Merge and normalize data.

Parameters:: fields - the fields; to - the data to merge into; driving - the driving record; from - the array of data sets to merge data from
Returns:: the data

minus
```
public static DataSet minus(DataSet drivingDataSet,
                            DataSet dataSetToMinus,
                            String keys)
                     throws Exception
```
Performs minus operations on two data sets. Returns all rows in the first dataset that are not in the second dataset.

Parameters:

drivingDataSet - the driving data set

dataSetToMinus - the data set to join

keys - the key fields

Returns:

the data set

Throws:

Exception - in case of any error

processFiles

public static void processFiles(Alias alias,
                                IoProcessorCallable callable,
                                boolean disconnect,
                                Driver driver)
                         throws Exception

Process files for the give alias. The actual functionality must be implemented in the class which implements given callable interface.

Parameters:: alias - the alias; callable - the instance of the class which implements callable interface; disconnect - if true disconnect on exit
Throws:: Exception - in case of any error

processFiles

public static void processFiles(Alias alias,
                                IoProcessorCallable callable,
                                boolean disconnect,
                                boolean closeIfNotDone,
                                boolean doneOnDone,
                                String ownerName,
                                Driver driver)
                         throws Exception

Process files for the give alias. The actual functionality must be implemented in the class which implements given callable interface.

Parameters:: alias - the alias; callable - the instance of the class which implements callable interface; disconnect - if true disconnect on exit; closeIfNotDone - the close if not done; doneOnDone - the done on done; ownerName - the owner name; driver - the driver
Throws:: Exception - in case of any error

reorderFields

public static List<TypedKeyValue<Integer,FieldDef>> reorderFields(DataSetFields fields,
                                                                  String pattern)

Reorder fields according to the pattern.

Parameters:: fields - the fields; pattern - the pattern
Returns:: the re-arrange fields with new and old positions

reorderFieldsInRecord

public static DataSetRecord reorderFieldsInRecord(DataSetRecord record,
                                                  DataSetFields fields,
                                                  DataSetFields originalFields)

Reorder fields in record.

Parameters:: record - the original record; fields - the fields; originalFields - the original fields
Returns:: the record

reorderFieldsInRecord

public static DataSetRecord reorderFieldsInRecord(DataSetRecord newRecord,
                                                  DataSetRecord record,
                                                  DataSetFields fields,
                                                  Object keyFieldValue,
                                                  Object valueFieldValue,
                                                  String[] actualFields,
                                                  String combine)

Reorder fields in record.

Parameters:: newRecord - the new record; record - the record; fields - the fields; keyFieldValue - the key field value; valueFieldValue - the value field value; actualFields - the actual fields; combine - the combine field
Returns:: the data set record

split
```
public static LinkedHashMap<String,DataSet> split(DataSet dataSet,
                                                  String keys)
```
Splits the data set on multiple data sets using given key field(s).

Parameters:

dataSet - the data set

keys - the keys

Returns:

the linked hash map

split

public static LinkedHashMap<String,DataSet> split(DataSet dataSet,
                                                  String keys,
                                                  int dsSize)

Splits the data set on multiple data sets using given key field(s).

Parameters:: dataSet - the data set; keys - the keys; dsSize - the maximum data set size
Returns:: the linked hash map

splitRecord

public static List<DataSetRecord> splitRecord(DataSetRecord record,
                                              int maxColsInRow,
                                              DataSetFields fields)

Split record.

Parameters:: record - the record to split; maxColsInRow - the max number of columns in the row; fields - the fields
Returns:: the list of records

union

public static DataSet union(DataSet dataSet1,
                            DataSet dataSet2,
                            String keys,
                            boolean unionAll,
                            String include,
                            String exclude)
                     throws Exception

Performs union of the two data sets.

Parameters:: dataSet1 - the first data set; dataSet2 - the second data set; keys - the key fields used when unionAll == false; unionAll - if false exclude rows with duplicated keys; include - the fields to include; exclude - the fields to exclude
Returns:: the data set
Throws:: Exception - in case of any error

getFieldName
```
public static String getFieldName(List<FieldMapping> mapping,
                                  int index,
                                  boolean isSource)
```
Gets the fully qualified field name. Example: "name.first";

Parameters:

mapping - the mapping

index - the index of the field

isSource - the is source

Returns:

String

getFieldMappingPerDataSet
```
public static Map<String,List<FieldMapping>> getFieldMappingPerDataSet(List<FieldMapping> mapping)
```
Gets the field mapping per data set.

Parameters:

mapping - the original mapping

Returns:

the field mapping per data set

transform
```
public static DataSet transform(DataSet source,
                                List<FieldMapping> mapping)
                         throws Exception
```
Transform source data set into destination using given mapping.

Parameters:

source - the source data set

mapping - the mapping

Returns:

the data set

Throws:

Exception - in case of any error

Class CommonEtlUtils

Field Summary

Constructor Summary

Method Summary

Methods inherited from class Object

Field Detail

MODE_PROP

PROXY_HOST_PROP

PROXY_PORT_PROP

PASSIVE_MODE

ROW_ID_FIELD_NAME

PARENT_ROW_ID_FIELD_NAME

Constructor Detail

CommonEtlUtils

Method Detail

addDimensions

denormalize

extractDimension

getFieldsAfterIncludeExclude

getFieldsAsString

getKey

getCaseInsensitiveKeys

getKeyFields

getFieldsExceptExcluded

getRecordAfterInlcudeExclude

normalizeNestedDataSet

normalizeNestedDataSet

getSelectedDataSet

intersect

pivot

pivot

filter

lookup

filter

getFrom

executeSql

executeSql

executeSql

cloneFields

executeSql

getFieldValue

getFieldValue

join

keyValueDenormalization

keyValueDenormalization

keyValueNormalization

keyValueNormalization

matrix

mergeFields

normalizeDataSetData

mergeData

minus

processFiles

processFiles

reorderFields

reorderFieldsInRecord

reorderFieldsInRecord

split

split

splitRecord

union

getFieldName

getFieldMappingPerDataSet

transform