Teradata MultiLoad – Part 1



Teradata MultiLoad  or MLoad is a command driven load utility for fast, high volume data maintenance on multiple tables or views in Teradata database.

Why it is called Multi-load?

MultiLoad  can perform multiple DML operations, including INSERT, UPDATE, DELETE, and UPSERT on up to five (5) empty/populated target tables at the same time. In case of Fastload, it loads the data to only one target table which should be empty.

 

MultiLoad  Modes:-

MultiLoad  Import-

  • Each MultiLoad import task can perform multiple INSERT, UPDATE, DELETE and UPSERT operation on five target tables in parallel.
  • Can import data from the network attached system or mainframe attached system using custom access module.

MultiLoad  DELETE-

  • Each MultiLoad delete can remove large volumes of data from a single table.

The MultiLoad  DELETE is used to perform global (all amps) delete operation on just one table. The main features of this delete is that it bypasses the transient journal (TJ) and can be restarted if it fails prior to finishing. We can’t use primary index in the MultiLoad  delete operation as primary index access is AMP specific, but MultiLoad  delete is built for global delete.

 

How MultiLoad  DELETE works and why it is fast?

MLoad DELETE operation checks for the entire block of data and after deleting the eligible rows, it will write back the entire block once and one checkpoint will be written. In case of failure before finishing, during restart, it will start deleting from the next data block without checkpoints.

In conventional delete method Teradata uses Transient Journal. In case of failure, all deleted rows are put back into the table from the TJ as a rollback. This rollback can take longer time to finish than delete. MultiLoad  delete does not do roll back; it does restart.

 

BLOCK Level operation:-

Same as FastLoad, Teradata MultiLoad  uses block level operation to overcome the I/O bottleneck. It packs the data into 64KB block into the client system and sends to one AMP to write them to disks to the AMPs.  This is much faster than witting one row at time like BTEQ.

For fallback protected table’s data are being loaded in the background once the base table has been loaded.

 

Data conversion capabilities:-

If an input data field with character data type is targeted for a column with date data type specification, Teradata MultiLoad  can change the input data specification to date before inserting it into the target table.

Below are the conversions supported by MLoad-

  • Numeric-to-numeric (for example, integer to decimal)
  • Character-to-numeric
  • Character-to-date
  • Date-to-character

 

Limitations of MultiLoad:-

No Unique Secondary Index- MultiLoad  does not support unique secondary index (USI) same as Fastload as in case of USI subtable may create in different AMP where the actual row resides. But unlike FastLoad, it supports non-unique secondary index (NUSI) as in this case suitable will be created in the same AMP. In MultiLoad, each AMP works in parallel and independently. This is the reason; it supports NUSI but not USI.

No Referential Integrity- Referential Integrity (RI) on the target table not supported by the Teradata MultiLoad.  RI requires too much system checking to maintain the referential constraints to a different table.

Triggers are not allowed- Triggers are involved with the more than one table. MultiLoad deals with the only one target table. Simply ALTER the Triggers to the DISABLED status prior to using MultiLoad.

No concatenation of the input files- It could impact the restart process if the files were concatenated in different sequences or data was deleted between runs.

No Aggregate, exponential operator or arithmetic functions- If you need data conversions or math, you might be better off using an INMOD to prepare the data prior to loading it.

Also note that MultiLoad  does not support SELECT, Foreign key references, Hash indexes, Join indexes, NOPI tables.