Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Tuesday, May 24, 2011

Mapping the Auto Increment Column in Talend Open Studio

When you're working with an auto-incremented column in Talend Open Studio, you must be careful not to map the column if it's used as an output.
Databases like Microsoft Access use auto-increment columns to provide for unique record identifiers.  This is in contrast to databases like Oracle which maintain a separate facility for sequences (ROWID aside).  When working with Talend, don't map a value to the column, otherwise a database error will the thrown.

(This takes the form of a "General error" referencing the ODBC driver if a tAccessOutput component is used.)

The metadata functions in Talend like the "Retrieve schema" button will include auto-increment columns.  For example

Generated from Retrieve Schema
To use this schema in the map, there are a few techniques.

  1. Abandon it.  In the TOS User Guide, the schema used in the tMysqlOutput component sample is created by hand in a tMap.  Connect the the tMap and press the Reset DB Types on the Edit Schema dialog.
  2. Tweak the Repository.  Drag the Repository's definition on the canvas.  Change to a Built-in and remove the auto-increment column. 
  3. Build the schema in the tMysqlOutput component.  The opposite of the "Abandon it" where the schema is pushed backwards onto a tMap.
Access Output with Built-in Schema
This screenshot is tweaking Repository metadata.  The schema started with an ETL_BUDGET_ID column that was removed.
Schema without Auto-Increment Column
 If you're working with a database output with an auto-increment column, be careful not to map a value for it.  Otherwise, there will be a DB error.

2 comments:

  1. Access is good but what about Oracle?

    ReplyDelete
    Replies
    1. Oracle has a separate ID-generating mechanism, the sequence. Oracle identity columns are defined as any integer type, and can be mapped in Talend without an error. Look up the sequence beforehand in a query (SELECT mysequence.NEXTVAL FROM DUAL) and route the result to a tOracleOutput's ID column.

      Alternatively, you can hack the schema to remove the column (see "Tweak the repository"). Then,

      1. Add the column in the Advanced settings > additional columns table of the tOracleOutput.
      2. Set a column name, position, reference column, and a SQL expression.
      3. For the SQL expression, use "mysequence.NEXTVAL" where mysequence is the name of your sequence.

      Good luck

      Delete