Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Monday, December 27, 2010

Why Use a Lookup Function?

Several blog posts and tutorials have been created for Pervasive Data Integrator Lookup Functions. In Data Integrator, a Lookup Function is a SQL query or file read that takes a key and returns a value, possibly a default. The returned value can then be used in a target mapping in Map Designer.

One use of the Lookup Function is to retrieve a value for a normalized data model.  In a normalized data model, repeating groups of data are removed.  This is typical of a relational database (SQL Server, Oracle) but less typical of a spreadsheet.


'Brand' is not normalized in the spreadsheet; it is repeated for each and every row in the spreadsheet. A data element (like Brand) is easier to maintain if it is stored in a single place.  So, a representation of this data in an RDBMS would likely store Brand in its own table, separate from the rest of the hotel data.


A single Excel workbook, with all the data in a single Sheet, becomes two relational tables.  And the transformation -- from one workbook to two tables -- requires the two target tables to be loaded in sequence.  This is where the Lookup Function comes in.

Loading data into the relational tables is dones in two Map Designer transformations for each of the two tables: BRAND and HOTEL.  The BRAND table is loaded from the input spreadsheet using the OnDataChange1 event which adds a record for each distinct data element in the Brand column of the spreadsheet.

In the HOTEL transformation, a HOTEL record is loaded for each row in the input spreadsheet; Hotel Name does not repeat.  However, the HOTEL table does not contain a textual Brand value.  Rather, it is using a numerical database identifier.  The Lookup Function makes this linkage by consulting the previously-loaded BRAND table.

Lookup Functions aren't unique to Data Integrator.  For example, you might find something like this in an Oracle stored procedure.

  v_contact_id := get_contact_id(p_email);
 


Where get_contact_id would run a SQL query with a SELECT on contact_id and p_email used in the WHERE clause.

Lookup Functions have uses beyond dealing with a normalized data model.  New product code or other conversions can find new values as a transformation is processed.  There is a performance hit in looking up values with each record, particularly if the data set is large.

Here is a blog post on dealing with performance hits "When Good Lookups Go Bad" .

No comments:

Post a Comment