Pervasive Map Designer is a very effective tool for creating source to table mappings where there is a one-to-one correspondence between source record and target record. That is, a given target record is written out or skipped for each source record. Certain multimode connectors give the additional capability of directing the flow of source records to more than one target. But what about a single source record resulting in multiple target records in the same target table?
Consider the following scenario. An Excel spreadsheet with a column of encoded values stored as a comma-separated list.
Each row in the spreadsheet will need to break out the Services column and write out a record for each service. One source record may produce multiple target records, all of the same record type, going to the same table, but without bounds (there aren't service1, service2, serviceN columns). Additionally, a lookup will need to be performed prior to the insert in order to find the key.
To process input in this manner, switch from Map Designer to Process Designer. Manipulate DJImport and DJExport objects within a RIFL script to read the source service strings, parse a service string, and write out one or more service records using the parsed value and a key (the result of a lookup).
This is a single-step process with two Process Variables defined: imp (DJImport) and exp (DJExport).
The code behind the Load Services step follows. Note that there is an extra line break in the ConnectString for display purposes. Also, the write operation hasn't been coded yet. Look for Part 2.
' include reusable code module Include "C:\Users\Carl2\Cosmos9_Work\WorkspaceDemos\CID.itable.rifl" ' prevents eof errors calling imp.Fields() for last line On Error Resume Next Dim sBrand Dim aServices(10) Dim i Dim sCID ' initialize incore lookup functions CID_Init() ' imp references DJImport from Process Variables Set imp = New DJImport "Excel XP" ' string created using wizard imp.ConnectString = "Database=C:\\Users\\Public\\Documents\\Data\\ATD Test \ Data.xls;Table=hotels;Encoding=OEM;HeaderRecordRow=1" While imp.AtEOF = 0 ' split comma-separated list into array aServices = Split(imp.Fields("Services"), ",") For i = LBound(aServices) to UBound(aServices) ' look up id in database sBrand = imp.Fields("Brand") sCID = CID_CID_Lookup(sBrand, "") ' Replace LogMessage() with database insert (djexport) in Part 2 LogMessage("INFO", sCID & ", " & aServices(i)) Next i ' advance to next record imp.RecordNumber = imp.RecordNumber + 1 Wend
The Include statement at the top points to a fully-qualified path. Macros won't work here, so if you want a re-usable RIFL script, store the lookup functions in a folder off the install directory and use a relative path with the Include statement.
Map Designer works well with structures in which there is a one-to-one correspondence between source and target records. But Map Designer doesn't work as well when you need to parse the input prior to insertion. In that case, build a set of loops using DJImport and DJExport objects as you would with a stored procedure. But DJImport and DJExport have the added benefit that they can be used for non-RDBMS sources and targets.
This post is replicated from http://my.opera.com/walkerca/blog/2010/12/07/go-procedural-loading-more-than-one-target-record-per-source-record-part-1.
Talend Open Studio has a great way of handling this. Use a tNormalize and tFilterColumns on the input spreadsheet, then apply the lookups with a tMap. Look for a Jan 10 or 11 post on this.
ReplyDelete