Contact information in databases changes with new technologies. "Email" and "web site" were added as standard contact fields in the 90's. Now, contacts can have Twitter handles, Facebook user names, YouTube channels and multiple email addresses (Google Circles, etc.).
Software applications can't always keep up with the new technologies. Accounting packages like Sage Peachtree will provide custom fields for handling this type of thing. An email address can be stored in Peachtree Accounting 2012's contact information, but you'll need to use a custom field to store a Twitter handle.
Legacy Data
The following data set is inspired from such a scenario. The screenshot is taken from Talend Open Studio's (TOS) tFileInputExcel wizard. It shows three contact records with a mandatory Email address and three custom fields (Custom_Field1, Custom_Field2, CustomF_Field3) which may contain additional email addresses.
tFileInputExcel Wizard Showing Legacy Data |
The target schema will move the repeating email columns into a single table. Each email column (Email, Custom Field 1) will be realized in its own record in the target. Empty email columns will not produce a target record.
The following screenshot shows a TOS job producing the desired output. Three records are read-in; nine are output by the tSplitRow. Each email field is saved in a target record (3 email fields x 3 input records = 9 output records). A tFilterRow is added to remove records that have an empty email field. Only the first contact record has its Custom Field 2 set, so two records are filtered.
TOS Job Showing Converted Output |
The tSplitRow component is configured by defining three mappings. First_Name and Last_Name are repeated for each mapping. Each of the three email columns (Email, Custom Field 1, Custom Field 2) differentiate the column mappings.
tSplitRow Column Mappings |
Nice to read this post. Really help full. I was stuck to solve some similar problem since last 2 days. Thanks for helpful information. keep posting
ReplyDelete