A mutli-schema text file contains a varying structure where each line may represent a different record type. This is an example from the Talend documentation.
01;SOFT MUSIC DANCE ALBUM;RICHARDSON;15/12/2005
02;We Danced
02;She's Everytying
02;Once in a Lifetime Love
03;National Library
01;COUNTRY MUSIC ALBUM;WHITE;02/01/2006
02;Fall Into Me
02;Another Try
02;Something About Her
Songs ("We Danced", etc) are grouped in Compact Discs ("SOFT MUSIC DANCE ALBUM"). There's also a third record type "Library". This could be expressed in an XML document like this.
XML Document from a Multi-Schema Text File |
It might seem possible to connect each schema in a multi-schema component like tFileInputMSDelimited to a set of tAdvancedFileOutputXMLs set in Append mode. This job looks like it would work. The record counts all check out.
WRONG: Appended Elements Won't Show Up |
However, the XML output doesn't render correctly. The subelements "Songs" and "Libraries" are missing.
XML Document Missing Key Subelements |
Instead of connecting each schema out with a Main, chain the tAdvancedFileOutputXMLs together using OnSubJobOks.
With OnSubJobOk |
Component Configuration
The configuration of the tFileInputMSDelimited is found in the Talend help files. In the case of "OnSubJobOk", the tFileInputMSDelimited is duplicated, one for each output schema.
The tAdvancedFileOutputXML's are in append mode (except for the first one) and directed to the same XML file.
Mapping that Produces Toplevel Disc Container |
The elements in the toplevel container should appear in all schemas: Author, Date. This is to ensure the correct ordering of the XML elements which might be validated against an xs:sequence element in an XSD.
Mapping that Produces Song Element |
Mapping that Produces the Libraries Element |
Connecting a bunch of tAdvancedFileOutputXMLs didn't work initially, but by restructuring the job, you can produce an XML document from a text file.
UPDATE
This is a screenshot of the schema used repeated in each of the three tFileInputMSDelimited components.
Schema Used in "With SubJob Ok" Job |
Hi,
ReplyDeleteI can't find a way to do the same thing with tXmlMap ?
I miss this feature to use it in DataServices exposed trough talend esb container.
Carl,
ReplyDeletewhat is the mapping schema of the MSInputDelimited?
it appears from the mapping as though the schema would contain previous record id as part of the MSInputDelimited schema (ie records "02" would contain "01" as a field, records "03" would contain fields of "01" and "02"?) DISCName mapping within Songs and DISCName mapping within Library.
Hi. I added a screenshot under the section heading "UPDATE".
Delete