A mutli-schema text file contains a varying structure where each line may represent a different record type. This is an example from the Talend documentation.
01;SOFT MUSIC DANCE ALBUM;RICHARDSON;15/12/2005
02;Once in a Lifetime Love
01;COUNTRY MUSIC ALBUM;WHITE;02/01/2006
02;Fall Into Me
02;Something About Her
Songs ("We Danced", etc) are grouped in Compact Discs ("SOFT MUSIC DANCE ALBUM"). There's also a third record type "Library". This could be expressed in an XML document like this.
|XML Document from a Multi-Schema Text File|
It might seem possible to connect each schema in a multi-schema component like tFileInputMSDelimited to a set of tAdvancedFileOutputXMLs set in Append mode. This job looks like it would work. The record counts all check out.
|WRONG: Appended Elements Won't Show Up|
However, the XML output doesn't render correctly. The subelements "Songs" and "Libraries" are missing.
|XML Document Missing Key Subelements|
Instead of connecting each schema out with a Main, chain the tAdvancedFileOutputXMLs together using OnSubJobOks.
The configuration of the tFileInputMSDelimited is found in the Talend help files. In the case of "OnSubJobOk", the tFileInputMSDelimited is duplicated, one for each output schema.
The tAdvancedFileOutputXML's are in append mode (except for the first one) and directed to the same XML file.
|Mapping that Produces Toplevel Disc Container|
The elements in the toplevel container should appear in all schemas: Author, Date. This is to ensure the correct ordering of the XML elements which might be validated against an xs:sequence element in an XSD.
|Mapping that Produces Song Element|
|Mapping that Produces the Libraries Element|
Connecting a bunch of tAdvancedFileOutputXMLs didn't work initially, but by restructuring the job, you can produce an XML document from a text file.
This is a screenshot of the schema used repeated in each of the three tFileInputMSDelimited components.
|Schema Used in "With SubJob Ok" Job|