tExtractXMLField is a Talend Open Studio (TOS) component that parses a field containing XML and breaks the structure out into individual fields. Once the data is out of XML, the resulting schema can be processed using other TOS components like tMap.
If an update is required, use a tXMLMap to rebuild the structure and direct the flow to an output component like tMySqlOutput. A tMap will need to be added in order to convert the Document output of tXMLMap to an output schema; implicit cast-style conversions or tConvertType won't work.
<?xml version="1.0" encoding="utf-8"?>
The XML document can be saved in a Text column of a MySQL database. The Text column is like an Oracle Clob which removes the maxlength limit of the typical VARCHAR data type. This database record stores an id (preferences_id) and a unique business key (username) along with the XML for easy retrieval by a RESTful web service.
|DB Schema Showing a Text Column with XML|
The use case described in this post is to migrate the data stored in the XML document to a new value. Specifically, the theme named "modern" is to be renamed "modern1". One way to handle this is to use a string replacement, but a more flexible way to handle this is to retain the data's definition by dealing with the perferences_xml field as XML rather than a simple string. To apply the new theme name, the XML document will be read, converted internally into something usable by TOS, modified with the results of a lookup value, and written back out the the database.
|Job Parsing and Re-forming XML|
The tExtractXMLField parses and queries the XML document with XPaths. From the XML document, it creates preference_theme and preference_color columns. Although preference_color is not modified, it needs to be extracted so that it can be re-formed later.
|Join Tables to Update Preferences|
The tXMLMap takes the individual fields -- now converted by the preferences_update lookup -- and re-forms the XML document in a field "preferences_xml". To use tXMLMap, make sure that the target type is Document. This will make the basic subelement/attribute right-click menu functions available. Build the structure and map the fields as you would with any XML component.
A second tMap is used to convert the Document coming from the tXMLMap into the String used by the target tMySqlOutput. It seems like this should happen automatically, but you'll get an error even if you add an interceding tConvertType. Rather, call the toString() method of the Document as in this configuration.
|Document to String Conversion|
Finally, an update is executed using a tMySqlOutput component. This update uses the Repository schema for preferences. But it uses the Advanced settings' Field options as specified below to prevent username from being overridden with a NULL value. Update is the action used.
|Update Only preferences_xml|
A Note about tWriteXMLField:
tWriteXMLField has similar capabilities to tXMLMap. However, tWriteXMLField will only output a single field of XML. That's good for inserting, however it may not work for updating where a unique key or id needs to be carried through. See TDI-8392 on Talend's bug tracker for more info.