Storing an XML document in the database can give a performance boost by caching. Using XMLType rather than VARCHAR2 can store more text (it's a LOB). The Oracle functions mentioned in this post can be efficient particularly if the document is large; more work is done in the database as it's being retrieved.
Oracle uses the SYS.XMLTYPE data type to store XML documents. These documents can be manipulated in the database using XPATH manipulation routines such as UPDATEXML, APPENDCHILDXML, and DELETEXML.
For example, an extended SQL statement for a table in which a customer_id is stored along with a block of XML in a field 'invoice_xml'.
UPDATE invoice_tbl
SET invoice_xml = DELETEXML(invoice_xml, 'Invoices/Invoice')
WHERE customer_id = 1001
will remove all the nodes matching the Xpath 'Invoices/Invoice'. The Xpath syntax supports indexing (Invoices/Invoice[1] manipulates the first invoice in a list) and attribute selection: Invoices/Invoice[@CustomerId='1001'].
To integrate this in Talend Open Studio, use a tOracleOutput component configured to update. Handle the update by mapping a key field. Use the Advanced settings to insert the special Oracle command (DELETEXML, etc.) into the update.
Applying a Vendor-specific Function to a SQL Update |
This video walks through a Talend Job that uses the Advanced settings to pass the DELETEXML function to Oracle through TOS.
Some XML documents are simple and can be retrieved and transformed in the middle tier or even the front-end. But for large, complex documents that are updated in small amounts, storing a cached document can help with performance. Talend Open Studio's normal update coding -- a tMap into a tOracleOutput -- can take advantage of the special Oracle XML functions like DELETEXML through the Advanced settings tab.
thanks
ReplyDelete