When storing text, there is a limit in text fields. In Oracle, there is a 4k limit on storing text in a VARCHAR2 field. The limit in a Microsoft Access text field is 255 characters. Both databases have an alternate storage system for keeping larger amounts of text. Oracle has the Clob and Access has the Memo. Selecting Clob or Memo over VARCHAR2 or Text may affect the type of operations you can do on the field like sorting or comparisons.
This distinction between the storage types isn't important to Talend Open Studio. Retrieve the metadata from the database table which will create a field of type String or Object. From a coding perspective, these are treated identically since a Java String is an Object.
One difference in dealing with large amounts of text occurs on the input side. If the text is coming from a large file or web service, the components used in Talend for input may have a problem if the text contains linebreaks. Components like tFileInputDelimited or tFileInputFullRow will not necessarily map a whole file to a Clob field.
In this job, a file's contents are gathered in a global variable called "stringBuffer" which is a java.lang.StringBuffer. The StringBuffer is written out to a database's Clob field (in this example an Access "Memo"). The writing is assisted with a tRowGenerator.
Reading a Large File into a Database Clob |
tSetGlobalVar Defines a StringBuffer |
Unpack the StringBuffer and Append a Line |
A tRowGenerator is used to produce a single row. This is a flow adapter that will allow a tAccessOutput component to be used. The tRowGenerator maps hardcoded recordName and a dynamic recordData fields. The recordData field is formed by retrieving the stringBuffer global variable and calling the toString() method to produce a String for use in a tMap.
tRowGenerator Forming a Row Based on Global Var |
Straight-forward tMap |
No comments:
Post a Comment