The Java library I'll be working with is the Commons Lang library: Commons Lang. Here is the Javadoc.
An Example
This is an image of Open Studio. An input Excel file is mapped directly to an output Excel file. However, since some of the fields in the source are empty, some null/empty string checks are required to make sure that the output spreadsheet's columns are aligned.
Talend Open Studio Job with tLibraryLoad Component |
import org.apache.commons.lang.StringUtils;
Then, in the tMap component, add a Java expression that makes the StringUtils.isBlank call.
tMap with a Commons Lang StringUtils Call |
When you use tLoadLibrary, the JAR file is copied in the Talend internals. This makes it eligible to be exported along with a job. Don't try to adjust the IDE's or another classpath to find your JAR.
Flexibility
There are many possibilities for integration with this kind of flexibility. This example focused on some useful string handling routines from a popular Java library. But with Java, there is so much code out there that more capable libraries like Hibernate or JUnit could find their way into integration scenarios.
Talend comes with a lot of JARs including several versions of Commons Lang. Rather than browse the file system for your downloaded JAR in the tLibraryLoad component configuration panel, try scanning the popup menu for "Commons Lang 2.5".
ReplyDeleteHi Cart,
ReplyDeleteI need to bring over Chinese charaters an am using tLibraryLoad to load charset.jar.
My question for you...What should be the corresponding import statement?
Thanks in Advance....
Hi,
ReplyDeleteConsider upgrading to Java 6 which contains charsets.jar for Chinese encodings: Big 5, GB18030, GB2312, and GBK. You can also try placing charsets.jar in the jre/ext/lib folder in the JRE used by Talend and the target platform.
Thanks Carl...It works for one row (If only one row is in the source)...
ReplyDeleteWhen there are more than one row in the source,I get ???
When I use tLibraryLoad to load charset.jar
ReplyDeleteWhat should be the corresponding import statement (like import org.apache.commons.lang.StringUtils;
)?
I don't think you'll need an import statement. These classes are sun.* implementations that will be instantiated by more the more familiar java.io.* classes as in InputStreamReader below.
DeleteFileInputStream fis = new FileInputStream("test.txt");
InputStreamReader isr = new InputStreamReader(fis, "GBK");
Thanks Carl...Do we know why it works fine if there is only one source row...and TOS brings ???? marks when there are multiple rows in the source.
DeleteIs Charset being invoked only for one row?
Can you isolate the first row (the one that's working) and repeat it? That way you can make sure that the problem is not with the data. For example, the first row may be a header that uses standard characters and subsequent rows contain local variations that are causing the error.
DeleteIt brings over only one row with Chinese char,not just the first...any row you filter with a whereclause...If the whereclause returns more than onerow it displays ???
DeleteI also tested the following case...
I kept taking one row at a time from the source by filtering differently...It transfers nicely to the Target the entire table if run the same map multiple times by changing the where clause....
Which components are you using?
DeleteOracle to Netezza..Just source to target
DeleteNetezza to Netezza..Just source to target
I haven't worked with Netezza yet. Can you produce a properly encoded text file using only a tOracleInput and a tFileOutputDelimited? Set the character encoding to "CUSTOM" for the text file on the Advanced settings tab and add the appropriate Chinese encoding: "Big5", "GBK", "GB18030".
Delete