Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Sunday, December 29, 2013

Finding an Escaped Character with tScriptRules

If you want to look for a String containing a backslash-N (\n) using tScriptRules, you'll need a whopping 8 backslashes to escape the search criteria through layers of tScriptRules and JEXL.

tScriptRules is a Talend component hosted on the Exchange.  You can save a collection of business rules in a file and apply those rules in a Talend data flow.  tScriptRules is based on Commons JEXL which is a Javascript-like language.  Talend itself is built on Java.  This means that when you add an expression into a tScriptRules rule, you'll need to escape Strings when certain character combinations appear.

For example, suppose you pull the following String from a user interface

  This is a comment with a \n newline.

and you want to identify data like this for cleaning, perhaps through a manual review.

You can use tScriptRules to scan a table or text file, spooling the result to a tLogRow or other output component.  To do this, specify a JEXL expression using the regular expression match operator (=~) that will route the String to a tLogRow.  The reason for the explosion of backslash operators is that we're escaping for both Java and Javascript (JEXL).  Also, the target String in this case also contains a backslash.

  input_row.line =~ '.*(\\\\\\\\n).*'

 This will send all input such as the sample String presented earlier to the Filter flow (versus the Reject flow).  If you'd like to reverse the flows, wrap the JEXL expression in the not operator (! or not).

  !(input_row.line =~ '.*(\\\\\\\\n).*')

 The following job rejects one record because of the newline and allows two other records to continue on.

Job Escaping the Backslash-N Character


  1. download link-

  2. Find more details on