Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Monday, April 11, 2011

Three Error Handling Strategies in Talend Open Studio

You can recover from some errors.  Others, like system or network failures are fatal.  But even in the fatal case, your Talend Open Studio job should die gracefully, notifying the operations team and leaving the data in a good state.  This post presents three error handling strategies for your Talend jobs.

Some Talend Open Studio job errors are alternate paths that, though infrequent, occur often enough to justify special programming. This programming may come in the form of guard conditions, special logic applied to route the special case to another subjob.  For an example of these type of errors, see this blog post on ETL Filter Patterns.

Other errors are related to system and network activity or are bugs.  There are a few ways to handle this class of error in Talend Open Studio.

Do Nothing

For simple jobs, say an automated administrative task, you can rely on the exception throwing of Talend Open Studio.  An example is a simple input to output job where a database failure in writing the output results in a system error.  This is expressed in the Run View as a red stack trace.

Simple Job with No Extra Error Handling Configured 
Subjob or Component Error Triggers

Each subjob and component has a return code that can drive additional processing.  The Subjob Ok/Error and Component Ok/Error can be used to steer the error toward an error handling routine like the tSendMail component.  This example looks for a connection error (the database is off) or a file processing error (the database is on, but the table name is wrong).

Both an individual subjob and a finer-grain component can be tested.  The screenshot shows two tSendMail routines being called from an OnSubjobError trigger.

Error Handling Tailored to the Subjob (or Component)
While testing the individual subjobs and components has the advantage of providing error handling tied to the specific case, there are disadvantages in maintenance and testing.  Maintenance suffers because the job  becomes cluttered with extra components which can confuse the normal processing, less frequent processing, and the error handling.  Testing is harder because there are more test cases.

Sometimes, there is a need for this level of detail.  You may want to send a file that represents an intermediate stage of processing via email.  This file isn't available throughout the job, and not every failure can handle this.

tAssertCatcher

A more general strategy is to define an error handling subjob to be performed when an error -- any error -- occurs.  This has the important advantage of consolidating the error handling, dramatically reducing testing.  It puts the burden of testing for error conditions on Talend (where it belongs).

To implement the general strategy, use the tAssertCatcher component which will be invoked whenever any component throws an error.

A Shared Error Handler with tAssertCatcher

If there's a failure in the XSL component (tXSLT) or other component resulting in a Java exception, the job will continue with the error handler (in this case a tLogRow) attached to the tAssertCatcher. tAssertCatcher can route an error message to other handlers like a tSendmail.

tAssertCatcher Config
Components like tXSLT don't need any additional configuration to use tAssertCatcher.  The tFileInputXML has a "Die on error" checkbox that needs to be set.

In the following screenshot, the database component tMSSqlOutput_1 has "Die on error" set.  If the flag is not set, then the tMSSqlOutput will print a message and the tAssertCatcher will not be called.  This particular example caught errors from the connection component (bad login) and the tMSSqlOutput component (DB-generated unique constraint violation and invalid insert of identity column).

An Example with Database Components


Let Talend Work

Handling system errors is different than alternate paths and conditions that arise during coding a Talend job.  Sometimes, you'll have a specific error routine for a specific system error condition.  But where possible, let Talend throw the system errors and catch them with a tAssertCatcher. 

83 comments:

  1. Excellent tutorial.

    I've tried applying this concept using a tOracleInput component with a OnSubJobOk trigger to a tMap component. The tMap component won't seem to accept a trigger conditional, forcing me to divide my logic to 2 steps (checking first for a connection using tOracleConnection, then running my query using the tOracleInput).

    Could you explain why you can't run a trigger directly to a tMap? Is there another way to consolidate this error checking on a query without having to have an explicit connection component?

    Thanks!

    ReplyDelete
  2. Hi,

    tMap needs an input component like tOracleInput to provide it with operands. This is the flow of records from a query.

    A tOracleInput can include connection information. However, most of the jobs that I write use a separate tOracleConnection. This is so that the connection can be applied to all of the components in the job: tOracleOutput, other tOracleInputs. The configuration, number of connections, and properties like auto-commit are controlled centrally. It's rare that I have only a single DB input in a job.

    To handle a tOracleConnection error, connect your error handling code using an OnSubjobError trigger. Alternatively, a tLogCatcher can be configured to accept Java Exceptions which will be thrown if the tOracleConnection fails.

    Good luck.

    ReplyDelete
  3. Thanks, this has been a huge help!

    ReplyDelete
  4. Hi,

    Can you tell us if it's possible to resume from an exception ?
    Let's say I have an infinite loop that calls a tFTPPut every 30 minutes, if there is a connection problem the job will stop by throwing an exception, I'm trying to make it goo back to the loop and try again until the FTP server is available.

    Thanks

    ReplyDelete
  5. Have you tried checking the exit status of the tFTPPut component and building a loop from that? Prior to the tFTPPut component, set a flag "NumRetries". Direct the OnComponentError or OnSubjobError connector to a tJava component that will increment this variable.

    To avoid a long-running problem, define a context variable 'MaxRetries'. Before calling the tFTPPut, check this variable with a Run If connection or other component.

    If the exception is still being thrown, make sure there are not catchers registered for "Java Exception".

    ReplyDelete
  6. I developed my own sendmail component, but I cannot connect other component to it in "on component ok". Could you please help me?

    ReplyDelete
    Replies
    1. Hi,

      Can you verify that you have these CONNECTOR elements in your XML config?

      Delete
  7. I used the third method (A more general strategy is to define an error handling subjob to be performed when an error -- any error -- occurs ), even the job runs successfully with out any error on any components the tdie is executing, how do I avoid it?
    thanks in advance

    ReplyDelete
    Replies
    1. Hi,

      tDie is the wrong component. Try tAssertCatcher. I'm correcting the post with an updated screenshot.

      Delete
    2. thanks for your reply Carl, will try and let you know.

      Delete
    3. No problem. The section was updated with an example that uses tAssertCatcher. I verified that the job doesn't invoke the error handler if there are no errors and that the error handler is called if there is an error in the tXSLT and the tFileInputXML components.

      See the note about setting "Die on error" if available in a component.

      Delete
    4. thanks Carl, I saw the update, however I use lot of database related components such as toralceinput, toracleout, toracleSCD, I tested with tAssertCatcher and it doesnt pick up error like "cannot insert NULL" or anything that happens in that database components, but however it picks up errors like invalid objects names etc.. How do I handle some of the error that occurs using the database components. thanks in advance.

      Delete
    5. Are the "Die on error" settings on the tOracleOutput and tOracleSCD (Basic settings) checked?

      Delete
    6. Carl, I havent checked it, will try and see. thanks.

      Delete
    7. Carl, I didnt get a chance to test it yet, i was using the tLogCatcher before and today I came to know about tAssertCatcher, so what is the difference between them. thanks.

      Delete
    8. The difference between tLogCatcher and tAssertCatcher is the schema used to communicate the errors in the project.

      tLogCatcher uses moment, pid, root_pid, father_pid, project, job, context, priority, type, orgin, message, and code.

      tLogCatcher will also capture tWarn messages.

      tAssertCatcher uses moment, pid, project, log, language, origin, status, substatus, and description.

      Both handle tDie messages although tAssertCatcher calls these tAssert messages.

      Delete
    9. thanks Carl.

      Delete
    10. Carl, I tried checking the "Die on Error" and it sends the error through tAssertCatcher and I have a sendmail component, but I know the "Die on Error" stops the process, but how do I continue the process just getting the error messages. thanks.

      Delete
    11. Use a Run If connector routed to a tWarn to throw an error to a tLogcatcher. For more details on this, search the blog for a post called "Handling Database Warnings" I just put up (May 7, 2012). There's a video too. Be sure to Like and +1 anything you found useful.

      Delete
    12. thanks Carl, will check it out.

      Delete
    13. Carl, I see in the RunIF you check for number of lines inserted == 0, what do I need to do, if it partially inserts rows and warns later or for example as I asked before what if it tries to insert null values to a not null column etc., what global variables do I need to use in the RUN IF? thanks.

      Delete
    14. If you're able to turn off batching, you can open up a rejects stream which will give you the error that resulted in a failed insert (not null, unique, etc).

      In that case, you'd use _REJECTED > 0. However, if you need to have batching, say for performance reasons, continue with _INSERT.

      You can always do the null checks ahead of time with a tFilterRow or similar component.

      Check the "Handling Database Warnings" post for an update (section "Getting More Information").

      Delete
    15. ok thanks Carl, will check there

      Delete
    16. Hi Carl, not sure if you have a post for handling using the same table for multiple lookups without using the same table multiple times, for example, take a Date Dimension table, if I need to reslove keys for the Birthdate, Hiredate, TerminationDate from the Date Dimension table, currently I using the same table 3 times for resoving keys for the 3 fields, how do I overcome this by looking up the table one time and reslove all the 3 keys, please let me know. I am using 4.2.3r.....version of Talend. thank you.

      Delete
    17. Hi,

      Take a look at a post from May 14, 2012 called "Efficient Lookups with Talend Open Studio's Hash Components" on this blog.

      Delete
  8. thanks Carl, was busy, will take a look now.

    ReplyDelete
  9. Hi Carl, I posted a question on "Efficient Lookups with Talend Open Studio's Hash Components" about how to clear the thashinput/output from the memory, please take a look and let me know. thanks.

    ReplyDelete
  10. Hi Experts,

    My talend job reads some host names from a table and then I am using the tSSH component to get the df -K command outputs for those hosts. In a case where talend is not able to connect to a host I want to log a warning and resume the command execution with the next record. How do I achieve that?

    Cheers!

    Chanaka

    ReplyDelete
    Replies
    1. Hi,

      Take a look at the May 22, 2012 post "Your Own Exception Handler for Standard Talend Components" on this blog.

      Delete
    2. Hi Carl,
      How can we genereate Batch_id to load in target,which is not present in source.

      thanks,
      Ranjan

      Delete
  11. Hi Carl,
    thanks for excellent blog!

    I have a problem related to exception handling on subjobs. My job has many subjobs (up to 4 levels down) and i'm trying to establish a single exception handler in the main job. I've tried different strategies but something fails - maybe you can suggest some general principles to avoid typical mistakes?

    Regards,
    Martins

    ReplyDelete
    Replies
    1. I thought not subjobs but child jobs.

      My goal is to have one place for all warning and error handling - to write them to database and more dangerous ones to report through e-mail.

      One of problem is that when I put a tWarn in a child job (and it executes) and a tLogCatcher - all works fine. But if try to catch that tWarn message in parent job - it doesn't work. The same with tDie.

      The main question is - is it possible at all to catch child job errors and messages in parent job?

      Delete
    2. Hi Martins,

      Everything in Talend becomes a Java class, so I tried a few combinations of child jobs, tAssertCatcher, and re-throws ("throw new Exception()"). I didn't find a solution worth posting.

      There's definitely a need to break jobs into child jobs (.class files size, reusability). However, I'd try to eliminate one of the levels on the scheduler side by finding a job scheduler that supports processing chains (sometimes called "job streams").

      Maybe there's a refactoring of your jobs that you can do to eliminate another layer? Something like grouping all of the lower-level child jobs that behave in a similar manner together if they're separated by different use cases today.

      Delete
    3. Thanks, Carl,
      I chose a simple workaround - using copy-paste I put in each childjob a tLogCather component connected to tRunjob which calls another child job that does the necessary work.

      Delete
  12. Hi Carl, I have a problem with the tOracleoutput component, where I set the action on table to be drop and create, first time it runs fine but later on the action on table is not there and see only the action on data option in the basis settings, wondering why it is changing, not sure what I am missing do you have any blog on this I can take a look. thanks in advance.

    John

    ReplyDelete
    Replies
    1. Hi John,

      I haven't run into your particular problem with tOracleOutput. The Oracle work I did with Talend used tables that were already created. In general, I create the tables with scripts rather than the Talend "Action on table" settings.

      This post may be of interest as it describes working with the DDL generated from the "Action on table" settings. Different DB, though. http://bekwam.blogspot.com/2011/10/controlling-ddl-behind-talend-open.html

      Check the JIRA on Talend for your particular version.

      Good luck.

      Delete
    2. thanks Carl, will take a look.

      John

      Delete
  13. Hi Carl, I just posted a question in the following link

    http://bekwam.blogspot.com/2011/05/procedural-update-in-talend-open-studio.html?showComment=1341891554898#c3966067130247382279

    just sending this as a backup, could you please take a look and let me know there is way to handle that requirement in Talend.

    thanks
    John

    ReplyDelete
    Replies
    1. There is...Take a look at http://bekwam.blogspot.com/2012/07/theta-join-with-talend-open-studio.html.

      Delete
  14. Hello Carl!
    I have a service installed in windows which fires commands every minute on a user made terminal to fetch data from TPF window into database. In case this service stops, database is no longer updated.
    How can I use Talend to notify authority via email that the service has stopped working?

    thanks in advance!

    ReplyDelete
    Replies
    1. Hi Hitesh,

      Job scheduling software like Control-M will keep track of your jobs, and if one isn't run, an alert can be generated.

      If you're not using this type of software with Talend, you'll need to provide your own mechanism. I see a lot of batch jobs -- Talend included -- using a polling process to check to see whether something has run. The Talend job writes out or updates a record with a process status. This process status is updated at the end of the job.

      A polling job is run that looks at this record, examining the process status, last run date, and other information recorded by Talend.

      Delete
  15. Hi Carl
    I am really impressed with your answers. Much appreciated.

    I am new to talend and I am working on an enhancement to fail the entire job when there is first error (whatever be the error). Cos the downstream system is heavily dependentant on previous job's data.

    So my requirement is --> I have a workflow in which 10 jobs are attached most are sequential load and only few are parallel. After going through all these threads, I checked on Die on error for all targets and added a joblet which has tlogcatcher and sendmail. Is this sufficient? or do i need to include tDie at every job?

    Say out of 10 sequential job, 2nd one has failed, then entire job should just STOP at 2nd one and do nothing than sending an email from failure_job_alert joblet. Please help me with this.

    ReplyDelete
  16. I don't use Enterprise (joblets), but for TOS, I catch the errors using tLogCatcher and route the data to an error handler (email, etc) in the calling job.

    ReplyDelete
    Replies
    1. so when we use tlogcatcher does it fail the entire job?

      Delete
    2. tDie at the end of the tLogCatcher subjob.

      Delete
  17. Hi Carl, this question is related to CLOB datatype in oracle, my source and destination is oracle and one of the comment field is CLOB and using Talend to move the data, I used the tConvertType component which changed to the type to Object, but I am getting error, Please let me know if you have any experience with this datatype.

    thanks
    John

    ReplyDelete
    Replies
    1. Hi John,

      Have you seen the CLOB-related posts on the blog? At the top of the page type 'clob' in the search.

      Here's one on working with XMLType, a CLOB-like Oracle data type.

      http://bekwam.blogspot.com/2011/12/exporting-oracle-xml-to-file-with.html

      And another one: http://bekwam.blogspot.com/2011/05/large-chunks-of-text-in-talend-open.html

      There are a few other hits.

      Delete
  18. thanks Carl, I didnt notice it, will take a look.

    John

    ReplyDelete
  19. Thanks curl your error handling systems are really useful to me..

    ReplyDelete
  20. Hi, I am new to Talend Studio. I designed a job to transform data from phpmysql to phpMysql itself. The following error occurs:

    java.net.ConnectException: connect: Address is invalid on local machine, or port is not valid on remote machine
    ......
    Exception in component tMysqlOutput_1
    java.lang.NullPointerException
    at routines.system.RunStat.sendMessages(RunStat.java:286)
    [statistics] connecting to socket on port 4041

    However the port 4041 in the console error is shown differently at different turns of run. The actual port is 3306. What could be the reason for this connectivity problem?Is this because of installation error?Please help.Thanks.

    ReplyDelete
    Replies
    1. Can you compare your Talend MySQL job with the job shown in the video? I apologize for the sound quality; my later videos are better. http://www.youtube.com/watch?v=5L1AtlPqsVg

      Delete
  21. frnds i want to send Email alert to my Team mates when my MDMconnections goes down...in ETL talend..how can i do this ..??? im using oracle DB
    pls help me thanks

    ReplyDelete
    Replies
    1. Have you tried a tSendMail component as demonstrated in this post? http://bekwam.blogspot.com/2012/05/handling-database-warnings-with-talend.html

      Delete
  22. Is there a way to identify the log files that are generated by Talend Administrator ?
    I would like to get those logs and email. Usually, I download the logs from the administrator from job execution monitoring. I would like to automate and get these logs in email.
    Thanks in advance!!

    ReplyDelete
    Replies
    1. Have you tried writing a job or joblet that uses the tSendMail component to send the files as an attachment?

      Delete
  23. Hello,

    I have an issue with a rather complex job, and no idea on how to catch the error. I would be thankful if you had a suggestion.
    I have roughly 200 remote database I'm trying to call and retrieve data from.
    However I sometimes get a
    "Exception in component tMSSqlInput
    java.sql.SQLException: Network error IOException: Connection timed out (...)"

    This kills the job ( taking the other threads and the next with him ) resulting in much less data retrieved than actually available.

    I would like to catch that error and kill only the one thread it concerns, or any other idea that would help the job to fallback gracefully.

    Any hint?

    Regards.

    ReplyDelete
  24. Dear Carl,

    Thanks for the nice tutorial. In a Talend job, we would like to log all errors to a file send an email to the administrator.

    We have no problem about the email part, however I found that sometimes some Talend components only write error to the stderr without triggering an onComponentError event or giving a Rejected output.

    Any suggestion on how to properly log these errors? We understand that if we execute the Talend job in command line, we can use > 2&1 to capture all stdout and stderr, however we are seeking for solution coded inside the Talend job.

    Thanks.

    ReplyDelete
    Replies
    1. Hi,

      In my jobs, I distinguish between errors and rejected records. I don't want to abort a million record job because one record is missing an email for example. I make heavy use of tSchemaCompliance or tScriptRules (if you need more flexibility) and pipe the rejected records out to a text file for an email attachment. The email is labeled "REJECTED RECORDS" indicating that someone who knows the source data should investigate. It may be a non-issue.

      For true errors that will abort processing like a network error or a bad password, I catch exceptions in a tLogCatcher. These go into an email flagged as "ERROR" for a visible call-to-action for the ops team.

      Delete
    2. Try tRedirectOutput

      Delete
  25. Hello, Can you explain about the Error Management. execution job can continue with or with out error. ???

    ReplyDelete
    Replies
    1. Many components have a "Die on error" operation that will allow processing to continue of the value is unchecked

      If you catch the error, you can continue processing new components. I wouldn't recommend trying to return to the area whether the exception is thrown in case you wind up in an infinite loop. That is, if you catch the error, connect the tLogCatcher up to new components on the canvas.

      Delete
  26. I'm running Studio v5.6.1 and find that no components seem to implement OnComponentError. Looking at the source I can't figure out how to do it legitimately. In my custom components I end up having to call:

    <%=cid%>_error(e, "<%=cid%>", globalMap);

    When I wish to activate the OnComponentError flows. This works, but is probably not the way to do it.

    ReplyDelete
  27. Hi, i'm running the same job concurrently and that writes on a DB table. Before write i check if already exists a certain record, if not it insterts. In this table there isn't a constraint violation and multiple line will occur. If i set a constraint violation on that table, the job will throw an exception but i want it to continue to insert without duplicate lines, How can i do that?

    ReplyDelete
    Replies
    1. The concurrency problem you're facing here is that the INSERT from the first process can come between the check (a SELECT) and the INSERT from the second process. One approach is to make the check + insert atomic in a stored procedure that locks the table when the procedure is entered, performs the SELECT, conditionally performs the INSERT, then unlocks the table. If the INSERT or SELECT takes a long time, you may not want to go with this approach because your processes will block too much. However, if the INSERT is quick, this is a robust way keep the checks in sync with the additions.

      Another option is indirection. Have the concurrent processes load a staging table with a meaningless surrogate key. Everything gets inserted, including "dups" that are really distinguished by the special key (generated by auto identity, sequence, etc). Then, at the end of the concurrent processes, run a single process to move the data to its correct location without duplicates.

      I don't recommend dropping the constraints as you'll start to push data issues up into the app or reporting layers.

      None of these recommendations are specific to Talend.

      Delete
  28. Hi Carl,

    Do you need a tAssertCatcher in every job for error handling or you can use one on the top job only?

    I have a chain of jobs one call others etc and I have one action I want to execute when ever there is any error.

    Thanks,

    David

    ReplyDelete
    Replies
    1. You might want to try a tDie and see if you can get the CHILD_RETURN_CODE from the tRunJob. A positive error code in a tDie implies a failure, then you can unpack this in the caller. The umbrella exception handler might not give you enough insight into your job chain and might make recovery more difficult.

      Good luck

      Delete
  29. I'm getting Null pointer exception in tPartitioner_1 component .Can u suggest what might be happening there.

    Exception in component tPartitioner_1
    java.lang.NullPointerException
    at java.util.concurrent.ConcurrentHashMap.put(Unknown Source)
    at java.util.concurrent.ConcurrentHashMap.putAll(Unknown Source)
    at java.util.concurrent.ConcurrentHashMap.(Unknown Source)
    at aghub_irt.test_0_1.TEST.tOracleInput_1Process(TEST.java:14423)
    at aghub_irt.test_0_1.TEST.runJobInTOS(TEST.java:17428)
    at aghub_irt.test_0_1.TEST.main(TEST.java:17042)
    [statistics] disconnected

    ReplyDelete
    Replies
    1. Unfortunately, I haven't used the tPartitioner component before. Double-check the Oracle input by routing it to a tLogRow to make sure that the query is producing a proper result set.

      Good luck

      Delete
    2. The key used to partition could be null which might be causing this problem.

      Delete
  30. Hi,

    I am trying to increase the throughput of my Talend Job I store all required records to a tHash and then try to inset this records in my table the commit speed is just 400 rows/second using tJdbcOutput (ojdbc7.jar - Talend 5.6.1). Using an existing connection the tJdbcOutput was even slower. I need to commit atleast 500000 records at a time and it takes too long almost an hour. I have tried experimenting with the batch size and commit every but the problem is either all the records should commit or all of them should rollback if an error occurs. Any ideas on how this can be achieved.

    Abhishek

    ReplyDelete
    Replies
    1. Hi,

      I wouldn't use a hash for this and instead opt for a transaction-less (NOLOGGING) staging table into which you can load the records as fast as possible. If there is any failure, you'll provide your own compensating transaction which will DELETE or TRUNCATE this staging table. Make sure that the Talend job is running on the same machine as the server to reduce the networking performance hit.

      To get the contents of this staging table into the target, use in-database Talend components (SQL template, tELT, etc). JDBC can be very inefficient with a batch of this size. Often times, to do a complex transformation with JDBC, you'll spend as much time pulling data out of the database as you will with putting data into the database.

      Good luck

      Delete
    2. Thanks a lot for the response.
      Problem is in a production environment I will not be allowed to Run the Talend application on the same machine as the server on which the Database resides.
      In addition there is some data cleaning too which is done in memory using tJava components before trying to commit to the database. We have a stage schema in which we are doing the transformations problem is these data spikes need to be handled in the Stage itself. To transfer data from Stage to Target we are using PL/SQL.
      Is there anyway I can optimize the the jdbc commit part since fetching to memory seems reasonably quick.

      Delete
    3. I usually don't do a lot in Java code and instead have the database do as much work as possible. For example, you might stage the data quickly, then run bulk updates to set flags, add in normalized columns, or remove items. Say you needed to normalize a category code. Load the data in as quickly as possible (NOLOGGING, no indexes), add a column to the staging table for the normalized value, then run a bulk update on the new column with the normalized code (retaining the old value in a different column).

      Something like this is preferable to Java because of error handling. Working in stages means that the DB load is not an all-or-nothing. A failure can produce valuable troubleshooting info.

      I stay away from such a large amount of uncommitted work. I've seen things really bog down at > 100k records and a few years ago trashed a redo log with a data load of 1m records.

      If you're committed to keeping the Talend job as-is, you could try a different driver. If you're using the thin driver, install the Oracle client where Talend is running and use the OCI driver. That would speed the network communication between the Talend job and the database, but won't address the fundamental problems.

      Good luck

      Delete
    4. Thanks Carl much appreciated !!

      Delete
  31. Hi,
    I have one job ,that it enters records into the database.code and every thing was fine.
    now my requirement is ,some times, some problem with my database so my job is taking more time to start (like 10min or more for 40000 records).so in this case how can i handle my job to terminate the process based on the time.and what component can i place to get same( long time taking) to my mail
    Awaiting for reply,
    regards,
    rekha

    ReplyDelete
  32. Hi Carl,

    Great Post!! Cleared many doubts.
    I have one question to ask.
    I have one job and inside it I have prejob and postjob which is making connection and closing it.
    In between prejob and postjob I have 10 parallel subjobs which are loading data from one table to another.

    In my case, if any of the subjob fails, I do not want any other subjob to stop instead continue with their task, which is default behavior and its working fine.

    But for the jobs which failed I want to log that data into table. By reading above article and q&a, I used tlogcather component with each subjob. Its like
    i/p1 -tmap ->O/p1->on component error ->tlogcathcher1 ->tlogrow1
    i/p2 -tmap ->O/p2->on component error ->tlogcathcher2 ->tlogrow2

    Questions:
    In tlogrow I am getting error message repeated, i am not able to
    2017-02-07 07:10:52|dVjZYl|dVjZYl|dVjZYl|TEST_NEW|Hana_job_with_different_length|Dev|6|Java Exception|tSAPHanaOutput_2|com.sap.db.jdbc.exceptions.BatchUpdateExceptionSapDB:SAP DBTech JDBC: [274]: inserted value too large for column: Failed in "NAME" column with the value 'Lyndon'(input position 11)|1
    2017-02-07 07:10:52|dVjZYl|dVjZYl|dVjZYl|TEST_NEW|Hana_job_with_different_length|Dev|6|Java Exception|tSAPHanaOutput_2|com.sap.db.jdbc.exceptions.BatchUpdateExceptionSapDB:SAP DBTech JDBC: [274]: inserted value too large for column: Failed in "NAME" column with the value 'Lyndon'(input position 11)|1

    Question2: I want subjob name to be printed but instead it is showing component name"tSAPHanaOutput_2". I updated component name in the subjob and even gave some name to subjob itself, but it is not reflecting. What i am missing?

    Question3: As I shown above, I added on component error to o/p component only, what if error comes from i/p component, do I need to place logcatcher for each component?

    question4:
    This is related to my design. In case my subjob fails and admin come to know that this subjob got failed, then how he will run only that particular subjob? Is there any component to manage that.

    Sorry for putting so many question which may seem lame to you, but I am new to Talend and this dwh thing. I will appreciate you response on this.

    Thanks,
    R

    ReplyDelete
  33. Hi Carl,

    Regarding my questions
    q1: understood and fixed it.
    q2: understood. tlogcatcher will always show component name which is throwing an error. Fixed it and now I can see name which is helping in identifying the job.
    q3: Understood: Only one logcatcher is enough. It will capture error thrown from any of the component in the job.
    q4: Still investigating that how I can manage that.
    Any suggestion for q4 would be great help to me.

    Thanks,
    R

    ReplyDelete
    Replies
    1. Hi,

      I haven't coded in Talend for a while, but I think my advice will still hold. Break your Talend job up into several jobs and schedule them in a job chain using scheduling software like Control-M or AutoMate. These scheduling programs identify the failing job for the admin, allowing them to re-run it in isolation or pick up where the failure left off. The enterprise version of Talend may have this feature, but I haven't used it.

      I don't know of a clean way to communicate a failure point in a Talend Open Studio job. When I didn't have a scheduler more than cron or the Windows Task Scheduler, I had to pack in extra logging, reporting, and global variables to help admins. This type of custom solution involved a lot of training -- you're effectively building out the scheduling software mentioned in the prior paragraph. While this works, it is usually not a very successful project because you'll have to keep working on the job past the deadline since all the failures likely won't be known at the start.

      Good luck,
      Carl

      Delete
  34. Hi,

    Nice tutorial. Can you please provide me materials on how to write the exceptions and filters in tmap Talend.

    ReplyDelete
  35. Great BlogPost..Really knowledgeable. Thank you.
    I have one question..How can we handle the Salesforce connection in Bulk Upload.Our kob is failing almost every day , But after one restart it works fine means connection is successful

    ReplyDelete
  36. I insert record into store procedure from other oracle table.
    Records is 100 and after 50th record inserted then getting error and job stop.
    I want that if i getting error after inserting record into store procedure , then error goes to one separate table and remaining records insert into store procedure.
    Kindly help us .

    ReplyDelete
  37. I insert record into store procedure from other oracle table.
    Records is 100 and after 50th record inserted then getting error and job stop.
    I want that if i getting error after inserting record into store procedure , then error goes to one separate table and remaining records insert into store procedure.
    Kindly help us .

    ReplyDelete