Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Friday, March 4, 2011

Excel 2007 Dates in Talend Open Studio

Talend Open Studio builds metadata from Excel 2007.  When dates are used in Excel 2007, Talend Open Studio ignores the formatting that may have been put in the spreadsheet, even if the schema is specified to handle strings rather than date objects.

Excel 2007 stores its dates using a numeric representation that doesn't convey a particular format.  Take the following spreadsheet as an example.

A Spreadsheet Using Different Date Formats
When this file -- generated from Microsoft Excel 2010 Professional -- is read into Talend Open Studio as metadata using the "Create file excel" command, each date is displayed in the same way.

Excel Metadata in Talend Open Studio
This is the case whether "date" is specified when describing the schema or "string" is specified.  If "date" is used, then a string pattern supporting formatting can be added.  See the following screen shot which uses a date for the first date field and string for the remainder.

Mixed Settings for Handling Excel Metadata
The result is that a formatted string value is available to components like tLogRow.  The following screenshot shows the results of a Job containing a tFileInputExcel run into a tLogRow using the mixed schema.  Notice the difference in formatting between the first date field and the other date fields.

Examining a tExcelFileInput Run

For flexible Jobs, treat dates as dates throughout your Talend Job.  RDBMS have different default formats and you'd like to avoid conversions in your SQL (like Oracle TO_DATE functions).  If the date format is important, it's best to tackle this at the source.  Excel 2007 can save a spreadsheet as a delimited file that will preserve the dates exactly as they're seen on the screen.

3 comments:

  1. Great tutorial on Excel and Talend. Thank you!

    ReplyDelete
  2. Thanks, it did clear few doubts in my mind.
    However in my task I have generated dates using RAND function in my Excel sheet.
    They are displayed as (1) November 30, 2001 (2) May 2, 1980

    My SQL server database's table's column has data type as 'smalldatetime'.

    I am not being able to copy data from Excel to SQL while keeping both, input and output's datatypes as Date or String.

    What do I need to do?

    ReplyDelete
    Replies
    1. What does the date look like if you put in a tLogRow immediately after the excel input component?

      Delete