Talend Open Studio builds metadata from Excel 2007. When dates are used in Excel 2007, Talend Open Studio ignores the formatting that may have been put in the spreadsheet, even if the schema is specified to handle strings rather than date objects.
Excel 2007 stores its dates using a numeric representation that doesn't convey a particular format. Take the following spreadsheet as an example.
|
A Spreadsheet Using Different Date Formats |
When this file -- generated from Microsoft Excel 2010 Professional -- is read into Talend Open Studio as metadata using the "Create file excel" command, each date is displayed in the same way.
|
Excel Metadata in Talend Open Studio |
This is the case whether "date" is specified when describing the schema or "string" is specified. If "date" is used, then a string pattern supporting formatting can be added. See the following screen shot which uses a date for the first date field and string for the remainder.
|
Mixed Settings for Handling Excel Metadata |
The result is that a formatted string value is available to components like tLogRow. The following screenshot shows the results of a Job containing a tFileInputExcel run into a tLogRow using the mixed schema. Notice the difference in formatting between the first date field and the other date fields.
|
Examining a tExcelFileInput Run |
For flexible Jobs, treat dates as dates throughout your Talend Job. RDBMS have different default formats and you'd like to avoid conversions in your SQL (like Oracle TO_DATE functions). If the date format is important, it's best to tackle this at the source. Excel 2007 can save a spreadsheet as a delimited file that will preserve the dates exactly as they're seen on the screen.
Great tutorial on Excel and Talend. Thank you!
ReplyDeleteThanks, it did clear few doubts in my mind.
ReplyDeleteHowever in my task I have generated dates using RAND function in my Excel sheet.
They are displayed as (1) November 30, 2001 (2) May 2, 1980
My SQL server database's table's column has data type as 'smalldatetime'.
I am not being able to copy data from Excel to SQL while keeping both, input and output's datatypes as Date or String.
What do I need to do?
What does the date look like if you put in a tLogRow immediately after the excel input component?
Delete