Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Friday, March 4, 2011

Five Date Formats in Talend Open Studio

It's best to standardize on one date format throughout your application.  The date format for the RDBMS is a good candidate, like "dd-MM-yy" for Oracle.  But dates can come from other sources, so you'll occasionally need to format a date.

Schemas in Talend Open Studio can automatically turn text strings into date objects.  Once Talend recognizes a field as a date, that date can be made available to other components without having to worry about format.

The following screenshot is a text file with fields of various date formats.

Input File with Different Date Formats
Load the file using a tFileInputDelimited.  When creating the schema, define each field as a date (rather than a string).  Specify a pattern to handle the different formats.  Talend Open Studio provides a handy code complete that will help select the correct format.  The following date pattern is for "December 2, 2010" which isn't in the code complete list.

"MMMM dd, yyyy"

MMMM uses the full month, like "December".  MMM is for an abbreviated month, "Dec".  MM is used for numeric values ("12").

This is a simple tFileInputDelimited to tLogRow Job.
Job Showing Date Conversions

The following schema identifies each field as a date and provides a date pattern.  It's for the tFileInputDelimited.

Schema with Different Date Format
This schema converts the input strings to a date format.  Once the fields are in date format, they can be reformatted.  The tLogRow schema applies a uniform date format to all of the date fields.
Schema Applying a Common Date Format
Running the job produces the following.

Running a Job with a Common Date Format
When dealing with dates, it's best to standardize on a format.  However, dates can come from a lot of other sources -- a web application, legacy data, external interface -- so, become familiar with handling dates in a schema or a dedicated component like tConvertType.

4 comments:

  1. Does the implicit date conversion work only with text files? In my tests, date conversion from a database String/Varchar seems to fail.

    ReplyDelete
    Replies
    1. Hi,

      I've always stored dates as their proper format in the database, so I'm not sure about the implicit conversion. Are you using type=Date and dbtype=VARCHAR? I would try a tConvertType as a next step.

      I've noticed with Excel that the date handling isn't straightforward. The spreadsheet stores the date as a number rather than a String. So, once I wasn't able to read in the value then convert it using a date format (it was already a date).

      Delete
  2. I was using Varchar for date in the database, so, I suppose it just doesn't do implicit conversion from an RMDB. I ended up just converting my dates to proper date fields in the DB.

    Thanks.

    ReplyDelete
  3. Nice blog Carl thanks for sharing.

    ReplyDelete